Skip to content

Annotation Import Implementation Plan

For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.

Goal: Allow project administrators to import annotations from external tools (CSV, JSON, YAML) into a SyRF project stage via a hybrid async-validate / synchronous-commit flow.

Architecture: Upload triggers synchronous header parsing and job creation. A MassTransit IJobConsumer validates the full file asynchronously (study resolution, annotator resolution, tree validation) and writes a ValidationResult to the pmAnnotationImportJob document. The admin reviews conflicts and POSTs resolutions; the server re-reads the file from GCS, reconstructs the annotation tree depth-first, resolves the target AQVersion/StageQuestionSetVersion references, and writes new Annotation/AnnotationVersion/AnnotationSession/AnnotationSessionVersion aggregates. A new IGcsStorageService in SyRF.AppServices replaces S3 for this feature and establishes GCS as the standard for new features going forward.

Tech Stack: .NET 10, MongoDB (C# driver), MassTransit (IJobConsumer), Google.Cloud.Storage.V1, YamlDotNet, CsvHelper (check existing usage first), Angular 21 (signal forms, Material stepper), xUnit.

Design doc: docs/features/annotation-import/design.md


Prerequisite: QM v2 domain must be landed

This plan assumes the QM v2 domain from PRs #2572–#2575 is merged. Before starting, confirm the following types exist in src/libs/project-management/SyRF.ProjectManagement.Core/Model/QuestionVersioning/:

  • AnnotationQuestionV2 (question entity — uses Guid Id as aggregate root)
  • AQVersion (embedded on AnnotationQuestionV2; has own Guid Id and int VersionNumber)
  • ProjectQuestionSet + PQSVersion
  • StageQuestionSet + SQSVersion
  • DraftQuestion (pre-publish mutable question; formerly referred to as "DraftAQ" in earlier specs)
  • VersioningValueObjects.cs with the composite reference records:
  • AnnotationQuestionVersionReference(Guid QuestionId, Guid VersionId)
  • StageQuestionSetVersionReference(Guid StageId, Guid VersionId)
  • AnnotationVersionReference(Guid AnnotationId, Guid VersionId)
  • AnnotationSessionVersionReference(Guid SessionId, Guid VersionId)

Where this plan refers to "AQ", "AQVersion", "QSV", use the actual types listed above. Commit-time writes must pin every imported annotation to its AnnotationQuestionVersionReference and every session to its StageQuestionSetVersionReference.

Terminology sweep required during implementation: the task descriptions below were drafted before QM v2 landed and occasionally use "AQ" or "QSV" shorthand. When implementing, replace with the full current class/record names and add the typed references where needed.


Task 1: GCS Storage Service

Add IGcsStorageService to SyRF.AppServices following the same pattern as S3FileService. This is the foundation everything else depends on.

Files: - Create: src/libs/appservices/SyRF.AppServices/FileServices/GcsFileService.cs - Create: src/libs/appservices/SyRF.AppServices/FileServices/IGcsStorageService.cs - Create: src/libs/appservices/SyRF.AppServices/FileServices/Settings/GcsSettings.cs - Modify: src/libs/appservices/SyRF.AppServices/SyRF.AppServices.csproj - Test: src/libs/appservices/SyRF.AppServices.Tests/GcsFileServiceTests.cs

Step 1: Add the NuGet package

cd src/libs/appservices/SyRF.AppServices
dotnet add package Google.Cloud.Storage.V1

Step 2: Write the failing test

// src/libs/appservices/SyRF.AppServices.Tests/GcsFileServiceTests.cs
using Google.Cloud.Storage.V1;
using NSubstitute;
using SyRF.AppServices.FileServices;
using SyRF.AppServices.FileServices.Settings;
using Xunit;

public class GcsFileServiceTests
{
    [Fact]
    public async Task UploadAsync_WritesStreamToGcsBucket()
    {
        var storageClient = Substitute.For<StorageClient>();
        var settings = new GcsSettings { AnnotationImportsBucket = "test-bucket" };
        var sut = new GcsFileService(storageClient, settings);
        var stream = new MemoryStream("hello"u8.ToArray());

        await sut.UploadAsync(stream, "test-object.csv", "text/csv");

        await storageClient.Received(1).UploadObjectAsync(
            "test-bucket", "test-object.csv", "text/csv", stream,
            null, Arg.Any<CancellationToken>(), null);
    }

    [Fact]
    public async Task DeleteAsync_RemovesObjectFromBucket()
    {
        var storageClient = Substitute.For<StorageClient>();
        var settings = new GcsSettings { AnnotationImportsBucket = "test-bucket" };
        var sut = new GcsFileService(storageClient, settings);

        await sut.DeleteAsync("test-object.csv");

        await storageClient.Received(1).DeleteObjectAsync(
            "test-bucket", "test-object.csv", null, Arg.Any<CancellationToken>());
    }
}

Step 3: Run test to verify it fails

dotnet test src/libs/appservices/SyRF.AppServices.Tests/ --filter "GcsFileServiceTests" -v
Expected: FAIL — GcsFileService does not exist yet.

Step 4: Create the settings, interface, and implementation

// GcsSettings.cs
namespace SyRF.AppServices.FileServices.Settings;
public class GcsSettings
{
    public string AnnotationImportsBucket { get; set; } = string.Empty;
}
// IGcsStorageService.cs
namespace SyRF.AppServices.FileServices;
public interface IGcsStorageService
{
    Task UploadAsync(Stream content, string objectName, string contentType, CancellationToken ct = default);
    Task<Stream> DownloadAsync(string objectName, CancellationToken ct = default);
    Task DeleteAsync(string objectName, CancellationToken ct = default);
}
// GcsFileService.cs
using Google.Cloud.Storage.V1;
using SyRF.AppServices.FileServices.Settings;

namespace SyRF.AppServices.FileServices;

public class GcsFileService(StorageClient storageClient, GcsSettings settings) : IGcsStorageService
{
    public async Task UploadAsync(Stream content, string objectName, string contentType,
        CancellationToken ct = default)
        => await storageClient.UploadObjectAsync(settings.AnnotationImportsBucket, objectName,
            contentType, content, cancellationToken: ct);

    public async Task<Stream> DownloadAsync(string objectName, CancellationToken ct = default)
    {
        var stream = new MemoryStream();
        await storageClient.DownloadObjectAsync(settings.AnnotationImportsBucket, objectName,
            stream, cancellationToken: ct);
        stream.Position = 0;
        return stream;
    }

    public async Task DeleteAsync(string objectName, CancellationToken ct = default)
        => await storageClient.DeleteObjectAsync(settings.AnnotationImportsBucket, objectName,
            cancellationToken: ct);
}

Step 5: Run tests to verify they pass

dotnet test src/libs/appservices/SyRF.AppServices.Tests/ --filter "GcsFileServiceTests" -v
Expected: PASS (2 tests).

Step 6: Commit

git add src/libs/appservices/
git commit -m "feat(gcs): add IGcsStorageService to SyRF.AppServices

Wraps Google.Cloud.Storage.V1 for annotation import temp file storage.
Establishes GCS as the standard for new features; existing S3 usage unchanged."

Task 2: Domain Enums and Value Objects

All the supporting types for AnnotationImportJob. Pure C# records and enums — no MongoDB, no DI. Fastest task in the plan.

Files: - Create: src/libs/project-management/SyRF.ProjectManagement.Core/Model/AnnotationImportJobAggregate/ (directory + files below) - AnnotationImportStatus.cs - ImportFileFormat.cs - StudyIdentifierType.cs - AnnotatorIdentifierType.cs - ImportFileInfo.cs - ParsedStructure.cs - AnnotationImportMapping.cs - AnnotationValidationResult.cs - AnnotationImportResolutions.cs - ImportStats.cs

Step 1: Write the failing test (round-trip serialisation check)

// src/libs/project-management/SyRF.ProjectManagement.Core.Tests/AnnotationImport/AnnotationImportMappingTests.cs
using SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
using Xunit;

public class AnnotationImportMappingTests
{
    [Fact]
    public void QuestionFieldMapping_WithArrayType_HasSemicolonDelimiterByDefault()
    {
        var mapping = new QuestionFieldMapping(
            SourceField: "col_a",
            QuestionId: Guid.NewGuid(),
            IsArrayType: true,
            ArrayDelimiter: ";");

        Assert.Equal(";", mapping.ArrayDelimiter);
        Assert.True(mapping.IsArrayType);
    }

    [Fact]
    public void AnnotationImportMapping_RequiresStudyAndAnnotatorFields()
    {
        var mapping = new AnnotationImportMapping(
            StudyIdentifierField: "StudyId",
            StudyIdentifierType: StudyIdentifierType.SyRFStudyId,
            AnnotatorIdentifierField: "Email",
            AnnotatorIdentifierType: AnnotatorIdentifierType.Email,
            QuestionMappings: []);

        Assert.Equal("StudyId", mapping.StudyIdentifierField);
        Assert.Equal("Email", mapping.AnnotatorIdentifierField);
    }
}

Step 2: Run to verify it fails

dotnet test src/libs/project-management/SyRF.ProjectManagement.Core.Tests/ \
  --filter "AnnotationImportMappingTests" -v

Step 3: Create all types

// AnnotationImportStatus.cs
namespace SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
public enum AnnotationImportStatus
{
    Created, MappingPending, Validating, ValidationComplete,
    Committing, Completed, Failed, Cancelled
}

// ImportFileFormat.cs
namespace SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
public enum ImportFileFormat { Csv, Json, Yaml }

// StudyIdentifierType.cs
namespace SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
public enum StudyIdentifierType { SyRFStudyId, CustomId }

// AnnotatorIdentifierType.cs
namespace SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
public enum AnnotatorIdentifierType { Email, InvestigatorId }

// ImportFileInfo.cs
namespace SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
public record ImportFileInfo(
    string FileName,
    string GcsObjectName,
    ImportFileFormat Format,
    long SizeBytes);

// ParsedStructure.cs
namespace SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
public record ParsedStructure(
    IReadOnlyList<string> Fields,           // Column headers (CSV) or detected JSON keys
    IReadOnlyList<IReadOnlyList<string>> SampleRows); // Up to 3 sample rows

// AnnotationImportMapping.cs
namespace SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
public record AnnotationImportMapping(
    string StudyIdentifierField,
    StudyIdentifierType StudyIdentifierType,
    string AnnotatorIdentifierField,
    AnnotatorIdentifierType AnnotatorIdentifierType,
    IReadOnlyList<QuestionFieldMapping> QuestionMappings);

public record QuestionFieldMapping(
    string SourceField,
    Guid QuestionId,
    bool IsArrayType,
    string ArrayDelimiter = ";");

// AnnotationValidationResult.cs
namespace SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
public record AnnotationValidationResult(
    IReadOnlyList<CleanGroup> CleanGroups,
    IReadOnlyList<AnnotationConflict> Conflicts,
    IReadOnlyList<ConditionalWarning> ConditionalWarnings,
    IReadOnlyList<OrphanedChildWarning> OrphanedChildren,
    IReadOnlyList<string> UnmatchedStudyIds,
    IReadOnlyList<string> UnresolvedAnnotators,
    int TotalAnnotationsToImport,
    int TotalStudiesAffected);

public record CleanGroup(Guid StudyId, Guid AnnotatorId, int AnnotationCount);

public record AnnotationConflict(
    Guid StudyId, Guid AnnotatorId,
    int ExistingCount, int IncomingCount,
    ConflictResolutionChoice Resolution = ConflictResolutionChoice.Pending);

public enum ConflictResolutionChoice { Pending, Overwrite, Skip }

public record ConditionalWarning(
    Guid StudyId, Guid AnnotatorId,
    Guid ChildQuestionId, Guid ParentQuestionId, string ParentAnswer);

public record OrphanedChildWarning(
    Guid StudyId, Guid AnnotatorId, Guid OrphanedQuestionId,
    OrphanedResolutionChoice Resolution = OrphanedResolutionChoice.Pending);

public enum OrphanedResolutionChoice { Pending, PromoteToRoot, Skip }

// AnnotationImportResolutions.cs
namespace SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
public record AnnotationImportResolutions(
    IReadOnlyList<ConflictResolution> ConflictResolutions,
    IReadOnlyList<ConditionalWarningResolution> ConditionalWarningResolutions,
    IReadOnlyList<OrphanedChildResolution> OrphanedChildResolutions);

public record ConflictResolution(Guid StudyId, Guid AnnotatorId, ConflictResolutionChoice Choice);
public record ConditionalWarningResolution(Guid StudyId, Guid AnnotatorId, Guid ChildQuestionId, bool Include);
public record OrphanedChildResolution(Guid StudyId, Guid AnnotatorId, Guid OrphanedQuestionId, OrphanedResolutionChoice Choice);

// ImportStats.cs
namespace SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
public record ImportStats(int AnnotationsImported, int StudiesAffected, int AnnotationsSkipped);

Step 4: Run tests to verify they pass

dotnet test src/libs/project-management/SyRF.ProjectManagement.Core.Tests/ \
  --filter "AnnotationImportMappingTests" -v

Step 5: Commit

git add src/libs/project-management/SyRF.ProjectManagement.Core/Model/AnnotationImportJobAggregate/
git add src/libs/project-management/SyRF.ProjectManagement.Core.Tests/
git commit -m "feat(annotation-import): add domain enums and value objects"

Task 3: AnnotationImportJob Entity, Repository, and UoW Wiring

Follows the exact pattern from Task 03-01/03-02 of Phase 3. Read DataExportJob.cs and DataExportJobRepository.cs as your reference.

Files: - Create: src/libs/project-management/SyRF.ProjectManagement.Core/Model/AnnotationImportJobAggregate/AnnotationImportJob.cs - Create: src/libs/project-management/SyRF.ProjectManagement.Core/Interfaces/IAnnotationImportJobRepository.cs - Modify: src/libs/project-management/SyRF.ProjectManagement.Core/Interfaces/IPmUnitOfWork.cs - Create: src/libs/project-management/SyRF.ProjectManagement.Mongo.Data/Repositories/AnnotationImportJobRepository.cs - Modify: src/libs/project-management/SyRF.ProjectManagement.Mongo.Data/MongoPmUnitOfWork.cs

Step 1: Write the failing build test

dotnet build src/libs/project-management/SyRF.ProjectManagement.Core/SyRF.ProjectManagement.Core.csproj
Currently passes. After adding the entity it will fail until wired correctly. Use this as your continuous check.

Step 2: Create the entity

// AnnotationImportJob.cs
using SyRF.SharedKernel.BaseClasses;

namespace SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;

public class AnnotationImportJob : AggregateRoot<Guid>
{
    public AnnotationImportJob() : base(Guid.NewGuid()) { DefaultSchemaVersion = 1; }

    public AnnotationImportJob(
        Guid projectId, Guid stageId, Guid createdByInvestigatorId,
        ImportFileInfo file, ParsedStructure parsedStructure)
        : base(Guid.NewGuid())
    {
        DefaultSchemaVersion = 1;
        ProjectId = projectId;
        StageId = stageId;
        CreatedByInvestigatorId = createdByInvestigatorId;
        CreatedAt = DateTime.UtcNow;
        File = file;
        ParsedStructure = parsedStructure;
        Status = AnnotationImportStatus.MappingPending;
    }

    public Guid ProjectId { get; set; }
    public Guid StageId { get; set; }
    public Guid CreatedByInvestigatorId { get; set; }
    public DateTime CreatedAt { get; set; }
    public DateTime? CompletedAt { get; set; }

    public ImportFileInfo File { get; set; } = null!;
    public ParsedStructure? ParsedStructure { get; set; }
    public AnnotationImportMapping? Mapping { get; set; }
    public AnnotationValidationResult? ValidationResult { get; set; }
    public AnnotationImportResolutions? Resolutions { get; set; }
    public ImportStats? Stats { get; set; }

    public AnnotationImportStatus Status { get; set; }
    public string? ErrorMessage { get; set; }

    public void SetMapping(AnnotationImportMapping mapping)
    {
        Mapping = mapping;
        Status = AnnotationImportStatus.Validating;
    }

    public void SetValidationResult(AnnotationValidationResult result)
    {
        ValidationResult = result;
        Status = AnnotationImportStatus.ValidationComplete;
    }

    public void SetResolutions(AnnotationImportResolutions resolutions)
        => Resolutions = resolutions;

    public void MarkCommitting() => Status = AnnotationImportStatus.Committing;

    public void MarkCompleted(ImportStats stats)
    {
        Stats = stats;
        Status = AnnotationImportStatus.Completed;
        CompletedAt = DateTime.UtcNow;
    }

    public void MarkFailed(string errorMessage)
    {
        ErrorMessage = errorMessage;
        Status = AnnotationImportStatus.Failed;
        CompletedAt = DateTime.UtcNow;
    }

    public void MarkCancelled()
    {
        Status = AnnotationImportStatus.Cancelled;
        CompletedAt = DateTime.UtcNow;
    }
}

Step 3: Create repository interface

// IAnnotationImportJobRepository.cs
using SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
using SyRF.SharedKernel.Interfaces;

namespace SyRF.ProjectManagement.Core.Interfaces;

public interface IAnnotationImportJobRepository : ICrudRepository<AnnotationImportJob, Guid>
{
    Task<IReadOnlyList<AnnotationImportJob>> GetByStageAsync(Guid stageId, CancellationToken ct = default);
}

Step 4: Add to IPmUnitOfWork

// Add to IPmUnitOfWork.cs after the DataExportJobs property:
IAnnotationImportJobRepository AnnotationImportJobs { get; }

Step 5: Create repository implementation

// AnnotationImportJobRepository.cs
using Microsoft.Extensions.Logging;
using MongoDB.Bson.Serialization;
using SyRF.Mongo.Common;
using SyRF.ProjectManagement.Core.Interfaces;
using SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;

namespace SyRF.ProjectManagement.Mongo.Data.Repositories;

public class AnnotationImportJobRepository
    : MongoRepositoryBase<AnnotationImportJob, Guid>, IAnnotationImportJobRepository
{
    public AnnotationImportJobRepository(
        MongoContext context,
        RepositoryCache<Guid, AnnotationImportJob> cache,
        ILogger<AnnotationImportJobRepository> logger,
        Func<IPmUnitOfWork> unitOfWork)
        : base(context, cache, logger, unitOfWork) { }

    public async Task<IReadOnlyList<AnnotationImportJob>> GetByStageAsync(
        Guid stageId, CancellationToken ct = default)
    {
        var filter = FilterDefinitionBuilder.Eq(j => j.StageId, stageId);
        var cursor = await Collection.FindAsync(filter, cancellationToken: ct);
        return await cursor.ToListAsync(ct);
    }

    public override async Task InitialiseIndexesAsync()
    {
        await CreateAscendingIndexAsync(j => j.ProjectId);
        await CreateAscendingIndexAsync(j => j.ProjectId, j => j.StageId);
        await CreateAscendingIndexAsync(j => j.Status);
    }

    public override void CreateMappings()
    {
        if (!BsonClassMap.IsClassMapRegistered(typeof(AnnotationImportJob)))
            BsonClassMap.RegisterClassMap<AnnotationImportJob>(cm =>
            {
                cm.AutoMap();
                cm.SetIgnoreExtraElements(true);
            });
    }
}

Step 6: Wire into MongoPmUnitOfWork

Add IAnnotationImportJobRepository annotationImportJobRepository as a constructor parameter. Add AddRepository(annotationImportJobRepository) in the constructor body. Add the typed accessor:

public IAnnotationImportJobRepository AnnotationImportJobs =>
    (IAnnotationImportJobRepository) GetRepository<AnnotationImportJob, Guid>();

Step 7: Build to verify

dotnet build src/services/project-management/project-management.slnf
Expected: PASS with no errors.

Step 8: Commit

git add src/libs/project-management/
git commit -m "feat(annotation-import): add AnnotationImportJob collection, repository, UoW wiring"

Task 4: Import Record Types and IAnnotationFileParser Interface

The shared internal representation all three parsers produce. Pure domain types — no I/O.

Files: - Create: src/libs/project-management/SyRF.ProjectManagement.Core/Services/AnnotationImportServices/IAnnotationFileParser.cs - Create: src/libs/project-management/SyRF.ProjectManagement.Core/Services/AnnotationImportServices/ImportRecord.cs

Step 1: Create the types

// ImportRecord.cs
namespace SyRF.ProjectManagement.Core.Services.AnnotationImportServices;

public record ImportRecord(
    string StudyIdentifierRaw,
    string AnnotatorIdentifierRaw,
    IReadOnlyList<ImportAnnotation> Annotations);

public record ImportAnnotation(
    string QuestionKey,        // Source field name — resolved to QuestionId via mapping
    string? RawValue,          // null = not present in source
    IReadOnlyList<ImportAnnotation> Children = null!) // Populated for JSON/YAML nested structures
{
    public IReadOnlyList<ImportAnnotation> Children { get; init; } = Children ?? [];
}
// IAnnotationFileParser.cs
using SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;

namespace SyRF.ProjectManagement.Core.Services.AnnotationImportServices;

public interface IAnnotationFileParser
{
    ImportFileFormat Format { get; }

    /// <summary>
    /// Parses only the structure (field names + up to 3 sample rows). Fast — used on upload.
    /// </summary>
    Task<ParsedStructure> ParseStructureAsync(Stream content, CancellationToken ct = default);

    /// <summary>
    /// Parses all records using the supplied mapping. Used during validation and commit.
    /// </summary>
    IAsyncEnumerable<ImportRecord> ParseRecordsAsync(
        Stream content,
        AnnotationImportMapping mapping,
        CancellationToken ct = default);
}

Step 2: Commit

git add src/libs/project-management/SyRF.ProjectManagement.Core/Services/AnnotationImportServices/
git commit -m "feat(annotation-import): add IAnnotationFileParser interface and import record types"

Task 5: CsvAnnotationParser

Files: - Create: src/libs/project-management/SyRF.ProjectManagement.Core/Services/AnnotationImportServices/CsvAnnotationParser.cs - Test: src/libs/project-management/SyRF.ProjectManagement.Core.Tests/AnnotationImport/CsvAnnotationParserTests.cs

First, check whether CsvHelper is already referenced in the project:

grep -r "CsvHelper" src/libs/project-management/ --include="*.csproj"

If not found, add it:

cd src/libs/project-management/SyRF.ProjectManagement.Core
dotnet add package CsvHelper

Step 1: Write failing tests

// CsvAnnotationParserTests.cs
using System.Text;
using SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
using SyRF.ProjectManagement.Core.Services.AnnotationImportServices;
using Xunit;

public class CsvAnnotationParserTests
{
    private static Stream ToStream(string content)
        => new MemoryStream(Encoding.UTF8.GetBytes(content));

    private static AnnotationImportMapping MakeMapping(params (string col, Guid qId)[] questions)
        => new(
            StudyIdentifierField: "StudyId",
            StudyIdentifierType: StudyIdentifierType.SyRFStudyId,
            AnnotatorIdentifierField: "Annotator",
            AnnotatorIdentifierType: AnnotatorIdentifierType.Email,
            QuestionMappings: questions.Select(q =>
                new QuestionFieldMapping(q.col, q.qId, false)).ToList());

    [Fact]
    public async Task ParseStructure_ReturnsCsvHeaders()
    {
        var csv = "StudyId,Annotator,Q1,Q2\n123,a@b.com,Yes,No\n456,c@d.com,No,Yes\n";
        var sut = new CsvAnnotationParser();

        var result = await sut.ParseStructureAsync(ToStream(csv));

        Assert.Equal(["StudyId", "Annotator", "Q1", "Q2"], result.Fields);
        Assert.Equal(2, result.SampleRows.Count);
        Assert.Equal(["123", "a@b.com", "Yes", "No"], result.SampleRows[0]);
    }

    [Fact]
    public async Task ParseRecords_ReturnsOneRecordPerRow()
    {
        var q1Id = Guid.NewGuid();
        var csv = "StudyId,Annotator,Q1\n111,a@b.com,Yes\n222,c@d.com,No\n";
        var mapping = MakeMapping(("Q1", q1Id));
        var sut = new CsvAnnotationParser();

        var records = await sut.ParseRecordsAsync(ToStream(csv), mapping)
            .ToListAsync();

        Assert.Equal(2, records.Count);
        Assert.Equal("111", records[0].StudyIdentifierRaw);
        Assert.Equal("a@b.com", records[0].AnnotatorIdentifierRaw);
        Assert.Single(records[0].Annotations);
        Assert.Equal("Yes", records[0].Annotations[0].RawValue);
    }

    [Fact]
    public async Task ParseRecords_ArrayColumn_SplitsOnSemicolon()
    {
        var q1Id = Guid.NewGuid();
        var csv = "StudyId,Annotator,Tags\n111,a@b.com,\"A;B;C\"\n";
        var mapping = new AnnotationImportMapping(
            "StudyId", StudyIdentifierType.SyRFStudyId,
            "Annotator", AnnotatorIdentifierType.Email,
            [new QuestionFieldMapping("Tags", q1Id, IsArrayType: true, ArrayDelimiter: ";")]);
        var sut = new CsvAnnotationParser();

        var records = await sut.ParseRecordsAsync(ToStream(csv), mapping).ToListAsync();

        // RawValue for array types is the raw cell; children carry individual values
        Assert.Equal("A;B;C", records[0].Annotations[0].RawValue);
    }
}

Step 2: Run to verify it fails

dotnet test src/libs/project-management/SyRF.ProjectManagement.Core.Tests/ \
  --filter "CsvAnnotationParserTests" -v

Step 3: Implement CsvAnnotationParser

// CsvAnnotationParser.cs
using CsvHelper;
using CsvHelper.Configuration;
using System.Globalization;
using System.Runtime.CompilerServices;
using SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;

namespace SyRF.ProjectManagement.Core.Services.AnnotationImportServices;

public class CsvAnnotationParser : IAnnotationFileParser
{
    public ImportFileFormat Format => ImportFileFormat.Csv;

    public async Task<ParsedStructure> ParseStructureAsync(Stream content, CancellationToken ct = default)
    {
        using var reader = new StreamReader(content, leaveOpen: true);
        using var csv = new CsvReader(reader, new CsvConfiguration(CultureInfo.InvariantCulture));

        await csv.ReadAsync();
        csv.ReadHeader();
        var headers = csv.HeaderRecord!.ToList();

        var sampleRows = new List<IReadOnlyList<string>>();
        while (sampleRows.Count < 3 && await csv.ReadAsync())
            sampleRows.Add(headers.Select(h => csv.GetField(h) ?? "").ToList());

        return new ParsedStructure(headers, sampleRows);
    }

    public async IAsyncEnumerable<ImportRecord> ParseRecordsAsync(
        Stream content,
        AnnotationImportMapping mapping,
        [EnumeratorCancellation] CancellationToken ct = default)
    {
        using var reader = new StreamReader(content, leaveOpen: true);
        using var csv = new CsvReader(reader, new CsvConfiguration(CultureInfo.InvariantCulture));

        await csv.ReadAsync();
        csv.ReadHeader();

        while (await csv.ReadAsync())
        {
            ct.ThrowIfCancellationRequested();
            var studyId = csv.GetField(mapping.StudyIdentifierField) ?? "";
            var annotator = csv.GetField(mapping.AnnotatorIdentifierField) ?? "";

            var annotations = mapping.QuestionMappings
                .Where(qm => csv.HeaderRecord!.Contains(qm.SourceField))
                .Select(qm =>
                {
                    var rawValue = csv.GetField(qm.SourceField);
                    return new ImportAnnotation(qm.SourceField, rawValue);
                })
                .ToList();

            yield return new ImportRecord(studyId, annotator, annotations);
        }
    }
}

Step 4: Run tests to verify they pass

dotnet test src/libs/project-management/SyRF.ProjectManagement.Core.Tests/ \
  --filter "CsvAnnotationParserTests" -v
Expected: PASS (3 tests).

Step 5: Commit

git add src/libs/project-management/
git commit -m "feat(annotation-import): add CsvAnnotationParser with header + record parsing"

Task 6: JsonAnnotationParser and YamlAnnotationParser

Add YamlDotNet for YAML support. The YAML parser converts to JSON then delegates.

cd src/libs/project-management/SyRF.ProjectManagement.Core
dotnet add package YamlDotNet

Files: - Create: src/libs/project-management/SyRF.ProjectManagement.Core/Services/AnnotationImportServices/JsonAnnotationParser.cs - Create: src/libs/project-management/SyRF.ProjectManagement.Core/Services/AnnotationImportServices/YamlAnnotationParser.cs - Test: src/libs/project-management/SyRF.ProjectManagement.Core.Tests/AnnotationImport/JsonAnnotationParserTests.cs

Step 1: Write failing tests

// JsonAnnotationParserTests.cs
using System.Text;
using System.Text.Json;
using SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
using SyRF.ProjectManagement.Core.Services.AnnotationImportServices;
using Xunit;

public class JsonAnnotationParserTests
{
    private static Stream ToStream(string content)
        => new MemoryStream(Encoding.UTF8.GetBytes(content));

    private static string BuildJson(object studies)
        => JsonSerializer.Serialize(new { studies });

    [Fact]
    public async Task ParseStructure_DetectsAnnotationFieldNames()
    {
        var json = BuildJson(new[]
        {
            new
            {
                studyId = "111", annotator = "a@b.com",
                annotations = new[] { new { questionKey = "Q1", answer = "Yes" } }
            }
        });
        var sut = new JsonAnnotationParser();

        var result = await sut.ParseStructureAsync(ToStream(json));

        Assert.Contains("studyId", result.Fields);
        Assert.Contains("annotator", result.Fields);
    }

    [Fact]
    public async Task ParseRecords_NestedAnnotations_PreservesChildHierarchy()
    {
        var parentQ = Guid.NewGuid();
        var childQ = Guid.NewGuid();
        var json = JsonSerializer.Serialize(new
        {
            studies = new[]
            {
                new
                {
                    studyId = "111", annotator = "a@b.com",
                    annotations = new[]
                    {
                        new
                        {
                            questionKey = parentQ.ToString(), answer = "Yes",
                            children = new[] { new { questionKey = childQ.ToString(), answer = "5" } }
                        }
                    }
                }
            }
        });

        var mapping = new AnnotationImportMapping(
            "studyId", StudyIdentifierType.SyRFStudyId,
            "annotator", AnnotatorIdentifierType.Email,
            [new QuestionFieldMapping(parentQ.ToString(), parentQ, false),
             new QuestionFieldMapping(childQ.ToString(), childQ, false)]);
        var sut = new JsonAnnotationParser();

        var records = await sut.ParseRecordsAsync(ToStream(json), mapping).ToListAsync();

        Assert.Single(records);
        var parentAnnotation = records[0].Annotations[0];
        Assert.Equal("Yes", parentAnnotation.RawValue);
        Assert.Single(parentAnnotation.Children);
        Assert.Equal("5", parentAnnotation.Children[0].RawValue);
    }
}

Step 2: Run to verify failure

dotnet test src/libs/project-management/SyRF.ProjectManagement.Core.Tests/ \
  --filter "JsonAnnotationParserTests" -v

Step 3: Implement JsonAnnotationParser

// JsonAnnotationParser.cs
using System.Runtime.CompilerServices;
using System.Text.Json;
using SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;

namespace SyRF.ProjectManagement.Core.Services.AnnotationImportServices;

public class JsonAnnotationParser : IAnnotationFileParser
{
    public ImportFileFormat Format => ImportFileFormat.Json;

    public async Task<ParsedStructure> ParseStructureAsync(Stream content, CancellationToken ct = default)
    {
        var doc = await JsonDocument.ParseAsync(content, cancellationToken: ct);
        var root = doc.RootElement;
        var studies = root.GetProperty("studies").EnumerateArray().Take(3).ToList();

        var topFields = studies.FirstOrDefault() is { ValueKind: JsonValueKind.Object } first
            ? first.EnumerateObject().Select(p => p.Name).ToList()
            : new List<string>();

        var sampleRows = studies.Select(s =>
            (IReadOnlyList<string>) s.EnumerateObject()
                .Select(p => p.Value.ToString())
                .ToList()).ToList();

        return new ParsedStructure(topFields, sampleRows);
    }

    public async IAsyncEnumerable<ImportRecord> ParseRecordsAsync(
        Stream content,
        AnnotationImportMapping mapping,
        [EnumeratorCancellation] CancellationToken ct = default)
    {
        var doc = await JsonDocument.ParseAsync(content, cancellationToken: ct);
        foreach (var study in doc.RootElement.GetProperty("studies").EnumerateArray())
        {
            ct.ThrowIfCancellationRequested();
            var studyId = study.GetProperty(mapping.StudyIdentifierField).GetString() ?? "";
            var annotator = study.GetProperty(mapping.AnnotatorIdentifierField).GetString() ?? "";
            var annotations = ParseAnnotations(
                study.TryGetProperty("annotations", out var ann) ? ann : default, mapping);
            yield return new ImportRecord(studyId, annotator, annotations);
        }
    }

    private static List<ImportAnnotation> ParseAnnotations(
        JsonElement annotationsEl, AnnotationImportMapping mapping)
    {
        if (annotationsEl.ValueKind != JsonValueKind.Array) return [];
        return annotationsEl.EnumerateArray()
            .Select(a =>
            {
                var key = a.TryGetProperty("questionKey", out var k) ? k.GetString() ?? "" : "";
                var value = a.TryGetProperty("answer", out var v) ? v.ToString() : null;
                var children = a.TryGetProperty("children", out var ch)
                    ? ParseAnnotations(ch, mapping)
                    : new List<ImportAnnotation>();
                return new ImportAnnotation(key, value, children);
            })
            .ToList();
    }
}

Step 4: Implement YamlAnnotationParser

// YamlAnnotationParser.cs
using System.Runtime.CompilerServices;
using System.Text.Json;
using SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
using YamlDotNet.Serialization;

namespace SyRF.ProjectManagement.Core.Services.AnnotationImportServices;

public class YamlAnnotationParser(JsonAnnotationParser jsonParser) : IAnnotationFileParser
{
    public ImportFileFormat Format => ImportFileFormat.Yaml;

    private static Stream ConvertToJson(Stream yaml)
    {
        using var reader = new StreamReader(yaml, leaveOpen: true);
        var yamlContent = reader.ReadToEnd();
        var deserializer = new DeserializerBuilder().Build();
        var obj = deserializer.Deserialize(yamlContent);
        var serializer = new SerializerBuilder()
            .JsonCompatible().Build();
        var json = serializer.Serialize(obj);
        return new MemoryStream(System.Text.Encoding.UTF8.GetBytes(json));
    }

    public Task<ParsedStructure> ParseStructureAsync(Stream content, CancellationToken ct = default)
        => jsonParser.ParseStructureAsync(ConvertToJson(content), ct);

    public IAsyncEnumerable<ImportRecord> ParseRecordsAsync(Stream content,
        AnnotationImportMapping mapping, [EnumeratorCancellation] CancellationToken ct = default)
        => jsonParser.ParseRecordsAsync(ConvertToJson(content), mapping, ct);
}

Step 5: Run tests

dotnet test src/libs/project-management/SyRF.ProjectManagement.Core.Tests/ \
  --filter "JsonAnnotationParserTests" -v
Expected: PASS (2 tests).

Step 6: Commit

git add src/libs/project-management/
git commit -m "feat(annotation-import): add JSON and YAML annotation file parsers"

Task 7: AnnotationTreeBuilder

Pure domain logic — takes parsed ImportRecords, the stage question tree, and resolutions, and produces a List<Annotation> with correct ParentId/Children links. Most important algorithmic piece. No I/O.

Files: - Create: src/libs/project-management/SyRF.ProjectManagement.Core/Services/AnnotationImportServices/AnnotationTreeBuilder.cs - Test: src/libs/project-management/SyRF.ProjectManagement.Core.Tests/AnnotationImport/AnnotationTreeBuilderTests.cs

Step 1: Write failing tests

// AnnotationTreeBuilderTests.cs
using SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
using SyRF.ProjectManagement.Core.Model.StudyAggregate;
using SyRF.ProjectManagement.Core.Model.ProjectAggregate;
using SyRF.ProjectManagement.Core.Services.AnnotationImportServices;
using Xunit;

public class AnnotationTreeBuilderTests
{
    private static Guid ProjectId = Guid.NewGuid();
    private static Guid StageId = Guid.NewGuid();
    private static Guid StudyId = Guid.NewGuid();
    private static Guid AnnotatorId = Guid.NewGuid();

    // Helper: build a minimal question tree entry
    private static AnnotationQuestion MakeQuestion(Guid id, string type, Guid? parentId = null)
        => new() { Id = id, QuestionType = type, AnswerArray = false /* adjust as needed */ };

    [Fact]
    public void Build_SingleRootQuestion_ProducesRootAnnotation()
    {
        var qId = Guid.NewGuid();
        var importAnnotations = new List<ImportAnnotation>
        {
            new(qId.ToString(), "Yes", [])
        };
        var questions = new List<AnnotationQuestion> { MakeQuestion(qId, "boolean") };
        var mapping = new AnnotationImportMapping("", StudyIdentifierType.SyRFStudyId,
            "", AnnotatorIdentifierType.Email,
            [new QuestionFieldMapping(qId.ToString(), qId, false)]);

        var result = AnnotationTreeBuilder.Build(
            ProjectId, StageId, StudyId, AnnotatorId,
            importAnnotations, questions, mapping, resolutions: null);

        Assert.Single(result);
        Assert.True(result[0].Root);
        Assert.Null(result[0].ParentId);
    }

    [Fact]
    public void Build_ParentAndChild_LinksParentIdAndChildren()
    {
        var parentQId = Guid.NewGuid();
        var childQId = Guid.NewGuid();
        var importAnnotations = new List<ImportAnnotation>
        {
            new(parentQId.ToString(), "Yes",
            [
                new ImportAnnotation(childQId.ToString(), "5", [])
            ])
        };
        var questions = new List<AnnotationQuestion>
        {
            MakeQuestion(parentQId, "boolean"),
            MakeQuestion(childQId, "integer", parentQId)
        };
        var mapping = new AnnotationImportMapping("", StudyIdentifierType.SyRFStudyId,
            "", AnnotatorIdentifierType.Email,
            [new QuestionFieldMapping(parentQId.ToString(), parentQId, false),
             new QuestionFieldMapping(childQId.ToString(), childQId, false)]);

        var result = AnnotationTreeBuilder.Build(
            ProjectId, StageId, StudyId, AnnotatorId,
            importAnnotations, questions, mapping, resolutions: null);

        Assert.Equal(2, result.Count);
        var parent = result.Single(a => a.Root);
        var child = result.Single(a => !a.Root);
        Assert.Equal(parent.Id, child.ParentId);
        Assert.Contains(child.Id, parent.Children);
    }

    [Fact]
    public void Build_NullValue_SkipsAnnotation()
    {
        var qId = Guid.NewGuid();
        var importAnnotations = new List<ImportAnnotation>
        {
            new(qId.ToString(), null, [])  // null = not answered
        };
        var questions = new List<AnnotationQuestion> { MakeQuestion(qId, "string") };
        var mapping = new AnnotationImportMapping("", StudyIdentifierType.SyRFStudyId,
            "", AnnotatorIdentifierType.Email,
            [new QuestionFieldMapping(qId.ToString(), qId, false)]);

        var result = AnnotationTreeBuilder.Build(
            ProjectId, StageId, StudyId, AnnotatorId,
            importAnnotations, questions, mapping, resolutions: null);

        Assert.Empty(result);
    }
}

Step 2: Run to verify failure

dotnet test src/libs/project-management/SyRF.ProjectManagement.Core.Tests/ \
  --filter "AnnotationTreeBuilderTests" -v

Step 3: Implement AnnotationTreeBuilder

Read Annotation.cs, BoolAnnotation.cs, StringAnnotation.cs, IntAnnotation.cs to understand the type hierarchy before implementing. The builder must instantiate the correct concrete subclass based on the question's QuestionType.

// AnnotationTreeBuilder.cs
using SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
using SyRF.ProjectManagement.Core.Model.ProjectAggregate;
using SyRF.ProjectManagement.Core.Model.StudyAggregate;

namespace SyRF.ProjectManagement.Core.Services.AnnotationImportServices;

public static class AnnotationTreeBuilder
{
    public static List<Annotation> Build(
        Guid projectId, Guid stageId, Guid studyId, Guid annotatorId,
        IReadOnlyList<ImportAnnotation> importAnnotations,
        IReadOnlyList<AnnotationQuestion> stageQuestions,
        AnnotationImportMapping mapping,
        AnnotationImportResolutions? resolutions)
    {
        var result = new List<Annotation>();
        var questionById = stageQuestions.ToDictionary(q => q.Id);
        var mappingByKey = mapping.QuestionMappings.ToDictionary(m => m.SourceField);

        BuildDepthFirst(
            importAnnotations, questionById, mappingByKey,
            projectId, stageId, studyId, annotatorId,
            parentAnnotationId: null, result);

        return result;
    }

    private static void BuildDepthFirst(
        IReadOnlyList<ImportAnnotation> annotations,
        Dictionary<Guid, AnnotationQuestion> questionById,
        Dictionary<string, QuestionFieldMapping> mappingByKey,
        Guid projectId, Guid stageId, Guid studyId, Guid annotatorId,
        Guid? parentAnnotationId,
        List<Annotation> result)
    {
        foreach (var importAnnotation in annotations)
        {
            if (importAnnotation.RawValue is null) continue;
            if (!mappingByKey.TryGetValue(importAnnotation.QuestionKey, out var qm)) continue;
            if (!questionById.TryGetValue(qm.QuestionId, out var question)) continue;

            var annotation = CreateAnnotation(
                question, importAnnotation.RawValue, qm,
                projectId, stageId, studyId, annotatorId,
                isRoot: parentAnnotationId is null,
                parentId: parentAnnotationId);

            if (annotation is null) continue;

            // Wire parent → child
            if (parentAnnotationId.HasValue)
            {
                var parent = result.First(a => a.Id == parentAnnotationId.Value);
                parent.Children.Add(annotation.Id);
            }

            result.Add(annotation);

            // Recurse for children
            if (importAnnotation.Children.Count > 0)
                BuildDepthFirst(importAnnotation.Children, questionById, mappingByKey,
                    projectId, stageId, studyId, annotatorId, annotation.Id, result);
        }
    }

    private static Annotation? CreateAnnotation(
        AnnotationQuestion question, string rawValue, QuestionFieldMapping qm,
        Guid projectId, Guid stageId, Guid studyId, Guid annotatorId,
        bool isRoot, Guid? parentId)
    {
        var baseProps = new
        {
            Id = Guid.NewGuid(),
            StudyId = studyId, StageId = stageId,
            AnnotatorId = annotatorId, ProjectId = projectId,
            QuestionId = question.Id, Question = question.QuestionText ?? "",
            Root = isRoot, ParentId = parentId
        };

        // Read Annotation.cs subclasses to confirm property names before finalising
        return question.QuestionType switch
        {
            "boolean" when bool.TryParse(rawValue, out var b)
                => new BoolAnnotation { /* set all props */ Answer = b },
            "integer" when int.TryParse(rawValue, out var i)
                => new IntAnnotation { Answer = i },
            "decimal" when decimal.TryParse(rawValue, out var d)
                => new DecimalAnnotation { Answer = d },
            "string" => new StringAnnotation { Answer = rawValue },
            // Array types: split rawValue on delimiter
            "boolean" when question.AnswerArray
                => new BoolArrayAnnotation { Answer = rawValue.Split(qm.ArrayDelimiter)
                    .Select(v => bool.TryParse(v.Trim(), out var b) ? b : false).ToList() },
            _ => null  // Unknown type — skip
        };
        // Note: set StudyId, StageId, AnnotatorId, ProjectId, QuestionId, Question, Root, ParentId
        // on each branch. Extract a private helper to avoid repetition.
    }
}

Note to implementer: The CreateAnnotation method needs to set all base Annotation properties on each concrete subclass. Read the actual Annotation.cs base class before writing this — the property names may differ from the above sketch. Extract a void SetBaseProps(Annotation a, ...) helper to keep it DRY.

Step 4: Run tests

dotnet test src/libs/project-management/SyRF.ProjectManagement.Core.Tests/ \
  --filter "AnnotationTreeBuilderTests" -v
Expected: PASS (3 tests).

Step 5: Commit

git add src/libs/project-management/
git commit -m "feat(annotation-import): add AnnotationTreeBuilder with depth-first tree reconstruction"

Task 8: AnnotationImportValidationService

Resolves study IDs, annotator identifiers, validates tree structure (orphans + conditional warnings), produces AnnotationValidationResult. Depends on IAnnotationFileParser.

Files: - Create: src/libs/project-management/SyRF.ProjectManagement.Core/Services/AnnotationImportServices/AnnotationImportValidationService.cs - Test: src/libs/project-management/SyRF.ProjectManagement.Core.Tests/AnnotationImport/AnnotationImportValidationServiceTests.cs

Step 1: Write failing tests

// AnnotationImportValidationServiceTests.cs
using NSubstitute;
using SyRF.ProjectManagement.Core.Interfaces;
using SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
using SyRF.ProjectManagement.Core.Services.AnnotationImportServices;
using Xunit;

public class AnnotationImportValidationServiceTests
{
    [Fact]
    public async Task Validate_UnmatchedStudyId_ReportedInResult()
    {
        var studyRepo = Substitute.For<IStudyRepository>();
        studyRepo.GetByCustomIdAsync("MISSING", Arg.Any<Guid>(), Arg.Any<CancellationToken>())
            .Returns((Study?)null);
        // ... set up parser to return one record with studyId="MISSING"
        // ... assert result.UnmatchedStudyIds.Contains("MISSING")
    }

    [Fact]
    public async Task Validate_ExistingAnnotations_ReportedAsConflict()
    {
        // ... study has existing annotations for the annotator+stage
        // ... assert result.Conflicts has one entry with correct ExistingCount
    }

    [Fact]
    public async Task Validate_OrphanedChildAnnotation_ReportedAsOrphan()
    {
        // ... import record has child question key but no parent question key
        // ... assert result.OrphanedChildren has one entry
    }
}

Step 2: Implement the service

The service signature:

public class AnnotationImportValidationService(
    IStudyRepository studyRepository,
    IInvestigatorRepository investigatorRepository,
    IAnnotationFileParser[] parsers)
{
    public async Task<AnnotationValidationResult> ValidateAsync(
        AnnotationImportJob job,
        Stream fileContent,
        IReadOnlyList<AnnotationQuestion> stageQuestions,
        CancellationToken ct = default)
    { ... }
}

Logic: 1. Select parser by job.File.Format 2. Stream records via ParseRecordsAsync 3. Per record: resolve study ID, resolve annotator → build clean/conflict/orphan/warning buckets 4. Return AnnotationValidationResult

For orphan detection: check if a child ImportAnnotation has a non-null value but its parent ImportAnnotation in the same record has a null value. Use the stage question tree to determine parent-child relationships.

Step 3: Run tests and iterate until passing

dotnet test src/libs/project-management/SyRF.ProjectManagement.Core.Tests/ \
  --filter "AnnotationImportValidationServiceTests" -v

Step 4: Commit

git add src/libs/project-management/
git commit -m "feat(annotation-import): add AnnotationImportValidationService"

Task 9: MassTransit Validation Consumer

Files: - Create: src/services/project-management/SyRF.ProjectManagement.Endpoint/Consumers/AnnotationImportValidationConsumer.cs - Create: src/libs/project-management/SyRF.ProjectManagement.Messages/Commands/IValidateAnnotationImportCommand.cs - Test: src/services/project-management/SyRF.ProjectManagement.Endpoint.Tests/AnnotationImportValidationConsumerTests.cs

Step 1: Create the command message

// IValidateAnnotationImportCommand.cs
namespace SyRF.ProjectManagement.Messages.Commands;

public interface IValidateAnnotationImportCommand
{
    Guid JobId { get; }
    Guid ProjectId { get; }
}

Step 2: Write failing test

// AnnotationImportValidationConsumerTests.cs
// Follow the pattern in ReferenceFileParseJobConsumerTests.cs
// Test that: consumer loads job, downloads file from GCS, calls validation service,
// updates job with result

Step 3: Implement the consumer

// AnnotationImportValidationConsumer.cs
using MassTransit;
using SyRF.AppServices.FileServices;
using SyRF.ProjectManagement.Core.Interfaces;
using SyRF.ProjectManagement.Core.Services.AnnotationImportServices;
using SyRF.ProjectManagement.Messages.Commands;

namespace SyRF.ProjectManagement.Endpoint.Consumers;

public class AnnotationImportValidationConsumer(
    IGcsStorageService gcs,
    IPmUnitOfWork uow,
    AnnotationImportValidationService validationService)
    : IConsumer<IValidateAnnotationImportCommand>
{
    public async Task Consume(ConsumeContext<IValidateAnnotationImportCommand> context)
    {
        var job = await uow.AnnotationImportJobs.GetAsync(context.Message.JobId);
        if (job is null) return;

        try
        {
            await using var fileStream = await gcs.DownloadAsync(job.File.GcsObjectName, context.CancellationToken);
            var stageQuestions = await GetStageQuestionsAsync(job.ProjectId, job.StageId, context.CancellationToken);
            var result = await validationService.ValidateAsync(job, fileStream, stageQuestions, context.CancellationToken);
            job.SetValidationResult(result);
        }
        catch (Exception ex)
        {
            job.MarkFailed(ex.Message);
        }

        await uow.SaveAsync();
    }

    private async Task<IReadOnlyList<AnnotationQuestion>> GetStageQuestionsAsync(
        Guid projectId, Guid stageId, CancellationToken ct)
    {
        var project = await uow.Projects.GetAsync(projectId);
        var stage = project?.Stages.FirstOrDefault(s => s.Id == stageId);
        return stage?.AnnotationQuestions ?? [];
    }
}

Step 4: Register the consumer — add to the MassTransit registration in Program.cs / host configuration of the project-management service. Follow the pattern of ReferenceFileParseJobConsumer.

Step 5: Run tests

dotnet test src/services/project-management/SyRF.ProjectManagement.Endpoint.Tests/ \
  --filter "AnnotationImportValidation" -v

Step 6: Build full solution

dotnet build src/services/project-management/project-management.slnf

Step 7: Commit

git add src/services/project-management/ src/libs/project-management/SyRF.ProjectManagement.Messages/
git commit -m "feat(annotation-import): add MassTransit validation consumer"

Task 10: API Endpoints

Five endpoints in the project-management service. Follow the pattern of ReviewController.cs and the data export endpoints.

Files: - Create: src/services/project-management/SyRF.ProjectManagement.Endpoint/Controllers/AnnotationImportController.cs - Create: src/services/project-management/SyRF.ProjectManagement.Endpoint/Models/AnnotationImport/ (DTOs)

Step 1: Create DTOs

// CreateAnnotationImportDto.cs — returned on POST (upload)
public record CreateAnnotationImportResponseDto(
    Guid JobId,
    string Format,
    ParsedStructureDto ParsedStructure,
    IReadOnlyList<QuestionNodeDto> StageQuestions);  // tree

// ParsedStructureDto.cs
public record ParsedStructureDto(
    IReadOnlyList<string> Fields,
    IReadOnlyList<IReadOnlyList<string>> SampleRows);

// QuestionNodeDto.cs (recursive tree)
public record QuestionNodeDto(
    Guid Id, string Text, string Type, bool IsArrayType,
    IReadOnlyList<QuestionNodeDto> Children);

// SubmitMappingDto.cs — body of POST /mapping
public record SubmitMappingDto(
    string StudyIdentifierField, string StudyIdentifierType,
    string AnnotatorIdentifierField, string AnnotatorIdentifierType,
    IReadOnlyList<QuestionFieldMappingDto> QuestionMappings);

// AnnotationImportJobStatusDto.cs — returned from GET
public record AnnotationImportJobStatusDto(
    Guid JobId, string Status,
    ValidationResultDto? ValidationResult,
    ImportStatsDto? Stats,
    string? ErrorMessage);

Step 2: Implement controller

[ApiController]
[Route("api/projects/{projectId}")]
public class AnnotationImportController(
    IPmUnitOfWork uow,
    IGcsStorageService gcs,
    IPublishEndpoint publishEndpoint,
    IAnnotationFileParserResolver parserResolver,
    GcsSettings gcsSettings) : ControllerBase
{
    // POST stages/{stageId}/annotation-imports
    [HttpPost("stages/{stageId}/annotation-imports")]
    public async Task<IActionResult> Upload(Guid projectId, Guid stageId, IFormFile file) { ... }

    // POST annotation-imports/{jobId}/mapping
    [HttpPost("annotation-imports/{jobId}/mapping")]
    public async Task<IActionResult> SubmitMapping(Guid projectId, Guid jobId,
        [FromBody] SubmitMappingDto dto) { ... }

    // GET annotation-imports/{jobId}
    [HttpGet("annotation-imports/{jobId}")]
    public async Task<IActionResult> GetStatus(Guid projectId, Guid jobId) { ... }

    // POST annotation-imports/{jobId}/confirm
    [HttpPost("annotation-imports/{jobId}/confirm")]
    public async Task<IActionResult> Confirm(Guid projectId, Guid jobId,
        [FromBody] AnnotationImportResolutionsDto resolutions) { ... }

    // DELETE annotation-imports/{jobId}
    [HttpDelete("annotation-imports/{jobId}")]
    public async Task<IActionResult> Cancel(Guid projectId, Guid jobId) { ... }
}

The Confirm endpoint re-reads the file from GCS, iterates studies in the validation result applying resolutions, calls AnnotationTreeBuilder.Build per (study, annotator) group, calls study.AddSessionData(), saves, then deletes the GCS file.

Step 3: Write controller tests following the pattern of existing controller tests in the endpoint test project.

Step 4: Build and run tests

dotnet build src/services/project-management/project-management.slnf
dotnet test src/services/project-management/SyRF.ProjectManagement.Endpoint.Tests/ \
  --filter "AnnotationImportController" -v

Step 5: Commit

git add src/services/project-management/
git commit -m "feat(annotation-import): add API endpoints (upload, mapping, status, confirm, cancel)"

Task 11: Angular UI — Import Wizard

Add an annotation import wizard to the stage admin area. Uses Angular Material Stepper.

Files: - Create: src/services/web/src/app/stage/stage-admin/annotation-import/ (directory) - annotation-import.component.ts - annotation-import.component.html - annotation-import.component.scss - annotation-import.component.spec.ts - annotation-import.service.ts - annotation-import.service.spec.ts - Modify: src/services/web/src/app/stage/stage-admin/ (add route + nav entry)

Step 1: Write failing service test

// annotation-import.service.spec.ts
import { TestBed } from '@angular/core/testing';
import { HttpTestingController, provideHttpClientTesting } from '@angular/common/http/testing';
import { AnnotationImportService } from './annotation-import.service';

describe('AnnotationImportService', () => {
  let service: AnnotationImportService;
  let http: HttpTestingController;

  beforeEach(() => {
    TestBed.configureTestingModule({
      providers: [AnnotationImportService, provideHttpClientTesting()]
    });
    service = TestBed.inject(AnnotationImportService);
    http = TestBed.inject(HttpTestingController);
  });

  it('should upload file and return job with parsed structure', () => {
    const projectId = 'proj-1', stageId = 'stage-1';
    const file = new File(['col1,col2\nv1,v2'], 'test.csv', { type: 'text/csv' });

    service.upload(projectId, stageId, file).subscribe(result => {
      expect(result.jobId).toBe('job-1');
      expect(result.parsedStructure.fields).toEqual(['col1', 'col2']);
    });

    const req = http.expectOne(
      `api/projects/${projectId}/stages/${stageId}/annotation-imports`);
    expect(req.request.method).toBe('POST');
    req.flush({ jobId: 'job-1', format: 'Csv',
      parsedStructure: { fields: ['col1', 'col2'], sampleRows: [] },
      stageQuestions: [] });
  });
});

Step 2: Run to verify failure

cd src/services/web
npx ng test --no-watch --filter="AnnotationImportService"

Step 3: Implement the service

// annotation-import.service.ts
import { Injectable, inject } from '@angular/core';
import { HttpClient } from '@angular/common/http';
import { Observable, interval, switchMap, takeWhile } from 'rxjs';

@Injectable({ providedIn: 'root' })
export class AnnotationImportService {
  readonly #http = inject(HttpClient);

  upload(projectId: string, stageId: string, file: File): Observable<ImportJobResponse> {
    const form = new FormData();
    form.append('file', file);
    return this.#http.post<ImportJobResponse>(
      `api/projects/${projectId}/stages/${stageId}/annotation-imports`, form);
  }

  submitMapping(projectId: string, jobId: string, mapping: SubmitMappingDto): Observable<void> {
    return this.#http.post<void>(
      `api/projects/${projectId}/annotation-imports/${jobId}/mapping`, mapping);
  }

  pollStatus(projectId: string, jobId: string): Observable<ImportJobStatus> {
    return interval(2000).pipe(
      switchMap(() => this.#http.get<ImportJobStatus>(
        `api/projects/${projectId}/annotation-imports/${jobId}`)),
      takeWhile(s => s.status === 'Validating', inclusive: true));
  }

  confirm(projectId: string, jobId: string, resolutions: ImportResolutionsDto): Observable<ImportStats> {
    return this.#http.post<ImportStats>(
      `api/projects/${projectId}/annotation-imports/${jobId}/confirm`, resolutions);
  }

  cancel(projectId: string, jobId: string): Observable<void> {
    return this.#http.delete<void>(
      `api/projects/${projectId}/annotation-imports/${jobId}`);
  }
}

Step 4: Implement the stepper component

Use MatStepper with four steps matching the UX flow. Each step is a child component for clarity: - Step 1: <app-import-upload> — drag-and-drop file input, shows format after upload - Step 2: <app-import-mapping> — tree-rendered question list with column dropdowns - Step 3: <app-import-validation> — polling spinner → conflict resolution table - Step 4: <app-import-complete> — stats summary

Use Angular signals for state: jobId = signal<string | null>(null), status = signal<ImportJobStatus | null>(null).

Step 5: Run component tests

npx ng test --no-watch --filter="AnnotationImport"

Step 6: Commit

git add src/services/web/src/app/stage/stage-admin/annotation-import/
git commit -m "feat(annotation-import): add Angular import wizard (upload, mapping, validation, commit)"

Task 12: GCS DI Registration + Config Wiring

Wire IGcsStorageService into DI for the project-management service. Add config section.

Files: - Modify: src/services/project-management/SyRF.ProjectManagement.Endpoint/Program.cs (or Startup equivalent) - Modify: src/services/project-management/SyRF.ProjectManagement.Endpoint/appsettings.json

Step 1: Add config

// appsettings.json — add under existing sections:
"Gcs": {
  "AnnotationImportsBucket": "syrf-annotation-imports"
}

In staging/production, override via GCP Secret Manager / Kubernetes ConfigMap (follow existing pattern for S3Settings).

Step 2: Register in DI

// Program.cs
var gcsSettings = builder.Configuration.GetSection("Gcs").Get<GcsSettings>()!;
builder.Services.AddSingleton(gcsSettings);
builder.Services.AddSingleton(StorageClient.Create());  // Uses Workload Identity
builder.Services.AddScoped<IGcsStorageService, GcsFileService>();

Step 3: Build and smoke test

dotnet build src/services/project-management/project-management.slnf

Step 4: Commit

git add src/services/project-management/SyRF.ProjectManagement.Endpoint/
git commit -m "feat(annotation-import): wire GCS DI registration and appsettings config"

Task 13: End-to-End Smoke Test

Verify the full flow works locally using the project-management service.

Manual test steps:

  1. Start the project-management service locally
  2. Pick an existing project + stage with at least 2 questions
  3. Export a small set of existing annotations using the existing data-export endpoint with ImportCompatible=true
  4. POST the exported CSV to POST /api/projects/{id}/stages/{id}/annotation-imports
  5. Verify: response contains correct column headers and stageQuestions tree
  6. POST column mapping to /mapping — verify 202 response
  7. Poll GET until status == "ValidationComplete" — inspect validationResult
  8. POST /confirm with resolutions — verify stats in response
  9. Verify annotations appear on a study via the review endpoint

Step: Commit any fixes found during smoke test

git commit -m "fix(annotation-import): [describe issue found]"

Execution Options

Plan complete and saved to docs/features/annotation-import/implementation-plan.md.

1. Subagent-Driven (this session) — dispatch fresh subagent per task, review between tasks, fast iteration

2. Parallel Session (separate) — open new session with executing-plans, batch execution with checkpoints

Which approach?