Annotation Import Implementation Plan¶
For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
Goal: Allow project administrators to import annotations from external tools (CSV, JSON, YAML) into a SyRF project stage via a hybrid async-validate / synchronous-commit flow.
Architecture: Upload triggers synchronous header parsing and job creation. A MassTransit IJobConsumer validates the full file asynchronously (study resolution, annotator resolution, tree validation) and writes a ValidationResult to the pmAnnotationImportJob document. The admin reviews conflicts and POSTs resolutions; the server re-reads the file from GCS, reconstructs the annotation tree depth-first, resolves the target AQVersion/StageQuestionSetVersion references, and writes new Annotation/AnnotationVersion/AnnotationSession/AnnotationSessionVersion aggregates. A new IGcsStorageService in SyRF.AppServices replaces S3 for this feature and establishes GCS as the standard for new features going forward.
Tech Stack: .NET 10, MongoDB (C# driver), MassTransit (IJobConsumer), Google.Cloud.Storage.V1, YamlDotNet, CsvHelper (check existing usage first), Angular 21 (signal forms, Material stepper), xUnit.
Design doc: docs/features/annotation-import/design.md
Prerequisite: QM v2 domain must be landed¶
This plan assumes the QM v2 domain from PRs #2572–#2575 is merged. Before starting, confirm the following types exist in src/libs/project-management/SyRF.ProjectManagement.Core/Model/QuestionVersioning/:
AnnotationQuestionV2(question entity — usesGuid Idas aggregate root)AQVersion(embedded onAnnotationQuestionV2; has ownGuid Idandint VersionNumber)ProjectQuestionSet+PQSVersionStageQuestionSet+SQSVersionDraftQuestion(pre-publish mutable question; formerly referred to as "DraftAQ" in earlier specs)VersioningValueObjects.cswith the composite reference records:AnnotationQuestionVersionReference(Guid QuestionId, Guid VersionId)StageQuestionSetVersionReference(Guid StageId, Guid VersionId)AnnotationVersionReference(Guid AnnotationId, Guid VersionId)AnnotationSessionVersionReference(Guid SessionId, Guid VersionId)
Where this plan refers to "AQ", "AQVersion", "QSV", use the actual types listed above. Commit-time writes must pin every imported annotation to its AnnotationQuestionVersionReference and every session to its StageQuestionSetVersionReference.
Terminology sweep required during implementation: the task descriptions below were drafted before QM v2 landed and occasionally use "AQ" or "QSV" shorthand. When implementing, replace with the full current class/record names and add the typed references where needed.
Task 1: GCS Storage Service¶
Add IGcsStorageService to SyRF.AppServices following the same pattern as S3FileService. This is the foundation everything else depends on.
Files:
- Create: src/libs/appservices/SyRF.AppServices/FileServices/GcsFileService.cs
- Create: src/libs/appservices/SyRF.AppServices/FileServices/IGcsStorageService.cs
- Create: src/libs/appservices/SyRF.AppServices/FileServices/Settings/GcsSettings.cs
- Modify: src/libs/appservices/SyRF.AppServices/SyRF.AppServices.csproj
- Test: src/libs/appservices/SyRF.AppServices.Tests/GcsFileServiceTests.cs
Step 1: Add the NuGet package
Step 2: Write the failing test
// src/libs/appservices/SyRF.AppServices.Tests/GcsFileServiceTests.cs
using Google.Cloud.Storage.V1;
using NSubstitute;
using SyRF.AppServices.FileServices;
using SyRF.AppServices.FileServices.Settings;
using Xunit;
public class GcsFileServiceTests
{
[Fact]
public async Task UploadAsync_WritesStreamToGcsBucket()
{
var storageClient = Substitute.For<StorageClient>();
var settings = new GcsSettings { AnnotationImportsBucket = "test-bucket" };
var sut = new GcsFileService(storageClient, settings);
var stream = new MemoryStream("hello"u8.ToArray());
await sut.UploadAsync(stream, "test-object.csv", "text/csv");
await storageClient.Received(1).UploadObjectAsync(
"test-bucket", "test-object.csv", "text/csv", stream,
null, Arg.Any<CancellationToken>(), null);
}
[Fact]
public async Task DeleteAsync_RemovesObjectFromBucket()
{
var storageClient = Substitute.For<StorageClient>();
var settings = new GcsSettings { AnnotationImportsBucket = "test-bucket" };
var sut = new GcsFileService(storageClient, settings);
await sut.DeleteAsync("test-object.csv");
await storageClient.Received(1).DeleteObjectAsync(
"test-bucket", "test-object.csv", null, Arg.Any<CancellationToken>());
}
}
Step 3: Run test to verify it fails
Expected: FAIL —GcsFileService does not exist yet.
Step 4: Create the settings, interface, and implementation
// GcsSettings.cs
namespace SyRF.AppServices.FileServices.Settings;
public class GcsSettings
{
public string AnnotationImportsBucket { get; set; } = string.Empty;
}
// IGcsStorageService.cs
namespace SyRF.AppServices.FileServices;
public interface IGcsStorageService
{
Task UploadAsync(Stream content, string objectName, string contentType, CancellationToken ct = default);
Task<Stream> DownloadAsync(string objectName, CancellationToken ct = default);
Task DeleteAsync(string objectName, CancellationToken ct = default);
}
// GcsFileService.cs
using Google.Cloud.Storage.V1;
using SyRF.AppServices.FileServices.Settings;
namespace SyRF.AppServices.FileServices;
public class GcsFileService(StorageClient storageClient, GcsSettings settings) : IGcsStorageService
{
public async Task UploadAsync(Stream content, string objectName, string contentType,
CancellationToken ct = default)
=> await storageClient.UploadObjectAsync(settings.AnnotationImportsBucket, objectName,
contentType, content, cancellationToken: ct);
public async Task<Stream> DownloadAsync(string objectName, CancellationToken ct = default)
{
var stream = new MemoryStream();
await storageClient.DownloadObjectAsync(settings.AnnotationImportsBucket, objectName,
stream, cancellationToken: ct);
stream.Position = 0;
return stream;
}
public async Task DeleteAsync(string objectName, CancellationToken ct = default)
=> await storageClient.DeleteObjectAsync(settings.AnnotationImportsBucket, objectName,
cancellationToken: ct);
}
Step 5: Run tests to verify they pass
Expected: PASS (2 tests).Step 6: Commit
git add src/libs/appservices/
git commit -m "feat(gcs): add IGcsStorageService to SyRF.AppServices
Wraps Google.Cloud.Storage.V1 for annotation import temp file storage.
Establishes GCS as the standard for new features; existing S3 usage unchanged."
Task 2: Domain Enums and Value Objects¶
All the supporting types for AnnotationImportJob. Pure C# records and enums — no MongoDB, no DI. Fastest task in the plan.
Files:
- Create: src/libs/project-management/SyRF.ProjectManagement.Core/Model/AnnotationImportJobAggregate/ (directory + files below)
- AnnotationImportStatus.cs
- ImportFileFormat.cs
- StudyIdentifierType.cs
- AnnotatorIdentifierType.cs
- ImportFileInfo.cs
- ParsedStructure.cs
- AnnotationImportMapping.cs
- AnnotationValidationResult.cs
- AnnotationImportResolutions.cs
- ImportStats.cs
Step 1: Write the failing test (round-trip serialisation check)
// src/libs/project-management/SyRF.ProjectManagement.Core.Tests/AnnotationImport/AnnotationImportMappingTests.cs
using SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
using Xunit;
public class AnnotationImportMappingTests
{
[Fact]
public void QuestionFieldMapping_WithArrayType_HasSemicolonDelimiterByDefault()
{
var mapping = new QuestionFieldMapping(
SourceField: "col_a",
QuestionId: Guid.NewGuid(),
IsArrayType: true,
ArrayDelimiter: ";");
Assert.Equal(";", mapping.ArrayDelimiter);
Assert.True(mapping.IsArrayType);
}
[Fact]
public void AnnotationImportMapping_RequiresStudyAndAnnotatorFields()
{
var mapping = new AnnotationImportMapping(
StudyIdentifierField: "StudyId",
StudyIdentifierType: StudyIdentifierType.SyRFStudyId,
AnnotatorIdentifierField: "Email",
AnnotatorIdentifierType: AnnotatorIdentifierType.Email,
QuestionMappings: []);
Assert.Equal("StudyId", mapping.StudyIdentifierField);
Assert.Equal("Email", mapping.AnnotatorIdentifierField);
}
}
Step 2: Run to verify it fails
dotnet test src/libs/project-management/SyRF.ProjectManagement.Core.Tests/ \
--filter "AnnotationImportMappingTests" -v
Step 3: Create all types
// AnnotationImportStatus.cs
namespace SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
public enum AnnotationImportStatus
{
Created, MappingPending, Validating, ValidationComplete,
Committing, Completed, Failed, Cancelled
}
// ImportFileFormat.cs
namespace SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
public enum ImportFileFormat { Csv, Json, Yaml }
// StudyIdentifierType.cs
namespace SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
public enum StudyIdentifierType { SyRFStudyId, CustomId }
// AnnotatorIdentifierType.cs
namespace SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
public enum AnnotatorIdentifierType { Email, InvestigatorId }
// ImportFileInfo.cs
namespace SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
public record ImportFileInfo(
string FileName,
string GcsObjectName,
ImportFileFormat Format,
long SizeBytes);
// ParsedStructure.cs
namespace SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
public record ParsedStructure(
IReadOnlyList<string> Fields, // Column headers (CSV) or detected JSON keys
IReadOnlyList<IReadOnlyList<string>> SampleRows); // Up to 3 sample rows
// AnnotationImportMapping.cs
namespace SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
public record AnnotationImportMapping(
string StudyIdentifierField,
StudyIdentifierType StudyIdentifierType,
string AnnotatorIdentifierField,
AnnotatorIdentifierType AnnotatorIdentifierType,
IReadOnlyList<QuestionFieldMapping> QuestionMappings);
public record QuestionFieldMapping(
string SourceField,
Guid QuestionId,
bool IsArrayType,
string ArrayDelimiter = ";");
// AnnotationValidationResult.cs
namespace SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
public record AnnotationValidationResult(
IReadOnlyList<CleanGroup> CleanGroups,
IReadOnlyList<AnnotationConflict> Conflicts,
IReadOnlyList<ConditionalWarning> ConditionalWarnings,
IReadOnlyList<OrphanedChildWarning> OrphanedChildren,
IReadOnlyList<string> UnmatchedStudyIds,
IReadOnlyList<string> UnresolvedAnnotators,
int TotalAnnotationsToImport,
int TotalStudiesAffected);
public record CleanGroup(Guid StudyId, Guid AnnotatorId, int AnnotationCount);
public record AnnotationConflict(
Guid StudyId, Guid AnnotatorId,
int ExistingCount, int IncomingCount,
ConflictResolutionChoice Resolution = ConflictResolutionChoice.Pending);
public enum ConflictResolutionChoice { Pending, Overwrite, Skip }
public record ConditionalWarning(
Guid StudyId, Guid AnnotatorId,
Guid ChildQuestionId, Guid ParentQuestionId, string ParentAnswer);
public record OrphanedChildWarning(
Guid StudyId, Guid AnnotatorId, Guid OrphanedQuestionId,
OrphanedResolutionChoice Resolution = OrphanedResolutionChoice.Pending);
public enum OrphanedResolutionChoice { Pending, PromoteToRoot, Skip }
// AnnotationImportResolutions.cs
namespace SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
public record AnnotationImportResolutions(
IReadOnlyList<ConflictResolution> ConflictResolutions,
IReadOnlyList<ConditionalWarningResolution> ConditionalWarningResolutions,
IReadOnlyList<OrphanedChildResolution> OrphanedChildResolutions);
public record ConflictResolution(Guid StudyId, Guid AnnotatorId, ConflictResolutionChoice Choice);
public record ConditionalWarningResolution(Guid StudyId, Guid AnnotatorId, Guid ChildQuestionId, bool Include);
public record OrphanedChildResolution(Guid StudyId, Guid AnnotatorId, Guid OrphanedQuestionId, OrphanedResolutionChoice Choice);
// ImportStats.cs
namespace SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
public record ImportStats(int AnnotationsImported, int StudiesAffected, int AnnotationsSkipped);
Step 4: Run tests to verify they pass
dotnet test src/libs/project-management/SyRF.ProjectManagement.Core.Tests/ \
--filter "AnnotationImportMappingTests" -v
Step 5: Commit
git add src/libs/project-management/SyRF.ProjectManagement.Core/Model/AnnotationImportJobAggregate/
git add src/libs/project-management/SyRF.ProjectManagement.Core.Tests/
git commit -m "feat(annotation-import): add domain enums and value objects"
Task 3: AnnotationImportJob Entity, Repository, and UoW Wiring¶
Follows the exact pattern from Task 03-01/03-02 of Phase 3. Read DataExportJob.cs and DataExportJobRepository.cs as your reference.
Files:
- Create: src/libs/project-management/SyRF.ProjectManagement.Core/Model/AnnotationImportJobAggregate/AnnotationImportJob.cs
- Create: src/libs/project-management/SyRF.ProjectManagement.Core/Interfaces/IAnnotationImportJobRepository.cs
- Modify: src/libs/project-management/SyRF.ProjectManagement.Core/Interfaces/IPmUnitOfWork.cs
- Create: src/libs/project-management/SyRF.ProjectManagement.Mongo.Data/Repositories/AnnotationImportJobRepository.cs
- Modify: src/libs/project-management/SyRF.ProjectManagement.Mongo.Data/MongoPmUnitOfWork.cs
Step 1: Write the failing build test
dotnet build src/libs/project-management/SyRF.ProjectManagement.Core/SyRF.ProjectManagement.Core.csproj
Step 2: Create the entity
// AnnotationImportJob.cs
using SyRF.SharedKernel.BaseClasses;
namespace SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
public class AnnotationImportJob : AggregateRoot<Guid>
{
public AnnotationImportJob() : base(Guid.NewGuid()) { DefaultSchemaVersion = 1; }
public AnnotationImportJob(
Guid projectId, Guid stageId, Guid createdByInvestigatorId,
ImportFileInfo file, ParsedStructure parsedStructure)
: base(Guid.NewGuid())
{
DefaultSchemaVersion = 1;
ProjectId = projectId;
StageId = stageId;
CreatedByInvestigatorId = createdByInvestigatorId;
CreatedAt = DateTime.UtcNow;
File = file;
ParsedStructure = parsedStructure;
Status = AnnotationImportStatus.MappingPending;
}
public Guid ProjectId { get; set; }
public Guid StageId { get; set; }
public Guid CreatedByInvestigatorId { get; set; }
public DateTime CreatedAt { get; set; }
public DateTime? CompletedAt { get; set; }
public ImportFileInfo File { get; set; } = null!;
public ParsedStructure? ParsedStructure { get; set; }
public AnnotationImportMapping? Mapping { get; set; }
public AnnotationValidationResult? ValidationResult { get; set; }
public AnnotationImportResolutions? Resolutions { get; set; }
public ImportStats? Stats { get; set; }
public AnnotationImportStatus Status { get; set; }
public string? ErrorMessage { get; set; }
public void SetMapping(AnnotationImportMapping mapping)
{
Mapping = mapping;
Status = AnnotationImportStatus.Validating;
}
public void SetValidationResult(AnnotationValidationResult result)
{
ValidationResult = result;
Status = AnnotationImportStatus.ValidationComplete;
}
public void SetResolutions(AnnotationImportResolutions resolutions)
=> Resolutions = resolutions;
public void MarkCommitting() => Status = AnnotationImportStatus.Committing;
public void MarkCompleted(ImportStats stats)
{
Stats = stats;
Status = AnnotationImportStatus.Completed;
CompletedAt = DateTime.UtcNow;
}
public void MarkFailed(string errorMessage)
{
ErrorMessage = errorMessage;
Status = AnnotationImportStatus.Failed;
CompletedAt = DateTime.UtcNow;
}
public void MarkCancelled()
{
Status = AnnotationImportStatus.Cancelled;
CompletedAt = DateTime.UtcNow;
}
}
Step 3: Create repository interface
// IAnnotationImportJobRepository.cs
using SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
using SyRF.SharedKernel.Interfaces;
namespace SyRF.ProjectManagement.Core.Interfaces;
public interface IAnnotationImportJobRepository : ICrudRepository<AnnotationImportJob, Guid>
{
Task<IReadOnlyList<AnnotationImportJob>> GetByStageAsync(Guid stageId, CancellationToken ct = default);
}
Step 4: Add to IPmUnitOfWork
// Add to IPmUnitOfWork.cs after the DataExportJobs property:
IAnnotationImportJobRepository AnnotationImportJobs { get; }
Step 5: Create repository implementation
// AnnotationImportJobRepository.cs
using Microsoft.Extensions.Logging;
using MongoDB.Bson.Serialization;
using SyRF.Mongo.Common;
using SyRF.ProjectManagement.Core.Interfaces;
using SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
namespace SyRF.ProjectManagement.Mongo.Data.Repositories;
public class AnnotationImportJobRepository
: MongoRepositoryBase<AnnotationImportJob, Guid>, IAnnotationImportJobRepository
{
public AnnotationImportJobRepository(
MongoContext context,
RepositoryCache<Guid, AnnotationImportJob> cache,
ILogger<AnnotationImportJobRepository> logger,
Func<IPmUnitOfWork> unitOfWork)
: base(context, cache, logger, unitOfWork) { }
public async Task<IReadOnlyList<AnnotationImportJob>> GetByStageAsync(
Guid stageId, CancellationToken ct = default)
{
var filter = FilterDefinitionBuilder.Eq(j => j.StageId, stageId);
var cursor = await Collection.FindAsync(filter, cancellationToken: ct);
return await cursor.ToListAsync(ct);
}
public override async Task InitialiseIndexesAsync()
{
await CreateAscendingIndexAsync(j => j.ProjectId);
await CreateAscendingIndexAsync(j => j.ProjectId, j => j.StageId);
await CreateAscendingIndexAsync(j => j.Status);
}
public override void CreateMappings()
{
if (!BsonClassMap.IsClassMapRegistered(typeof(AnnotationImportJob)))
BsonClassMap.RegisterClassMap<AnnotationImportJob>(cm =>
{
cm.AutoMap();
cm.SetIgnoreExtraElements(true);
});
}
}
Step 6: Wire into MongoPmUnitOfWork
Add IAnnotationImportJobRepository annotationImportJobRepository as a constructor parameter. Add AddRepository(annotationImportJobRepository) in the constructor body. Add the typed accessor:
public IAnnotationImportJobRepository AnnotationImportJobs =>
(IAnnotationImportJobRepository) GetRepository<AnnotationImportJob, Guid>();
Step 7: Build to verify
Expected: PASS with no errors.Step 8: Commit
git add src/libs/project-management/
git commit -m "feat(annotation-import): add AnnotationImportJob collection, repository, UoW wiring"
Task 4: Import Record Types and IAnnotationFileParser Interface¶
The shared internal representation all three parsers produce. Pure domain types — no I/O.
Files:
- Create: src/libs/project-management/SyRF.ProjectManagement.Core/Services/AnnotationImportServices/IAnnotationFileParser.cs
- Create: src/libs/project-management/SyRF.ProjectManagement.Core/Services/AnnotationImportServices/ImportRecord.cs
Step 1: Create the types
// ImportRecord.cs
namespace SyRF.ProjectManagement.Core.Services.AnnotationImportServices;
public record ImportRecord(
string StudyIdentifierRaw,
string AnnotatorIdentifierRaw,
IReadOnlyList<ImportAnnotation> Annotations);
public record ImportAnnotation(
string QuestionKey, // Source field name — resolved to QuestionId via mapping
string? RawValue, // null = not present in source
IReadOnlyList<ImportAnnotation> Children = null!) // Populated for JSON/YAML nested structures
{
public IReadOnlyList<ImportAnnotation> Children { get; init; } = Children ?? [];
}
// IAnnotationFileParser.cs
using SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
namespace SyRF.ProjectManagement.Core.Services.AnnotationImportServices;
public interface IAnnotationFileParser
{
ImportFileFormat Format { get; }
/// <summary>
/// Parses only the structure (field names + up to 3 sample rows). Fast — used on upload.
/// </summary>
Task<ParsedStructure> ParseStructureAsync(Stream content, CancellationToken ct = default);
/// <summary>
/// Parses all records using the supplied mapping. Used during validation and commit.
/// </summary>
IAsyncEnumerable<ImportRecord> ParseRecordsAsync(
Stream content,
AnnotationImportMapping mapping,
CancellationToken ct = default);
}
Step 2: Commit
git add src/libs/project-management/SyRF.ProjectManagement.Core/Services/AnnotationImportServices/
git commit -m "feat(annotation-import): add IAnnotationFileParser interface and import record types"
Task 5: CsvAnnotationParser¶
Files:
- Create: src/libs/project-management/SyRF.ProjectManagement.Core/Services/AnnotationImportServices/CsvAnnotationParser.cs
- Test: src/libs/project-management/SyRF.ProjectManagement.Core.Tests/AnnotationImport/CsvAnnotationParserTests.cs
First, check whether CsvHelper is already referenced in the project:
If not found, add it:
Step 1: Write failing tests
// CsvAnnotationParserTests.cs
using System.Text;
using SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
using SyRF.ProjectManagement.Core.Services.AnnotationImportServices;
using Xunit;
public class CsvAnnotationParserTests
{
private static Stream ToStream(string content)
=> new MemoryStream(Encoding.UTF8.GetBytes(content));
private static AnnotationImportMapping MakeMapping(params (string col, Guid qId)[] questions)
=> new(
StudyIdentifierField: "StudyId",
StudyIdentifierType: StudyIdentifierType.SyRFStudyId,
AnnotatorIdentifierField: "Annotator",
AnnotatorIdentifierType: AnnotatorIdentifierType.Email,
QuestionMappings: questions.Select(q =>
new QuestionFieldMapping(q.col, q.qId, false)).ToList());
[Fact]
public async Task ParseStructure_ReturnsCsvHeaders()
{
var csv = "StudyId,Annotator,Q1,Q2\n123,a@b.com,Yes,No\n456,c@d.com,No,Yes\n";
var sut = new CsvAnnotationParser();
var result = await sut.ParseStructureAsync(ToStream(csv));
Assert.Equal(["StudyId", "Annotator", "Q1", "Q2"], result.Fields);
Assert.Equal(2, result.SampleRows.Count);
Assert.Equal(["123", "a@b.com", "Yes", "No"], result.SampleRows[0]);
}
[Fact]
public async Task ParseRecords_ReturnsOneRecordPerRow()
{
var q1Id = Guid.NewGuid();
var csv = "StudyId,Annotator,Q1\n111,a@b.com,Yes\n222,c@d.com,No\n";
var mapping = MakeMapping(("Q1", q1Id));
var sut = new CsvAnnotationParser();
var records = await sut.ParseRecordsAsync(ToStream(csv), mapping)
.ToListAsync();
Assert.Equal(2, records.Count);
Assert.Equal("111", records[0].StudyIdentifierRaw);
Assert.Equal("a@b.com", records[0].AnnotatorIdentifierRaw);
Assert.Single(records[0].Annotations);
Assert.Equal("Yes", records[0].Annotations[0].RawValue);
}
[Fact]
public async Task ParseRecords_ArrayColumn_SplitsOnSemicolon()
{
var q1Id = Guid.NewGuid();
var csv = "StudyId,Annotator,Tags\n111,a@b.com,\"A;B;C\"\n";
var mapping = new AnnotationImportMapping(
"StudyId", StudyIdentifierType.SyRFStudyId,
"Annotator", AnnotatorIdentifierType.Email,
[new QuestionFieldMapping("Tags", q1Id, IsArrayType: true, ArrayDelimiter: ";")]);
var sut = new CsvAnnotationParser();
var records = await sut.ParseRecordsAsync(ToStream(csv), mapping).ToListAsync();
// RawValue for array types is the raw cell; children carry individual values
Assert.Equal("A;B;C", records[0].Annotations[0].RawValue);
}
}
Step 2: Run to verify it fails
dotnet test src/libs/project-management/SyRF.ProjectManagement.Core.Tests/ \
--filter "CsvAnnotationParserTests" -v
Step 3: Implement CsvAnnotationParser
// CsvAnnotationParser.cs
using CsvHelper;
using CsvHelper.Configuration;
using System.Globalization;
using System.Runtime.CompilerServices;
using SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
namespace SyRF.ProjectManagement.Core.Services.AnnotationImportServices;
public class CsvAnnotationParser : IAnnotationFileParser
{
public ImportFileFormat Format => ImportFileFormat.Csv;
public async Task<ParsedStructure> ParseStructureAsync(Stream content, CancellationToken ct = default)
{
using var reader = new StreamReader(content, leaveOpen: true);
using var csv = new CsvReader(reader, new CsvConfiguration(CultureInfo.InvariantCulture));
await csv.ReadAsync();
csv.ReadHeader();
var headers = csv.HeaderRecord!.ToList();
var sampleRows = new List<IReadOnlyList<string>>();
while (sampleRows.Count < 3 && await csv.ReadAsync())
sampleRows.Add(headers.Select(h => csv.GetField(h) ?? "").ToList());
return new ParsedStructure(headers, sampleRows);
}
public async IAsyncEnumerable<ImportRecord> ParseRecordsAsync(
Stream content,
AnnotationImportMapping mapping,
[EnumeratorCancellation] CancellationToken ct = default)
{
using var reader = new StreamReader(content, leaveOpen: true);
using var csv = new CsvReader(reader, new CsvConfiguration(CultureInfo.InvariantCulture));
await csv.ReadAsync();
csv.ReadHeader();
while (await csv.ReadAsync())
{
ct.ThrowIfCancellationRequested();
var studyId = csv.GetField(mapping.StudyIdentifierField) ?? "";
var annotator = csv.GetField(mapping.AnnotatorIdentifierField) ?? "";
var annotations = mapping.QuestionMappings
.Where(qm => csv.HeaderRecord!.Contains(qm.SourceField))
.Select(qm =>
{
var rawValue = csv.GetField(qm.SourceField);
return new ImportAnnotation(qm.SourceField, rawValue);
})
.ToList();
yield return new ImportRecord(studyId, annotator, annotations);
}
}
}
Step 4: Run tests to verify they pass
dotnet test src/libs/project-management/SyRF.ProjectManagement.Core.Tests/ \
--filter "CsvAnnotationParserTests" -v
Step 5: Commit
git add src/libs/project-management/
git commit -m "feat(annotation-import): add CsvAnnotationParser with header + record parsing"
Task 6: JsonAnnotationParser and YamlAnnotationParser¶
Add YamlDotNet for YAML support. The YAML parser converts to JSON then delegates.
Files:
- Create: src/libs/project-management/SyRF.ProjectManagement.Core/Services/AnnotationImportServices/JsonAnnotationParser.cs
- Create: src/libs/project-management/SyRF.ProjectManagement.Core/Services/AnnotationImportServices/YamlAnnotationParser.cs
- Test: src/libs/project-management/SyRF.ProjectManagement.Core.Tests/AnnotationImport/JsonAnnotationParserTests.cs
Step 1: Write failing tests
// JsonAnnotationParserTests.cs
using System.Text;
using System.Text.Json;
using SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
using SyRF.ProjectManagement.Core.Services.AnnotationImportServices;
using Xunit;
public class JsonAnnotationParserTests
{
private static Stream ToStream(string content)
=> new MemoryStream(Encoding.UTF8.GetBytes(content));
private static string BuildJson(object studies)
=> JsonSerializer.Serialize(new { studies });
[Fact]
public async Task ParseStructure_DetectsAnnotationFieldNames()
{
var json = BuildJson(new[]
{
new
{
studyId = "111", annotator = "a@b.com",
annotations = new[] { new { questionKey = "Q1", answer = "Yes" } }
}
});
var sut = new JsonAnnotationParser();
var result = await sut.ParseStructureAsync(ToStream(json));
Assert.Contains("studyId", result.Fields);
Assert.Contains("annotator", result.Fields);
}
[Fact]
public async Task ParseRecords_NestedAnnotations_PreservesChildHierarchy()
{
var parentQ = Guid.NewGuid();
var childQ = Guid.NewGuid();
var json = JsonSerializer.Serialize(new
{
studies = new[]
{
new
{
studyId = "111", annotator = "a@b.com",
annotations = new[]
{
new
{
questionKey = parentQ.ToString(), answer = "Yes",
children = new[] { new { questionKey = childQ.ToString(), answer = "5" } }
}
}
}
}
});
var mapping = new AnnotationImportMapping(
"studyId", StudyIdentifierType.SyRFStudyId,
"annotator", AnnotatorIdentifierType.Email,
[new QuestionFieldMapping(parentQ.ToString(), parentQ, false),
new QuestionFieldMapping(childQ.ToString(), childQ, false)]);
var sut = new JsonAnnotationParser();
var records = await sut.ParseRecordsAsync(ToStream(json), mapping).ToListAsync();
Assert.Single(records);
var parentAnnotation = records[0].Annotations[0];
Assert.Equal("Yes", parentAnnotation.RawValue);
Assert.Single(parentAnnotation.Children);
Assert.Equal("5", parentAnnotation.Children[0].RawValue);
}
}
Step 2: Run to verify failure
dotnet test src/libs/project-management/SyRF.ProjectManagement.Core.Tests/ \
--filter "JsonAnnotationParserTests" -v
Step 3: Implement JsonAnnotationParser
// JsonAnnotationParser.cs
using System.Runtime.CompilerServices;
using System.Text.Json;
using SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
namespace SyRF.ProjectManagement.Core.Services.AnnotationImportServices;
public class JsonAnnotationParser : IAnnotationFileParser
{
public ImportFileFormat Format => ImportFileFormat.Json;
public async Task<ParsedStructure> ParseStructureAsync(Stream content, CancellationToken ct = default)
{
var doc = await JsonDocument.ParseAsync(content, cancellationToken: ct);
var root = doc.RootElement;
var studies = root.GetProperty("studies").EnumerateArray().Take(3).ToList();
var topFields = studies.FirstOrDefault() is { ValueKind: JsonValueKind.Object } first
? first.EnumerateObject().Select(p => p.Name).ToList()
: new List<string>();
var sampleRows = studies.Select(s =>
(IReadOnlyList<string>) s.EnumerateObject()
.Select(p => p.Value.ToString())
.ToList()).ToList();
return new ParsedStructure(topFields, sampleRows);
}
public async IAsyncEnumerable<ImportRecord> ParseRecordsAsync(
Stream content,
AnnotationImportMapping mapping,
[EnumeratorCancellation] CancellationToken ct = default)
{
var doc = await JsonDocument.ParseAsync(content, cancellationToken: ct);
foreach (var study in doc.RootElement.GetProperty("studies").EnumerateArray())
{
ct.ThrowIfCancellationRequested();
var studyId = study.GetProperty(mapping.StudyIdentifierField).GetString() ?? "";
var annotator = study.GetProperty(mapping.AnnotatorIdentifierField).GetString() ?? "";
var annotations = ParseAnnotations(
study.TryGetProperty("annotations", out var ann) ? ann : default, mapping);
yield return new ImportRecord(studyId, annotator, annotations);
}
}
private static List<ImportAnnotation> ParseAnnotations(
JsonElement annotationsEl, AnnotationImportMapping mapping)
{
if (annotationsEl.ValueKind != JsonValueKind.Array) return [];
return annotationsEl.EnumerateArray()
.Select(a =>
{
var key = a.TryGetProperty("questionKey", out var k) ? k.GetString() ?? "" : "";
var value = a.TryGetProperty("answer", out var v) ? v.ToString() : null;
var children = a.TryGetProperty("children", out var ch)
? ParseAnnotations(ch, mapping)
: new List<ImportAnnotation>();
return new ImportAnnotation(key, value, children);
})
.ToList();
}
}
Step 4: Implement YamlAnnotationParser
// YamlAnnotationParser.cs
using System.Runtime.CompilerServices;
using System.Text.Json;
using SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
using YamlDotNet.Serialization;
namespace SyRF.ProjectManagement.Core.Services.AnnotationImportServices;
public class YamlAnnotationParser(JsonAnnotationParser jsonParser) : IAnnotationFileParser
{
public ImportFileFormat Format => ImportFileFormat.Yaml;
private static Stream ConvertToJson(Stream yaml)
{
using var reader = new StreamReader(yaml, leaveOpen: true);
var yamlContent = reader.ReadToEnd();
var deserializer = new DeserializerBuilder().Build();
var obj = deserializer.Deserialize(yamlContent);
var serializer = new SerializerBuilder()
.JsonCompatible().Build();
var json = serializer.Serialize(obj);
return new MemoryStream(System.Text.Encoding.UTF8.GetBytes(json));
}
public Task<ParsedStructure> ParseStructureAsync(Stream content, CancellationToken ct = default)
=> jsonParser.ParseStructureAsync(ConvertToJson(content), ct);
public IAsyncEnumerable<ImportRecord> ParseRecordsAsync(Stream content,
AnnotationImportMapping mapping, [EnumeratorCancellation] CancellationToken ct = default)
=> jsonParser.ParseRecordsAsync(ConvertToJson(content), mapping, ct);
}
Step 5: Run tests
dotnet test src/libs/project-management/SyRF.ProjectManagement.Core.Tests/ \
--filter "JsonAnnotationParserTests" -v
Step 6: Commit
git add src/libs/project-management/
git commit -m "feat(annotation-import): add JSON and YAML annotation file parsers"
Task 7: AnnotationTreeBuilder¶
Pure domain logic — takes parsed ImportRecords, the stage question tree, and resolutions, and produces a List<Annotation> with correct ParentId/Children links. Most important algorithmic piece. No I/O.
Files:
- Create: src/libs/project-management/SyRF.ProjectManagement.Core/Services/AnnotationImportServices/AnnotationTreeBuilder.cs
- Test: src/libs/project-management/SyRF.ProjectManagement.Core.Tests/AnnotationImport/AnnotationTreeBuilderTests.cs
Step 1: Write failing tests
// AnnotationTreeBuilderTests.cs
using SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
using SyRF.ProjectManagement.Core.Model.StudyAggregate;
using SyRF.ProjectManagement.Core.Model.ProjectAggregate;
using SyRF.ProjectManagement.Core.Services.AnnotationImportServices;
using Xunit;
public class AnnotationTreeBuilderTests
{
private static Guid ProjectId = Guid.NewGuid();
private static Guid StageId = Guid.NewGuid();
private static Guid StudyId = Guid.NewGuid();
private static Guid AnnotatorId = Guid.NewGuid();
// Helper: build a minimal question tree entry
private static AnnotationQuestion MakeQuestion(Guid id, string type, Guid? parentId = null)
=> new() { Id = id, QuestionType = type, AnswerArray = false /* adjust as needed */ };
[Fact]
public void Build_SingleRootQuestion_ProducesRootAnnotation()
{
var qId = Guid.NewGuid();
var importAnnotations = new List<ImportAnnotation>
{
new(qId.ToString(), "Yes", [])
};
var questions = new List<AnnotationQuestion> { MakeQuestion(qId, "boolean") };
var mapping = new AnnotationImportMapping("", StudyIdentifierType.SyRFStudyId,
"", AnnotatorIdentifierType.Email,
[new QuestionFieldMapping(qId.ToString(), qId, false)]);
var result = AnnotationTreeBuilder.Build(
ProjectId, StageId, StudyId, AnnotatorId,
importAnnotations, questions, mapping, resolutions: null);
Assert.Single(result);
Assert.True(result[0].Root);
Assert.Null(result[0].ParentId);
}
[Fact]
public void Build_ParentAndChild_LinksParentIdAndChildren()
{
var parentQId = Guid.NewGuid();
var childQId = Guid.NewGuid();
var importAnnotations = new List<ImportAnnotation>
{
new(parentQId.ToString(), "Yes",
[
new ImportAnnotation(childQId.ToString(), "5", [])
])
};
var questions = new List<AnnotationQuestion>
{
MakeQuestion(parentQId, "boolean"),
MakeQuestion(childQId, "integer", parentQId)
};
var mapping = new AnnotationImportMapping("", StudyIdentifierType.SyRFStudyId,
"", AnnotatorIdentifierType.Email,
[new QuestionFieldMapping(parentQId.ToString(), parentQId, false),
new QuestionFieldMapping(childQId.ToString(), childQId, false)]);
var result = AnnotationTreeBuilder.Build(
ProjectId, StageId, StudyId, AnnotatorId,
importAnnotations, questions, mapping, resolutions: null);
Assert.Equal(2, result.Count);
var parent = result.Single(a => a.Root);
var child = result.Single(a => !a.Root);
Assert.Equal(parent.Id, child.ParentId);
Assert.Contains(child.Id, parent.Children);
}
[Fact]
public void Build_NullValue_SkipsAnnotation()
{
var qId = Guid.NewGuid();
var importAnnotations = new List<ImportAnnotation>
{
new(qId.ToString(), null, []) // null = not answered
};
var questions = new List<AnnotationQuestion> { MakeQuestion(qId, "string") };
var mapping = new AnnotationImportMapping("", StudyIdentifierType.SyRFStudyId,
"", AnnotatorIdentifierType.Email,
[new QuestionFieldMapping(qId.ToString(), qId, false)]);
var result = AnnotationTreeBuilder.Build(
ProjectId, StageId, StudyId, AnnotatorId,
importAnnotations, questions, mapping, resolutions: null);
Assert.Empty(result);
}
}
Step 2: Run to verify failure
dotnet test src/libs/project-management/SyRF.ProjectManagement.Core.Tests/ \
--filter "AnnotationTreeBuilderTests" -v
Step 3: Implement AnnotationTreeBuilder
Read Annotation.cs, BoolAnnotation.cs, StringAnnotation.cs, IntAnnotation.cs to understand the type hierarchy before implementing. The builder must instantiate the correct concrete subclass based on the question's QuestionType.
// AnnotationTreeBuilder.cs
using SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
using SyRF.ProjectManagement.Core.Model.ProjectAggregate;
using SyRF.ProjectManagement.Core.Model.StudyAggregate;
namespace SyRF.ProjectManagement.Core.Services.AnnotationImportServices;
public static class AnnotationTreeBuilder
{
public static List<Annotation> Build(
Guid projectId, Guid stageId, Guid studyId, Guid annotatorId,
IReadOnlyList<ImportAnnotation> importAnnotations,
IReadOnlyList<AnnotationQuestion> stageQuestions,
AnnotationImportMapping mapping,
AnnotationImportResolutions? resolutions)
{
var result = new List<Annotation>();
var questionById = stageQuestions.ToDictionary(q => q.Id);
var mappingByKey = mapping.QuestionMappings.ToDictionary(m => m.SourceField);
BuildDepthFirst(
importAnnotations, questionById, mappingByKey,
projectId, stageId, studyId, annotatorId,
parentAnnotationId: null, result);
return result;
}
private static void BuildDepthFirst(
IReadOnlyList<ImportAnnotation> annotations,
Dictionary<Guid, AnnotationQuestion> questionById,
Dictionary<string, QuestionFieldMapping> mappingByKey,
Guid projectId, Guid stageId, Guid studyId, Guid annotatorId,
Guid? parentAnnotationId,
List<Annotation> result)
{
foreach (var importAnnotation in annotations)
{
if (importAnnotation.RawValue is null) continue;
if (!mappingByKey.TryGetValue(importAnnotation.QuestionKey, out var qm)) continue;
if (!questionById.TryGetValue(qm.QuestionId, out var question)) continue;
var annotation = CreateAnnotation(
question, importAnnotation.RawValue, qm,
projectId, stageId, studyId, annotatorId,
isRoot: parentAnnotationId is null,
parentId: parentAnnotationId);
if (annotation is null) continue;
// Wire parent → child
if (parentAnnotationId.HasValue)
{
var parent = result.First(a => a.Id == parentAnnotationId.Value);
parent.Children.Add(annotation.Id);
}
result.Add(annotation);
// Recurse for children
if (importAnnotation.Children.Count > 0)
BuildDepthFirst(importAnnotation.Children, questionById, mappingByKey,
projectId, stageId, studyId, annotatorId, annotation.Id, result);
}
}
private static Annotation? CreateAnnotation(
AnnotationQuestion question, string rawValue, QuestionFieldMapping qm,
Guid projectId, Guid stageId, Guid studyId, Guid annotatorId,
bool isRoot, Guid? parentId)
{
var baseProps = new
{
Id = Guid.NewGuid(),
StudyId = studyId, StageId = stageId,
AnnotatorId = annotatorId, ProjectId = projectId,
QuestionId = question.Id, Question = question.QuestionText ?? "",
Root = isRoot, ParentId = parentId
};
// Read Annotation.cs subclasses to confirm property names before finalising
return question.QuestionType switch
{
"boolean" when bool.TryParse(rawValue, out var b)
=> new BoolAnnotation { /* set all props */ Answer = b },
"integer" when int.TryParse(rawValue, out var i)
=> new IntAnnotation { Answer = i },
"decimal" when decimal.TryParse(rawValue, out var d)
=> new DecimalAnnotation { Answer = d },
"string" => new StringAnnotation { Answer = rawValue },
// Array types: split rawValue on delimiter
"boolean" when question.AnswerArray
=> new BoolArrayAnnotation { Answer = rawValue.Split(qm.ArrayDelimiter)
.Select(v => bool.TryParse(v.Trim(), out var b) ? b : false).ToList() },
_ => null // Unknown type — skip
};
// Note: set StudyId, StageId, AnnotatorId, ProjectId, QuestionId, Question, Root, ParentId
// on each branch. Extract a private helper to avoid repetition.
}
}
Note to implementer: The
CreateAnnotationmethod needs to set all baseAnnotationproperties on each concrete subclass. Read the actualAnnotation.csbase class before writing this — the property names may differ from the above sketch. Extract avoid SetBaseProps(Annotation a, ...)helper to keep it DRY.
Step 4: Run tests
dotnet test src/libs/project-management/SyRF.ProjectManagement.Core.Tests/ \
--filter "AnnotationTreeBuilderTests" -v
Step 5: Commit
git add src/libs/project-management/
git commit -m "feat(annotation-import): add AnnotationTreeBuilder with depth-first tree reconstruction"
Task 8: AnnotationImportValidationService¶
Resolves study IDs, annotator identifiers, validates tree structure (orphans + conditional warnings), produces AnnotationValidationResult. Depends on IAnnotationFileParser.
Files:
- Create: src/libs/project-management/SyRF.ProjectManagement.Core/Services/AnnotationImportServices/AnnotationImportValidationService.cs
- Test: src/libs/project-management/SyRF.ProjectManagement.Core.Tests/AnnotationImport/AnnotationImportValidationServiceTests.cs
Step 1: Write failing tests
// AnnotationImportValidationServiceTests.cs
using NSubstitute;
using SyRF.ProjectManagement.Core.Interfaces;
using SyRF.ProjectManagement.Core.Model.AnnotationImportJobAggregate;
using SyRF.ProjectManagement.Core.Services.AnnotationImportServices;
using Xunit;
public class AnnotationImportValidationServiceTests
{
[Fact]
public async Task Validate_UnmatchedStudyId_ReportedInResult()
{
var studyRepo = Substitute.For<IStudyRepository>();
studyRepo.GetByCustomIdAsync("MISSING", Arg.Any<Guid>(), Arg.Any<CancellationToken>())
.Returns((Study?)null);
// ... set up parser to return one record with studyId="MISSING"
// ... assert result.UnmatchedStudyIds.Contains("MISSING")
}
[Fact]
public async Task Validate_ExistingAnnotations_ReportedAsConflict()
{
// ... study has existing annotations for the annotator+stage
// ... assert result.Conflicts has one entry with correct ExistingCount
}
[Fact]
public async Task Validate_OrphanedChildAnnotation_ReportedAsOrphan()
{
// ... import record has child question key but no parent question key
// ... assert result.OrphanedChildren has one entry
}
}
Step 2: Implement the service
The service signature:
public class AnnotationImportValidationService(
IStudyRepository studyRepository,
IInvestigatorRepository investigatorRepository,
IAnnotationFileParser[] parsers)
{
public async Task<AnnotationValidationResult> ValidateAsync(
AnnotationImportJob job,
Stream fileContent,
IReadOnlyList<AnnotationQuestion> stageQuestions,
CancellationToken ct = default)
{ ... }
}
Logic:
1. Select parser by job.File.Format
2. Stream records via ParseRecordsAsync
3. Per record: resolve study ID, resolve annotator → build clean/conflict/orphan/warning buckets
4. Return AnnotationValidationResult
For orphan detection: check if a child ImportAnnotation has a non-null value but its parent ImportAnnotation in the same record has a null value. Use the stage question tree to determine parent-child relationships.
Step 3: Run tests and iterate until passing
dotnet test src/libs/project-management/SyRF.ProjectManagement.Core.Tests/ \
--filter "AnnotationImportValidationServiceTests" -v
Step 4: Commit
git add src/libs/project-management/
git commit -m "feat(annotation-import): add AnnotationImportValidationService"
Task 9: MassTransit Validation Consumer¶
Files:
- Create: src/services/project-management/SyRF.ProjectManagement.Endpoint/Consumers/AnnotationImportValidationConsumer.cs
- Create: src/libs/project-management/SyRF.ProjectManagement.Messages/Commands/IValidateAnnotationImportCommand.cs
- Test: src/services/project-management/SyRF.ProjectManagement.Endpoint.Tests/AnnotationImportValidationConsumerTests.cs
Step 1: Create the command message
// IValidateAnnotationImportCommand.cs
namespace SyRF.ProjectManagement.Messages.Commands;
public interface IValidateAnnotationImportCommand
{
Guid JobId { get; }
Guid ProjectId { get; }
}
Step 2: Write failing test
// AnnotationImportValidationConsumerTests.cs
// Follow the pattern in ReferenceFileParseJobConsumerTests.cs
// Test that: consumer loads job, downloads file from GCS, calls validation service,
// updates job with result
Step 3: Implement the consumer
// AnnotationImportValidationConsumer.cs
using MassTransit;
using SyRF.AppServices.FileServices;
using SyRF.ProjectManagement.Core.Interfaces;
using SyRF.ProjectManagement.Core.Services.AnnotationImportServices;
using SyRF.ProjectManagement.Messages.Commands;
namespace SyRF.ProjectManagement.Endpoint.Consumers;
public class AnnotationImportValidationConsumer(
IGcsStorageService gcs,
IPmUnitOfWork uow,
AnnotationImportValidationService validationService)
: IConsumer<IValidateAnnotationImportCommand>
{
public async Task Consume(ConsumeContext<IValidateAnnotationImportCommand> context)
{
var job = await uow.AnnotationImportJobs.GetAsync(context.Message.JobId);
if (job is null) return;
try
{
await using var fileStream = await gcs.DownloadAsync(job.File.GcsObjectName, context.CancellationToken);
var stageQuestions = await GetStageQuestionsAsync(job.ProjectId, job.StageId, context.CancellationToken);
var result = await validationService.ValidateAsync(job, fileStream, stageQuestions, context.CancellationToken);
job.SetValidationResult(result);
}
catch (Exception ex)
{
job.MarkFailed(ex.Message);
}
await uow.SaveAsync();
}
private async Task<IReadOnlyList<AnnotationQuestion>> GetStageQuestionsAsync(
Guid projectId, Guid stageId, CancellationToken ct)
{
var project = await uow.Projects.GetAsync(projectId);
var stage = project?.Stages.FirstOrDefault(s => s.Id == stageId);
return stage?.AnnotationQuestions ?? [];
}
}
Step 4: Register the consumer — add to the MassTransit registration in Program.cs / host configuration of the project-management service. Follow the pattern of ReferenceFileParseJobConsumer.
Step 5: Run tests
dotnet test src/services/project-management/SyRF.ProjectManagement.Endpoint.Tests/ \
--filter "AnnotationImportValidation" -v
Step 6: Build full solution
Step 7: Commit
git add src/services/project-management/ src/libs/project-management/SyRF.ProjectManagement.Messages/
git commit -m "feat(annotation-import): add MassTransit validation consumer"
Task 10: API Endpoints¶
Five endpoints in the project-management service. Follow the pattern of ReviewController.cs and the data export endpoints.
Files:
- Create: src/services/project-management/SyRF.ProjectManagement.Endpoint/Controllers/AnnotationImportController.cs
- Create: src/services/project-management/SyRF.ProjectManagement.Endpoint/Models/AnnotationImport/ (DTOs)
Step 1: Create DTOs
// CreateAnnotationImportDto.cs — returned on POST (upload)
public record CreateAnnotationImportResponseDto(
Guid JobId,
string Format,
ParsedStructureDto ParsedStructure,
IReadOnlyList<QuestionNodeDto> StageQuestions); // tree
// ParsedStructureDto.cs
public record ParsedStructureDto(
IReadOnlyList<string> Fields,
IReadOnlyList<IReadOnlyList<string>> SampleRows);
// QuestionNodeDto.cs (recursive tree)
public record QuestionNodeDto(
Guid Id, string Text, string Type, bool IsArrayType,
IReadOnlyList<QuestionNodeDto> Children);
// SubmitMappingDto.cs — body of POST /mapping
public record SubmitMappingDto(
string StudyIdentifierField, string StudyIdentifierType,
string AnnotatorIdentifierField, string AnnotatorIdentifierType,
IReadOnlyList<QuestionFieldMappingDto> QuestionMappings);
// AnnotationImportJobStatusDto.cs — returned from GET
public record AnnotationImportJobStatusDto(
Guid JobId, string Status,
ValidationResultDto? ValidationResult,
ImportStatsDto? Stats,
string? ErrorMessage);
Step 2: Implement controller
[ApiController]
[Route("api/projects/{projectId}")]
public class AnnotationImportController(
IPmUnitOfWork uow,
IGcsStorageService gcs,
IPublishEndpoint publishEndpoint,
IAnnotationFileParserResolver parserResolver,
GcsSettings gcsSettings) : ControllerBase
{
// POST stages/{stageId}/annotation-imports
[HttpPost("stages/{stageId}/annotation-imports")]
public async Task<IActionResult> Upload(Guid projectId, Guid stageId, IFormFile file) { ... }
// POST annotation-imports/{jobId}/mapping
[HttpPost("annotation-imports/{jobId}/mapping")]
public async Task<IActionResult> SubmitMapping(Guid projectId, Guid jobId,
[FromBody] SubmitMappingDto dto) { ... }
// GET annotation-imports/{jobId}
[HttpGet("annotation-imports/{jobId}")]
public async Task<IActionResult> GetStatus(Guid projectId, Guid jobId) { ... }
// POST annotation-imports/{jobId}/confirm
[HttpPost("annotation-imports/{jobId}/confirm")]
public async Task<IActionResult> Confirm(Guid projectId, Guid jobId,
[FromBody] AnnotationImportResolutionsDto resolutions) { ... }
// DELETE annotation-imports/{jobId}
[HttpDelete("annotation-imports/{jobId}")]
public async Task<IActionResult> Cancel(Guid projectId, Guid jobId) { ... }
}
The Confirm endpoint re-reads the file from GCS, iterates studies in the validation result applying resolutions, calls AnnotationTreeBuilder.Build per (study, annotator) group, calls study.AddSessionData(), saves, then deletes the GCS file.
Step 3: Write controller tests following the pattern of existing controller tests in the endpoint test project.
Step 4: Build and run tests
dotnet build src/services/project-management/project-management.slnf
dotnet test src/services/project-management/SyRF.ProjectManagement.Endpoint.Tests/ \
--filter "AnnotationImportController" -v
Step 5: Commit
git add src/services/project-management/
git commit -m "feat(annotation-import): add API endpoints (upload, mapping, status, confirm, cancel)"
Task 11: Angular UI — Import Wizard¶
Add an annotation import wizard to the stage admin area. Uses Angular Material Stepper.
Files:
- Create: src/services/web/src/app/stage/stage-admin/annotation-import/ (directory)
- annotation-import.component.ts
- annotation-import.component.html
- annotation-import.component.scss
- annotation-import.component.spec.ts
- annotation-import.service.ts
- annotation-import.service.spec.ts
- Modify: src/services/web/src/app/stage/stage-admin/ (add route + nav entry)
Step 1: Write failing service test
// annotation-import.service.spec.ts
import { TestBed } from '@angular/core/testing';
import { HttpTestingController, provideHttpClientTesting } from '@angular/common/http/testing';
import { AnnotationImportService } from './annotation-import.service';
describe('AnnotationImportService', () => {
let service: AnnotationImportService;
let http: HttpTestingController;
beforeEach(() => {
TestBed.configureTestingModule({
providers: [AnnotationImportService, provideHttpClientTesting()]
});
service = TestBed.inject(AnnotationImportService);
http = TestBed.inject(HttpTestingController);
});
it('should upload file and return job with parsed structure', () => {
const projectId = 'proj-1', stageId = 'stage-1';
const file = new File(['col1,col2\nv1,v2'], 'test.csv', { type: 'text/csv' });
service.upload(projectId, stageId, file).subscribe(result => {
expect(result.jobId).toBe('job-1');
expect(result.parsedStructure.fields).toEqual(['col1', 'col2']);
});
const req = http.expectOne(
`api/projects/${projectId}/stages/${stageId}/annotation-imports`);
expect(req.request.method).toBe('POST');
req.flush({ jobId: 'job-1', format: 'Csv',
parsedStructure: { fields: ['col1', 'col2'], sampleRows: [] },
stageQuestions: [] });
});
});
Step 2: Run to verify failure
Step 3: Implement the service
// annotation-import.service.ts
import { Injectable, inject } from '@angular/core';
import { HttpClient } from '@angular/common/http';
import { Observable, interval, switchMap, takeWhile } from 'rxjs';
@Injectable({ providedIn: 'root' })
export class AnnotationImportService {
readonly #http = inject(HttpClient);
upload(projectId: string, stageId: string, file: File): Observable<ImportJobResponse> {
const form = new FormData();
form.append('file', file);
return this.#http.post<ImportJobResponse>(
`api/projects/${projectId}/stages/${stageId}/annotation-imports`, form);
}
submitMapping(projectId: string, jobId: string, mapping: SubmitMappingDto): Observable<void> {
return this.#http.post<void>(
`api/projects/${projectId}/annotation-imports/${jobId}/mapping`, mapping);
}
pollStatus(projectId: string, jobId: string): Observable<ImportJobStatus> {
return interval(2000).pipe(
switchMap(() => this.#http.get<ImportJobStatus>(
`api/projects/${projectId}/annotation-imports/${jobId}`)),
takeWhile(s => s.status === 'Validating', inclusive: true));
}
confirm(projectId: string, jobId: string, resolutions: ImportResolutionsDto): Observable<ImportStats> {
return this.#http.post<ImportStats>(
`api/projects/${projectId}/annotation-imports/${jobId}/confirm`, resolutions);
}
cancel(projectId: string, jobId: string): Observable<void> {
return this.#http.delete<void>(
`api/projects/${projectId}/annotation-imports/${jobId}`);
}
}
Step 4: Implement the stepper component
Use MatStepper with four steps matching the UX flow. Each step is a child component for clarity:
- Step 1: <app-import-upload> — drag-and-drop file input, shows format after upload
- Step 2: <app-import-mapping> — tree-rendered question list with column dropdowns
- Step 3: <app-import-validation> — polling spinner → conflict resolution table
- Step 4: <app-import-complete> — stats summary
Use Angular signals for state: jobId = signal<string | null>(null), status = signal<ImportJobStatus | null>(null).
Step 5: Run component tests
Step 6: Commit
git add src/services/web/src/app/stage/stage-admin/annotation-import/
git commit -m "feat(annotation-import): add Angular import wizard (upload, mapping, validation, commit)"
Task 12: GCS DI Registration + Config Wiring¶
Wire IGcsStorageService into DI for the project-management service. Add config section.
Files:
- Modify: src/services/project-management/SyRF.ProjectManagement.Endpoint/Program.cs (or Startup equivalent)
- Modify: src/services/project-management/SyRF.ProjectManagement.Endpoint/appsettings.json
Step 1: Add config
// appsettings.json — add under existing sections:
"Gcs": {
"AnnotationImportsBucket": "syrf-annotation-imports"
}
In staging/production, override via GCP Secret Manager / Kubernetes ConfigMap (follow existing pattern for S3Settings).
Step 2: Register in DI
// Program.cs
var gcsSettings = builder.Configuration.GetSection("Gcs").Get<GcsSettings>()!;
builder.Services.AddSingleton(gcsSettings);
builder.Services.AddSingleton(StorageClient.Create()); // Uses Workload Identity
builder.Services.AddScoped<IGcsStorageService, GcsFileService>();
Step 3: Build and smoke test
Step 4: Commit
git add src/services/project-management/SyRF.ProjectManagement.Endpoint/
git commit -m "feat(annotation-import): wire GCS DI registration and appsettings config"
Task 13: End-to-End Smoke Test¶
Verify the full flow works locally using the project-management service.
Manual test steps:
- Start the project-management service locally
- Pick an existing project + stage with at least 2 questions
- Export a small set of existing annotations using the existing data-export endpoint with
ImportCompatible=true - POST the exported CSV to
POST /api/projects/{id}/stages/{id}/annotation-imports - Verify: response contains correct column headers and
stageQuestionstree - POST column mapping to
/mapping— verify 202 response - Poll
GETuntilstatus == "ValidationComplete"— inspectvalidationResult - POST
/confirmwith resolutions — verify stats in response - Verify annotations appear on a study via the review endpoint
Step: Commit any fixes found during smoke test
Execution Options¶
Plan complete and saved to docs/features/annotation-import/implementation-plan.md.
1. Subagent-Driven (this session) — dispatch fresh subagent per task, review between tasks, fast iteration
2. Parallel Session (separate) — open new session with executing-plans, batch execution with checkpoints
Which approach?