Annotation Question Management & Reconciliation — Design Decisions¶

Purpose: This document captures all architectural and domain model decisions from the brainstorming sessions for this initiative. It is the authoritative reference for design intent. Where this document contradicts earlier spec files (reconciliation.md, data-model-migration.md, README.md), this document takes precedence and those specs must be updated to align.

Parent: Annotation Management & Reconciliation

Contradictions with Existing Spec Files¶

The following contradictions were identified between brainstorming decisions and the initial spec files, plus refinements from the annotation-versioning brainstorming (D37-D50). All 20 contradictions must be resolved in favour of the authoritative source before implementation begins.

Document precedence: annotation-versioning-design.md (D37-D50) supersedes this document (D1-D36) supersedes the original spec files.

Layer 1: Original 12 contradictions (this document supersedes spec files)¶

#	Spec File	Contradiction	Required Change	Resolution Status
1	`reconciliation.md`	Uses claim/release workflow (self-assigned tasks)	Replace with random assignment from pool (D9)	Resolved
2	`reconciliation.md`	Uses `Reconciled` flag on annotations as authority mechanism	Replace with Reconciliation Session model (D5)	Resolved
3	`reconciliation.md`	`DecisionSource` enum (AnnotatorA/AnnotatorB/New/Averaged) implies picking a candidate's answer	Remove — reconciler always creates own annotation (D7)	Resolved
4	`reconciliation.md`	`ReconciliationRecord` with `ClaimedBy`, `ClaimedAt` fields	Remove claiming fields; use random assignment pool (D9)	Resolved
5	`reconciliation.md`	No mention of screening reconciliation ordering	Add ordering constraint (D14)	Resolved
6	`reconciliation.md`	No candidate blinding requirement	Add as system invariant (D10)	Resolved
7	`data-model-migration.md`	`ReconciliationRecord` schema differs from `reconciliation.md`	Replace both with Reconciliation Session model (D5)	Resolved
8	`data-model-migration.md`	No `FinalAnnotationOutcome` concept	Add Reconciliation Session entity and migration strategy	Resolved
9	`README.md`	API surface shows `/claim` and `/release` endpoints	Replace with pool/assignment endpoints (D9)	Resolved
10	`README.md`	Domain model diagram shows `ReconciliationRecord` embedded in ExtractionInfo	Replace with Reconciliation Session model (D5)	Resolved
11	`README.md`	Missing session model (Session → SessionVersion)	Add to proposed architecture (D3)	Resolved
12	`README.md`	Missing screening reconciliation ordering in milestones	Add dependency note (D14)	Resolved

Layer 2: D37-D50 contradictions (annotation-versioning-design.md supersedes this document)¶

#	Source	Contradiction	Required Change	Resolution Status
13	`design-decisions.md` S1 (AQVersion)	`AnswerType` and `ParentQuestionId` in AQVersion (versionable)	Move to AQ identity level — immutable after publish (D38)	Resolved
14	`design-decisions.md` S1	No DraftAQ concept	Add DraftAQ as separate type from AQ (D37)	Resolved
15	`design-decisions.md` S1	No `pendingChanges` / `pendingAnswer` concept	Add pending buffers for auto-save (D40, D46)	Resolved
16	`design-decisions.md` S1 / `README.md`	"No new collections are created"	Create `pmAnnotation` and `pmAnnotationSession` as separate collections (D41, D42)	Resolved
17	`design-decisions.md` S1	Annotation identity is composite `(StudyId, AnnotatorId, QuestionId)` only	Still true for candidates. Reconciliation annotations: `(StudyId, QuestionId)` with `annotatorId: null` (D45)	Resolved
18	`design-decisions.md` S2	ReconciliationAnswer as separate entity with Resolution enum	Reconciliation answers become AVs on reconciliation Annotations. `committedBy` and `stageId` on AV replace separate ReconciliationAnswer. (D45)	Resolved
19	`product-overview.md`	No auto-save mechanics described	Server-side auto-save via `pendingAnswer` (D46)	Resolved
20	`data-model-migration.md`	"Embedded-first, extract-if-needed" principle with all data in pmProject/pmStudy	Annotations and sessions move to own collections. Study holds references only (D41, D42, D50)	Resolved

Section 1: Entity Model — Identity + Immutable Versions¶

Every core entity follows the same pattern: a stable identity with append-only immutable versions. All entities are scope-aware, supporting cross-scope sharing (see Section 9).

Draft Annotation Question (DraftAQ)¶

Before activation, questions exist as DraftAQ entities — fully mutable factory objects that live on the Project aggregate. DraftAQs have no version history, no impact assessment requirements, and no downstream dependencies. Everything is mutable.

DraftAQ (factory — lives on Project aggregate)
├── DraftId: Guid                     ← temporary, replaced by AqId on activation
├── DataType: string                  ← mutable during draft phase
├── ParentId: Guid?                   ← mutable during draft phase
├── GroupAsSingle: boolean            ← mutable during draft phase
├── Text: string                      ← mutable
├── Options: List<AnswerOption>       ← mutable
├── HelpText: string?                 ← mutable
├── AnswerFilters: List<AnswerFilter> ← mutable

When a project admin activates a stage (or explicitly publishes a question), DraftAQs are converted to AQs. The DraftAQ ceases to exist; the AQ is born with its first AQVersion. See annotation-versioning-design.md Section 1.1 for the full rationale.

Annotation Question (AQ → AQVersion)¶

AnnotationQuestion (Identity — collection: pmAnnotationQuestion)
├── Id: Guid                           ← stable across versions
├── Scope: System | Organisation | Researcher | Project
├── OwnerId: Guid                     ← SystemId | OrgId | InvestigatorId | ProjectId
├── Category: QuestionCategory
├── DataType: AnswerType               ← IDENTITY: immutable after activation (D38)
├── ParentQuestionId: Guid?            ← IDENTITY: immutable after activation (D38)
├── GroupAsSingle: boolean             ← IDENTITY: immutable after activation (D38)
├── DerivedFrom: QuestionId?           ← null if original, source QuestionId if forked
├── PendingChanges: {Text, Options, HelpText, AnswerFilters} | null  ← mutable auto-save buffer (D40)
├── CurrentVersionNumber: int
└── Versions: List<AQVersion>          ← append-only

AQVersion (Immutable Snapshot — content properties only)
├── Id: Guid                           ← AQVersionId, globally unique
├── VersionNumber: int                 ← 1, 2, 3, ...
├── DerivedFrom: AQVersionId?          ← null if original, source AQVersionId if forked
├── QuestionText: string
├── Options: List<AnswerOption>
├── HelpText: string?
├── AnswerFilters: List<AnswerFilter>  ← conditional display / option filters
├── CreatedAt: DateTime
├── CreatedBy: Guid
└── ChangeReason: string?

Key rules:

Editing a question's content (text, options, helpText, answerFilters) always creates a new AQVersion. The AQVersion.Id (AQVersionId) is the globally unique identifier used by all downstream references.
Identity properties (DataType, ParentQuestionId, GroupAsSingle) are immutable after activation. Changing these would effectively create a different question — use a new AQ instead. See annotation-versioning-design.md Section 1.2 for rationale.
The PendingChanges buffer supports server-side auto-save of in-progress edits. The PA commits pending changes to create a new AQVersion. See Section 11 (D40) for details.
All AQs regardless of scope live in a single global collection (pmAnnotationQuestion). Scope + OwnerId provide access control. No routing logic needed — query by scope/owner.
Immutable AQVersions are safe to share across aggregates and scope boundaries (like Git objects, Docker layers). The DDD Shared Kernel pattern explicitly supports this.

Question Set → Question Set Version (QS → QSV)¶

The QuestionSet is the unified concept that replaces both StageQuestionSet (SQS) and QuestionTemplate/TemplateVersion from earlier designs. Within a project context, question sets are referred to as "Question Sets"; when browsing the cross-project catalogue, they are presented as "Templates" in the UI.

QuestionSet (Identity — collection: pmQuestionSet)
├── Id: Guid                           ← QuestionSetId, stable across versions
├── Scope: System | Organisation | Researcher | Project
├── OwnerId: Guid                     ← SystemId | OrgId | InvestigatorId | ProjectId
├── Name: string
├── Description: string?
├── DerivedFrom: QuestionSetId?        ← null if original, source QuestionSetId if forked
├── Published: bool                    ← visible in community catalogue when true
├── Metadata: QuestionSetMetadata?     ← optional catalogue fields (domain, species, tags, etc.)
├── ImportPolicy: AllOrNothing | SelectiveImport    ← optional, for shared sets
├── EditPolicy: Editable | ReadOnly | EditableWithWarning  ← derivation/fork policy
├── CurrentVersionNumber: int
└── Versions: List<QuestionSetVersion> ← append-only

QuestionSetVersion (Immutable — QSV)
├── Id: Guid                           ← QSVId, globally unique
├── VersionNumber: int                 ← per-set lineage: 1, 2, 3, ...
├── AQVersionIds: OrderedList<Guid>    ← the question versions, in display order
├── CreatedAt: DateTime
├── CreatedBy: Guid
└── ChangeReason: string?

Key rules:

QSV is immutable. Changing the question set (adding, removing, reordering, or upgrading a question version) creates a new QSV with a new QSVId and incremented VersionNumber.
QSV entries are AQVersionIds only. The QuestionId is derivable from the AQVersion when needed, but the QSV stores only the version references.
A stage references a QSVId to define its active question configuration. What happens to existing sessions when a new QSV is assigned is an admin decision, not system-prescribed. The admin chooses from configurable options (e.g., pin sessions to their starting QSV, require re-annotation, etc.).
EditPolicy is a fork/derivation policy, not about mutating the QS itself (which follows the immutable versioning pattern). It controls what happens when a project imports this question set: Editable allows free forking, ReadOnly prevents derivation, EditableWithWarning allows derivation with a warning.
Published promotes visibility to all authenticated users. Community catalogue = WHERE Published == true. See Section 9 for details.

Parent Integrity Constraint¶

All ancestors (via ParentQuestionId hierarchy — not DerivedFrom) of any AQVersion in a QSV must also be present in the same QSV. This is enforced at QSV composition time, not at stage assignment time, because AQVersions are always created in the context of a QSV.

This constraint means that when a stage has child questions whose parents belong to another stage's question set, those parent AQVersions are necessarily included in both QSVs. Cross-stage question overlap is therefore structural (forced by parent integrity), not exceptional. This is an expected and common pattern.

Annotation → AnnotationVersion¶

Annotation (Identity — collection: pmAnnotation)
├── StudyId: Guid                      ┐
├── AnnotatorId: Guid | null           ├── composite identity (null for reconciliation, D45)
├── QuestionId: Guid                   ┘
├── PendingAnswer: {Value, Notes} | null  ← mutable auto-save buffer (D46)
├── CurrentVersionNumber: int
└── Versions: List<AnnotationVersion>  ← append-only (embedded in same document, D43)

AnnotationVersion (AV — Immutable)
├── Id: Guid                           ← AnnotationVersionId, globally unique
├── VersionNumber: int
├── AQVersionId: Guid                  ← which question version was answered
├── QSVId: Guid                        ← which question set version was active
├── Answer: TypedAnswer                ← bool/int/decimal/string/arrays
├── Notes: string?
├── CommittedBy: Guid                  ← who committed (important for reconciliation, D45)
├── StageId: Guid                      ← which stage this was committed from (D45)
├── CreatedAt: DateTime
├── CreatedByAction: "save" | "complete" | "impact-update"
└── SessionVersionId: Guid             ← which session submission included this

Key rules:

Candidate annotation identity is study-scoped: (StudyId, AnnotatorId, QuestionId). Not stage-scoped — this supports cross-stage question sharing via explicit version references.
Reconciliation annotation identity is (StudyId, QuestionId) with AnnotatorId: null (D45). Authorship is tracked per-AV via CommittedBy. The gold standard has no single owner — multiple reconcilers may contribute AVs across stages.
PendingAnswer is a mutable server-side auto-save buffer, updated with each auto-save (D46). Cleared on Save/Complete/Revert.
Every save/submit creates a new AnnotationVersion. Previous versions are preserved.
Each AnnotationVersion records the AQVersionId and QSVId that were active when the answer was collected, providing full audit traceability.
Annotations live in their own pmAnnotation collection (D41). The Study document holds reference arrays (annotationIds) only (D50). See Section 11 for rationale.

Session → SessionVersion¶

Session (Identity — collection: pmAnnotationSession)
├── Id: Guid                           ← SessionId
├── StageId: Guid
├── InvestigatorId: Guid
├── Reconciliation: boolean            ← is this a reconciliation session?
├── AnnotationIds: List<Guid>          ← references to Annotation entities in this session
├── CurrentVersionNumber: int
└── Versions: List<SessionVersion>     ← append-only (embedded in same document)

SessionVersion (ASV — Immutable)
├── Id: Guid                           ← SessionVersionId, globally unique
├── VersionNumber: int
├── Status: SessionStatus              ← Incomplete | Completed
├── QSVId: Guid                        ← which question set version this submission used
├── AnnotationAVMap: Map<AnnotationId, AVId>  ← pinned AV for each annotation at save time
├── CreatedAt: DateTime
├── CreatedByAction: "save" | "complete"
└── SubmittedAt: DateTime?

Key rules:

Sessions hold explicit AnnotationAVMap pinning each annotation to a specific AV at save time, not computed filters. There are no predicates or "latest" lookups — the session knows exactly which annotation versions it contains.
This replaces the current AnnotationSession.MatchingAnnotationsPredicate pattern (which uses computed filters based on Reconciled, StageId, AnnotatorId).
Sessions live in their own pmAnnotationSession collection (D42). The Study document holds reference arrays (sessionIds) only (D50). See Section 11 for rationale.

Stages Are Unordered¶

Stages do not have an inherent sequential order. They can be concurrent and orthogonal — e.g., one stage focused on risk of bias, another on data extraction, running in parallel. The system must not assume stage ordering for any authority or precedence logic.

Section 2: Reconciliation Session — Gold Standard Authority¶

Why Not Reconciled Annotations?¶

The current system uses a Reconciled: bool flag on annotations to mark reconciliation answers. The brainstorming established that this approach has fundamental problems:

Duplication: In 3 of 4 scenarios (SingleAnnotator, CandidateAgreement, ManualReconciliation-with-agreement), the system would need to create reconciled annotations that are identical copies of existing annotations.
Authority ambiguity: A Reconciled flag doesn't distinguish between "this annotation was auto-promoted because it was the only one" vs "a reconciler reviewed this and approved it".
Inconsistent with screening: The screening annotation feature uses FinalScreeningOutcome for authority — annotation should follow the same pattern.

The Reconciliation Session Model¶

Each study has a single reconciliation session that acts as the gold standard. Each time a reconciler reconciles a stage, the system creates a new version of this session. The latest version IS the project-level authoritative answer set for that study.

The reconciliation form presents all AQs in the project (the "Project Question Set"). Only the intersection of the current stage's QSV questions and the AQs defined as required are actually required on the form. AQs that are required but don't belong to the current stage's QSV become optional, allowing reconcilers to see and optionally engage with the full project context.

This means each stage's reconciliation is another version of the same session, so that eventually all AQs belonging to stages will have reconciled answers and the full context of previous answers is available to reconcilers.

The reconciliation session model retains its role as the gold standard authority. However, reconciliation answers are now implemented as AnnotationVersions (AVs) on reconciliation Annotations rather than as a separate ReconciliationAnswer entity. The committedBy and stageId fields on AV (see Section 1) replace the separate ResolvedById and OriginStageId fields that were previously on ReconciliationAnswer.

ReconciliationSession (per Study — follows Identity + Versions pattern)
├── CurrentVersionNumber: int
└── Versions: List<ReconciliationSessionVersion>   ← append-only

ReconciliationSessionVersion (Immutable — equivalent to a reconciliation ASV)
├── Id: Guid                           ← globally unique
├── VersionNumber: int
├── ReconciledStageId: Guid            ← which stage's reconciliation produced this version
├── ReconcilerId: Guid                 ← who reconciled
├── QSVId: Guid                        ← the stage's active QSV (audit)
├── AnnotationAVMap: Map<AnnotationId, AVId>  ← reconciliation Annotation → authoritative AV
├── ResolutionMetadata: Map<QuestionId, ResolutionMeta>  ← per-question resolution tracking
├── SubmittedAt: DateTime
└── ChangeReason: string?

ResolutionMeta (per-question metadata — replaces ReconciliationAnswer)
├── Resolution: Resolution             ← how this answer was determined
├── Rationale: string?                 ← optional per-answer rationale

The AnnotationAVMap maps each reconciliation Annotation (identified by annotationId) to its authoritative AV. Since reconciliation annotations have annotatorId: null (D45), each (studyId, questionId) pair has exactly one reconciliation Annotation whose latest AV is the gold standard.

Resolution enum:

Value	Meaning	AV committed by
`SingleAnnotator`	Only one annotator completed; auto-promoted	System (on behalf of the candidate)
`CandidateAgreement`	Multiple annotators agreed; reconciler bulk-approved	Reconciler (creating own AV with agreed answer)
`ManualReconciliation`	Annotators disagreed; reconciler manually resolved	Reconciler (creating own AV with their answer)

See annotation-versioning-design.md Section 2.7 for the full reconciliation annotation model.

How Versioning Works¶

When a reconciler reconciles Stage B for a study that already has a v1 (from Stage A):

Load v1's AnnotationAVMap (the current gold standard — a map of reconciliation annotations to their authoritative AVs)
Stage B's QSV questions are required — reconciler must provide answers
All other project questions are optional — previous answers shown as context
Reconciler submits → new AVs created on reconciliation Annotations for Stage B questions, with committedBy: reconcilerId and stageId: Stage B
Carried-forward entries from v1 retain their existing AVs (no new AV created for unchanged answers)
Write v2 with the complete merged AnnotationAVMap

The latest version's AnnotationAVMap is always a complete snapshot of the gold standard — no need to traverse version history.

Cross-Stage Overlap¶

For questions appearing in multiple stages (structurally forced by the parent integrity rule — see Section 1):

Stage A reconciler answers q1 → creates AV on q1's reconciliation Annotation → v1 maps {ann-q1: av-1}
Stage B reconciler sees v1's answer for q1 as context, plus Stage B's candidate annotations for q1
Stage B reconciler provides their own answer for q1 (it's required — q1 is in Stage B's QSV)
If they agree with Stage A's answer, they confirm it. If they disagree, they change it.
A new AV is created on q1's reconciliation Annotation with committedBy: Stage B reconciler, stageId: Stage B
v2 maps {ann-q1: av-2} — the Stage B reconciler's AV

Cross-stage disagreement is resolved through the natural act of reconciliation — no separate conflict detection or resolution workflow is needed. The reconciler sees the previous answer, considers the candidates, and makes a deliberate decision.

Key Properties¶

Not a DDD entity: The reconciliation session is a materialised decision record, not a traditional aggregate. It has no lifecycle or state transitions — each version is an immutable fact.
Uses AV mechanism: Rather than a separate ReconciliationAnswer entity, reconciliation answers are AVs on reconciliation Annotations (D45). This unifies the annotation model — both candidate and reconciliation answers are AVs, just on different Annotation entities.
Complete snapshot: Each version contains ALL authoritative answers (newly reconciled + carried forward). The latest version is self-sufficient.
Audit trail: Version history shows exactly how the gold standard evolved — which stage, which reconciler, what changed at each step.
Per-answer attribution: Each reconciliation AV records committedBy (which reconciler) and stageId (which stage), preserving full provenance even when answers are carried forward across versions. Resolution metadata (Resolution, Rationale) is tracked per-question in ResolutionMetadata.

Key Rules¶

The reconciliation session is per study — one gold standard per study for the entire project.
Once a reconciliation AV exists for a question, no further candidate sessions are accepted for that study/question in the originating stage.
Resolution metadata (Resolution, Rationale) lives on ResolutionMeta in the session version, never on the annotation itself. Annotations are pure data; the reconciliation session carries authority and resolution tracking.
The Rationale field defaults to optional. Admins can configure it as required per stage (MVP), with potential per-question configurability in the future.
Concurrent stage reconciliation: Handled by optimistic concurrency — if two reconcilers submit for the same study simultaneously, the second writer detects a version mismatch, reloads, merges (stage questions are mostly disjoint), and retries. This is rare in practice (stages reach coverage at different times).

Section 3: Authority Determination Rules¶

When Is Authority Determined?¶

Authority is determined per stage when a study reaches sufficient annotation coverage. The outcome is a new version of the study's reconciliation session (see Section 2):

CompletedSessions == 0
  → Not ready. No action.

CompletedSessions == 1, MinAnnotators == 1
  → Auto-promote (SingleAnnotator).
  → System creates a new ReconciliationSessionVersion,
    adding ReconciliationAnswers for this stage's questions
    pointing to the candidate's AnnotationVersionIds.
  → Resolution = SingleAnnotator, ResolvedById = candidate's Id.

CompletedSessions == 1, MinAnnotators > 1
  → Not ready. Needs more annotators.

CompletedSessions >= 2
  → Reconciliation required (regardless of MinAnnotators value).
  → Study enters the reconciliation pool for this stage.

MinAnnotators Is a Readiness Threshold¶

MinAnnotators controls when a study is eligible for authority determination, not what kind of determination occurs. The actual completed session count determines the pathway:

If exactly 1 session completed and that meets MinAnnotators → auto-promote.
If 2+ sessions completed → reconciliation is always required, even if MinAnnotators was 1.

How can a study have more sessions than MinAnnotators? Admins assign annotators to stages, not individual studies. Multiple annotators working on the same stage will independently annotate the same studies. The system does not prevent this — it is expected and handled by the reconciliation pathway.

Section 4: Reconciliation Workflow¶

Random Assignment (Not Claiming)¶

Reconcilers are randomly assigned studies from the reconciliation pool. They do not browse or claim tasks.

Rationale:

Prevents cherry-picking (avoiding difficult studies)
Consistent with the screening annotation reconciliation pattern
Ensures fair workload distribution
Reduces bias in reconciliation outcomes

Workflow:

Reconciler opens the reconciliation dashboard for a stage
System presents a randomly-selected study from the unresolved pool
Reconciler sees the full Project Question Set:
Stage questions (required): candidate answers shown side-by-side (blinded)
Previously reconciled questions (context): answers from earlier reconciliation session versions shown as read-only context
Other project questions (optional): available for the reconciler to answer if they choose
Reconciler reviews and resolves (or skips with reason)
System creates a new ReconciliationSessionVersion, carrying forward previous answers and adding the new stage's reconciled answers
System presents the next random study
Pool shrinks as studies are resolved

Reconciler Always Creates Own Annotation¶

When reconciliation takes place (any scenario where a reconciler is involved), the reconciler always creates their own annotation, even if they agree with a candidate's answer.

Rationale:

Pointing ReconciliationAnswer.AnnotationVersionId to a candidate's annotation when the reconciler agrees would arbitrarily elevate one candidate over the other in the system.
The reconciler's annotation represents a deliberate decision, distinct from the candidates' independent assessments.
Clean attribution: ReconciliationAnswer.AnnotationVersionId always points to the reconciler's work.

How this works for agreed studies (bulk approve):

Reconciler reviews a batch of studies where all candidates agree.
On bulk approve, the system creates reconciliation annotations on behalf of the reconciler (copying the agreed answer, attributed to the reconciler).
ReconciliationAnswer for each question points to the reconciler's newly-created AnnotationVersionId.
Resolution = CandidateAgreement.

How this works for conflicted studies (manual resolution):

Reconciler manually reviews side-by-side comparisons.
Reconciler creates their own annotation for each question (may match a candidate or be entirely new).
ReconciliationAnswer points to the reconciler's AnnotationVersionId.
Resolution = ManualReconciliation.

Candidate Blinding — System Invariant¶

Candidate annotators must never see other candidates' answers during annotation. This is a system invariant for independent assessment, not merely a reconciliation UX consideration.

During annotation: annotator sees only their own previous answers (if returning to an in-progress session).
After completing their session: annotator still cannot see other candidates' answers.
Only reconcilers (and admins) see the side-by-side comparison of candidate answers.

This blinding rule applies to all annotation questions at all times, regardless of whether reconciliation is configured or has occurred.

Cross-Stage Visibility — Admin Configurable¶

When a question has been reconciled in a previous stage, the admin can configure how (or whether) that answer is shown to annotators and reconcilers in subsequent stages:

Setting	Meaning
`Blind`	Annotators and reconcilers in this stage do not see reconciliation answers from other stages
`ShowOwnPrior`	Annotators see only their own prior answers from other stages (not the reconciled gold standard)
`ShowReconciled`	Reconcilers see the reconciled answer from other stages as context (annotators remain blinded)

This is configured per stage by the project admin. Default: ShowReconciled for reconcilers, Blind for annotators (preserves independent assessment while giving reconcilers maximum context).

Section 5: Agreement Metrics¶

Metric Definitions¶

Two metrics are computed:

Percent Agreement (all question types):

PercentAgreement = (agreed_questions / total_questions) × 100

Simple, intuitive, universally applicable.

Cohen's Kappa (categorical questions only):

κ = (Po - Pe) / (1 - Pe)
where:
  Po = observed agreement proportion
  Pe = expected agreement by chance

Accounts for chance agreement. Only meaningful for categorical (boolean, select, checklist) questions. null for free-text and numeric questions.

Metric Granularity¶

Level	Computation
Per question, per study	Do all annotators agree on this question for this study?
Per question, across stage	Percent agreement and kappa across all studies with 2+ annotators
Per study	Percent of questions in agreement for this study
Stage aggregate	Average percent agreement across all studies

When Metrics Are Computed¶

On session completion: When a study reaches 2+ completed sessions, per-study metrics are computed.
On reconciliation resolution: Updated to reflect resolved state.
On demand: Admin can trigger recomputation for a stage.
Background: MassTransit consumer handles computation asynchronously.

Section 6: Screening Integration¶

Separate Processes, Aligned Patterns¶

Screening annotation reconciliation and data annotation reconciliation are separate processes with aligned patterns and shared infrastructure.

Aspect	Screening Reconciliation	Annotation Reconciliation
What is reconciled	Include/exclude decision + exclusion reasons	Annotation answers per question
Authority mechanism	`FinalScreeningOutcome`	Reconciliation Session (versioned gold standard per study)
Pool management	Random assignment from pool	Random assignment from pool
Bulk operations	Bulk approve agreed studies	Bulk approve agreed studies
Reconciler annotation	Creates own screening annotation	Creates own data annotation
Outcome	Study included or excluded	Authoritative answer per question (latest session version)

Why not combine? The entities are fundamentally different (boolean include/exclude vs typed multi-question answers), the reconciliation UIs are different (simple decision vs side-by-side multi-question review), and the downstream effects are different (pipeline gating vs data authority).

What is shared: Pool management logic, random assignment algorithm, bulk approve patterns, metrics computation framework, permission model.

Ordering Constraint¶

When a project uses both screening and annotation stages:

Candidates complete screening + annotations
  → Screening Annotation Reconciliation
    → Reconciler resolves screening conflicts
      → FinalScreeningOutcome
        → Exclude: Study is excluded. No annotation reconciliation needed.
        → Include: Study is included. Check annotation sessions per stage.
          → CompletedSessions >= MinAnnotators?
            → Yes, 1 session, MinAnnotators==1:
                Auto-promote (SingleAnnotator)
                → New ReconciliationSessionVersion with auto-promoted answers
            → Yes, 2+ sessions:
                Study enters Annotation Reconciliation Pool for this stage
                → Reconciler resolves annotation conflicts
                  → New ReconciliationSessionVersion
                    (carries forward previous answers, adds this stage's)
                    → Study's gold standard evolves toward completeness
            → No:
                Not ready. Awaiting more annotation sessions.

Key constraint: Annotation reconciliation cannot proceed until screening reconciliation has determined that the study is included. Excluded studies are never reconciled for annotations — this saves work and avoids meaningless reconciliation of studies that won't be used.

No Bypass for Annotation Reconciliation (MVP)¶

Screening reconciliation supports bypass criteria (admin-configurable rules for skipping the pool) because the screening pipeline is an operational gate — studies must flow through to reach annotation. If every study required manual reconciliation, throughput would bottleneck.

Annotation reconciliation does not need bypass for MVP: - There is no equivalent pipeline pressure. Studies with agreed annotations get bulk-approved efficiently. - Removing reconciler involvement entirely (bypass) for some studies could undermine the quality assurance purpose of reconciliation. - Bulk approve already handles the efficiency concern for agreed studies without removing reconciler oversight. - Bypass can be considered as a future enhancement if demand arises.

Section 7: UI/UX Design Decisions¶

Question Management (Design + Assign Tabs)¶

Editing a question: Opens a "Save as New Version" flow. Admin sees the current version number and is informed they are creating a new version.
Version history: Panel showing all versions with diffs. Side-by-side comparison between any two versions.
Version badge: Displayed on each question card (e.g., "v3").
Locked questions: Questions that have been answered are marked as locked. Editing is still possible but triggers the versioning wizard.

Question Set Version Management¶

Creating a new QSV: Admin selects which AQVersionIds to include and their display order. Drag-and-drop reordering. Version selection per question. Parent integrity is enforced: all ancestors of any selected AQV are automatically included.
QSV immutability: Any change to the question set (add, remove, reorder, upgrade version) creates a new QSV.
Admin decision on active sessions: When a new QSV is assigned to a stage and sessions exist under the previous QSV, the admin chooses what to do. This is an admin decision, not a system-prescribed behaviour. Options may include: pin existing sessions to their starting QSV, invalidate in-progress sessions, require re-annotation for specific questions, etc. The system presents the options; the admin decides.

Annotation Form¶

Question version changes: If an admin creates a new QSV that upgrades a question version and configures re-annotation, annotators are informed of what changed. The notification is informational — it helps the annotator understand why they are being asked to re-answer.
Per-question saving: Form saves individual answers rather than the entire session, enabling auto-save and reducing data loss risk.

Reconciliation Dashboard¶

Random presentation: Reconciler clicks "Next study" to receive a randomly-assigned study from the pool. No browsing or selection.
Full Project Question Set: The reconciliation form shows all project AQs. The current stage's QSV questions are required; other project questions are optional. Previously reconciled answers (from earlier session versions) are shown as context.
Candidate blinding: Reconciler sees all candidates' answers side-by-side for the current stage's questions. Candidate annotators never see each other's answers (system invariant).
Anonymised candidates: During reconciliation review, candidate answers are presented anonymously (e.g., "Annotator A", "Annotator B") to prevent identity bias. The reconciler does not know which annotator gave which answer.
Cross-stage context: For questions that have been reconciled in previous stages, the reconciler sees the existing gold standard answer alongside the current stage's candidates (if applicable).
Bulk approve: For studies where all candidates agree, reconciler can approve in bulk. System creates reconciliation annotations on the reconciler's behalf and writes a new ReconciliationSessionVersion.
Manual resolution: For conflicted studies, reconciler enters their own answer per question. Optional rationale per decision (admin-configurable to require).
Metrics visibility: Per-question and per-study agreement metrics visible on the dashboard.

Section 8: Migration Strategy¶

Verified: No Existing Reconciliation Data¶

Production database verified via MongoDB MCP: - Sessions: 194,741 total — all have Reconciliation: false - Annotations: 3,096,894 total — all have Reconciled: false

There are zero reconciliation sessions or reconciled annotations in production. The reconciliation feature was partially implemented in the backend but never exposed in the UI and never used. This means: - No legacy reconciliation data handling is needed. - The Reconciled flag can be phased out cleanly. - Migration is a clean slate for reconciliation-related entities.

Migration Approach: Additive Only¶

No data is deleted or moved. All changes are additive.

Phase 1: Versioning Infrastructure

Add CurrentVersionNumber and Versions[] to AnnotationQuestion
Add Scope, OwnerId fields (default: Project scope, ProjectId as owner for existing questions)
Mark all existing questions as v1 (accurate — they have never been versioned)
Create initial QuestionSet and QuestionSetVersion for each stage based on current Stage.AnnotationQuestions
Add version reference fields to annotations (nullable initially, backfilled in Phase 2)

Phase 2: Reconciliation Session Backfill

For every study with exactly 1 completed session per stage where MinAnnotators == 1:
Create a ReconciliationSessionVersion with Resolution = SingleAnnotator answers
ReconciliationAnswer.AnnotationVersionId points to the candidate's (now v1) AnnotationVersionId
Studies with 0 completed sessions: no action (not ready)
Studies with 2+ completed sessions: enter the reconciliation pool (no auto-backfill — reconciler must review)

Phase 3: Feature Flag Rollout - New question management UI: behind feature flag (extends existing newQuestionManagement flag) - Reconciliation dashboard: behind new feature flag - Annotation form v2: behind feature flag (independent rollout) - Gradual enablement per project

Backward Compatibility¶

Existing annotation form continues to work throughout migration.
Existing question management UI continues to work.
API consumers see the same structures with additional optional fields.
The Reconciled boolean is preserved on existing annotations but not used for new authority determination (replaced by the Reconciliation Session model).

Discussion: GitHub Discussion #1598 — Feature Spec Question Templates (June 2024)

Problem¶

Many systematic review projects use the same or very similar annotation questions. Project admins spend time recreating identical questions across projects. CAMARADES also wants to provide "best practice methodology" question sets (e.g., risk of bias, reporting quality) and aggregate structured data across projects for automation/LLM training. This feature was specifically requested by the funder.

Design Principle: Reference-First with Copy-on-Write¶

Cross-scope sharing uses a reference model, not a copy model. Since AQVersions and QSVs are immutable, they are inherently safe to share across project boundaries — an immutable object can never cause consistency issues when referenced from multiple contexts.

Action	Result
Import QSV as-is	Stage references the shared QSV directly. No copy created.
Modify imported QSV	System creates a new project-local QSV (`DerivedFrom` → source QSVId)
Use shared AQ as-is	QSV references the shared AQVersionId directly. No copy created.
Modify imported AQ	System creates a new project-local AQ (`DerivedFrom` → source QuestionId) with a new project-local AQVersion (`DerivedFrom` → source AQVersionId)

This follows established patterns: Git (shared immutable commits, fork to modify), Docker (shared immutable layers), npm (shared published versions, fork to patch), and Copy-on-Write filesystems (ZFS/Btrfs).

DDD justification: Immutable objects satisfy the safety requirements for cross-aggregate sharing. In DDD, Value Objects can be freely shared because they have no mutable state. AQVersions are effectively Value Objects with an ID for referencing convenience. The Shared Kernel pattern explicitly supports this — multiple bounded contexts referencing shared immutable concepts.

Unified Entity Model¶

The QuestionSet and QuestionSetVersion entities defined in Section 1 serve both as project-local question set configurations (assigned to stages) and as cross-project reusable templates. There is no separate "template" entity — a template is simply a QuestionSet at a non-Project scope, optionally with Published: true and rich Metadata.

Within a project context, question sets are referred to as "Question Sets". When browsing the cross-project catalogue, they are presented as "Templates" in the UI. The distinction is purely presentational.

Asset Scoping Model¶

All versionable assets (AQ, AQVersion, QS, QSV) exist at one of four ownership scope levels:

Ownership Scopes:
  System        → SyRF platform (curated by CAMARADES)
  Organisation  → Org-private library (see Section 10)
  Researcher    → Personal library
  Project       → Project-local

Community Catalogue:
  Not a scope level — it is a query: WHERE Published == true
  Any asset at System, Organisation, or Researcher scope can be published

Visibility rules:

Asset Scope	Published: false	Published: true
System	All authenticated users	All authenticated users
Organisation	Org members only	All authenticated users
Researcher	Researcher only	All authenticated users
Project	Project members only	N/A (promote to higher scope to publish)

Derivation rules:

Any scope can derive (fork) from any equal-or-higher-visibility scope
Derivation creates a new immutable asset at the target scope
The derived asset contains DerivedFrom pointing to the source
Unmodified assets in the source are referenced, not copied

Project Question Library¶

A project maintains a Question Library — a manifest of all AQ references available to the project. This is a flat set of AQVersionId references (not entity copies):

ProjectQuestionLibrary
├── Entries: Set<QuestionLibraryEntry>

QuestionLibraryEntry
├── QuestionId: Guid
├── VersionIds: List<Guid>            ← AQVersionIds available in this project
├── Ownership: Owned | Imported
└── SourceScope: System | Organisation | Researcher | Project

Import invariant: Any QSV assigned to a stage must have all of its constituent AQVersionIds already present in the project's Question Library. This is enforced at QSV assignment time.

The Fork Graph¶

The DerivedFrom fields on AQ, AQVersion, QS, and QSV create a fork graph (DAG) that provides full provenance. Any project-local asset can be traced back through its derivation chain to its origin:

AQ "Sample Size" (System scope)
  └── AQVersion v1 (System scope)
        │
        ├── referenced as-is by Org A's QSV (no fork, shared reference)
        │     └── referenced as-is by Project X's stage (no fork)
        │
        └── forked by Project Y → Project AQVersion v1
              (DerivedFrom = System AQVersion v1)
              (added "Include dropouts?" option)

Import Workflow¶

Import as-is (reference, no copy):

Admin browses community catalogue (search/filter by metadata) or org/system libraries
Selects a QuestionSet (and optionally specific questions if SelectiveImport policy)
AQVersionIds are added to the project's Question Library
Stage's active QSV is set to the selected QSVId (or a subset QSV is created)
No new AQ or AQVersion entities created — the stage references the shared immutable assets
Annotations will reference the shared AQVersionIds directly

Fork to customize:

Admin wants to modify an imported QSV (add/remove/reorder questions, or edit a question)
For modified questions: system creates project-local AQ + AQVersion with DerivedFrom pointers
System creates project-local QSV with DerivedFrom pointer, referencing:
Shared AQVersionIds for unchanged questions (shared reference)
Project-local AQVersionIds for modified questions (forked)
Stage now references the project-local QSV
New project-local entities added to the Question Library

Publish Workflow¶

Publish project questions to the community catalogue:

Admin selects questions from their project
Admin enters QuestionSet metadata (name, domain, species, guidance URL)
System creates a new QuestionSet at Researcher or Organisation scope with Published: true
The QSV references the selected AQVersionIds (shared immutable references)
QuestionSet versioned independently from that point

Cross-Project Aggregation¶

The reference model makes cross-project data aggregation trivially correct:

Projects using a shared QSV as-is:
  → Annotations directly reference shared AQVersionIds
  → Query: ReconciliationSession answers WHERE AQVersionId IN [shared AQVersionIds]
  → Result: All authoritative answers, semantically identical across projects

Projects that forked:
  → Follow DerivedFrom chain to identify forked AQVersions
  → Decide if forks are semantically comparable (did the fork change meaning?)

This is the key enabler for CAMARADES' LLM training data extraction use case: when hundreds of projects reference the same AQVersionId, their reconciliation session answers are structured, comparable training data with zero ambiguity about semantic equivalence.

System Questions as a QuestionSet¶

The 18 hardcoded system questions (with fixed GUIDs) should be remodelled as a CAMARADES curated QuestionSet — the "SyRF Core" question set at System scope with ImportPolicy: AllOrNothing and EditPolicy: ReadOnly. This provides:

A versioning pathway for system questions (currently impossible)
Conceptual alignment: system questions ARE shared questions from the platform
A migration path to eventually remove hardcoded GUIDs

This is a future migration, not an MVP requirement.

Physical Storage¶

All AQs live in a single global collection (pmAnnotationQuestion) regardless of scope. Scope + OwnerId fields provide access control. No routing logic needed — query by scope/owner.

All QuestionSets live in a single global collection (pmQuestionSet) regardless of scope. Same pattern.

Since QSVs reference AQVersionIds by Guid, resolution is straightforward — query pmAnnotationQuestion by AQVersionId. The immutability guarantee makes this efficient and infinitely cacheable.

Prioritization¶

Phase	Scope	Depends On
With M1	`Scope`, `OwnerId`, `DerivedFrom` fields on AQ/AQVersion (with defaults for existing data)	Question versioning (M1)
With M1	`pmQuestionSet` collection, QS/QSV entities, stage references QSVId	Question versioning (M1)
New milestone	Curated catalogue, import-as-reference workflow, Question Library	M1 complete
Follow-on	Fork-to-customize workflow, community publishing, metadata search	Import MVP
Follow-on	Cross-project aggregation API, data export, LLM training pipeline	Sharing + Reconciliation Session
Follow-on	System questions as QuestionSet, update propagation notifications	Sharing stable

Section 10: Organisation Model¶

Overview¶

Organisations provide a multi-tenancy layer above projects. Projects and users belong to organisations, enabling:

Private template libraries: Org-level question templates visible only to org members
Centralised user management: Invite/remove users at org level
Shared methodology: Enforce or recommend org-standard question sets
Permission hierarchy: Org-level roles in addition to existing project/stage roles

Entity Model¶

Organisation (New Aggregate — new collection: pmOrganisation)
├── Id: Guid
├── Name: string
├── Slug: string                      ← URL-friendly identifier
├── Settings: OrgSettings
│   ├── DefaultEditPolicy: Editable | ReadOnly | EditableWithWarning
│   ├── RequireTemplateForNewStages: bool
│   └── AllowCommunityPublishing: bool
├── Members: List<OrgMembership>
│   ├── InvestigatorId: Guid
│   ├── Role: OrgOwner | OrgAdmin | OrgMember
│   └── JoinedAt: DateTime
└── CreatedAt: DateTime

Relationship to Existing Entities¶

Organisation
  ├── has many → Projects (Project.OrganisationId: Guid?)
  ├── has many → Members (OrgMembership → Investigator)
  └── owns → Org-scoped AQs, AQVersions, QuestionSets

Project (modified)
  ├── OrganisationId: Guid?           ← nullable for backward compatibility
  └── (all existing fields unchanged)

Investigator (modified)
  ├── OrgMemberships: List<OrgMembershipRef>?
  │   ├── OrganisationId: Guid
  │   └── Role: OrgOwner | OrgAdmin | OrgMember
  └── (all existing fields unchanged)

Permission Hierarchy¶

System (SyRF platform)
  └── Organisation
        ├── OrgOwner:  Full org management, delete org, transfer ownership
        ├── OrgAdmin:  Manage members, manage org templates, create projects
        ├── OrgMember: View org templates, import into projects, create projects
        │
        └── Project (inherits org context but NOT automatic access)
              ├── ProjectAdmin: Manage project settings, stages, members
              ├── ProjectMember: Participate in assigned stages
              │
              └── Stage
                    ├── Annotate
                    ├── Reconcile
                    └── (other stage activities)

Key principle: Org-level roles do not automatically grant project-level access. An OrgAdmin can create projects and manage org templates, but must still be explicitly added to a project to access its data. This follows the principle of least privilege and prevents accidental data exposure.

Template Visibility with Organisations¶

Visibility matrix:

Who can see...          | System | Org A private | Org B private | Community | Project X (Org A)
─────────────────────── | ────── | ───────────── | ───────────── | ──────── | ─────────────────
System templates        | ✓      | ✓             | ✓             | ✓        | ✓
Org A templates         | ✗      | ✓             | ✗             | ✗        | ✓
Org B templates         | ✗      | ✗             | ✓             | ✗        | ✗
Community templates     | ✓      | ✓             | ✓             | ✓        | ✓
Project X local assets  | ✗      | ✗             | ✗             | ✗        | ✓

An OrgAdmin can publish an org template to the community catalogue, changing its visibility from org-private to public. This is a one-way promotion (private → public).

Researcher-Level Libraries¶

In addition to org and project scopes, individual researchers can maintain a personal question library:

Researcher-scoped assets:
├── Scope: Researcher
├── OwnerId: InvestigatorId
├── Visibility: private to the researcher
└── Can be imported into any project the researcher is a member of

This enables researchers to build up personal libraries of question sets they reuse across their projects and organisations.

Migration Approach¶

The organisation model is additive — existing projects and users continue to work without organisations:

Project.OrganisationId is nullable — existing projects have no org
Investigator.OrgMemberships is nullable — existing users have no org affiliations
Organisations can be created and populated gradually
Existing projects can be migrated into organisations when ready

Phase 1: Add nullable org fields to Project and Investigator Phase 2: Organisation CRUD, member management, org-level UI Phase 3: Org template libraries, visibility rules Phase 4: Org settings enforcement (e.g., require template for new stages)

MongoDB Storage¶

New collection: pmOrganisation

{
  _id: CSUUID("..."),
  name: "CAMARADES Edinburgh",
  slug: "camarades-edinburgh",
  settings: {
    defaultEditPolicy: "EditableWithWarning",
    requireTemplateForNewStages: false,
    allowCommunityPublishing: true
  },
  members: [
    {
      investigatorId: CSUUID("..."),
      role: "OrgAdmin",
      joinedAt: ISODate("2026-01-15T00:00:00Z")
    }
  ],
  createdAt: ISODate("2026-01-15T00:00:00Z")
}

Modified pmProject:

{
  _id: CSUUID("..."),
  organisationId: CSUUID("..."),  // nullable — null for legacy projects
  // ... all existing fields unchanged
}

Decision Log¶

Summary of all decisions made during brainstorming, for quick reference:

#	Decision	Rationale
D1	Identity + Immutable Versions pattern for all entities	Provides full audit trail, enables time-travel queries, consistent pattern
D2	AQVersionId as sole reference in QSV	QuestionId is derivable; single reference simplifies lookups
D3	Explicit AnnotationVersionIds in sessions (not computed filters)	Eliminates ambiguity; each session knows exactly what it contains
D4	Annotation identity is study-scoped: (StudyId, AnnotatorId, QuestionId)	Supports cross-stage question sharing via explicit version refs
D5	Reconciliation Session as authority mechanism (not Reconciled flag or separate FAO entity)	Versioned gold standard per study; each stage's reconciliation is another version; latest version IS the authoritative answer set
D6	Resolution metadata on ReconciliationAnswer only, not on annotations	Clean separation: annotations are pure data; the reconciliation session carries authority
D7	Reconciler always creates own annotation when involved	Avoids arbitrarily elevating one candidate; clean attribution
D8	MinAnnotators is a readiness threshold, not authority mechanism	CompletedSessions count determines pathway, not MinAnnotators
D9	Random assignment for reconciliation (not claiming)	Prevents cherry-picking; fair distribution; matches screening pattern
D10	Candidate blinding is a system invariant	Independent assessment requires annotators never see each other's answers
D11	QSV is immutable; admin decides session handling on change	System does not prescribe what happens to sessions; admin chooses
D12	Rationale is optional by default, admin-configurable	Per-stage for MVP; potentially per-question in future
D13	Separate screening and annotation reconciliation processes	Different entities, UIs, and downstream effects; share infrastructure
D14	Screening reconciliation precedes annotation reconciliation	Excluded studies skip annotation reconciliation entirely
D15	No annotation reconciliation bypass for MVP	Bulk approve handles efficiency; bypass risks undermining quality assurance
D16	Reconciler annotations attributed to reconciler	Bulk approve creates annotations on reconciler's behalf, not copied from candidates
D17	No legacy reconciliation migration needed	Verified: 0 reconciled annotations in production
D18	Additive-only migration	No data deleted; new fields alongside existing; rollback = $unset
D19	Reference-first with copy-on-write for cross-scope sharing	Immutable assets are safe to share; eliminates duplication; cross-project aggregation is trivially correct
D20	Four-level ownership scoping: System, Organisation, Researcher, Project	Enables private org libraries, personal libraries, and project-local customization
D21	Fork graph (DAG) for provenance tracking via DerivedFrom	Complete traceable lineage; any asset traceable back to its origin
D22	QuestionSet/QuestionSetVersion unifies SQS and QuestionTemplate	Single entity serves both as stage question config and cross-project template; eliminates TemplateVersion indirection
D23	Organisation as new aggregate with nullable foreign keys	Additive adoption; existing projects/users unaffected until migrated
D24	Org roles do NOT grant automatic project access	Principle of least privilege; explicit project membership still required
D25	System questions to be remodelled as CAMARADES curated QuestionSet (future)	Enables versioning of system questions; conceptual alignment
D26	Researcher-level personal question libraries	Researchers build reusable libraries across their projects
D27	Community is a Published flag, not a scope level	Community catalogue = `WHERE Published == true`; avoids a fifth scope level; any ownership scope can publish
D28	Single global collection for all AQs (`pmAnnotationQuestion`)	Scope + OwnerId fields for access control; no routing logic; shared immutable versions are safe
D29	Parent integrity: all AQV ancestors must be in the same QSV	Enforced at QSV composition time; creates structural cross-stage question overlap
D30	Stages are unordered — can be concurrent and orthogonal	No sequential ordering assumption; no "latest stage wins" precedence
D31	Cross-stage question overlap is structural, not exceptional	Forced by parent integrity rule; expected and common pattern
D32	Cross-stage visibility is admin-configurable per stage	Blind, ShowOwnPrior, or ShowReconciled; default: reconcilers see context, annotators are blinded
D33	Reconciliation session is a materialised decision record, not a DDD entity	No lifecycle or state transitions; computed synchronously on the study document; not a separate aggregate
D34	Reconciliation form shows full Project Question Set	Stage questions required, others optional; reconcilers have full project context; cross-stage answers visible
D35	Cross-stage disagreement resolved through natural reconciliation	Later stage's reconciler sees and engages with earlier answers; no separate conflict resolution workflow needed
D36	Project Question Library as flat reference manifest	Set of AQVersionId references, not entity copies; import invariant enforced at QSV assignment

Source: annotation-versioning-design.md — brainstorming session (Feb 2026). These decisions refine D1-D36 above. Where they conflict, the D37-D50 decisions take precedence and the affected sections of this document have been updated inline (see audit trail comments).

Decision Table¶

#	Decision	Rationale
D37	DraftAQ is a separate type from AQ	Creation phase has fundamentally different invariants (all properties mutable) vs modification phase (identity frozen, only content versionable). Conditional logic signals two domain concepts forced into one model.
D38	`dataType`, `parentId`, `groupAsSingle` are identity properties — immutable after publish	Changing parent restructures entity subtrees and form shape. Changing dataType invalidates all AVs. These are not "new versions of the same question" — they are different questions.
D39	SQS does not need a Draft type	SQS has no identity properties. Its only content (questionIds) is the same shape for initial assignment and subsequent modification. `pendingChanges` is sufficient.
D40	"Pending" terminology for existing entities, "Draft" for under-construction	Avoids confusion. "Draft" = not yet real. "Pending" = real entity with uncommitted edits.
D41	Annotations are own aggregate in `pmAnnotation` collection	Avoids Study document contention and unbounded growth. Annotation + embedded AVs form natural aggregate boundary.
D42	Sessions are own aggregate in `pmAnnotationSession` collection	Same reasoning as D41. Session + embedded ASVs form natural aggregate boundary.
D43	AVs embedded in Annotation document (not separate collection)	Annotation and its AVs must be consistent (`currentAVId` must match). Embedding gives single-doc atomicity for AV creation.
D44	Cross-stage annotation sharing via same Annotation entity with multiple AVs	No forking, no copying. Both stages' sessions reference the same annotationId. Editing creates a new AV; other stages' ASVs pin to the older AV.
D45	Reconciliation annotations have `annotatorId: null` and track authorship on AVs	Reconciliation annotations are shared across stages AND reviewers. The gold standard has no single owner — authorship tracked per-AV via `committedBy`.
D46	Server-side auto-save via `pendingAnswer` field	Users change machines/browsers. Client-only auto-save (IndexedDB) is insufficient. `pendingAnswer` is a mutable field on the Annotation document, updated with each auto-save.
D47	Save/Complete operations use MongoDB multi-document transactions	Save touches N annotation docs + 1 session doc. Transactions ensure consistency. Acceptable overhead for infrequent explicit user actions.
D48	Graduated impact assessment at commit time, not immutable property restrictions	No property is "forbidden to edit" — the system assesses impact and warns proportionally. A dataType change with annotations is allowed, it just requires creating a new AQ (because it IS a new question).
D49	`entityInstanceId` concept removed — `annotationId` serves as stable entity identity	The existing `Annotation.Id` already provides stable identity for entity subtree references. No new concept needed. Migration is direct: current Id becomes annotationId.
D50	Study document holds only references (`sessionIds`, `annotationIds`)	Study is slimmed to metadata + reference arrays. Actual annotation and session data lives in their own collections. Prevents document bloat and contention.

Collection Boundaries Summary¶

Collection	Aggregate root	Contents	Rationale
`pmStudy`	Study	Study metadata + `sessionIds: GUID[]` + `annotationIds: GUID[]` (references only)	Slimmed to avoid contention and bloat (D50)
`pmAnnotation`	Annotation	Annotation identity + `pendingAnswer` + embedded `avs: AV[]`	Own aggregate boundary; grows with saves (D41, D43)
`pmAnnotationSession`	Session	Session metadata + `annotationIds` + embedded `asvs: ASV[]`	Own aggregate boundary; grows with saves (D42)
`pmProject`	Project	DraftAQs, AQs (or refs), SQS config	Unchanged
`pmAnnotationQuestion`	AQ	AQ identity + embedded AQVersions	Global collection, scope-based access (D28)
`pmQuestionSet`	QS	QS identity + embedded QSVs	Global collection, scope-based access

Key Patterns Established¶

DraftAQ → AQ lifecycle: Fully mutable drafts convert to versioned entities on activation (D37)
Identity / Content / Derived property classification: Determines which properties can be versioned vs require a new entity (D38)
Pending buffers: pendingChanges on AQ/SQS and pendingAnswer on Annotation enable server-side auto-save without creating versions (D40, D46)
Reconciliation via AV mechanism: Reconciliation answers are AVs on annotatorId: null annotations, unifying candidate and reconciliation answer storage (D45)

Open Items¶

Items that remain unresolved or require product-owner input:

Decimal comparison tolerance: Should decimal answer comparison use a tolerance for agreement detection? Current decision: exact match (safer). Tolerance can be added later.
Reconciliation access control: Who can reconcile? Options: (a) any project member, (b) designated reconcilers only, © project administrators only. Recommendation: (b) designated reconcilers, consistent with screening.
Export format changes: Adding version references and reconciliation metrics to exports may affect downstream tools. Survey current consumers before finalising.
Legacy annotation audit: Should we audit for annotations referencing non-existent questions before migration? Low risk (versioning is additive) but good hygiene.
Admin UI for session handling on QSV change: Exact options and UX for the admin decision when assigning a new QSV to a stage with active sessions. Needs UX design.
Notification system for re-annotation: When admin requires re-annotation after a question version change, how are annotators notified? In-app notification, email, or both?
QuestionSet update notifications: When a shared QuestionSet author publishes a new version, should projects referencing the old version be notified? How? Opt-in upgrade flow needs UX design.
QuestionSet deprecation policy: What happens when a shared QuestionSet is deprecated? Projects still reference immutable versions (safe), but the catalogue should communicate lifecycle status.
Organisation migration strategy: When and how do existing projects migrate into organisations? Voluntary opt-in? Bulk migration? What happens to projects that never join an org?
Auth0 integration for org-level roles: How do org-level permissions integrate with the existing Auth0-based authentication? New claims/roles needed?
Selective import and parent integrity interaction: For QuestionSets with SelectiveImport policy, the import UI needs to show a question picker. Parent integrity means selecting a child question auto-includes all ancestors — UX should make this clear.
Optimistic concurrency strategy for reconciliation session: When concurrent stage reconciliation produces a version conflict, the merge strategy is straightforward (disjoint stage questions), but the implementation details (retry logic, UI feedback) need design.
ReconciliationAnswer per-question rationale vs per-session: Current design supports per-answer rationale. Is this needed for MVP, or is per-session rationale sufficient? Per-answer is richer but adds UI complexity.
Reconciliation session re-reconciliation workflow: When a stage needs re-reconciliation (e.g., new candidate data), how should previously carried-forward answers from other stages be handled? Preserve as-is, or present for re-confirmation?

Annotation Question Management & Reconciliation — Design Decisions¶

Contradictions with Existing Spec Files¶

Layer 1: Original 12 contradictions (this document supersedes spec files)¶

Layer 2: D37-D50 contradictions (annotation-versioning-design.md supersedes this document)¶

Section 1: Entity Model — Identity + Immutable Versions¶

Draft Annotation Question (DraftAQ)¶

Annotation Question (AQ → AQVersion)¶

Question Set → Question Set Version (QS → QSV)¶

Parent Integrity Constraint¶

Annotation → AnnotationVersion¶

Session → SessionVersion¶

Stages Are Unordered¶

Section 2: Reconciliation Session — Gold Standard Authority¶

Why Not Reconciled Annotations?¶

The Reconciliation Session Model¶

How Versioning Works¶

Cross-Stage Overlap¶

Key Properties¶

Key Rules¶

Section 3: Authority Determination Rules¶

When Is Authority Determined?¶

MinAnnotators Is a Readiness Threshold¶

Section 4: Reconciliation Workflow¶

Random Assignment (Not Claiming)¶

Reconciler Always Creates Own Annotation¶

Candidate Blinding — System Invariant¶

Cross-Stage Visibility — Admin Configurable¶

Section 5: Agreement Metrics¶

Metric Definitions¶

Metric Granularity¶

When Metrics Are Computed¶

Section 6: Screening Integration¶

Separate Processes, Aligned Patterns¶

Ordering Constraint¶

No Bypass for Annotation Reconciliation (MVP)¶

Section 7: UI/UX Design Decisions¶

Question Management (Design + Assign Tabs)¶

Question Set Version Management¶

Annotation Form¶

Reconciliation Dashboard¶

Section 8: Migration Strategy¶

Verified: No Existing Reconciliation Data¶

Migration Approach: Additive Only¶

Backward Compatibility¶

Section 9: Cross-Scope Sharing — Import, Fork, Publish¶

Problem¶

Design Principle: Reference-First with Copy-on-Write¶

Unified Entity Model¶

Asset Scoping Model¶

Project Question Library¶

The Fork Graph¶

Import Workflow¶

Publish Workflow¶

Cross-Project Aggregation¶

System Questions as a QuestionSet¶

Physical Storage¶

Prioritization¶

Section 10: Organisation Model¶

Overview¶

Entity Model¶

Relationship to Existing Entities¶

Permission Hierarchy¶

Template Visibility with Organisations¶

Researcher-Level Libraries¶

Migration Approach¶

MongoDB Storage¶

Decision Log¶

Section 11: Annotation Versioning & Entity Model Refinements (D37-D50)¶

Decision Table¶

Collection Boundaries Summary¶

Key Patterns Established¶

Open Items¶