Reconciliation Workflow¶

When multiple people annotate the same study and disagree on one or more answers, reconciliation resolves those disagreements to produce a single authoritative answer -- the "gold standard." This gold-standard record is what appears in data exports and feeds agreement metrics for publication.

Reconciliation is the core quality assurance workflow in SyRF. Everything built in previous phases -- versioned questions, the rebuilt annotation form, per-question auto-save, and authority determination -- exists to make this workflow possible, traceable, and trustworthy.

Understanding Reconciliation¶

Every study in your project needs a gold-standard answer set before its data can be trusted for analysis. How a study gets its gold standard depends on how many people annotated it:

Scenario	What Happens
One annotator completed the study, and the project requires only one annotator	Their answers are automatically promoted -- no reconciliation needed
Two or more annotators completed the study	The study enters the reconciliation pool for a reconciler to review

Most projects that use dual annotation will have a mix of both scenarios. Auto-promotion handles the simple cases silently; reconciliation handles the cases that need human judgement.

Key Principles¶

Random assignment, not cherry-picking. When a study needs reconciliation, it enters a pool. The system randomly assigns it to an available reconciler. Reconcilers cannot browse or select studies. This prevents bias and ensures fair workload distribution.
Anonymised comparison. The reconciler sees candidate answers labelled "Annotator A" and "Annotator B" -- never the actual identities. This ensures the reconciler judges answers on their merits, not on who provided them.
Reconciler always records their own answer. Even when the reconciler agrees with one of the candidates, they record their own independent answer. The gold standard is always the reconciler's deliberate decision, never a pointer to someone else's work. This is consistent with best practice in systematic review methodology.

The Reconciliation Dashboard¶

The reconciliation dashboard gives project administrators a high-level view of reconciliation progress for each stage.

Navigating to the Dashboard¶

Navigate to a stage and select the Reconciliation tab. The dashboard shows:

Pool size: How many studies still need reconciliation
Bulk approve eligible: How many studies can be approved without manual review (all annotators agreed on every answer)
Manual reconciliation required: How many studies have at least one disagreement and need a reconciler
Resolved: How many studies have been reconciled (or auto-promoted)
Agreement summary: The distribution of agreement levels across the pool

Progress Tracking¶

A progress bar shows reconciliation completion for the stage. As reconcilers work through the pool and administrators bulk-approve unanimous studies, the bar fills. When the pool is empty, reconciliation for that stage is complete.

Per-Question Agreement¶

Below the summary, you see agreement rates broken down by individual question. This is useful for identifying questions that are commonly misunderstood or ambiguously worded. If a question has notably low agreement, consider revising its text or help content in a new question version.

Reconciling a Study¶

This section is for users with the Reconcile permission on a stage (see Project Groups).

Starting a Reconciliation Session¶

Navigate to the stage's Reconciliation tab.
Click Start Reconciling.
The system assigns you a random study from the unresolved pool. You do not choose which study to reconcile.

The Reconciliation Form¶

For each study, you see the annotation form with candidate answers displayed side-by-side:

Disagreed questions are highlighted and shown first. Each candidate's answer is displayed next to each other -- "Annotator A" on the left, "Annotator B" on the right. Neither identity is revealed.
Agreed questions are grouped separately below. When all candidates gave the same answer, it is shown as a single pre-filled value.

For every question in the stage:

Review the candidate answers. Read what Annotator A and Annotator B answered. Consider the study text alongside their answers.
Record your own answer. Even if you agree with one candidate, enter your own answer deliberately. This creates an independent record.
Add a rationale (if required). Your project administrator may require you to explain your decision for some or all questions. If a rationale field appears, describe why you chose your answer. This is especially important for disagreed questions.

Cross-Stage Context¶

If a question was reconciled in a previous stage, the existing gold-standard answer is displayed as read-only context. This gives you additional information to inform your decision, but you still record your own answer for the current stage.

The visibility of cross-stage answers depends on your project's configuration:

Setting	What You See
Blind	No answers from other stages
Show own prior	Your own answers from other stages (if you were also an annotator)
Show reconciled	The gold-standard answer from prior stages

Your project administrator configures this per stage.

Submitting Your Reconciliation¶

When you have answered all required questions, click Submit. The system:

Creates a gold-standard record for this study
Records how each answer was determined (your manual decision)
Preserves the full audit trail (what the candidates answered, what you decided, and why)

After submission, the system automatically assigns you the next study from the pool. Continue until the pool is empty or you choose to stop.

Bulk Approve¶

For studies where all annotators gave identical answers on every question, there is nothing to reconcile. Project administrators can approve these studies in bulk.

How to Bulk Approve¶

Open the Reconciliation tab for a stage.
The dashboard shows how many studies are eligible for bulk approval.
Click Bulk Approve.
Review the list of unanimous studies. Each study shows its question count and confirmation that all annotators agreed.
Click Approve All.

For each approved study, the system creates gold-standard records automatically, copying the agreed-upon answers. The resolution is recorded as "Candidate Agreement" -- the audit trail shows that all annotators agreed and no manual reconciliation was needed.

When to Use Bulk Approve¶

Use bulk approve early and often. It clears unanimous studies from the pool, leaving only genuine disagreements for reconcilers. This saves significant time for projects with high inter-rater agreement.

Agreement Metrics¶

SyRF automatically computes two agreement metrics that are standard in systematic review methodology.

Percent Agreement¶

The proportion of questions (or studies) where all annotators gave the same answer. This is a simple, intuitive metric applicable to all question types.

Example: If 10 studies each have 5 questions, and annotators agreed on 42 out of 50 question-study pairs, the percent agreement is 84%.

Cohen's Kappa¶

A statistical measure that accounts for chance agreement. Cohen's Kappa is more rigorous than percent agreement because it adjusts for the probability that annotators would agree by chance alone.

Kappa Value	Interpretation
0.81 -- 1.00	Almost perfect agreement
0.61 -- 0.80	Substantial agreement
0.41 -- 0.60	Moderate agreement
0.21 -- 0.40	Fair agreement
0.00 -- 0.20	Slight agreement
< 0.00	Less than chance agreement

Cohen's Kappa is computed for categorical questions only (yes/no, single select, multiple select). It is not computed for free-text or numeric questions where chance agreement is not meaningful.

Viewing Metrics¶

Metrics are available at multiple levels:

Per question: What percentage of studies had agreement on this question?
Per study: How many questions had disagreement for this study?
Per stage: The overall agreement summary across all studies

Navigate to the Reconciliation tab and scroll to the Agreement Metrics section. Metrics update as reconciliation progresses.

Exporting Metrics¶

Click Export Metrics to download a CSV file containing per-question and per-study agreement statistics. This data is ready for inclusion in systematic review publications. Journals and funding bodies increasingly expect inter-rater agreement statistics, and these exports provide them directly.

Using Agreement Metrics Effectively¶

Agreement metrics are not just for publication -- they are a quality management tool:

Identify problematic questions. If a question has notably low agreement, it may be ambiguously worded. Consider revising the question text in a new version (see Question Lifecycle).
Identify annotators who may need training. If one annotator's answers consistently diverge from others, they may benefit from additional guidance on the annotation protocol.
Monitor quality over time. Track agreement across reconciliation sessions. If agreement drops, investigate whether protocol clarity has degraded or if new team members need onboarding.

Workflow Summary¶

flowchart TD
    A[Two or more annotators complete a study] --> B{Do all annotators agree on every answer?}
    B -->|Yes| C[Eligible for bulk approve]
    B -->|No| D[Study enters reconciliation pool]
    D --> E[System randomly assigns study to a reconciler]
    E --> F[Reconciler sees anonymised answers side-by-side]
    F --> G[Reconciler records their own answer for every question]
    G --> H[Gold-standard record created]
    C --> I[Admin bulk-approves unanimous studies]
    I --> H

    style H fill:#c8e6c9

Project Groups -- configure who has the Reconcile permission
Reconciliation Model -- the data structures behind reconciliation
Screening Annotations -- screening disagreements follow the same reconciliation pattern
Feature Brief
Reconciliation Specification
Platform Architecture