Skip to content

Screening Profiles

Overview

Feature Name: Screening Profiles Target Users: Project Administrators, Reviewers Business Value: Enable multi-stage screening pipelines within a single project, eliminating workarounds and preserving audit trails Phase: Phase 1 (with Stage Filtering)


Problem & Solution

SyRF's current screening model supports only a single, project-wide set of inclusion/exclusion criteria. This forces teams conducting multi-stage reviews (e.g., Title/Abstract followed by Full-Text) into a workaround: creating separate SyRF projects for each screening phase, then manually exporting and re-importing studies between them. This fragments the audit trail, duplicates administrative effort, and makes collaboration harder.

Screening Profiles solve this by decoupling screening criteria from their application. A Screening Profile is a named, reusable configuration containing:

  • Criteria text (inclusion and exclusion criteria)
  • Agreement mode (how individual decisions are aggregated: Single, Dual Manual, Dual Automated)
  • Notes (optional admin context)

Profiles are defined at the project level and assigned to one or more stages. A project can have multiple profiles — e.g., "Title/Abstract Criteria" and "Full-Text Eligibility Criteria" — enabling a complete multi-stage pipeline within a single project.

Key design principle: Immutable once used. Once any screening decision has been recorded against a profile, it becomes immutable. To revise criteria, admins clone the profile (pre-filling a new creation form), edit the copy, and assign it to stages going forward. The original profile and all its decisions remain intact. This protects scientific integrity: decisions are always traceable to the exact criteria under which they were made.


Key Concepts & Data Model

Screening Decision vs Screening Outcome

These are distinct concepts:

  • A Screening Decision is an individual reviewer's vote (Include or Exclude) on a study, recorded against a specific Screening Profile.
  • A Screening Outcome is the aggregate result derived from all screening decisions on that study for that profile, resolved via the profile's agreement mode. Possible values: Included | Excluded | Conflict | Pending.

The Screening Outcome is what downstream systems act on — stage filtering references outcomes, not individual decisions. A study's outcome is Pending until enough decisions exist to satisfy the agreement mode, then resolves to Included, Excluded, or Conflict.

Status mapping from legacy: - Legacy InsufficientlyScreenedPending - Legacy DisagreeConflict

Data Model (MongoDB)

Screening Profiles are embedded in the Project document:

Project {
  screeningProfiles: [
    { id, name, criteriaText, agreementMode, notes, createdBy, createdAt, used: bool }
  ]
}

The used flag is set to true when the first screening decision is recorded against this profile. Once used = true, the profile is immutable.

Screening Decisions and Outcomes are embedded in the Study document:

Study {
  screeningOutcomes: [
    {
      profileId,
      status: "Included" | "Excluded" | "Conflict" | "Pending",
      decisions: [ { reviewerId, decision, timestamp } ],
      decisionCount,              // materialized len(decisions) — efficient threshold queries
      updatedAt
    }
  ]
}

This embedding ensures atomic updates — a new decision and the recomputed outcome are written in a single document operation.

Materialized fields: status and decisionCount are pre-computed on write so that MongoDB queries (stats aggregation, selection filters) can operate at the database level without unpacking the decisions array. The legacy codebase materializes additional screening-derived fields for its $facet aggregation pipeline:

Legacy field New equivalent Notes
ScreeningInfo.Inclusion (ratio) status Pre-computed outcome replaces the ratio
ScreeningInfo.NumberOfScreenings decisionCount Direct replacement
ScreeningInfo.InclusionInfo[] screeningOutcomes[] itself Per-profile outcome replaces per-stage materialized array
ScreeningInfo.AgreementMeasure Computable from decisions[] Materialized only if reporting queries require it

The stats aggregation pipeline (StudyStats.cs) will need to be rewritten to query screeningOutcomes[] via $elemMatch (scoped by profileId) rather than the flat legacy fields. This is a necessary migration cost — the legacy pipeline assumes a single set of screening data per study.

Screening Stats Are Profile-Scoped

A study's screening outcome belongs to its Screening Profile regardless of which stage(s) use that profile. This has important consequences:

  • Screening progress (screened count, pending, conflicts) is aggregated per profile, not per stage
  • Reviewer screening stats (studies screened by investigator X) are scoped by (project, profile) — not by stage
  • Stage-scoped stats exist only for annotation/data-extraction work, not for screening decisions
  • A stage displays screening stats for its assigned profile, but does not own them

API & Admin Workflow

Admin Workflow

In Project Settings, admins manage Screening Profiles:

  1. Create — Name (must be unique within the project), criteria text, agreement mode, optional notes
  2. Edit/Delete — Only while used = false (no decisions recorded). Profile shows an "In Use" indicator once locked.
  3. Clone — Pre-fills a new profile creation form from source. Independent entity.
  4. Assign — Link a profile to one or more stages

Screening Decision Submission

The current endpoint pattern is preserved — the study is a path resource:

POST /api/projects/{projectId}/stages/{stageId}/studies/{studyId}/review
Body: "Included" | "Excluded"
Response: NextStudyResponseDto (next study + review status + stats)

This is a combined submit-and-get-next design: submitting a decision atomically records it and returns the next study, avoiding an extra round-trip.

Under the new model, the server-side behavior evolves:

  1. Resolve the stage's active Screening Profile
  2. Record a Screening Decision for this (study, profile, reviewer) tuple
  3. Recompute the Screening Outcome for this (study, profile) using the profile's agreement mode
  4. Select and return the next study from the stage's study pool

The screening annotation reconciliation endpoint creates a Reconciled Screening — the authoritative decision + screening annotation for the study (see Screening Annotations for the full model):

POST /api/projects/{projectId}/stages/{stageId}/studies/{studyId}/reconcile
Body: "Included" | "Excluded"
Auth: StageReconcilePolicy

This is distinct from data extraction reconciliation (a future feature), which would create a Reconciled Annotation Session. The two reconciliation acts produce different entities; see the terminology note in the end-to-end workflow below.

Profile Management API

Method Path Description
GET /api/projects/{projectId}/screeningProfiles List all profiles
POST /api/projects/{projectId}/screeningProfiles Create profile
PUT /api/projects/{projectId}/screeningProfiles/{id} Edit (409 if used)
DELETE /api/projects/{projectId}/screeningProfiles/{id} Delete (409 if used)
POST /api/projects/{projectId}/screeningProfiles/{id}/clone Clone to new profile

Backwards compatibility: The /review endpoint shape is unchanged — the only difference is that decisions are now recorded under a profile (resolved from the stage) rather than globally. Legacy projects (without profiles) continue to work as-is.


Migration & Legacy Compatibility

Approach: Just-in-Time Profile Creation

Rather than a migration wizard, profiles are created on-demand when admins first need them:

  1. Screening Profiles appear in Project Settings for all projects — no feature flag, no "enable" step
  2. When the admin creates their first profile, if the project has existing screening data, the system shows a clear message:

    "Your project has X studies with screening decisions. These will be associated with this profile so they're available for stage filtering. Your existing data is not changed — this adds new metadata alongside it."

  3. Admin confirms → Profile is created, background job writes screeningOutcomes[] entries (additive only)
  4. Admin continues configuring their stage and filter set. If the migration is still running for large projects, the filter builder shows "Importing screening data... (85%)" with a live count preview that updates as data becomes available.
  5. No rollback UI needed — if the admin deletes the profile (only possible if unused), the associated screeningOutcomes data is cleaned up automatically.

For new projects: Profiles are just there from the start. No migration, no special messaging.

For existing projects with no screening data: Creating a profile is just creating a profile. No migration messaging.

Three-Phase Rollout

Phase 1: Just-in-time (at launch) - Migration happens per-project when an admin first creates a Screening Profile - Active projects self-select — the ones that need the feature migrate first - Each migration is small, observable, and reversible - Gives real-world validation across varied project sizes and data shapes

Phase 2: Bulk migration (once proven safe) - After N projects have migrated without issues (or after a time threshold) - Background job migrates all remaining legacy projects - Same additive operation — writes screeningOutcomes[] alongside existing ScreeningInfo - Can be batched, monitored, paused if issues arise - No user-facing impact — the data is additive and doesn't change existing behavior

Phase 3: Deprecate legacy code (after bulk migration) - All projects now have screeningOutcomes data - Remove dual read paths (ScreeningInfo vs screeningOutcomes) - Eventually archive or drop ScreeningInfo from the schema - Single code path going forward — simpler to maintain

The key metric for triggering Phase 2: zero data discrepancies across the first N just-in-time migrations (verified by automated count checks comparing ScreeningInfo totals with screeningOutcomes totals).


Success Criteria

  • 80% fewer new projects requiring multi-project workaround for TA→FT
  • ≤ 5 minutes to configure a two-stage pipeline (profile + filter set + stage settings)
  • Reviewer throughput no worse than legacy; select_next p95 < 400 ms
  • Clean audit trail for screening decisions; no data loss during migration

End-to-End Screening Workflow

Terminology: Reconciliation Acts vs Workflow

The screening pipeline involves two distinct reconciliation acts — each producing a different entity:

Reconciliation Act Entity Created When Authority
Screening annotation reconciliation Reconciled Screening (decision + screening annotation) Reconciler reviews candidate screenings for a study under a Screening Profile Determines the Final Screening Outcome
Data extraction reconciliation Reconciled Annotation Session (authoritative data extraction answers) Reconciler reviews candidate annotation sessions for an included study Determines the authoritative data extraction record

These acts are always conceptually distinct — they produce different entities with different purposes. However, the reconciliation workflow (the user experience) can evolve:

  • MVP: Separate workflows — screening annotation reconciliation is implemented first; data extraction reconciliation is a future feature with its own design process
  • Future: An integrated reconciler workbench could present both acts in a single session (e.g., shared assignment queues, unified study view), while preserving the distinction between the two entity types. This integration is a UX concern — it does not merge the underlying data models

The diagram below uses these terms precisely. "Reconciliation" without a qualifier refers to the workflow context (e.g., a selection mode); the specific act is always named explicitly.

Workflow Diagram

The full screening workflow involves three distinct reconciliation processes:

1. SCREENING: Reviewers screen study under Profile X
   ├── Each reviewer makes Include/Exclude decision
   └── If Exclude + Screening Annotations configured → fill annotation form

2. SCREENING DECISION RECONCILIATION (automatic)
   ├── Handled by agreement mode — no special role needed
   ├── SingleScreening: 1 decision = done
   ├── AutomatedDualScreening: 2 agree = done; disagree = 3rd reviewer tie-breaks
   └── ManualDualScreening: 2 agree = done; disagree = manual resolution
   Result: Screening Outcome (Included | Excluded | Conflict | Pending)

3. SCREENING ANNOTATION RECONCILIATION (manual — new feature)
   Only applies when: Screening Annotations configured + reconciliation enabled
   ├── Studies enter Screening Annotation Reconciliation Pool
   │   (configurable criteria; default: all studies with screening annotations)
   ├── Reconciler reviews candidate screening annotations side-by-side
   ├── Creates Reconciled Screening (decision + screening annotation)
   │   └── Can confirm OR override the automatic decision from step 2
   └── Reconciled Screening → Final Screening Outcome

   OR (if study bypasses pool due to agreement criteria):
   └── Agreement-derived outcome → Final Screening Outcome
       (screening annotation truncated at disagreement branches)

   IF no screening annotations configured:
   └── Step 2 outcome → Final Screening Outcome directly

4. STAGE FILTERING: Final Screening Outcome feeds Stage Study Pools
   └── e.g., Full-Text stage pool = "studies where Profile X outcome = Included"

5. DATA EXTRACTION: Included studies go through annotation
   ├── Reviewers answer regular annotation questions (data extraction)
   └── Candidate annotation sessions created per SessionCountTarget

6. REGULAR ANNOTATION RECONCILIATION (future feature — not yet designed)
   Only applies to included studies with completed candidate annotation sessions
   ├── Studies enter Regular Annotation Reconciliation Pool
   ├── Reconciler reviews candidate annotation sessions side-by-side
   ├── Creates Reconciled Annotation (authoritative data extraction answers)
   └── Reconciled Annotation used for final data export

   Note: Steps 3 and 6 are distinct reconciliation ACTS that produce
   different entities (Reconciled Screening vs Reconciled Annotation
   Session). For MVP, they also have separate WORKFLOWS — separate
   pools, assignment queues, and UX. When data extraction reconciliation
   is designed, the two workflows could be integrated into a unified
   reconciler workbench (shared assignment queues, single study view
   covering both screening annotations and data extraction annotations).
   Such integration is a UX improvement — it does not merge the
   underlying entity types or data models.

7. OUTPUTS
   ├── PRISMA reporting ← Final Screening Outcome (from step 3)
   ├── Data exports ← Reconciled Annotations (from step 6) or candidate data
   └── Stage Study Pools ← Final Screening Outcome (from step 3)