SyRF Platform Architecture¶

Reading time: ~30 minutes. This is the primary architecture reference for the SyRF platform. For detailed build dependencies, see dependency-map.md. For MongoDB collection details, see mongodb-reference.md. For PRISMA 2020 data model constraints, see ../features/prisma-specification/.

1. What Is SyRF¶

SyRF (Systematic Review Facility) is a web platform for conducting systematic reviews and meta-analyses of preclinical research. It is developed by the CAMARADES group (Collaborative Approach to Meta-Analysis and Review of Animal Data from Experimental Studies) and used by research teams worldwide to synthesise evidence from animal studies.

The Problem SyRF Solves¶

A systematic review requires a research team to:

Search multiple bibliographic databases for potentially relevant studies
Remove duplicate citations across those databases
Screen thousands of titles and abstracts against inclusion criteria
Retrieve and assess full-text reports for eligible studies
Annotate (extract data from) included studies using structured questions
Reconcile disagreements between independent annotators
Export structured data and produce a PRISMA 2020 flow diagram documenting the review process

Each step involves multiple researchers working independently, producing decisions that must be auditable, reproducible, and traceable. SyRF provides the collaborative infrastructure for all of these steps.

Core Workflow¶

flowchart LR
    A[Import Citations] --> B[Deduplicate]
    B --> C[Screen]
    C --> D[Annotate]
    D --> E[Reconcile]
    E --> F[Export & PRISMA]

    style A fill:#e1f5fe
    style B fill:#e1f5fe
    style C fill:#fff3e0
    style D fill:#fff3e0
    style E fill:#e8f5e9
    style F fill:#e8f5e9

Import: Users upload citation files (RIS, PubMed XML, CSV, etc.) from systematic searches. Each citation becomes a study in the project.

Deduplicate: [TARGET] The ASySD algorithm (R subprocess) detects duplicate citations. High-confidence duplicates are auto-confirmed; probable duplicates are queued for admin review. A canonical study is enriched with best-of-breed metadata from all import records.

Screen: Reviewers evaluate studies against structured inclusion/exclusion criteria at title/abstract and full-text levels. Screening decisions are per-profile, allowing multi-stage pipelines.

Annotate: Annotators answer versioned question sets for each study. Per-question auto-save creates individual annotation versions; submitting creates an immutable session version.

Reconcile: When multiple annotators disagree, a reconciler is randomly assigned to resolve disagreements by creating their own independent answers, producing a gold-standard reconciliation session version.

Export & PRISMA: Structured data export includes reconciliation status, version references, and agreement metrics. A PRISMA 2020 flow diagram is auto-generated from lifecycle and screening data.

Current state: Import and basic annotation/screening work today. Deduplication, versioned questions, reconciliation, screening profiles, and PRISMA reporting are being built across three releases (see ROADMAP.md).

For a product-level framing of the annotation and reconciliation capabilities, see the product overview.

2. Service Architecture¶

SyRF is a microservices-based platform following Domain-Driven Design principles. Five services collaborate to deliver the platform's capabilities.

graph TB
    subgraph "Client"
        Web["Web Service<br/>(Angular 21 SPA)<br/>Port 80"]
    end

    subgraph "Backend Services"
        API["API Service<br/>(REST Gateway)<br/>Port 8080"]
        PM["Project Management Service<br/>(Core Domain Logic)<br/>Port 8081"]
        Quartz["Quartz Service<br/>(Background Jobs)<br/>Port 8082"]
    end

    subgraph "External"
        S3N["S3 Notifier<br/>(AWS Lambda)"]
        S3["AWS S3<br/>(File Uploads)"]
    end

    subgraph "Infrastructure"
        RMQ["RabbitMQ<br/>(Message Bus)"]
        Mongo["MongoDB Atlas<br/>(Data Store)"]
        Auth0["Auth0<br/>(Authentication)"]
    end

    Web -->|REST/HTTP| API
    API <-->|MassTransit/RabbitMQ| PM
    API --> Mongo
    PM --> Mongo
    Quartz --> Mongo
    PM <-->|MassTransit/RabbitMQ| RMQ
    Quartz -->|MassTransit/RabbitMQ| RMQ
    S3 -->|S3 Event| S3N
    S3N -->|Publishes to| RMQ
    API --> Auth0

    classDef service fill:#bbdefb,stroke:#1976d2
    classDef infra fill:#c8e6c9,stroke:#388e3c
    classDef external fill:#fff9c4,stroke:#f9a825
    classDef client fill:#e1bee7,stroke:#7b1fa2

    class Web client
    class API,PM,Quartz service
    class RMQ,Mongo,Auth0 infra
    class S3N,S3 external

API Service (`syrf-api`)¶

The REST gateway for all client interactions. Handles authentication (Auth0 JWT), request routing, and response aggregation. The API service is intentionally thin -- it delegates business logic to the Project Management service via MassTransit messages.

Technology: .NET 10.0, ASP.NET Core
Port: 8080
Database: Direct MongoDB access (read-heavy queries)
Key responsibility: REST API surface, authentication/authorization, request routing

Project Management Service (`syrf-project-management`)¶

The core domain service -- the "brain" of SyRF. All business logic for projects, studies, annotations, screening, reconciliation, and question management lives here. It follows DDD patterns with aggregate roots (Project, Study, Investigator) and MongoDB repositories.

Technology: .NET 10.0, DDD architecture
Port: 8081
Database: Direct MongoDB access (domain collections)
Communication: MassTransit/RabbitMQ (receives commands from API, publishes domain events)
Key responsibilities: Project CRUD, study management, annotation processing, screening workflows, reconciliation, question management, import pipeline orchestration (MassTransit sagas)

Quartz Service (`syrf-quartz`)¶

Background job scheduling and processing. Handles data exports, cleanup tasks, and scheduled operations.

Technology: .NET 10.0, Quartz.NET
Port: 8082
Database: SQL Server (job storage), MongoDB (domain data reads)
Key responsibilities: Scheduled reports, data export processing, cleanup tasks

Web Service (`syrf-web`)¶

The Angular 21 single-page application served by NGINX. Communicates exclusively with the API service via REST. Uses ngrx for state management and SignalR for real-time notifications.

Technology: Angular 21, Material, ngrx, rxjs
Port: 80
Key responsibilities: User interface, form rendering, real-time updates

S3 File Saved Notifier (`syrf-s3-notifier`)¶

An AWS Lambda function that receives S3 upload events when users upload citation files and publishes messages to RabbitMQ, triggering the import pipeline in the Project Management service.

Technology: .NET 10.0, AWS Lambda
Region: eu-west-1
Key responsibility: Bridge between S3 file uploads and the MassTransit message bus

Communication Patterns¶

Pattern	Path	Protocol	Purpose
REST	Web → API	HTTP/JSON	All client-to-server communication
Message Bus	API ↔ PM	MassTransit/RabbitMQ	Command/event exchange between services
Message Bus	S3 Notifier → PM	RabbitMQ	File upload event notification
Direct DB	API, PM, Quartz → MongoDB	MongoDB Driver	Data access
Real-time	API → Web	SignalR (WebSocket)	Push notifications (MongoDB change streams)

For build-time dependency details (shared libraries, Docker contexts, CI triggers), see dependency-map.md.

3. Three-Level Data Model¶

SyRF separates bibliographic data into three levels, each serving a distinct purpose in the systematic review pipeline.

erDiagram
    PUBLICATION ||--o{ CITATION : "linked from"
    STUDY ||--|{ CITATION : "embeds"
    STUDY }o--|| PUBLICATION : "canonical ref"

    PUBLICATION {
        Guid id PK
        string doi UK "unique sparse"
        string pmid UK "unique sparse"
        string canonicalTitle
        string canonicalAuthors
        string canonicalAbstract
        int canonicalYear
        array metadataProvenance
        array linkedProjectIds
    }

    CITATION {
        Guid id PK
        Guid publicationId FK
        Guid projectId
        Guid systematicSearchId
        SearchSourceType sourceType
        string sourceName
        string rawTitle
        string rawAuthors
        string rawAbstract
        DateTime importedAt
    }

    STUDY {
        Guid id PK
        Guid projectId
        Guid publicationId FK
        StudyLifecycleStatus lifecycleStatus
        FullTextStatus fullTextStatus
        Guid duplicateGroupId
        array citations "embedded"
        array screeningOutcomes
        bool metaAnalysisIncluded
    }

Publication (System-Scoped)¶

Collection: pmPublication [TARGET -- Phase 12]

The global bibliographic identity. A Publication represents a unique piece of research across the entire SyRF system, not tied to any single project. DOI and PMID have unique sparse indexes for instant deduplication on import.

Scope: System-wide -- exists independently of any project
Mutability: Canonical fields are updated when better metadata arrives from any project's imports
Deletion: Never deleted, even if all linked studies are removed
Key fields: doi, pmid, canonicalTitle, canonicalAuthors, metadataProvenance[], linkedProjectIds[]

When multiple projects import the same citation, they all link to the same Publication. The Publication accumulates the best metadata from all sources using canonical enrichment rules (prefer longest title, most complete author list, non-empty abstract, etc.). Field-level provenance tracks which Citation provided each canonical value.

Citation (Project-Scoped, Immutable)¶

Storage: Embedded on the Study document as citations[] [TARGET -- Phase 12]

An immutable record of a single citation as imported from a specific source. Citations are never modified after creation -- they preserve the exact raw bibliographic data from the import file.

Scope: Project-scoped (belongs to one project)
Mutability: Immutable -- never modified after creation
Key fields: publicationId, sourceType, sourceName, rawTitle, rawAuthors, rawDoi, importedAt
PRISMA role: Enables per-source-type counting (PRISMA boxes 2, 11) because each Citation preserves its source classification

After deduplication, multiple Citations that represent the same research are linked to the same Study. The Citation count per Study enables the "records identified vs. studies included" distinction that PRISMA requires.

Study (Project-Scoped, Mutable)¶

Collection: pmStudy (existing)

The reviewable entity that annotators, screeners, and reconcilers interact with. A Study carries lifecycle status, screening outcomes, annotation sessions, and reconciliation sessions.

Scope: Project-scoped (belongs to one project)
Mutability: Mutable -- evolves as the review progresses
Key fields: lifecycleStatus, publicationId, citations[], screeningOutcomes[], fullTextStatus, metaAnalysisIncluded
PRISMA role: Lifecycle status feeds terminal-state boxes (3, 6-7, 10, 16). Screening outcomes feed screening boxes (4-5, 8-9, 14-15).

Current State vs. Target¶

Level	Today	Target (Phase 12+)
Publication	Does not exist	`pmPublication` collection with DOI/PMID indexes
Citation	Does not exist	Embedded on Study as `citations[]`
Study	Exists as `pmStudy` -- no lifecycle status, no Publication link, no Citations	Enriched with `lifecycleStatus`, `publicationId`, `citations[]`, `screeningOutcomes[]`

The three-level model is introduced incrementally: Phase 7 adds forward-compatible fields (nullable sourceType on SystematicSearch); Phase 12 creates the Publication collection and Citation structure; Phase 16 backfills lifecycle status and source types on existing data.

For the complete entity specification, see three-level-data-model.md.

4. Study Lifecycle¶

Every study in a project has a lifecycle status tracking its position in the systematic review pipeline. This is distinct from screening outcomes, which track per-criteria inclusion/exclusion decisions.

State Machine¶

stateDiagram-v2
    [*] --> Active : Import

    Active --> Duplicate : Dedup auto-confirm
    Active --> PendingDuplicateReview : Dedup probable match
    Active --> FullTextSought : Retrieval initiated
    Active --> Included : All screening profiles pass
    Active --> Merged : Admin confirms merge
    Active --> RemovedByAutomation : Automation tool
    Active --> RemovedOther : Admin removal

    PendingDuplicateReview --> Duplicate : Admin confirms
    PendingDuplicateReview --> Active : Admin rejects

    Duplicate --> Active : Admin reversal

    FullTextSought --> Active : Full text obtained
    FullTextSought --> FullTextNotRetrieved : Retrieval fails

    state "Terminal States" as terminal {
        Included
        Merged
        RemovedByAutomation
        RemovedOther
        FullTextNotRetrieved
    }

    note right of terminal
        Terminal states are admin-overridable
        but not routinely reversible
    end note

The Nine States¶

State	Value	Pool Visibility	PRISMA Box	Description
Active	0	Yes	(working state)	Default. Available for screening and annotation.
Duplicate	1	No	Box 3	Confirmed duplicate (auto or admin).
PendingDuplicateReview	2	No	--	Probable duplicate awaiting admin review.
FullTextSought	3	No	Box 6/12	Full-text retrieval attempted.
FullTextNotRetrieved	4	No	Box 7/13	Full text could not be obtained. Terminal.
Included	5	No (annotation pools: yes)	Box 10/16	Passed all screening profiles. Terminal.
Merged	6	No	Box 3	Merged into canonical study during dedup. Terminal.
RemovedByAutomation	7	No	Box 3	Removed by automation tool pre-screening. Terminal.
RemovedOther	8	No	Box 3	Removed for other pre-screen reasons. Terminal.

Critical Invariant: Lifecycle vs. Screening¶

Lifecycle status tracks pipeline position ONLY. Screening exclusion is NOT a lifecycle status -- it is recorded per-profile on the screeningOutcomes[] array.

This distinction matters because:

A study can be excluded under one screening profile and included under another in a multi-stage pipeline
PRISMA uses both systems for different boxes -- lifecycle status feeds boxes 3, 6-7, 10, 16; screening outcomes feed boxes 4-5, 8-9, 14-15
The lifecycle status transitions to Included ONLY when the study passes ALL required screening profiles

Pool Visibility Rules¶

Only Active studies appear in screening and annotation stage pools. This is a system invariant.
PendingDuplicateReview studies are conservatively excluded to prevent wasted screening effort.
Included studies may appear in annotation pools for data extraction but not in screening pools.
All other non-Active states exclude the study from all pools.

Current State vs. Target¶

The lifecycleStatus field does not exist today. It will be added in Phase 12 as a nullable field and backfilled to Active for all existing studies in Phase 16. All state transition logic is new.

For the complete lifecycle specification including transition rules and edge cases, see study-lifecycle-and-source-taxonomy.md.

5. End-to-End Data Flow¶

This section traces a citation from import through to PRISMA reporting, showing how data moves through the three-level model and service architecture.

sequenceDiagram
    participant User
    participant Web
    participant API
    participant S3
    participant Lambda as S3 Notifier
    participant RMQ as RabbitMQ
    participant PM as PM Service
    participant Mongo as MongoDB

    Note over User,Mongo: 1. IMPORT

    User->>Web: Upload citation file
    Web->>S3: Upload to S3 bucket
    S3->>Lambda: S3 event notification
    Lambda->>RMQ: Publish upload event
    RMQ->>PM: Deliver to import consumer
    PM->>Mongo: Create Study + Citation
    PM->>Mongo: Create/link Publication (DOI/PMID lookup)

    Note over User,Mongo: 2. DEDUPLICATION [TARGET]

    PM->>PM: Run ASySD algorithm (R subprocess)
    PM->>Mongo: Auto-confirm high-confidence duplicates
    PM->>Mongo: Queue probable duplicates for admin review
    PM->>Mongo: Enrich canonical Publication metadata

    Note over User,Mongo: 3. SCREENING [TARGET]

    User->>Web: Open screening stage
    Web->>API: Request study from pool
    API->>PM: Get study for screening
    PM->>Mongo: Query Active studies in stage pool
    PM-->>API: Return study
    API-->>Web: Display study
    User->>Web: Record screening decision
    Web->>API: Submit screening outcome
    API->>PM: Save screening outcome
    PM->>Mongo: Update Study.screeningOutcomes[]

    Note over User,Mongo: 4. ANNOTATION

    User->>Web: Open annotation form
    Web->>API: Request study + question set
    API->>PM: Get study and current QSV
    PM-->>API: Return study + questions
    User->>Web: Answer questions (auto-save per question)
    Web->>API: Save annotation version
    User->>Web: Submit session
    Web->>API: Create immutable SessionVersion
    PM->>Mongo: Store SessionVersion with AV references

    Note over User,Mongo: 5. RECONCILIATION [TARGET]

    PM->>PM: Detect 2+ completed sessions
    PM->>Mongo: Add study to reconciliation pool
    User->>Web: Open reconciliation page
    Web->>API: Request assignment
    PM->>PM: Random assignment from pool
    PM-->>API: Return study with blinded candidates
    User->>Web: Create reconciler's own answers
    Web->>API: Submit reconciliation
    PM->>Mongo: Create immutable RSV (gold standard)

    Note over User,Mongo: 6. EXPORT & PRISMA [TARGET]

    User->>Web: Request data export / PRISMA diagram
    Web->>API: Trigger export
    API->>PM: Generate export / compute PRISMA counts
    PM->>Mongo: Aggregate lifecycle + screening + dedup data
    PM-->>API: Return export data / PRISMA fields
    API-->>Web: Download / display

Step-by-Step Detail¶

Step 1: Import¶

A user uploads a citation file (RIS, PubMed XML, CSV, etc.) through the web interface. The file is stored in AWS S3. The S3 Notifier Lambda detects the upload event and publishes a message to RabbitMQ. The Project Management service's import consumer processes the file:

Parses citations from the file
Creates one Study per citation in the project
[TARGET] Creates one Citation per citation, preserving raw bibliographic data
[TARGET] Looks up or creates a Publication by DOI/PMID match
Sets the study's lifecycle status to Active

Current state: Studies are created directly from parsed citations. Citations and Publications do not exist yet.

Step 2: Deduplication [TARGET -- Phase 12]¶

After import, the deduplication service runs:

Exact match: Check new Citations against existing Publications by DOI and PMID
Fuzzy match: Run the ASySD algorithm (R subprocess via stdin/stdout) for non-exact matches
Auto-confirm: High-confidence pairs (above threshold) are automatically confirmed as duplicates -- secondary study's lifecycleStatus set to Duplicate
Queue for review: Probable duplicates (below threshold) are set to PendingDuplicateReview for admin resolution
Canonical enrichment: The Publication entity is updated with best-of-breed metadata from all linked Citations

Current state: No deduplication exists. Users manually identify duplicates.

Step 3: Screening [TARGET -- Phases 13-15]¶

Screening uses named profiles that define inclusion/exclusion criteria. Stage filtering computes which Active studies belong to each screening pool. Screeners evaluate studies and record structured decisions with exclusion reasons. Screening reconciliation resolves disagreements using the same random-assignment pattern as annotation reconciliation.

Each screening decision is stored in Study.screeningOutcomes[] as a per-profile entry
A FinalScreeningOutcome tracks the authoritative decision (CandidateAgreement or Reconciled)
When all required screening profiles resolve to Included, the study's lifecycleStatus transitions to Included

Current state: Basic screening exists (binary include/exclude) but without profiles, structured reasons, or reconciliation.

Step 4: Annotation¶

Annotators answer questions from versioned question sets. The annotation form uses per-question auto-save:

Each answer creates an AnnotationVersion linked to a specific AQVersion (question version)
The working session is mutable until submission
Submitting creates an immutable SessionVersion holding explicit AnnotationVersionIds
Candidate blinding ensures annotators never see each other's answers

Current state: Annotation works but questions are not versioned (mutable on the project entity), and there is no per-question version tracking.

Step 5: Reconciliation [TARGET -- Phases 9-10]¶

When two or more annotators have completed sessions for the same study:

The system detects the study has multiple completed sessions
If all annotators agree on all answers, the study is eligible for bulk approve
Otherwise, the study enters the reconciliation pool
A reconciler is randomly assigned -- they see anonymised candidate answers ("Annotator A" vs. "Annotator B")
The reconciler creates their own answers for every question (even when agreeing)
Submitting creates an immutable ReconciliationSessionVersion -- the gold-standard record

Current state: No reconciliation workflow exists. Production has zero reconciled annotations.

Step 6: Export & PRISMA [TARGET -- Phase 16]¶

Data export includes reconciliation status, version references, and agreement metrics. The PRISMA 2020 flow diagram is auto-generated by aggregating lifecycle status, screening outcomes, and dedup counts across the project's studies and import records. All 17 PRISMA boxes (34 fields) are derivable from the three-level model.

Current state: Basic CSV/Google Sheets export exists but without reconciliation data, version references, or PRISMA reporting.

6. Question Management and Versioning¶

SyRF uses an Identity + Immutable Versions pattern for annotation questions, ensuring full traceability from question creation through annotation to reconciliation.

Core Concepts¶

Draft Question (DQ): A mutable factory object living on the Project entity. Created and freely edited by project administrators in the QM Design tab. No version history, no audit trail, no downstream dependencies.

Annotation Question (AQ): Created when a stage is enabled (DQ activation). The AQ has a stable identity with fixed structural properties.

Annotation Question Version (AQV): An immutable snapshot of the AQ's content properties. Each edit after activation creates a new version.

Question Set (QS) / Question Set Version (QSV): A QS is an ordered collection of AQVersions assigned to a stage. A QSV is an immutable snapshot -- a specific list of AQVersionIds in a specific order. When annotators work, they see the questions defined by the active QSV.

What Is Fixed vs. What Is Versionable¶

Property	Category	Rationale
Parent question	Structural (fixed)	Changing parent would break hierarchy integrity
Data type (boolean, select, etc.)	Structural (fixed)	Changing type would invalidate all existing answers
Group-as-single	Structural (fixed)	Changes grouping semantics
Question text	Content (versionable)	Wording refinements are common and safe
Answer options	Content (versionable)	Adding/removing options is common
Help text	Content (versionable)	Guidance evolves with use
Answer option filters	Content (versionable)	Filtering logic may need refinement

Session Versioning¶

When an annotator submits their work, a SessionVersion is created that holds:

Explicit AnnotationVersionIds -- the exact answer versions submitted
Each AnnotationVersion references its AQVersionId -- which question version it was answered against
The QSV that was active at the time

This creates a complete audit trail: for any annotation, you can reconstruct exactly what question was asked (AQV), in what context (QSV), and what was answered (AV).

Current State vs. Target¶

Concept	Today	Target (Phase 4+)
Questions	Mutable on the Project entity	AQ identity + AQVersion immutable snapshots
Question assignment	Question IDs on stage	QSV -- immutable snapshots of ordered AQVersionId lists
Annotations	Basic save/submit	Per-question auto-save as AVs; immutable SessionVersion on submit
Audit trail	Question text snapshot only	Full: AQV -> AV -> SessionVersion -> QSV

For the complete question management specification, see Question Management. For annotation versioning details, see the design decisions in design-decisions.md.

7. PRISMA 2020 Integration¶

PRISMA 2020 (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) defines a standardised flow diagram that systematic reviews must include. SyRF's data model is designed to auto-generate this diagram.

Flow Diagram Structure¶

The PRISMA 2020 flow diagram has 17 boxes requiring 34 data fields, organised into three phases:

Identification -- Records discovered from all sources, with duplicates and pre-screen removals
Screening -- Records screened at title/abstract and full-text levels, split by source column
Included -- Studies and reports included in the final review and/or meta-analysis

The diagram uses a dual-column layout:

Column 1 (Databases/Registers): Searches from PubMed, Embase, Scopus, ClinicalTrials.gov, CENTRAL, etc.
Column 2 (Other Sources): Websites, organisations contacted, citation searching, other methods

Source Type Taxonomy¶

The SearchSourceType enum determines which PRISMA column a search's records appear in:

Source Type	PRISMA Column	Examples
Database	Column 1	PubMed, Embase, Scopus, PsycINFO
Register	Column 1	ClinicalTrials.gov, CENTRAL, ICTRP
Website	Column 2	Google Scholar, institutional sites
Organisation	Column 2	Organisations contacted for studies
CitationSearching	Column 2	Forward/backward citation chasing
Other	Column 2	Expert contacts, conference abstracts

When a study has Citations from different source types, the first import's sourceType determines column assignment for screening/inclusion boxes. All Citations are counted in their respective source boxes regardless.

How SyRF Data Maps to PRISMA Boxes¶

Every PRISMA box is derivable from the three-level data model:

PRISMA Phase	Boxes	Data Source
Identification	Box 2 (records from databases/registers)	`SystematicSearch.sourceType` + `numberOfCitations`
Identification	Box 3 (duplicates, automation removal)	`Study.lifecycleStatus` + `Citation` counts
Identification	Box 11 (records from other sources)	`SystematicSearch.sourceType` + `numberOfCitations`
Screening	Boxes 4-5 (T/A screening)	`Study.screeningOutcomes[]` per profile
Screening	Boxes 6-7 (retrieval)	`Study.fullTextStatus`
Screening	Boxes 8-9 (FT screening with reasons)	`Study.screeningOutcomes[]` + `primaryExclusionReason`
Screening	Boxes 12-15 (other sources screening)	Same as 4-9 but filtered to Column 2 sources
Included	Box 10 (new studies/reports)	`Study.lifecycleStatus = Included` + `citations[]` count
Included	Box 16 (total studies/reports)	Same as box 10 (no updated reviews yet)
Included	Box 17 (meta-analysis)	`Study.metaAnalysisIncluded = true`

For the complete box-by-box derivation rules with exact queries, see prisma-flow-diagram-mapping.md. For source type taxonomy details, see study-lifecycle-and-source-taxonomy.md.

8. Technology Stack¶

Backend¶

.NET 10.0 -- All backend services and shared libraries
MongoDB -- Primary data store (Atlas M20, shared syrftest database)
MassTransit -- Message bus abstraction over RabbitMQ
Quartz.NET -- Job scheduling (SQL Server for job storage)
R -- [TARGET] ASySD deduplication algorithm (R subprocess via stdin/stdout)
Auth0 -- Authentication (JWT, policy-based authorization)

Frontend¶

Angular 21 -- Single-page application
Angular Material -- UI component library
ngrx 20 -- State management
rxjs 7 -- Reactive programming
Signal forms (experimental) -- [TARGET] New form infrastructure (@angular/forms/signals)
NGINX -- Web server for serving the built Angular app

Infrastructure¶

GKE (Google Kubernetes Engine) -- Container orchestration (europe-west2)
ArgoCD -- GitOps continuous delivery
Helm -- Kubernetes package management
External Secrets Operator -- GCP Secret Manager to Kubernetes secrets sync
GitHub Actions -- CI/CD pipeline
GitHub Container Registry (GHCR) -- Docker image storage
AWS S3 -- File uploads (eu-west-1)
AWS Lambda -- S3 event processing

For the full technology reference including version details, deployment configuration, and CI/CD workflow, see CLAUDE.md.

9. Cross-References¶

Architecture Documents¶

Document	Purpose
dependency-map.md	Build-time service and library dependencies
mongodb-reference.md	Database collections, CSUUID format, collection naming
llm-navigation-guide.md	Multi-repository navigation for AI assistants
CLAUDE.md	Full technology reference and project context

PRISMA Specification Documents¶

Document	Purpose
PRISMA Specification README	Specification package overview and usage guide
prisma-flow-diagram-mapping.md	Box-by-box PRISMA field mapping with derivation rules
three-level-data-model.md	Publication/Citation/Study entity specifications
study-lifecycle-and-source-taxonomy.md	9 lifecycle states, 6 source types, PRISMA count derivation
deduplication-service-specification.md	ASySD integration, confidence model, enrichment rules

Feature Specification Documents¶

Document	Purpose
Product Overview	Product-level framing of annotation and reconciliation
Design Decisions	50 authoritative design decisions (D1-D50)
Question Management	Versioning, question sets, stage configuration
Annotation Form v2	Signal forms, virtual scroll, auto-save
Reconciliation	Workflow, metrics, authority determination

Planning Documents¶

Document	Purpose
ROADMAP.md	16-phase delivery plan across 3 releases
REQUIREMENTS.md	101 requirements with phase traceability
PROJECT.md	Project context, constraints, key decisions

This document is the primary architecture reference for the SyRF platform. It replaces system-overview.md as the architecture entry point. For the generic system overview, see the deprecated system-overview.md.

SyRF Platform Architecture¶

1. What Is SyRF¶

The Problem SyRF Solves¶

Core Workflow¶

2. Service Architecture¶

API Service (syrf-api)¶

Project Management Service (syrf-project-management)¶

Quartz Service (syrf-quartz)¶

Web Service (syrf-web)¶

S3 File Saved Notifier (syrf-s3-notifier)¶

Communication Patterns¶

3. Three-Level Data Model¶

Publication (System-Scoped)¶

Citation (Project-Scoped, Immutable)¶

Study (Project-Scoped, Mutable)¶

Current State vs. Target¶

4. Study Lifecycle¶

State Machine¶

The Nine States¶

Critical Invariant: Lifecycle vs. Screening¶

Pool Visibility Rules¶

Current State vs. Target¶

5. End-to-End Data Flow¶

Step-by-Step Detail¶

Step 1: Import¶

Step 2: Deduplication [TARGET -- Phase 12]¶

Step 3: Screening [TARGET -- Phases 13-15]¶

Step 4: Annotation¶

Step 5: Reconciliation [TARGET -- Phases 9-10]¶

Step 6: Export & PRISMA [TARGET -- Phase 16]¶

6. Question Management and Versioning¶

Core Concepts¶

What Is Fixed vs. What Is Versionable¶

Session Versioning¶

Current State vs. Target¶

7. PRISMA 2020 Integration¶

Flow Diagram Structure¶

Source Type Taxonomy¶

How SyRF Data Maps to PRISMA Boxes¶

8. Technology Stack¶

Backend¶

Frontend¶

Infrastructure¶

9. Cross-References¶

Architecture Documents¶

PRISMA Specification Documents¶

Feature Specification Documents¶

Planning Documents¶

API Service (`syrf-api`)¶

Project Management Service (`syrf-project-management`)¶

Quartz Service (`syrf-quartz`)¶

Web Service (`syrf-web`)¶

S3 File Saved Notifier (`syrf-s3-notifier`)¶