Skip to content

SyRF Platform Architecture

Reading time: ~30 minutes. This is the primary architecture reference for the SyRF platform. For detailed build dependencies, see dependency-map.md. For MongoDB collection details, see mongodb-reference.md. For PRISMA 2020 data model constraints, see ../features/prisma-specification/.


1. What Is SyRF

SyRF (Systematic Review Facility) is a web platform for conducting systematic reviews and meta-analyses of preclinical research. It is developed by the CAMARADES group (Collaborative Approach to Meta-Analysis and Review of Animal Data from Experimental Studies) and used by research teams worldwide to synthesise evidence from animal studies.

The Problem SyRF Solves

A systematic review requires a research team to:

  1. Search multiple bibliographic databases for potentially relevant studies
  2. Remove duplicate citations across those databases
  3. Screen thousands of titles and abstracts against inclusion criteria
  4. Retrieve and assess full-text reports for eligible studies
  5. Annotate (extract data from) included studies using structured questions
  6. Reconcile disagreements between independent annotators
  7. Export structured data and produce a PRISMA 2020 flow diagram documenting the review process

Each step involves multiple researchers working independently, producing decisions that must be auditable, reproducible, and traceable. SyRF provides the collaborative infrastructure for all of these steps.

Core Workflow

flowchart LR
    A[Import Citations] --> B[Deduplicate]
    B --> C[Screen]
    C --> D[Annotate]
    D --> E[Reconcile]
    E --> F[Export & PRISMA]

    style A fill:#e1f5fe
    style B fill:#e1f5fe
    style C fill:#fff3e0
    style D fill:#fff3e0
    style E fill:#e8f5e9
    style F fill:#e8f5e9

Import: Users upload citation files (RIS, PubMed XML, CSV, etc.) from systematic searches. Each citation becomes a study in the project.

Deduplicate: [TARGET] The ASySD algorithm (R subprocess) detects duplicate citations. High-confidence duplicates are auto-confirmed; probable duplicates are queued for admin review. A canonical study is enriched with best-of-breed metadata from all import records.

Screen: Reviewers evaluate studies against structured inclusion/exclusion criteria at title/abstract and full-text levels. Screening decisions are per-profile, allowing multi-stage pipelines.

Annotate: Annotators answer versioned question sets for each study. Per-question auto-save creates individual annotation versions; submitting creates an immutable session version.

Reconcile: When multiple annotators disagree, a reconciler is randomly assigned to resolve disagreements by creating their own independent answers, producing a gold-standard reconciliation session version.

Export & PRISMA: Structured data export includes reconciliation status, version references, and agreement metrics. A PRISMA 2020 flow diagram is auto-generated from lifecycle and screening data.

Current state: Import and basic annotation/screening work today. Deduplication, versioned questions, reconciliation, screening profiles, and PRISMA reporting are being built across three releases (see ROADMAP.md).

For a product-level framing of the annotation and reconciliation capabilities, see the product overview.


2. Service Architecture

SyRF is a microservices-based platform following Domain-Driven Design principles. Five services collaborate to deliver the platform's capabilities.

graph TB
    subgraph "Client"
        Web["Web Service<br/>(Angular 21 SPA)<br/>Port 80"]
    end

    subgraph "Backend Services"
        API["API Service<br/>(REST Gateway)<br/>Port 8080"]
        PM["Project Management Service<br/>(Core Domain Logic)<br/>Port 8081"]
        Quartz["Quartz Service<br/>(Background Jobs)<br/>Port 8082"]
    end

    subgraph "External"
        S3N["S3 Notifier<br/>(AWS Lambda)"]
        S3["AWS S3<br/>(File Uploads)"]
    end

    subgraph "Infrastructure"
        RMQ["RabbitMQ<br/>(Message Bus)"]
        Mongo["MongoDB Atlas<br/>(Data Store)"]
        Auth0["Auth0<br/>(Authentication)"]
    end

    Web -->|REST/HTTP| API
    API <-->|MassTransit/RabbitMQ| PM
    API --> Mongo
    PM --> Mongo
    Quartz --> Mongo
    PM <-->|MassTransit/RabbitMQ| RMQ
    Quartz -->|MassTransit/RabbitMQ| RMQ
    S3 -->|S3 Event| S3N
    S3N -->|Publishes to| RMQ
    API --> Auth0

    classDef service fill:#bbdefb,stroke:#1976d2
    classDef infra fill:#c8e6c9,stroke:#388e3c
    classDef external fill:#fff9c4,stroke:#f9a825
    classDef client fill:#e1bee7,stroke:#7b1fa2

    class Web client
    class API,PM,Quartz service
    class RMQ,Mongo,Auth0 infra
    class S3N,S3 external

API Service (syrf-api)

The REST gateway for all client interactions. Handles authentication (Auth0 JWT), request routing, and response aggregation. The API service is intentionally thin -- it delegates business logic to the Project Management service via MassTransit messages.

  • Technology: .NET 10.0, ASP.NET Core
  • Port: 8080
  • Database: Direct MongoDB access (read-heavy queries)
  • Key responsibility: REST API surface, authentication/authorization, request routing

Project Management Service (syrf-project-management)

The core domain service -- the "brain" of SyRF. All business logic for projects, studies, annotations, screening, reconciliation, and question management lives here. It follows DDD patterns with aggregate roots (Project, Study, Investigator) and MongoDB repositories.

  • Technology: .NET 10.0, DDD architecture
  • Port: 8081
  • Database: Direct MongoDB access (domain collections)
  • Communication: MassTransit/RabbitMQ (receives commands from API, publishes domain events)
  • Key responsibilities: Project CRUD, study management, annotation processing, screening workflows, reconciliation, question management, import pipeline orchestration (MassTransit sagas)

Quartz Service (syrf-quartz)

Background job scheduling and processing. Handles data exports, cleanup tasks, and scheduled operations.

  • Technology: .NET 10.0, Quartz.NET
  • Port: 8082
  • Database: SQL Server (job storage), MongoDB (domain data reads)
  • Key responsibilities: Scheduled reports, data export processing, cleanup tasks

Web Service (syrf-web)

The Angular 21 single-page application served by NGINX. Communicates exclusively with the API service via REST. Uses ngrx for state management and SignalR for real-time notifications.

  • Technology: Angular 21, Material, ngrx, rxjs
  • Port: 80
  • Key responsibilities: User interface, form rendering, real-time updates

S3 File Saved Notifier (syrf-s3-notifier)

An AWS Lambda function that receives S3 upload events when users upload citation files and publishes messages to RabbitMQ, triggering the import pipeline in the Project Management service.

  • Technology: .NET 10.0, AWS Lambda
  • Region: eu-west-1
  • Key responsibility: Bridge between S3 file uploads and the MassTransit message bus

Communication Patterns

Pattern Path Protocol Purpose
REST Web → API HTTP/JSON All client-to-server communication
Message Bus API ↔ PM MassTransit/RabbitMQ Command/event exchange between services
Message Bus S3 Notifier → PM RabbitMQ File upload event notification
Direct DB API, PM, Quartz → MongoDB MongoDB Driver Data access
Real-time API → Web SignalR (WebSocket) Push notifications (MongoDB change streams)

For build-time dependency details (shared libraries, Docker contexts, CI triggers), see dependency-map.md.


3. Three-Level Data Model

SyRF separates bibliographic data into three levels, each serving a distinct purpose in the systematic review pipeline.

erDiagram
    PUBLICATION ||--o{ CITATION : "linked from"
    STUDY ||--|{ CITATION : "embeds"
    STUDY }o--|| PUBLICATION : "canonical ref"

    PUBLICATION {
        Guid id PK
        string doi UK "unique sparse"
        string pmid UK "unique sparse"
        string canonicalTitle
        string canonicalAuthors
        string canonicalAbstract
        int canonicalYear
        array metadataProvenance
        array linkedProjectIds
    }

    CITATION {
        Guid id PK
        Guid publicationId FK
        Guid projectId
        Guid systematicSearchId
        SearchSourceType sourceType
        string sourceName
        string rawTitle
        string rawAuthors
        string rawAbstract
        DateTime importedAt
    }

    STUDY {
        Guid id PK
        Guid projectId
        Guid publicationId FK
        StudyLifecycleStatus lifecycleStatus
        FullTextStatus fullTextStatus
        Guid duplicateGroupId
        array citations "embedded"
        array screeningOutcomes
        bool metaAnalysisIncluded
    }

Publication (System-Scoped)

Collection: pmPublication [TARGET -- Phase 12]

The global bibliographic identity. A Publication represents a unique piece of research across the entire SyRF system, not tied to any single project. DOI and PMID have unique sparse indexes for instant deduplication on import.

  • Scope: System-wide -- exists independently of any project
  • Mutability: Canonical fields are updated when better metadata arrives from any project's imports
  • Deletion: Never deleted, even if all linked studies are removed
  • Key fields: doi, pmid, canonicalTitle, canonicalAuthors, metadataProvenance[], linkedProjectIds[]

When multiple projects import the same citation, they all link to the same Publication. The Publication accumulates the best metadata from all sources using canonical enrichment rules (prefer longest title, most complete author list, non-empty abstract, etc.). Field-level provenance tracks which Citation provided each canonical value.

Citation (Project-Scoped, Immutable)

Storage: Embedded on the Study document as citations[] [TARGET -- Phase 12]

An immutable record of a single citation as imported from a specific source. Citations are never modified after creation -- they preserve the exact raw bibliographic data from the import file.

  • Scope: Project-scoped (belongs to one project)
  • Mutability: Immutable -- never modified after creation
  • Key fields: publicationId, sourceType, sourceName, rawTitle, rawAuthors, rawDoi, importedAt
  • PRISMA role: Enables per-source-type counting (PRISMA boxes 2, 11) because each Citation preserves its source classification

After deduplication, multiple Citations that represent the same research are linked to the same Study. The Citation count per Study enables the "records identified vs. studies included" distinction that PRISMA requires.

Study (Project-Scoped, Mutable)

Collection: pmStudy (existing)

The reviewable entity that annotators, screeners, and reconcilers interact with. A Study carries lifecycle status, screening outcomes, annotation sessions, and reconciliation sessions.

  • Scope: Project-scoped (belongs to one project)
  • Mutability: Mutable -- evolves as the review progresses
  • Key fields: lifecycleStatus, publicationId, citations[], screeningOutcomes[], fullTextStatus, metaAnalysisIncluded
  • PRISMA role: Lifecycle status feeds terminal-state boxes (3, 6-7, 10, 16). Screening outcomes feed screening boxes (4-5, 8-9, 14-15).

Current State vs. Target

Level Today Target (Phase 12+)
Publication Does not exist pmPublication collection with DOI/PMID indexes
Citation Does not exist Embedded on Study as citations[]
Study Exists as pmStudy -- no lifecycle status, no Publication link, no Citations Enriched with lifecycleStatus, publicationId, citations[], screeningOutcomes[]

The three-level model is introduced incrementally: Phase 7 adds forward-compatible fields (nullable sourceType on SystematicSearch); Phase 12 creates the Publication collection and Citation structure; Phase 16 backfills lifecycle status and source types on existing data.

For the complete entity specification, see three-level-data-model.md.


4. Study Lifecycle

Every study in a project has a lifecycle status tracking its position in the systematic review pipeline. This is distinct from screening outcomes, which track per-criteria inclusion/exclusion decisions.

State Machine

stateDiagram-v2
    [*] --> Active : Import

    Active --> Duplicate : Dedup auto-confirm
    Active --> PendingDuplicateReview : Dedup probable match
    Active --> FullTextSought : Retrieval initiated
    Active --> Included : All screening profiles pass
    Active --> Merged : Admin confirms merge
    Active --> RemovedByAutomation : Automation tool
    Active --> RemovedOther : Admin removal

    PendingDuplicateReview --> Duplicate : Admin confirms
    PendingDuplicateReview --> Active : Admin rejects

    Duplicate --> Active : Admin reversal

    FullTextSought --> Active : Full text obtained
    FullTextSought --> FullTextNotRetrieved : Retrieval fails

    state "Terminal States" as terminal {
        Included
        Merged
        RemovedByAutomation
        RemovedOther
        FullTextNotRetrieved
    }

    note right of terminal
        Terminal states are admin-overridable
        but not routinely reversible
    end note

The Nine States

State Value Pool Visibility PRISMA Box Description
Active 0 Yes (working state) Default. Available for screening and annotation.
Duplicate 1 No Box 3 Confirmed duplicate (auto or admin).
PendingDuplicateReview 2 No -- Probable duplicate awaiting admin review.
FullTextSought 3 No Box 6/12 Full-text retrieval attempted.
FullTextNotRetrieved 4 No Box 7/13 Full text could not be obtained. Terminal.
Included 5 No (annotation pools: yes) Box 10/16 Passed all screening profiles. Terminal.
Merged 6 No Box 3 Merged into canonical study during dedup. Terminal.
RemovedByAutomation 7 No Box 3 Removed by automation tool pre-screening. Terminal.
RemovedOther 8 No Box 3 Removed for other pre-screen reasons. Terminal.

Critical Invariant: Lifecycle vs. Screening

Lifecycle status tracks pipeline position ONLY. Screening exclusion is NOT a lifecycle status -- it is recorded per-profile on the screeningOutcomes[] array.

This distinction matters because:

  1. A study can be excluded under one screening profile and included under another in a multi-stage pipeline
  2. PRISMA uses both systems for different boxes -- lifecycle status feeds boxes 3, 6-7, 10, 16; screening outcomes feed boxes 4-5, 8-9, 14-15
  3. The lifecycle status transitions to Included ONLY when the study passes ALL required screening profiles

Pool Visibility Rules

  • Only Active studies appear in screening and annotation stage pools. This is a system invariant.
  • PendingDuplicateReview studies are conservatively excluded to prevent wasted screening effort.
  • Included studies may appear in annotation pools for data extraction but not in screening pools.
  • All other non-Active states exclude the study from all pools.

Current State vs. Target

The lifecycleStatus field does not exist today. It will be added in Phase 12 as a nullable field and backfilled to Active for all existing studies in Phase 16. All state transition logic is new.

For the complete lifecycle specification including transition rules and edge cases, see study-lifecycle-and-source-taxonomy.md.


5. End-to-End Data Flow

This section traces a citation from import through to PRISMA reporting, showing how data moves through the three-level model and service architecture.

sequenceDiagram
    participant User
    participant Web
    participant API
    participant S3
    participant Lambda as S3 Notifier
    participant RMQ as RabbitMQ
    participant PM as PM Service
    participant Mongo as MongoDB

    Note over User,Mongo: 1. IMPORT

    User->>Web: Upload citation file
    Web->>S3: Upload to S3 bucket
    S3->>Lambda: S3 event notification
    Lambda->>RMQ: Publish upload event
    RMQ->>PM: Deliver to import consumer
    PM->>Mongo: Create Study + Citation
    PM->>Mongo: Create/link Publication (DOI/PMID lookup)

    Note over User,Mongo: 2. DEDUPLICATION [TARGET]

    PM->>PM: Run ASySD algorithm (R subprocess)
    PM->>Mongo: Auto-confirm high-confidence duplicates
    PM->>Mongo: Queue probable duplicates for admin review
    PM->>Mongo: Enrich canonical Publication metadata

    Note over User,Mongo: 3. SCREENING [TARGET]

    User->>Web: Open screening stage
    Web->>API: Request study from pool
    API->>PM: Get study for screening
    PM->>Mongo: Query Active studies in stage pool
    PM-->>API: Return study
    API-->>Web: Display study
    User->>Web: Record screening decision
    Web->>API: Submit screening outcome
    API->>PM: Save screening outcome
    PM->>Mongo: Update Study.screeningOutcomes[]

    Note over User,Mongo: 4. ANNOTATION

    User->>Web: Open annotation form
    Web->>API: Request study + question set
    API->>PM: Get study and current QSV
    PM-->>API: Return study + questions
    User->>Web: Answer questions (auto-save per question)
    Web->>API: Save annotation version
    User->>Web: Submit session
    Web->>API: Create immutable SessionVersion
    PM->>Mongo: Store SessionVersion with AV references

    Note over User,Mongo: 5. RECONCILIATION [TARGET]

    PM->>PM: Detect 2+ completed sessions
    PM->>Mongo: Add study to reconciliation pool
    User->>Web: Open reconciliation page
    Web->>API: Request assignment
    PM->>PM: Random assignment from pool
    PM-->>API: Return study with blinded candidates
    User->>Web: Create reconciler's own answers
    Web->>API: Submit reconciliation
    PM->>Mongo: Create immutable RSV (gold standard)

    Note over User,Mongo: 6. EXPORT & PRISMA [TARGET]

    User->>Web: Request data export / PRISMA diagram
    Web->>API: Trigger export
    API->>PM: Generate export / compute PRISMA counts
    PM->>Mongo: Aggregate lifecycle + screening + dedup data
    PM-->>API: Return export data / PRISMA fields
    API-->>Web: Download / display

Step-by-Step Detail

Step 1: Import

A user uploads a citation file (RIS, PubMed XML, CSV, etc.) through the web interface. The file is stored in AWS S3. The S3 Notifier Lambda detects the upload event and publishes a message to RabbitMQ. The Project Management service's import consumer processes the file:

  • Parses citations from the file
  • Creates one Study per citation in the project
  • [TARGET] Creates one Citation per citation, preserving raw bibliographic data
  • [TARGET] Looks up or creates a Publication by DOI/PMID match
  • Sets the study's lifecycle status to Active

Current state: Studies are created directly from parsed citations. Citations and Publications do not exist yet.

Step 2: Deduplication [TARGET -- Phase 12]

After import, the deduplication service runs:

  1. Exact match: Check new Citations against existing Publications by DOI and PMID
  2. Fuzzy match: Run the ASySD algorithm (R subprocess via stdin/stdout) for non-exact matches
  3. Auto-confirm: High-confidence pairs (above threshold) are automatically confirmed as duplicates -- secondary study's lifecycleStatus set to Duplicate
  4. Queue for review: Probable duplicates (below threshold) are set to PendingDuplicateReview for admin resolution
  5. Canonical enrichment: The Publication entity is updated with best-of-breed metadata from all linked Citations

Current state: No deduplication exists. Users manually identify duplicates.

Step 3: Screening [TARGET -- Phases 13-15]

Screening uses named profiles that define inclusion/exclusion criteria. Stage filtering computes which Active studies belong to each screening pool. Screeners evaluate studies and record structured decisions with exclusion reasons. Screening reconciliation resolves disagreements using the same random-assignment pattern as annotation reconciliation.

  • Each screening decision is stored in Study.screeningOutcomes[] as a per-profile entry
  • A FinalScreeningOutcome tracks the authoritative decision (CandidateAgreement or Reconciled)
  • When all required screening profiles resolve to Included, the study's lifecycleStatus transitions to Included

Current state: Basic screening exists (binary include/exclude) but without profiles, structured reasons, or reconciliation.

Step 4: Annotation

Annotators answer questions from versioned question sets. The annotation form uses per-question auto-save:

  • Each answer creates an AnnotationVersion linked to a specific AQVersion (question version)
  • The working session is mutable until submission
  • Submitting creates an immutable SessionVersion holding explicit AnnotationVersionIds
  • Candidate blinding ensures annotators never see each other's answers

Current state: Annotation works but questions are not versioned (mutable on the project entity), and there is no per-question version tracking.

Step 5: Reconciliation [TARGET -- Phases 9-10]

When two or more annotators have completed sessions for the same study:

  1. The system detects the study has multiple completed sessions
  2. If all annotators agree on all answers, the study is eligible for bulk approve
  3. Otherwise, the study enters the reconciliation pool
  4. A reconciler is randomly assigned -- they see anonymised candidate answers ("Annotator A" vs. "Annotator B")
  5. The reconciler creates their own answers for every question (even when agreeing)
  6. Submitting creates an immutable ReconciliationSessionVersion -- the gold-standard record

Current state: No reconciliation workflow exists. Production has zero reconciled annotations.

Step 6: Export & PRISMA [TARGET -- Phase 16]

Data export includes reconciliation status, version references, and agreement metrics. The PRISMA 2020 flow diagram is auto-generated by aggregating lifecycle status, screening outcomes, and dedup counts across the project's studies and import records. All 17 PRISMA boxes (34 fields) are derivable from the three-level model.

Current state: Basic CSV/Google Sheets export exists but without reconciliation data, version references, or PRISMA reporting.


6. Question Management and Versioning

SyRF uses an Identity + Immutable Versions pattern for annotation questions, ensuring full traceability from question creation through annotation to reconciliation.

Core Concepts

Draft Question (DQ): A mutable factory object living on the Project entity. Created and freely edited by project administrators in the QM Design tab. No version history, no audit trail, no downstream dependencies.

Annotation Question (AQ): Created when a stage is enabled (DQ activation). The AQ has a stable identity with fixed structural properties.

Annotation Question Version (AQV): An immutable snapshot of the AQ's content properties. Each edit after activation creates a new version.

Question Set (QS) / Question Set Version (QSV): A QS is an ordered collection of AQVersions assigned to a stage. A QSV is an immutable snapshot -- a specific list of AQVersionIds in a specific order. When annotators work, they see the questions defined by the active QSV.

What Is Fixed vs. What Is Versionable

Property Category Rationale
Parent question Structural (fixed) Changing parent would break hierarchy integrity
Data type (boolean, select, etc.) Structural (fixed) Changing type would invalidate all existing answers
Group-as-single Structural (fixed) Changes grouping semantics
Question text Content (versionable) Wording refinements are common and safe
Answer options Content (versionable) Adding/removing options is common
Help text Content (versionable) Guidance evolves with use
Answer option filters Content (versionable) Filtering logic may need refinement

Session Versioning

When an annotator submits their work, a SessionVersion is created that holds:

  • Explicit AnnotationVersionIds -- the exact answer versions submitted
  • Each AnnotationVersion references its AQVersionId -- which question version it was answered against
  • The QSV that was active at the time

This creates a complete audit trail: for any annotation, you can reconstruct exactly what question was asked (AQV), in what context (QSV), and what was answered (AV).

Current State vs. Target

Concept Today Target (Phase 4+)
Questions Mutable on the Project entity AQ identity + AQVersion immutable snapshots
Question assignment Question IDs on stage QSV -- immutable snapshots of ordered AQVersionId lists
Annotations Basic save/submit Per-question auto-save as AVs; immutable SessionVersion on submit
Audit trail Question text snapshot only Full: AQV -> AV -> SessionVersion -> QSV

For the complete question management specification, see Question Management. For annotation versioning details, see the design decisions in design-decisions.md.


7. PRISMA 2020 Integration

PRISMA 2020 (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) defines a standardised flow diagram that systematic reviews must include. SyRF's data model is designed to auto-generate this diagram.

Flow Diagram Structure

The PRISMA 2020 flow diagram has 17 boxes requiring 34 data fields, organised into three phases:

  1. Identification -- Records discovered from all sources, with duplicates and pre-screen removals
  2. Screening -- Records screened at title/abstract and full-text levels, split by source column
  3. Included -- Studies and reports included in the final review and/or meta-analysis

The diagram uses a dual-column layout:

  • Column 1 (Databases/Registers): Searches from PubMed, Embase, Scopus, ClinicalTrials.gov, CENTRAL, etc.
  • Column 2 (Other Sources): Websites, organisations contacted, citation searching, other methods

Source Type Taxonomy

The SearchSourceType enum determines which PRISMA column a search's records appear in:

Source Type PRISMA Column Examples
Database Column 1 PubMed, Embase, Scopus, PsycINFO
Register Column 1 ClinicalTrials.gov, CENTRAL, ICTRP
Website Column 2 Google Scholar, institutional sites
Organisation Column 2 Organisations contacted for studies
CitationSearching Column 2 Forward/backward citation chasing
Other Column 2 Expert contacts, conference abstracts

When a study has Citations from different source types, the first import's sourceType determines column assignment for screening/inclusion boxes. All Citations are counted in their respective source boxes regardless.

How SyRF Data Maps to PRISMA Boxes

Every PRISMA box is derivable from the three-level data model:

PRISMA Phase Boxes Data Source
Identification Box 2 (records from databases/registers) SystematicSearch.sourceType + numberOfCitations
Identification Box 3 (duplicates, automation removal) Study.lifecycleStatus + Citation counts
Identification Box 11 (records from other sources) SystematicSearch.sourceType + numberOfCitations
Screening Boxes 4-5 (T/A screening) Study.screeningOutcomes[] per profile
Screening Boxes 6-7 (retrieval) Study.fullTextStatus
Screening Boxes 8-9 (FT screening with reasons) Study.screeningOutcomes[] + primaryExclusionReason
Screening Boxes 12-15 (other sources screening) Same as 4-9 but filtered to Column 2 sources
Included Box 10 (new studies/reports) Study.lifecycleStatus = Included + citations[] count
Included Box 16 (total studies/reports) Same as box 10 (no updated reviews yet)
Included Box 17 (meta-analysis) Study.metaAnalysisIncluded = true

For the complete box-by-box derivation rules with exact queries, see prisma-flow-diagram-mapping.md. For source type taxonomy details, see study-lifecycle-and-source-taxonomy.md.


8. Technology Stack

Backend

  • .NET 10.0 -- All backend services and shared libraries
  • MongoDB -- Primary data store (Atlas M20, shared syrftest database)
  • MassTransit -- Message bus abstraction over RabbitMQ
  • Quartz.NET -- Job scheduling (SQL Server for job storage)
  • R -- [TARGET] ASySD deduplication algorithm (R subprocess via stdin/stdout)
  • Auth0 -- Authentication (JWT, policy-based authorization)

Frontend

  • Angular 21 -- Single-page application
  • Angular Material -- UI component library
  • ngrx 20 -- State management
  • rxjs 7 -- Reactive programming
  • Signal forms (experimental) -- [TARGET] New form infrastructure (@angular/forms/signals)
  • NGINX -- Web server for serving the built Angular app

Infrastructure

  • GKE (Google Kubernetes Engine) -- Container orchestration (europe-west2)
  • ArgoCD -- GitOps continuous delivery
  • Helm -- Kubernetes package management
  • External Secrets Operator -- GCP Secret Manager to Kubernetes secrets sync
  • GitHub Actions -- CI/CD pipeline
  • GitHub Container Registry (GHCR) -- Docker image storage
  • AWS S3 -- File uploads (eu-west-1)
  • AWS Lambda -- S3 event processing

For the full technology reference including version details, deployment configuration, and CI/CD workflow, see CLAUDE.md.


9. Cross-References

Architecture Documents

Document Purpose
dependency-map.md Build-time service and library dependencies
mongodb-reference.md Database collections, CSUUID format, collection naming
llm-navigation-guide.md Multi-repository navigation for AI assistants
CLAUDE.md Full technology reference and project context

PRISMA Specification Documents

Document Purpose
PRISMA Specification README Specification package overview and usage guide
prisma-flow-diagram-mapping.md Box-by-box PRISMA field mapping with derivation rules
three-level-data-model.md Publication/Citation/Study entity specifications
study-lifecycle-and-source-taxonomy.md 9 lifecycle states, 6 source types, PRISMA count derivation
deduplication-service-specification.md ASySD integration, confidence model, enrichment rules

Feature Specification Documents

Document Purpose
Product Overview Product-level framing of annotation and reconciliation
Design Decisions 50 authoritative design decisions (D1-D50)
Question Management Versioning, question sets, stage configuration
Annotation Form v2 Signal forms, virtual scroll, auto-save
Reconciliation Workflow, metrics, authority determination

Planning Documents

Document Purpose
ROADMAP.md 16-phase delivery plan across 3 releases
REQUIREMENTS.md 101 requirements with phase traceability
PROJECT.md Project context, constraints, key decisions

This document is the primary architecture reference for the SyRF platform. It replaces system-overview.md as the architecture entry point. For the generic system overview, see the deprecated system-overview.md.