Skip to content

CanDIGv2 Architecture

Simplified repo structure with relevant files.

  • Directoryetc/
    • Directoryenv/
      • example.env Copy and edit to configure variables
      • nightly_env.sh
    • Directorytests/ Integration and performance tests
      • test_integration.py Integration tests
    • Directoryvenv/ script, requirements for candig venv
      • activate.sh Activate candig conda env
      • requirements.txt Dependencies for the stack
  • Directorylib/ submodule directories, docker compose, setup scripts and repos
  • Makefile
  • create_service_store.sh
  • nightly_build.sh Script that can deploy stack nightly
  • nightly_build_token.py Post nightly results to Slack
  • post_build.sh Runs after to build to check running containers
  • pre-build-check.sh Runs before build and checks dependencies
  • settings.py Gets env variables from .env
  • setup_hosts.sh Sets up hosts if LOCAL_IP_ADDR not set
  • site_admin_token.py

The following table lists the individual repos for each service and helper library developed by the CanDIG team that contribute to the CanDIGv2 stack.

Service/Component NameSourceDescription
authxcandigv2-authxLibrary to facilitate interacting with AuthZ/AuthN services, Keycloak, Tyk, Opa, Vault & Access to minIO S3 objects
CanDIG Data Portalcandig-data-portalFront-end User interface for CanDIG Services
CanDIGv2 Ingest Servicecandigv2-ingestIngests clinical and genomic data into the CanDIG infrastructure.
Clinical ETL Codeclinical_ETL_codeCode to convert spreadsheet format into the MoH data model in preparation for ingest into katsu
Federation Servicefederation-serviceDistributes requests across each federated node of the distributed infrastructure
HTSGethtsget_appImplementation of GA4GH htsget API which ingests and indexes VCF files and stores GA4GH DRS objects for retrieval
KatsukatsuManages the clinical metadata in a PostgreSQL database
CanDIG OPAcandig-opaManages role-based access policies
CanDIG Query servicecandigv2-queryManages front-end querying of services
CanDIG loggingcandigv2-loggingLogging package to unify logging across all services

As well as in-house developed services, the CanDIG stack relies on external software which is configured to work within the stack, configurations are found in the /lib folder for each software, these include:

Service/Component NameRole
KeycloakAuthentication management
minioObject storage for genomic files
OPAManages role-based access policies
TykAPI management and redirection
VaultSecret and password management

New services can be added under lib directory. Please refer to the template for new services README for more details. diagram