Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,8 @@ uv.lock
# ignore .json files unless if SNT_config.json
*.json
!SNT_config.json
!SNT_config_XXX.json
!snt_config.schema.json

# R ----------------------------------------
*.rds
Expand Down
68 changes: 68 additions & 0 deletions configuration/SNT_config_XXX.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
{
"$schema": "./snt_config.schema.json",
"SNT_CONFIG": {
"COUNTRY_CODE": "",
"COUNTRY_NAME": "",
"DHIS2_ADMINISTRATION_1": "level_2_name",
"DHIS2_ADMINISTRATION_2": "level_3_name",
"ANALYTICS_ORG_UNITS_LEVEL": 0,
"REPORTING_RATE_PRODUCT_UID": []
},
"SNT_DATASET_IDENTIFIERS": {
"DHIS2_DATASET_EXTRACTS": "snt-dhis2-extracts",
"DHIS2_DATASET_FORMATTED": "snt-dhis2-formatted",
"DHIS2_POPULATION_TRANSFORMATION": "snt-dhis2-pop-transformation",
"DHIS2_REPORTING_RATE": "snt-dhis2-reporting-rate",
"DHIS2_OUTLIERS_IMPUTATION": "snt-dhis2-outliers-imputation",
"DHIS2_INCIDENCE": "snt-dhis2-incidence",
"DHS_INDICATORS": "snt-dhs-indicators",
"WORLDPOP_DATASET_EXTRACT": "snt-worldpop-extract",
"SNT_HEALTHCARE_ACCESS": "snt-healthcare-access",
"ERA5_DATASET_CLIMATE": "snt-era5-climate",
"SNT_SEASONALITY_RAINFALL": "snt-seasonality-rainfall",
"SNT_SEASONALITY_CASES": "snt-seasonality-cases",
"DHIS2_QUALITY_OF_CARE": "snt-dhis2-quality-of-care",
"SNT_MAP_EXTRACTS": "snt-map-extracts",
"SNT_RESULTS": "snt-results"
},
"DHIS2_DATA_DEFINITIONS": {
"POPULATION_DEFINITIONS": {
"TOTAL_POPULATION_REF": null,
"GROWTH_FACTOR": null,
"REFERENCE_YEAR": 2024,
"POPULATION_INDICATORS": {
"POPULATION": { "ids": [], "type": "dataElement" }
},
"POPULATION_DISAGGREGATIONS": {
"POP_UNDER_5": null,
"POP_0_1_Y": null,
"POP_1_2_Y": null,
"POP_5_10_Y": null,
"POP_5_36_M": null,
"POP_PREGNANT_WOMAN": null
}
},
"DHIS2_INDICATOR_DEFINITIONS": {
"SUSP": [],
"TEST": [],
"CONF": [],
"PRES": [],
"PRESSEV": [],
"MALTREAT": [],
"MALADM": [],
"MALSEV": [],
"MALSEV_ABOVE5": [],
"MALSEV_UNDER5": [],
"MALDTH": [],
"CASRECU": [],
"CPN1": []
},
"DHIS2_REPORTING_RATES": {
"REPORTING_DATASETS": [],
"REPORTING_INDICATORS": {
"ACTUAL_REPORTS": "",
"EXPECTED_REPORTS": ""
}
}
}
}
45 changes: 45 additions & 0 deletions configuration/readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Best Practices for SNT Configuration Management

Hi! Here are some ideas to implement a more robust approach to the way we work with SNT config files.
Here below I put some suggestions, looking forward to your feedback!

Please refer to this **Jira task**: https://bluesquare.atlassian.net/browse/SNT25-453

## 1. Use a JSON Schema

A JSON Schema is a powerful tool to define the structure and data types of your JSON files. It allows to validate config files against a predefined schema, ensuring that they are correctly formatted and contain the expected fields.

How to implement it:

- Keep "**snt_config.schema.json**" in the same folder as your config files.
- Ensure the "**SNT_config_XXX.json**" files have the `$schema` property:
`"$schema": "./snt_config.schema.json"`.
This will force VS Code to show red squiggles if a user puts a string where a number is expected.

_I took the liberty to already add these bits, hope they don't break anything ... !_

**FYI**: The URL http://json-schema.org/draft-07/schema# is the **Meta-Schema**. It doesn't actually "download" anything while the code runs; rather, it tells software (like VS Code or a Python validator) which "version" of the JSON Schema language you are using.<br>
This is equivalent to a `<!DOCTYPE html>` tag: it ensures the rules for validation are interpreted correctly.

## 2. Placeholders in the config template "SNT_config_XXX.json"

The generic version of the config file ("**SNT_config_XXX.json**") now contains all the mandatry fields, and none of the country-specific fields. The following placeholders are used to help filling up these fields with the correct data type:

- `null` for missing numbers/floats.
- `""` for missing mandatory strings.
- `[]` for missing lists of IDs.

## 3. (not mandatory but it helps) Use same JSON code formatter

In VS Code, I'm using [Prettier](https://marketplace.visualstudio.com/items?itemName=esbenp.prettier-vscode) to ensure JSON files are automatically formatted at save. This way, spacing and line breaks etc are consistent and we don't get things like blank spaces being flagged as changes in git diff.

---

## (idea) Architectural Split: a More Robust and Intuitive Approcah

Instead of having one large file that users edit, we should consider splitting them into 2 files:

- `SNT_config_global.json`: Contains only the "Fixed" components (like SNT\*DATASET_IDENTIFIERS). The **user** should **not** be **exposed** to this because this is structural (do not touch!).
- `SNT_config_COD.json`: Contains only the country-specific fields. In this case I'm using DRC (COD) as a concrete example. The **user** must be able to correctly **modify** this file.

I put an example of each in the folder `./new_approach_idea`. It will require updating all the existing pipelines BUT I reckon it will improve everyone's experince around the config file ... less things breaking, more clarity for the user.
60 changes: 60 additions & 0 deletions configuration/snt_config.schema.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "SNT Configuration Schema",
"type": "object",
"required": [
"SNT_CONFIG",
"SNT_DATASET_IDENTIFIERS",
"DHIS2_DATA_DEFINITIONS"
],
"properties": {
"SNT_CONFIG": {
"type": "object",
"required": ["COUNTRY_CODE", "COUNTRY_NAME", "ANALYTICS_ORG_UNITS_LEVEL"],
"properties": {
"COUNTRY_CODE": {
"type": "string",
"description": "ISO 3166-1 alpha-3 code (e.g., COD)"
},
"COUNTRY_NAME": { "type": "string" },
"DHIS2_ADMINISTRATION_1": { "type": "string" },
"DHIS2_ADMINISTRATION_2": { "type": "string" },
"ANALYTICS_ORG_UNITS_LEVEL": { "type": "integer", "minimum": 1 },
"REPORTING_RATE_PRODUCT_UID": {
"type": "array",
"items": { "type": "string" }
}
}
},
"SNT_DATASET_IDENTIFIERS": {
"type": "object",
"description": "FIXED: Do not change keys or values here unless updating the global pipeline logic.",
"additionalProperties": { "type": "string" }
},
"DHIS2_DATA_DEFINITIONS": {
"type": "object",
"properties": {
"POPULATION_DEFINITIONS": {
"type": "object",
"properties": {
"TOTAL_POPULATION_REF": { "type": "number" },
"GROWTH_FACTOR": { "type": "number" },
"REFERENCE_YEAR": { "type": "integer" },
"POPULATION_INDICATORS": { "type": "object" },
"POPULATION_DISAGGREGATIONS": {
"type": "object",
"additionalProperties": { "type": "number" }
}
}
},
"DHIS2_INDICATOR_DEFINITIONS": {
"type": "object",
"additionalProperties": {
"type": "array",
"items": { "type": "string" }
}
}
}
}
}
}