Skip to content

feat(tooling): add Makefile and extract check_ids script#11

Merged
ningzimu merged 1 commit intomainfrom
fix/data-quality-and-tooling
Feb 25, 2026
Merged

feat(tooling): add Makefile and extract check_ids script#11
ningzimu merged 1 commit intomainfrom
fix/data-quality-and-tooling

Conversation

@ningzimu
Copy link
Collaborator

Summary

  • Add Makefile with four targets: validate, check-ids, check, build-indexes for local development convenience
  • Extract the duplicate ID check logic into scripts/check_ids.py to avoid inline heredoc in CI workflows
  • Simplify both validate-sources.yml and update-indexes.yml to call scripts/check_ids.py directly
  • Remove two duplicate source files found by the new check:
    • firstdata/sources/academic/economics/bis-statistics.json (duplicate of international/economics/bis.json)
    • firstdata/sources/sectors/education/arwu.json (duplicate of sectors/P-education/arwu.json, also had invalid schema format)

Test plan

  • make check passes: all 134 source files valid, all IDs unique
  • make build-indexes runs successfully
  • CI workflows simplified and functional

- add Makefile with validate, check-ids, check, build-indexes targets
- extract duplicate ID check into scripts/check_ids.py for reuse
- simplify CI workflows to call scripts/check_ids.py instead of inline heredoc
- remove duplicate source files: bis-statistics.json and sectors/education/arwu.json
@ningzimu ningzimu merged commit 153d0ed into main Feb 25, 2026
2 checks passed
@ningzimu ningzimu deleted the fix/data-quality-and-tooling branch February 25, 2026 11:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant