Skip to main content

Duplicate Cleanup

Duplicate maintenance scripts default to the remote D1 database and dry-run unless --apply is passed.

bun scripts/merge-duplicate-persons.ts
bun scripts/merge-duplicate-persons.ts --high-only
bun scripts/merge-duplicate-persons.ts --report tmp/duplicate-person-candidates.csv
bun scripts/merge-duplicate-persons.ts --apply
bun scripts/repair-merged-person-fields.ts

merge-duplicate-persons.ts reads active persons from D1 and groups them by three tiers:

  • same valid national_id.
  • same normalized name, age, and gender.
  • same hospital and normalized name, skipped with --high-only.

The script picks the richest survivor, merges fields into it, repoints tips, sources, and reports, then soft-deletes the rest. Deceased and found records outrank missing records.

Groups with conflicting valid cedulas or incompatible ages are skipped.

Use --report <path> to dump candidate groups as CSV. Use --apply only after reviewing the dry-run output.