CZ 1:1 population synthesis QA cycle
Country-specific layer for synthetic people in households, dwellings, real building stock where available, hidden-population overlays, and work/school assignment evidence.
Board: synthestat-population-qa · Tenant: synthestat · Country: CZ · Overall status: running
Ideal-country quality criteria: impossible 1:1 benchmark
This is the common gold-standard benchmark for an ideal country. It is intentionally impossible to fully satisfy: complete success would mean a 1:1 replica of the real population where every person, household, dwelling, attribute, and assignment is exactly represented. The QA page uses it as an asymptote and gap taxonomy, not as a release promise.
Apply this same rubric to this country’s latest run, then report which needs are measured, constrained, modelled, unavailable, or blocked.
| Need | Unachievable ideal | QA evidence we require instead | Why perfection cannot be achieved |
|---|---|---|---|
| Complete de jure resident coverage | Every real resident represented exactly once in the right country, municipality, small area, household, and dwelling. | Synthetic person count equals official population at all enforced geographies; no unexplained duplicate, missing, or out-of-universe people. | A true 1:1 resident list is a confidential population register and changes continuously; Synthestat can only match official aggregates and declared source universes. |
| Complete attribute truth | Each synthetic person has the same age, sex, household role, education, occupation, industry, origin, health proxy, income proxy, and lifecycle state as the corresponding real person. | Published marginal and cross-tab constraints pass within HARD/FIRM/SOFT tolerances; modelled fields carry uncertainty and measured/constrained/modelled provenance. | Official releases do not expose a complete individual joint distribution, and many attributes are survey-derived, lagged, suppressed, or unavailable at fine geography. |
| Perfect household and family structure | Every household contains the exact real members and relationships, including multi-generation, partnership, child, shared, institutional, and edge-case arrangements. | Household totals, household-type distributions, age/sex/role consistency, fertility/child constraints, and structural invariants pass with explicit residuals. | Household membership is sensitive microdata; public sources usually expose only aggregate household/family tables and partial cross-tabs. |
| Exact dwelling and building grounding | Every household is assigned to its real dwelling and building with exact occupancy, vacancy, dwelling type, floor area, tenure, and address-level geography. | Dwelling/building capacity checks pass; vacancy/second-home/institutional dwellings are represented or explicitly unavailable; building links have source provenance. | Many countries lack open address-level registers; dwelling occupancy is confidential and time-varying. |
| Complete de facto and hidden-population overlays | Homeless, undocumented, refugees, students away from home, seasonal, institutional, tourists, and daytime populations are all represented with exact location and timing. | Overlay layers use interval estimates, source-specific quality flags, and never silently modify de jure HARD constraints. | Hidden populations are partly unobserved by definition; ethical/privacy constraints forbid exact person-level labels. |
| Exact school, workplace, facility, and mobility assignment | Every person is assigned to the real school, workplace, care provider, commute, and daily activity chain they use. | Assignment layers use official registers/OD flows where available; modelled assignments are flagged and validated only against aggregate flows/capacities. | Operational assignments are usually protected registers or dynamic behavioural data; Phase 1 must not imply they are known. |
| Full joint-distribution realism | The full multivariate joint distribution is identical to reality across all attributes, households, geography, and rare subgroups. | High-priority marginals/cross-tabs pass; sparse zones and prior-dominated attributes are clearly marked with quality tiers and credible intervals. | The joint distribution is non-identifiable from published marginals; IPF/BN/hierarchical pooling choose plausible distributions, not truth. |
| Zero uncertainty and zero lag | All values are current today and known without error. | Every output records reference period, retrieval timestamp, lag, confidence, uncertainty bounds, and degradation decisions. | Official statistics are lagged, revised, sampled, suppressed, and harmonized after collection. |
| Privacy-safe yet maximally detailed release | The system releases maximum useful detail while creating zero re-identification risk. | Release mode, k-anonymity/cell safeguards, perturbation/aggregation policy, and sensitive-field treatment are explicit. | Fine-area synthetic microdata can still create structurally unique records; synthetic does not mean anonymous. |
| Perfect reproducibility and auditability | Any user can trace every output record to exact source snapshots, transformations, constraints, relaxations, seeds, and code versions. | Run manifests, source provenance, checksums, frozen extracts, seeds, versioned crosswalks, validation reports, and relaxation logs are complete. | This is approachable but never final: source portals, classifications, geography, and code keep changing, so audits must be continuously renewed. |
Population output status
| People | Target population | National coverage | Absolute shortfall | Households | Dwellings | Houses/buildings | Max marginal deviation | HARD status | Run |
|---|---|---|---|---|---|---|---|---|---|
| 10,524,167 | — | — | — | 4,813,103 | — | — | 0.00% | pass_exact | cz_population_targeted_priors_7eeb18af_seed420987 |
Deviation is the maximum absolute relative error across collected HARD/FIRM/SOFT marginal constraints in the latest review bundle. GUIDE/INFORMATIONAL priors are excluded. National target/coverage are read from build_manifest.json when available and override any visual impression of completion.
Datasets and distributions
Lists come from the latest run bundle: source_provenance.json, distribution_diagnostics.json, and build_manifest.json.
Summary
| Datasets used | 0 |
|---|---|
| Distributions available | 0 |
| Constraints/distributions used in synthesis | 12 |
| Constraint types | — |
| Dataset variants | — |
| Finest-geography status | — |
Source gaps
- No source gaps listed.
Datasets used
| Dataset/source ID |
|---|
| None listed yet. |
Best source by distribution family
| Distribution family | Dataset/source ID |
|---|---|
| None listed yet. | |
Available distributions / priors in registry
| Spec | Label | Type | Geo | Status | Variant | Confidence | Data URI |
|---|---|---|---|---|---|---|---|
| None listed yet. | |||||||
Constraints/distributions used in synthesis manifest
| Constraint or distribution ID |
|---|
{'actual': 10524167, 'constraint': 'national_person_count', 'precedence': 'HARD', 'source_id': 'sldb2021_vek1_pohlavi', 'status': 'pass_exact', 'target': 10524167} |
{'actual': 4813103, 'constraint': 'national_household_count', 'precedence': 'FIRM', 'source_id': 'CZ_CZSO_CENS21_HH_TYPE_SIZE_CSV', 'status': 'pass_exact', 'target': 4813103} |
{'constraint': 'household_type_x_size_prior', 'precedence': 'SOFT', 'source_id': 'CZ_CZSO_CENS21_HH_TYPE_SIZE_CSV', 'status': 'integrated_component_prior'} |
{'constraint': 'dependent_children_by_household_type_prior', 'precedence': 'SOFT', 'source_id': 'CZ_CZSO_CENS21_HH_TYPE_CHILDREN_CSV', 'status': 'integrated_component_prior'} |
{'constraint': 'education_age_munisize_sex_prior', 'precedence': 'SOFT', 'source_id': 'CZ_CZSO_CENS21_EDU_AGE_MUNISIZE_SEX_CSV', 'status': 'integrated_component_prior'} |
{'constraint': 'economic_activity_age10_sex_prior', 'precedence': 'SOFT', 'source_id': 'CZ_CZSO_CENS21_ACTIVITY_AGE10_SEX_CSV', 'status': 'integrated_component_prior'} |
{'constraint': 'work_status_sex_prior', 'precedence': 'SOFT', 'source_id': 'CZ_CZSO_CENS21_WORK_STATUS_SEX_CSV', 'status': 'integrated_component_prior'} |
{'constraint': 'occupation_isco1_sex_prior', 'precedence': 'SOFT', 'source_id': 'CZ_CZSO_CENS21_OCCUPATION_SEX_CSV', 'status': 'integrated_component_prior'} |
{'constraint': 'industry_sex_prior', 'precedence': 'SOFT', 'source_id': 'CZ_CZSO_CENS21_INDUSTRY_SEX_CSV', 'status': 'integrated_component_prior'} |
{'constraint': 'direct_member_age_relationship_structure', 'precedence': 'GUIDE', 'source_id': None, 'status': 'EVIDENCE_EXHAUSTED_PUBLIC_CZ_DIRECT_MEMBER_AGE_STRUCTURE'} |
{'constraint': 'isco2_isco3_fine_joint', 'precedence': 'GUIDE', 'source_id': None, 'status': 'EVIDENCE_EXHAUSTED_PUBLIC_CZ_ISCO2_3_BY_AGE_SEX_EDU_REGION'} |
{'constraint': 'full_non_demographic_tensor', 'precedence': 'GUIDE', 'source_id': None, 'status': 'EVIDENCE_EXHAUSTED_PUBLIC_CZ_FULL_EDU_LABOUR_OCC_INDUSTRY_TENSOR'} |
Current country tasks
| ID | Title | Assignee | Status | Created | Latest summary |
|---|---|---|---|---|---|
| t_62645062 | CZ integrate household/non-demographic distribution priors into national synthesis | synth-modeler | running | 2026-05-19 20:26:36 CEST | |
| t_edd59a28 | CZ re-review after registry-aware national synthesis implementation | synth-reviewer | todo | 2026-05-19 20:00:45 CEST | |
| t_61551a80 | Implement CZ registry-aware national synthesis path beyond seeded 8-person fixture | synth-modeler | blocked | 2026-05-19 19:41:42 CEST | review-required: CZ registry-aware national LKOD path implemented and generated exact 10,524,167-person output; needs human review before treating code/artifacts as accepted. |
| t_9f132a82 | CZ freeze targeted household/non-demographic distribution tables | synth-downloader | done | 2026-05-19 20:26:27 CEST | Froze/catalogued the CZ targeted household/non-demographic distribution bundle: 5/5 requested CZSO LKOD CSV+metadata+LKOD endpoint snapshots downloaded with checksums, row/value counts, sidecars, and combined catalogue records. Downloader handoffs and latest snapshot were written under /home/synthes |
| t_e468ed5e | CZ RUIAN VFR building/address downloader and parser freeze | synth-downloader | done | 2026-05-19 20:24:42 CEST | Froze CZ ČÚZK RÚIAN/VFR 202602 at national manifest plus verified sample-parser scope: 6,258 OB municipality package records, checksummed sample/dependency raw assets, parsed sample building/address tables, parsed ST_UCIS usage codelist, and updated downloader/manager handoffs. Full national OB payl |
| t_16d03050 | CZ household composition and non-demographic joint distributions | synth-distributions-researcher | done | 2026-05-19 20:13:36 CEST | Completed CZ household/non-demographic distribution evidence research. Wrote findings, extraction specs, and latest evidence board; identified actionable CZSO/Eurostat priors for household templates and broad education/labour/occupation/industry fields while marking direct member-age structure, publ |
| t_6d08f1e9 | CZ bounded hidden-population overlay source sweep | synth-marginals-researcher | done | 2026-05-19 20:13:35 CEST | Completed CZ bounded hidden-population overlay sweep and wrote durable handoffs for downloader/modeler. Found usable bounded sources for CZSO Census 2021 group-quarter/outside-dwelling/census-homeless counts, RILSA/MPSV ETHOS homelessness/housing-need estimates, MVCR temporary-protection district x |
| t_5b3ef51c | CZ RUIAN dwelling/building linkage and capacity evidence for national placement | synth-marginals-researcher | done | 2026-05-19 20:13:34 CEST | Closed CZ RÚIAN dwelling/building source research: official ČÚZK VFR OB municipality packages provide building IDs, address links, obec linkage, usage codes, and registered flat count (`PocetBytu`) with schema/codelist provenance; capacity caveats remain explicit. Wrote required marginals handoffs a |
| t_f0432571 | CZ immediate re-review: national LKOD 10.5M output after t_61551a80 | synth-reviewer | done | 2026-05-19 20:08:57 CEST | PASS_FOR_INTERNAL_REVIEW for CZ national de jure core only. Verified the 8-person fixture regression is resolved: 10,524,167 persons, 4,813,103 households, HARD population pass_exact, FIRM household pass_exact; hidden populations, RUIAN dwelling/building placement, and work/school/facility layers re |
| t_7543c007 | CZ manager gate: route source closures into >8-person implementation rerun | synth-manager | done | 2026-05-19 19:49:59 CEST | Routed CZ source closures into the implementation path: source/marginal/distribution/downloader handoffs agree the old 8-person output was a model-runner fixture blocker, not a missing-source blocker. Verified t_61551a80 received the exact source payload comment and has produced a 10,524,167-person |
| t_71b60a40 | CZ urgent downloader verification: source payload readiness for national synthesis | synth-downloader | done | 2026-05-19 19:49:57 CEST | Verified CZ frozen official-source payload readiness and wrote downloader handoffs. Verdict SOURCE_PAYLOAD_READY: the core CZSO LKOD 2021 population/person/household/dwelling files and checksums are already frozen; the 8-person blocker is implementation wiring, not a missing exact source payload. |
| t_f35ab7ff | CZ urgent source closure: exact distribution priors needed for >8-person national synthesis | synth-distributions-researcher | done | 2026-05-19 19:49:56 CEST | DISTRIBUTION_READY_FOR_MODEL_FIX: completed CZ urgent distribution source closure and wrote findings/extraction/latest handoffs plus a wiki output memo. The >8-person failure is a fixture-bound generator/manifest-runner implementation bug, not a vague missing-distribution blocker; remaining true gap |
| t_401b35ae | CZ urgent source closure: exact marginals needed for >8-person national synthesis | synth-marginals-researcher | done | 2026-05-19 19:49:55 CEST | SOURCE_READY_FOR_MODEL_FIX: closed CZ national marginal-source question with exact CZSO/LKOD population, household, dwelling, and building controls; the 8-person failure is a model/runner fixture blocker, not a missing core-source blocker. Wrote marginals handoffs and wiki memo, and appended the man |
| t_3e7c7885 | Review CZ population QA cycle 3 bundle: frozen catalogue consumed but output remains 8-person fixture | synth-reviewer | done | 2026-05-19 19:41:30 CEST | Reviewed CZ population QA cycle-3 bundle and wrote the QA report at /home/synthestat/workspace/manager_handoffs/reviewer/2026-05-19_174546Z_CZ_population_review_cycle3.md. Verdict NEEDS_MODEL_FIX: bundle is contract-complete/readable and frozen catalogues are correctly recorded as provenance/target/ |
| t_8ccfccd8 | CZ population QA cycle 3: rerun using frozen official source bundle, no 8-person fixture stop | synth-modeler | done | 2026-05-19 19:37:45 CEST | Built and verified CZ population QA cycle-3 review bundle at /home/synthestat/output/runs/CZ/cz_population_review_cycle3_841ad7a3_seed420987 using the frozen t_ddbeb722 catalogues for provenance/target/gap assessment. The bundle is contract-complete with HARD pass_exact, but explicitly not a nationa |
| t_ddbeb722 | CZ downloader freeze/catalogue official sources from cycle-2 source-gap handoffs before next model rerun | synth-downloader | done | 2026-05-19 19:01:18 CEST | Froze/catalogued the CZ cycle-2 official source bundle for the next model rerun: CZSO LKOD marginal/admin assets, existing 51-artifact distribution/prior bundle, and MV Ukrainian temporary-protection XLSX are tied into a combined modeller-facing catalogue with checksum verification and preserved evi |
| t_2d605ad5 | CZ population QA cycle 2 review: inspect rerun bundle after source-gap handoffs | synth-reviewer | done | 2026-05-19 18:58:21 CEST | Reviewed CZ cycle-2 bundle and wrote handoff at /home/synthestat/workspace/manager_handoffs/reviewer/2026-05-19_170104Z_CZ_population_review_cycle2.md. Verdict NEEDS_MORE_SOURCES: bundle is contract-complete and fixed the dwelling household_id issue, but no frozen/catalogued source upgrades were con |
| t_70f8ac25 | CZ downloader freeze for distribution evidence sources | synth-downloader | done | 2026-05-19 18:53:20 CEST | Froze CZ distribution-evidence sources for the model rerun: 51/51 CZSO, Eurostat, ILO lead, and Smith 2024 metadata artifacts downloaded with SHA256 checksums, retrieval timestamps, sidecars, and row/value/status-count inspections. Wrote downloader handoffs and latest snapshot; no production registr |
| t_ae2f8661 | CZ downloader follow-up: ingest CZSO LKOD/Census 2021, CUZK/RUIAN, MPSV/UZIS/MŠMT official sources | synth-downloader | done | 2026-05-19 18:47:23 CEST | Completed CZ official-source ingestion/provenance freeze: downloaded 16/16 requested CZSO LKOD datasets plus schemas, MPSV RPSS JSON/schema, UZIS NRPZS route+CSV exports, MŠMT register/data route evidence, and CUZK VDP/VFR route pages. Wrote downloader handoffs and catalogue artifacts with checksums |
| t_d0d7c4c0 | CZ population rerun after source upgrades with degraded-layer metadata preserved | synth-modeler | done | 2026-05-19 18:36:57 CEST | Built and verified CZ cycle-2 population review bundle at /home/synthestat/output/runs/CZ/cz_population_review_cycle2_59810317_seed420987 after rerunning the CZ seeded synthesis/validation path. The bundle is contract-complete with HARD residuals pass_exact, explicit unavailable hidden/work-school l |
| t_5393f669 | CZ joint/conditional distribution evidence for modelled Cxx/Dxx surfaces | synth-distributions-researcher | done | 2026-05-19 18:36:47 CEST | Completed CZ joint/conditional evidence search and wrote distribution handoffs: findings, extraction specs, and latest evidence board. Key result: most weak Cxx surfaces can be upgraded with CZSO/Eurostat Czech-specific sources, but public ISCO-2/3 occupation and C08 income mobility remain evidence- |
| t_6869427a | CZ source gap closure for official marginals, buildings/dwellings, hidden populations, and assignments | synth-marginals-researcher | done | 2026-05-19 18:36:14 CEST | Completed CZ marginals source-gap closure and wrote the required handoff files under /home/synthestat/workspace/manager_handoffs/marginals/. Verified concrete official source routes for CZSO LKOD/Census 2021, CUZK/RUIAN, MPSV RPSS, UZIS NRPZS, MŠMT school registers, and MV Ukraine statistics; unreso |
| t_a0950153 | CZ population QA cycle 1: review population bundle | synth-reviewer | done | 2026-05-19 18:23:12 CEST | Reviewed CZ cycle-1 population bundle and wrote reviewer handoff at /home/synthestat/workspace/manager_handoffs/reviewer/2026-05-19_163440Z_CZ_population_review_cycle1.md. Verdict NEEDS_MORE_SOURCES: bundle is structurally reviewable and HARD/FIRM constraints pass, but it is only a 2-zone/8-person s |
| t_d29e80e4 | CZ population QA cycle 1: build population review bundle | synth-modeler | done | 2026-05-19 18:22:59 CEST | Built and verified the CZ cycle-1 population review bundle at /home/synthestat/output/runs/CZ/cz_population_review_cycle1_7ec6406c_seed420987. The bundle is contract-complete for internal review, HARD residuals pass exact, and hidden/work-school layers are explicitly unavailable rather than silently |
| t_6636bd4f | orchestrate CZ population synthesis QA loop | synth-manager | done | 2026-05-19 18:21:43 CEST | Routed CZ population synthesis QA cycle 1: created synth-modeler bundle-build task t_d29e80e4 and dependent synth-reviewer bundle-review task t_a0950153. Updated /home/synthestat/workspace/manager_handoffs/manager_updates.md with bottleneck, routing, blocker, and next-action notes. |
Process
synth-manager creates and controls the country loop.
synth-modeler generates the review bundle: people, households, dwellings/buildings or unavailable markers, overlays, assignments, manifests, residuals, diagnostics, uncertainty, provenance.
synth-reviewer audits constraints, marginals, household/family realism, hidden populations, dwelling/building grounding, work/school assignment, uncertainty, provenance, and privacy.
PASS finalizes; NEEDS_MODEL_FIX routes back to modeler; NEEDS_MORE_SOURCES routes to marginal/distribution researchers then downloader; exhausted evidence/model plateau stops for human decision.
Quality gates and stop conditions
- PASS: satisfactory for declared country evidence tier and internal review mode.
- NEEDS_MODEL_FIX: model logic, bundle, uncertainty, household/dwelling/assignment issue.
- NEEDS_MORE_SOURCES: missing marginal or joint/conditional evidence; researchers then downloader.
- EVIDENCE_EXHAUSTED_HUMAN_REVIEW: source search cannot responsibly improve the output.
- MODEL_IMPROVEMENT_EXHAUSTED_HUMAN_REVIEW: modeler cannot materially improve or diagnostics plateau.