LT 1:1 population synthesis QA cycle
Country-specific layer for synthetic people in households, dwellings, real building stock where available, hidden-population overlays, and work/school assignment evidence.
Board: synthestat-population-qa · Tenant: synthestat · Country: LT · Overall status: running
Ideal-country quality criteria: impossible 1:1 benchmark
This is the common gold-standard benchmark for an ideal country. It is intentionally impossible to fully satisfy: complete success would mean a 1:1 replica of the real population where every person, household, dwelling, attribute, and assignment is exactly represented. The QA page uses it as an asymptote and gap taxonomy, not as a release promise.
Apply this same rubric to this country’s latest run, then report which needs are measured, constrained, modelled, unavailable, or blocked.
| Need | Unachievable ideal | QA evidence we require instead | Why perfection cannot be achieved |
|---|---|---|---|
| Complete de jure resident coverage | Every real resident represented exactly once in the right country, municipality, small area, household, and dwelling. | Synthetic person count equals official population at all enforced geographies; no unexplained duplicate, missing, or out-of-universe people. | A true 1:1 resident list is a confidential population register and changes continuously; Synthestat can only match official aggregates and declared source universes. |
| Complete attribute truth | Each synthetic person has the same age, sex, household role, education, occupation, industry, origin, health proxy, income proxy, and lifecycle state as the corresponding real person. | Published marginal and cross-tab constraints pass within HARD/FIRM/SOFT tolerances; modelled fields carry uncertainty and measured/constrained/modelled provenance. | Official releases do not expose a complete individual joint distribution, and many attributes are survey-derived, lagged, suppressed, or unavailable at fine geography. |
| Perfect household and family structure | Every household contains the exact real members and relationships, including multi-generation, partnership, child, shared, institutional, and edge-case arrangements. | Household totals, household-type distributions, age/sex/role consistency, fertility/child constraints, and structural invariants pass with explicit residuals. | Household membership is sensitive microdata; public sources usually expose only aggregate household/family tables and partial cross-tabs. |
| Exact dwelling and building grounding | Every household is assigned to its real dwelling and building with exact occupancy, vacancy, dwelling type, floor area, tenure, and address-level geography. | Dwelling/building capacity checks pass; vacancy/second-home/institutional dwellings are represented or explicitly unavailable; building links have source provenance. | Many countries lack open address-level registers; dwelling occupancy is confidential and time-varying. |
| Complete de facto and hidden-population overlays | Homeless, undocumented, refugees, students away from home, seasonal, institutional, tourists, and daytime populations are all represented with exact location and timing. | Overlay layers use interval estimates, source-specific quality flags, and never silently modify de jure HARD constraints. | Hidden populations are partly unobserved by definition; ethical/privacy constraints forbid exact person-level labels. |
| Exact school, workplace, facility, and mobility assignment | Every person is assigned to the real school, workplace, care provider, commute, and daily activity chain they use. | Assignment layers use official registers/OD flows where available; modelled assignments are flagged and validated only against aggregate flows/capacities. | Operational assignments are usually protected registers or dynamic behavioural data; Phase 1 must not imply they are known. |
| Full joint-distribution realism | The full multivariate joint distribution is identical to reality across all attributes, households, geography, and rare subgroups. | High-priority marginals/cross-tabs pass; sparse zones and prior-dominated attributes are clearly marked with quality tiers and credible intervals. | The joint distribution is non-identifiable from published marginals; IPF/BN/hierarchical pooling choose plausible distributions, not truth. |
| Zero uncertainty and zero lag | All values are current today and known without error. | Every output records reference period, retrieval timestamp, lag, confidence, uncertainty bounds, and degradation decisions. | Official statistics are lagged, revised, sampled, suppressed, and harmonized after collection. |
| Privacy-safe yet maximally detailed release | The system releases maximum useful detail while creating zero re-identification risk. | Release mode, k-anonymity/cell safeguards, perturbation/aggregation policy, and sensitive-field treatment are explicit. | Fine-area synthetic microdata can still create structurally unique records; synthetic does not mean anonymous. |
| Perfect reproducibility and auditability | Any user can trace every output record to exact source snapshots, transformations, constraints, relaxations, seeds, and code versions. | Run manifests, source provenance, checksums, frozen extracts, seeds, versioned crosswalks, validation reports, and relaxation logs are complete. | This is approachable but never final: source portals, classifications, geography, and code keep changing, so audits must be continuously renewed. |
Population output status
| People | Target population | National coverage | Absolute shortfall | Households | Dwellings | Houses/buildings | Max marginal deviation | HARD status | Run |
|---|---|---|---|---|---|---|---|---|---|
| 2,810,761 | — | — | — | 1,215,360 | 1,215,360 | — | — | pass_exact | lt_population_review_cycle4_888b7387_seed420987 |
Deviation is the maximum absolute relative error across collected HARD/FIRM/SOFT marginal constraints in the latest review bundle. GUIDE/INFORMATIONAL priors are excluded. National target/coverage are read from build_manifest.json when available and override any visual impression of completion.
Datasets and distributions
Lists come from the latest run bundle: source_provenance.json, distribution_diagnostics.json, and build_manifest.json.
Summary
| Datasets used | 0 |
|---|---|
| Distributions available | 38 |
| Constraints/distributions used in synthesis | 3 |
| Constraint types | — |
| Dataset variants | — |
| Finest-geography status | — |
Source gaps
- No source gaps listed.
Datasets used
| Dataset/source ID |
|---|
| None listed yet. |
Best source by distribution family
| Distribution family | Dataset/source ID |
|---|---|
| None listed yet. | |
Available distributions / priors in registry
| Spec | Label | Type | Geo | Status | Variant | Confidence | Data URI |
|---|---|---|---|---|---|---|---|
C01_education_occupation_coupling | Education-occupation coupling strength | GUIDE | national | modelled | comparable_country | 0.615 | data/literature/seeded_occupation_priors.yaml |
C02_assortative_mating_education | Assortative mating by education | GUIDE | municipality | modelled | comparable_country | 0.625 | data/literature/seeded_occupation_priors.yaml |
C03_assortative_mating_age | Assortative mating by age | GUIDE | municipality | modelled | comparable_country | 0.695 | data/literature/seeded_occupation_priors.yaml |
C04_assortative_mating_origin | Assortative mating by origin | GUIDE | municipality | modelled | comparable_country | 0.635 | data/literature/seeded_occupation_priors.yaml |
C05_spatial_sorting_education | Spatial sorting by education | GUIDE | national | modelled | comparable_country | 0.715 | data/literature/seeded_occupation_priors.yaml |
C06_spatial_sorting_income | Spatial sorting by income | GUIDE | national | modelled | comparable_country | 0.715 | data/literature/seeded_occupation_priors.yaml |
C07_spatial_sorting_origin | Spatial sorting by origin | GUIDE | national | modelled | comparable_country | 0.735 | data/literature/seeded_occupation_priors.yaml |
C08_intergenerational_income_elasticity | Intergenerational income elasticity | GUIDE | national | modelled | comparable_country | 0.595 | data/literature/seeded_occupation_priors.yaml |
C09_intergenerational_occupation_transmission | Intergenerational occupation transmission | GUIDE | national | modelled | comparable_country | 0.595 | data/literature/seeded_occupation_priors.yaml |
C10_commuting_mode_distance | Commuting mode × distance × occupation × region | GUIDE | municipality | modelled | comparable_country | 0.655 | data/literature/seeded_occupation_priors.yaml |
C11_health_age_sex_education | Health × age × sex × education | GUIDE | national | modelled | comparable_country | 0.635 | data/literature/seeded_occupation_priors.yaml |
D01_age_sex_nuts3 | Age × sex at NUTS-3 | HARD | NUTS-3 | constrained | robust | 0.74 | docs/wiki/compiled/D01_age_sex_nuts3.md |
D01_census_age_sex_nuts3 | Census age × sex at NUTS-3 | HARD | NUTS-3 | constrained | robust | 0.74 | docs/wiki/compiled/D01_census_age_sex_nuts3.md |
D02_marital_nuts3 | Marital status × age × sex at NUTS-3 | FIRM | NUTS-3 | constrained | robust | 0.73 | docs/wiki/compiled/D02_marital_nuts3.md |
D03_origin_age_sex | Origin group × age × sex | FIRM | NUTS-3 | constrained | robust | 0.73 | docs/wiki/compiled/D03_origin_age_sex.md |
D04_religion_age_sex_region | Religion × age × sex × region | GUIDE | NUTS-3 | constrained | robust | 0.71 | docs/wiki/compiled/D04_religion_age_sex_region.md |
D05_census_education_nuts3 | Census education at NUTS-3 | FIRM | NUTS-3 | constrained | robust | 0.73 | docs/wiki/compiled/D05_census_education_nuts3.md |
D05_education_nuts2 | Education at NUTS-2 | FIRM | NUTS-2 | constrained | current | 0.7 | docs/wiki/compiled/D05_education_nuts2.md |
D06_employment_age_sex_education | Employment status × age × sex × education | FIRM | unknown | constrained | robust | 0.73 | docs/wiki/compiled/D06_employment_age_sex_education.md |
D07_occupation_isco3 | Occupation ISCO-3 distribution | SOFT | unknown | constrained | robust | 0.71 | docs/wiki/compiled/D07_occupation_isco3.md |
D08_occupation_education | Occupation × education | SOFT | unknown | constrained | robust | 0.71 | docs/wiki/compiled/D08_occupation_education.md |
D09_industry_nace2 | Industry NACE-2 distribution | SOFT | unknown | constrained | robust | 0.71 | docs/wiki/compiled/D09_industry_nace2.md |
D10_income_education_occupation | Income × education × occupation | SOFT | unknown | constrained | robust | 0.71 | docs/wiki/compiled/D10_income_education_occupation.md |
D11_income_household_type_region | Income × household type × region | SOFT | NUTS-3 | constrained | robust | 0.71 | docs/wiki/compiled/D11_income_household_type_region.md |
D12_household_type_size_region | Household type × size × region | FIRM | NUTS-3 | constrained | robust | 0.73 | docs/wiki/compiled/D12_household_type_size_region.md |
D13_children_mother_age_education | Children × mother age × education | SOFT | NUTS-3 | constrained | robust | 0.71 | docs/wiki/compiled/D13_children_mother_age_education.md |
D14_partner_age_gap_homogamy | Partner age gap × homogamy | SOFT | NUTS-3 | constrained | robust | 0.71 | docs/wiki/compiled/D14_partner_age_gap_homogamy.md |
D15_coresidence_structure | Co-residence structure | SOFT | NUTS-3 | constrained | robust | 0.71 | docs/wiki/compiled/D15_coresidence_structure.md |
D16_household_income_type_region | Household income × type × region | SOFT | NUTS-3 | constrained | robust | 0.71 | docs/wiki/compiled/D16_household_income_type_region.md |
D17_education_mobility | Education mobility | GUIDE | unknown | constrained | robust | 0.71 | docs/wiki/compiled/D17_education_mobility.md |
D18_occupation_given_education | Occupation | education | SOFT | unknown | constrained | robust | 0.71 | docs/wiki/compiled/D18_occupation_given_education.md |
D19_employment_given_demographics | Employment | demographics | SOFT | unknown | constrained | robust | 0.71 | docs/wiki/compiled/D19_employment_given_demographics.md |
D20_birth_intervals | Birth intervals | GUIDE | unknown | constrained | robust | 0.71 | docs/wiki/compiled/D20_birth_intervals.md |
D21_age_first_birth | Age at first birth × education × cohort | GUIDE | unknown | constrained | robust | 0.71 | docs/wiki/compiled/D21_age_first_birth.md |
D22_age_leaving_home | Age leaving home | GUIDE | unknown | constrained | robust | 0.71 | docs/wiki/compiled/D22_age_leaving_home.md |
D23_divorce_duration_children_education | Divorce × duration × children × education | GUIDE | NUTS-3 | constrained | robust | 0.71 | docs/wiki/compiled/D23_divorce_duration_children_education.md |
D24_age_marriage_sex_education | Age at marriage × sex × education | GUIDE | NUTS-3 | constrained | robust | 0.71 | docs/wiki/compiled/D24_age_marriage_sex_education.md |
D25_internal_migration | Internal migration | FIRM | unknown | constrained | robust | 0.73 | docs/wiki/compiled/D25_internal_migration.md |
Constraints/distributions used in synthesis manifest
| Constraint or distribution ID |
|---|
Eurostat:cens_21resh_r2 |
Eurostat:cens_21hhcs_r3 |
Eurostat:cens_21empo_r2 |
Current country tasks
| ID | Title | Assignee | Status | Created | Latest summary |
|---|---|---|---|---|---|
| t_d3b17db4 | LT population QA downloader: freeze exact official payloads for national-scale rerun | synth-downloader | running | 2026-05-19 20:17:42 CEST | |
| t_bc67be5c | LT population QA reviewer: inspect national-scale non-fixture rerun | synth-reviewer | todo | 2026-05-19 20:17:44 CEST | |
| t_ee0b60c7 | LT population QA modeler: national-scale non-fixture synthesis rerun | synth-modeler | todo | 2026-05-19 20:17:43 CEST | |
| t_78a5c4a3 | LT cycle 4 model fix: enforce household-family age/role/type realism | synth-modeler | done | 2026-05-19 20:26:45 CEST | Built LT cycle-4 population review bundle at /home/synthestat/output/runs/LT/lt_population_review_cycle4_888b7387_seed420987. It preserves 2,810,761 persons / 1,215,360 households / 1,215,360 dwellings, 76/76 HARD residual rows pass_exact, complete explicit unavailable markers, and now passes househ |
| t_292b6371 | LT cycle 3 reviewer inspection: national/NUTS2 constrained model fix bundle | synth-reviewer | done | 2026-05-19 20:22:16 CEST | Reviewed LT cycle-3 bundle and returned NEEDS_MODEL_FIX. Cycle-2 artefact blockers are materially addressed (contract complete, no 8-person fixture, 76/76 HARD residual rows pass_exact, variation present, unavailable layers explicit), but household-family realism fails at national scale with 487,000 |
| t_1bfbaf35 | LT population QA distribution closure: joint priors for non-fixture synthesis | synth-distributions-researcher | done | 2026-05-19 20:17:40 CEST | Completed LT distribution evidence closure: verdict DISTRIBUTION_READY_FOR_MODEL_FIX for national/NUTS2-constrained non-fixture synthesis, with unsupported fine joints kept modelled/unavailable. Wrote LT findings, extraction specs, refreshed distributions/latest.md, and appended a manager update. |
| t_405d5f40 | LT population QA source closure: exact national marginals for non-fixture synthesis | synth-marginals-researcher | done | 2026-05-19 20:17:39 CEST | Completed LT national marginal source closure and wrote the required handoffs under /home/synthestat/workspace/manager_handoffs/marginals. Status is SOURCE_READY_FOR_MODEL_FIX: official controls close 2,810,761 de jure residents, 1,215,360 private households, dwelling targets, non-private/institutio |
| t_4408d0ca | LT cycle 3 model fix: integrate frozen official LT sources beyond seeded fixture | synth-modeler | done | 2026-05-19 20:09:27 CEST | Built LT cycle-3 model-fix review bundle at /home/synthestat/output/runs/LT/lt_population_review_cycle3_888b7387_seed420987, replacing the prior 8-person fixture with a national/NUTS2-constrained internal candidate: 2,810,761 persons, 1,215,360 households, and 1,215,360 synthetic dwellings. Frozen E |
| t_bfeae4a3 | LT cycle 2 review after source-upgraded model rerun | synth-reviewer | done | 2026-05-19 19:59:54 CEST | Reviewed LT cycle-2 bundle and wrote the QA report at /home/synthestat/workspace/manager_handoffs/reviewer/2026-05-19_181900Z_LT_population_review_cycle2.md. Verdict NEEDS_MODEL_FIX: the bundle is contract-complete and honest about unavailable layers, but remains an 8-person seeded fixture, not a cr |
| t_cd2a944e | LT cycle 2 model rerun after source/download follow-ups | synth-modeler | done | 2026-05-19 19:59:39 CEST | Built and verified the LT cycle-2 population review bundle at /home/synthestat/output/runs/LT/lt_population_review_cycle2_837bfa75_seed420987 after rerunning the LT seeded source/building/country/hierarchical/population/validation path. The bundle is contract-complete with explicit unavailable hidde |
| t_4cf4377e | LT downloader schema profile: RC/VDA/NTR official source candidates from t_faad3b28 | synth-downloader | done | 2026-05-19 18:56:45 CEST | Completed LT official-source downloader schema profiling for the approved RC/VDA/NTR/VMI/Migration/RPPC candidates. Wrote an auditable manifest with small API samples, cheap row counts, bounded VDA/RC cache downloads, checksums, and explicit Sodra/hidden-population degradation flags; no modelling or |
| t_fbb4483c | LT downloader: freeze Eurostat distribution evidence pulls and OSP native table discovery | synth-downloader | done | 2026-05-19 18:52:29 CEST | Froze the approved Lithuania Eurostat distribution evidence queue: all 26 datasets returned HTTP 200 and are stored with raw JSON-stat payloads, headers, URLs, checksums, retrieval timestamps, dimensions/geographies/periods, and flag/status metadata under /home/synthestat/data/ingestion/lt/official_ |
| t_fd1d7d8f | LT cycle-1 distribution gaps: household microstructure and joint priors beyond comparable-country fallbacks | synth-distributions-researcher | done | 2026-05-19 18:43:41 CEST | Completed LT distribution-gap research and wrote findings/extraction/latest handoffs under /home/synthestat/workspace/manager_handoffs/distributions/. Found actionable Lithuania-specific Eurostat Census/LFS/SILC/SES/fertility sources for household microstructure, occupation-origin-workplace, educati |
| t_faad3b28 | LT cycle-1 source gaps: live national marginals, hidden overlays, building/work-school evidence | synth-marginals-researcher | done | 2026-05-19 18:43:33 CEST | Completed LT marginal-source research freeze for cycle-1 gaps. Wrote official source handoffs covering RC population-register marginals, VDA census/geospatial/housing packages, RC/Datalab NTR building evidence, VMI/Sodra/LFS/EU-SILC income-employment sources, and Migration/RPPC overlays; created dow |
| t_c2ade41e | LT population synthesis QA cycle 1 — review bundle and verdict | synth-reviewer | done | 2026-05-19 18:35:23 CEST | Reviewed LT cycle-1 population bundle and wrote reviewer memo at /home/synthestat/workspace/manager_handoffs/reviewer/2026-05-19_164321Z_LT_population_review_cycle1.md. Verdict NEEDS_MORE_SOURCES: bundle is contract-complete with exact HARD/FIRM fit, but source/distribution gaps block PASS beyond th |
| t_bef90222 | LT population synthesis QA cycle 1 — build review bundle | synth-modeler | done | 2026-05-19 18:35:09 CEST | Built and verified the LT cycle-1 population review bundle at /home/synthestat/output/runs/LT/lt_population_review_cycle1_813e5acd_seed420987. The bundle is contract-complete for internal review, HARD residuals pass exactly, and hidden-population/work-school layers are explicit unavailable markers r |
| t_2bb29733 | orchestrate LT population synthesis QA loop | synth-manager | done | 2026-05-19 18:34:10 CEST | Routed LT population synthesis QA cycle 1: created synth-modeler bundle-build task t_bef90222 and dependent synth-reviewer bundle-review task t_c2ade41e. Updated /home/synthestat/workspace/manager_handoffs/manager_updates.md with bottleneck, routing decisions, blockers, and next branch instructions. |
Process
synth-manager creates and controls the country loop.
synth-modeler generates the review bundle: people, households, dwellings/buildings or unavailable markers, overlays, assignments, manifests, residuals, diagnostics, uncertainty, provenance.
synth-reviewer audits constraints, marginals, household/family realism, hidden populations, dwelling/building grounding, work/school assignment, uncertainty, provenance, and privacy.
PASS finalizes; NEEDS_MODEL_FIX routes back to modeler; NEEDS_MORE_SOURCES routes to marginal/distribution researchers then downloader; exhausted evidence/model plateau stops for human decision.
Quality gates and stop conditions
- PASS: satisfactory for declared country evidence tier and internal review mode.
- NEEDS_MODEL_FIX: model logic, bundle, uncertainty, household/dwelling/assignment issue.
- NEEDS_MORE_SOURCES: missing marginal or joint/conditional evidence; researchers then downloader.
- EVIDENCE_EXHAUSTED_HUMAN_REVIEW: source search cannot responsibly improve the output.
- MODEL_IMPROVEMENT_EXHAUSTED_HUMAN_REVIEW: modeler cannot materially improve or diagnostics plateau.