← Back to EE country layer · Country index

EE population QA reviewer: inspect national-scale non-fixture rerun

done synth-reviewer

Task metadata

idt_8f902059
titleEE population QA reviewer: inspect national-scale non-fixture rerun
assigneesynth-reviewer
statusdone
tenantsynthestat
priority55
workspace_kinddir
workspace_path/home/synthestat
created_byuser
created_at2026-05-19 20:17:33 CEST
started_at2026-05-19 20:48:03 CEST
completed_at2026-05-19 20:52:29 CEST

Latest summary

EE review complete: the rerun is a real national-scale private-household artifact (1,317,666 persons; 561,655 households), not the 8-person fixture, and it matches declared private-household HARD counts. Verdict is NEEDS_MODEL_FIX because household assignment is structurally invalid (174,570 child-without-adult households, 132,650 single-child households, 174,571 minor reference persons) and household diagnostics misreport generated large-household sizes. Created modeler follow-up t_1bf7ff30.

Body

Country: EE — Estonia
Parent modeler task: t_4eb48eb2
Mission: review national-scale population artifact or blocker.

Verify:
- actual persons.parquet and households.parquet row counts, not board/card claims;
- official target population/households and diff;
- whether artifact is fixture/sample/national;
- HARD/FIRM/SOFT constraint status;
- provenance, uncertainty, degradation flags;
- no sensitive/hidden/de facto overlay misuse;
- tests and run bundle validity.

Verdict must be one of: PASS_FOR_INTERNAL_REVIEW, NEEDS_MODEL_FIX, NEEDS_SOURCE_FIX, FAIL.
If it is still a fixture, say so loudly and create/require follow-up routing rather than final delivery.

Parents

[
  "t_4eb48eb2"
]

Children

[
  "t_1bf7ff30",
  "t_e0e72e3c"
]

Runs

IDProfileStatusOutcomeStartedEndedSummary/error
134synth-reviewerdonecompleted2026-05-19 20:48:03 CEST2026-05-19 20:52:29 CESTEE review complete: the rerun is a real national-scale private-household artifact (1,317,666 persons; 561,655 households), not the 8-person fixture, and it matches declared private-household HARD counts. Verdict is NEEDS_MODEL_FIX because household assignment is structurally invalid (174,570 child-without-adult households, 132,650 single-child households, 174,571 minor reference persons) and household diagnostics misreport generated large-household sizes. Created modeler follow-up t_1bf7ff30.

Events

TimeKindPayload
2026-05-19 20:17:33 CESTcreated{ "assignee": "synth-reviewer", "status": "todo", "parents": [ "t_4eb48eb2" ], "tenant": "synthestat", "skills": [ "synthestat-infra05-validation-framework" ] }
2026-05-19 20:47:07 CESTpromotednull
2026-05-19 20:48:03 CESTclaimed{ "lock": "vmi3188806:1590352", "expires": 1779217383, "run_id": 134 }
2026-05-19 20:48:03 CESTspawned{ "pid": 1697883 }
2026-05-19 20:52:18 CESTcommented{ "author": "synth-reviewer", "len": 5512 }
2026-05-19 20:52:29 CESTcompleted{ "result_len": 0, "summary": "EE review complete: the rerun is a real national-scale private-household artifact (1,317,666 persons; 561,655 households), not the 8-person fixture, and it matches declared private-household HARD counts. Verdict is NEEDS_MODEL_FIX because household assignment is structurally invalid (174,570 child-without-adult households, 132,650 single-child households, 174,571 minor reference persons) and house", "verified_cards": [ "t_1bf7ff30" ] }

Comments

2026-05-19 20:52:18 CEST · synth-reviewer
QA report for EE national-scale non-fixture rerun

country: EE
run_id: ee_population_private_household_national_2021_seed420987
artifact_reviewed: /home/synthestat/output/runs/EE/ee_population_private_household_national_2021_seed420987
verdict: NEEDS_MODEL_FIX
confidence_in_verdict: high
summary: The artifact is national-scale private-household, not the old 8-person fixture: actual parquet counts are persons=1,317,666, households=561,655, dwellings=561,655. It matches declared 2021 Statistics Estonia private-household HARD totals exactly and documents the 14,158 total-population residual as unavailable/non-private rather than fabricating hidden persons. However it fails household-family realism badly because age groups are streamed into household slots: 174,570 households contain children and no adult; 132,650 are single-person child households; 174,571 child records are household reference_person. Verdict is NEEDS_MODEL_FIX, not pass for internal review.

constraint_fit:
  hard: PASS for declared private-household counts only. Independent raw freeze check: RL21707 households=561,655; RL21703 private-household members=1,317,666; RL21001 total population=1,331,824, residual=14,158. Parquet row counts match declared private-household persons/households. Age-sex generated counts match distribution_diagnostics totals.
  firm: No FIRM constraints declared.
  soft: No SOFT constraints declared.
household_family_checks: FAIL. size member sums are internally consistent, but household composition is impossible for ordinary private households: 174,570 child-without-adult households, 132,650 single child households, 174,571 child reference persons. Root cause visible in builder: person_rows streams age-sex blocks sequentially into household slots and assigns first member as reference_person without adult/role safeguards.
dwelling_building_checks: DEGRADED/NOT PASS. One synthetic dwelling per private household is internally linked; real building assignment is explicitly unavailable pending Maa-amet reconciliation. This is acceptable as an explicit unavailable layer for a scoped review bundle, but not as real-house grounding.
hidden_population_checks: PASS for honesty of scope, not completeness. The 14,158 total-vs-private residual is documented as aggregate unavailable/non-private and not silently injected or labelled as hidden persons.
work_school_assignment_checks: UNAVAILABLE, honestly declared. No work/school assignments are claimed.
distribution_checks: MIXED. Age-sex counts are exact at declared private-household aggregate level. Socioeconomic attributes are unassigned/null and explicitly marked unavailable/modelled. Large-household diagnostics are inaccurate: raw RL21707 open class is 6-10 households=17,764 and members=114,596; actual generated split is size 6=9,752 and size 7=8,012, but household_diagnostics.json lists 6=16,966 and 7=798, so diagnostics do not match the parquet.
geography_checks: Scoped to country EE only. geography_quality_tiers.json labels country B and buildings C; no subnational geography is synthesized. This is explicit, not a silent dropped-zone issue.
uncertainty_provenance_checks: Mostly adequate for scoped candidate. Required review files are present (16 files). Provenance includes frozen Statistics Estonia and Maa-amet records with paths/checksums/timestamps. Uncertainty summary notes midpoint ages, unassigned socioeconomic attributes, missing collective/hidden/work-school/building layers. Gap: household-family structural uncertainty/invalidity is not surfaced as a failed diagnostic.
privacy_release_checks: Low immediate sensitivity because no fine geography, real building IDs, occupation, origin, nationality, or work/school assignments are present. Still synthetic persons are 1:1 national private-household rows and should not be described as anonymous by default.
critical_failures:
  - Ordinary private-household composition is structurally invalid: 174,570 child-without-adult households and 132,650 single-child households.
  - 174,571 minor records are household reference_person in ordinary private households.
  - household_diagnostics.json mismatches actual generated size distribution for large households and does not report the child-alone failure.
model_fix_requests:
  - Rebuild person-to-household assignment so household roles and adult/child composition are plausible under private-household rules while preserving HARD private-household totals.
  - Add validation metrics for minor reference persons, child-alone households, household type/member composition coherence, and role/age plausibility.
  - Correct household_diagnostics.json to distinguish measured open-class source controls from generated deterministic splits.
  - Keep explicit unavailable statuses for real building assignment, hidden/non-private overlay, and work/school assignment until validated.
source_gap_requests:
  marginals: []
  distributions: []
stopping_condition_assessment: Not evidence exhausted and not model-improvement exhausted. The failure is model logic/diagnostics, so route to synth-modeler.
recommended_next_cards:
  - assignee: synth-modeler
    title: EE model fix: repair invalid child-alone household assignment in national private-household bundle
    reason: Required before any internal-review pass; created as t_1bf7ff30.
    depends_on: t_8f902059

Evidence commands run: pyarrow parquet row/schema checks; independent frozen CSV target extraction; household composition audit; builder source inspection.