2026-05-19 21:10:12 CEST · synth-reviewer
QA report — SE population QA cycle 2 reviewer
country: SE
run_id: se_population_review_cycle2_3a9d999a_seed420987
artifact_reviewed: /home/synthestat/output/runs/SE/se_population_review_cycle2_3a9d999a_seed420987
verdict: NEEDS_MODEL_FIX
confidence_in_verdict: high
summary: Cycle 2 is a material improvement over cycle 1: it is contract-complete and has moved from an 8-person/two-test-zone fixture to a 91,030-person, 43,739-household, 42-DeSO stratified slice with exact selected-zone SCB age-sex HARD residuals. It still cannot PASS because household/family construction is structurally invalid, FIRM/SOFT residuals are not exercised despite newly frozen source data, modelled attributes do not have attribute-specific uncertainty bounds, and unavailable building/hidden/work-school layers remain correctly explicit but unresolved.
constraint_fit:
hard: PASS for declared selected-zone age-sex scope. constraint_residuals.json reports hard_constraint_status=pass_exact, 84/84 HARD residual rows pass, selected-zone official target population 91,030 equals synthetic person count 91,030, residual 0. Independent parquet count check found 91,030 persons and 42 selected DeSO zones in diagnostics.
firm: NOT ADEQUATELY EXERCISED. build_manifest lists national household-size prior and modelled/fallback metadata, but constraint_residuals.json contains only HARD rows. Frozen cycle-2 sources include DeSO household type, education, labour, housing/tenure, income, etc.; the bundle does not report FIRM residuals/tolerances for these, so fit cannot be accepted beyond age-sex.
soft: NOT ADEQUATELY EXERCISED. Household/family realism, occupation/industry, origin/categorical attributes, and dwelling shells are labelled modelled/fallback, but no SOFT residual summaries or numeric tolerance checks are present.
household_family_checks: FAIL. Deterministic parquet check found 11,526 households where all members are children under 18, with samples such as household 240 containing only three age-2 persons all labelled child. Household type counts are implausible/internally inconsistent: HH_COUPLE_CHILDREN=10 while 23,973 persons are labelled child and 6,659 households are HH_SINGLE_PARENT; child allocation appears not linked to adult guardian/couple shells. Household sizes match reported member counts, but member composition violates the no impossible children-alone rule and household type/member composition coherence.
dwelling_building_checks: PARTIAL/UNAVAILABLE. synthetic_dwellings.parquet is present with 43,739 shell dwellings and household backlinks now consistent; building_id is null for all dwellings and synthetic_building_assignments.unavailable.json honestly states no official residential building/address/dwelling anchor is frozen. This is acceptable as explicit unavailability for a slice, but it remains non-PASS for real-house grounding.
hidden_population_checks: UNAVAILABLE BUT HONEST. hidden_population_overlays.unavailable.json states hidden overlays are unavailable and not folded into de jure private households. This preserves HARD de jure constraints, but homelessness/irregular/seasonal/student/institutional/refugee overlays remain unresolved and should not be represented as covered.
work_school_assignment_checks: UNAVAILABLE BUT HONEST. work_school_assignments.unavailable.json and assignment_diagnostics.json state OD commuters are future-prior provenance only and no individual work/school/facility assignments are emitted. This avoids hallucinated assignments, but the layer is not reviewable.
distribution_checks: FAIL FOR CURRENT MODEL PASS. distribution_diagnostics confirms only national household-size prior use plus modelled fine attributes. Source acquisition produced stronger SCB tables, including HushallDesoTyp, but the model did not convert them into reported FIRM/SOFT residuals or coherent household-family generation. Occupation/industry are fallback_1digit/not_applicable and correctly not measured.
geography_checks: MIXED. The bundle clearly declares stratified_multi_DeSO_not_full_national and 42 selected zones, which fixes the cycle-1 toy-scope issue. However geography_quality_tiers reports degraded_zone_count=0 and every zone quality_tier=B/degraded=false even though every zone lacks building assignment and non-age-sex attributes are modelled. That is less severe than cycle-1 A-tier overclaiming, but still overstates zone quality; zones with unavailable buildings/assignments and national-prior household construction should be degraded or explicitly tier-C for those layers.
uncertainty_provenance_checks: FAIL/PARTIAL. source_provenance has 20 frozen records with source IDs, retrieval timestamps, table IDs, geography levels, reference periods, checksums, source systems, and license_access_notes. Per-row provenance/fallback columns exist. But modelled attributes do not have attribute-specific numeric uncertainty bounds: synthetic_persons has uncertainty_low=uncertainty_high=1.0 for all rows while many columns are modelled; uncertainty_summary is qualitative and says wide categorical uncertainty without exposing bounds. Source records use evidence_tier/quality_flags but lack normalized candidate_use/quality_flag fields expected by the task wording.
privacy_release_checks: Fine-geography DeSO slice with unique household/person records and modelled sensitive attributes has material re-identification and misinterpretation risk. Internal review only; do not treat as anonymous or production-release safe.
critical_failures:
- Household/family graph is structurally invalid: 11,526 child-only households with no adult/guardian, including age-2 child-only households.
- Household type/member composition is incoherent: almost no HH_COUPLE_CHILDREN households despite 23,973 child-labelled persons and thousands of single-parent labels.
- FIRM/SOFT residuals are not reported for available newly frozen SCB household/person attribute sources; only age-sex HARD residuals are exercised.
- Modelled attributes lack attribute-specific uncertainty bounds; constant person-level uncertainty_low/high=1.0 is misleading for modelled origin/education/labour/income/occupation/industry fields.
- Geography quality still overclaims: degraded_zone_count=0 and zone degraded=false despite unavailable buildings/assignments and national-prior household construction for every selected zone.
model_fix_requests:
- Rebuild household generation so children are placed only with adult guardians/parents or explicitly sourced institutional/exceptional placements; enforce household_type/member-role coherence and parent/guardian age-gap checks.
- Consume frozen HushallDesoTyp and other SCB cycle-2 tables as FIRM/SOFT constraints where appropriate, and emit residual rows with tolerances/reasons for household type, education/labour/housing/income/origin where claimed.
- Replace constant per-person uncertainty_low/high=1.0 with attribute-level uncertainty/status fields or a diagnostics table covering each modelled attribute/zone; do not imply exact certainty for modelled categorical assignments.
- Mark zones/layers degraded honestly: at minimum layer-specific tier C for building/work-school/hidden and household-family layers when only national priors or unavailable anchors are used.
- Keep unavailable building/hidden/work-school artifacts explicit unless approved source anchors exist; do not silently infer them.
source_gap_requests:
marginals:
- Residential building/address/dwelling anchors remain blocked on Lantmäteriet/contract credentials or a human-approved proxy/scaffold decision.
- Hidden-population overlays remain evidence-exhausted/partial for DeSO-ready homelessness, undocumented, seasonal, institutional, student-dormitory and refugee/Ukrainian/Syrian resident-stock semantics.
- Current workplace/school destination evidence remains insufficient for individual assignment; OD source is stale/GUIDE for assignments.
distributions:
- The claimed Sweden household-composition prior bundle remains absent and should be rebuilt/mirrored if the model needs richer parent-child/couple/guardian priors beyond SCB household tables.
stopping_condition_assessment: Do not PASS. Do not use NEEDS_MORE_SOURCES as the primary verdict: cycle 2 source acquisition materially improved the evidence base, and the most immediate blockers are model logic/diagnostic/uncertainty failures using already-frozen inputs. Findings are not merely a repeat of cycle 1: the toy-scope/source-freeze issues improved, but household-family and residual-reporting failures now block. Human review may be needed for licensed building/proxy and hidden-overlay scope decisions, but model improvement is not exhausted because concrete fixes exist.
recommended_next_cards:
- assignee: synth-modeler
title: SE cycle-2 household-family/residual/uncertainty model fix rerun
reason: Fix child-only households, household-type coherence, FIRM/SOFT residual reporting, attribute uncertainty, and layer-specific degradation metadata using frozen SCB sources.
depends_on: t_66bdf062
- assignee: synth-manager
title: SE building-anchor and hidden-overlay scope decision
reason: Decide whether to pause production building/hidden layers, obtain licensed Lantmäteriet/official sources, or permit labelled proxy/scaffold anchors for internal review only.
depends_on: t_66bdf062
Checks run:
- Read current task t_66bdf062 and parent modeler handoff t_29a0c9c4 via kanban_show.
- Read prior cycle-1 reviewer t_33ff07f7 and source/downloader handoff t_1bbf9f63 via kanban_show.
- Read /home/synthestat/workspace/manager_handoffs/SE_other_synthesis_ingest.md.
- Read /home/synthestat/docs/contracts/population_review_bundle.md, /home/synthestat/docs/SOUL.md, and /home/synthestat/docs/specs/research_knowledge_base.md.
- Read model_notes.md and downloader freeze handoff.
- Listed bundle files and validated required contract artifacts/unavailable artifacts exist.
- Parsed all JSON diagnostics/provenance with Python json.
- Read parquet files with pyarrow and checked row counts, schemas, IDs, quality/evidence/fallback fields, person-household joins, household-dwelling joins, age/sex counts, and household age composition.
- Parsed source_provenance fields/timestamps/checksums and unavailable artifact reasons.
Cycle comparison to t_33ff07f7: Materially improved, not a flat repeat. Cycle 1 was an 8-person/two-zone fixture with seeded buildings and inconsistent A-tier geography. Cycle 2 is a non-toy 42-DeSO/91,030-person slice with frozen SCB provenance and exact selected-zone age-sex HARD controls. The old building/hidden/work-school source gaps remain explicit, but the decisive new blocker is model/build quality: invalid household-family construction, absent FIRM/SOFT residual reporting despite new sources, inadequate modelled-attribute uncertainty, and still-overoptimistic geography/layer quality metadata.