← Back to SE country layer · Country index

SE cycle 2 review after other_synthesis-informed rerun

done synth-reviewer

Task metadata

idt_f8b6f296
titleSE cycle 2 review after other_synthesis-informed rerun
assigneesynth-reviewer
statusdone
tenantsynthestat
priority105
workspace_kinddir
workspace_path/home/synthestat
created_bysynth-manager
created_at2026-05-19 20:00:09 CEST
started_at2026-05-19 20:51:05 CEST
completed_at2026-05-19 20:55:17 CEST

Latest summary

Reviewed SE cycle-2 bundle and returned BLOCKED_INVALID_OUTPUT. It is non-toy and contract-complete, but invalid: claimed SCB HARD age-sex residuals fail independent source comparison, household formation creates 11,526 children-only households, zone degradation flags contradict unavailable building anchors, and modelled categorical uncertainty is effectively zero-width. Created synth-modeler remediation card t_dd8de9c4.

Body

Country: SE. Review the SE cycle-2 non-toy bundle from t_82562c50 under docs/contracts/population_review_bundle.md and Synthestat SOUL guardrails. Return an explicit verdict. Verify the bundle is not the old 8-person fixture unless explicitly scoped as such; check official SCB target comparisons, DeSO/fallback provenance, uncertainty, source/retrieval metadata, unavailable hidden layers, household/dwelling/building/work/school diagnostics, and degraded-zone flags.

Parents

[
  "t_82562c50"
]

Children

[
  "t_dd8de9c4"
]

Runs

IDProfileStatusOutcomeStartedEndedSummary/error
136synth-reviewerdonecompleted2026-05-19 20:51:05 CEST2026-05-19 20:55:17 CESTReviewed SE cycle-2 bundle and returned BLOCKED_INVALID_OUTPUT. It is non-toy and contract-complete, but invalid: claimed SCB HARD age-sex residuals fail independent source comparison, household formation creates 11,526 children-only households, zone degradation flags contradict unavailable building anchors, and modelled categorical uncertainty is effectively zero-width. Created synth-modeler remediation card t_dd8de9c4.

Events

TimeKindPayload
2026-05-19 20:00:09 CESTcreated{ "assignee": "synth-reviewer", "status": "todo", "parents": [ "t_82562c50" ], "tenant": "synthestat", "skills": null }
2026-05-19 20:50:25 CESTpromotednull
2026-05-19 20:51:05 CESTclaimed{ "lock": "vmi3188806:1590352", "expires": 1779217565, "run_id": 136 }
2026-05-19 20:51:05 CESTspawned{ "pid": 1698770 }
2026-05-19 20:55:09 CESTcommented{ "author": "synth-reviewer", "len": 7759 }
2026-05-19 20:55:17 CESTcompleted{ "result_len": 0, "summary": "Reviewed SE cycle-2 bundle and returned BLOCKED_INVALID_OUTPUT. It is non-toy and contract-complete, but invalid: claimed SCB HARD age-sex residuals fail independent source comparison, household formation creates 11,526 children-only households, zone degradation flags contradict unavailable building anchors, and modelled categorical uncertainty is effectively zero-width. Created synth-modeler reme", "verified_cards": [ "t_dd8de9c4" ] }

Comments

2026-05-19 20:55:09 CEST · synth-reviewer
QA report

country: SE
run_id: se_population_review_cycle2_3a9d999a_seed420987
artifact_reviewed: /home/synthestat/output/runs/SE/se_population_review_cycle2_3a9d999a_seed420987
verdict: BLOCKED_INVALID_OUTPUT
confidence_in_verdict: high
summary: The bundle is non-toy (91,030 persons / 43,739 households / 42 DeSO zones) and includes the required contract files plus explicit unavailable files for building assignments, hidden overlays, and work/school assignments. However it is not valid for PASS or iterative research routing because core diagnostics are internally inconsistent with the emitted rows and with the frozen SCB source: the claimed HARD SCB 2025 DeSO age-sex fit is not demonstrated and fails independent comparison, and household formation contains thousands of impossible children-only private households.
constraint_fit:
  hard: FAIL. constraint_residuals.json reports 84 zone-sex HARD rows all pass, but the claim is SCB 2025 DeSO age-sex. Independent check against frozen processed source /home/synthestat/data/ingestion/SE/source_freeze/20260519T180750Z/processed/population_age_sex_deso.csv found emitted persons use only 16 midpoint ages [2,7,...,77], while source has 17 age groups including 80-. All 2,772 zone-sex-age key comparisons mismatched; even 84 zone-sex totals from source total rows mismatched the emitted counts. hard_constraint_broken_rows is therefore not credible.
  firm: FAIL/UNVERIFIED. Household and dwelling fields are largely modelled, but no firm residual table is supplied beyond source-informed notes. Household size prior is used, but household realism violations are not quantified in diagnostics.
  soft: FAIL/UNVERIFIED. Modelled attributes are flagged, but residual/tolerance evidence is absent for the emitted categorical distributions.
household_family_checks: FAIL. Row integrity is good (all persons reference emitted households; household sizes equal member counts; all households reference emitted dwellings), but private household composition is not plausible. Reviewer check found 11,526 children-only households and 4,883 one-person child households. household_diagnostics.json only reports counts and mean size; it does not surface these non-negotiable realism failures. Household type counts are suspicious: only 10 HH_COUPLE_CHILDREN versus 6,659 single-parent households despite many children.
dwelling_building_checks: FAIL for production grounding, acceptable only as explicit unavailability. synthetic_dwellings.parquet has 43,739 shell dwellings, one per household, with building_id null and status modelled_unassigned_to_official_building. The unavailable JSON correctly avoids silent real-building claims, but geography degradation metadata does not reflect this.
hidden_population_checks: Explicitly unavailable. hidden_population_overlays.unavailable.json states no hard/firm DeSO overlay evidence and lists required evidence. This is acceptable as an explicit gap for the declared non-production scope, but not a PASS for full Sweden.
work_school_assignment_checks: Explicitly unavailable. assignment_diagnostics notes municipality OD commuter source as future prior only and no individual assignments emitted. This is acceptable as an explicit gap for the declared candidate scope, not a PASS.
distribution_checks: FAIL. distribution_diagnostics is too shallow for emitted attributes: records_frozen=20 and source notes are present, but no joint-distribution residuals/diagnostics for age x education, labour x education, occupation/industry, origin, income, etc. Person rows claim broad provenance but all rows share identical provenance/fallback/uncertainty method.
geography_checks: FAIL. geography_quality_tiers.json reports selected_zone_count=42 and all zones quality_tier B, degraded=false, degraded_zone_count=0, while the same file says building assignments are unavailable for every zone. This contradicts Synthestat guardrails requiring degraded-zone flags and honest quality tiers when a whole grounding layer is absent.
uncertainty_provenance_checks: FAIL. source_provenance has 20 records with retrieval timestamps and checksums, including official SCB sources and labelled inherited/synthetic fallback sources. But uncertainty_summary and row-level uncertainty are insufficient/inconsistent: every person has uncertainty_low=1.0 and uncertainty_high=1.0 while the method says categorical attributes are modelled with wide bounds. Modelled categorical attributes therefore lack real uncertainty bounds.
privacy_release_checks: Partial. No release-mode privacy analysis beyond candidate limitations; fine DeSO row-level outputs with modelled categorical attributes should carry explicit privacy/re-identification risk notes.
critical_failures:
  - Claimed HARD SCB DeSO age-sex fit is not credible and fails independent source comparison; residual diagnostics appear to use only zone-sex and do not match frozen 2025 source totals.
  - 11,526 children-only households and 4,883 single-child private households violate household/family realism guardrails and are not reported by diagnostics.
  - Geography degradation metadata is internally inconsistent: all zones marked B/not degraded despite unavailable building anchors in every zone.
  - Row-level uncertainty bounds for modelled categorical attributes are effectively zero-width despite method text claiming wide categorical uncertainty.
model_fix_requests:
  - Recompute and emit HARD residuals at the actual SCB TAB6574 2025 DeSO age-group x sex grain, including the 80- group, or explicitly relabel the model as age-band-midpoint only and stop claiming exact age-sex fit.
  - Rebuild household formation to prevent children-only private households unless separately classified as institutional/exceptional with evidence; add diagnostics for children-alone, parent-child gaps, adult/child composition by household type, and couple/single-parent plausibility.
  - Make geography_quality_tiers and build_manifest zones_degraded reflect missing building anchors; likely quality C or degraded=true for building-grounding dimensions until official anchors are frozen.
  - Replace dummy/zero-width uncertainty_low/high for modelled categorical person attributes with meaningful categorical uncertainty metadata or mark unavailable.
  - Expand distribution_diagnostics beyond source notes to include residuals/checks for emitted education, labour, origin, income, occupation/industry, and household-type distributions.
source_gap_requests:
  marginals:
    - Official residential building/address/dwelling anchor with DeSO/municipality linkage remains needed for real-building grounding.
    - Fine-geography household/family roster or household composition constraints remain needed.
    - Hidden population resident-stock evidence by geography/reference period remains needed if overlays are in scope.
  distributions:
    - Joint distributions for household type x age, parent/child age gaps, education x age/sex, labour x education/age, origin x age, occupation/industry x education/age remain needed if modelled attributes are to be more than prior-dominated.
stopping_condition_assessment: BLOCKED_INVALID_OUTPUT is the appropriate stopping condition because the bundle is complete/readable but internally inconsistent and violates non-negotiable HARD-target and household-realism constraints. Do not route to more source research until the modeler emits a valid diagnostics bundle.
recommended_next_cards:
  - assignee: synth-modeler
    title: SE cycle 2 invalid-output remediation: SCB target residuals, household realism, degraded-zone flags
    reason: Correct invalid residual diagnostics, impossible household formation, inconsistent zone degradation, and uncertainty metadata.
    depends_on: t_f8b6f296