データ資産化 — Provenance / Source Weight / Owner 承認フロー Data asset — Provenance / Source weight / Owner approval flow

Parky は駐車場データを 11 種類のソース(owner / admin / field / ugc / akippa / navitime / google / osm / kokudo / jichitai / import)から取り込みます。Phase 0 (2026-05-04 dev DB apply 済) で field 単位の provenance モデルを導入し、駐車場の各属性ごとに「誰が・いつ・どこから」入れた値かを記録できる構造に切り替えました。本ページは社内エンジニア向けに、provenance / source_weight / Owner 承認フロー / Raw landing zone / ハウスキーピング / バックアップを横断的にまとめます。

Parky ingests parking-lot data from 11 sources (owner / admin / field / ugc / akippa / navitime / google / osm / kokudo / jichitai / import). Phase 0 (applied to dev on 2026-05-04) introduced field-level provenance: every attribute is recorded with who supplied it, when, and from which source. This page is an internal engineering reference covering provenance, source_weight, the owner approval flow, the raw landing zone, housekeeping, and backups.

関連プラン文書: Related design docs: 実装プランは infra/supabase/migrations/2026-05-04_parking_data_asset_phase0_README.md と Phase 0c/0d/0e/0g の各 SQL ヘッダコメントを参照。 バックアップ運用は ops/db-backup-runbook See infra/supabase/migrations/2026-05-04_parking_data_asset_phase0_README.md and the SQL header comments for Phase 0c/0d/0e/0g. Backup ops live in ops/db-backup-runbook.

1. 全体構造1. Architecture overview

Phase 0 では parking_lots テーブルを VIEW にリネーム し、その背後に canonical identity テーブル parking_lots_master + provenance ログ parking_field_values を置きました。INSTEAD OF triggers で透過動作させているため、API 層の SQL (FROM public.parking_lots) は無改修で動きます。

Phase 0 turned parking_lots into a VIEW backed by a canonical identity table (parking_lots_master) and a provenance log (parking_field_values). INSTEAD OF triggers make this transparent — existing API SQL (FROM public.parking_lots) keeps working unchanged.

flowchart LR
  subgraph CLIENT["Client / API"]
    A[FROM public.parking_lots]
  end
  subgraph VIEW["VIEW layer (透過)"]
    V["parking_lots (VIEW)
INSTEAD OF triggers"] end subgraph CORE["Core tables"] M["parking_lots_master
canonical identity
(id / code / status /
location / shape_type)"] F["parking_field_values
provenance log
(field_name / source /
captured_at / priority_score)"] C["parking_field_confirmations
UGC 補強"] end subgraph SEARCH["Search layer"] S["parking_search_v (MV)
field_values primary を pivot"] end subgraph INGEST["Raw landing zone"] R["parking_ingest_runs"] P["parking_ingest_payloads"] end A ==> V V -->|read| M V -->|read primary| F V -->|INSTEAD OF INSERT/UPDATE| M V -->|INSTEAD OF INSERT/UPDATE| F F --> C M --> S F --> S R --> P P -->|upsert_ingest_payload| F

2. parking_field_values テーブル2. The parking_field_values table

Phase 0 の中心。(parking_lot_id, field_name, source, captured_at) を 1 行として、駐車場 1 属性につき複数 source の値が時系列で蓄積されます。

The heart of Phase 0. One row per (parking_lot_id, field_name, source, captured_at) tuple — multiple values per attribute, accumulated chronologically.

役割Purpose
parking_lot_id対象駐車場 (FK to parking_lots_master)Target lot (FK to parking_lots_master)
field_name属性名 (Phase 0: name / address / total_spaces / structure / entry_method / receipt_available / pricing_notes / vehicle_type_max。Phase 0.5 追加: contact_phone / contact_email / contact_url / capacity_disabled / capacity_kei / capacity_regular / capacity_oversized / capacity_motorcycle / capacity_ev)Attribute name (Phase 0: name / address / total_spaces / structure / entry_method / receipt_available / pricing_notes / vehicle_type_max. Phase 0.5 added: contact_phone / contact_email / contact_url / capacity_disabled / capacity_kei / capacity_regular / capacity_oversized / capacity_motorcycle / capacity_ev)
valuejsonb (string / number / boolean any)jsonb (string / number / boolean / any)
sourcedata_source enum (11 種)data_source enum (11 sources)
contributor_id入稿元 user / admin / surveyor の uuid (匿名 ugc は NULL)Submitter (user / admin / surveyor uuid; NULL for anonymous UGC)
captured_at入稿時刻Submission timestamp
confirms_n他ソースが同値で補強した回数 (UGC で重要)Times this value was confirmed by other sources (matters for UGC)
is_primaryこの (lot, field) で公開される値か (1 行のみ true)Whether this is the primary value exposed for the (lot, field) (only one row)
priority_scoresource_weight + recency + confirms から計算 (§4 参照)Computed from source_weight + recency + confirms (see §4)
approved_atowner / admin 承認の時刻 (未承認 = NULL)Approval timestamp (NULL = unapproved)
approved_by承認した admin の uuidApproving admin uuid

UNIQUE INDEX idx_pfv_primary ON (parking_lot_id, field_name) WHERE is_primary AND deleted_at IS NULL で、同 (lot, field) で primary が 1 行のみであることを物理保証します。

A partial UNIQUE INDEX (idx_pfv_primary) physically enforces "one primary row per (lot, field) where not deleted."

3. data_source ENUM と source_weight3. data_source ENUM and source_weight

どのソース由来の値を信頼するかは source_weight で表現します。実装は public.calc_priority_score(source data_source, captured_at timestamptz, confirms_n int) RETURNS double precision 関数。

How much we trust each source is encoded in source_weight, materialized in public.calc_priority_score(source, captured_at, confirms_n) RETURNS double precision.

sourceweight由来Origin承認要否Approval
field1.00実地調査Field survey不要 (現地確認済)none (already verified)
owner0.90オーナー入稿Owner submissionactive 中の編集は admin 承認後active lots: admin approval required
admin0.85運営手動編集Manual ops edit即時 (権限内)immediate
ugc0.60ユーザー投稿User-generated contentconfirms_n で重み付けweighted by confirms_n
akippa0.50提携 APIPartner API自動承認auto-approved
navitime0.50提携 APIPartner API自動承認auto-approved
jichitai0.45自治体オープンデータMunicipal open data自動承認auto-approved
kokudo0.40国土数値情報National geographic data自動承認auto-approved
google0.35Google PlacesGoogle Places自動承認auto-approved
osm0.30OpenStreetMapOpenStreetMap自動承認auto-approved
import0.25Phase 0 マイグレーション由来 (旧 parking_lots からの shadow record)Migrated from Phase 0 baseline (shadow records from old parking_lots rows)遡及承認retrospectively approved

4. priority_score の計算4. priority_score calculation

calc_priority_score は 3 要素を線形合成します。係数は dev DB 側で調整可能 (将来 displacement batch でチューニング)。

calc_priority_score linearly combines three signals. Coefficients are tunable on dev DB (will be tuned via the displacement batch later).

priority_score
  = 1.00 * source_weight(source)             -- 0.25 .. 1.00
  + 0.30 * recency_factor(captured_at)       -- 0.0 .. 0.3 (90 日 half-life の指数減衰)
  + 0.15 * confirm_factor(confirms_n)        -- 0.0 .. 0.15 (log scale, cap 5)

同点処理: priority_score 同点は captured_at DESC で勝負。それでも引き分けなら最後に書き込んだ行が primary に。

Tie-breaks: ties on priority_score fall back to captured_at DESC; further ties go to the most recent insert.

5. is_primary の自動再計算5. is_primary auto-recompute

parking_field_values への INSERT / UPDATE で trg_pfv_recompute_primary が発火し、対象 (lot, field) の中から 承認済 (approved_at IS NOT NULL) かつ priority_score 最大の 1 行を is_primary=TRUE、それ以外を FALSE に書き換えます。未承認の owner / admin レコードは候補から除外されるため、Owner Portal で PATCH しても admin 承認まで公開値は変わりません。

Insert / update on parking_field_values triggers trg_pfv_recompute_primary, which picks the highest priority_score among approved rows and flips it to is_primary=TRUE (others to FALSE). Unapproved owner / admin rows are skipped — so an owner PATCH never moves the public value until admin approves.

6. Owner 承認フロー (parking_lots VIEW INSTEAD OF triggers)6. Owner approval flow (parking_lots VIEW INSTEAD OF triggers)

Owner 編集の挙動は 駐車場 statuscaller の権限で分岐します。INSTEAD OF triggers (parking_lots_view_insert/update/delete) と provenance helper (parking_lots_view_provenance) で実装。

Owner edit behavior branches on lot status and caller role. Implemented in INSTEAD OF triggers (parking_lots_view_insert/update/delete) plus a provenance helper (parking_lots_view_provenance).

callerCallerstatus挙動Behavior
is_admin() = true (any) source='admin', approved_at=now() で即時公開 source='admin', approved_at=now() — published immediately
owner active source='owner', approved_at=NULL でキューに入る (admin 審査待ち) source='owner', approved_at=NULL — queued for admin review
owner draft / pending source='owner', approved_at=now() で即時反映 (作業中) source='owner', approved_at=now() — applied immediately (still being drafted)
admin が status: pending → active トリガ trg_parking_lot_approve_owner_data が当該駐車場の全 owner 未承認レコードを一括承認 Trigger trg_parking_lot_approve_owner_data bulk-approves every queued owner row on that lot

申請中の field を Owner Portal で表示するため、GET /v1/owner/parking-lots/mine/{lotId}/pending-fields endpoint が source='owner' AND approved_at IS NULL AND deleted_at IS NULL の行を返します。Portal 側は PublicationTab で「N 件の修正が運営審査中」と提示します。

For Owner Portal visibility, GET /v1/owner/parking-lots/mine/{lotId}/pending-fields returns rows with source='owner' AND approved_at IS NULL AND deleted_at IS NULL. The portal renders "N edits awaiting admin review" in PublicationTab.

7. Raw landing zone (Phase 0d)7. Raw landing zone (Phase 0d)

外部 API (akippa / navitime / google / osm 等) から取り込んだ生データを parking_field_values へ流し込む前段parking_ingest_payloads として保管します。失敗時のリトライ、過去版の参照、新フィールド抽出のリプレイに使います。

External API ingests (akippa / navitime / google / osm / ...) land in parking_ingest_payloads before being projected into parking_field_values. Used for retry, historical lookup, and replay when extracting new fields.

8. ハウスキーピング (Phase 0e)8. Housekeeping (Phase 0e)

Raw landing zone は何もしないと無限に肥大化するため、リテンション + プルーンを pg_cron で自動運用します。

The raw zone would grow unbounded — pg_cron handles retention + pruning.

関数Function役割Purposecron
public.prune_ingest_payloads(retention_days, min_versions, dry_run, triggered_by)superseded 版を保持期間後に hard delete (直近 N 版は確保)Hard-delete superseded payloads after retention (keep last N)Sat 19:00 UTC (Sun 04:00 JST)
public.prune_ingest_runs(retention_days)古い run metadata を削除 (FK cascade なし、子 payload とは別管理)Drop old run rows (no FK cascade; child payloads are managed separately)(同上)
public.purge_soft_deleted_payloads(grace_days)deleted_at から N 日経過した soft-delete 行を物理削除Hard-delete soft-deleted rows older than N days(同上)

全ハウスキーピング実行は infra.parking_ingest_housekeeping_log に行き先 (rows_examined / rows_deleted / parameters / triggered_by) と一緒に記録され、infra.parking_ingest_storage_v view でテーブルサイズと併せて監視できます。

Each housekeeping run is logged in infra.parking_ingest_housekeeping_log (rows_examined / rows_deleted / parameters / triggered_by). Combine with infra.parking_ingest_storage_v for table-size monitoring.

9. 検索層 (parking_search_v MV)9. Search layer (parking_search_v MV)

地図検索 / 一覧 API は parking_search_v (MATERIALIZED VIEW) を読みます。parking_lots_master + parking_field_values の primary を pivot した形で、ratings / hourly_rate も JOIN 済。setup_parking_search_v_refresh_cron() で pg_cron に REFRESH MATERIALIZED VIEW CONCURRENTLY が登録されます。

Map search and list APIs read parking_search_v (MATERIALIZED VIEW): a pivot of parking_lots_master + the primary parking_field_values, pre-joined with ratings and hourly_rate. setup_parking_search_v_refresh_cron() registers a pg_cron REFRESH MATERIALIZED VIEW CONCURRENTLY.

10. バックアップ (Phase 0g)10. Backups (Phase 0g)

Phase 0f (per-row R2 archive) は over-engineered として revert し、代わりに pg_dump → R2 → Wasabi の 3 拠点バックアップに統一しました。

Phase 0f (per-row R2 archive) was reverted as over-engineered, replaced with a unified pg_dump → R2 → Wasabi 3-vendor backup.

詳細運用は ops/db-backup-runbook.html

Operational details in ops/db-backup-runbook.html.

11. 表で覚える Phase 0 の構成物11. Phase 0 inventory at a glance

Phase追加されたものAdded主目的Goal
0a/0bparking_lots → VIEW rename / parking_lots_master / parking_field_values / INSTEAD OF triggers / RLSparking_lots VIEW rename / parking_lots_master / parking_field_values / INSTEAD OF triggers / RLSfield 単位 provenance の coreCore of field-level provenance
0cparking_field_confirmations / increment_confirms_n / run_displacement_batchparking_field_confirmations / increment_confirms_n / run_displacement_batchUGC confirms 補強と priority_score 再計算UGC confirmation + priority_score recompute
0dparking_ingest_runs / parking_ingest_payloads / upsert_ingest_payloadparking_ingest_runs / parking_ingest_payloads / upsert_ingest_payload外部取り込みの raw 永続化 (replay 可)Persist raw external ingests (for replay)
0eprune_ingest_payloads / prune_ingest_runs / purge_soft_deleted_payloads / housekeeping_logprune_ingest_payloads / prune_ingest_runs / purge_soft_deleted_payloads / housekeeping_logretention / プルーンの自動化Automated retention / pruning
0fper-row R2 cold archive (revert 済)per-row R2 cold archive (reverted)over-engineered と判断Reverted as over-engineered
0gbackup-supabase-to-r2.sh / backup-r2-to-wasabi.sh / restore-supabase-from-r2.sh + Xserver cronbackup-supabase-to-r2.sh / backup-r2-to-wasabi.sh / restore-supabase-from-r2.sh + Xserver cronDB の長期 3 拠点バックアップLong-term 3-vendor DB backup
0.5codes 拡張 (lot_structure 10 値 / lot_entry 8 値 / lot_type 6 値) / parking_field_values の field_name 8→17 / parking_operators テーブル新設 + master.operator_id FK / tags 14 個追加 (vehicle / accessibility / 監視員 / 照明) / parking_spots に EV 詳細 5 列codes expansion (lot_structure 10 / lot_entry 8 / lot_type 6) / parking_field_values field_name 8→17 / new parking_operators table + master.operator_id FK / 14 tags (vehicle / accessibility / supervisor / lighting) / parking_spots EV detail 5 columnsPhase 1 (OSM / UGC) 着手前の駐車場データモデル整地 — 監査レポート (.work/output/parky/2026-05-04_008) の High 5 件 + Mid 7 件を解消Pre-Phase-1 schema tidy-up — addresses High 5 + Mid 7 gaps from audit (.work/output/parky/2026-05-04_008)

12. 参考: 主要 RPC 一覧12. Reference: key RPCs

schema.function役割Purpose
public.calc_priority_score(source, captured_at, confirms_n)priority_score 計算 (provenance 評価)Compute priority_score from provenance
public.run_displacement_batch(...)priority_score を全件再計算 + primary 入替 (Phase 1 で recency 改良予定)Recompute priority_score + flip primaries (recency tuning planned in Phase 1)
public.prune_ingest_payloads(retention_days, min_versions, dry_run, triggered_by)superseded 版の hard delete (直近 N 版確保)Hard-delete superseded payloads (keep last N versions)
public.prune_ingest_runs(retention_days)古い run の削除Drop old runs
public.purge_soft_deleted_payloads(grace_days)soft-delete からの確定削除Hard-delete from soft-deleted state
public.setup_parking_search_v_refresh_cron()parking_search_v REFRESH の pg_cron 登録Schedule parking_search_v REFRESH via pg_cron
public.setup_displacement_batch_cron()displacement batch の pg_cron 登録Schedule the displacement batch via pg_cron
public.setup_ingest_housekeeping_cron()ハウスキーピング系 3 関数の pg_cron 登録Schedule the 3 housekeeping functions via pg_cron
bff_only.upsert_ingest_payload(...)外部取り込みの単位 RPC (hash 冪等)Per-record ingest RPC (hash-idempotent)
bff_only.start_ingest_run / finish_ingest_run取り込み batch の lifecycleIngest batch lifecycle
bff_only.increment_confirms_n(field_value_id)UGC confirm の単位 RPCUGC confirm increment RPC

13. Phase 1 以降 (スコープ外 / 参考)13. Future phases (out of scope, for reference)

Phase 0 で provenance / source_weight / Owner 承認フロー / Raw landing zone / ハウスキーピング / バックアップが揃ったため、Phase 1 以降は 新ソースの追加と新 ingestion script の実装 だけで済むはず、という設計判断になっています。

With Phase 0 in place — provenance, source_weight, owner approval flow, raw landing zone, housekeeping, backups — later phases should reduce to adding new sources and new ingestion scripts only.