Field Mapping: Mill 1 (CSV) to Mill 2 (JSON)
Both mills record the same business data -- paddy intake from farmers -- but use different systems with different field names and formats.
Field mapping
| Mill 1 field | Mill 2 field | Unified name | Notes |
|---|---|---|---|
| record_id | id | record_id | Sequential integer. Not a natural key -- assigned by each mill's system independently. |
| farmer_name | supplier_name | farmer_name | Same concept: the person delivering paddy. Mill 2's newer system uses "supplier" terminology. |
| paddy_weight_kg | weight_kg | paddy_weight_kg | Weight of paddy delivered in kilograms. Null in Mill 2 for advance payment records. |
| moisture_pct | moisture_percent | moisture_pct | Moisture content as a percentage. Null in Mill 2 for advance payment records. |
| grade | harvest_quality | grade | Mill 1 uses text (premium/standard/low). Mill 2 uses letter codes (A/B/C). Map A->premium, B->standard, C->low. Null in Mill 2 for advance payments. |
| price_mmk | payment_amount | price_mmk | Amount paid in Myanmar Kyat. Present on all records including advance payments. |
| mill_date | processing_date | mill_date | Date of the milling operation. Mill 2 uses ISO format (YYYY-MM-DD). |
| intake_time | timestamp | intake_time | Exact time of the transaction. Mill 2 uses ISO timestamp format. |
Advance payment records (Mill 2 only)
Some Mill 2 records represent advance payments to farmers for future paddy delivery. These records have:
supplier_nameandpayment_amountpopulated (who was paid, how much)weight_kg,moisture_percent, andharvest_qualityset to null (no paddy delivered yet)
These are normal business operations. Kyaw Zin Oo pays farmers in advance to secure future supply. The payment appears in the data before the corresponding paddy delivery. When the paddy is eventually delivered, it appears as a separate record with all fields populated.
Key considerations
- record_id / id is NOT a natural key for MERGE. Each mill assigns IDs independently. The same ID number means different records at different mills.
- The natural key for deduplication should combine mill identifier + farmer/supplier name + mill_date (and potentially grade) to uniquely identify a paddy intake event.
- Grade mapping must be explicit -- do not rely on AI to infer that A=premium. Define the mapping in the staging model.