● What this map shows
Schema-level connections. Each line means "this column feeds that column." For example: IHS.LRIMOShipNo is the source for TT.sitechanges.Imo.
1
An AI agent profiled 1,100 tables — examining sample data, column meanings, and cross-system references
2
Overlap tested in production — Databricks queries measured how many values appear in both columns (exact match)
3
Each edge semantically reviewed — does this relationship make real-world sense given the domain?
Confidence % = how sure we are the relationship exists, not how many values match.
● How the DQ pipeline uses this
The lineage map is the input to data quality. It tells the pipeline where to look. Then at runtime:
A
Where to check — lineage says "these columns are connected"
B
Who is right — each family has a default authority (e.g. IHS for vessel identity)
C
Do values match — actual row-by-row comparison (exact, fuzzy, and AI matching)
Step C happens at DQ runtime, not here. This map provides A and B.