Leeroo Data Lineage Marlink Maritime Telecom

Data Lineage Map

This map shows which columns in which systems are about the same thing — vessel IMO numbers, customer IDs, billing codes. It's the wiring diagram, not the data itself.

What this map shows

Schema-level connections. Each line means "this column feeds that column." For example: IHS.LRIMOShipNo is the source for TT.sitechanges.Imo.

1 An AI agent profiled 1,100 tables — examining sample data, column meanings, and cross-system references
2 Overlap tested in production — Databricks queries measured how many values appear in both columns (exact match)
3 Each edge semantically reviewed — does this relationship make real-world sense given the domain?

Confidence % = how sure we are the relationship exists, not how many values match.

How the DQ pipeline uses this

The lineage map is the input to data quality. It tells the pipeline where to look. Then at runtime:

A Where to check — lineage says "these columns are connected"
B Who is right — each family has a default authority (e.g. IHS for vessel identity)
C Do values match — actual row-by-row comparison (exact, fuzzy, and AI matching)

Step C happens at DQ runtime, not here. This map provides A and B.

--
systems mapped
from IHS ship registry to satellite providers, billing, CRM, and more
--
column relationships confirmed
schema connections with production overlap evidence
--
need your expertise
relationships the agent can't confirm without domain knowledge
Relationship types:
Identity same field, direct reference Copy independent duplicate Transform reformatted/mapped Derived computed from source | Confirmed (≥85%) Uncertain Weak
🔍

Data Flow

Confirmed (≥85%)
Uncertain (70-84%)
Weak (<70%)
Default Authority

All Connections

FamilyTypeFromToConfidenceEvidence

Detail