Question 1

What is schema drift?

Accepted Answer

Schema drift is when the structure, format, or meaning of data in a source system changes over time without those changes being communicated to or handled by the downstream systems that consume the data. The integrations keep running and report success, but the data flowing through is now incomplete or incorrect.

Question 2

What are the types of schema drift?

Accepted Answer

There are three: structural drift (a field is added, removed, renamed, or retyped), semantic drift (a field's meaning changes while its name and type stay the same), and temporal drift (the timing or ordering of data changes, such as a timezone or batch-frequency shift).

Question 3

Why don't monitoring tools catch schema drift?

Accepted Answer

Most monitoring tools measure infrastructure signals — uptime, error rates, job success — not data correctness. A pipeline that drops or misreads records due to drift still reports a successful execution, so structural and semantic drift pass through undetected.

Question 4

How do you detect schema drift?

Accepted Answer

Effective detection works at the semantic layer: continuously comparing schemas across integration points, baselining the expected values of each field, and recognizing when a field's meaning has shifted even if its structure has not. Techniques like vector-embedding comparison and statistical distribution monitoring make this tractable.

Schema Drift

What is schema drift?

What are the types of schema drift?

Why don't monitoring tools catch schema drift?

How do you detect schema drift?

See what's silently wrong.