Duplicate Columns
What It Detects
Section titled “What It Detects”This flag identifies columns within the same semantic model that are likely duplicates — they share similar names, identical data types, and comparable value distributions, suggesting they carry the same data.
Why It Matters
Section titled “Why It Matters”- Wasted memory — Duplicate columns double the VertiPaq storage for the same information.
- Ambiguity for report authors — Two columns with similar names and identical data create confusion about which one to use.
- Relationship conflicts — Duplicate columns can lead to ambiguous relationship paths, causing unexpected DAX results.
Trigger Conditions
Section titled “Trigger Conditions”Two columns within the same model are flagged when all of the following are true:
- Name similarity — Column names match after normalization (removing spaces, underscores, casing differences). For example,
Customer_IDandCustomerIDare considered a match. - Data type match — Both columns have the same data type.
- Same cardinality range — Distinct value counts are within 10% of each other.
- Different tables — The columns reside in different tables within the same model (same-table duplicates are rare and usually intentional).
Threshold: Name match + data type match + cardinality within 10%.
Output
Section titled “Output”| Field | Description |
|---|---|
columnA | First column (table.column format) |
columnB | Second column (table.column format) |
dataType | Shared data type |
cardinalityA | Distinct value count in column A |
cardinalityB | Distinct value count in column B |
datasetName | Semantic model name |
workspaceName | Workspace name |
Notes and Edge Cases
Section titled “Notes and Edge Cases”- Foreign key columns — Columns that exist on both sides of a relationship (e.g.,
Orders[ProductID]andProducts[ProductID]) are expected and are excluded from this flag if they participate in an active relationship. - Denormalized tables — In star-schema models, dimension attributes may be intentionally denormalized into fact tables for performance. These may be flagged but could be intentional.
Related Flags
Section titled “Related Flags”- Duplicate Tables — Same concept applied at the table level across datasets.
- Unused Columns — One of the duplicate pair may be unused.