Duplicate Tables

What It Detects

This flag identifies tables across the scanned semantic models that appear to be duplicates of each other — same structure, same (or very similar) data, loaded independently into multiple datasets.

Why It Matters

Redundant memory usage — The same data is loaded into RAM multiple times across different datasets, wasting capacity memory.
Inconsistent data — If source queries are even slightly different, “duplicate” tables may produce conflicting numbers in different reports.
Refresh overhead — Every copy is refreshed independently, multiplying CU consumption.
Governance gap — Indicates a lack of shared/reusable datasets in the data architecture.

Trigger Conditions

Two tables are flagged as duplicates when all of the following match:

Schema match — Both tables have the same column names and data types (column order is ignored).
Row count proximity — Row counts are within 5% of each other.
Different datasets — The tables reside in different semantic models (same-dataset duplicates are handled separately).

Threshold: Schema match ≥ 100% column overlap + row count within 5%.

Output

Field	Description
`tableName`	Table name
`datasetA`	First dataset name
`workspaceA`	First dataset’s workspace
`datasetB`	Second dataset name
`workspaceB`	Second dataset’s workspace
`columnOverlap`	Percentage of matching columns
`rowCountA`	Row count in dataset A
`rowCountB`	Row count in dataset B

Notes and Edge Cases

Shared datasets — If you already use Power BI shared datasets (live connections), the downstream models should not contain copies of the same tables. This flag helps identify cases where copy was used instead of live connection.
Date tables — Many models have independently created Date/Calendar tables with similar structure. These are common and expected; evaluate case by case whether to consolidate.

Duplicate Columns — Detects column-level duplication within a single model.
Duplicate Measures — Detects identical DAX measures.

Video Tutorial

▶ Watch the video tutorial