Skip to content

Duplicate Tables

This flag identifies tables across the scanned semantic models that appear to be duplicates of each other — same structure, same (or very similar) data, loaded independently into multiple datasets.


  • Redundant memory usage — The same data is loaded into RAM multiple times across different datasets, wasting capacity memory.
  • Inconsistent data — If source queries are even slightly different, “duplicate” tables may produce conflicting numbers in different reports.
  • Refresh overhead — Every copy is refreshed independently, multiplying CU consumption.
  • Governance gap — Indicates a lack of shared/reusable datasets in the data architecture.

Two tables are flagged as duplicates when all of the following match:

  1. Schema match — Both tables have the same column names and data types (column order is ignored).
  2. Row count proximity — Row counts are within 5% of each other.
  3. Different datasets — The tables reside in different semantic models (same-dataset duplicates are handled separately).

Threshold: Schema match ≥ 100% column overlap + row count within 5%.


FieldDescription
tableNameTable name
datasetAFirst dataset name
workspaceAFirst dataset’s workspace
datasetBSecond dataset name
workspaceBSecond dataset’s workspace
columnOverlapPercentage of matching columns
rowCountARow count in dataset A
rowCountBRow count in dataset B

  • Shared datasets — If you already use Power BI shared datasets (live connections), the downstream models should not contain copies of the same tables. This flag helps identify cases where copy was used instead of live connection.
  • Date tables — Many models have independently created Date/Calendar tables with similar structure. These are common and expected; evaluate case by case whether to consolidate.

Video Tutorial