Effects of materialized view with Cluster BY

malterializedvw · ‎03-10-2026

Hi folks,
I have a question on whether I am using materialized views right.
Our pipeline looks like this:
1. A spark job creates a table `source` with columns a, b and c.
2. A materialized view `target` is created on `source`. I want to partition it by `a`, so I add a `CLUSTER BY`.
3. PowerBI queries `target` by doing `select * from target where a is between x and y` (the reason why I want a partition)

CREATE MATERIALIZED VIEW IF NOT EXISTS target
            TBLPROPERTIES ('delta.columnMapping.mode' = 'name')
            TRIGGER ON UPDATE
    select
       a,
       b,
       sum(c)
    from source
    group by all
        CLUSTER BY (`a`)

Unfortunately the PowerBi querying is still very slow (compared to a simple view without clustering).
But this could be a bandwith issue, bit hard to find out.
Two questions:
1. Is the definition of the materialized view correct?
2. Would it matter to add partitioning additionally in the source table (let's assume this table is not accessed otherwise)?

For any clarifications or hints, please let me know.
Thanks and best regards