cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Warehousing & Analytics
Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Community. Share insights, tips, and best practices for leveraging data for informed decision-making.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Databricks Liquid Cluster

techuser
New Contributor III

Hi,

Is it possible to convert existing delta table with partition having data to clustering? If so can you please suggest the steps required? I tried and searched but couldn't find any. Is it that liquid clustering can be done only for new Delta tables? Please help

6 REPLIES 6

techuser
New Contributor III

@Retired_mod How this can be applied to existing delta table which is partitioned having data? Can you please suggest me the steps involved? The existing delta table is partitioned and hence the location files are all partitioned. So is it possible to convert this to cluster? In existing databricks documentation, its mentioned for NEW tables.

techuser
New Contributor III

Hi,

Sorry for again asking this. My requirement is not to change partition column for existing Delta table. My requirement is to change the existing delta table example Table A partitioned by Column 1, Column 2 to Table A cluster by Column 3.

Requirement is converting existing partitioned Delta table to Delta table with cluster with new column.

 

techuser
New Contributor III

While defining a new table which uses liquid cluster, we mention at the end as 'USING DELTA CLUSTER BY (Column1)'

As per above solution Point 3 Cluster by the new column, its mentioned as 

.partitionBy("colB")

How is it identified as CLUSTER? Because while creating a table we have CLUSTER BY and PARTITION BY as 2 different usage and that's how table is identified as CLUSTER or PARTITION. 

As per above explanation , does the DESCRIBE table show it as cluster?

 

techuser
New Contributor III

Thank you for the response!

DESCRIBE table gives the column and datatype details and also the columns of PARTITION and CLUSTER. If cluster is used it mentions as Clustering Information and mention the columns used likewise for partition also.

So back to my previous question. Is there a way to Convert existing Delta Table with partition to a Delta Table with cluster.

1. Table A -- Partition column A

2. Take back up of Table A as A_bkp

3. Replace or Drop/Create Table A with Cluster Column B

4. INSERT TABLE A AS SELECT * FROM A_BKP

5. DROP A_BKP, remove the files associated

Is this a good approach?

 

This is an old reply, but I want to verify the last comment by Fatma.
You create the new table using "AS SELECT * FROM A_bkp", and in the next step you write another "INSERT INTO Table_A SELECT ร„ FROM A_bkp". Is this just a typo or why is it inserting data from the backup table twice?

Raja_Databricks
New Contributor III

Does Liquid Clustering accepts Merge or How Upsert can be done efficiently with Liquid clustered delta table

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group