cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Yuki
by Contributor
  • 316 Views
  • 1 replies
  • 0 kudos

Is it possible to retain original deltatable data with Unity Catalog?

Hi everyone,I have a question regarding data retention in Unity Catalog. In the pre–Unity Catalog setup, I believe that even if we dropped an external table, the underlying data files remained intact.However, in the current best practices for Unity C...

  • 316 Views
  • 1 replies
  • 0 kudos
Latest Reply
mani_22
Databricks Employee
  • 0 kudos

Hi @Yuki,  If you drop an external table, the underlying data remains accessible even now. Only the table definition is removed from the metastore, while the actual data is retained. The UNDROP command for an EXTERNAL table simply recreates the table...

  • 0 kudos
SakthiGanesh
by New Contributor II
  • 195 Views
  • 2 replies
  • 0 kudos

Delta table partition folder names is getting changed

I am facing an issue where the expected date partition folder should be named in format like "campaign_created_date=2024-01-17", but instead it is writing as "ad", "8B" looks like a random folder names.Usually it will be like below:Now it changed lik...

SakthiGanesh_0-1751013736357.png SakthiGanesh_1-1751013840570.png
  • 195 Views
  • 2 replies
  • 0 kudos
Latest Reply
Krishnamatta
Contributor
  • 0 kudos

Hi Satish,This is due to the column mapping enabled on the tableFrom Databricks Docs:When you enable column mapping for a Delta table, random prefixes replace column names in partition directories for Hive-style partitioning. See Rename and drop colu...

  • 0 kudos
1 More Replies
elgeo
by Valued Contributor II
  • 37454 Views
  • 12 replies
  • 6 kudos

SQL Stored Procedure in Databricks

Hello. Is there an equivalent of SQL stored procedure in Databricks? Please note that I need a procedure that allows DML statements and not only Select statement as a function provides.Thank you in advance

  • 37454 Views
  • 12 replies
  • 6 kudos
Latest Reply
sridharplv
Valued Contributor
  • 6 kudos

Its working for me without any issues if we create a cluster with DBR 17.0 https://docs.databricks.com/aws/en/sql/language-manual/sql-ref-syntax-ddl-create-procedure 

  • 6 kudos
11 More Replies
Ganeshch
by New Contributor II
  • 466 Views
  • 6 replies
  • 0 kudos

No option to create cluster

I don't see any option to create cluster inside compute .How to create cluster ? Please help me

  • 466 Views
  • 6 replies
  • 0 kudos
Latest Reply
nayan_wylde
Valued Contributor III
  • 0 kudos

Yes if you are using legacy community version you will be able to create clusters but with free edition it is limited serverless compute

  • 0 kudos
5 More Replies
AkhileshVB
by New Contributor
  • 580 Views
  • 0 replies
  • 0 kudos

Syncing lakebase table to delta table

I have been exploring Lakebase and I wanted to know if there is a way to sync CDC data from Lakebase tables to delta table in Lakehouse. I know the other way is possible and that's what was shown in the demo. Can you tell how I can I sync both the ta...

  • 580 Views
  • 0 replies
  • 0 kudos
Dewlap
by New Contributor II
  • 245 Views
  • 1 replies
  • 1 kudos

How to handle exploded records with overwrite-by-key logic in Delta Live Tables

 I'm using Delta Live Tables (DLT) with the apply_changes API to manage SCD Type 1 on a source table. However, I’ve run into a limitation.Context:After apply_changes, I have a derived view that:Flattens and explodes a JSON array field in the source d...

  • 245 Views
  • 1 replies
  • 1 kudos
Latest Reply
Brahmareddy
Honored Contributor III
  • 1 kudos

Hi Dewlap,How are you doing today? As per my understanding, You're right to notice that apply_changes in DLT works best for one-row-per-key updates and doesn't fit well when you need to replace multiple rows for the same key, especially after explodi...

  • 1 kudos
Sreejuv
by New Contributor
  • 354 Views
  • 1 replies
  • 0 kudos

Lakebridge code conversion

m currently working on a proof of concept to convert Oracle & Synapse procedures into Databricks SQL  and none of these are getting converted. followed the steps mentioned in documentation . Wanted to check any one able to sucvessfuly convert and exe...

  • 354 Views
  • 1 replies
  • 0 kudos
Latest Reply
lingareddy_Alva
Honored Contributor II
  • 0 kudos

Hi @Sreejuv You're encountering a very common challenge. Oracle and Synapse to Databricks SQL procedure conversion is notoriously difficult, and many organizations struggle with this.Common Issues with Automated ConversionWhy procedures often fail to...

  • 0 kudos
AdamIH123
by New Contributor II
  • 467 Views
  • 1 replies
  • 0 kudos

Resolved! Agg items in a map

What is the best way to aggregate a map across rows? In the below, The agg results would be red: 4, green 7, blue: 10. This can be achieved using explode wondering if there is a better way. %sql with cte as ( select 1 as id , map('red', 1, 'green...

  • 467 Views
  • 1 replies
  • 0 kudos
Latest Reply
SP_6721
Contributor III
  • 0 kudos

Hi @AdamIH123 ,The explode-based approach is widely used and remains the most reliable and readable method.But if you're looking for an alternative without using explode, you can try the REDUCE + MAP_FILTER approach. It lets you aggregate maps across...

  • 0 kudos
seefoods
by Contributor
  • 1296 Views
  • 1 replies
  • 0 kudos

asset bundle

Hello Guys, Actually, i build a custom asset bundle  config, but i have a issue when i create several sub directories inside resources directory. After running the command databricks bundle summary, databricks librairies mentionned that resources its...

  • 1296 Views
  • 1 replies
  • 0 kudos
Latest Reply
Renu_
Contributor III
  • 0 kudos

Hi @seefoods, databricks asset bundles don’t automatically detect resources in subdirectories unless they’re explicitly listed or a recursive pattern is used in the config.To resolve this, you can update the include section with a pattern like resour...

  • 0 kudos
jeremy98
by Honored Contributor
  • 253 Views
  • 3 replies
  • 1 kudos

How to reference a workflow to use multiple GIT sources?

Hi community,Is it possible for a workflow to reference multiple Git sources? Specifically, can different tasks within the same workflow point to different Git repositories or types of Git sources?Ty

  • 253 Views
  • 3 replies
  • 1 kudos
Latest Reply
mai_luca
New Contributor III
  • 1 kudos

A workflow can reference multiple Git sources. You can specify the git information for each task. However, I am not sure you can have multiple GitProvider for the same workspace.... 

  • 1 kudos
2 More Replies
frosti_pro
by New Contributor II
  • 581 Views
  • 3 replies
  • 1 kudos

UC external tables to managed tables

Dear community, I would like to know if there are any procedure and/or recommendation to safely and efficiently migrate UC external tables to managed tables (in a production context with high volume of data)? Thank you for your support!

  • 581 Views
  • 3 replies
  • 1 kudos
Latest Reply
ElizabethB
New Contributor II
  • 1 kudos

Please check out our new docs page! This has some information which may help you, including information about our new SET MANAGED command. We are also looking to make this process smoother over time, so if you have any feedback, please let us know. h...

  • 1 kudos
2 More Replies
suk
by New Contributor II
  • 277 Views
  • 1 replies
  • 0 kudos

Databricks pipeline script is not creating schema before creation of table

Hello We are facing some issue while executing the databricks pipeline i.e it takes all scripts in random sequence, and if no schema was created before it scheduled a job to create table it'll failAs a alternative we are executing  schema pipeline fi...

  • 277 Views
  • 1 replies
  • 0 kudos
Latest Reply
lingareddy_Alva
Honored Contributor II
  • 0 kudos

Hi @suk This is a common issue with Databricks pipelines where dependencies aren't properly managed, causing scripts to execute in random order. Use Databricks Workflows with Task DependenciesConfigure explicit task dependencies: 

  • 0 kudos
Ganeshch
by New Contributor II
  • 491 Views
  • 3 replies
  • 0 kudos

No option to create cluster

I don't see any option to create cluster inside compute in community edition, is it disable? .How to create cluster ? Please help me

  • 491 Views
  • 3 replies
  • 0 kudos
Latest Reply
Ganeshch
New Contributor II
  • 0 kudos

If i create notebook and run it , explicitly cluster will not be created but it will work in the backend , am i right?

  • 0 kudos
2 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels