Data Engineering

Forum Posts

Sorted by:

by manish1987c • New Contributor III

04-07-2024 9:31:38 PM

2536 Views
1 replies
0 kudos

Delta Live table expectations

I am able to ues expectation feature in delta live table using by creating the expectations as below checks = {}checks["validate circuitId col for null values"] = "(circuitId IS not NULL)"checks["validate name col for null values"] = "(name IS not ...

Data Engineering

2536 Views
1 replies
0 kudos

04-07-2024 9:31:38 PM

View Replies

by Gilg • Contributor II

04-07-2024 11:57:53 PM

1734 Views
0 replies
0 kudos

Best Practices Near Real-time Processing

HI All,We are ingesting 1000 files in json format and different sizes per minute. DLT is in continuous mode. Unity Catalog is enabled workspace. We are using the default setting of Autoloader (Directory Listing) and Silver has CDC as well.We aim to ...

Data Engineering

1734 Views
0 replies
0 kudos

04-07-2024 11:57:53 PM

by RajeshRK • Contributor II

01-06-2022 8:14:40 AM

10617 Views
3 replies
0 kudos

Need help to analyze databricks logs for a long-running job.

Hi Team,We have a job it completes in 3 minutes in one Databricks cluster, if we run the same job in another databricks cluster it is taking 3 hours to complete.I am quite new to Databricks and need your guidance on how to find out where databricks s...

Data Engineering

10617 Views
3 replies
0 kudos

01-06-2022 8:14:40 AM

View Replies

Latest Reply

AmitKP
New Contributor II

04-07-2024 12:53:50 PM

0 kudos

Hi @Retired_mod ,I am saving logs of my databricks Job Compute From ADF, How can i open those files that present in dbfs location.

0 kudos

04-07-2024 12:53:50 PM

2 More Replies

by NarenderKumar • New Contributor III

04-06-2024 4:00:46 AM

3034 Views
0 replies
0 kudos

How to set up relations between tables in unity catalog tables

We are using unity catalog.Is there a way to set up relations in unity catalog tables like key column relations, 1 to many, many to 1.Can we also generate ER diagrams if we are able to set up these relations.

Data Engineering

3034 Views
0 replies
0 kudos

04-06-2024 4:00:46 AM

by prasad95 • New Contributor III

04-04-2024 3:01:13 AM

2053 Views
1 replies
0 kudos

How to unwrap the notebook code lines, By default its getting wrapping the code lines

Data Engineering

2053 Views
1 replies
0 kudos

04-04-2024 3:01:13 AM

View Replies

by Gilg • Contributor II

04-03-2024 5:56:09 PM

3222 Views
1 replies
0 kudos

Move files

HiI am using DLT with Autoloader.DLT pipeline is running in Continuous mode.Autoloader is in Directory Listing mode (Default)Question.I want to move files that has been processed by the DLT to another folder (archived) and planning to have another no...

Data Engineering

3222 Views
1 replies
0 kudos

04-03-2024 5:56:09 PM

View Replies

by MikeGo • Contributor II

04-05-2024 12:02:36 AM

5247 Views
1 replies
0 kudos

What is the behavior when merge key is not unique

Hi, When using the MERGE statement, if merge key is not unique on both source and target, it will throw error. If merge key is unique in source but not unique in target, WHEN MATCHED THEN DELETE/UPDATE should work or not? For example merge key is id....

Data Engineering

5247 Views
1 replies
0 kudos

04-05-2024 12:02:36 AM

View Replies

Latest Reply

MikeGo
Contributor II

04-05-2024 3:27:34 PM

0 kudos

Cool, this is what I tested out. Great to get confirmed. Thanks. BTW, https://medium.com/@ritik20023/delta-lake-upserting-without-primary-key-f4a931576b0 has a workaround which can fix the merge with duplicate merge key on both source and target.

0 kudos

04-05-2024 3:27:34 PM

by Erik_L • Contributor II

04-05-2024 10:29:47 AM

2445 Views
1 replies
0 kudos

Visualizations failing to show

I have a SQL query that generates a table. I created a visualization from that table with the UI. I then have a widget that updates a value used in the query and re-runs the SQL, but then the visualization shows nothing, that there is "1 row," but if...

Data Engineering

2445 Views
1 replies
0 kudos

04-05-2024 10:29:47 AM

View Replies

by 397973 • New Contributor III

04-05-2024 7:52:48 AM

4273 Views
3 replies
0 kudos

Having trouble installing my own Python wheel?

I want to install my own Python wheel package on a cluster but can't get it working. I tried two ways: I followed these steps: https://docs.databricks.com/en/workflows/jobs/how-to/use-python-wheels-in-workflows.html#:~:text=March%2025%2C%202024,code%...

Data Engineering

cluster

Notebook

4273 Views
3 replies
0 kudos

04-05-2024 7:52:48 AM

View Replies

Latest Reply

shan_chandra
Databricks Employee

04-05-2024 10:44:17 AM

0 kudos

@397973 - Once you uploaded the .whl file, did you had a chance to list the file manually in the notebook? Also, did you had a chance to move the files to /Volumes .whl file?

0 kudos

04-05-2024 10:44:17 AM

2 More Replies

by SyedSaqib • New Contributor II

04-02-2024 10:40:50 AM

4021 Views
1 replies
0 kudos

Delta Live Table : [TABLE_OR_VIEW_ALREADY_EXISTS] Cannot create table or view

Hi,I have a delta live table workflow with storage enabled for cloud storage to a blob store.Syntax of bronze table in notebook===@dlt.table(spark_conf = {"spark.databricks.delta.schema.autoMerge.enabled": "true"},table_properties = {"quality": "bron...

Data Engineering

4021 Views
1 replies
0 kudos

04-02-2024 10:40:50 AM

View Replies

Latest Reply

SyedSaqib
New Contributor II

04-05-2024 11:35:30 AM

0 kudos

Hi Kaniz,Thanks for replying back.I am using python for delta live table creation, so how can I set these configurations?When creating the table, add the IF NOT EXISTS clause to tolerate pre-existing objects.consider using the OR REFRESH clause Answe...

0 kudos

04-05-2024 11:35:30 AM

by Henrique_Lino • New Contributor II

04-05-2024 7:53:04 AM

6912 Views
6 replies
0 kudos

value is null after loading a saved df when using specific type in schema

I am facing an issue when using databricks, when I set a specific type in my schema and read a json, its values are fine, but after saving my df and loading again, the value is gone.I have this sample code that shows this issue: from pyspark.sql.typ...

Data Engineering

6912 Views
6 replies
0 kudos

04-05-2024 7:53:04 AM

View Replies

Latest Reply

Lakshay
Databricks Employee

04-05-2024 9:51:32 AM

0 kudos

@Henrique_Lino , Where are you saving your df?

0 kudos

04-05-2024 9:51:32 AM

5 More Replies

by Anandsingh • New Contributor

04-05-2024 6:52:19 AM

2368 Views
1 replies
0 kudos

Writing to multiple files/tables from data held within a single file through autoloader

I have a requirement to read and parse JSON files using autoloader where incoming JSON file has multiple sub entities. Each sub entity needs to go into its own delta table. Alternatively we can write each entity data to individual files. We can use D...

Data Engineering

2368 Views
1 replies
0 kudos

04-05-2024 6:52:19 AM

View Replies

Latest Reply

Lakshay
Databricks Employee

04-05-2024 9:55:48 AM

0 kudos

I think using DLT's medallion architecture should be helpful in this scenario. You can write all the incoming data to one bronze table and one silver table. And you can have multiple gold tables based on the value of the sub-entities.

0 kudos

04-05-2024 9:55:48 AM

by Kavi_007 • Databricks Partner

04-01-2024 12:44:21 PM

10303 Views
6 replies
1 kudos

Resolved! Seeing history even after vacuuming the Delta table

Hi,I'm trying to do the vacuum on a Delta table within a unity catalog. The default retention is 7 days. Though I vacuum the table, I'm able to see the history beyond 7 days. Tried restarting the cluster but still not working. What would be the fix ?...

Data Engineering

10303 Views
6 replies
1 kudos

04-01-2024 12:44:21 PM

View Replies

Latest Reply

Kavi_007
Databricks Partner

04-04-2024 12:06:25 PM

1 kudos

No, that's wrong. VACUUM removes all files from the table directory that are not managed by Delta, as well as data files that are no longer in the latest state of the transaction log for the table and are older than a retention threshold.VACUUM - Azu...

1 kudos

04-04-2024 12:06:25 PM

5 More Replies

by Jon • New Contributor II

10-21-2021 12:14:40 AM

9295 Views
4 replies
5 kudos

IP address fix

How can I fix the IP address of my Azure Cluster so that I can whitelist the IP address to run my job daily on my python notebook? Or can I find out the IP address to perform whitelisting? Thanks

Data Engineering

9295 Views
4 replies
5 kudos

10-21-2021 12:14:40 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

04-05-2024 4:08:07 AM

5 kudos

Depends on the scenario. You could expose a single ip address to the external internet, but databricks itself will always use many addresses.

5 kudos

04-05-2024 4:08:07 AM

3 More Replies

by DE_K • New Contributor II

04-03-2024 5:41:02 AM

5088 Views
4 replies
0 kudos

@dlt.table throws error AttributeError: module 'dlt' has no attribute 'table'

Hi Everyone,I am new to DLT and am trying to run below code to create table dynamically. But I get error "AttributeError: module 'dlt' has no attribute 'table'". code snippet:def generate_tables(model_name try: spark.sql("select * from dlt.{0}"....

Data Engineering

dataengineering

datapipeline

deltalivetables

dlt

5088 Views
4 replies
0 kudos

04-03-2024 5:41:02 AM

View Replies

Latest Reply

YuliyanBogdanov
New Contributor III

04-03-2024 11:57:13 PM

0 kudos

Thank You, @DE_K. I see your point. I believe you are using the @dlt.table instead of @dlt.create_table to begin with, since want the table to be created and not define and existing one. (https://community.databricks.com/t5/data-engineering/differenc...

0 kudos

04-03-2024 11:57:13 PM

3 More Replies

Databricks Community

Forum Posts

Delta Live table expectations

Best Practices Near Real-time Processing

Need help to analyze databricks logs for a long-running job.

How to set up relations between tables in unity catalog tables

How to unwrap the notebook code lines, By default its getting wrapping the code lines

Move files

What is the behavior when merge key is not unique

Visualizations failing to show

Having trouble installing my own Python wheel?

Delta Live Table : [TABLE_OR_VIEW_ALREADY_EXISTS] Cannot create table or view

value is null after loading a saved df when using specific type in schema

Writing to multiple files/tables from data held within a single file through autoloader

Resolved! Seeing history even after vacuuming the Delta table

IP address fix

@dlt.table throws error AttributeError: module 'dlt' has no attribute 'table'

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template