Machine Learning

by thushar • Contributor

06-14-2023 5:46:37 AM

11663 Views
4 replies
2 kudos

MetadataChangedException Exception in databricks

Reading around 20 text files from ADLS, doing some transformations, and after that these files are written back to ADLS as a single delta file (all operations are in parallel through the thread pool). Here from 20 threads, it is writing to a single f...

Machine Learning

Reply

11663 Views
4 replies
2 kudos

06-14-2023 5:46:37 AM

View Replies

Latest Reply

jkb7
New Contributor III

02-20-2025 3:12:17 AM

2 kudos

How can we import the exception "MetadataChangedException"?Or does Databricks recommend to catch / except Exception and parse the string?

2 kudos

02-20-2025 3:12:17 AM

3 More Replies

by aladda • Databricks Employee

05-14-2021 12:07:59 PM

5006 Views
2 replies
1 kudos

Resolved! How do I use the Copy Into command to copy data into a Delta Table? Looking for examples where you want to have a pre-defined schema

I've reviewed the COPY INTO docs here - https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-copy-into.html#examples but there's only one simple example. Looking for some additional examples that show loading data from CSV - with ...

Machine Learning

Reply

5006 Views
2 replies
1 kudos

05-14-2021 12:07:59 PM

View Replies

Latest Reply

aladda
Databricks Employee

06-21-2021 1:32:35 PM

1 kudos

Here's an example for predefined schemaUsing COPY INTO with a predefined table schema – Trick here is to CAST the CSV dataset into your desired schema in the select statement of COPY INTO. Example below%sql CREATE OR REPLACE TABLE copy_into_bronze_te...

1 kudos

06-21-2021 1:32:35 PM

1 More Replies

by brickster_2018 • Databricks Employee

06-25-2021 7:01:28 AM

3814 Views
2 replies
0 kudos

Resolved! How is Idempotency ensured for COPY INTO command

Machine Learning

Reply

3814 Views
2 replies
0 kudos

06-25-2021 7:01:28 AM

View Replies

Latest Reply

N_M
Contributor

12-01-2023 7:27:10 AM

0 kudos

How does COPY_INTO work with table restore?I made some tests, and the restore method does NOT restore the key-store values of the target at the specific version, which means that the data that came after the chosen version cannot be inserted (unless ...

0 kudos

12-01-2023 7:27:10 AM

1 More Replies

by SRK • Contributor III

12-08-2022 6:29:20 AM

12329 Views
6 replies
3 kudos

How to apply Primary Key constraint in Delta Live Table?

In this blog I can see for dimension and fact tables, the primary key constraint has been applied. Following is the example:-- Store dimensionCREATE OR REPLACE TABLE dim_store( store_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, business_key ...

Machine Learning

Reply

12329 Views
6 replies
3 kudos

12-08-2022 6:29:20 AM

View Replies

Latest Reply

Oliver_Angelil
Valued Contributor II

10-24-2023 8:52:14 AM

3 kudos

@SRK Please see a copy of this answer on stackoverflow here. You can use DLT Expectations to have this check (see my previous answer if you're using SQL and not Python):@dlt.table(name="table1",)def create_df():schema = T.StructType([T.StructField("i...

3 kudos

10-24-2023 8:52:14 AM

5 More Replies

by alesventus • Contributor

06-20-2023 2:02:48 AM

3613 Views
1 replies
2 kudos

Pyspark Merge parquet and delta file

Is it possible to use merge command when source file is parquet and destination file is delta? Or both files must delta files? Currently, I'm using this code and I transform parquet into delta and it works. But I want to avoid of this tranformation.T...

Machine Learning

Reply

3613 Views
1 replies
2 kudos

06-20-2023 2:02:48 AM

View Replies

Latest Reply

Anonymous
Not applicable

06-20-2023 8:16:32 PM

2 kudos

Hi @Ales ventus We haven't heard from you since the last response from @Kaniz Fatma , and I was checking back to see if her suggestions helped you.Or else, If you have any solution, please share it with the community, as it can be helpful to others...

2 kudos

06-20-2023 8:16:32 PM

by vittal • New Contributor

01-24-2023 10:35:44 PM

1747 Views
1 replies
0 kudos

Getting errors in DLT Pipeline while using ML Model

I am getting the following error when I try to run ML Models in Delta live Table Pipeline File "/local_disk0/.ephemeral_nfs/repl_tmp_data/ReplId-55c61-9b898-2c4b6-d/mlflow/envs/virtualenv_envs/mlflow-888f8c9b966409e6bddca3894244b4df9d1f94c1/lib/pyth...

Machine Learning

Reply

1747 Views
1 replies
0 kudos

01-24-2023 10:35:44 PM

View Replies

Latest Reply

shan_chandra
Databricks Employee

04-27-2023 9:17:21 AM

0 kudos

@Vittal Pai - In general, please follow the below steps for the mlflow CLI error,Step 1: set up API token and create secrets as mentioned in the below documenthttps://docs.databricks.com/machine-learning/manage-model-lifecycle/multiple-workspaces.h...

0 kudos

04-27-2023 9:17:21 AM

by lurban • New Contributor II

01-25-2023 9:56:15 AM

1993 Views
1 replies
0 kudos

CloudFilesIllegalStateException: Found mismatched event: key old_file_path doesn't have the prefix: new_file_path

My team currently uses Autoloader and Delta Live Tables to process incremental data from ADLS storage. We are needing to keep the same table and history, but switch the filepath to a different location in storage. When I test a filepath change, I rec...

Machine Learning

Reply

1993 Views
1 replies
0 kudos

01-25-2023 9:56:15 AM

View Replies

Latest Reply

DD_Sharma
New Contributor III

04-14-2023 12:15:03 AM

0 kudos

Autoloader doesn't support changing the source path for running job so if you change your source path your stream fails because the source path has changed. However, if you really want to change the path you can change it by using the new checkpoint ...

0 kudos

04-14-2023 12:15:03 AM

by Anonymous • Not applicable

04-09-2023 7:11:39 PM

1782 Views
2 replies
3 kudos

www.databricks.com

Hello Dolly: Democratizing the magic of ChatGPT with open modelsDatabricks has just released a groundbreaking new blog post exploring ChatGPT, an open-source language model with the potential to transform the way we interact with technology. From cha...

Machine Learning

Reply

1782 Views
2 replies
3 kudos

04-09-2023 7:11:39 PM

View Replies

Latest Reply

Anonymous
Not applicable

04-10-2023 5:50:07 AM

3 kudos

Lets get candid! Let me know your initial thoughts about LLM Models, ChatGpt, Dolly.

3 kudos

04-10-2023 5:50:07 AM

1 More Replies

by Aviral-Bhardwaj • Esteemed Contributor III

12-23-2022 8:55:16 PM

10835 Views
2 replies
36 kudos

Delta lake Vs Data lake in Databricks Delta Lake is an open-source storage layer that sits on top of existing data lake storage, such as Azure Data La...

Delta lake Vs Data lake in DatabricksDelta Lake is an open-source storage layer that sits on top of existing data lake storage, such as Azure Data Lake Store or Amazon S3. It provides a more robust and scalable alternative to traditional data lake st...

Machine Learning

Reply

10835 Views
2 replies
36 kudos

12-23-2022 8:55:16 PM

View Replies

Latest Reply

Meghala
Valued Contributor II

12-26-2022 2:25:31 AM

36 kudos

this data is very much informative and i understood much in it so thank you @Aviral Bhardwaj sir

36 kudos

12-26-2022 2:25:31 AM

1 More Replies

by elgeo • Valued Contributor II

11-28-2022 5:26:02 AM

5874 Views
1 replies
4 kudos

Resolved! Insert into delta table fails

Hello experts. We are trying to execute an insert command with less columns than the target table:Insert into table_name( col1, col2, col10)Select col1, col2, col10from table_name2However the above fails with:Error in SQL statement: DeltaAnalysisExce...

Machine Learning

Reply

5874 Views
1 replies
4 kudos

11-28-2022 5:26:02 AM

View Replies

Latest Reply

UmaMahesh1
Honored Contributor III

11-29-2022 10:51:19 AM

4 kudos

Hi @ELENI GEORGOUSI Yes. When you are doing an insert, your provided schema should match with the target schema else it would throw an error.But you can still insert the data using another approach. Create a dataframe with your data having less colu...

4 kudos

11-29-2022 10:51:19 AM

by MA • New Contributor II

10-20-2022 2:48:14 PM

2126 Views
1 replies
4 kudos

Stream data from Delta tables replicated with Fivetran into DLT

I'm attempting to stream into a DLT pipeline with data replicated from Fivetran directly into Delta tables in another database than the one that the DLT pipeline uses. This is not an aggregate, and I don't want to recompute the entire data model eac...

Machine Learning

Reply

2126 Views
1 replies
4 kudos

10-20-2022 2:48:14 PM

View Replies

Latest Reply

Anonymous
Not applicable

11-27-2022 5:49:58 AM

4 kudos

Hi @M A Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question first. Or else bricksters will get back to you soon. Thanks

4 kudos

11-27-2022 5:49:58 AM

by Shuvi • New Contributor III

09-13-2022 2:25:49 AM

3631 Views
3 replies
5 kudos

Resolved! What is the use case of having Azure Synapse(DWH) and Delta Lake ( Gold) given we can connect BI to delta directly

The curated zone is pushed to cloud data warehouse such as Synapse Dedicated SQL Pools which then acts as a serving layer for BI tools and analyst.I believe we can have models in gold layer and have BI connect to this layer or we can have serverless ...

Machine Learning

Reply

3631 Views
3 replies
5 kudos

09-13-2022 2:25:49 AM

View Replies

Latest Reply

Shuvi
New Contributor III

09-13-2022 2:41:21 AM

5 kudos

Thank you, so for a large workload, where we need lot of optimization we might need Synapse, but for a small/medium workload, we might have to stick to Delta Table

5 kudos

09-13-2022 2:41:21 AM

2 More Replies

by vaver_3 • New Contributor III

08-05-2022 12:45:07 PM

17424 Views
1 replies
5 kudos

Resolved! ingest a .csv file with spaces in column names using Delta Live into a streaming table

How do I ingest a .csv file with spaces in column names using Delta Live into a streaming table? All of the fields should be read using the default behavior .csv files for DLT autoloader - as strings. Running the pipeline gives me an error about in...

Machine Learning

Reply

17424 Views
1 replies
5 kudos

08-05-2022 12:45:07 PM

View Replies

Latest Reply

vaver_3
New Contributor III

08-11-2022 5:30:07 AM

5 kudos

After additional googling on "withColumnRenamed", I was able to replace all spaces in column names with "_" all at once by using select and alias instead:@dlt.view( comment="" ) def vw_raw(): return ( spark.readStream.format("cloudF...

5 kudos

08-11-2022 5:30:07 AM

by amits • New Contributor III

05-23-2022 8:48:56 AM

5626 Views
6 replies
4 kudos

Tableau extract creation frozen

Heya,I'm having an issue with extract creation from a Delta lake table. Tableau is frozen on "Rows retrieved: X" for too long.I actually succeeded in creating the first extract but saw I was missing a column. I went ahead and did a full rewrite -even...

Machine Learning

Reply

5626 Views
6 replies
4 kudos

05-23-2022 8:48:56 AM

View Replies

Latest Reply

Prabakar
Databricks Employee

05-24-2022 2:44:27 PM

4 kudos

@Amit Steiner what is the size of the table. Do you see any error or does Tableau get frozen without any error? I believe this to be more of a Tableau-related issue than Databricks.What is the version of Tableau that you are using? What is the conne...

4 kudos

05-24-2022 2:44:27 PM

5 More Replies

by MadelynM • Databricks Employee

10-01-2021 2:10:35 PM

2322 Views
1 replies
7 kudos

2021-07-Webinar--Hassle-Free-Data-Ingestion-Social-1200x628

Thanks to everyone who joined the Hassle-Free Data Ingestion webinar. You can access the on-demand recording here. We're sharing a subset of the phenomenal questions asked and answered throughout the session. You'll find Ingestion Q&A listed first, f...

Machine Learning

Reply

2322 Views
1 replies
7 kudos

10-01-2021 2:10:35 PM

View Replies

Latest Reply

Emily_S
New Contributor III

11-09-2021 6:32:13 AM

7 kudos

Check out Part 2 of this Data Ingestion webinar to find out how to easily ingest semi-structured data at scale into your Delta Lake, including how to use Databricks Auto Loader to ingest JSON data into Delta Lake.

7 kudos

11-09-2021 6:32:13 AM

Databricks Community

Forum Posts

MetadataChangedException Exception in databricks

Resolved! How do I use the Copy Into command to copy data into a Delta Table? Looking for examples where you want to have a pre-defined schema

Resolved! How is Idempotency ensured for COPY INTO command

How to apply Primary Key constraint in Delta Live Table?

Pyspark Merge parquet and delta file

Getting errors in DLT Pipeline while using ML Model

CloudFilesIllegalStateException: Found mismatched event: key old_file_path doesn't have the prefix: new_file_path

www.databricks.com

Delta lake Vs Data lake in Databricks Delta Lake is an open-source storage layer that sits on top of existing data lake storage, such as Azure Data La...

Resolved! Insert into delta table fails

Stream data from Delta tables replicated with Fivetran into DLT

Resolved! What is the use case of having Azure Synapse(DWH) and Delta Lake ( Gold) given we can connect BI to delta directly

Resolved! ingest a .csv file with spaces in column names using Delta Live into a streaming table

Tableau extract creation frozen

2021-07-Webinar--Hassle-Free-Data-Ingestion-Social-1200x628