Data Engineering

Forum Posts

Sorted by:

by Valentin1 • New Contributor III

04-02-2023 2:30:24 AM

3813 Views
5 replies
2 kudos

Delta Live Tables Incremental Batch Loads & Failure Recovery

Hello Databricks community,I'm working on a pipeline and would like to implement a common use case using Delta Live Tables. The pipeline should include the following steps:Incrementally load data from Table A as a batch.If the pipeline has previously...

Data Engineering

3813 Views
5 replies
2 kudos

04-02-2023 2:30:24 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-04-2023 11:43:05 PM

2 kudos

Hi @Valentin Rosca Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...

2 kudos

04-04-2023 11:43:05 PM

4 More Replies

by adrianlwn • New Contributor III

10-06-2022 10:35:20 AM

6787 Views
18 replies
17 kudos

How to activate ignoreChanges in Delta Live Table read_stream ?

Hello everyone, I'm using DLT (Delta Live Tables) and I've implemented some Change Data Capture for deduplication purposes. Now I am creating a downstream table that will read the DLT as a stream (dlt.read_stream("<tablename>")). I keep receiving thi...

Data Engineering

6787 Views
18 replies
17 kudos

10-06-2022 10:35:20 AM

View Replies

Latest Reply

gopínath
New Contributor II

02-27-2023 7:03:18 PM

17 kudos

In DLT read_stream, we can't use ignoreChanges / ignoreDeletes. These are the configs helps to avoid the failures but it is actually ignoring the operations done on the upstream. So you need to manually perform the deletes or updates in the downstrea...

17 kudos

02-27-2023 7:03:18 PM

17 More Replies

by Colter • New Contributor II

04-14-2023 9:18:44 AM

995 Views
3 replies
0 kudos

Is there a way to use cluster policies within jobs api to define cluster configuration rather than in the jobs api itself?

I want to create a cluster policy that is referenced by most of our repos/jobs so we have one place to update whenever there is a spark version change or when we need to add additional spark configurations. I figured cluster policies might be a good ...

Data Engineering

995 Views
3 replies
0 kudos

04-14-2023 9:18:44 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-17-2023 2:31:09 AM

0 kudos

Hi @Colter Nattrass Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answe...

0 kudos

04-17-2023 2:31:09 AM

2 More Replies

by tototox • New Contributor III

04-16-2023 12:46:05 AM

1323 Views
3 replies
2 kudos

dbutils.fs.ls overlaps with managed storage error

I created a schema with that route as a managed location.(abfss://~~@~~.dfs.core.windows.net/dejeong/)However, I dropped shcema with the cascade option, and also entered the azure portal and deleted the path directly. and made it again(abfss://~~@~~....

Data Engineering

1323 Views
3 replies
2 kudos

04-16-2023 12:46:05 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-18-2023 1:18:58 AM

2 kudos

Hi @jin park Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your...

2 kudos

04-18-2023 1:18:58 AM

2 More Replies

by Dean_Lovelace • New Contributor III

04-17-2023 7:14:04 AM

1243 Views
3 replies
4 kudos

What is the Pyspark equivalent of FSCK REPAIR TABLE?

I am using the delta format and occasionaly get the following error:-"xx.parquet referenced in the transaction log cannot be found. This occurs when data has been manually deleted from the file system rather than using the table `DELETE` statement"FS...

Data Engineering

1243 Views
3 replies
4 kudos

04-17-2023 7:14:04 AM

View Replies

Latest Reply

shan_chandra
Honored Contributor III

04-19-2023 7:40:38 AM

4 kudos

## Delta check when a file was added %scala (oldest-version-available to newest-version-available).map { version => var df = spark.read.json(f"<delta-table-location>/_delta_log/$version%020d.json").where("add is not null").select("add.path") var ...

4 kudos

04-19-2023 7:40:38 AM

2 More Replies

by Dean_Lovelace • New Contributor III

04-17-2023 12:55:08 AM

2054 Views
3 replies
0 kudos

Delta Table Optimize Error

I have have started getting an error message when running the following optimize command:-deltaTable.optimize().executeCompaction()Error:-java.util.concurrent.ExecutionException: java.lang.IllegalStateException: Number of records changed after Optimi...

Data Engineering

2054 Views
3 replies
0 kudos

04-17-2023 12:55:08 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-18-2023 2:06:28 AM

0 kudos

@Dean Lovelace :The error message suggests that the number of records in the Delta table changed after the optimize() command was run. The optimize() command is used to improve the performance of Delta tables by removing small files and compacting l...

0 kudos

04-18-2023 2:06:28 AM

2 More Replies

by haraldh • New Contributor II

12-09-2022 8:00:29 AM

864 Views
1 replies
2 kudos

Databericks JDBC driver connection pooling support

When using Camel JDBC with Databricks JDBC driver I get an error: Caused by: java.sql.SQLFeatureNotSupportedException: [Databricks][JDBC](10220) Driver does not support this optional feature.Is there any means to work around this limitation?

Data Engineering

864 Views
1 replies
2 kudos

12-09-2022 8:00:29 AM

View Replies

Latest Reply

swethaNandan
New Contributor III

04-19-2023 7:06:11 AM

2 kudos

Tools like SDI can connect to a generic JDBC source such as Databricks SQL Warehouse via the SDI Camel JDBC adapter. can you see if these will help you https://help.sap.com/docs/HANA_SMART_DATA_INTEGRATION/7952ef28a6914997abc01745fef1b607/1247c9518...

2 kudos

04-19-2023 7:06:11 AM

by System1999 • New Contributor III

04-10-2023 10:46:30 AM

2503 Views
7 replies
0 kudos

My 'Data' menu item shows 'No Options' for Databases. How can I fix?

Hi, I'm new to Databricks and I've signed up for the Community edition.First, I've noticed that I cannot return to a previously created cluster, as I get the message telling me that restarting a cluster is not available to me. Ok, inconvenient, but I...

Data Engineering

2503 Views
7 replies
0 kudos

04-10-2023 10:46:30 AM

View Replies

Latest Reply

System1999
New Contributor III

04-19-2023 6:42:01 AM

0 kudos

Hi @Suteja Kanuri ,I get the error message under Data before I've created a cluster. Then I still get it when I've created a cluster and a notebook (having attached the notebook to the cluster). Thanks.

0 kudos

04-19-2023 6:42:01 AM

6 More Replies

by Student185 • New Contributor III

11-23-2021 7:12:03 AM

4141 Views
7 replies
5 kudos

Resolved! Is that long-term free version for students still available now?

Dear sir/madam,I've tried lots of methods in order to access the long-term Databricks' free version - community version for students.Also, I followed the instructions - Introduction to Databricks - in Coursera step by step: https://www.coursera.org/l...

Data Engineering

4141 Views
7 replies
5 kudos

11-23-2021 7:12:03 AM

View Replies

Latest Reply

shreeves
New Contributor II

04-19-2023 6:41:13 AM

5 kudos

Look for the "Community Edition" in small print below the button

5 kudos

04-19-2023 6:41:13 AM

6 More Replies

by mangel • New Contributor III

05-10-2022 1:54:58 AM

3725 Views
6 replies
3 kudos

Resolved! Delta Live Tables error pivot

I'm facing an error in Delta Live Tables when I want to pivot a table. The error is the following: And the code to replicate the error is the following:import pandas as pd import pyspark.sql.functions as F pdf = pd.DataFrame({"A": ["foo", "foo", "f...

Data Engineering

3725 Views
6 replies
3 kudos

05-10-2022 1:54:58 AM

View Replies

Latest Reply

Khalil
Contributor

04-19-2023 6:36:16 AM

3 kudos

It's said in the DLT documentation that "pivot" is not supported in DLT but I noticed that if you want the pivot function to work you have to do one of the the following things:apply the pivot in your first a dlt.view + the config "spark.databricks.d...

3 kudos

04-19-2023 6:36:16 AM

5 More Replies

by Anonymous • Not applicable

04-18-2023 1:08:12 AM

262 Views
1 replies
2 kudos

www.databricks.com

Dear Community - @Youssef Mrini will answer all your questions on April 19, 2023 from 9:00am to 10:00am GMT during the Databricks EMEA Office Hours.Make sure to join this amazing 'Ask Me Anything' session by Databricks - https://www.databricks.com/r...

Data Engineering

262 Views
1 replies
2 kudos

04-18-2023 1:08:12 AM

View Replies

Latest Reply

youssefmrini
Honored Contributor III

04-19-2023 5:50:36 AM

2 kudos

It was a successful office hours. Make sure to join the next one.

2 kudos

04-19-2023 5:50:36 AM

by youssefmrini • Honored Contributor III

04-19-2023 5:48:43 AM

688 Views
1 replies
0 kudos

Resolved! How Can I use Databricks Connect v2 ?

Data Engineering

688 Views
1 replies
0 kudos

04-19-2023 5:48:43 AM

View Replies

Latest Reply

youssefmrini
Honored Contributor III

04-19-2023 5:49:32 AM

0 kudos

Make sure to watch the following video https://www.youtube.com/watch?v=DkzwFTC7WWsThis section lists the requirements for Databricks Connect.Only Databricks Runtime 13.0 ML and Databricks Runtime 13.0 are supported.Only clusters that are compatible w...

0 kudos

04-19-2023 5:49:32 AM

by Hubert-Dudek • Esteemed Contributor III

04-19-2023 2:08:20 AM

624 Views
2 replies
8 kudos

databricks has recently introduced a new SQL function allowing easy integration of LLM (Language Model) models with Databricks. This exciting new feat...

databricks has recently introduced a new SQL function allowing easy integration of LLM (Language Model) models with Databricks. This exciting new feature simplifies calling LLM models, making them more accessible and user-friendly. To try it out, che...

Data Engineering

624 Views
2 replies
8 kudos

04-19-2023 2:08:20 AM

View Replies

Latest Reply

Vartika
Moderator

04-19-2023 3:49:29 AM

8 kudos

Hi @Hubert Dudek,I wanted to take a moment to express our gratitude for sharing your valuable insights and information with us. Thank you for taking the time to share your thoughts with us. We truly appreciate your contribution.You are awesome!Cheer...

8 kudos

04-19-2023 3:49:29 AM

1 More Replies

by JLSy • New Contributor III

04-16-2023 7:29:52 PM

7677 Views
5 replies
6 kudos

cannot convert Parquet type INT64 to Photon type string

I am receiving an error similar to the post in this link: https://community.databricks.com/s/question/0D58Y00009d8h4tSAA/cannot-convert-parquet-type-int64-to-photon-type-doubleHowever, instead of type double the error message states that the type can...

Data Engineering

7677 Views
5 replies
6 kudos

04-16-2023 7:29:52 PM

View Replies

Latest Reply

Anonymous
Not applicable

04-18-2023 1:53:12 AM

6 kudos

@John Laurence Sy :It sounds like you are encountering a schema conversion error when trying to read in a Parquet file that contains an INT64 column that cannot be converted to a string type. This error can occur when the Parquet file has a schema t...

6 kudos

04-18-2023 1:53:12 AM

4 More Replies

by Aakash_Bhandari • New Contributor III

02-20-2023 10:24:16 PM

3256 Views
6 replies
2 kudos

Resolved! Accessing a FastAPI endpoint using Personal Access Token (PAT)

Hello Community,I have a FastAPI endpoint on a cluster with addess 0.0.0.0:8084/predict. And I want to send a request to this endpoint from a React App which is locally hosted on my computer. I have a Personal access token for the workspace but dont ...

Data Engineering

3256 Views
6 replies
2 kudos

02-20-2023 10:24:16 PM

View Replies

Latest Reply

Anonymous
Not applicable

03-31-2023 8:32:58 AM

2 kudos

@Aakash Bhandari :To send a request from a React App to a FastAPI endpoint on a Databricks cluster using a Personal Access Token (PAT), you can use the requests module in Python to make HTTP requests.Here's an example of how to use requests to send ...

2 kudos

03-31-2023 8:32:58 AM

5 More Replies

User

Count

1602

736

344

284

247

Databricks

Forum Posts

Delta Live Tables Incremental Batch Loads & Failure Recovery

How to activate ignoreChanges in Delta Live Table read_stream ?

Is there a way to use cluster policies within jobs api to define cluster configuration rather than in the jobs api itself?

dbutils.fs.ls overlaps with managed storage error

What is the Pyspark equivalent of FSCK REPAIR TABLE?

Delta Table Optimize Error

Databericks JDBC driver connection pooling support

My 'Data' menu item shows 'No Options' for Databases. How can I fix?

Resolved! Is that long-term free version for students still available now?

Resolved! Delta Live Tables error pivot

www.databricks.com

Resolved! How Can I use Databricks Connect v2 ?

databricks has recently introduced a new SQL function allowing easy integration of LLM (Language Model) models with Databricks. This exciting new feat...

cannot convert Parquet type INT64 to Photon type string

Resolved! Accessing a FastAPI endpoint using Personal Access Token (PAT)

Best way to parse Google Analytics data in Databri...

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...