cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

standup1
by Contributor
  • 1666 Views
  • 7 replies
  • 3 kudos

Delt Live Table Path/Directory help

Hello, I am working on a dlt pipeline and I've been facing an issue. I hope someone here can help me find a solution.My files are json in azure storage. These files are stored in dircctory like this ( blobName/FolderName/xx.csv).The folder name is li...

  • 1666 Views
  • 7 replies
  • 3 kudos
Latest Reply
filipniziol
Esteemed Contributor
  • 3 kudos

Hi @standup1 , I'm glad the example was helpful

  • 3 kudos
6 More Replies
JR61276126
by New Contributor II
  • 1245 Views
  • 5 replies
  • 1 kudos

Data Engineering with Databricks 3.1.12 - Unable to run Classroom-Setup-01.2

Receiving the following error when attempting to run the classroom setup for lesson 1.2 of the Data Engineering with Databricks 3.1.12. This has been tested with multiple accounts, both admins and non-admins.Below is the error message I am receiving....

  • 1245 Views
  • 5 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @JR61276126 ,Since your workspace is deployed in azure with vent injection I assume it might be a network/firewall related issue. Could you check your driver logs also?

  • 1 kudos
4 More Replies
ADB0513
by New Contributor III
  • 3534 Views
  • 1 replies
  • 1 kudos

Databricks Asset Bundle "Credential was not sent or was of an unsupported type"

I am working on setting up an asset bundle and it is failing when I try to validate the bundle.  I am getting an error saying "Credential was not sent or was of an unsupported type for this API."I have a profile created and am using an access token t...

  • 3534 Views
  • 1 replies
  • 1 kudos
Latest Reply
mvmiller
New Contributor III
  • 1 kudos

I am having a similar issue, when trying to deploy my asset bundle.I ran the following:databricks auth login --host <hostname>I then was authenticated just fine, without issue. I then pointed to the relevant directory containing the asset bundle and ...

  • 1 kudos
Mathias_Peters
by Contributor
  • 1774 Views
  • 2 replies
  • 0 kudos

How to properly implement incremental batching from Kinesis Data Streams

Hi, I implemented a job that should incrementally read all the available data from a Kinesis Data Stream and terminate afterwards. I schedule the job daily. The data retention period of the data stream is 7 days, i.e., there should be enough time to ...

  • 1774 Views
  • 2 replies
  • 0 kudos
Latest Reply
fixhour
New Contributor II
  • 0 kudos

It seems like the issue might be caused by potential data loss in the Kinesis stream. Even though you're using checkpoints and specifying the "earliest" position, data can expire due to the 7-day retention period, especially if there's a delay in job...

  • 0 kudos
1 More Replies
MGeiss
by New Contributor III
  • 2711 Views
  • 3 replies
  • 1 kudos

Resolved! Suddenly Getting Timeout Errors Across All Environments while waiting for Python REPL to start.

Hey - we currently have 4 environments spread out across separate workspaces, and as of Monday we've began to have transient failures in our DLT pipeline runs with the following error:"java.util.concurrent.TimeoutException: Timed out after 60 seconds...

  • 2711 Views
  • 3 replies
  • 1 kudos
Latest Reply
MGeiss
New Contributor III
  • 1 kudos

For anyone else who may be experiencing this issue - it seems to have been related to serverless compute for notebooks/workflows, which we had enabled for the account, but WERE NOT using for our DLT pipelines. After noticing references to serverless ...

  • 1 kudos
2 More Replies
varshini_reddy
by New Contributor III
  • 3127 Views
  • 14 replies
  • 2 kudos
  • 3127 Views
  • 14 replies
  • 2 kudos
Latest Reply
filipniziol
Esteemed Contributor
  • 2 kudos

Hi @varshini_reddy ,There is no option to stop all the other iterations when for each is running and one of the iterations failed.This is why the shared workaround, that will simply skip/fail all the next iterations without doing anything.You can fai...

  • 2 kudos
13 More Replies
pritam_epam
by New Contributor III
  • 1945 Views
  • 9 replies
  • 0 kudos

WHERE 1=0, Error message from Server

Hi ,I am getting this Error:WHERE 1=0, Error message from Server: Configuration db table is not available. I am using PySpark and JDBC connection. Please help on this.

  • 1945 Views
  • 9 replies
  • 0 kudos
Latest Reply
pritam_epam
New Contributor III
  • 0 kudos

@szymon_dybczak Can you help us on this? Or could you provide a complete structure/steps how to connect with databricks using PySpark and JDBC step by step . Like initiate spark session then JDBC connection url then sql read all these in details.Also...

  • 0 kudos
8 More Replies
TheManOfSteele
by New Contributor III
  • 951 Views
  • 1 replies
  • 1 kudos

azure pipeline databricks bundle deploy duplicating jobs

I am deploying an asset bundle using an azure pipeline.I use # Databricks Bundle Validate- bash: |    databricks bundle validate -t $(BUNDLE_TARGET)  displayName: 'Validate Asset Bundle' # Databricks Bundle Deploy- bash: |    databricks bundle deploy...

TheManOfSteele_0-1723232334622.png
  • 951 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ricklen
New Contributor III
  • 1 kudos

Hey! Same problem over here, tried upgrading to the latest version of the Databricks CLI but to no avail.I did find the issue on Github: https://github.com/databricks/cli/issues/1650 

  • 1 kudos
ahsan_aj
by Contributor II
  • 11749 Views
  • 27 replies
  • 20 kudos

Resolved! Databricks connect 14.3.2 SparkConnectGrpcException Not found any cached local relation withthe hash

Hi All,I am using Databricks Connect 14.3.2 with Databricks Runtime 14.3 LTS to execute the code below. The CSV file is only 7MB, the code runs without issues on Databricks Runtime 15+ clusters but consistently produces the error message shown below ...

Data Engineering
databricks-connect
spark-connect
  • 11749 Views
  • 27 replies
  • 20 kudos
Latest Reply
ahsan_aj
Contributor II
  • 20 kudos

As a workaround, please try the following Spark configuration, which seems to have resolved the issue for me on both 14.3 LTS and 15.4 LTS.spark.conf.set("spark.sql.session.localRelationCacheThreshold", 64 * 1024 * 1024)

  • 20 kudos
26 More Replies
Angus-Dawson
by New Contributor III
  • 2940 Views
  • 5 replies
  • 3 kudos

Resolved! PARSE_EMPTY_STATEMENT error when trying to use spark.sql via Databricks Connect

I'm trying to use Databricks Connect to run queries on Delta Tables locally. However, SQL queries using spark.sql don't seem to work properly, even though spark.read.table works.>>> from databricks.connect import DatabricksSession>>> spark = Databric...

  • 2940 Views
  • 5 replies
  • 3 kudos
Latest Reply
alex_khakhlyuk
Databricks Employee
  • 3 kudos

Hi everyone! I am an engineer working on Databricks Connect. This error appears because of the incompatibility between the Serverless Compute and Databricks Connect versions. The current Serverless Compute release roughly corresponds to Databricks Ru...

  • 3 kudos
4 More Replies
lauracoursera
by New Contributor II
  • 1440 Views
  • 4 replies
  • 3 kudos

Create New Table, Infer Schema gives error: Invalid column type {colSchemaType}

I'm doing a the course 'Distributed Computing with Spark SQL' on Coursera, and need to create a table by uploading a csv file. That seems to work at first, but as soon as I check the box for 'Infer schema' for the preview table, I get the following m...

  • 1440 Views
  • 4 replies
  • 3 kudos
Latest Reply
filipniziol
Esteemed Contributor
  • 3 kudos

Hi @lauracoursera ,In Databricks Community Edition I am getting the same error as you:The Community Edition is very limited - the UI is not updated to the newest version, it has old runtimes, missing features etc.My recommendation is to register a fr...

  • 3 kudos
3 More Replies
alexandrexixe
by New Contributor
  • 635 Views
  • 1 replies
  • 0 kudos

Best approach for handling batch processess from cloud object storage.

I'm working on a Databricks implementation project where external Kafka processes write JSON files to S3. I need to ingest these files daily, or in some cases every four hours, but I don't need to perform stream processing.I'm considering two approac...

  • 635 Views
  • 1 replies
  • 0 kudos
Latest Reply
filipniziol
Esteemed Contributor
  • 0 kudos

Hi @alexandrexixe ,Are you building a production solution or you want to simply explore the data?For something long-term I would recommend autoloader option. Having external tables you do not get the benefits of working with Delta tables: the queries...

  • 0 kudos
ryomen_sukuna
by New Contributor II
  • 1029 Views
  • 4 replies
  • 0 kudos

Adding tags without Cluster Restart

Hi Team, I want to add tags to cluster that is shared for multiple job runs.To isolate cost we want to leverage tagging.If a job run is already in place on a running cluster, I want to trigger one more run and add tags.I'm not able to add tags withou...

  • 1029 Views
  • 4 replies
  • 0 kudos
Latest Reply
ryomen_sukuna
New Contributor II
  • 0 kudos

Correct, we can leverage job level tagging, but again it comes with a cost of handling concurrent runs or repair runs and some special cases as the job definition will change.We generally don't trigger the runs using UI as we have an automated proces...

  • 0 kudos
3 More Replies
pablobd
by Contributor II
  • 2338 Views
  • 2 replies
  • 0 kudos

Resolved! Cluster scoped init script failing

I am creating a cluster with asset bundles and adding a init script to it with asset bundles too. The init script is a .sh file in a UC Volume. When a I run a job, the cluster spins up and fails with this error:Cluster '****' was terminated. Reason: ...

  • 2338 Views
  • 2 replies
  • 0 kudos
Latest Reply
PabloCSD
Valued Contributor
  • 0 kudos

Hello Pablo, where did you changed the permissons, we are having the exact same issue, but with dbx.We are using .sh for installing a library (just doing "pip install ***.whl")

  • 0 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels