cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

jeremy98
by Contributor
  • 111 Views
  • 1 replies
  • 0 kudos

Conflicts MetaDataChanged

Hello Community,I'm encountering a problem with Databricks Delta Tables. Specifically, I have several tables that are accessed by different processes, which include both write and update operations. The main issue arises when these operations overlap...

  • 111 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

You can refer to the best practices mentioned in https://docs.databricks.com/en/optimizations/isolation-level.html 

  • 0 kudos
zuzsad
by New Contributor II
  • 247 Views
  • 3 replies
  • 0 kudos

Azure Asset Bundle deploy removes the continous: true configuration

I have this pipeline configuration that I'm deploying using Azure Asset Bundles:ingest-pipeline.test.yml```resources:  pipelines:    ingest-pipeline-test:      name: ingest-pipeline-test-2      clusters:        - label: default          node_type_id:...

  • 247 Views
  • 3 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

I just got confirmation internally that as of now Continuous is disabled in mode: development

  • 0 kudos
2 More Replies
RobsonNLPT
by Contributor II
  • 346 Views
  • 2 replies
  • 0 kudos

Delta UserMetadata attribute using Serverless Compute

To automate the configuration of Spark on serverless compute, Databricks has removed support for manually setting most Spark configurations. I've used userMetadata attribute to add context for all workloads that write to delta tablesI have the follow...

  • 346 Views
  • 2 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

@RobsonNLPT  I see, so you're currently getting a "Configuration spark.databricks.delta.commitInfo.userMetadata is not available" correct? I did an internal research and there doesn't seem to be a workaround for this yet, but it'll be good if you co...

  • 0 kudos
1 More Replies
brady_tyson
by New Contributor
  • 242 Views
  • 1 replies
  • 0 kudos

Databricks Connect Vscode. Cannot find package installed on cluster

I am using Databricks Connect v2 to connect to a UC enabled cluster. I have a package I have made and installed in a wheel file on the cluster. When using vscode to import the package and use it I get a module not found error when running cell by cel...

  • 242 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Hi @brady_tyson just checking—are you still facing this issue with using your custom package and Databricks Connect? If so, here are a few questions to collect some data points about your setup:   Is Databricks Connect properly installed and configur...

  • 0 kudos
prakharcode
by New Contributor II
  • 485 Views
  • 1 replies
  • 0 kudos

Problem with streaming jobs (foreachBatch) with USER_ISOLATION compute cluster

 We have been trying to run a streaming job on an all-purpose compute (4 cores, 16 gb) in the “user_isolation”, recommended by databricks to run with/for unity catalog. The job reads CDC files produced by a table refreshed every hour and produces aro...

  • 485 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

@prakharcode  Thank you for sharing the detailed information about your issue. Before diving into solutions, I want to confirm if this is still an ongoing problem you're facing. Regarding the difference in job performance between "NO_ISOLATION" mode ...

  • 0 kudos
sakuraDev
by New Contributor II
  • 362 Views
  • 1 replies
  • 0 kudos

Why does soda not initialize?

Hey everyone, im using autoloader x soda.I'm new to both,The idea is to ingest with quality checks in my silver table for every batch in a continuous ingestion.I tried to configure soda as str just like the docs show, but its seems that it keeps on t...

sakuraDev_0-1725645131588.png
  • 362 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

@sakuraDev is this still an ongoing issue? If so, could you please share the error stacktrace as a file attachment? Thanks.

  • 0 kudos
ceceliac
by New Contributor III
  • 366 Views
  • 7 replies
  • 0 kudos

inconsistent behavior with serverless sql: user is not an owner of table error with views

We get the following error with some basic views and not others when using serverless compute (from a notebook or from SQL Editor or from the Catalog Explorer).  Views are simple select * from table x and underlying schemas/tables are using managed m...

  • 366 Views
  • 7 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

@ceceliac just a quick check, if you rerun the same query after it has initially failed, will it go through or still fail? if it runs fine, wait another 10-15mins and rerun it and share the outcome. So: 1.- Run it once, it will fail. 2.- Rerun it inm...

  • 0 kudos
6 More Replies
zsh24
by New Contributor
  • 1382 Views
  • 3 replies
  • 0 kudos

Python worker exited unexpectedly (crashed)

I have a failing pipeline which results in the following failure:org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2053.0 failed 4 times, most recent failure: Lost task 0.3 in stage 2053.0 (TID 4594) (10.171.199.129 e...

  • 1382 Views
  • 3 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

@zsh24 , just checking if you were able to address the problem or need further guidance? 

  • 0 kudos
2 More Replies
khishore
by Contributor
  • 4116 Views
  • 7 replies
  • 6 kudos

Resolved! i haven't received my certificate or the badge for Databricks Certified Data Engineer Associate

Hi @Lindsay Olson​ @Kaniz Fatma​ ,I have cleared my Databricks Certified Data Engineer Associate on 29 October 2022. but haven't received my badge or certificate yet .Can you guys please help .Thanks

  • 4116 Views
  • 7 replies
  • 6 kudos
Latest Reply
gokul2
New Contributor II
  • 6 kudos

Hi @Lindsay Olson​ @Kaniz Fatma​ ,I have cleared my Databricks Certified Data Engineer Associate on 29 October 2022. but haven't received my badge or certificate yet .thanks,Gokul P

  • 6 kudos
6 More Replies
bobbysidhartha
by New Contributor
  • 15583 Views
  • 2 replies
  • 0 kudos

How to parallelly merge data into partitions of databricks delta table using PySpark/Spark streaming?

I have a PySpark streaming pipeline which reads data from a Kafka topic, data undergoes thru various transformations and finally gets merged into a databricks delta table. In the beginning we were loading data into the delta table by using the merge ...

WbOeJ 6MYWV
  • 15583 Views
  • 2 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@bobbysidhartha​ :When merging data into a partitioned Delta table in parallel, it is important to ensure that each job only accesses and modifies the files in its own partition to avoid concurrency issues. One way to achieve this is to use partition...

  • 0 kudos
1 More Replies
FilipezAR
by New Contributor
  • 6692 Views
  • 2 replies
  • 1 kudos

Failed to create new KafkaAdminClient

I want to create connections to kafka with spark.readStream using the following parameters: kafkaParams = { "kafka.sasl.jaas.config": f'org.apache.kafka.common.security.plain.PlainLoginModule required username="{kafkaUsername}" password="{kafkaPa...

  • 6692 Views
  • 2 replies
  • 1 kudos
Latest Reply
john533
New Contributor III
  • 1 kudos

The error indicates a missing Kafka client dependency for Spark in Databricks. Ensure the correct Kafka connector library is attached to your Databricks cluster, such as org.apache.spark:spark-sql-kafka-0-10_2.12:x.x.x (replace x.x.x with your Spark ...

  • 1 kudos
1 More Replies
JothyGanesan
by New Contributor II
  • 321 Views
  • 3 replies
  • 0 kudos

DLT Merge tables into Delta

We are trying to load a Delta table from streaming tables using DLT. This target table needs a MERGE of 3 source tables. But when we use the DLT command with merge it says Merge is not supported. Is this anything related to DLT version? Please help u...

  • 321 Views
  • 3 replies
  • 0 kudos
Latest Reply
RiyazAli
Valued Contributor II
  • 0 kudos

Hey @JothyGanesan Please take a look at the Apply Changes API - https://docs.databricks.com/en/delta-live-tables/cdc.htmlThis is a replacement of MERGE INTO in Databricks.Cheers!

  • 0 kudos
2 More Replies
Taja
by New Contributor II
  • 134 Views
  • 1 replies
  • 0 kudos

Delta Live Tables: large use

Does anyone use Delta Live Table on large scale in production pipelines ? Are they satisfied with the product ?Recently, I´ve started a PoC to evaluate the DLT and notice some concerns:- Excessive use of compute resources when you check the cluster m...

  • 134 Views
  • 1 replies
  • 0 kudos
Latest Reply
RiyazAli
Valued Contributor II
  • 0 kudos

Hi @Taja,I agree that DLT pipelines doesn't accept a single node cluster to begin with but you can always choose the instance type for both your driver and the worker nodes.As far as `waiting for resources` time is concerned, I've seen that DLT takes...

  • 0 kudos
NK_123
by New Contributor II
  • 678 Views
  • 3 replies
  • 0 kudos

DELTA_INVALID_SOURCE_VERSION issue on spark structure streaming

I am doing a structure streaming and getting this error on databricks, the source table already have 2 versions(0,1). It is still not able to find  Query {'_id': UUID('fe7a563e-f487-4d0e-beb0-efe794ab4708'), '_runId': UUID('bf0e94b5-b6ce-42bb-9bc7-15...

  • 678 Views
  • 3 replies
  • 0 kudos
Latest Reply
lukinkratas
New Contributor II
  • 0 kudos

Are you using checkpoints? If so, make sure the permisions to that location are ok, alternatively delete all the checkpoints, you have created in that location and try again. This was my case. 

  • 0 kudos
2 More Replies
Akash_Wadhankar
by New Contributor III
  • 128 Views
  • 0 replies
  • 1 kudos

Data Engineering Journey on Databricks

For any new Data Engineering aspirant, it has always been a difficult where to start the learning journey. I faced this challenge a decade ago. In order to help new aspirants I created a series of medium article for new learners. I hope it brings mor...

  • 128 Views
  • 0 replies
  • 1 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels