cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Hubert-Dudek
by Esteemed Contributor III
  • 1863 Views
  • 2 replies
  • 1 kudos

Resolved! Introducing 'Run-If' Feature in Databricks Jobs API for Efficient Task Failure Managemen

Databricks Jobs API now includes a 'run-if' feature for task creation in workflows. This upgrade enables the execution of repair jobs in scenarios where one or all tasks fail. 

ezgif-5-b4915a74cf.gif
  • 1863 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

@Hubert-Dudek, A heartfelt salute to our dedicated community champion who continually lights up our Databricks community with insightful posts about the latest features! Keep going!

  • 1 kudos
1 More Replies
piterpan
by New Contributor III
  • 5010 Views
  • 8 replies
  • 11 kudos

Resolved! _sqldf not defined on Azure job cluster v12.2

Since yesterday we have errors in notebooks that were previously working.  NameError: name '_sqldf' is not defined  It was working previously.We are on Azure databricks, usng job pool Driver: Standard_D4s_v5 · Workers: Standard_D4s_v5 · 1-6 workers ·...

Data Engineering
azure
Notebook
pyspark
  • 5010 Views
  • 8 replies
  • 11 kudos
Latest Reply
Tharun-Kumar
Honored Contributor II
  • 11 kudos

@piterpan This was a regression issue which impacted the jobs where _sqldf was referenced and the notebook those weren't run interactively. Our Engineering team has fixed this issue yesterday.Could you check whether you are still facing the issue?

  • 11 kudos
7 More Replies
marianopenn
by New Contributor III
  • 8970 Views
  • 6 replies
  • 4 kudos

Resolved! [UDF_MAX_COUNT_EXCEEDED] Exceeded query-wide UDF limit of 5 UDFs

We are using DLT to ingest data into our Unity catalog and then, in a separate job, we are reading and manipulating this data and then writing it to a table like:df.write.saveAsTable(name=target_table_path)We are getting an error which I cannot find ...

Data Engineering
data engineering
dlt
python
udf
Unity Catalog
  • 8970 Views
  • 6 replies
  • 4 kudos
Latest Reply
Tharun-Kumar
Honored Contributor II
  • 4 kudos

@AlexPrev You can traverse to the Advanced Settings in the Cluster configuration and include this config in the Spark section.

  • 4 kudos
5 More Replies
Atifdatabricks
by New Contributor II
  • 1196 Views
  • 3 replies
  • 1 kudos

Suspended - Databricks Certified Associate Developer for Apache Spark

During middle of the exam I got suspended. It said due to my eye movement. I had the test on left part of my monitor and pdf (which was provided as a testing aid for this exam) on right side. I was just moving my eyes left and right as I was using PD...

  • 1196 Views
  • 3 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Adding @Cert-Team for visibility.

  • 1 kudos
2 More Replies
Rishi045
by New Contributor III
  • 11270 Views
  • 11 replies
  • 0 kudos

Data getting missed while reading from azure event hub using spark streaming

Hi All,I am facing an issue of data getting missed.I am reading the data from azure event hub and after flattening the json data I am storing it in a parquet file and then using another databricks notebook to perform the merge operations on my delta ...

Data Engineering
Azure event hub
Spark streaming
  • 11270 Views
  • 11 replies
  • 0 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 0 kudos

- In the EventHub, you can preview the event hub job using Azure Analitycs, so please first check are all records there- Please set in Databricks that it is saved directly to the bronze delta table without performing any aggregation, just 1 to 1, and...

  • 0 kudos
10 More Replies
ThomasVanBilsen
by New Contributor III
  • 1142 Views
  • 1 replies
  • 1 kudos

Catalog name's in DTAP scenario

Hi everyone,I'm currently in the process of migrating to Unity Catalog. I have several Azure Databricks Workspaces, one for each phase of the development phase (development, test, acceptance, and production). In accordance with the best practices (ht...

Data Engineering
DTAP
Unity Catalog
  • 1142 Views
  • 1 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

you could also store the environment name in a config file f.e. in the databricks filestore.These config files can also be managed by ci/cd.tbh my preferred way of working lately.

  • 1 kudos
sparkstreaming
by New Contributor III
  • 4292 Views
  • 5 replies
  • 4 kudos

Resolved! Missing rows while processing records using foreachbatch in spark structured streaming from Azure Event Hub

I am new to real time scenarios and I need to create a spark structured streaming jobs in databricks. I am trying to apply some rule based validations from backend configurations on each incoming JSON message. I need to do the following actions on th...

  • 4292 Views
  • 5 replies
  • 4 kudos
Latest Reply
Rishi045
New Contributor III
  • 4 kudos

Were you able to achieve any solutions if yes please can you help with it.

  • 4 kudos
4 More Replies
Oliver_Angelil
by Valued Contributor II
  • 7091 Views
  • 9 replies
  • 8 kudos

In what circumstances are both UAT/DEV and PROD environments actually necessary?

I wanted to ask this Q yesterday in the Q&A session with Mohan Mathews, but didn't get around to it (@Kaniz Fatma​ do you know his handle here so I can tag him?)We (and most development teams) have two environments: UAT/DEV and PROD. For those that d...

  • 7091 Views
  • 9 replies
  • 8 kudos
Latest Reply
Anonymous
Not applicable
  • 8 kudos

Hi @Oliver Angelil​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Tha...

  • 8 kudos
8 More Replies
DipsikhaDas
by New Contributor II
  • 982 Views
  • 2 replies
  • 2 kudos

Databricks notebook exceptions into Service Now

Hello Community members,I am looking for options for redirecting the Databricks notebook raised except within exception block to be redirected to ServiceNowIs there a way the connection can be made directly from the notebook?Looking for suggestions. ...

  • 982 Views
  • 2 replies
  • 2 kudos
Latest Reply
DipsikhaDas
New Contributor II
  • 2 kudos

Thank you for the solution, I will definitely try this and share to the community if this works.

  • 2 kudos
1 More Replies
adivandhya
by New Contributor III
  • 1930 Views
  • 4 replies
  • 4 kudos

Resolved! configuration for Job Queueing in Terraform

When defining the databricks_job resource in Terraform , we are trying to enable Job Queueing flag for the job. However, from the Terraform Provider docs, we are not able to find any config related to queuing. Is there a different method to configure...

  • 1930 Views
  • 4 replies
  • 4 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 4 kudos

Hi @adivandhya, it’s a private preview feature - you need to work with your account SA for that.

  • 4 kudos
3 More Replies
HasiCorp
by New Contributor II
  • 11640 Views
  • 3 replies
  • 2 kudos

Resolved! AnalysisException: [RequestId=... ErrorClass=INVALID_PARAMETER_VALUE] Missing cloud file system scheme

Hi community,i get an analysis exception when executing following code in a notebook using a personal compute cluster. Seems to be an issue with permission but I am logged in with my admin account. Any help would be appreciated. USE CATALOG catalog; ...

  • 11640 Views
  • 3 replies
  • 2 kudos
Latest Reply
Leonardo
New Contributor III
  • 2 kudos

I was having the same issue because I was trying to set the location with the absolute path, just like you did.I solved it by creating an external location, then copying the URL and putting it into the location of the path options.

  • 2 kudos
2 More Replies
felix_counter
by New Contributor III
  • 2651 Views
  • 3 replies
  • 3 kudos

Resolved! Order of delta table after read not as expected

Dear Databricks Community,I am performing three consecutive 'append' writes to a delta table, whereas the first append creates the table. Each append consists of two rows, which are ordered by column 'id' (see example in the attached screenshot). Whe...

  • 2651 Views
  • 3 replies
  • 3 kudos
Latest Reply
felix_counter
New Contributor III
  • 3 kudos

Thanks a lot @Lakshay and @Tharun-Kumar for your valued contributions!

  • 3 kudos
2 More Replies
ivanychev
by Contributor
  • 1505 Views
  • 2 replies
  • 1 kudos

Is there a way to avoid using EBS drives on workers with local NVMe SSD?

The Databricks on AWS docs claim that 30G + 150G EBS drives are mounter to every node by default. But if I use instance type like r5d.2xlarge, it already has local disk so I want to avoid mounting the 150G EBS drive to it. Is there a way to do it?We ...

  • 1505 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @ivanychev, Based on the provided information, if you want to avoid mounting the 150G EBS drive to a node with the local disk, you can set ebs_volume_count it to 0 in the Clusters API when creating the cluster. Another option could be manually det...

  • 1 kudos
1 More Replies
bearys
by New Contributor II
  • 2121 Views
  • 2 replies
  • 2 kudos

Illegal character in partition path when attempting REORG ... (PURGE)

I have a large delta table partitioned by an identifier column that I now have discovered has blank spaces in some of the identifiers, e.g. one partition can be defined by "Identifier=first identifier". Most partitions does not have these blank space...

  • 2121 Views
  • 2 replies
  • 2 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 2 kudos

Hi @bearys, The error message suggests an illegal character in the path at a specific index. The error is pointing to a blank space in the path "dbfs:/mnt/container/table_name/Identifier=first identifier/part-01347-8a9a157b-6d0d-75dd-b1b7-2aed12e057...

  • 2 kudos
1 More Replies
DB_PROD_Molina
by New Contributor
  • 1132 Views
  • 2 replies
  • 3 kudos

Job aborted due to stage failure. Relative path in absolute URI

Hello Team we have frequently data bricks job failure with following message , any help would be appreciated Job aborted due to stage failure. Relative path in absolute URI

  • 1132 Views
  • 2 replies
  • 3 kudos
Latest Reply
Tharun-Kumar
Honored Contributor II
  • 3 kudos

@DB_PROD_Molina One of the reasons this error shows up is due to file path/name containing special characters in it. If that is the case, could you rename your file to have the special characters removed.

  • 3 kudos
1 More Replies
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels
Top Kudoed Authors