cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

VovaVili
by New Contributor II
  • 2213 Views
  • 4 replies
  • 0 kudos

Databricks Runtime 13.3 - can I use Databricks Connect without Unity Catalog?

Hello all,The official documentation for Databricks Connect states that, for Databricks Runtime versions 13.0 and above, my cluster needs to have Unity Catalog enabled for me to use Databricks Connect, and use a Databricks cluster through an IDE like...

  • 2213 Views
  • 4 replies
  • 0 kudos
Latest Reply
ZivadinM
New Contributor II
  • 0 kudos

Did you configure databricks connect without UnitCatalog at the end? If you managed to do that can you share with me how?

  • 0 kudos
3 More Replies
SDas1
by New Contributor
  • 7833 Views
  • 2 replies
  • 2 kudos

Identity column value of Databricks delta table is not started with 0 and increaed by 1. It always started with something like 1 or 2 and increased by 2. Below is the sample code and any logical input here is appreciated

spark.sql("CREATE TABLE integrated.TrailingWeeks(ID bigint GENERATED BY DEFAULT AS IDENTITY (START WITH 0 increment by 1) ,Week_ID int NOT NULL) USING delta OPTIONS (path 'dbfs:/<Path in Azure datalake>/delta')")

  • 7833 Views
  • 2 replies
  • 2 kudos
Latest Reply
agallardrivilla
New Contributor II
  • 2 kudos

Hi,When you define an identity column in  Databricks with GENERATED BY DEFAULT AS IDENTITY (START WITH 0 INCREMENT BY 1), it is expected to start at 0 and increment by 1. However, due to Databricks' distributed architecture, the values may not be str...

  • 2 kudos
1 More Replies
Pavan578
by New Contributor II
  • 189 Views
  • 2 replies
  • 0 kudos

Cluster is not starting

Cluster 'xxxxxxx' was terminated. Reason: WORKER_SETUP_FAILURE (SERVICE_FAULT). Parameters: databricks_error_message:DBFS Daemomn is not reachable., gcp_error_message:Unable to reach the colocated DBFS Daemon.Can Anyone help me how can we resolve thi...

  • 189 Views
  • 2 replies
  • 0 kudos
Latest Reply
Pavan578
New Contributor II
  • 0 kudos

Thanks @agallardrivilla . I will check the above steps and let you know. 

  • 0 kudos
1 More Replies
elikvar
by New Contributor III
  • 18817 Views
  • 9 replies
  • 9 kudos

Cluster occasionally fails to launch

I have a daily running notebook that occasionally fails with the error:"Run result unavailable: job failed with error message Unexpected failure while waiting for the cluster Some((xxxxxxxxxxxxxxx) )to be readySome(: Cluster xxxxxxxxxxxxxxxx is in un...

  • 18817 Views
  • 9 replies
  • 9 kudos
Latest Reply
Pavan578
New Contributor II
  • 9 kudos

Cluster 'xxxxxxx' was terminated. Reason: WORKER_SETUP_FAILURE (SERVICE_FAULT). Parameters: databricks_error_message:DBFS Daemomn is not reachable., gcp_error_message:Unable to reach the colocated DBFS Daemon.Can Anyone help me how can we resolve thi...

  • 9 kudos
8 More Replies
tanjil
by New Contributor III
  • 13352 Views
  • 9 replies
  • 6 kudos

Resolved! Downloading sharepoint lists using python

Hello, I am trying to download lists from SharePoint into a pandas dataframe. However I cannot get any information successfully. I have attempted many solution mentioned in stackoverflow. Below is one of those attempts: # https://pypi.org/project/sha...

  • 13352 Views
  • 9 replies
  • 6 kudos
Latest Reply
huntaccess
New Contributor II
  • 6 kudos

The error "<urlopen error [Errno -2] Name or service not known>" suggests that there's an issue with the server URL or network connectivity. Double-check the server URL to ensure it's correct and accessible. Also, verify that your network connection ...

  • 6 kudos
8 More Replies
pesky_chris
by New Contributor III
  • 1382 Views
  • 5 replies
  • 0 kudos

Resolved! Problem with SQL Warehouse (Serverless)

I get the following error message on the attempt to use SQL Warehouse (Serverless) compute with Materialized Views (a simple interaction, e.g. DML, UI sample lookup). The MVs are created off the back of Federated Tables (Postgresql), MVs are created ...

  • 1382 Views
  • 5 replies
  • 0 kudos
Latest Reply
pesky_chris
New Contributor III
  • 0 kudos

Hey,To clarify, as I think I'm potentially hitting Databricks unintended "functionality".Materialised Views are managed by DLT pipeline, which was deployed with DABs off CI/CD pipeline,DLT Pipeline runs a notebook with Python code creating MVs dynami...

  • 0 kudos
4 More Replies
Edthehead
by Contributor II
  • 1692 Views
  • 2 replies
  • 0 kudos

Parameterized Delta live table pipeline

I'm trying to create an ETL framework on delta live tables and basically use the same pipeline for all the transformation from bronze to silver to gold. This works absolutely fine when I hard code the tables and the SQL transformations as an array wi...

Data Engineering
Databricks
Delta Live Table
dlt
  • 1692 Views
  • 2 replies
  • 0 kudos
Latest Reply
canadiandataguy
New Contributor II
  • 0 kudos

Here is how you can do it

  • 0 kudos
1 More Replies
calvinchan_iot
by New Contributor II
  • 258 Views
  • 1 replies
  • 0 kudos

SparkRuntimeException: [UDF_ERROR.ENV_LOST] the execution environment was lost during execution

Hey everyone,I have been facing a weird error when i upgrade to use Unity Catalog.org.apache.spark.SparkRuntimeException: [UDF_ERROR.ENV_LOST] Execution of function line_string_linear_interp(geometry#1432) failed - the execution environment was lost ...

  • 258 Views
  • 1 replies
  • 0 kudos
Latest Reply
Brahmareddy
Valued Contributor III
  • 0 kudos

Hi @calvinchan_iot, How are you doing today?As per my understanding, It sounds like the error may be due to environment instability when running the UDF after enabling Unity Catalog. The [UDF_ERROR.ENV_LOST] error often points to the UDF execution en...

  • 0 kudos
KartRasi_10779
by New Contributor
  • 265 Views
  • 2 replies
  • 0 kudos

Glue Catalog Metadata Management with Enforced Tagging in Databricks

As part of the data governance team, we're trying to enforce table-level tagging when users create tables in a Databricks environment where metadata is managed by AWS Glue Catalog (non-Unity Catalog). Is there a way to require tagging at table creati...

  • 265 Views
  • 2 replies
  • 0 kudos
Latest Reply
145676
New Contributor II
  • 0 kudos

You can use lakeFS pre-merge hooks to force this. Works great with this stack -> https://lakefs.io/blog/lakefs-hooks/ 

  • 0 kudos
1 More Replies
htu
by New Contributor III
  • 4622 Views
  • 7 replies
  • 18 kudos

Installing Databricks Connect breaks pyspark local cluster mode

Hi, It seems that when databricks-connect is installed, pyspark is at the same time modified so that it will not anymore work with local master node. This has been especially useful in testing, when unit tests for spark-related code without any remot...

  • 4622 Views
  • 7 replies
  • 18 kudos
Latest Reply
Kolath
New Contributor II
  • 18 kudos

Also frustrated by this behavior. Databricks-connect should not replace the rest of local spark.Is there any solution to this?

  • 18 kudos
6 More Replies
iamgoda
by New Contributor III
  • 2149 Views
  • 11 replies
  • 3 kudos

Databricks SQL script slow execution in workflows using serverless

I am running a very simple SQL script within a notebook, using an X-Small SQL Serverless warehouse (that is already running). The execution time is different depending on how it's run:4s if run interactively (and through SQL editor)26s if run within ...

iamgoda_4-1720697910509.png iamgoda_5-1720697937883.png iamgoda_7-1720698691523.png iamgoda_0-1720701617441.png
  • 2149 Views
  • 11 replies
  • 3 kudos
Latest Reply
iamgoce
New Contributor II
  • 3 kudos

So I was told that the Q4 date was incorrect - in fact there is currently no ETA for when this issue will be fixed. It's considered lower priority by Databricks as not enough customers are impacted or have raised this type of an issue. I would recomm...

  • 3 kudos
10 More Replies
steveanderson
by New Contributor
  • 189 Views
  • 1 replies
  • 0 kudos

Comparing Ultrawide Curved Monitors and Dual-Monitor Setups for Databricks Projects

Hello everyone,I’m currently exploring the best setup for my data engineering tasks in Databricks and have been considering the benefits of using an ultrawide curved monitor compared to a standard dual-monitor setup.I’d love to hear from the communit...

  • 189 Views
  • 1 replies
  • 0 kudos
Latest Reply
BigRoux
Databricks Employee
  • 0 kudos

Here’s a clearer version:I don’t use a curved monitor; instead, I have two 35" monitors, which work perfectly for my Databricks work. I chose two large monitors over one extra-large one because I frequently share screens, and it’s easier to share an ...

  • 0 kudos
Jana
by New Contributor III
  • 7611 Views
  • 9 replies
  • 4 kudos

Resolved! Parsing 5 GB json file is running long on cluster

I was creating delta table from ADLS json input file. but the job was running long while creating delta table from json. Below is my cluster configuration. Is the issue related to cluster config ? Do I need to upgrade the cluster config ?The cluster ...

  • 7611 Views
  • 9 replies
  • 4 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 4 kudos

with multiline = true, the json is read as a whole and processed as such.I'd try with a beefier cluster.

  • 4 kudos
8 More Replies
1npo
by New Contributor II
  • 312 Views
  • 2 replies
  • 2 kudos

Dark mode broken by "New version of this app is available" popup

Hello,I have the interface theme set to "Prefer dark" in Databricks. I just got a popup in the Workflow page while reviewing a job run, that said something like "A new version of this app is available, click to refresh". I clicked refresh, and now my...

  • 312 Views
  • 2 replies
  • 2 kudos
Latest Reply
1npo
New Contributor II
  • 2 kudos

I just got another "New version of this app is available" popup, and clicking "Refresh" fixedthe dark mode issue. Thanks for the quick response to whichever engineer at Databricks just pushed a hotfix

  • 2 kudos
1 More Replies
TamD
by Contributor
  • 1134 Views
  • 6 replies
  • 2 kudos

How do I drop a delta live table?

I'm a newbie and I've just done the "Run your first Delta Live Tables pipeline" tutorial.The tutorial downloads a publicly available csv baby names file and creates two new Delta Live tables from it.  Now I want to be a good dev and clean up the reso...

  • 1134 Views
  • 6 replies
  • 2 kudos
Latest Reply
TamD
Contributor
  • 2 kudos

Thank you @gchandra .  Deleting the pipeline does indeed remove the materialized view definitions from the Catalog.  How can I confirm that the underlying S3 storage has also been cleared?  Just removing the pointers in the Catalog is not enough, if ...

  • 2 kudos
5 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels