cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

apurvasawant
by New Contributor II
  • 463 Views
  • 1 replies
  • 3 kudos

Job Run Failed - "Cluster became unreachable during run" with Cause: "requirement failed: Execution

I'm encountering a failure while running a job in Databricks. The run fails with the following error message:Cluster became unreachable during run Cause: requirement failed: Execution is doneDetails:Runtime version: 15.4 LTS (includes Apache Spark 3....

  • 463 Views
  • 1 replies
  • 3 kudos
Latest Reply
mmayorga
Databricks Employee
  • 3 kudos

Hello @apurvasawant  I'm sorry you are seeing this behavior while using Jobs. Definitely, these messages don't help much. When this happens, I suggest taking a step back and reviewing the configuration of your Job and some troubleshooting: What is th...

  • 3 kudos
adrianhernandez
by New Contributor III
  • 500 Views
  • 3 replies
  • 1 kudos

GIT automate DEV Databricks instance to PROD instance

Hello,Would like to use GIT to automate the process of syncing between a DEV Databricks instance and a PROD Databricks instance. Something like :On GIT console pull changes/sync with DEV Databricks.Have some kind of approval process in GIT (like some...

  • 500 Views
  • 3 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @adrianhernandez ,This can be easily achieved using Databricks Assets Bundles in combination with Azure Devops Pipeline (or github actions).So, typical CI/CD workflows looks something like this:Store: Store your Databricks code and notebooks in a ...

  • 1 kudos
2 More Replies
Carlton
by Contributor II
  • 1431 Views
  • 19 replies
  • 2 kudos

No Longer Able to Create DeltaTables in ADLS Gen 2

Hi Community,Up until recently I was happily deleting deltaTables in ADLS Gen with their associated _delta_log table, and subsequently recreating the same table with a new _delta_log table.Now, after deleting a table with its associated _delta_log ta...

  • 1431 Views
  • 19 replies
  • 2 kudos
Latest Reply
Carlton
Contributor II
  • 2 kudos

Hi @szymon_dybczak , you must have a special version of Databricks Community Edition as I don't have those options after I select settings   Shall I try from my premium paid for version of Databricks? 

  • 2 kudos
18 More Replies
MoodyDirk
by New Contributor III
  • 594 Views
  • 4 replies
  • 4 kudos

Resolved! Workspace Catalog Connection to 'on-prem' MS SQL Server

Hi, first post/question here ...I'm trying to add a MS SQL Connection to my workspace (us-west-2) Catalog. Connection type is SQL Server, Host equal the public IP of that Server, port is the default 1433. (Server is a AWS EC2 instance)All attempts to...

  • 594 Views
  • 4 replies
  • 4 kudos
Latest Reply
MoodyDirk
New Contributor III
  • 4 kudos

Thank You to all responses.I've added 44.234.192.32/28 and 52.27.216.188/32 as a (for now) All traffic inbound rule.But no success ...I'll keep looking and trying.We're currently evaluating with a Serverless Cloud workspace, but at the end might go t...

  • 4 kudos
3 More Replies
DR07
by New Contributor II
  • 585 Views
  • 6 replies
  • 2 kudos

Notebook Dashboard refreshes for all users when one user refreshes

Hi Team ,I have created a dashboard using the dashboard functionality of the notebook.But when one user refreshes the dashboard it refreshes for all the users with whom the dashboard is shared.What are the ways this issue can be resolved ?

  • 585 Views
  • 6 replies
  • 2 kudos
Latest Reply
DR07
New Contributor II
  • 2 kudos

Got it , Thank you !

  • 2 kudos
5 More Replies
Anish_2
by New Contributor II
  • 1270 Views
  • 3 replies
  • 0 kudos

Delta live tables - ignore updates on some columns

Hello Team,I have scenario where in apply_changes, i want to ignore updates on 1 column. Is there any way we can achieve this in Delta live tables?

  • 1270 Views
  • 3 replies
  • 0 kudos
Latest Reply
ashraf1395
Honored Contributor
  • 0 kudos

Hi there @Anish_2 , Yes you can do that Here is the doc link : https://docs.databricks.com/aws/en/dlt/cdc?language=PythonFor python you can simply add an attribute except_columns_list like thisdlt.apply_changes( target = "target", source = "users...

  • 0 kudos
2 More Replies
LukaszJ
by Contributor III
  • 23148 Views
  • 7 replies
  • 2 kudos

Resolved! Install ODBC driver by init script

Hello,I want to install ODBC driver (for pyodbc).I have tried to do it using terraform, however I think it is impossible.So I want to do it with Init Script in my cluster. I have the code from the internet and it works when it is on the beginning of ...

  • 23148 Views
  • 7 replies
  • 2 kudos
Latest Reply
MayaBakh_80151
New Contributor II
  • 2 kudos

Actually found this article and using this to migrate my shell script to workspace.Cluster-named and cluster-scoped init script migration notebook - Databricks 

  • 2 kudos
6 More Replies
GowthamR
by New Contributor II
  • 384 Views
  • 2 replies
  • 2 kudos

Connecting SQL Server From Databricks

Hi Team,Good Day!Iam trying to connect to sql server from databricks using pyodbc , However iam not able to connect , And I have tried many ways like adding the init script in the cluster configuration etc,But it is showing me error,I want to know ea...

  • 384 Views
  • 2 replies
  • 2 kudos
Latest Reply
BS_THE_ANALYST
Esteemed Contributor III
  • 2 kudos

@GowthamR supplying errors, if possible, given that they aren't leaking credentials (please mask those in the screenshots) will be really useful for helping us to debug what's happening. All the best,BS

  • 2 kudos
1 More Replies
sandeepsuresh16
by New Contributor II
  • 1093 Views
  • 4 replies
  • 7 kudos

Resolved! Azure Databricks Job Run Failed with Error - Could not reach driver of cluster

Hello Community,I am facing an intermittent issue while running a Databricks job. The job fails with the following error message:Run failed with error message:Could not reach driver of cluster <cluster-id>.Here are some additional details:Cluster Typ...

  • 1093 Views
  • 4 replies
  • 7 kudos
Latest Reply
sandeepsuresh16
New Contributor II
  • 7 kudos

Hello Anudeep,Thank you for your detailed response and the helpful recommendations.I would like to provide some additional context:For our jobs, we are running only one notebook at a time, not multiple notebooks or tasks concurrently.The issue occurs...

  • 7 kudos
3 More Replies
Vinil
by New Contributor III
  • 743 Views
  • 7 replies
  • 1 kudos

Upgrading Drivers and Authentication Method for Snowflake Integration

Hello Databricks Support Team,I am reaching out to request assistance with upgrading the drivers and configuring authentication methods for our Snowflake–Databricks integration.We would like to explore and implement one of the recommended secure auth...

  • 743 Views
  • 7 replies
  • 1 kudos
Latest Reply
Vinil
New Contributor III
  • 1 kudos

@Khaja_Zaffer ,Need assistance on upgrading Snowflake drivers on Cluster. We installed Snowflake package on cluster, How to upgrade Snowflake library? For Authentication, I will reach out to Azure team.Thanks for the details.  

  • 1 kudos
6 More Replies
yit
by Contributor III
  • 783 Views
  • 3 replies
  • 6 kudos

Resolved! Schema hints: define column type as struct and incrementally add fields with schema evolution

Hey everyone,I want to set column type as empty struct via schema hints without specifying subfields. Then I expect the struct to be evolved with subfields through schema evolution when new subfields appear in the data. But, I've found in the documen...

yit_2-1758029850608.png
Data Engineering
autoloader
schema hints
Struct
  • 783 Views
  • 3 replies
  • 6 kudos
Latest Reply
K_Anudeep
Databricks Employee
  • 6 kudos

Hello @yit ,You can’t. An “empty struct”  is treated as a fixed struct with zero fields, so AutoLoader will not expand it later. The NOTE in the screenshot applies to JSON just as much as Parquet/Avro/CSV.If your goal is “discover whatever shows up u...

  • 6 kudos
2 More Replies
ToBeDataDriven
by New Contributor II
  • 642 Views
  • 4 replies
  • 3 kudos

Resolved! Disable Logging inPython `dbutils.fs.put`?

This function logs every time it writes to stdout "Wrote n bytes." I want to disable its logging as I have thousands of files I'm writing and it floods the log with meaningless information. Does anyone know if it's possible?

  • 642 Views
  • 4 replies
  • 3 kudos
Latest Reply
K_Anudeep
Databricks Employee
  • 3 kudos

@ToBeDataDriven , If the above answered your question, then could you please help accept the solution?

  • 3 kudos
3 More Replies
ManojkMohan
by Honored Contributor II
  • 1019 Views
  • 1 replies
  • 2 kudos

Parsing from PDF to a Structured Table | Looking for best practies

Use Case:Converting unstructured data from PDF to structured format before sending to SalesforceAsk:Best practices to structure my table better before sending to a system like salesforceOutput in structured format looks like: My code:Extract Tables f...

image.png
  • 1019 Views
  • 1 replies
  • 2 kudos
Latest Reply
BS_THE_ANALYST
Esteemed Contributor III
  • 2 kudos

@ManojkMohan  My advice for parsing PDFs:1. Will your project have PDFs that are all the same in terms of formatting? i.e. invoices of a particular type where things like addresses and values might change but their position on the page is mostly the ...

  • 2 kudos
animesh_kumar27
by New Contributor
  • 384 Views
  • 2 replies
  • 1 kudos

not able to create a compute

hello allI deleted resource group for 3 times and selected 3 different regions east us , central India and south India . And every time I am trying to create a single node compute . It is taking so much time and then at last saying resource out of st...

  • 384 Views
  • 2 replies
  • 1 kudos
Latest Reply
nayan_wylde
Esteemed Contributor
  • 1 kudos

Resource out of Stock is becoming a commmon issue with Microsoft this days. It happens when they don't have enough VMs in a region. I would say try a new or different SKU for your cluster. This resolves the issue when you change the SKU.

  • 1 kudos
1 More Replies
gerard_gv
by New Contributor
  • 548 Views
  • 1 replies
  • 1 kudos

Resolved! readStream with readChangeFeed option in SQL

I have been some days trying to find the equivalent SQL for:  spark.readStream .option("readChangeFeed", "true") .table("table_name") I suspect that it works like AUTO CDC FROM SNAPSHOT, since CDF adds the column "_commit_version", a ...

  • 548 Views
  • 1 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Greetings @gerard_gv , there isn’t currently a direct SQL equivalent to the readChangeFeed option. This option is only supported in streaming through the Python and Scala DataFrame APIs. As a workaround, take a look at the table_changes SQL function....

  • 1 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels