cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Paul92S
by New Contributor III
  • 748 Views
  • 2 replies
  • 1 kudos

Resolved! DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Hi,I am having an issue of loading source data into a delta table/ unity catalog. The error we are recieving is the following:grpc_message:"[DELTA_EXCEED_CHAR_VARCHAR_LIMIT] Exceeds char/varchar type length limitation. Failed check: (isnull(\'metric_...

  • 748 Views
  • 2 replies
  • 1 kudos
Latest Reply
Palash01
Contributor III
  • 1 kudos

 Hey @Paul92S Looking at the error message it looks like column "metric_name" is the culprit here:Understanding the Error:Character Limit Violation: The error indicates that values in the metric_name column are exceeding the maximum length allowed fo...

  • 1 kudos
1 More Replies
Olaoye_Somide
by New Contributor
  • 172 Views
  • 1 replies
  • 0 kudos

How to Implement Custom Logging in Databricks without Using _jvm Attribute with Spark Connect?

Hello Databricks Community,I am currently working in a Databricks environment and trying to set up custom logging using Log4j in a Python notebook. However, I've run into a problem due to the use of Spark Connect, which does not support the _jvm attr...

Data Engineering
Apache Spark
data engineering
  • 172 Views
  • 1 replies
  • 0 kudos
Latest Reply
arpit
Contributor III
  • 0 kudos

import logging logging.getLogger().setLevel(logging.WARN) log = logging.getLogger("DATABRICKS-LOGGER") log.warning("Hello")

  • 0 kudos
Phani1
by Valued Contributor
  • 117 Views
  • 1 replies
  • 0 kudos

Boomi integrating with Databricks

Hi Team,Is there any impact when integrating Databricks with Boomi as opposed to Azure Event Hub? Could you offer some insights on the integration of Boomi with Databricks?https://boomi.com/blog/introducing-boomi-event-streams/Regards,Janga

  • 117 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Phani1, Let’s explore the integration of Databricks with Boomi and compare it to Azure Event Hub. Databricks Integration with Boomi: Databricks is a powerful data analytics platform that allows you to process large-scale data and build machin...

  • 0 kudos
CarstenWeber
by New Contributor
  • 217 Views
  • 4 replies
  • 1 kudos

Resolved! Invalid configuration fs.azure.account.key trying to load ML Model with OAuth

Hi Community,i was trying to load a ML Model from a Azure Storageaccount (abfss://....) with: model = PipelineModel.load(path) i set the spark config:  spark.conf.set("fs.azure.account.auth.type", "OAuth") spark.conf.set("fs.azure.account.oauth.provi...

  • 217 Views
  • 4 replies
  • 1 kudos
Latest Reply
CarstenWeber
New Contributor
  • 1 kudos

@daniel_sahal using the settings above did indeed work. 

  • 1 kudos
3 More Replies
niruban
by New Contributor II
  • 258 Views
  • 2 replies
  • 0 kudos

Databricks Asset Bundle to deploy only one workflow

Hello Community -I am trying to deploy only one workflow from my CICD. But whenever I am trying to deploy one workflow using "databricks bundle deploy - prod", it is deleting all the existing workflow in the target environment. Is there any option av...

Data Engineering
CICD
DAB
Databricks Asset Bundle
DevOps
  • 258 Views
  • 2 replies
  • 0 kudos
Latest Reply
niruban
New Contributor II
  • 0 kudos

@Rajani : This is what I am doing. I am having git actions to kick off which will run - name: bundle-deployrun: |      cd ${{ vars.HOME }}/dev-ops/databricks_cicd_deployment      databricks bundle deploy --debug Before running this step, I am creatin...

  • 0 kudos
1 More Replies
EWhitley
by New Contributor II
  • 223 Views
  • 0 replies
  • 0 kudos

Custom ENUM input as parameter for SQL UDF?

Hello  - We're migrating from T-SQL to Spark SQL. We're migrating a significant number of queries."datediff(unit, start,end)" is different between these two implementations (in a good way).  For the purpose of migration, we'd like to stay as consiste...

  • 223 Views
  • 0 replies
  • 0 kudos
amde99
by New Contributor
  • 248 Views
  • 2 replies
  • 0 kudos

How can I throw an exception when a .json.gz file has multiple roots?

I have a situation where source files in .json.gz sometimes arrive with invalid syntax containing multiple roots separated by empty braces []. How can I detect this and thrown an exception? Currently the code runs and picks up only record set 1, and ...

  • 248 Views
  • 2 replies
  • 0 kudos
Latest Reply
Lakshay
Esteemed Contributor
  • 0 kudos

Schema validation should help here.

  • 0 kudos
1 More Replies
Karlo_Kotarac
by New Contributor II
  • 180 Views
  • 3 replies
  • 0 kudos

Run failed with error message ContextNotFound

Hi all!Recently we've been getting lots of these errors when running Databricks notebooks:At that time we observed DRIVER_NOT_RESPONDING (Driver is up but is not responsive, likely due to GC.) log on the single-user cluster we use.Previously when thi...

Karlo_Kotarac_0-1713422302017.png
  • 180 Views
  • 3 replies
  • 0 kudos
Latest Reply
Lakshay
Esteemed Contributor
  • 0 kudos

You may also try to run the failing notebook on the job cluster

  • 0 kudos
2 More Replies
Phani1
by Valued Contributor
  • 115 Views
  • 1 replies
  • 0 kudos

Code Review tools

Could you kindly recommend any Code Review tools that would be suitable for our Databricks tech stack?

Data Engineering
code review
  • 115 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @Phani1, When it comes to code review tools for your Databricks tech stack, here are some options you might find useful: Built-in Interactive Debugger in Databricks Notebook: The interactive debugger is available exclusively for Python code withi...

  • 0 kudos
dilkushpatel
by New Contributor II
  • 253 Views
  • 4 replies
  • 0 kudos

Databricks connecting SQL Azure DW - Confused between Polybase and Copy Into

I see two articles on databricks documentationshttps://docs.databricks.com/en/archive/azure/synapse-polybase.html#language-pythonhttps://docs.databricks.com/en/connect/external-systems/synapse-analytics.html#service-principal Polybase one is legacy o...

Data Engineering
azure
Copy
help
Polybase
Synapse
  • 253 Views
  • 4 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @dilkushpatel, Thank you for sharing your confusion regarding PolyBase and the COPY INTO command in Databricks when working with Azure Synapse.  PolyBase (Legacy): PolyBase was previously used for data loading and unloading operations in Azure...

  • 0 kudos
3 More Replies
Abhi0607
by New Contributor II
  • 203 Views
  • 2 replies
  • 0 kudos

Variables passed from ADF to Databricks Notebook Try-Catch are not accessible

Dear Members,I need your help in below scenario.I am passing few parameters from ADF pipeline to Databricks notebook.If I execute ADF pipeline to run my databricks notebook and use these variables as is in my code (python) then it works fine.But as s...

  • 203 Views
  • 2 replies
  • 0 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 0 kudos

Hi @Abhi0607  Can you please help me to find if you are taking or defining these parameter value outside try catch or inside it ?

  • 0 kudos
1 More Replies
PrebenOlsen
by New Contributor III
  • 171 Views
  • 2 replies
  • 0 kudos

Job stuck while utilizing all workers

Hi!Started a job yesterday. It was iterating over data, 2-months at a time, and writing to a table. It was successfully doing this for 4 out of 6 time periods. The 5th time period however, got stuck, 5 hours in.I can find one Failed Stage that reads ...

Data Engineering
job failed
Job froze
need help
  • 171 Views
  • 2 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

As Spark is lazy evaluated, using only small clusters for read and large ones for writes is not something that will happen.The data is read when you apply an action (write f.e.).That being said:  I have no knowledge of a bug in Databricks on clusters...

  • 0 kudos
1 More Replies
laurenskuiper97
by New Contributor
  • 178 Views
  • 1 replies
  • 0 kudos

JDBC / SSH-tunnel to connect to PostgreSQL not working on multi-node clusters

Hi everybody,I'm trying to setup a connection between Databricks' Notebooks and an external PostgreSQL database through a SSH-tunnel. On a single-node cluster, this is working perfectly fine. However, when this is ran on a multi-node cluster, this co...

Data Engineering
clusters
JDBC
spark
SSH
  • 178 Views
  • 1 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

I doubt it is possible.The driver runs the program, and sends tasks to the executors.  But since creating the ssh tunnel is no spark task, I don't think it will be established on any executor.

  • 0 kudos
Labels