cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

mkEngineer
by New Contributor III
  • 1129 Views
  • 2 replies
  • 0 kudos

How to Version & Deploy Databricks Workflows with Azure DevOps (CI/CD)?

Hi everyone,I’m trying to set up versioning and CI/CD for my Databricks workflows using Azure DevOps and Git. While I’ve successfully versioned notebooks in a Git repo, I’m struggling with handling workflows (which define orchestration, dependencies,...

  • 1129 Views
  • 2 replies
  • 0 kudos
Latest Reply
mkEngineer
New Contributor III
  • 0 kudos

As of now, my current approach is to manually copy/paste YAMLs across workspaces and version them using Git/Azure DevOps by saving them as DBFS files. The CD process is then handled using Databricks DBFS File Deployment by Data Thirst Ltd.While this ...

  • 0 kudos
1 More Replies
BillBishop
by New Contributor III
  • 285 Views
  • 2 replies
  • 0 kudos

DAB for_each_task python wheel fail

using python_wheel_wrapper experimental true allows me to use python_wheel_task on an older cluster.However, if I embed the python_wheel_task in a for_each_task it fails at runtime with: "Library installation failed for library due to user error.  Er...

  • 285 Views
  • 2 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @BillBishop, I will check on this internally as outcome does not seem to be correct. If possible, upgrade your cluster to DBR 14.1 or later. This would resolve the issue without relying on the experimental feature

  • 0 kudos
1 More Replies
Yuppp
by New Contributor
  • 1084 Views
  • 0 replies
  • 0 kudos

Need help with setting up ForEach task in Databricks

Hi everyone,I have a workflow involving two notebooks: Notebook A and Notebook B. At the end of Notebook A, we generate a variable number of files, let's call it N. I want to run Notebook B for each of these N files.I know Databricks has a Foreach ta...

For Each.jpg Task.jpg
Data Engineering
ForEach
Workflows
  • 1084 Views
  • 0 replies
  • 0 kudos
rushi29
by New Contributor III
  • 2471 Views
  • 5 replies
  • 0 kudos

sparkContext in Runtime 15.3

Hello All, Our Azure databricks cluster is running under "Legacy Shared Compute" policy with 15.3 runtime. One of the python notebooks is used to connect to an Azure SQL database to read/insert data. The following snippet of code is responsible for r...

  • 2471 Views
  • 5 replies
  • 0 kudos
Latest Reply
jayct
New Contributor II
  • 0 kudos

@rushi29 @GangsterI ended up implementing pyodbc with the mssql driver using init scripts.Spark context is no longer usable in shared compute so that was the only approach we could take. 

  • 0 kudos
4 More Replies
ila-de
by New Contributor III
  • 1251 Views
  • 7 replies
  • 1 kudos

Resolved! databricks workspace import_dir not working without any failure message

Morning everyone!I`m trying to copy from the repo into the databricks workspace all the notebooks. I`m using the command: databricks workspace import_dir . /Shared/Notebooks, it will just print all the info regarding the Workspace API.If I launch dat...

  • 1251 Views
  • 7 replies
  • 1 kudos
Latest Reply
ila-de
New Contributor III
  • 1 kudos

Hi all,I`ve disinstalled and installed again databricks-cli and now worked.Is not a real solution but still it worked after one week...

  • 1 kudos
6 More Replies
p_romm
by New Contributor III
  • 297 Views
  • 1 replies
  • 0 kudos

Autoloader is not able to infer schema from json

Hi, I have json files and it contains json array and only one object (payload below), I have set in autoloader inferSchema to true, however autoloader throws:"Failed to infer schema for format json from existing files ..."I have also check option to ...

  • 297 Views
  • 1 replies
  • 0 kudos
Latest Reply
p_romm
New Contributor III
  • 0 kudos

Yep, my mistake, json file was corrupted. 

  • 0 kudos
akuma643
by New Contributor II
  • 1003 Views
  • 1 replies
  • 0 kudos

The authentication value "ActiveDirectoryManagedIdentity" is not valid.

Hi Team,i am trying to connect to SQL server hosted in azure vm using Entra id authentication from Databricks.("authentication", "ActiveDirectoryManagedIdentity")Below is the notebook script i am using. driver = "com.microsoft.sqlserver.jdbc.SQLServe...

  • 1003 Views
  • 1 replies
  • 0 kudos
Latest Reply
akuma643
New Contributor II
  • 0 kudos

Can anyone help me out on this please

  • 0 kudos
cmunteanu
by Contributor
  • 429 Views
  • 2 replies
  • 0 kudos

External connection to Azure ADLS Gen2 storage

Hello, I have a problem trying to make an external connection to a blob storage configured as ADLS Gen2 with hierarchical namespace (HNS) enabled. I have setuup the storage account with a container wirh HNS enabled as in the image attached:Next I hav...

cmunteanu_0-1739883785698.png cmunteanu_1-1739883952145.png cmunteanu_2-1739885124675.png
  • 429 Views
  • 2 replies
  • 0 kudos
Latest Reply
hao_hu
New Contributor II
  • 0 kudos

Hi, would it work if you try to remove "landing" at the end? Seems the error is complaining that the external location should be a directory.   

  • 0 kudos
1 More Replies
Splush_
by New Contributor III
  • 363 Views
  • 1 replies
  • 0 kudos

Resolved! Hostname not resolving using Spark JDBC

Hey guys,Ive ran into a weird error this morning. Last week I was testing a new Oracle Connector and it ran through smooth the whole last week!This morning at 7 it ran again and following it was showing a "SQLRecoverableException: IO Error: Unknown h...

  • 363 Views
  • 1 replies
  • 0 kudos
Latest Reply
Splush_
New Contributor III
  • 0 kudos

I have even cloned the cluster and it worked on the new one. But after turning the cluster off over night, it started working again the next morning. This is really weird.

  • 0 kudos
kenmyers-8451
by New Contributor II
  • 1006 Views
  • 1 replies
  • 0 kudos

Long runtimes on simple copying of data

Hi my team has been trying to identify areas where we can improve our processes. We have some long runtimes on processes that have multiple joins and aggregations. To create a baseline we have been running tests on a simple select and write operation...

kenmyers8451_0-1739400824751.png
  • 1006 Views
  • 1 replies
  • 0 kudos
Latest Reply
kenmyers-8451
New Contributor II
  • 0 kudos

I ran another test this week where I changed my source table to not have deletion vectors and no longer believe that step is limiting factor. Without it the compute times seemed to be as follows:reading in data + wscg + exchange = 7.9 hours, ~1/3 wal...

  • 0 kudos
shubham_007
by Contributor II
  • 899 Views
  • 0 replies
  • 0 kudos

Dear experts, need urgent help on logic.

Dear experts,I am facing difficulty while developing pyspark automation logic on “Developing automation logic to delete/remove display() and cache() method used in scripts in multiple databricks notebooks (tasks)”.kindly advise on developing automati...

  • 899 Views
  • 0 replies
  • 0 kudos
ashraf1395
by Honored Contributor
  • 1209 Views
  • 3 replies
  • 1 kudos

Resolved! How to capture dlt pipeline id / name using dynamic value reference

Hi there,I have a usecase where I want to set the dlt pipeline id in the configuration parameters of that dlt pipeline.The way we can use workspace ids or task id in notebook task task_id = {{task.id}}/ {{task.name}} and can save them as parameters a...

  • 1209 Views
  • 3 replies
  • 1 kudos
Latest Reply
ashraf1395
Honored Contributor
  • 1 kudos

Hi @mourakshit , I tried all the three methods you mentioned . None of them workedmethod_1 returne pipeline_name or pipelin_id as printed value : {{dlt_pipeline.name}} {{dlt_pipeline.id}} not the actual valuesmethond_2 returned not conf like spark.da...

  • 1 kudos
2 More Replies
Dominos
by New Contributor II
  • 479 Views
  • 4 replies
  • 0 kudos

Does DBR 14.3 not support Describe history command?

Hello, We have recently updated DBR version from 9.1 LTS to 14.3 LTS and observed that DESCRIBE HISTORY is not supported in 14.3 LTS. Could you please suggest any alternative to be used for table history? 

  • 479 Views
  • 4 replies
  • 0 kudos
Latest Reply
holly
Databricks Employee
  • 0 kudos

Hi, I'm still not able to recreate this issue with Standard_DS3_v2.  I'm not sure if this is relevant, but do you also have this issue on an old High Concurrency cluster with custom access mode for the Standard_DS3_v2 cluster too? 

  • 0 kudos
3 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels