cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

baatchus
by New Contributor III
  • 7254 Views
  • 6 replies
  • 2 kudos

Resolved! Call Databricks notebook in a specific branch from Azure Data Factory?

I'm using the new Databricks Repos functionality and in Azure Data Factory UI for the notebook activity you can browse the Databricks workspace and select Repos > username > project > folder > notebook.Is it possible to call a Databricks notebook in ...

  • 7254 Views
  • 6 replies
  • 2 kudos
Latest Reply
Maksym
New Contributor III
  • 2 kudos

Greetings, I have similar problem. Did you try to use Databricks workflows instead and schedule them instead on Data Factory?Because inside workflows it is possible to select a specific branch, so it may actually work.What do you think?

  • 2 kudos
5 More Replies
nk76
by New Contributor III
  • 3995 Views
  • 11 replies
  • 5 kudos

Resolved! Custom library import fails randomly with error: not found: value it

Hello,I have an issue with the import of a custom library, in Azure Databricks.(roughly) 95% of the times it works fine, but sometimes it fails.I searched the internet and this community with no luck, so far.It is a scala library in a scala notebook,...

  • 3995 Views
  • 11 replies
  • 5 kudos
Latest Reply
Naskar
New Contributor II
  • 5 kudos

Even I also encountered the same error. While Importing a file getting an error as "Import failed with error: Could not deserialize: Exceeded 16777216 bytes (current = 16778609)"

  • 5 kudos
10 More Replies
B_Seibert
by New Contributor III
  • 1145 Views
  • 3 replies
  • 9 kudos

Restore Delta table after adding columns.

At version 3 of our Delta Lake table we added a column. We later restored from version 11 back to version 10, which is now the most current version. But now when we run the table build from Azure Data Factory (ADF) on the full history of the data, we...

  • 1145 Views
  • 3 replies
  • 9 kudos
Latest Reply
B_Seibert
New Contributor III
  • 9 kudos

ignoreDeletes works. But I recommend to other developers that you have a think aboout all of the schema change scenarios and solve this problem above as part of a complete solution to every schema change scenario, instead of dealing with it as a one...

  • 9 kudos
2 More Replies
labromb
by Contributor
  • 633 Views
  • 0 replies
  • 0 kudos

Capturing notebook return codes in databricks jobs

Hi, I currently am running a number of notebook jobs from Azure Data Factory. A new requirement has come up where I need to capture a return code in ADF that has been generated from the note. I tried using  dbutils.notebook.exit(json.dumps({"return_v...

  • 633 Views
  • 0 replies
  • 0 kudos
StephanieRivera
by Valued Contributor II
  • 5041 Views
  • 7 replies
  • 10 kudos

Resolved! How do I kick off Azure Data Factory from within Databricks?

I want to kick off ingestion in ADF from Databricks. When ADF ingestion is done, my DBX bronze-silver-gold pipeline follows within DBX.I see it is possible to call Databricks notebooks from ADF. Can I also go the other way? I want to start the ingest...

  • 5041 Views
  • 7 replies
  • 10 kudos
Latest Reply
Kaniz
Community Manager
  • 10 kudos

Hi @Stephanie Rivera​​, We haven’t heard from you since the last response from @Werner Stinckens​ , and I was checking back to see if you have a resolution yet. If you have any solution, please share it with the community as it can be helpful to othe...

  • 10 kudos
6 More Replies
sage5616
by Valued Contributor
  • 4652 Views
  • 2 replies
  • 3 kudos

Resolved! Running local python code with arguments in Databricks via dbx utility.

I am trying to execute a local PySpark script on a Databricks cluster via dbx utility to test how passing arguments to python works in Databricks when developing locally. However, the test arguments I am passing are not being read for some reason. Co...

  • 4652 Views
  • 2 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

You can pass parameters using dbx launch --parametersIf you want to define it in the deployment template please try to follow exactly databricks API 2.1 schema https://docs.databricks.com/dev-tools/api/latest/jobs.html#operation/JobsCreate (for examp...

  • 3 kudos
1 More Replies
fsm
by New Contributor II
  • 4745 Views
  • 5 replies
  • 2 kudos

Resolved! Implementation of a stable Spark Structured Streaming Application

Hi folks,I have an issue. It's not critical but's annoying.We have implemented a Spark Structured Streaming Application.This application will be triggered wire Azure Data Factory (every 8 minutes). Ok, this setup sounds a little bit weird and it's no...

  • 4745 Views
  • 5 replies
  • 2 kudos
Latest Reply
User16869510359
Esteemed Contributor
  • 2 kudos

@Markus Freischlad​  Looks like the spark driver was stuck. It will be good to capture the thread dump of the Spark driver to understand what operation is stuck

  • 2 kudos
4 More Replies
RicksDB
by Contributor II
  • 2388 Views
  • 6 replies
  • 6 kudos

Resolved! SingleNode all-purpose cluster for small ETLs

Hi,I have many "small" jobs than needs to be executed quickly and at a predictable low cost from several Azure Data Factory pipelines. For this reason, I configured a small single node cluster to execute those processes. For the moment, everything se...

image
  • 2388 Views
  • 6 replies
  • 6 kudos
Latest Reply
RicksDB
Contributor II
  • 6 kudos

@Bilal Aslam​  In my case, it usually depends on the customers and their SLA. Most of them usually do not have a "true" high SLA requirement thus prefer the jobs to be throttled when the actual cost is within a certain range of the budget instead of ...

  • 6 kudos
5 More Replies
Hubert-Dudek
by Esteemed Contributor III
  • 868 Views
  • 2 replies
  • 18 kudos

I thought that Azure Data Factory is built on spark but now when I crushed it I see that is build directly on databricks :-)

I thought that Azure Data Factory is built on spark but now when I crushed it I see that is build directly on databricks

image.png
  • 868 Views
  • 2 replies
  • 18 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 18 kudos

correct. Because Data Flows were available before their own (MS) spark pools were available.But let's be honest: that is only a good thing

  • 18 kudos
1 More Replies
JK2021
by New Contributor III
  • 4762 Views
  • 10 replies
  • 5 kudos

Resolved! An unidentified special character is added in outbound file when transformed in databricks. Please help with suggestion?

Data from external source is copied to ADLS, which further gets picked up by databricks, then this massaged data is put in the outbound file . A special character ? (question mark in black diamond) is seen in some fields in outbound file which may br...

  • 4762 Views
  • 10 replies
  • 5 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 5 kudos

Are you sure it is Databricks which puts the special character in place?It could also have happened during the copy of the external system to ADLS.If you use Azure Data Factory f.e. you have to define the encoding (UTF-8 or UTF-16, ...)

  • 5 kudos
9 More Replies
MarcoCaviezel
by New Contributor III
  • 3178 Views
  • 6 replies
  • 3 kudos

Resolved! Use Spot Instances with Azure Data Factory Linked Service

In my pipeline I'm using Azure Data Factory to trigger Databricks notebooks as a linked serviceI want to use spot instances for my job clusters Is there a way to achieve this?I didn't find a way to do this in the GUI.Thanks for your help!Marco

  • 3178 Views
  • 6 replies
  • 3 kudos
Latest Reply
MarcoCaviezel
New Contributor III
  • 3 kudos

Hi @Werner Stinckens​ ,Just a quick follow up question.Does it make sense to you that you can select the following options in Azure Data Factory?To my understanding, "cluster version", "Python Version" and the "Worker options" are defined when I crea...

  • 3 kudos
5 More Replies
StephanieRivera
by Valued Contributor II
  • 1119 Views
  • 1 replies
  • 0 kudos

Is it possible to turn off the redaction of secrets? Is there a better way to solve this?

As part of our Azure Data Factory pipeline, we utilize Databricks to run some scripts that identify which files we need to load from a certain source. This list of files is then passed back into Azure Data Factory utilizing the Exit status from the n...

  • 1119 Views
  • 1 replies
  • 0 kudos
Latest Reply
StephanieRivera
Valued Contributor II
  • 0 kudos

No, it is not possible to turn off redaction. No, there is not another way to return values from a notebook.1) Using a native Databricks feature such as Autoloader is suggested.2) They could write the list of files to be processed to a delta table an...

  • 0 kudos
Ryan_Chynoweth
by Honored Contributor III
  • 1661 Views
  • 1 replies
  • 1 kudos
  • 1661 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ryan_Chynoweth
Honored Contributor III
  • 1 kudos

Yes, Azure Data Factory can execute code on Azure Databricks. The best way to return values from the notebook to Data factory is to use the dbutils.notebook.exit() function at the end of your notebook or whenever you want to terminate execution.

  • 1 kudos
Yogi
by New Contributor III
  • 6966 Views
  • 15 replies
  • 0 kudos

Resolved! Can we pass Databricks output to Azure function body?

Hi, Can anyone help me with Databricks and Azure function. I'm trying to pass databricks json output to azure function body in ADF job, is it possible? If yes, How? If No, what other alternative to do the same?

  • 6966 Views
  • 15 replies
  • 0 kudos
Latest Reply
AbhishekNarain_
New Contributor III
  • 0 kudos

You can now pass values back to ADF from a notebook.@@Yogi​ Though there is a size limit, so if you are passing dataset of larger than 2MB then rather write it on storage, and consume it directly with Azure Functions. You can pass the file path/ refe...

  • 0 kudos
14 More Replies
Labels