cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

KKo
by Contributor III
  • 593 Views
  • 1 replies
  • 0 kudos

On Prem MS sql to Azure Databricks

Hi allI need to ingest data from on prem MS sql tables using Databricks to Azure Cloud. For the ingest, previously I used notebooks, jdbc connectors, read sql tables and write in unity catalog tables. Now, I want to experiment Databricks connectors f...

  • 593 Views
  • 1 replies
  • 0 kudos
Latest Reply
AbhaySingh
Databricks Employee
  • 0 kudos

This is feature is good to go... I can't think of any disadvantages. Here is a guide.  https://landang.ca/2025/01/31/simple-data-ingestion-from-sql-server-to-databricks-using-lakeflow-connect/  

  • 0 kudos
Suheb
by Contributor
  • 257 Views
  • 1 replies
  • 0 kudos

How have you set up a governance structure (data access control, workspace management, cluster polic

If your company uses Databricks with many people, how do you manage security, organize teams, and control costs — and what tools do you use to make it all work smoothly?

  • 257 Views
  • 1 replies
  • 0 kudos
Latest Reply
AbhaySingh
Databricks Employee
  • 0 kudos

Please take a look here to get some initial ideas. https://medium.com/databricks-unity-catalog-sme/a-practical-guide-to-catalog-layout-data-sharing-and-distribution-with-databricks-unity-catalog-763e4c7b7351  

  • 0 kudos
him
by New Contributor III
  • 27219 Views
  • 14 replies
  • 10 kudos

i am getting the below error while making a GET request to job in databrick after successfully running it

"error_code": "INVALID_PARAMETER_VALUE",  "message": "Retrieving the output of runs with multiple tasks is not supported. Please retrieve the output of each individual task run instead."}

Capture
  • 27219 Views
  • 14 replies
  • 10 kudos
Latest Reply
Octavian1
Contributor
  • 10 kudos

Hi @Debayan I'd suggest to also mention this explicitly in the documentation of the workspace client for get_run_outputOne has to pay extra attention to the examplerun_id=run.tasks[0].run_id otherwise it can be easily missed. 

  • 10 kudos
13 More Replies
alhuelamo
by New Contributor II
  • 11686 Views
  • 5 replies
  • 1 kudos

Getting non-traceable NullPointerExceptions

We're running a job that's issuing NullPointerException without traces of our job's code.Does anybody know what would be the best course of action when it comes to debugging these issues?The job is a Scala job running on DBR 11.3 LTS.In case it's rel...

  • 11686 Views
  • 5 replies
  • 1 kudos
Latest Reply
Amora
New Contributor II
  • 1 kudos

You could try enabling full stack traces and checking the Spark executor logs for hidden errors. Null Pointer Exceptions in Scala on DBR often come from lazy evaluations or missing schema fields during I/O. Reviewing your Data Frame transformations a...

  • 1 kudos
4 More Replies
Phani1
by Databricks MVP
  • 5301 Views
  • 4 replies
  • 2 kudos

Convert EBCDIC (Binary) file format to ASCII

Hi Team,How can we convert EBCDIC (Binary) file format to ASCII in databricks? Do we have any libraries in Databricks?

  • 5301 Views
  • 4 replies
  • 2 kudos
Latest Reply
amulight
Databricks Partner
  • 2 kudos

Hi Phani1 Were you able to do that successfully ? Can you share the details and steps please. Thanks.

  • 2 kudos
3 More Replies
67
by New Contributor
  • 339 Views
  • 1 replies
  • 1 kudos

Simple integration to push data from third-party into a client's Databricks instance

Hi there, we have an industry data platform with multiple customers using it. We provide each customer with their own data every night via .csv. Some of our customers use Databricks, and import their data from us into it.We would like to offer a more...

  • 339 Views
  • 1 replies
  • 1 kudos
Latest Reply
jeffreyaven
Databricks Employee
  • 1 kudos

You could use external volumes with a Cloudflare R2 bucket as an intermediary - you write the nightly data files to R2 (using S3-compatible API), and your customers create external volumes in their Databricks workspace pointing to their designated R2...

  • 1 kudos
GiriSreerangam
by Databricks Partner
  • 563 Views
  • 2 replies
  • 1 kudos

Resolved! org.apache.spark.SparkRuntimeException: [UDF_USER_CODE_ERROR.GENERIC]

Hi EveryoneI am writing a small function, with spark read from a csv and spark write into a table. I could execute this function within the notebook. But, when I register the same function as a unity catalog function and calling it from Playground, i...

GiriSreerangam_0-1761761391719.png
  • 563 Views
  • 2 replies
  • 1 kudos
Latest Reply
KaushalVachhani
Databricks Employee
  • 1 kudos

Hi @GiriSreerangam, You cannot use a Unity Catalog user-defined function (UDF) in Databricks to perform Spark read from a CSV and write to a table. Unity Catalog Python UDFs execute in a secure, isolated environment without access to the file system ...

  • 1 kudos
1 More Replies
dheeraj98
by New Contributor II
  • 586 Views
  • 1 replies
  • 2 kudos

dbt Cloud + Databricks SQL Warehouse with microbatching (48h lookback) — intermittent failures

Hey everyone,I’m currently running hourly dbt Cloud job (27 models with 8 threads) on a Databricks SQL Warehouse using the dbt microbatch approach, with a 48-hour lookback window.But I’m running into some recurring issues:Jobs failing intermittentlyO...

  • 586 Views
  • 1 replies
  • 2 kudos
Latest Reply
nayan_wylde
Esteemed Contributor II
  • 2 kudos

Here are few options  you can try and see if it resolves your issue.1. SQL Warehouse TuningUse Serverless SQL Warehouse with Photon for faster spin-up and query execution. [docs.getdbt.com]Size Appropriately: Start with Medium or Large, and enable au...

  • 2 kudos
tt_921
by New Contributor II
  • 460 Views
  • 2 replies
  • 0 kudos

Databricks CLI binding storage credential to a workspace

In the documentation from Databricks it says to run the below for binding a storage credential to a workspace (after already completing step 1 to update the `isolation-mode` to be `ISOLATED`): databricks workspace-bindings update-bindings storage-cre...

  • 460 Views
  • 2 replies
  • 0 kudos
Latest Reply
AbhaySingh
Databricks Employee
  • 0 kudos

This appears to be a documentation inconsistency. The CLI implementation seems to:   1. Require binding_type to be explicitly specified (contradicting the docs)   2. Require it to be placed within each workspace object, not as a top-level parameter  ...

  • 0 kudos
1 More Replies
tgburrin-afs
by New Contributor
  • 11257 Views
  • 7 replies
  • 3 kudos

Limiting concurrent tasks in a job

I have a job with > 10 tasks in it that interacts with an external system outside of databricks.  At the moment that external system cannot handle more than 3 of the tasks executing concurrently.  How can I limit the number of tasks that concurrently...

  • 11257 Views
  • 7 replies
  • 3 kudos
Latest Reply
_J
Databricks Partner
  • 3 kudos

You do something like:E1           E4E2   Z      E5  Z  ...E3           E6So Z does not actually do anything it's just a funnel that waits for the 3 tasks at a time to complete ... 

  • 3 kudos
6 More Replies
Adam_Borlase
by New Contributor III
  • 1859 Views
  • 4 replies
  • 2 kudos

Resolved! Error trying to edit Job Cluster via Databricks CLI

Good Day all,After having issues with Cloud resources allocated to Lakeflow jobs and Gateways I am trying to apply a policy to the cluster that is allocated to the Job. I am very new to a lot of the databricks platform and the administration so all h...

  • 1859 Views
  • 4 replies
  • 2 kudos
Latest Reply
Adam_Borlase
New Contributor III
  • 2 kudos

Thank you so much Louis,This has resolved all of our issues! Really appreciate the help.

  • 2 kudos
3 More Replies
1GauravS
by Databricks Partner
  • 1717 Views
  • 2 replies
  • 0 kudos

Resolved! Ingesting Data from Event Hubs via Kafka API with Serverless Compute

Hi!I'm currently working on ingesting log data from Azure Event Hubs into Databricks. Initially, I was using a managed Databricks workspace, which couldn't access Event Hubs over a private endpoint. To resolve this, our DevOps team provisioned a VNet...

  • 1717 Views
  • 2 replies
  • 0 kudos
Latest Reply
1GauravS
Databricks Partner
  • 0 kudos

Hi @mark_ott , Thanks for your response.I followed below mentioned documentation to configure private connectivity with Azure resources and was able to ingest logs using serverless compute. Having NCC set up is the key here.https://learn.microsoft.co...

  • 0 kudos
1 More Replies
hf-databricks
by New Contributor II
  • 472 Views
  • 2 replies
  • 0 kudos

Unable to create workspace

Hi Team,we have challenge creating workspace in data bricks account created on top of aws.below are the details:Databricks account name : saichaitanya.vaddadhi@healthfirsttech.com's LakehouseAWS Account id : 720016114009databricks id: 1ee8765f-b472-4...

  • 472 Views
  • 2 replies
  • 0 kudos
Latest Reply
BS_THE_ANALYST
Databricks Partner
  • 0 kudos

@hf-databricks there's a quickstart guide for creating a workspace with AWS: https://docs.databricks.com/aws/en/admin/workspace/quick-start There's a list of requirements:There's more options for creating workspaces. Above, I just listed the recommen...

  • 0 kudos
1 More Replies
AniruddhaGI
by New Contributor II
  • 3465 Views
  • 3 replies
  • 1 kudos

Workspace allows dbf path to install in Databricks 16.4 LTS

Feature: Library installation using requirements.txt on DB Runtime 16.4 LTSAffected Areas: Workspace isolation, Library ManagementSteps to Reproduce:Upload a wheel file to dbfPut the requirements.txt file in the Workspace and put dbfs path in require...

Data Engineering
library
Security
Workspace
  • 3465 Views
  • 3 replies
  • 1 kudos
Latest Reply
AniruddhaGI
New Contributor II
  • 1 kudos

I would like to know if the workspace isolation is a priority, and only Databricks 14.3 and lower allow installation via DBFS.Why should the requirements.txt allow you to install libraries or packages via dbfs path?Could someone please explain why th...

  • 1 kudos
2 More Replies
shubham_007
by Contributor III
  • 8111 Views
  • 6 replies
  • 3 kudos

Resolved! What are powerfull data quality tools/libraries to build data quality framework in Databricks ?

Dear Community Experts,I need your expert advice and suggestions on development of data quality framework. What are powerfull data quality tools or libraries are good to go for development of data quality framework in Databricks ? Please guide team.R...

  • 8111 Views
  • 6 replies
  • 3 kudos
Latest Reply
ChrisBergh-Data
New Contributor II
  • 3 kudos

Consider our open-source data quality tool, DataOps Data Quality TestGen. Our goal is to help data teams automatically generate 80% of the data tests they need with just a few clicks, while offering a nice UI for collaborating on the remaining 20% th...

  • 3 kudos
5 More Replies
Labels