cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

saurabh_aher
by New Contributor III
  • 578 Views
  • 8 replies
  • 4 kudos

Resolved! databricks sql create function - input table name as parameter and returns complete table.

hi , I am trying to create a databricks sql unity catalog function which will take table_name as input parameter and returns the full table as output. I am getting error, kindly help CREATE OR REPLACE FUNCTION catalog.schema.get_table( table_name STR...

  • 578 Views
  • 8 replies
  • 4 kudos
Latest Reply
BS_THE_ANALYST
Esteemed Contributor
  • 4 kudos

@szymon_dybczak I admire that you always find the appropriate information in the documentation. I will try my best to emulate this behaviour with other posts .@saurabh_aher great workaround suggestion with a stored procedure. Lots of lessons learned ...

  • 4 kudos
7 More Replies
Daan
by New Contributor III
  • 419 Views
  • 6 replies
  • 2 kudos

Resolved! Databricks Asset Bundles: using loops

Hey,I am using DAB's to deploy the job below.This code works but I would like to use it for other suppliers as well.Is there a way to loop over a loop of suppliers: ['nike', 'adidas',...] and fill those variables so that config_nike_gsheet_to_databri...

  • 419 Views
  • 6 replies
  • 2 kudos
Latest Reply
MujtabaNoori
New Contributor III
  • 2 kudos

@Daan ,You can maintain a default template that holds the common configuration. While creating a job-specific configuration, you can safely merge your job-specific dictionary with the base template using the | operator.

  • 2 kudos
5 More Replies
N_M
by Contributor
  • 2281 Views
  • 3 replies
  • 4 kudos

Access to For Each run ids from jobs rest API

Hello CommunityI'm using the for_each tasks in workflows, but I'm struggling to access to the job information through the job APIs.In short, using the runs api (Get a single job run | Jobs API | REST API reference | Databricks on AWS), I'm able to ac...

Data Engineering
API
jobs API
  • 2281 Views
  • 3 replies
  • 4 kudos
Latest Reply
prabhatika
New Contributor II
  • 4 kudos

This feature would be extremely helpful in monitoring each task in the `foreachtask` task. 

  • 4 kudos
2 More Replies
jtrousdale-lyb
by New Contributor III
  • 618 Views
  • 6 replies
  • 4 kudos

Resolved! DLT pipelines - sporadic ModuleNotFoundError

When we run DLT pipelines (which we deploy via DABs), we get a sporadic issue when attempting to install our bundle's wheel file.First, in every DLT pipeline, we as a first step a script that looks like the followingimport subprocess as sp from impor...

  • 618 Views
  • 6 replies
  • 4 kudos
Latest Reply
WiliamRosa
New Contributor III
  • 4 kudos

If you're encountering intermittent ModuleNotFoundError when your DLT pipeline tries to install your asset bundle’s wheel file, this typically points to inconsistencies in how your dependencies are packaged or where they’re being deployed. Common cul...

  • 4 kudos
5 More Replies
TalessRocha
by New Contributor II
  • 911 Views
  • 9 replies
  • 8 kudos

Resolved! Connect to azure data lake storage using databricks free edition

Hello guys, i'm using databricks free edition (serverless) and i am trying to connect to a azure data lake storage.The problem I'm having is that in the free edition we can't configure the cluster so I tried to make the connection via notebook using ...

  • 911 Views
  • 9 replies
  • 8 kudos
Latest Reply
BS_THE_ANALYST
Esteemed Contributor
  • 8 kudos

@TalessRocha thanks for getting back to us! Glad to hear you got it working, that's awesome. Best of luck with your projects.All the best,BS

  • 8 kudos
8 More Replies
susanne
by Contributor
  • 334 Views
  • 3 replies
  • 4 kudos

Resolved! Asset Bundles define entire folder for source code transformation files

Hi all I used the new Lakeflow UI in order to create a pipeline. Now I am struggling with the asset bundle configuration. When I am creating the pipeline manually I can configure the correct folder to the transformations where my sql and python trans...

Screenshot 2025-08-16 at 14.52.24.png
  • 334 Views
  • 3 replies
  • 4 kudos
Latest Reply
susanne
Contributor
  • 4 kudos

Hi Szymon, thanks once again for your help!It worked now with your approach. Do you maybe know why there is this warning displayed after databricks bundle validate/deploy:Warning: unknown field: globThis was one reason I thought this can not be the r...

  • 4 kudos
2 More Replies
RohanIyer
by New Contributor II
  • 204 Views
  • 1 replies
  • 3 kudos

Resolved! Azure RBAC Support for Secret Scopes

Hi there!I am using multiple Azure Key Vaults within our Azure Databricks workspaces, and we have set up secret scopes that are backed by these Key Vaults. Azure provides two authentication methods for accessing Key Vaults:Access Policies, which is c...

  • 204 Views
  • 1 replies
  • 3 kudos
Latest Reply
WiliamRosa
New Contributor III
  • 3 kudos

Actually, RBAC is supported for authentication for the secret scopes.The thing is, when you setup the secret scope, Databricks is automatically assigning permissions through access policies. With RBAC - you'll need to grant the role on your own.As a ...

  • 3 kudos
YosepWijaya
by New Contributor II
  • 32416 Views
  • 7 replies
  • 2 kudos

How can I embed image to the cell using markdown or code?

I have been trying to embed the image from the dbfs location, when I run the code, the image is unknown or question mark. I have tried following code: The path of the file is dbfs:/FileStore/tables/svm.jpgdisplayHTML("<img src ='dbfs:/FileStore/tabl...

Data Engineering
markdown
Notebook
  • 32416 Views
  • 7 replies
  • 2 kudos
Latest Reply
BS_THE_ANALYST
Esteemed Contributor
  • 2 kudos

@WiliamRosa You've stated:"1. Drag and drop images directly into Markdown cellsYou can simply drag an image file from your local system into a markdown cell. Databricks will upload it automatically to your workspace directory and display it inline in...

  • 2 kudos
6 More Replies
Malthe
by Contributor
  • 309 Views
  • 4 replies
  • 3 kudos

Self-referential foreign key constraint for streaming tables

When defining a streaming tables using DLT (declarative pipelines), we can provide a schema which lets us define primary and foreign key constraints.However, references to self, i.e. the defining table, are not currently allowed (you get a "table not...

  • 309 Views
  • 4 replies
  • 3 kudos
Latest Reply
Malthe
Contributor
  • 3 kudos

Each of these workarounds give up the optimizations that are enabled by the use of key constraints.

  • 3 kudos
3 More Replies
CzarR
by New Contributor III
  • 624 Views
  • 7 replies
  • 2 kudos

Maximum string length to pass for Databricks notebook widget

Is there a limitation on the string length to pass for Databricks notebook widget? ADF lookup outputs about 1000 tables that I am trying to pass to the databricks notebook via widget parameter. ADF spends 30 mins to open the Databricks notebook and e...

  • 624 Views
  • 7 replies
  • 2 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 2 kudos

Hi @CzarR ,Yes, there's a limitation. A maximum of 2048 characters can be input to a text widget.https://docs.databricks.com/aws/en/notebooks/notebook-limitations#databricks-widgets

  • 2 kudos
6 More Replies
ChristianRRL
by Valued Contributor III
  • 466 Views
  • 4 replies
  • 5 kudos

Resolved! AutoLoader - Cost of Directory Listing Mode

I'm curious to get thoughts and experience on this. Intuitively, the directory listing mode makes sense to me in order to ensure that only the latest unprocessed files are picked up and processed, but I'm curious about what the cost impact of this wo...

  • 466 Views
  • 4 replies
  • 5 kudos
Latest Reply
kerem
Contributor
  • 5 kudos

Hi @ChristianRRL Autoloader ingests your data incrementally regardless of whether you are on directory listing mode or file notification mode. The key difference lies in how it discovers new files. In directory listing mode, Autoloader queries the cl...

  • 5 kudos
3 More Replies
Ivaylo
by New Contributor II
  • 241 Views
  • 1 replies
  • 1 kudos

Resolved! read_files vs. cloud_file

I was wondering what is the difference between read_files and cloud_file.I can't find explicit explanation or comparison in Databriks Documentation.Best Regards Ivaylo

  • 241 Views
  • 1 replies
  • 1 kudos
Latest Reply
nayan_wylde
Honored Contributor II
  • 1 kudos

## Key Differences Between `read_files` and `cloud_files`### **`read_files` Function**`read_files` is a table-valued function that reads files under a provided location and returns the data in tabular form. It supports reading JSON, CSV, XML, TEXT, B...

  • 1 kudos
WiliamRosa
by New Contributor III
  • 238 Views
  • 1 replies
  • 4 kudos

Resolved! Recommended approach for handling deletes in a Delta table

What is the recommended approach for handling deletes in a Delta table?I have a table in MySQL (no soft delete flag) that I read and write into Azure as a Delta table. My current flow is:- If an ID exists in both MySQL and the Delta table → update th...

  • 238 Views
  • 1 replies
  • 4 kudos
Latest Reply
nayan_wylde
Honored Contributor II
  • 4 kudos

The recommended way of handling CDC in Databricks is by using the merge command.https://docs.databricks.com/aws/en/sql/language-manual/delta-merge-intoIf you using SQL.-- Delete all target rows that have a match in the source table.> MERGE INTO targe...

  • 4 kudos
Hari_P
by New Contributor II
  • 392 Views
  • 2 replies
  • 0 kudos

IBM DataStage to Databricks Migration

Hi All,We are currently exploring a use case involving migration from IBM DataStage to Databricks. I noticed that LakeBridge supports automated code conversion for this process. If anyone has experience using LakeBridge, could you please share any be...

  • 392 Views
  • 2 replies
  • 0 kudos
Latest Reply
Hari_P
New Contributor II
  • 0 kudos

Thank you for your response, do you know if there is any documentation on how much of it converts and what are the limitations in the conversion etc? 

  • 0 kudos
1 More Replies
AmarK
by Databricks Employee
  • 14640 Views
  • 5 replies
  • 0 kudos

Is there a way to programatically retrieve a workspace name ?

Is there a spark command in databricks that will tell me what databricks workspace I am using? I’d like to parameterise my code so that I can update delta lake file paths automatically depending on the workspace (i.e. it picks up the dev workspace na...

  • 14640 Views
  • 5 replies
  • 0 kudos
Latest Reply
WiliamRosa
New Contributor III
  • 0 kudos

To programmatically retrieve the Databricks workspace name from within a notebook, you can use Spark configuration or the notebook context. One method is to read the workspace URL using spark.conf.get("spark.databricks.workspaceUrl") and then extract...

  • 0 kudos
4 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels