cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

pop_smoke
by New Contributor III
  • 738 Views
  • 4 replies
  • 5 kudos

Resolved! switching to Databrick from Ab Initio (an old ETL software)- NEED ADVICE

All courses in market and on youtube as per my knowledge for databrick is outdated as those courses are for community edition. there is no new course for free edition of databrick. i am a working profession and i do not get much time. do you guys kno...

  • 738 Views
  • 4 replies
  • 5 kudos
Latest Reply
markjvickers-im
  • 5 kudos

@pop_smoke What were the arguments that swayed your organization to swtich to Databricks from Ab Initiio?Pure cost basis?

  • 5 kudos
3 More Replies
Ved88
by New Contributor III
  • 16 Views
  • 1 replies
  • 0 kudos

al purpose databricks cluster disappear

Hi, i can see sometime the cluster get disappeared even though it was created some time back using cluster pipeline,what could be the reason to disappear.we can recreate cluster but wanted to know the reason why this cluster get disappeared.thanks!Ve...

  • 16 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @Ved88 ,30 days after a compute is terminated, it is permanently deleted. To keep an all-purpose compute configuration after a compute has been terminated for more than 30 days, an administrator can pin the compute. Up to 100 compute resources can...

  • 0 kudos
a_user12
by New Contributor III
  • 31 Views
  • 0 replies
  • 0 kudos

Unity Catalog Schema management

From time to time i read  articles such as here which suggest to use a unity catalog schema management tool. All table schema changes should be applied via this tool.Usually SPs (or users) have the "Modify" Permission on tables. This allows to them t...

  • 31 Views
  • 0 replies
  • 0 kudos
smpa01
by Contributor
  • 90 Views
  • 2 replies
  • 2 kudos

Python DataSource API utilities/ Import Fails in Spark Declarative Pipeline

TLDR - UDFs work fine when imported from `utilities/` folder in DLT pipelines, but custom Python DataSource APIs fail with ModuleNotFoundError: No module named 'utilities'` during serialization. Only inline definitions work. Need reusable DataSource ...

  • 90 Views
  • 2 replies
  • 2 kudos
Latest Reply
smpa01
Contributor
  • 2 kudos

@emma_s  Thank you for the guidance! The wheel package approach worked perfectly.I also tried putting the .py directly in but it did not work/Workspace/Libraries/custom_datasource.py 

  • 2 kudos
1 More Replies
Digvijay_11
by New Contributor
  • 92 Views
  • 2 replies
  • 2 kudos

Lakeflow Spark Declarative Pipeline

How we can run a SDP pipeline in parallel manner with dynamic parameter parsing on pipeline level. How we can consume job level parameter in Pipeline. If similar name parameters are defined in pipeline level then job level parameters are getting over...

Data Engineering
Spark Declarative Pipelines
  • 92 Views
  • 2 replies
  • 2 kudos
Latest Reply
JacekLaskowski
Databricks MVP
  • 2 kudos

Just FYI, as of Jan 16th (the time I'm writing this answer), SDP and Delta Lake in their OSS versions don't work together yet.SDP is part of Apache Spark 4.1, but Delta Lake does not support it at the moment. It's coming. No idea when it's gonna be a...

  • 2 kudos
1 More Replies
ChristianRRL
by Honored Contributor
  • 92 Views
  • 3 replies
  • 4 kudos

Resolved! Is Auto Loader open source now in Apache 4.1 SDP?

With Spark Declarative Pipelines (SDP) being open source now, does this mean that the Databricks Auto Loader functionality is also open source? Is it called something else? If not, how does the open-source version handle incremental data processing a...

  • 92 Views
  • 3 replies
  • 4 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 4 kudos

Hi @ChristianRRL ,No, autoloader is propriety to Databricks. It's not open sourced. Open source version of SDP uses spark structured streaming for incremental processing. Keep in mind that Auto Loader is basically just Spark streaming under the hood ...

  • 4 kudos
2 More Replies
adeosthali
by New Contributor
  • 43 Views
  • 0 replies
  • 0 kudos

External to Managed

We are looking to migrate to managed tables using ALTER TABLE fq_table_name SET MANAGED During migration process we need to have ability to switch between external & managed tables & vice versa.UNSET MANAGED works for 14 days. But I'm unable to just ...

  • 43 Views
  • 0 replies
  • 0 kudos
dpc
by Contributor
  • 95 Views
  • 4 replies
  • 4 kudos

Case insensitive data

For all it's positives, one of the first general issues we had with databricks was case sensitivity.We have a lot of data specific filters in our codeProblem is, we land and view data from lots of different case insensitive source systems e.g. SQL Se...

  • 95 Views
  • 4 replies
  • 4 kudos
Latest Reply
emma_s
Databricks Employee
  • 4 kudos

Hi, You can set the default collation at Catalog level  or schema level and the tables in the catalog will inherit the collation. This is supported from DBR 17.1 and above.

  • 4 kudos
3 More Replies
Bkr-dbricks
by New Contributor
  • 78 Views
  • 1 replies
  • 0 kudos

Resolved! Databricks free Edition to Azure Connectivity

Hello EveryoneAs a beginner in databricks, I have a question. Can we connect Databricks Free Edition to connect Azure Blob/ Gen 2 storage? I would like to create external tables on files on Azure and Delta lake tables on top of it.Your help is apprec...

  • 78 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @Bkr-dbricks ,According to following topic Free Edition doesn't support external locations.Solved: If use databricks free version not free trail can ... - Databricks Community - 127421

  • 0 kudos
kivaniutenko
by New Contributor
  • 518 Views
  • 1 replies
  • 0 kudos

HTML Formatting Issue in Databricks Alerts

Hello everyone,I have recently encountered an issue with HTML formatting in custom templates for Databricks Alerts. Previously, the formatting worked correctly, but now the alerts display raw HTML instead of properly rendered content.For example, an ...

  • 518 Views
  • 1 replies
  • 0 kudos
Latest Reply
mmayorga
Databricks Employee
  • 0 kudos

hi @kivaniutenko  thanks for reaching out. Databricks alerts still support basic HTML in email templates, but HTML will render correctly only for email destinations and only with simple, allowed tags.​​ Quick things to try Make sure you are using Ale...

  • 0 kudos
Alf01
by New Contributor II
  • 195 Views
  • 2 replies
  • 3 kudos

Resolved! Databricks Serverless Pipelines - Incremental Refresh Doubts

Hello everyone,I would like to clarify some doubts regarding how Databricks Pipelines (DLT) behave when using serverless pipelines with incremental updates.In general, incremental processing is enabled and works as expected. However, I have observed ...

  • 195 Views
  • 2 replies
  • 3 kudos
Latest Reply
aleksandra_ch
Databricks Employee
  • 3 kudos

Hi @Alf01  , thanks for accepting the solution! To keep you updated, the REFRESH POLICY feature, that I mentioned in my post, is out now! It allows manual control of the refresh strategy (AUTO, INCREMENTAL, INCREMENTAL STRICT, FULL), just as you stat...

  • 3 kudos
1 More Replies
SparkMan
by New Contributor II
  • 94 Views
  • 2 replies
  • 2 kudos

Resolved! Job Cluster Reuse

Hi, I have a job where a job cluster is reused twice for task A and task C. Between A and C, task B runs for 4 hours on a different interactive cluster. The issue here is that the job cluster doesn't terminate as soon as Task A is completed and sits ...

  • 94 Views
  • 2 replies
  • 2 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 2 kudos

Hi @SparkMan ,This is expected behavior with Databricks job cluster reuse unless you change your job/task configuration. Look at following documentation entry:So with your flow you have something like this:Task A (job cluster) → Task B (interactive c...

  • 2 kudos
1 More Replies
nkrish
by New Contributor II
  • 73 Views
  • 1 replies
  • 1 kudos

Regarding Accelerators

Are there any databricks accelerators to convert the c# and qlikview code to pyspark ? We are using the Open source AI tools to convert now but wondering is there any better way to do the same?Thanks in advance 

  • 73 Views
  • 1 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @nkrish ,Unfortunately, I don't think so. Available accelerators you can find here:Databricks Solution Accelerators for Data & AI | DatabricksBut I haven't heard anything about accelerator for c# and qlikview specifically.

  • 1 kudos
GergoBo
by New Contributor
  • 78 Views
  • 1 replies
  • 0 kudos

How to Play or Stream MP4 Videos from Unity Catalog Volumes in Databricks (Flask/Dash)?

Hello Databricks Community,I am working on a Dash dashboard (Python/Flask backend) deployed on Databricks, and I need to play or stream MP4 video files stored in a Unity Catalog Volume. I have tried accessing these files both from a Databricks notebo...

  • 78 Views
  • 1 replies
  • 0 kudos
Latest Reply
Raman_Unifeye
Contributor III
  • 0 kudos

@GergoBo - Since notebooks cannot reach out to the file system to stream, you must embed the video as a Base64 encoded string. I tried below code and it works well in Notebook as it plays the video in the output. import base64from IPython.display imp...

  • 0 kudos
Malthe
by Contributor III
  • 66 Views
  • 1 replies
  • 0 kudos

Intermittent task execution issues

We're getting intermittent errors:[ISOLATION_STARTUP_FAILURE.SANDBOX_STARTUP] Failed to start isolated execution environment. Sandbox startup failed. Exception class: INTERNAL. Exception message: INTERNAL: LaunchSandboxRequest create failed - Error e...

  • 66 Views
  • 1 replies
  • 0 kudos
Latest Reply
sandy_123
New Contributor
  • 0 kudos

Hi @Malthe ,This might be because of New DBR (18.0) GA release yesterday(January 2026 - Azure Databricks | Microsoft Learn). you might need to use a custom spark version by the time engineering team fixes this issue in DBR. Below is the response from...

  • 0 kudos
Labels