cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Punit_Prajapati
by New Contributor III
  • 140 Views
  • 1 replies
  • 0 kudos

Long-lived authentication for Databricks Apps / FastAPI when using Service Principal (IoT use case)

Hi Community,I’m working with Databricks Apps (FastAPI) and invoking the API from external IoT devices.Currently, the recommended approach is to authenticate using a Bearer token generated via a Databricks Apps Service Principal (Client ID + Client S...

Punit_Prajapati_1-1767935971292.png
  • 140 Views
  • 1 replies
  • 0 kudos
Latest Reply
MoJaMa
Databricks Employee
  • 0 kudos

So from your IoT device you don't have a way to use any of the Unified Authentication mechanisms? https://docs.databricks.com/aws/en/dev-tools/auth/unified-auth

  • 0 kudos
fintech_latency
by Visitor
  • 50 Views
  • 4 replies
  • 0 kudos

How to guarantee “always-warm” serverless compute for low-latency Jobs workloads?

We’re building a low-latency processing pipeline on Databricks and are running into serverless cold-start constraints.We ingest events (calls) continuously via a Spark Structured Streaming listener.For each event, we trigger a serverlesss compute tha...

  • 50 Views
  • 4 replies
  • 0 kudos
Latest Reply
Raman_Unifeye
Contributor III
  • 0 kudos

There is no such option of always-warm. Your latency-senstive use-case qualifies more for a dedicated cluster.

  • 0 kudos
3 More Replies
dpc
by Contributor II
  • 81 Views
  • 5 replies
  • 2 kudos

Using AD groups for object ownership

Databricks has a general issue with object ownership in that only the creator can delete them.So, if I create a catalog, table, view, schema etc. I am the only person who can delete it.No good if it's a general table or view and some other developer ...

  • 81 Views
  • 5 replies
  • 2 kudos
Latest Reply
dbxdev
New Contributor II
  • 2 kudos

 I had the problem with another client at a much larger scale.This is what we did .At the end of each pipeline that we ran in the development environment we had an AlterOwnership task.When a user runs a pipeline with his/her credentials all the objec...

  • 2 kudos
4 More Replies
DataGuy2
by New Contributor
  • 109 Views
  • 1 replies
  • 0 kudos

Databricks notebook Issue

Hello Databricks Community,I’m facing multiple issues while working in Azure Databricks notebooks, and I’d appreciate guidance or troubleshooting suggestions.Issue 1: Failed to reconnectWhile running a notebook, I frequently see a “Failed to reconnec...

  • 109 Views
  • 1 replies
  • 0 kudos
Latest Reply
emma_s
Databricks Employee
  • 0 kudos

Hi, there a few things that could cause these types of problems. 1. Azure service availablity (when these happen check the Azure service availability to make sure there are no outages) 2. Local network connection problems (verify all your other inter...

  • 0 kudos
rvakr
by New Contributor II
  • 50 Views
  • 4 replies
  • 1 kudos

Resolved! scheduling in dlt pipeline

Hi Team, when i creaing DLT pipelines i am not able add scheduls via assed bundle,its allowing me from UI onlyis there any other option make dynamic way to create schdules like using SDK | API | CLI

  • 50 Views
  • 4 replies
  • 1 kudos
Latest Reply
dbxdev
New Contributor II
  • 1 kudos

If you still want to use sdk here is what you can do . update the job using sdkfrom databricks.sdk import WorkspaceClient w = WorkspaceClient() def update_job_schedule(job_id: int): return w.jobs.update( job_id=job_id, new_setti...

  • 1 kudos
3 More Replies
vamsi_simbus
by New Contributor III
  • 96 Views
  • 2 replies
  • 0 kudos

Databricks App deployment fails: mysqlclient build error (pkg-config not found)

 Hi Community Members,I’m deploying a Python project as a Databricks App, but deployment fails during dependency installation with: ERROR: Failed to build 'MySQL client' pkg-config: not found Exception: Can not find valid pkg-config nameThe dependenc...

  • 96 Views
  • 2 replies
  • 0 kudos
Latest Reply
aleksandra_ch
Databricks Employee
  • 0 kudos

Hi @vamsi_simbus , Databricks Apps don’t yet support installing OS-level packages (apt-get, pkg-config, native client libs). Could you try to install a Python native MySQL client as a backend of ibis: Install the core framework only: ibis-frameworkIn...

  • 0 kudos
1 More Replies
Malthe
by Contributor III
  • 127 Views
  • 3 replies
  • 0 kudos

Intermittent task execution issues

We're getting intermittent errors:[ISOLATION_STARTUP_FAILURE.SANDBOX_STARTUP] Failed to start isolated execution environment. Sandbox startup failed. Exception class: INTERNAL. Exception message: INTERNAL: LaunchSandboxRequest create failed - Error e...

  • 127 Views
  • 3 replies
  • 0 kudos
Latest Reply
aleksandra_ch
Databricks Employee
  • 0 kudos

Hi @Malthe , Please check if custom Spark image is used in the jobs. If it is, try to remove it and stick to default parameters. If not, I highly recommend to open a support ticket (assuming you are on Azure Databricks) via Azure portal.  Best regard...

  • 0 kudos
2 More Replies
ajay_wavicle
by Visitor
  • 57 Views
  • 3 replies
  • 1 kudos

Resolved! how to change delta table to managed iceberg table by writing it metadata & using same parquet data

i need to change delta table to managed iceberg table by writing it metadata and using same parquet data and not rewriting the table efficiently. Dont want to use Delta uniform format

  • 57 Views
  • 3 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Greetings @ajay_wavicle , I did some digging as this is a "not so common" request.  With that said, here is what I found. What you’re aiming for is very specific: you want to end up with a Unity Catalog managed Apache Iceberg table, reusing the exact...

  • 1 kudos
2 More Replies
_its_akshaye
by Visitor
  • 24 Views
  • 1 replies
  • 0 kudos

How to Track Hourly or Daily # of Upsert/Delete Metrics in a DLT Streaming Pipeline

We created a Delta Live Tables (DLT) streaming pipeline to ingest data from the Bronze layer to the Silver layer using Change Data Feed (CDF) enabled.The stream runs continuously and shows # of upserted and deleted rows at an aggregate level from the...

  • 24 Views
  • 1 replies
  • 0 kudos
Latest Reply
Saritha_S
Databricks Employee
  • 0 kudos

Hi @_its_akshaye  Yes—capture it from the DLT event log and derive it directly from the target table’s CDF, then aggregate by time.   Options that work well Use the DLT event log “rows written” metricsEvery pipeline writes a structured event log to y...

  • 0 kudos
a_user12
by Contributor
  • 138 Views
  • 1 replies
  • 1 kudos

Resolved! Unity Catalog Schema management

From time to time i read  articles such as here which suggest to use a unity catalog schema management tool. All table schema changes should be applied via this tool.Usually SPs (or users) have the "Modify" Permission on tables. This allows to them t...

  • 138 Views
  • 1 replies
  • 1 kudos
Latest Reply
MoJaMa
Databricks Employee
  • 1 kudos

I tend to mostly agree with you. Trying to do table-schema management like I would have done while developing ETL flows in an RDBMS world is quite different from trying to do this in a fast-moving "new-sources-all-the-time" data engineering world.  T...

  • 1 kudos
sandy_123
by New Contributor
  • 71 Views
  • 1 replies
  • 0 kudos

Getting 'Multiple failure in stage materialization' error in one of my Job with notebook task

Multiple failures in stage materialization. it tried using powerful Job cluster but it does not work out?? Any suggestions how should i fix it? FYI- my dataframe(uniq_rec_df) has around 30M rows -------Screenshot attached---  

sandy_123_0-1768844333336.png
  • 71 Views
  • 1 replies
  • 0 kudos
Latest Reply
Saritha_S
Databricks Employee
  • 0 kudos

Hi @sandy_123  The "Multiple failures in stage materialization" error at line 120 is caused by a massive shuffle bottleneck Check the spark UI and try to understand the reason for the failure such as RPC, heartbeat error etc Primary Issues: Window...

  • 0 kudos
Alf01
by New Contributor II
  • 275 Views
  • 4 replies
  • 6 kudos

Resolved! Databricks Serverless Pipelines - Incremental Refresh Doubts

Hello everyone,I would like to clarify some doubts regarding how Databricks Pipelines (DLT) behave when using serverless pipelines with incremental updates.In general, incremental processing is enabled and works as expected. However, I have observed ...

  • 275 Views
  • 4 replies
  • 6 kudos
Latest Reply
aleksandra_ch
Databricks Employee
  • 6 kudos

Hi @Alf01  , thanks for accepting the solution! To keep you updated, the REFRESH POLICY feature, that I mentioned in my post, is out now! It allows manual control of the refresh strategy (AUTO, INCREMENTAL, INCREMENTAL STRICT, FULL), just as you stat...

  • 6 kudos
3 More Replies
RIDBX
by Contributor
  • 59 Views
  • 1 replies
  • 1 kudos

Replicating DBX Demo set in Databricks FREE tier?

Replicating DBX Demo set in Databricks FREE tier?===================================================Thanks for reviewing my threads. I like to replicate/port the Databricks Demo artifacts/set in to my personal Databricks FREE tier. I am getting some ...

  • 59 Views
  • 1 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Hello @RIDBX , I’m not surprised that some of the actions you’re seeing through Vocareum don’t work in the Free Edition. When our training is developed, parts of it are intentionally tied to Vocareum APIs to support specific tasks and workflows. Beca...

  • 1 kudos
ajay_wavicle
by Visitor
  • 54 Views
  • 4 replies
  • 0 kudos

Migrate managed uc tables from one databricks workpace to another while retaining delta history

I am unable to access Databricks attached storage account due to deny assignment and hence i am unable to move the delta log from one storage account to another. How can i go around this so that i can move the delta log and data from one storage acco...

  • 54 Views
  • 4 replies
  • 0 kudos
Latest Reply
Raman_Unifeye
Contributor III
  • 0 kudos

@ajay_wavicle - in that case, you willhave to do it from a notebook using dbutils command.Map the new storage account as an external location to be accessed by old workspace.Then use dbutils.fs.cp like below with a recursive true option.dbutils.fs.cp...

  • 0 kudos
3 More Replies
Labels