cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

sakuraDev
by New Contributor
  • 41 Views
  • 0 replies
  • 0 kudos

Why does soda not initialize?

Hey everyone, im using autoloader x soda.I'm new to both,The idea is to ingest with quality checks in my silver table for every batch in a continuous ingestion.I tried to configure soda as str just like the docs show, but its seems that it keeps on t...

sakuraDev_0-1725645131588.png
  • 41 Views
  • 0 replies
  • 0 kudos
devpdi
by New Contributor
  • 46 Views
  • 0 replies
  • 0 kudos

Re-use jobs as tasks with the same cluster.

Hello,I am facing an issue with my workflow.I have a job (name it main job) that, among others, runs 5 concurrent tasks, which are defined as jobs (not notebooks).Each of these jobs is identical to the others (name them sub-job-1), with the only diff...

  • 46 Views
  • 0 replies
  • 0 kudos
Kurtis_R
by New Contributor
  • 34 Views
  • 0 replies
  • 0 kudos

Excel Formula results

Hi all,Just wanted to raise a question regarding Databricks workbooks and viewing the results in the cells. For the example provided in the screenshot I want to view the results of an excel formula that has been applied to a cell in our workbooks. Fo...

Kurtis_R_0-1725568966348.png Kurtis_R_1-1725569630650.png
  • 34 Views
  • 0 replies
  • 0 kudos
sakuraDev
by New Contributor
  • 87 Views
  • 1 replies
  • 2 kudos

Resolved! how does autoloader handle source outage

Hey guys,I've been looking for some docs on how autoloader manages the source outage, I am currently running the following code: dfBronze = (spark.readStream .format("cloudFiles") .option("cloudFiles.format", "json") .schema(json_schema_b...

sakuraDev_0-1725478024362.png
  • 87 Views
  • 1 replies
  • 2 kudos
Latest Reply
filipniziol
New Contributor II
  • 2 kudos

Hi @sakuraDev ,1. Using the availableNow trigger to process all available data immediately and then stop the query. As you noticed your data was processed once and now you need to trigger the process once again to process new files.2. Changing the tr...

  • 2 kudos
ashraf1395
by Contributor II
  • 62 Views
  • 0 replies
  • 0 kudos

Databricks Finops Assessment

We have to deliver a Databricks Finops Assessment project. I am trying to write a proposal for it. I haven't done one before. I have created a general process of how the assessment will look like and then restructured it using gpt.Plz give your feedb...

  • 62 Views
  • 0 replies
  • 0 kudos
Rishabh-Pandey
by Esteemed Contributor
  • 54 Views
  • 1 replies
  • 2 kudos

Creating a shareable dashboard

AI/BI Dashboards offer a robust solution for securely sharing visualizations, and insights throughout your organization. You can easily share these dashboards with users within your Databricks workspace, across other workspaces in your organization, ...

  • 54 Views
  • 1 replies
  • 2 kudos
Latest Reply
Anushree_Tatode
  • 2 kudos

Hi Rishabh,Nice post, AI/BI Dashboards make it easy to share data securely within and across workspaces, even with view-only users. This way, everyone gets the right info while keeping things controlled. Excited to learn more about the key features!A...

  • 2 kudos
sakuraDev
by New Contributor
  • 84 Views
  • 1 replies
  • 1 kudos

Resolved! schema is not enforced when using autoloader

Hi everyone,I am currently trying to enforce the following schema:  StructType([ StructField("site", StringType(), True), StructField("meter", StringType(), True), StructField("device_time", StringType(), True), StructField("data", St...

sakuraDev_0-1725389159389.png
  • 84 Views
  • 1 replies
  • 1 kudos
Latest Reply
Slash
Contributor
  • 1 kudos

Hi @sakuraDev ,I'm afraid your assumption is wrong. Here you define data field as struct type and the result is as expected. So once you have this column as struct type, you can refer to nested object using dot notation. So if you would like to get e...

  • 1 kudos
suryateja405555
by New Contributor
  • 66 Views
  • 1 replies
  • 0 kudos

Databricks SQL Alerts

Hi All, Need one help. Is there any possibility to trigger a Databricks SQL Alert as a email notification to group of users/individual users without schedule option.We can add the Email id in the destinations but it will trigger an alert only if we s...

suryateja405555_1-1725362930043.png
  • 66 Views
  • 1 replies
  • 0 kudos
Latest Reply
holly
Valued Contributor II
  • 0 kudos

HI there, can you provide a bit more detail - why do you need email addresses if you don't send an alert? Are you trying to email when the job finishes? Or do you want to send the results?

  • 0 kudos
Stellar
by New Contributor II
  • 3034 Views
  • 2 replies
  • 1 kudos

Resolved! Databricks CI/CD Azure Devops

Hi all,I am looking for advice on what would be the best approach when it comes to CI/CD in Databricks and repo in general. What would be the best approach; to have main branch and branch off of it or? How will changes be propagated from dev to qa an...

  • 3034 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 1 kudos

Hi @Stellar, Setting up a robust CI/CD (Continuous Integration/Continuous Deployment) pipeline for Databricks involves thoughtful planning and adherence to best practices. Let’s break down the key aspects: Development Workflow: Branching Strateg...

  • 1 kudos
1 More Replies
m_weirath
by New Contributor
  • 261 Views
  • 2 replies
  • 0 kudos

DLT-META requires ddl when using cdc_apply_changes

We are setting up new DLT Pipelines using the DLT-Meta package. Everything is going well in bringing our data in from Landing to our Bronze layer when we keep the onboarding JSON fairly vanilla. However, we are hitting an issue when using the cdc_app...

  • 261 Views
  • 2 replies
  • 0 kudos
Latest Reply
dbuser17
New Contributor
  • 0 kudos

Please check these details: https://github.com/databrickslabs/dlt-meta/issues/90

  • 0 kudos
1 More Replies
lprevost
by Contributor
  • 115 Views
  • 1 replies
  • 0 kudos

GraphFrames and DLT

I am trying to run a DLT job that uses GraphFrames, which is in the ML standard image.   I am using it successfully in my job compute instances but I'm running into problems trying to use it in a DLT job.  Here are my overrides for the standard job c...

  • 115 Views
  • 1 replies
  • 0 kudos
Latest Reply
lprevost
Contributor
  • 0 kudos

@Kaniz_Fatma - any chance I can get a definitive answer to this question?  I know I can %pip install in DLT jobs but graphframes requires a maven type install as it uses underlying java/scala modules/jar files.   A related question is whether there i...

  • 0 kudos
ggsmith
by New Contributor II
  • 255 Views
  • 1 replies
  • 0 kudos

dlt Streaming Checkpoint Not Found

I am using Delta Live Tables and have my pipeline defined using the code below. My understanding is that a checkpoint is automatically set when using Delta Live Tables. I am using the Unity Catalog and Schema settings in the pipeline as the storage d...

  • 255 Views
  • 1 replies
  • 0 kudos
Latest Reply
Slash
Contributor
  • 0 kudos

Hi @ggsmith ,If you use Delta Live Tables then checkpoints are stored under the storage location specified in the DLT settings. Each table gets a dedicated directory under storage_location/checkpoints/<dlt_table_name. 

  • 0 kudos
DBUser2
by New Contributor II
  • 134 Views
  • 2 replies
  • 0 kudos

How to use transaction when connecting to Databricks using Simba ODBC driver

I'm connecting to a databricks instance using Simba ODBC driver(version 2.8.0.1002). And I am able to perform read and write on the delta tables. But if I want to do some INSERT/UPDATE/DELETE operations within a transaction, I get the below error, an...

  • 134 Views
  • 2 replies
  • 0 kudos
Latest Reply
florence023
New Contributor II
  • 0 kudos

@DBUser2 wrote:I'm connecting to a databricks instance using Simba ODBC driver(version 2.8.0.1002). And I am able to perform read and write on the delta tables. But if I want to do some INSERT/UPDATE/DELETE operations within a transaction, I get the ...

  • 0 kudos
1 More Replies
dashawn
by New Contributor
  • 1604 Views
  • 4 replies
  • 1 kudos

DLT Pipeline Error Handling

Hello all.We are a new team implementing DLT and have setup a number of tables in a pipeline loading from s3 with UC as the target. I'm noticing that if any of the 20 or so tables fail to load, the entire pipeline fails even when there are no depende...

Data Engineering
Delta Live Tables
  • 1604 Views
  • 4 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Moderator
  • 1 kudos

Thank you for sharing this @Kaniz_Fatma. @dashawn did you were able to check Kaniz's docs? do you still need help or shall you accept Kaniz's solution? 

  • 1 kudos
3 More Replies
dcrezee
by New Contributor III
  • 100 Views
  • 0 replies
  • 0 kudos

workflow set maximum queued items

Hi all,I have a question regarding Workflows and queuing of job runs. I'm running into a case where jobs are running longer than expected and result in job runs being queued, which is expected and desired. However, in this particular case we only nee...

  • 100 Views
  • 0 replies
  • 0 kudos
Labels