cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

iskidet_glenny
by New Contributor
  • 1379 Views
  • 2 replies
  • 0 kudos

Possibility of creating and running concurrent Job Runs from a single job all parameters driven

Hello Community,I hope everyone is doing well.I’ve been exploring the idea of creating multiple instances of a job which will be jobs runs with different parameter configurations. Has anyone else considered this approach?Imagine a scenario where you ...

  • 1379 Views
  • 2 replies
  • 0 kudos
Latest Reply
Roshaan
New Contributor II
  • 0 kudos

I have seen correlation that bigger the cluster configuration leads to more concurrent job runs successfully, is that true and if so why? 

  • 0 kudos
1 More Replies
joshuat
by Contributor
  • 4419 Views
  • 5 replies
  • 0 kudos

How to partition JDBC Oracle read query and cast with TO_DATE on partition date field?

I'm attempting to fetch an Oracle Netsuite table in parallel via JDBC using the Netsuite Connect JAR, already installed on the cluster and setup correctly. I can do successfully with a single-threaded approach using the `dbtable` option:table = 'Tran...

  • 4419 Views
  • 5 replies
  • 0 kudos
Latest Reply
joshuat
Contributor
  • 0 kudos

@pavlosskev I did not and have to do partitioned reads via the ID.

  • 0 kudos
4 More Replies
Y2DTL
by New Contributor III
  • 3582 Views
  • 5 replies
  • 6 kudos

Resolved! Stream/static Join

Hi allWould appreciate your help on a topic.when performing a join between a static and streaming dataframe is the latest version of the  static table used at the start of the job or within each micro-batch. Documentation doesn’t seem to specifically...

  • 3582 Views
  • 5 replies
  • 6 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 6 kudos

Hi @Y2DTL ,Here's an answer from documentation:  A stream-static join joins the latest valid version of a Delta table (the static data) to a data stream using a stateless join. When Databricks processes a micro-batch of data in a stream-static join, ...

  • 6 kudos
4 More Replies
joeyslaptop
by New Contributor II
  • 9194 Views
  • 6 replies
  • 3 kudos

How to add a column to a new table containing the original source filenames in DataBricks.

If this isn't the right spot to post this, please move it or refer me to the right area.I recently learned about the "_metadata.file_name".  It's not quite what I need.I'm creating a new table in DataBricks and want to add a USR_File_Name column cont...

Data Engineering
Databricks
filename
import
SharePoint
Upload
  • 9194 Views
  • 6 replies
  • 3 kudos
Latest Reply
Debayan
Databricks Employee
  • 3 kudos

Hi, Could you please elaborate more on the expectation here? 

  • 3 kudos
5 More Replies
allyallen
by New Contributor III
  • 3911 Views
  • 5 replies
  • 0 kudos

Resolved! Variable Compute clusters within a Job

We have 3 possible compute clusters that we can run a notebook against.They are varying sizes and the one that the notebook uses will depend on the size of the data being processed.We "t-shirt size" each tenant base on their data size (S, M, L) and c...

  • 3911 Views
  • 5 replies
  • 0 kudos
Latest Reply
allyallen
New Contributor III
  • 0 kudos

Hi @eniwoke That's a great solution thank you so much!Our process is now as follows:NB1 gets the tenant t-shirt size and sets the cluster_id for each size as a variable.The notebook then loops through each tenant and using the DataBricks API updates ...

  • 0 kudos
4 More Replies
Steffen
by New Contributor III
  • 3316 Views
  • 4 replies
  • 1 kudos

Resolved! DictionaryFilters Pushdown on Views

HelloI have a very simple table with time series data with three columns:id (long): unique id of signalts (unix timestamp): timestamp of the event in unix timestamp formatvalue (double): value of the signal at the given timestampFor every second ther...

  • 3316 Views
  • 4 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @Steffen , This happens because you're applying some functions to ts attribute like FLOOR, from_unix_timestamp etc., which hides the raw ts from Spark's optimizer, so it can’t push down filters.If you can, try to add additional attribute to your u...

  • 1 kudos
3 More Replies
ShankarM
by Contributor
  • 2328 Views
  • 3 replies
  • 0 kudos

DBR version 10.4 impact

hi,For one of our projects which is in production we are using DBR 10.4 for which EOL was Mar 18th, 2025.I wanted to know will there any impact to existing workloads which are running in production. Is yes then can you let me know the impact and risk...

  • 2328 Views
  • 3 replies
  • 0 kudos
Latest Reply
Isi
Honored Contributor III
  • 0 kudos

Hello @ShankarM Actually, there is no official End of Life (EoL) date provided by Databricks. If you check the documentation I referenced in my previous message, EoL is the next phase after End of Support (EoS), but Databricks does not announce a spe...

  • 0 kudos
2 More Replies
om_bk_00
by New Contributor III
  • 1871 Views
  • 5 replies
  • 1 kudos

Resolved! passing job parameters through the terminal to a job

I am having troubles overriding the job parameters that are deployed in my local workspace.e.g I have a job that fills tables with data,the parameters given to it are random and I would like to override them when I run through my terminaldatabricks b...

  • 1871 Views
  • 5 replies
  • 1 kudos
Latest Reply
EduardoSB
New Contributor II
  • 1 kudos

Hi! I just found this post because I'm having troubles trying to pass custom values to some parameters in my jobs. I guess databricks bundle run <job_name> --python-params "--param1=value1,--param2=value2,..."should work, shouldn't it? Is any other e...

  • 1 kudos
4 More Replies
adhi_databricks
by Contributor
  • 5139 Views
  • 7 replies
  • 1 kudos

Resolved! Requirement to run a databricks job from another job based on custom conditions using DAB

Hi everyone,I'm using Databricks Asset Bundles to deploy a job that includes a run_job_task, which requires a job_id to trigger another job.For different targets (dev, staging, prod), I need to pass different job_ids dynamically. To achieve this, I’v...

  • 5139 Views
  • 7 replies
  • 1 kudos
Latest Reply
adhi_databricks
Contributor
  • 1 kudos

Hey folks, Thanks for the help hereWas able to solve this issue with updating the databricks cli to latest versionThanks once again!!

  • 1 kudos
6 More Replies
liu
by New Contributor III
  • 3206 Views
  • 2 replies
  • 1 kudos

Resolved! I encountered an error when trying to use dbutils to operate on files with a file: prefix.

When I execute the statement:dbutils.fs.ls("file:/tmp/")I receive the following error:ExecutionError: (java.lang.SecurityException) Cannot use com.databricks.backend.daemon.driver.WorkspaceLocalFileSystem - local filesystem access is forbiddenDoes an...

  • 3206 Views
  • 2 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @liu ,Which type of cluster are you using? Which access mode? Your compute must have Dedicated (formerly single user) access mode 

  • 1 kudos
1 More Replies
noorbasha534
by Valued Contributor II
  • 668 Views
  • 5 replies
  • 0 kudos

DQ anomaly detection : _quality_monitoring_summary table DDL

DearsDoes anyone have the DDL for _quality_monitoring_summary table?This is created by the DQ anomaly detector. Since the detector was trying to create a managed table which is not allowed in the environment I work, I am attempting to create this on ...

  • 668 Views
  • 5 replies
  • 0 kudos
Latest Reply
Yogesh_Verma_
Contributor
  • 0 kudos

Hi,The _quality_monitoring_summary table is an internal table created by the Data Quality Anomaly Detector in Databricks Lakehouse Monitoring. Unfortunately, the full DDL is not publicly documented in detail, and trying to manually create it can lead...

  • 0 kudos
4 More Replies
ismaelhenzel
by Contributor
  • 7149 Views
  • 4 replies
  • 11 kudos

Resolved! DELTA LIVE TABLES - MATERIALIZED VIEW DOES NOT INCREMENT NOTHING !

I'm very disappointed with this framework. The documentation is inadequate, and it has many limitations. I want to run materialized views with incremental updates, but DLT insists on performing a full recompute. Why is it doing this? Here is the log ...

  • 7149 Views
  • 4 replies
  • 11 kudos
Latest Reply
1ct0
New Contributor II
  • 11 kudos

I'm seeing a subtype of EXCESSIVE_OPERATOR_NESTING that is preventing incremental updates. Is there any documentation so that this these issues can attempt to be resolved? 

  • 11 kudos
3 More Replies
manish1987c
by New Contributor III
  • 2413 Views
  • 6 replies
  • 1 kudos

Delta Live Table - Flow detected an update or delete to one or more rows in the source table

I have create a pipeline where i am ingesting the data from bronze to silver and using SCD 1, however when i am trying to create gold table as dlt it is giving me error as "Flow 'user_silver' has FAILED fatally. An error occurred because we detected ...

manish1987c_0-1718341166099.png manish1987c_1-1718341206991.png
  • 2413 Views
  • 6 replies
  • 1 kudos
Latest Reply
Pat
Esteemed Contributor
  • 1 kudos

Streaming tables in Delta Live Tables (DLT) only support append-only operations in the SOURCE.The error occurs because:1. Your silver table uses SCD Type 1, which performs UPDATE and DELETE operations on existing records2. Your gold table is defined ...

  • 1 kudos
5 More Replies
ShivangiB1
by New Contributor III
  • 2858 Views
  • 3 replies
  • 0 kudos

Embed Databricks AI/BI dashboard in external website and validate using service principal

Hey Team,I tried embedding my AI/BI databricks dashboard in sharepoint and it worked.But i dont want to validate using my credential, can i use service principal to validate.

  • 2858 Views
  • 3 replies
  • 0 kudos
Latest Reply
Advika
Databricks Employee
  • 0 kudos

Hello @ShivangiB1! You can publish the dashboard using a Service Principal via API, which allows you to embed it in SharePoint without requiring individual user logins.For more details, please refer to the documentation here: https://docs.databricks....

  • 0 kudos
2 More Replies
prasanna_r
by New Contributor
  • 1427 Views
  • 1 replies
  • 0 kudos

Resolved! Download all pages of a multi-page dashboard

Hi,I have created a multi-page dashboard in databricks. I want to download all the pages of the dashboard as a single pdf file. But when i export the dashboard I get it only in .json format. Is there a way to download all the pages as a pdf file?

  • 1427 Views
  • 1 replies
  • 0 kudos
Latest Reply
ilir_nuredini
Honored Contributor
  • 0 kudos

Hello @prasanna_r ,Currently Databricks does not support to export dashboards in pdf format. What I can suggest is to use browser feature to export as pdf, or take a screenshot and save them in the word -> pdf. Then you can use pdf programmatically h...

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels