cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

aranjan99
by Contributor
  • 1034 Views
  • 2 replies
  • 0 kudos

How to switch serverless dlt pipeline to cost optimized mode from performance optimized

We have a few serverless dlt pipelines that we want to optimize for cost as we are ok with an increased latency. Where can I change the pipeline to run on cost optimized mode. I dont see this option in UI or API

  • 1034 Views
  • 2 replies
  • 0 kudos
Latest Reply
wawefog260
New Contributor II
  • 0 kudos

Hello!To enable cost-optimized mode for your serverless DLT pipeline, switch it to Triggered mode and edit the schedule trigger—there you’ll find the option to disable “Performance optimized.” This setting isn’t visible in the main UI or API unless t...

  • 0 kudos
1 More Replies
elgeo
by Valued Contributor II
  • 43267 Views
  • 13 replies
  • 6 kudos

SQL Stored Procedure in Databricks

Hello. Is there an equivalent of SQL stored procedure in Databricks? Please note that I need a procedure that allows DML statements and not only Select statement as a function provides.Thank you in advance

  • 43267 Views
  • 13 replies
  • 6 kudos
Latest Reply
SanthoshU
New Contributor II
  • 6 kudos

how to connect the stored procedures to power bi report builder, seems like it is not working 

  • 6 kudos
12 More Replies
zensardigital
by New Contributor II
  • 1284 Views
  • 3 replies
  • 0 kudos

Convert a Managed Table to Streaming Table

HiI have applied transformations on a set of streaming tables and saved it as a managed table....How can i change the Managed table to a Streaming table with minimal changesRegardsZD

  • 1284 Views
  • 3 replies
  • 0 kudos
Latest Reply
zensardigital
New Contributor II
  • 0 kudos

I am just writing the dataframe to delta table.....Are you suggesting me to first define a STREAMING TABLE (using the DLT definition) and then save the dataframe into that table? 

  • 0 kudos
2 More Replies
Naga05
by New Contributor III
  • 2336 Views
  • 4 replies
  • 2 kudos

Databricks app with parameters from databricks asset bundle

HelloooI tried out setting up a Databricks App using asset bundle, where i was able to successfully parameterize the sql warehouse id which was specified on specific targets. However i was unable to get values of other variables from the targets, the...

  • 2336 Views
  • 4 replies
  • 2 kudos
Latest Reply
Naga05
New Contributor III
  • 2 kudos

Found that this is an implementation in progress on the Databricks CLI. https://github.com/databricks/cli/issues/3679

  • 2 kudos
3 More Replies
smoortema
by Contributor
  • 1853 Views
  • 2 replies
  • 3 kudos

Resolved! handling both Pyspark and Python exceptions

In a Python notebook, I am using error handling according to the official documentation.  try:[some data transformation steps]except PySparkException as ex:[logging steps to log the error condition and error message in a table]However, this catches o...

  • 1853 Views
  • 2 replies
  • 3 kudos
Latest Reply
mark_ott
Databricks Employee
  • 3 kudos

To handle both PySpark exceptions and general Python exceptions without double-logging or overwriting error details, the recommended approach is to use multiple except clauses that distinguish the exception type clearly. In Python, exception handlers...

  • 3 kudos
1 More Replies
tom_1
by New Contributor III
  • 2474 Views
  • 5 replies
  • 1 kudos

Resolved! BUG in Job Task of Type DBT

Hi, just wanted to let the Databricks Team know, that there is a bug in the task ui.Currently it is not possible to save a task of "Type: dbt" if the "SQL Warehouse" is set to "None (Manual)".Some weeks ago this was possible, also the "Profiles Direc...

tom_1_0-1741870684542.png tom_1_1-1741870779606.png
  • 2474 Views
  • 5 replies
  • 1 kudos
Latest Reply
Aishu95
New Contributor II
  • 1 kudos

I am facing this bug still. I don't want to select any SQL warehouse, what do I do? and from where can I pass the profiles directory

  • 1 kudos
4 More Replies
Navi991100
by New Contributor II
  • 840 Views
  • 3 replies
  • 1 kudos

Resolved! I recently made new account on databricks under Free edition

It by default made SQL warehouse compute, but I want all-purpose compute, as I want test and learn capabilities of PySpark and Databricks.I can't connect with the serverless compute in the notebook; it gives a mean  error as follows: "An error occurr...

Navi991100_0-1759078594989.png
  • 840 Views
  • 3 replies
  • 1 kudos
Latest Reply
belforte
New Contributor II
  • 1 kudos

In the free Databricks edition, to use PySpark you need to create and start a cluster, since the SQL Warehouse is only for SQL queries; go to Compute > Create Cluster, set up a free cluster, click Start, and then attach your notebook to it this will ...

  • 1 kudos
2 More Replies
vishal_balaji
by New Contributor II
  • 2297 Views
  • 2 replies
  • 1 kudos

Unable to access metrics from Driver node on localhost:4040

Greetings,I am trying to setup monitoring in Grafana for all my databricks clustersI have added 2 things as part of thisUnder Compute > Configuration > Advanced > Spark > Spark Config, I have addedspark.ui.prometheus.enabled trueUnder init_scripts, I...

  • 2297 Views
  • 2 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @vishal_balaji ,You're following guides that were prepared for OSS Apache Spark. For sure localhost won't work in this case because in Databricks all compute is cloud-based. Please follow below guide how to configure it properly on databricks:Data...

  • 1 kudos
1 More Replies
saurabh18cs
by Honored Contributor III
  • 1812 Views
  • 4 replies
  • 3 kudos

Autoloader - File Notification Mode

Hello All,We have started to consume source messages/files via autoloader directory listing mode at the moment and want to convert this to file notification mode instead so consumption can be faster with no more entire directories/folder scanning. I ...

  • 1812 Views
  • 4 replies
  • 3 kudos
Latest Reply
saurabh18cs
Honored Contributor III
  • 3 kudos

Hi @K_Anudeep @szymon_dybczak how do i understand a situation when 100 jobs are running in parallel with minimal latency needed. does autoloader directly connect to the cloud queue service ? or databricks stores and manages detected files somewhere? ...

  • 3 kudos
3 More Replies
RobsonNLPT
by Contributor III
  • 933 Views
  • 2 replies
  • 0 kudos

Foreign Catalog Wrong Mapping - Azure SQL Database Binary Column

Hi all.I've used foreign catalog attached to azure sql databases and never had problems except in 2 situations:1) Foreign Catalogs don't support sql schemas/objects like [xxxx.yyyy].tablename. The workaround is creating views on sql database2) This i...

  • 933 Views
  • 2 replies
  • 0 kudos
Latest Reply
Isi
Honored Contributor III
  • 0 kudos

Hello @RobsonNLPT One thing that might help to narrow this down: could you check whether the problem occurs for the entire column, or if some of the batches you receive actually contain the full (non-truncated) value?If some batches are complete but ...

  • 0 kudos
1 More Replies
DataDev
by New Contributor
  • 2414 Views
  • 5 replies
  • 3 kudos

Schedule databricks job based on custom calendar

I want to schedule the databricks jobs based on the custom calender, like skip the job run on random days or holidays.#databricks @DataBricks @DATA 

  • 2414 Views
  • 5 replies
  • 3 kudos
Latest Reply
Advika
Community Manager
  • 3 kudos

Hello @DataDev! Did the suggestions shared above help address your question? If so, please consider marking one or more responses as the accepted solution. If you found another approach that worked for you, sharing it with the community would be real...

  • 3 kudos
4 More Replies
shan-databricks
by Databricks Partner
  • 828 Views
  • 3 replies
  • 3 kudos

How to load all the previous day's data only into the newly added column of the existing delta table

How to load all the previous day's data only into the newly added column of the existing delta table? Is there any option available to do that without writing any logic?

  • 828 Views
  • 3 replies
  • 3 kudos
Latest Reply
Advika
Community Manager
  • 3 kudos

Hello @shan-databricks! Did the suggestions shared above help resolve your concern? If so, please consider marking one of the responses as the accepted solution. If you found a different approach that worked for you, it would be great if you could sh...

  • 3 kudos
2 More Replies
philsch
by New Contributor III
  • 5318 Views
  • 8 replies
  • 3 kudos

Resolved! How to create a managed iceberg table via REST catalog

We're iceberg's java lib to write managed iceberg tables in databricks. We actually can create these tables using databricks as iceberg REST catalog. But this only works when we provide a partitioning spec. This is then picked up as cluster_columns f...

  • 5318 Views
  • 8 replies
  • 3 kudos
Latest Reply
liko
Databricks Employee
  • 3 kudos

Why are you using the iceberg-core Java library instead of an existing open source Iceberg client (like Apache Spark)? Any of these can create a table with partitions when using Unity Catalog.

  • 3 kudos
7 More Replies
chirag_nagar
by New Contributor
  • 3893 Views
  • 1 replies
  • 2 kudos

Resolved! uidance Required for Informatica to Databricks Workflow Migration Using AI

Hi Team,I am currently exploring approaches to convert Informatica PowerCenter workflows into Databricks-compatible code using AI capabilities. As part of this effort, I would like to highlight that Informatica generates individual XML files for each...

  • 3893 Views
  • 1 replies
  • 2 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 2 kudos

Greetings @chirag_nagar , as you can imagine or know, migrations are extremely complex and time consuming.  There are a few approaches to migrations but I want to focus on one - Bladebridge.  This is a free tool provided by Databricks that is AI powe...

  • 2 kudos
Rainer
by New Contributor
  • 1215 Views
  • 2 replies
  • 0 kudos

pyspark.testing.assertSchemaEqual() ignoreColumnOrder parameter exists in 3.5.0 only on Databricks

Hi, I am using the pyspark.testing.assertSchemaEqual() function in my code using the ignoreColumnOrder parameter that is available since pyspark 4.0.0. https://spark.apache.org/docs/4.0.0/api/python/reference/api/pyspark.testing.assertSchemaEqual.htm...

  • 1215 Views
  • 2 replies
  • 0 kudos
Latest Reply
saurabh18cs
Honored Contributor III
  • 0 kudos

Hi @Rainer When you use Databricks Connect, your local code is executed against the Databricks cluster, which uses the Databricks Runtime’s PySpark, not your local PySpark installation. meaning your master driver node is also running on remote comput...

  • 0 kudos
1 More Replies
Labels