cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Djelany
by New Contributor II
  • 4581 Views
  • 3 replies
  • 1 kudos

Resolved! DLT Event Logs

Hi,Does anyone know what details:planning_information:technique_information[0]:cost under planning_information event type means in my DLT workflow system event logs? For context, I'm trying to track the cost per run of my DLT workflow and I do not ha...

  • 4581 Views
  • 3 replies
  • 1 kudos
Latest Reply
adriennn
Valued Contributor
  • 1 kudos

you can enable system.billing schema and see the costs of the runs from the usage table.

  • 1 kudos
2 More Replies
jay971
by New Contributor II
  • 2406 Views
  • 3 replies
  • 0 kudos

Error: Cannot use legacy parameters because the job has job parameters configured.

I created a job which has two Job Parameters. How can I use Databricks CLI to pass different values to those parameters. 

  • 2406 Views
  • 3 replies
  • 0 kudos
Latest Reply
jay971
New Contributor II
  • 0 kudos

The job ran but did not pick up the values from the CLI.

  • 0 kudos
2 More Replies
Saf4Databricks
by New Contributor III
  • 3028 Views
  • 2 replies
  • 0 kudos

Reading single file from Databricks DBFS

I have a Test.csv file in FileStore of DBFS in Databricks Community edition. When I try to read the file using With Open, I get the following error:FileNotFoundError: [Errno 2] No such file or directory: '/dbfs/FileStore/tables/Test.csv' import os wi...

  • 3028 Views
  • 2 replies
  • 0 kudos
Latest Reply
Saf4Databricks
New Contributor III
  • 0 kudos

@EricRM It should work. Please see the accepted response from this same forum here. So, we still need to find a cause of the error. Following is the detailed error message. Maybe, this will help readers understand the issue better and help it resolve...

  • 0 kudos
1 More Replies
databicky
by Contributor II
  • 20986 Views
  • 13 replies
  • 4 kudos
  • 20986 Views
  • 13 replies
  • 4 kudos
Latest Reply
FerArribas
Contributor
  • 4 kudos

Hi @Hubert Dudek​,​Pandas API doesn't support abfss protocol.You have three options:​If you need to use pandas, you can write the excel to the local file system (dbfs) and then move it to ABFSS (for example with dbutils)Write as csv directly in abfss...

  • 4 kudos
12 More Replies
sakuraDev
by New Contributor II
  • 1061 Views
  • 1 replies
  • 2 kudos

Resolved! how does autoloader handle source outage

Hey guys,I've been looking for some docs on how autoloader manages the source outage, I am currently running the following code: dfBronze = (spark.readStream .format("cloudFiles") .option("cloudFiles.format", "json") .schema(json_schema_b...

sakuraDev_0-1725478024362.png
  • 1061 Views
  • 1 replies
  • 2 kudos
Latest Reply
filipniziol
Esteemed Contributor
  • 2 kudos

Hi @sakuraDev ,1. Using the availableNow trigger to process all available data immediately and then stop the query. As you noticed your data was processed once and now you need to trigger the process once again to process new files.2. Changing the tr...

  • 2 kudos
Soma
by Valued Contributor
  • 4867 Views
  • 6 replies
  • 3 kudos

Resolved! Dynamically supplying partitions to autoloader

We are having a streaming use case and we see a lot of time in listing from azure.Is it possible to supply partition to autoloader dynamically on the fly

  • 4867 Views
  • 6 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

@somanath Sankaran​ - Thank you for posting your solution. Would you be happy to mark your answer as best so that other members may find it more quickly?

  • 3 kudos
5 More Replies
188386
by New Contributor II
  • 1528 Views
  • 2 replies
  • 0 kudos

Databricks Learning - Get Started with Databricks for Data Engineering -> Next button not active

Hi,Databricks Learning - Get Started with Databricks for Data Engineering (ID: E-03ZW80) got stuck at lesson where file "get-started-with-data-engineering-on-databricks-2.1.zip" is downloaded. The "Next" button is not active - see attached picture.Li...

  • 1528 Views
  • 2 replies
  • 0 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 0 kudos

Hi @188386 As I can see you have skipped multiple lesson(Introduction to Data on Databricks) first complete it in sequence then next button will be enabled for you.

  • 0 kudos
1 More Replies
IN
by New Contributor II
  • 827 Views
  • 1 replies
  • 1 kudos

Connect to remote SQL Server (add databricks cluster IP to the whitelist)

Hi, I would need to connect from a notebook on workspaces to a remote SQL server instance. This server is protected by a firewall, thus I would need to add an IP address to the whitelist. Ideally, if it would be possible to setup/allocate static IP a...

  • 827 Views
  • 1 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @IN ,In Azure, you can deploy Workspace in VNET Injection mode and attach NAT Gateway to your VNET.NAT GW require Public IP.  This IP will be your static egress IP for all Cluster in for this Workspace.I've never worked with GCP,  but I think you ...

  • 1 kudos
Henrik_
by New Contributor III
  • 2641 Views
  • 1 replies
  • 1 kudos

Resolved! Optimizing recursive joins on group and UNION-operations.

The code snippet below takes each group (based on id) and perform recursive joins to build parent-child relations  (id1 and id2) within a group. The code produce the correct output, an array in column 'path'.However, in my real world use-case, this c...

  • 2641 Views
  • 1 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

The recursive join is definitely a performance killer.  It will blow up the query plan.So I would advice against using it.Alternatives?  Well, a fixed amount of joins for example, if that is an option of course.Using a graph algorithm is also an opti...

  • 1 kudos
n_joy
by Contributor
  • 5113 Views
  • 6 replies
  • 2 kudos

Resolved! Change data feed for tables with allowColumnDefaults property "enabled"

I have a Delta table already created, with both enabled the #enableChangeDataFeed option and #allowColumnDefaults properties. However when writing to CDC table with streaming queries it fails with the following error [CREATE TABLE command because it ...

  • 5113 Views
  • 6 replies
  • 2 kudos
Latest Reply
n_joy
Contributor
  • 2 kudos

@filipniziolYes, that is what I do  Thanks for feedback ! 

  • 2 kudos
5 More Replies
enda
by New Contributor II
  • 872 Views
  • 3 replies
  • 1 kudos

Course missing required content.

https://customer-academy.databricks.com/learn/course/1266/play/7847/navigate-the-workspace-user-interface;lp=10 Its missing the file you import to your workspace theres no file to download?

  • 872 Views
  • 3 replies
  • 1 kudos
Latest Reply
enda
New Contributor II
  • 1 kudos

Cheers thanks i see now!

  • 1 kudos
2 More Replies
ashraf1395
by Honored Contributor
  • 4011 Views
  • 0 replies
  • 0 kudos

Databricks Finops Assessment

We have to deliver a Databricks Finops Assessment project. I am trying to write a proposal for it. I haven't done one before. I have created a general process of how the assessment will look like and then restructured it using gpt.Plz give your feedb...

  • 4011 Views
  • 0 replies
  • 0 kudos
Rishabh-Pandey
by Esteemed Contributor
  • 1777 Views
  • 1 replies
  • 3 kudos

Creating a shareable dashboard

AI/BI Dashboards offer a robust solution for securely sharing visualizations, and insights throughout your organization. You can easily share these dashboards with users within your Databricks workspace, across other workspaces in your organization, ...

  • 1777 Views
  • 1 replies
  • 3 kudos
Latest Reply
Anushree_Tatode
Databricks Employee
  • 3 kudos

Hi Rishabh,Nice post, AI/BI Dashboards make it easy to share data securely within and across workspaces, even with view-only users. This way, everyone gets the right info while keeping things controlled. Excited to learn more about the key features!A...

  • 3 kudos
AndyG
by New Contributor II
  • 5299 Views
  • 6 replies
  • 1 kudos

Remove partition column from delta table

I have delta tables with multiple partition columns. I want to remove most of the partition columns and retain just one. I can see there are ALTER TABLE...PARTITION options but these are not supported for delta lake tables. So is there a way to do th...

  • 5299 Views
  • 6 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @AndyG ,Maybe try the way official delta guid is suggesting:Adding and Deleting Partitions in Delta Lake tables | Delta LakeYou can delete all rows from a given partition to remove the partition from the Delta table.Here’s how to delete all the ro...

  • 1 kudos
5 More Replies
guangyi
by Contributor III
  • 1284 Views
  • 2 replies
  • 2 kudos

Resolved! How to use dlt.expect to validate table level constraints?

I know how to validate the column level constraint, like checking whether the specified column value is larger than target value.Can I validate some table level constraints? For example, validate whether the total records count of a table is larger t...

  • 1284 Views
  • 2 replies
  • 2 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 2 kudos

Hi @guangyi ,Unfortunately, there is no out of the box solution for this requirement in dlt. But as a workaround you can add an additional view/table to your pipeline that defines an expectation in similar way to below: CREATE OR REFRESH MATERIALIZED...

  • 2 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels