Data Engineering

Forum Posts

Sorted by:

by lauraxyz • Contributor

12-17-2024 3:08:35 PM

6903 Views
5 replies
1 kudos

Put file into volume within Databricks

Hi! From a Databricks job, i want to copy a workspace file into volume. how can i do that?I tried`dbutils.fs.cp("/Workspace/path/to/the/file", "/Volumes/path/to/destination")`but got Public DBFS root is disabled. Access is denied on path: /Workspac...

Data Engineering

6903 Views
5 replies
1 kudos

12-17-2024 3:08:35 PM

View Replies

Latest Reply

fjrodriguez
New Contributor III

07-24-2025 3:55:18 AM

1 kudos

I do have one question, i think this post is the best suitable. I do want to override a wheel files into a Volume i do have already created in my CICD process. I do have something like this: - ${{if parameters.filesPackages}}: - $...

1 kudos

07-24-2025 3:55:18 AM

4 More Replies

by Aneruth • New Contributor II

07-23-2025 1:34:07 AM

805 Views
1 replies
0 kudos

[INTERNAL_ERROR] Cannot refresh quality dashboard

Hi all,I'm encountering an INTERNAL_ERROR issue when refreshing a Databricks Lakehouse Monitoring job. Here's the full error message:`ProfilingError: INTERNAL_ERROR. Please contact the Databricks team for further assistance and include the refresh id...

Data Engineering

805 Views
1 replies
0 kudos

07-23-2025 1:34:07 AM

View Replies

Latest Reply

Aneruth
New Contributor II

07-24-2025 3:03:38 AM

0 kudos

Thank you! I'll modify my query based on your explanation. Currently, I'm manually parsing the custom metrics output data types, which works but isn't ideal. I'll implement proper data type formatting through asset bundles to ensure the UI receives c...

0 kudos

07-24-2025 3:03:38 AM

by san11 • New Contributor II

07-24-2025 12:26:09 AM

980 Views
2 replies
0 kudos

Enabled IP access list for azure databricks workspace but it is not working

Hi,We enabled IP access list for azure databricks workspace using REST API and we are able to see the IPs in allow and block list but it is not working and we are able to login to Web UI from any IP address and run the queries. Does this approach not...

Data Engineering

980 Views
2 replies
0 kudos

07-24-2025 12:26:09 AM

View Replies

Latest Reply

Khaja_Zaffer
Esteemed Contributor

07-24-2025 1:45:54 AM

0 kudos

Hello @san11 what is the error you are getting? I mean on any error on web uican you also share the screenshot of the IPs which you allowed on the azure portal?

0 kudos

07-24-2025 1:45:54 AM

1 More Replies

by Pratikmsbsvm • Contributor

07-23-2025 4:33:37 AM

639 Views
2 replies
1 kudos

Resolved! which option is better

Can anyone please help me which option is better and why ?Thanks a lot

Data Engineering

639 Views
2 replies
1 kudos

07-23-2025 4:33:37 AM

View Replies

Latest Reply

Sharanya13
Contributor III

07-23-2025 9:01:31 PM

1 kudos

+1 @radothede @Pratikmsbsvm How are you transforming from bronze to silver? DLT?

1 kudos

07-23-2025 9:01:31 PM

1 More Replies

by HariharaSam • Databricks Partner

01-12-2022 11:45:58 PM

39873 Views
10 replies
4 kudos

Resolved! To get Number of rows inserted after performing an Insert operation into a table

Consider we have two tables A & B.qry = """INSERT INTO Table ASelect * from Table B where Id is null """spark.sql(qry)I need to get the number of records inserted after running this in databricks.

Data Engineering

39873 Views
10 replies
4 kudos

01-12-2022 11:45:58 PM

View Replies

Latest Reply

User16653924625
Databricks Employee

07-23-2025 4:28:46 PM

4 kudos

in case someone is looking for purely SQL based solution: (add LIMIT 1 to the query if you are looking for last op only) select t.timestamp, t.operation, t.operationMetrics.numOutputRows as numOutputRows from ( DESCRIBE HISTORY <catalog>.<schema>....

4 kudos

07-23-2025 4:28:46 PM

9 More Replies

by ajgold • New Contributor II

07-21-2025 12:50:38 PM

1563 Views
6 replies
2 kudos

DLT Expectations Alert for Warning

I want to receive an alert via email or Slack when the @Dlt.expect declaration fails the validation check in my DLT pipeline. I only see the option to add an email alert for @Dlt.expect_or_fail failures, but not for warnings.

Data Engineering

1563 Views
6 replies
2 kudos

07-21-2025 12:50:38 PM

View Replies

Latest Reply

RiyazAliM
Honored Contributor

07-21-2025 7:52:01 PM

2 kudos

Hey @ajgold I don't think DLT has this feature yet. You may raise a feature request for Databricks to add it in its future releases over here - https://databricks.aha.io/Cheers!

2 kudos

07-21-2025 7:52:01 PM

5 More Replies

by ande • New Contributor

04-26-2024 7:48:50 AM

2557 Views
2 replies
0 kudos

IP address for accessing external SFTP server

I am trying to pull in data to my Databricks workspace via an external SFTP server. I am using Azure for my compute. To access the SFTP server they need to whitelist my IP address. My IP address in Azure Databricks seems to be constantly changing fro...

Data Engineering

2557 Views
2 replies
0 kudos

04-26-2024 7:48:50 AM

View Replies

Latest Reply

Walter_C
Databricks Employee

04-27-2024 9:08:54 AM

0 kudos

Azure Databricks, like many cloud services, does not provide static IP addresses for outbound connections. This is because the compute resources are dynamically allocated and can change over time. One potential workaround could be to use a Virtual N...

0 kudos

04-27-2024 9:08:54 AM

1 More Replies

by fjrodriguez • New Contributor III

07-23-2025 4:39:18 AM

642 Views
2 replies
0 kudos

Job Preview in ADF

I do have one Spark Job that is triggered via ADF as a usual "Python" activity. Now wanted to move to Job which is under Preview. Normally under linked service level i do have spark config and environment that is needed for the execution of this scri...

Data Engineering

642 Views
2 replies
0 kudos

07-23-2025 4:39:18 AM

View Replies

Latest Reply

radothede
Valued Contributor II

07-23-2025 4:52:19 AM

0 kudos

Hi @fjrodriguez my understanding is You've already created a cluster for Your job. If that's the case, You can put that spark configuration and env variables directly in the cluster Your job is using. If for some reason thats not possible, then You c...

0 kudos

07-23-2025 4:52:19 AM

1 More Replies

by jdlogos • New Contributor III

03-17-2025 7:44:37 AM

4850 Views
5 replies
2 kudos

apply_changes_from_snapshot with expectations

Hi,Question: Are expectations supposed to function in conjunction with create_streaming_table() and apply_changes_from_snapshot?Our team is investigating Delta Live Tables and we have a working prototype using Autoloader to ingest some files from a m...

Data Engineering

4850 Views
5 replies
2 kudos

03-17-2025 7:44:37 AM

View Replies

Latest Reply

jbrmn
New Contributor II

07-14-2025 2:03:00 AM

2 kudos

Also facing the same issue - did you find a solution?Thinking I will have to apply expectations at the next stage of the pipeline until this is worked out

2 kudos

07-14-2025 2:03:00 AM

4 More Replies

by dnz • New Contributor

07-31-2024 8:15:58 AM

1342 Views
1 replies
0 kudos

Performance Issue with OPTIMIZE Command for Historical Data Migration Using Liquid Clustering

Hello Databricks Community,I’m experiencing performance issues with the OPTIMIZE command when migrating historical data into a table with liquid clustering. Specifically, I am processing one year’s worth of data at a time. For example:The OPTIMIZE co...

Data Engineering

1342 Views
1 replies
0 kudos

07-31-2024 8:15:58 AM

View Replies

Latest Reply

HimanshuSingh
New Contributor II

07-23-2025 5:18:54 AM

0 kudos

Did you got any solution? If Yes please post it.

0 kudos

07-23-2025 5:18:54 AM

by yuinagam • New Contributor II

07-23-2025 4:30:16 AM

853 Views
2 replies
0 kudos

how can I verify that the result of a dlt will have enough rows before updating the table?

I have a dlt/lakeflow pipeline that creates a table, and I need to make sure that it will only update the resulting materialized view if it will have more than one million records.I've found this, but it seems to only work if I have already updated t...

Data Engineering

853 Views
2 replies
0 kudos

07-23-2025 4:30:16 AM

View Replies

Latest Reply

yuinagam
New Contributor II

07-23-2025 4:44:49 AM

0 kudos

Thank you for the quick reply.Is there a common/recommended/possible way to work around this limitation? I don't mind not using the expectation api if it doesn't support logic that's based on aggregations.

0 kudos

07-23-2025 4:44:49 AM

1 More Replies

by shan-databricks • Databricks Partner

07-22-2025 7:36:05 AM

3652 Views
9 replies
4 kudos

Resolved! Databricks Autoloader BadRecords path Issue

I have one file that has 100 rows and in which two rows are bad data and the remaining 98 rows is good data, but when I use the bad records' path, it completely moves the file to the bad records' path, which has good data as well, and it should move ...

Data Engineering

3652 Views
9 replies
4 kudos

07-22-2025 7:36:05 AM

View Replies

Latest Reply

ShaileshBobay
Databricks Employee

07-23-2025 1:57:57 AM

4 kudos

Why Entire Files Go to badRecordsPath When you enable badRecordsPath in Autoloader or in Spark’s file readers (with formats like CSV/JSON), here’s what happens: Spark expects each data file to be internally well-formed with respect to the declared s...

4 kudos

07-23-2025 1:57:57 AM

8 More Replies

by yit • Databricks Partner

07-17-2025 4:43:59 AM

5270 Views
8 replies
4 kudos

Resolved! Schema evolution for JSON files with AutoLoader

I am using Auto Loader to ingest JSON files into a managed table. Auto Loader saves only the first-level fields as new columns, while nested structs are stored as values within those columns.My goal is to support schema evolution when loading new fi...

Data Engineering

5270 Views
8 replies
4 kudos

07-17-2025 4:43:59 AM

View Replies

Latest Reply

BS_THE_ANALYST
Databricks Partner

07-23-2025 3:28:20 AM

4 kudos

@yit awesome. Glad that you got this solved. I look forward to the next problem .All the best,BS

4 kudos

07-23-2025 3:28:20 AM

7 More Replies

by ZD • New Contributor III

07-22-2025 8:02:05 PM

2406 Views
5 replies
0 kudos

How to replace ${param} by :param

Hello,We previously used ${param} in our SQL queries:SELECT * FROM json.`${source_path}/file.json`However, this syntax is now deprecated. The recommended approach is to use :param instead.But when I attempt to replace ${param} with :param, I encounte...

Data Engineering

2406 Views
5 replies
0 kudos

07-22-2025 8:02:05 PM

View Replies

Latest Reply

radothede
Valued Contributor II

07-22-2025 11:56:51 PM

0 kudos

Hi @ZD Please try this syntax in Your notebook for SQL:%sqldeclare _my_path = 'some_path';select _my_path;

0 kudos

07-22-2025 11:56:51 PM

4 More Replies

by Johannes_E • New Contributor III

07-18-2025 2:16:23 AM

1371 Views
2 replies
1 kudos

Resolved! Job cluster has no permission to create folder in Unity Catalog Volume

Hello everybody,I want to run a job that collects some csv files from a SFTP server and saves them on my Unity Catalog Volume. While my personal cluster defined like the following has access to create folders on the volume my job cluster doesn't.Defi...

Data Engineering

1371 Views
2 replies
1 kudos

07-18-2025 2:16:23 AM

View Replies

Latest Reply

Johannes_E
New Contributor III

07-22-2025 10:46:49 PM

1 kudos

Thank you, that helped although I had to use "SINGLE_USER" instead of "DATA_SECURITY_MODE_DEDICATED". According to the docs (https://docs.databricks.com/api/workspace/clusters/create) "SINGLE_USER" is an alias for "DATA_SECURITY_MODE_DEDICATED".

1 kudos

07-22-2025 10:46:49 PM

1 More Replies

Databricks Community

Forum Posts

Put file into volume within Databricks

[INTERNAL_ERROR] Cannot refresh quality dashboard

Enabled IP access list for azure databricks workspace but it is not working

Resolved! which option is better

Resolved! To get Number of rows inserted after performing an Insert operation into a table

DLT Expectations Alert for Warning

IP address for accessing external SFTP server

Job Preview in ADF

apply_changes_from_snapshot with expectations

Performance Issue with OPTIMIZE Command for Historical Data Migration Using Liquid Clustering

how can I verify that the result of a dlt will have enough rows before updating the table?

Resolved! Databricks Autoloader BadRecords path Issue

Resolved! Schema evolution for JSON files with AutoLoader

How to replace ${param} by :param

Resolved! Job cluster has no permission to create folder in Unity Catalog Volume

Databricks optimization for query perfomance and p...

Parametrize the DLT pipeline for dynamic loading o...

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...