Data Engineering

Forum Posts

Sorted by:

Start a conversation

by Jerry01 • New Contributor III

03-16-2023 3:22:14 AM

11321 Views
3 replies
2 kudos

Is ABAC feature enabled?

Can anyone please share me the example of how it works in terms of access controls?

Data Engineering

11321 Views
3 replies
2 kudos

03-16-2023 3:22:14 AM

View Replies

Latest Reply

Anonymous
Not applicable

03-25-2023 3:48:20 AM

2 kudos

Hi @Naveena G Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers you...

2 kudos

03-25-2023 3:48:20 AM

2 More Replies

by santhoshKumarV • New Contributor II

01-17-2025 2:23:06 AM

1851 Views
2 replies
2 kudos

Code coverage on Databricks notebook

I have a scenario where my application code a scala package and notebook code[Scala] under /resources folder is being maitained.I am trying to look for a easiest way to perform code coverage on my notebook , does Databricks provide any option for it....

Data Engineering

1851 Views
2 replies
2 kudos

01-17-2025 2:23:06 AM

View Replies

Latest Reply

santhoshKumarV
New Contributor II

01-21-2025 11:22:02 PM

2 kudos

Important thing which missed to add in post is , we do maintan notebook code as .scala under resources and maitian in github. Files(.scala) from resources gets deployed as notebook using github action.With my approach of moving under package, I will ...

2 kudos

01-21-2025 11:22:02 PM

1 More Replies

by yvishal519 • Contributor

10-22-2024 11:01:22 AM

4234 Views
8 replies
2 kudos

Handling Audit Columns and SCD Type 1 in Databricks DLT Pipeline with Unity Catalog: Circular Depend

I am working on a Delta Live Tables (DLT) pipeline with Unity Catalog, where we are reading data from Azure Data Lake Storage (ADLS) and creating a table in the silver layer with Slowly Changing Dimensions (SCD) Type 1 enabled. In addition, we are ad...

Data Engineering

4234 Views
8 replies
2 kudos

10-22-2024 11:01:22 AM

View Replies

Latest Reply

yvishal519
Contributor

01-21-2025 11:13:30 PM

2 kudos

@NandiniN @RBlum I haven’t found an ideal solution for handling audit columns effectively in Databricks Delta Live Tables (DLT) when implementing SCD Type 1. It seems there’s no straightforward way to incorporate these columns into the apply_changes...

2 kudos

01-21-2025 11:13:30 PM

7 More Replies

by Deloitte_DS • New Contributor II

08-16-2023 12:30:00 PM

8725 Views
5 replies
1 kudos

Resolved! Unable to install poppler-utils

Hi,I'm trying to install system level package "Poppler-utils" for the cluster. I added the following line to the init.sh script.sudo apt-get -f -y install poppler-utilsI got the following error: PDFInfoNotInstalledError: Unable to get page count. Is ...

Data Engineering

8725 Views
5 replies
1 kudos

08-16-2023 12:30:00 PM

View Replies

Latest Reply

Raghavan93513
Databricks Employee

01-21-2025 8:08:56 PM

1 kudos

Hi Team, If you use a single user cluster and use the below init script, it will work: sudo rm -r /var/lib/apt/lists/* sudo apt clean && sudo apt update --fix-missing -ysudo apt-get install poppler-utils tesseract-ocr -y But if you are using a shared...

1 kudos

01-21-2025 8:08:56 PM

4 More Replies

by vvk • New Contributor II

03-13-2024 12:24:15 AM

6082 Views
2 replies
0 kudos

Unable to upload a wheel file in Azure DevOps pipeline

Hi, I am trying to upload a wheel file to Databricks workspace using Azure DevOps release pipeline to use it in the interactive cluster. I tried "databricks workspace import" command, but looks like it does not support .whl files. Hence, I tried to u...

Data Engineering

6082 Views
2 replies
0 kudos

03-13-2024 12:24:15 AM

View Replies

Latest Reply

Satyadeepak
Databricks Employee

01-21-2025 10:53:40 AM

0 kudos

Hi @vvk - The HTTP 403 error typically indicates a permissions issue. Ensure that the SP has the necessary permissions to perform the fs cp operation on the specified path. Verify that the path specified in the fs cp command is correct and that the v...

0 kudos

01-21-2025 10:53:40 AM

1 More Replies

by stvayers • New Contributor

03-21-2024 2:03:08 PM

6118 Views
1 replies
0 kudos

How to mount AWS EFS via NFS on a Databricks Cluster

I'm trying to read in ~500 million small json files into an spark autoloader pipeline, and I seem to be slowed down massively by S3 request limits, so I want to explore using AWS EFS instead. I found this blog post: https://www.databricks.com/blog/20...

Data Engineering

6118 Views
1 replies
0 kudos

03-21-2024 2:03:08 PM

View Replies

Latest Reply

Satyadeepak
Databricks Employee

01-21-2025 10:31:05 AM

0 kudos

Hi @stvayers Please refer to this doc. https://docs.databricks.com/api/workspace/clusters/create It has instructions on how to mount using EFS.

0 kudos

01-21-2025 10:31:05 AM

by Bepposbeste1993 • New Contributor III

11-30-2024 2:41:18 PM

1896 Views
4 replies
0 kudos

Resolved! select 1 query not finishing

Hello,I have the issue that even a query like "select 1" is not finishing. The sql warehouse runs infinite. I have no idea where to look for any issues because in the SPARK UI I cant see any error.What is intresting is that also allpurpose clusters (...

Data Engineering

1896 Views
4 replies
0 kudos

11-30-2024 2:41:18 PM

View Replies

Latest Reply

Alberto_Umana
Databricks Employee

12-10-2024 11:46:44 AM

0 kudos

Hi @Bepposbeste1993, Do you have the case ID raised for this issue?

0 kudos

12-10-2024 11:46:44 AM

3 More Replies

by cmilligan • Contributor II

01-18-2023 8:57:36 AM

4203 Views
4 replies
0 kudos

Undescriptive error when trying to insert overwrite into a table

I have a query that I'm trying to insert overwrite into a table. In an effort to try and speed up the query I added a range join hint. After adding it I started getting the error below.I can get around this though by creating a temporary view of the ...

Data Engineering

4203 Views
4 replies
0 kudos

01-18-2023 8:57:36 AM

View Replies

Latest Reply

jose_gonzalez
Databricks Employee

01-30-2023 2:11:55 PM

0 kudos

Could you share your code and the full error stack trace please? Check the driver logs for the full stack trace.

0 kudos

01-30-2023 2:11:55 PM

3 More Replies

by pranitha • New Contributor II

01-21-2025 2:58:34 AM

815 Views
3 replies
0 kudos

How to fetch max or avg worker nodes for a specific time period using system tables?

Data Engineering

815 Views
3 replies
0 kudos

01-21-2025 2:58:34 AM

View Replies

Latest Reply

MadhuB
Valued Contributor

01-21-2025 6:45:21 AM

0 kudos

Hi @pranitha Use this query to get the cluster details along with cost info as well. WITH hourly_metrics AS ( SELECT date_trunc('hour', usage_start_time) as hour, usage_metadata.cluster_id, sku_name, MAX(usage_quantity) as max_usage, ...

0 kudos

01-21-2025 6:45:21 AM

2 More Replies

by abelian-grape • New Contributor III

01-21-2025 6:29:57 AM

992 Views
1 replies
0 kudos

Near real time processing with CDC from snowflake to databricks

Hi I would like to configure near real time streaming on Databricks to process data as soon as a new data finish processing on snowflake e.g. with DLT pipelins and Auto Loader. Which option would be better for this setup? Option A)Export the Snowpark...

Data Engineering

992 Views
1 replies
0 kudos

01-21-2025 6:29:57 AM

View Replies

Latest Reply

saurabh18cs
Honored Contributor II

01-21-2025 6:36:40 AM

0 kudos

it is like latency vs complexity and cost. you have to choose for yourself for me option A sounds reasonable

0 kudos

01-21-2025 6:36:40 AM

by Sans • New Contributor III

03-12-2024 11:18:49 PM

5353 Views
9 replies
3 kudos

Unable to create new compute in community databricks

Hi Team,I am unable to create computer in databricks community due to below error. Please advice.Bootstrap Timeout:Node daemon ping timeout in 780000 ms for instance i-0ab6798b2c762fb25 @ 10.172.246.217. Please check network connectivity between the ...

Data Engineering

5353 Views
9 replies
3 kudos

03-12-2024 11:18:49 PM

View Replies

Latest Reply

drag7ter
Contributor

01-21-2025 1:30:44 AM

3 kudos

The same get this error regularly in eu-west-1 workspace. So many issues. Did databricks try to check this issue, as it could be a bug? No any response so far?

3 kudos

01-21-2025 1:30:44 AM

8 More Replies

by jyothib • New Contributor II

06-13-2024 1:51:04 PM

2299 Views
2 replies
3 kudos

Resolved! System tables latency

How much time is the latency of system tables#unitycatalog

Data Engineering

2299 Views
2 replies
3 kudos

06-13-2024 1:51:04 PM

View Replies

Latest Reply

raphaelblg
Databricks Employee

06-13-2024 3:25:41 PM

3 kudos

@jyothib at the current moment, system tables are still under Public Preview stage (more details at: https://docs.databricks.com/en/admin/system-tables/index.html)We don’t offer data freshness SLOs for system tables at this point and there are no pla...

3 kudos

06-13-2024 3:25:41 PM

1 More Replies

by Kanna • New Contributor II

01-20-2025 9:50:09 PM

1659 Views
1 replies
4 kudos

Resolved! Autoloader clarification

Hi team,Good day! I would like to know how we can perform an incremental load using Autoloader.I am uploading one file to DBFS and writing it into a table. When I upload a similar file to the same directory, it does not perform an incremental load; i...

Data Engineering

1659 Views
1 replies
4 kudos

01-20-2025 9:50:09 PM

View Replies

Latest Reply

boitumelodikoko
Valued Contributor

01-21-2025 12:49:02 AM

4 kudos

Hi @Kanna,Good day! Based on the issue you’re encountering, I believe the problem stems from missing deduplication or upsert logic in your current implementation. Here's an approach that combines the power of Databricks Autoloader and Delta Lake to h...

4 kudos

01-21-2025 12:49:02 AM

by harlemmuniz • New Contributor II

01-24-2024 12:33:22 PM

3932 Views
8 replies
1 kudos

Issue with Job Versioning with “Run Job” tasks and Deployments between envinronments

Hello,I am writing to bring to your attention an issue that we have encountered while working with Databricks and seek your assistance in resolving it.When running a Job of Workflow with the task "Run Job" and clicking on "View YAML/JSON," we have ob...

Data Engineering

3932 Views
8 replies
1 kudos

01-24-2024 12:33:22 PM

View Replies

Latest Reply

saurabh18cs
Honored Contributor II

01-14-2025 5:36:57 AM

1 kudos

Hi , Sorry if I don't understand your usecase, are your trying to start/stop databricks job via terraform? for this reason do you want to harcode job-id??

1 kudos

01-14-2025 5:36:57 AM

7 More Replies

by Kumar4567 • New Contributor II

04-20-2023 6:51:26 AM

7252 Views
4 replies
0 kudos

disable downloading files for specific group of users ?

I see we can disable/enable download button for entire workspace using download button for notebook results.is there a way to disable/enable this just for specific group of users ?

Data Engineering

7252 Views
4 replies
0 kudos

04-20-2023 6:51:26 AM

View Replies

Latest Reply

_anonymous
New Contributor II

01-20-2025 6:20:14 PM

0 kudos

To future adventurers, the feature described by responder to OP does not exist.

0 kudos

01-20-2025 6:20:14 PM

3 More Replies

Databricks Community

Forum Posts

Is ABAC feature enabled?

Code coverage on Databricks notebook

Handling Audit Columns and SCD Type 1 in Databricks DLT Pipeline with Unity Catalog: Circular Depend

Resolved! Unable to install poppler-utils

Unable to upload a wheel file in Azure DevOps pipeline

How to mount AWS EFS via NFS on a Databricks Cluster

Resolved! select 1 query not finishing

Undescriptive error when trying to insert overwrite into a table

How to fetch max or avg worker nodes for a specific time period using system tables?

Near real time processing with CDC from snowflake to databricks

Unable to create new compute in community databricks

Resolved! System tables latency

Resolved! Autoloader clarification

Issue with Job Versioning with “Run Job” tasks and Deployments between envinronments

disable downloading files for specific group of users ?

Join Us as a Local Community Builder!

Row tracking in Delta tables

mongodb connector duplicate writes

Why is spark creating 5 jobs and 200 tasks?

What are the options for "spark_conf.spark.databri...

Databricks AutoLoader IncrementalListing mode chan...