cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

srinivasu
by New Contributor II
  • 1078 Views
  • 5 replies
  • 0 kudos

Unable to find course materials for the course: Delivery Specialization: CDW Migration Best Practice

Hi,Tried searching everywhere but unable to find course materails for the course: Delivery Specialization: CDW Migration Best Practice.In the video it says see course materials, but I dont see anything with that name anywhere. Please let me know if s...

srinivasu_0-1736756512399.png
  • 1078 Views
  • 5 replies
  • 0 kudos
Latest Reply
srinivasu
New Contributor II
  • 0 kudos

I believe it is closed, I'm unable to check the status of the ticket. It is giving an error: "You currently do not have access to Help Center.Please reach out to your admin or send an email to help@databricks.com"If you are able to check the request ...

  • 0 kudos
4 More Replies
UlrikChristense
by New Contributor II
  • 1283 Views
  • 5 replies
  • 0 kudos

Apply-changes-table (SCD2) with huge amounts of `rowIsHidden=True` rows

I have a lot of DLT tables creating using the `apply_changes` function with type 2 history. This functions creates a physical table `__apply_changes_storage_<table_name>` and a view on top of this `<table_name>`. The number of rows the physical table...

  • 1283 Views
  • 5 replies
  • 0 kudos
Latest Reply
UlrikChristense
New Contributor II
  • 0 kudos

I'm trying, but doesn't seem to change anything. Setting these table properties - when are the "applied"? When the job is run, or as a background thing?

  • 0 kudos
4 More Replies
MichielPovre
by New Contributor II
  • 2070 Views
  • 1 replies
  • 1 kudos

Resolved! Delta Live Tables - use cluster-scoped init scripts

Hi All,According the documentation of delta live tables (https://docs.databricks.com/en/delta-live-tables/external-dependencies.html), one can user either global or cluster scoped init scripts. However, I don't see an option to select init scripts in...

Data Engineering
Delta Live Tables
  • 2070 Views
  • 1 replies
  • 1 kudos
Latest Reply
AngadSingh
New Contributor III
  • 1 kudos

Hi, You can do it via a cluster policy. It can be achieved in two steps: create a cluster policy with required attributes.You can provide the init_scripts attribute in the policy. For reference: https://learn.microsoft.com/en-us/azure/databricks/admi...

  • 1 kudos
Meenambigai
by New Contributor
  • 631 Views
  • 1 replies
  • 0 kudos

Link for webinar Get Started with Databricks for Data Engineering session

where to find Link for webinar Get Started with Databricks for Data Engineering session

  • 631 Views
  • 1 replies
  • 0 kudos
Latest Reply
Advika_
Databricks Employee
  • 0 kudos

Hello @Meenambigai! If you have successfully enrolled in the course, open the Databricks Academy, click on the kebab menu icon (upper left corner), select "My Calendar". You’ll see the courses you’re enrolled in, organized by date. Click on the link ...

  • 0 kudos
adrjuju
by New Contributor II
  • 814 Views
  • 2 replies
  • 0 kudos

Resolved! Custom library in clean rooms

Hello Hello ! I want to use a clean room to run some algorithms developed for one of my customer without exchanging any data, the code is stored as a python library in a private git repo connected to databricks. 1 - We'd like to import the library in...

  • 814 Views
  • 2 replies
  • 0 kudos
Latest Reply
adrjuju
New Contributor II
  • 0 kudos

Thanks for the solution

  • 0 kudos
1 More Replies
Brad
by Contributor II
  • 1012 Views
  • 3 replies
  • 0 kudos

How to add shared libs

Hi team,I want to add some shared libs which might be used by many repos, e.g. some util functions which might be used by any repos.1. What is the recommended way to add those libs? E.g. create a separate repo and reference it in another repo?2. How ...

  • 1012 Views
  • 3 replies
  • 0 kudos
Latest Reply
radothede
Valued Contributor II
  • 0 kudos

Hi @Brad Typically, You specify shared libraries in init script. From there, init script will be executed for each job compute, ensuring lib consistency.The other way - You could use a job cluster policy and specify desired libraries that will be pro...

  • 0 kudos
2 More Replies
pinaki1
by New Contributor III
  • 2689 Views
  • 4 replies
  • 2 kudos

Serverless compute databricks

1. How to connect s3 bucket to databricks since dbfs mount is not supported.?2. In serverless compute Spark Context (sc), spark.sparkContext, and sqlContext are not supported?. Does it means it will not leverage power of distributed processing?3. Wha...

  • 2689 Views
  • 4 replies
  • 2 kudos
Latest Reply
User16653924625
Databricks Employee
  • 2 kudos

please see this documentation for accessing cloud storage by setting Unity Catalog objects: Storage Credential and External Location. https://docs.databricks.com/en/connect/unity-catalog/cloud-storage/index.html

  • 2 kudos
3 More Replies
Jerry01
by New Contributor III
  • 11344 Views
  • 3 replies
  • 2 kudos

Is ABAC feature enabled?

Can anyone please share me the example of how it works in terms of access controls?

  • 11344 Views
  • 3 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Naveena G​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers you...

  • 2 kudos
2 More Replies
santhoshKumarV
by New Contributor II
  • 1906 Views
  • 2 replies
  • 2 kudos

Code coverage on Databricks notebook

I have a scenario where my application code a scala package and notebook code[Scala] under /resources folder is being maitained.I am trying to look for a easiest way to perform code coverage on my notebook , does Databricks provide any option for it....

  • 1906 Views
  • 2 replies
  • 2 kudos
Latest Reply
santhoshKumarV
New Contributor II
  • 2 kudos

Important thing which missed to add in post is , we do maintan notebook code as .scala under resources and maitian in github. Files(.scala) from resources gets deployed as notebook using github action.With my approach of moving under package, I will ...

  • 2 kudos
1 More Replies
yvishal519
by Contributor
  • 4312 Views
  • 8 replies
  • 2 kudos

Handling Audit Columns and SCD Type 1 in Databricks DLT Pipeline with Unity Catalog: Circular Depend

I am working on a Delta Live Tables (DLT) pipeline with Unity Catalog, where we are reading data from Azure Data Lake Storage (ADLS) and creating a table in the silver layer with Slowly Changing Dimensions (SCD) Type 1 enabled. In addition, we are ad...

yvishal519_0-1729619599002.png
  • 4312 Views
  • 8 replies
  • 2 kudos
Latest Reply
yvishal519
Contributor
  • 2 kudos

@NandiniN  @RBlum I haven’t found an ideal solution for handling audit columns effectively in Databricks Delta Live Tables (DLT) when implementing SCD Type 1. It seems there’s no straightforward way to incorporate these columns into the apply_changes...

  • 2 kudos
7 More Replies
Deloitte_DS
by New Contributor II
  • 8835 Views
  • 5 replies
  • 1 kudos

Resolved! Unable to install poppler-utils

Hi,I'm trying to install system level package "Poppler-utils" for the cluster. I added the following line to the init.sh script.sudo apt-get -f -y install poppler-utilsI got the following error: PDFInfoNotInstalledError: Unable to get page count. Is ...

  • 8835 Views
  • 5 replies
  • 1 kudos
Latest Reply
Raghavan93513
Databricks Employee
  • 1 kudos

Hi Team, If you use a single user cluster and use the below init script, it will work: sudo rm -r /var/lib/apt/lists/* sudo apt clean && sudo apt update --fix-missing -ysudo apt-get install poppler-utils tesseract-ocr -y But if you are using a shared...

  • 1 kudos
4 More Replies
vvk
by New Contributor II
  • 6156 Views
  • 2 replies
  • 0 kudos

Unable to upload a wheel file in Azure DevOps pipeline

Hi, I am trying to upload a wheel file to Databricks workspace using Azure DevOps release pipeline to use it in the interactive cluster. I tried "databricks workspace import" command, but looks like it does not support .whl files. Hence, I tried to u...

  • 6156 Views
  • 2 replies
  • 0 kudos
Latest Reply
Satyadeepak
Databricks Employee
  • 0 kudos

Hi @vvk - The HTTP 403 error typically indicates a permissions issue. Ensure that the SP has the necessary permissions to perform the fs cp operation on the specified path. Verify that the path specified in the fs cp command is correct and that the v...

  • 0 kudos
1 More Replies
stvayers
by New Contributor
  • 6199 Views
  • 1 replies
  • 0 kudos

How to mount AWS EFS via NFS on a Databricks Cluster

I'm trying to read in ~500 million small json files into an spark autoloader pipeline, and I seem to be slowed down massively by S3 request limits, so I want to explore using AWS EFS instead. I found this blog post: https://www.databricks.com/blog/20...

  • 6199 Views
  • 1 replies
  • 0 kudos
Latest Reply
Satyadeepak
Databricks Employee
  • 0 kudos

Hi @stvayers Please refer to this doc. https://docs.databricks.com/api/workspace/clusters/create It has instructions on how to mount using EFS.  

  • 0 kudos
Bepposbeste1993
by New Contributor III
  • 1958 Views
  • 4 replies
  • 0 kudos

Resolved! select 1 query not finishing

Hello,I have the issue that even a query like "select 1" is not finishing. The sql warehouse runs infinite. I have no idea where to look for any issues because in the SPARK UI I cant see any error.What is intresting is that also allpurpose clusters (...

  • 1958 Views
  • 4 replies
  • 0 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 0 kudos

Hi @Bepposbeste1993, Do you have the case ID raised for this issue? 

  • 0 kudos
3 More Replies
cmilligan
by Contributor II
  • 4262 Views
  • 4 replies
  • 0 kudos

Undescriptive error when trying to insert overwrite into a table

I have a query that I'm trying to insert overwrite into a table. In an effort to try and speed up the query I added a range join hint. After adding it I started getting the error below.I can get around this though by creating a temporary view of the ...

Screenshot_20230118_104626
  • 4262 Views
  • 4 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

Could you share your code and the full error stack trace please? Check the driver logs for the full stack trace.

  • 0 kudos
3 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels