cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

htd350
by New Contributor II
  • 1150 Views
  • 1 replies
  • 1 kudos

Predictive Optimization & Serverless Compute

Hello,I have a hard time understanding how predictive optimization if serverless compute is not enabled. According to the documentation:Predictive optimization identifies tables that would benefit from ANALYZE, OPTIMIZE, and VACUUM operations and que...

  • 1150 Views
  • 1 replies
  • 1 kudos
Latest Reply
Alberto_Umana
Databricks Employee
  • 1 kudos

Hi @htd350, Predictive optimization in Databricks largely depends on the use of serverless compute to execute operations like ANALYZE, OPTIMIZE, and VACUUM, but not 100% sure if serverless is needed on all scenarios. I'll check internally and confirm...

  • 1 kudos
mrstevegross
by Contributor III
  • 1036 Views
  • 3 replies
  • 0 kudos

Graviton & containers?

Currently, DBR does not permit a user to run a containerized job on a graviton machines (per these docs). In our case, we're running containerized jobs on a pool. We are exploring adopting Graviton, but--per those docs--DBR won't let us do that.Are t...

  • 1036 Views
  • 3 replies
  • 0 kudos
Latest Reply
Isi
Honored Contributor III
  • 0 kudos

Hey @mrstevegross Steve,I have found this docs from Databricks about enviroments, as you can see is in public preview... If you find my previous answer helpful, feel free to mark it as the solution so it can help others as well.Thanks!Isi

  • 0 kudos
2 More Replies
suppome
by New Contributor
  • 508 Views
  • 1 replies
  • 0 kudos

CAN RESTART can read logs from Job and Spark

Is it possible to read logs from Job or workflow run when I am having CAN RESTART role?   

suppome_0-1744300593577.png
  • 508 Views
  • 1 replies
  • 0 kudos
Latest Reply
Isi
Honored Contributor III
  • 0 kudos

Hey @suppome I share with you the security model for clusters and JobsHope this helps Isi

  • 0 kudos
felix_counter
by New Contributor III
  • 18517 Views
  • 7 replies
  • 3 kudos

How to authenticate databricks provider in terraform using a system-managed identity?

Hello,I want to authenticate the databricks provider using a system-managed identity in Azure. The identity resides in a different subscription than the databricks workspace: According to the "authentication" section of the databricks provider docume...

managed identity.png
Data Engineering
authentication
databricks provider
managed identity
Terraform
  • 18517 Views
  • 7 replies
  • 3 kudos
Latest Reply
goTEEMgo
New Contributor II
  • 3 kudos

Add an environment variable to you run environment. Add TF_LOG and set it to true. Scroll through and look for an oauth api call. Look at the resourceI have run into the same problem, and it looks our appreg for  AzureDatabricks enterprise applicatio...

  • 3 kudos
6 More Replies
aswithap
by New Contributor
  • 1135 Views
  • 1 replies
  • 0 kudos

Feasibility of Dynamically Reusing Common user defined functions Across Multiple DLT Notebooks

Hi @DataBricks team,I'm exploring ways to enable dynamic reusability of common user defined functions across multiple notebooks in a DLT (Delta Live Tables) pipeline. The goal is to avoid duplicating code and maintain a centralized location for commo...

  • 1135 Views
  • 1 replies
  • 0 kudos
Latest Reply
ashraf1395
Honored Contributor
  • 0 kudos

A simple solution and recommedned approach can be - If possible you can club all those common user defined functions in a structured python package / whl file.Now once this whl file is created you can simply upload it to your catalog volume and the f...

  • 0 kudos
vishaldevarajan
by New Contributor II
  • 2546 Views
  • 3 replies
  • 0 kudos

Unable to read excel files in the Azure databricks (UC enabled workspace)

Hello,After adding the maven library com.crealytics:spark-excel_2.12:0.13.5 under the artifact allowlist, I have installed it at the Azure databricks cluster level (shared, unity catalog enabled, runtime 15.4). Then I tried to create a df for the exc...

Data Engineering
Azure Databricks
Excel File
  • 2546 Views
  • 3 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

I did a little more digging and found further information:   Unity Catalog does not natively support reading Excel files directly. Based on the provided context, there are a few key points to consider: Third-Party Libraries: Reading Excel files in D...

  • 0 kudos
2 More Replies
walgt
by New Contributor II
  • 2694 Views
  • 1 replies
  • 1 kudos

Resolved! Databricks data engineer associate exam

Hi everyone,I'm preparing for the Databricks Data Engineer Associate certification. On the Databricks website, they list the following self-paced courses available in Databricks Academy for exam preparation:Data Ingestion with Delta LakeDeploy Worklo...

  • 2694 Views
  • 1 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Greetings, Yes, you have identified the correct sequence of courses to take before attempting the exam. I would also recommend gaining at least six months of practical experience using Databricks for data engineering tasks prior to sitting for the ce...

  • 1 kudos
Sadam97
by New Contributor III
  • 1064 Views
  • 3 replies
  • 0 kudos

Predictive Optimization is not running

We have enabled predictive optimization at account level and metastore level. The enables check box can be seen in catalog details and table details. When i query the system.storage.predictive_optimization_operations_history table, it is still empty....

  • 1064 Views
  • 3 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

I can't help with your specifc workspace as I don't have access to any customer environment. Support can help if you open a ticket with them but at this point I am out suggestions.

  • 0 kudos
2 More Replies
ShivangiB
by New Contributor III
  • 1668 Views
  • 8 replies
  • 0 kudos

Not Able To Access GCP storage bucket from Databricks

While running :df = spark.read.format("csv") \    .option("header", "true") \    .option("inferSchema", "true") \    .load('path')df.show()Getting error : java.io.IOException: Invalid PKCS8 data.Cluster Spark Config : spark.hadoop.fs.gs.auth.service....

  • 1668 Views
  • 8 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

At this point it is out of my area of knowledge and I don't havey any further suggestions. You may want to consider contacting Databricks Support if you have a support contract.

  • 0 kudos
7 More Replies
Phani1
by Valued Contributor II
  • 6895 Views
  • 2 replies
  • 0 kudos

Resolved! Databricks with Private cloud

Hi Databricks Team,Is it possible for Databricks to offer support for private cloud environments other than Azure, GCP, and AWS? The client intends to utilize Databricks in their own cloud for enhanced security. If this is feasible, what is the proce...

  • 6895 Views
  • 2 replies
  • 0 kudos
Latest Reply
mtatusDHS
New Contributor II
  • 0 kudos

We're looking at Databricks, but would prefer to use a Pure Storage Array to house data, mostly because of the cost of data storage for cloud providers. We're okay using cloud compute, but storage is much more feasible for us with local/private stora...

  • 0 kudos
1 More Replies
ChristianRRL
by Valued Contributor III
  • 4020 Views
  • 3 replies
  • 2 kudos

Databricks UMF Best Practice

Hi there, I would like to get some feedback on what are the ideal/suggested ways to get UMF data from our Azure cloud into Databricks. For context, UMF can mean either:User Managed FileUser Maintained FileBasically, a UMF could be something like a si...

Data Engineering
Data ingestion
UMF
User Maintained File
User Managed File
  • 4020 Views
  • 3 replies
  • 2 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 2 kudos

I am not an expert on this topic or Azure services but I did some research and have some suggested courses of action for you to test out.  To address your request for suggested ways to get User Managed Files (UMF) from Azure into Databricks, here are...

  • 2 kudos
2 More Replies
ChristianRRL
by Valued Contributor III
  • 1433 Views
  • 3 replies
  • 4 kudos

Resolved! toml file syntax highlighting

Hi there, I'm curious if there's a way for Databricks to support syntax highlighting for a language that is currently not supported in our DBX configuration. For example, I'm using .toml files, but Databricks doesn't understand it and displays it as ...

ChristianRRL_0-1743447583140.png ChristianRRL_1-1743447614420.png
  • 1433 Views
  • 3 replies
  • 4 kudos
Latest Reply
Advika
Databricks Employee
  • 4 kudos

Hello @ChristianRRL! Sorry for the delayed response. Databricks currently does not support syntax highlighting for .toml files. As a workaround, you can edit toml files in external editors like VS code (with plugins) and sync them to Databricks using...

  • 4 kudos
2 More Replies
georgef
by New Contributor III
  • 7137 Views
  • 3 replies
  • 3 kudos

Resolved! Cannot import relative python paths

Hello,Some variations of this question have been asked before but there doesn't seem to be an answer for the following simple use case:I have the following file structure on a Databricks Asset Bundles project: src --dir1 ----file1.py --dir2 ----file2...

  • 7137 Views
  • 3 replies
  • 3 kudos
Latest Reply
klaas
New Contributor II
  • 3 kudos

This works as long as the script calling the module is indeed __main__; i've changed it a bit to make it more generic:import os import sys def find_module(path): while path: if os.path.basename(path) == "src": return path ...

  • 3 kudos
2 More Replies
lsrinivas2k13
by New Contributor II
  • 1127 Views
  • 3 replies
  • 0 kudos

not able to run python script even after everything is in place in azure data bricks

getting the below error while running a python which connects to azure sql db Database connection error: ('01000', "[01000] [unixODBC][Driver Manager]Can't open lib 'ODBC Driver 17 for SQL Server' : file not found (0) (SQLDriverConnect)") can some on...

  • 1127 Views
  • 3 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

The error occurs because the Microsoft ODBC Driver 17 for SQL Server is missing on your Azure Databricks cluster. Here's how to fix it: Steps to Resolve Step 1: Create an Init Script to Install ODBC Driver1. Create a file named `odbc-install.sh` with...

  • 0 kudos
2 More Replies
NikosLoutas
by New Contributor III
  • 1013 Views
  • 3 replies
  • 2 kudos

Resolved! Materialized Views Compute

When creating a Materialized View (MV) without a schedule, there seems to be a cost associated with the MV once it is created, even if it is not queried.The question is, once the MV is created, is there already a "hot" compute ready for use in case a...

  • 1013 Views
  • 3 replies
  • 2 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 2 kudos

Please select "Accept as Solution" so that others can benefit from this exchange.  Regards, Louis.

  • 2 kudos
2 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels