cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

matanper
by New Contributor III
  • 4988 Views
  • 6 replies
  • 1 kudos

Custom docker image fails to initalize

I'm trying to use a custom docker image for my job. This is my docker file:FROM databricksruntime/standard:12.2-LTS COPY . . RUN /databricks/python3/bin/pip install -U pip RUN /databricks/python3/bin/pip install -r requirements.txt USER rootMy job ...

  • 4988 Views
  • 6 replies
  • 1 kudos
Latest Reply
mrstevegross
Contributor III
  • 1 kudos

Did y'all ever figure this out? I'm running in a similar issue.

  • 1 kudos
5 More Replies
badari_narayan
by New Contributor II
  • 454 Views
  • 1 replies
  • 0 kudos

Having an issue assigning databricks_current_metastore with terraform provider

I am trying to assign my databricks_current_metastore on terraform and I get the following error back as an output Error: cannot read current metastore: cannot get client current metastore: invalid Databricks Workspace configurationwith data.databric...

  • 454 Views
  • 1 replies
  • 0 kudos
Latest Reply
Panda
Valued Contributor
  • 0 kudos

@badari_narayan Based on above terraform code, you are trying to use the databricks.accounts provider to read the current workspace metastore, which is incorrect — the databricks_current_metastore data source is a workspace-level resource, and must b...

  • 0 kudos
johschmidt42
by New Contributor II
  • 1729 Views
  • 2 replies
  • 0 kudos

Autoloader cloudFiles.maxFilesPerTrigger ignored with .trigger(availableNow=True)?

Hi, I'm using the Auto Loader feature to read streaming data from Delta Lake files and process them in a batch. The trigger is set to availableNow to include all new data from the checkpoint offset but I limit the amount of delta files for the batch ...

  • 1729 Views
  • 2 replies
  • 0 kudos
Latest Reply
p_romm
New Contributor III
  • 0 kudos

In doc it is: "cloudFiles.maxFilesPerTrigger" https://docs.databricks.com/aws/en/ingestion/cloud-object-storage/auto-loader/options

  • 0 kudos
1 More Replies
verargulla
by New Contributor III
  • 14481 Views
  • 5 replies
  • 4 kudos

Azure Databricks: Error Creating Cluster

We have provisioned a new workspace in Azure using our own VNet. Upon creating the first cluster, I encounter this error:Control Plane Request Failure: Failed to get instance bootstrap steps from the Databricks Control Plane. Please check that instan...

  • 14481 Views
  • 5 replies
  • 4 kudos
Latest Reply
Mohamednazeer
New Contributor III
  • 4 kudos

We are also facing the same issue.

  • 4 kudos
4 More Replies
ayushmangal72
by New Contributor III
  • 534 Views
  • 2 replies
  • 1 kudos

Resolved! Revert cluster DBR version to last DBR

Hi Team,We have updated our clusters DBR version, later we got to know that some of our jobs started failing, now we wanted to revert to DBR version to the previos one only but we forgot the DBR version on which job was running fine.Is there any way ...

  • 534 Views
  • 2 replies
  • 1 kudos
Latest Reply
ayushmangal72
New Contributor III
  • 1 kudos

Thank you for your reply, I also found an another solution, checked the event_logs and there old DBR versions was mentioned.

  • 1 kudos
1 More Replies
Kayla
by Valued Contributor II
  • 4300 Views
  • 5 replies
  • 0 kudos

Errors When Using R on Unity Catalog Clusters

We are running into errors when running workflows with multiple jobs using the same notebook/different parameters. They are reading from tables we still have in hive_metastore, there's no Unity Catalog tables or functionality referenced anywhere. We'...

  • 4300 Views
  • 5 replies
  • 0 kudos
Latest Reply
Anwarchubb
New Contributor II
  • 0 kudos

R enabled cluster only supports single user group so please check permission at your group level 

  • 0 kudos
4 More Replies
stef2
by New Contributor III
  • 11092 Views
  • 14 replies
  • 5 kudos

Resolved! 2023-03-22 10:29:23 | Error 403 | https://customer-academy.databricks.com/

I would like to know why I am getting this error when I tried to earn badges for lakehouse fundamentals. I can't access the quiz page. Can you please help on this?

  • 11092 Views
  • 14 replies
  • 5 kudos
Latest Reply
dkn_data
New Contributor II
  • 5 kudos

Login by you gmail account in customer-academy.databricks.com and search the LakeHouse short course and enroll free

  • 5 kudos
13 More Replies
slimbnsalah
by New Contributor II
  • 1194 Views
  • 1 replies
  • 0 kudos

Use Salesforce Lakeflow Connector with a Salesforce Connected App

Hello, I'm trying to use the new Salesforce Lakeflow connector to ingest data into my Databricks account.However I see only the option to connect using a normal user, whereas I want to use a Salesforce App, just like how it is described here Run fede...

  • 1194 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 0 kudos

@slimbnsalah Please select Connection type as of Salesforce Data Cloud then you will be asked for details 

  • 0 kudos
ashap551
by New Contributor II
  • 1267 Views
  • 1 replies
  • 0 kudos

Streaming vs Batch with Continuous Trigger

Not sure what the concrete advantage there is for me to create a streaming table vs static one.  In my case, I designed a table with a job that extracts the most lastest files from an s3 location and then appends them to a delta table.  I set the job...

  • 1267 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ajay-Pandey
Esteemed Contributor III
  • 0 kudos

@ashap551 You're essentially implementing a well-optimized micro-batching process, and functionally, it's very similar to what readStream() with Autoloader would do. However, there are some advantages to using Autoloader and a proper streaming table ...

  • 0 kudos
Nik21
by New Contributor II
  • 462 Views
  • 2 replies
  • 1 kudos

warning message when secrets are added in cluster

when i try to add secrets in cluster config , databricks is showing error that secrets should not be hardcoded, although it is not halting to save those ,it is showing warning even if they are not hardcoded. 

Nik21_0-1742954637148.png
  • 462 Views
  • 2 replies
  • 1 kudos
Latest Reply
chexa_Wee
New Contributor III
  • 1 kudos

Hi Nik21,You can use Key Vaults in azure to store your secrets then import them to data bricks.

  • 1 kudos
1 More Replies
chexa_Wee
by New Contributor III
  • 2292 Views
  • 7 replies
  • 2 kudos

Resolved! How to manage two separate projects ?

Hi all, I am managing one project in Databricks, with one more coming soon. Can anyone guide me on how to use Unity Catalog or any other method for this?"

  • 2292 Views
  • 7 replies
  • 2 kudos
Latest Reply
mnorland
Contributor II
  • 2 kudos

There are a wide variety of needs that need to be considered such as governance, compute and storage.  This depends on the size of your projects.

  • 2 kudos
6 More Replies
Mulder81
by New Contributor II
  • 1334 Views
  • 3 replies
  • 2 kudos

PDF Generation via databricks Job

WE have a databricks Job that will aggregate some data and create some data tables. This needs to be exported out in a PDF format.I have seen a few python libraries that can generate PDF, but was wondering if the PDF can be generated and dropped in a...

  • 1334 Views
  • 3 replies
  • 2 kudos
Latest Reply
Mulder81
New Contributor II
  • 2 kudos

Are there any specific ways to generate the PDF file from a dataframe? and libraries that work

  • 2 kudos
2 More Replies
PraveenReddy21
by New Contributor III
  • 1244 Views
  • 3 replies
  • 0 kudos

how to create catalog

Hi ,I am trying to create  catalog  and database  its not  allowing  databricks  , please  suggest  .Here my code .base_dir = "/mnt/files"spark.sql(f"CREATE CATALOG IF NOT EXISTS dev")spark.sql(f"CREATE DATABASE IF NOT EXISTS dev.demo_db") first i ne...

  • 1244 Views
  • 3 replies
  • 0 kudos
Latest Reply
JairoCollante
New Contributor II
  • 0 kudos

I got a similar error trying create a catalog with "databricks.sdk" library I resolved it add the parameter "storage_root": w.catalogs.create(name=c.name, storage_root='s3://databricks-workspace-bucket/unity-catalog/426335709') In my case all catalog...

  • 0 kudos
2 More Replies
MadhuB
by Valued Contributor
  • 1256 Views
  • 1 replies
  • 0 kudos

Resolved! Installing Maven (3rd party) libraries on Job Cluster

I'm trying to install Maven Libraries on the job cluster (non interactive cluster) as part of databricks workflow. I've added the context in the cluster configuration as part of deployment which I cant find the same in the post deployment configurati...

MadhuB_0-1742919949369.png
  • 1256 Views
  • 1 replies
  • 0 kudos
Latest Reply
MadhuB
Valued Contributor
  • 0 kudos

I found the workaround. Below are the steps:1. Add the required library to the Allowed list at the workspace level (require workspace/metastore admin access); you might need coordinates groupdd:artifactId:version2. At the task level, include under De...

  • 0 kudos
Pu_123
by New Contributor
  • 504 Views
  • 1 replies
  • 0 kudos

Cluster configuration

Please help me configure/choose the cluster configuration. I need to process and merge 6 million records into Azure SQL DB. At the end of the week, 9 billion records need to be processed and merged into Azure SQL DB, and a few transformations need to...

  • 504 Views
  • 1 replies
  • 0 kudos
Latest Reply
Shua42
Databricks Employee
  • 0 kudos

It will depend on the transformations and how you're loading them. Assuming it's mostly in spark, I recommend starting small using a job compute cluster with autoscaling enabled for cost efficiency. For daily loads (6 million records), a driver and 2...

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels