cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

johschmidt42
by New Contributor II
  • 2597 Views
  • 2 replies
  • 0 kudos

Autoloader cloudFiles.maxFilesPerTrigger ignored with .trigger(availableNow=True)?

Hi, I'm using the Auto Loader feature to read streaming data from Delta Lake files and process them in a batch. The trigger is set to availableNow to include all new data from the checkpoint offset but I limit the amount of delta files for the batch ...

  • 2597 Views
  • 2 replies
  • 0 kudos
Latest Reply
p_romm
New Contributor III
  • 0 kudos

In doc it is: "cloudFiles.maxFilesPerTrigger" https://docs.databricks.com/aws/en/ingestion/cloud-object-storage/auto-loader/options

  • 0 kudos
1 More Replies
verargulla
by New Contributor III
  • 16210 Views
  • 5 replies
  • 4 kudos

Azure Databricks: Error Creating Cluster

We have provisioned a new workspace in Azure using our own VNet. Upon creating the first cluster, I encounter this error:Control Plane Request Failure: Failed to get instance bootstrap steps from the Databricks Control Plane. Please check that instan...

  • 16210 Views
  • 5 replies
  • 4 kudos
Latest Reply
Mohamednazeer
New Contributor III
  • 4 kudos

We are also facing the same issue.

  • 4 kudos
4 More Replies
ayushmangal72
by New Contributor III
  • 1371 Views
  • 2 replies
  • 1 kudos

Resolved! Revert cluster DBR version to last DBR

Hi Team,We have updated our clusters DBR version, later we got to know that some of our jobs started failing, now we wanted to revert to DBR version to the previos one only but we forgot the DBR version on which job was running fine.Is there any way ...

  • 1371 Views
  • 2 replies
  • 1 kudos
Latest Reply
ayushmangal72
New Contributor III
  • 1 kudos

Thank you for your reply, I also found an another solution, checked the event_logs and there old DBR versions was mentioned.

  • 1 kudos
1 More Replies
Kayla
by Valued Contributor II
  • 5474 Views
  • 5 replies
  • 0 kudos

Errors When Using R on Unity Catalog Clusters

We are running into errors when running workflows with multiple jobs using the same notebook/different parameters. They are reading from tables we still have in hive_metastore, there's no Unity Catalog tables or functionality referenced anywhere. We'...

  • 5474 Views
  • 5 replies
  • 0 kudos
Latest Reply
Anwarchubb
New Contributor II
  • 0 kudos

R enabled cluster only supports single user group so please check permission at your group level 

  • 0 kudos
4 More Replies
stef2
by New Contributor III
  • 13315 Views
  • 14 replies
  • 5 kudos

Resolved! 2023-03-22 10:29:23 | Error 403 | https://customer-academy.databricks.com/

I would like to know why I am getting this error when I tried to earn badges for lakehouse fundamentals. I can't access the quiz page. Can you please help on this?

  • 13315 Views
  • 14 replies
  • 5 kudos
Latest Reply
dkn_data
New Contributor II
  • 5 kudos

Login by you gmail account in customer-academy.databricks.com and search the LakeHouse short course and enroll free

  • 5 kudos
13 More Replies
ashap551
by New Contributor II
  • 3835 Views
  • 1 replies
  • 0 kudos

Streaming vs Batch with Continuous Trigger

Not sure what the concrete advantage there is for me to create a streaming table vs static one.  In my case, I designed a table with a job that extracts the most lastest files from an s3 location and then appends them to a delta table.  I set the job...

  • 3835 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ajay-Pandey
Databricks MVP
  • 0 kudos

@ashap551 You're essentially implementing a well-optimized micro-batching process, and functionally, it's very similar to what readStream() with Autoloader would do. However, there are some advantages to using Autoloader and a proper streaming table ...

  • 0 kudos
Nik21
by New Contributor II
  • 1005 Views
  • 2 replies
  • 1 kudos

warning message when secrets are added in cluster

when i try to add secrets in cluster config , databricks is showing error that secrets should not be hardcoded, although it is not halting to save those ,it is showing warning even if they are not hardcoded. 

Nik21_0-1742954637148.png
  • 1005 Views
  • 2 replies
  • 1 kudos
Latest Reply
chexa_Wee
New Contributor III
  • 1 kudos

Hi Nik21,You can use Key Vaults in azure to store your secrets then import them to data bricks.

  • 1 kudos
1 More Replies
chexa_Wee
by New Contributor III
  • 3848 Views
  • 7 replies
  • 2 kudos

Resolved! How to manage two separate projects ?

Hi all, I am managing one project in Databricks, with one more coming soon. Can anyone guide me on how to use Unity Catalog or any other method for this?"

  • 3848 Views
  • 7 replies
  • 2 kudos
Latest Reply
mnorland
Valued Contributor II
  • 2 kudos

There are a wide variety of needs that need to be considered such as governance, compute and storage.  This depends on the size of your projects.

  • 2 kudos
6 More Replies
Mulder81
by New Contributor II
  • 3253 Views
  • 3 replies
  • 2 kudos

PDF Generation via databricks Job

WE have a databricks Job that will aggregate some data and create some data tables. This needs to be exported out in a PDF format.I have seen a few python libraries that can generate PDF, but was wondering if the PDF can be generated and dropped in a...

  • 3253 Views
  • 3 replies
  • 2 kudos
Latest Reply
Mulder81
New Contributor II
  • 2 kudos

Are there any specific ways to generate the PDF file from a dataframe? and libraries that work

  • 2 kudos
2 More Replies
PraveenReddy21
by New Contributor III
  • 2660 Views
  • 3 replies
  • 0 kudos

how to create catalog

Hi ,I am trying to create  catalog  and database  its not  allowing  databricks  , please  suggest  .Here my code .base_dir = "/mnt/files"spark.sql(f"CREATE CATALOG IF NOT EXISTS dev")spark.sql(f"CREATE DATABASE IF NOT EXISTS dev.demo_db") first i ne...

  • 2660 Views
  • 3 replies
  • 0 kudos
Latest Reply
JairoCollante
New Contributor II
  • 0 kudos

I got a similar error trying create a catalog with "databricks.sdk" library I resolved it add the parameter "storage_root": w.catalogs.create(name=c.name, storage_root='s3://databricks-workspace-bucket/unity-catalog/426335709') In my case all catalog...

  • 0 kudos
2 More Replies
MadhuB
by Valued Contributor
  • 2215 Views
  • 1 replies
  • 0 kudos

Resolved! Installing Maven (3rd party) libraries on Job Cluster

I'm trying to install Maven Libraries on the job cluster (non interactive cluster) as part of databricks workflow. I've added the context in the cluster configuration as part of deployment which I cant find the same in the post deployment configurati...

MadhuB_0-1742919949369.png
  • 2215 Views
  • 1 replies
  • 0 kudos
Latest Reply
MadhuB
Valued Contributor
  • 0 kudos

I found the workaround. Below are the steps:1. Add the required library to the Allowed list at the workspace level (require workspace/metastore admin access); you might need coordinates groupdd:artifactId:version2. At the task level, include under De...

  • 0 kudos
Pu_123
by New Contributor
  • 1187 Views
  • 1 replies
  • 0 kudos

Cluster configuration

Please help me configure/choose the cluster configuration. I need to process and merge 6 million records into Azure SQL DB. At the end of the week, 9 billion records need to be processed and merged into Azure SQL DB, and a few transformations need to...

  • 1187 Views
  • 1 replies
  • 0 kudos
Latest Reply
Shua42
Databricks Employee
  • 0 kudos

It will depend on the transformations and how you're loading them. Assuming it's mostly in spark, I recommend starting small using a job compute cluster with autoscaling enabled for cost efficiency. For daily loads (6 million records), a driver and 2...

  • 0 kudos
walgt
by Databricks Partner
  • 1333 Views
  • 1 replies
  • 1 kudos

Resolved! Permission Issue in Delta Lake Course

Hi everyone,I'm new to Databricks and working on the "Data Ingestion with Delta Lake" course. I encountered a permission error with the following query:Can anyone help with this?Thanks! 

walgt_0-1742915294633.png
  • 1333 Views
  • 1 replies
  • 1 kudos
Latest Reply
Advika
Community Manager
  • 1 kudos

Hello @walgt! Apologies for the inconvenience. This was a known issue, but it has now been fixed! You should now be able to run your query without any problems. Thanks for your patience!

  • 1 kudos
kp12
by New Contributor II
  • 8731 Views
  • 1 replies
  • 0 kudos

Connecting to Azure PostgreSQL from Azure Databricks

Hello,In Databricks there are 2 ways to connect to PostgreSQL, i.e., using JDBC driver or the named connector as mentioned in the document -  https://learn.microsoft.com/en-us/azure/databricks/external-data/postgresqlFor JDBC, the driver needs to be ...

  • 8731 Views
  • 1 replies
  • 0 kudos
Latest Reply
sharukh_lodhi
New Contributor III
  • 0 kudos

Hi Kp12,I just wanted to check whether you found the answer or not.I also want to know the difference because the named connector "PostgreSQL" is overwhelming the CPU of PostgreSQL while inserting 41M rows.

  • 0 kudos
Brianben
by New Contributor III
  • 1544 Views
  • 1 replies
  • 0 kudos

Choice of SQL Warehouse

Hi community,I am studying the documentation about different kind of SQL warehouse (https://docs.databricks.com/aws/en/compute/sql-warehouse/warehouse-types#:~:text=A%20classic%20SQL%20warehouse%20supports,than%20in%20your%20Databricks%20account.)I s...

  • 1544 Views
  • 1 replies
  • 0 kudos
Latest Reply
Advika
Community Manager
  • 0 kudos

Hello @Brianben! Classic SQL warehouses are better for cost-sensitive 24/7 workloads, stable query patterns, and older workflows that depend on traditional data warehouse setups or external Hive metastores. They also allow some manual configuration, ...

  • 0 kudos
Labels