cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

MadhuB
by Valued Contributor
  • 2409 Views
  • 1 replies
  • 0 kudos

Resolved! Installing Maven (3rd party) libraries on Job Cluster

I'm trying to install Maven Libraries on the job cluster (non interactive cluster) as part of databricks workflow. I've added the context in the cluster configuration as part of deployment which I cant find the same in the post deployment configurati...

MadhuB_0-1742919949369.png
  • 2409 Views
  • 1 replies
  • 0 kudos
Latest Reply
MadhuB
Valued Contributor
  • 0 kudos

I found the workaround. Below are the steps:1. Add the required library to the Allowed list at the workspace level (require workspace/metastore admin access); you might need coordinates groupdd:artifactId:version2. At the task level, include under De...

  • 0 kudos
Pu_123
by New Contributor
  • 1305 Views
  • 1 replies
  • 0 kudos

Cluster configuration

Please help me configure/choose the cluster configuration. I need to process and merge 6 million records into Azure SQL DB. At the end of the week, 9 billion records need to be processed and merged into Azure SQL DB, and a few transformations need to...

  • 1305 Views
  • 1 replies
  • 0 kudos
Latest Reply
Shua42
Databricks Employee
  • 0 kudos

It will depend on the transformations and how you're loading them. Assuming it's mostly in spark, I recommend starting small using a job compute cluster with autoscaling enabled for cost efficiency. For daily loads (6 million records), a driver and 2...

  • 0 kudos
walgt
by Databricks Partner
  • 1473 Views
  • 1 replies
  • 1 kudos

Resolved! Permission Issue in Delta Lake Course

Hi everyone,I'm new to Databricks and working on the "Data Ingestion with Delta Lake" course. I encountered a permission error with the following query:Can anyone help with this?Thanks! 

walgt_0-1742915294633.png
  • 1473 Views
  • 1 replies
  • 1 kudos
Latest Reply
Advika
Community Manager
  • 1 kudos

Hello @walgt! Apologies for the inconvenience. This was a known issue, but it has now been fixed! You should now be able to run your query without any problems. Thanks for your patience!

  • 1 kudos
kp12
by New Contributor II
  • 9004 Views
  • 1 replies
  • 0 kudos

Connecting to Azure PostgreSQL from Azure Databricks

Hello,In Databricks there are 2 ways to connect to PostgreSQL, i.e., using JDBC driver or the named connector as mentioned in the document -  https://learn.microsoft.com/en-us/azure/databricks/external-data/postgresqlFor JDBC, the driver needs to be ...

  • 9004 Views
  • 1 replies
  • 0 kudos
Latest Reply
sharukh_lodhi
New Contributor III
  • 0 kudos

Hi Kp12,I just wanted to check whether you found the answer or not.I also want to know the difference because the named connector "PostgreSQL" is overwhelming the CPU of PostgreSQL while inserting 41M rows.

  • 0 kudos
Brianben
by New Contributor III
  • 1966 Views
  • 1 replies
  • 0 kudos

Choice of SQL Warehouse

Hi community,I am studying the documentation about different kind of SQL warehouse (https://docs.databricks.com/aws/en/compute/sql-warehouse/warehouse-types#:~:text=A%20classic%20SQL%20warehouse%20supports,than%20in%20your%20Databricks%20account.)I s...

  • 1966 Views
  • 1 replies
  • 0 kudos
Latest Reply
Advika
Community Manager
  • 0 kudos

Hello @Brianben! Classic SQL warehouses are better for cost-sensitive 24/7 workloads, stable query patterns, and older workflows that depend on traditional data warehouse setups or external Hive metastores. They also allow some manual configuration, ...

  • 0 kudos
sergecom
by New Contributor III
  • 5018 Views
  • 9 replies
  • 4 kudos

Databricks SQL Exists does not work correct?

Can someone explain to me how this is possible?  

Sergecom_0-1742460269477.png Sergecom_1-1742460306964.png
  • 5018 Views
  • 9 replies
  • 4 kudos
Latest Reply
sergecom
New Contributor III
  • 4 kudos

@Shua42 I've realized that I didn’t mention the subquery issue in my first post, so I guess this can be handled as a separate ticket.

  • 4 kudos
8 More Replies
T_I
by New Contributor II
  • 3860 Views
  • 5 replies
  • 0 kudos

Connect Databricks to Airflow

Hi,I have Databricks on top of aws. I have a Databricks connection on Airflow (mwaa). I am able to conect and execute a Datbricks job via Airflow using a personal access token. I believe the best practice is to conect using a service principal. I und...

  • 3860 Views
  • 5 replies
  • 0 kudos
Latest Reply
Sloka
New Contributor II
  • 0 kudos

https://airflow.apache.org/docs/apache-airflow-providers-databricks/6.9.0/connections/databricks.html

  • 0 kudos
4 More Replies
Mailendiran
by New Contributor III
  • 7244 Views
  • 3 replies
  • 0 kudos

Unity Catalog - Storage Account Data Access

I was exploring on unity catalog option on Databricks premium workspace.I understood that i need to create storage account credentials and external connection in workspace.Later, i can access the cloud data using 'abfss://storage_account_details' .I ...

  • 7244 Views
  • 3 replies
  • 0 kudos
Latest Reply
DouglasMoore
Databricks Employee
  • 0 kudos

Databricks strategic direction is to deprecate mount points in favor of Unity Catalog Volumes.Setup an STORAGE CREDENTIAL and EXTERNAL LOCATION to access and define how to get to your cloud storage account. To access data on the account, define a Tab...

  • 0 kudos
2 More Replies
ChrisLawford
by New Contributor II
  • 6126 Views
  • 4 replies
  • 2 kudos

PyTest working in Repos but not in Databricks Asset Bundles

Hello,I am trying to run PyTest from a notebook or python file that exists due to being deployed by a Databricks Asset Bundle (DAB).I have a repository that contains a number of files with the end goal of trying to run PyTest in a directory to valida...

  • 6126 Views
  • 4 replies
  • 2 kudos
Latest Reply
cinyoung
Databricks Employee
  • 2 kudos

@ChrisLawford You can run pytest through job databricks bundle run -t dev pytest_job I was able to work around in this way.resource/pytest.job.ymlresources: jobs: pytest_job: name: pytest_job tasks: - task_key: pytest_task ...

  • 2 kudos
3 More Replies
Aquib
by Databricks Partner
  • 4167 Views
  • 2 replies
  • 0 kudos

How to migrate DBFS from one tenant to another tenant

I am working on Databricks workspace migration, where I need to copy the Databricks workspace including DBFS from source to target (both source and target are in different subscription/account). Can someone suggest what could be approach to migrate D...

  • 4167 Views
  • 2 replies
  • 0 kudos
Latest Reply
arjunappani
New Contributor II
  • 0 kudos

Hi @jose_gonzalez How can we migrate the data from Managed storage account from azure data bricks to new tenant?

  • 0 kudos
1 More Replies
hdu
by New Contributor III
  • 1451 Views
  • 1 replies
  • 1 kudos

Resolved! Change cluster owner API call failed

I am trying to change cluster's owner using API call. but get following error. I am positive that host, cluster_id and owner_username are all correct. The error message says No API found. Is this related with the compute I am using? or something else...

hdu_0-1742837197352.png
  • 1451 Views
  • 1 replies
  • 1 kudos
Latest Reply
Brahmareddy
Esteemed Contributor II
  • 1 kudos

Hi hdu,How are you doing today?, As per my understanding, It sounds like you’re really close! That “No API found” error usually means either the wrong API endpoint is being used, or the cluster type doesn’t support ownership changes—for example, shar...

  • 1 kudos
Shivap
by New Contributor III
  • 2731 Views
  • 4 replies
  • 3 kudos

What's the recommended way of creating tables in Databricks with unity catalog (External/Managed)

I have databricks with unity catalog enables and created an external ADLS location. when I create the catalog/schema it uses the external location. when I try to create the table it uses the extrenal location but they are managed tables. What's the r...

  • 2731 Views
  • 4 replies
  • 3 kudos
Latest Reply
Brahmareddy
Esteemed Contributor II
  • 3 kudos

Hi Shivap,How are you doing today?, As per my understanding, in Unity Catalog, if you want to create an external table, you just need to make sure the external location is registered and approved first. Even though you're specifying a path with LOCAT...

  • 3 kudos
3 More Replies
bidek56
by Contributor
  • 3898 Views
  • 8 replies
  • 2 kudos

Resolved! When will DB release runtime with Scala 2.13

When will DB release runtime with Scala 2.13? Thx

  • 3898 Views
  • 8 replies
  • 2 kudos
Latest Reply
JoseSoto
New Contributor III
  • 2 kudos

Spark 4 is coming and it's only going to support Scala 2.13, so a Databricks Runtime with Spark 3.5.x and Scala 2.13 should be released soonish.

  • 2 kudos
7 More Replies
samye760
by New Contributor II
  • 3884 Views
  • 1 replies
  • 1 kudos

Job Retry Wait Policy and Cluster Shutdown

Hi all,I have a Databricks Workflow job in which the final task makes an external API call. Sometimes this API will be overloaded and the call will fail. In the spirit of automation, I want this task to retry the call an hour later if it fails in the...

Data Engineering
clusters
jobs
retries
Workflows
  • 3884 Views
  • 1 replies
  • 1 kudos
Latest Reply
rmartinezdezaya
New Contributor II
  • 1 kudos

What about this? Any reply? Any alternative? I'm facing the same issue.

  • 1 kudos
Jennifer
by New Contributor III
  • 3228 Views
  • 6 replies
  • 0 kudos

Can external tables be created backed by current cloud files without ingesting files in Databricks?

Hi,We have huge amount of parquet files in s3 with the path pattern <bucket>/<customer>/yyyy/mm/dd/hh/.*.parquet.The question is can I create a external table in Unity Catalog from this external location without actually ingesting the files? Like wha...

  • 3228 Views
  • 6 replies
  • 0 kudos
Latest Reply
Data_Mavericks
New Contributor III
  • 0 kudos

 i think the issue is that you are trying to create a DELTA table in Unity catalog from an Parquet source without converting it to Delta format first.As Unity catalog will not allow delta table to be created in an non-empty location. Since you want t...

  • 0 kudos
5 More Replies
Labels