cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

chexa_Wee
by New Contributor III
  • 3323 Views
  • 7 replies
  • 2 kudos

Resolved! How to manage two separate projects ?

Hi all, I am managing one project in Databricks, with one more coming soon. Can anyone guide me on how to use Unity Catalog or any other method for this?"

  • 3323 Views
  • 7 replies
  • 2 kudos
Latest Reply
mnorland
Valued Contributor II
  • 2 kudos

There are a wide variety of needs that need to be considered such as governance, compute and storage.  This depends on the size of your projects.

  • 2 kudos
6 More Replies
Mulder81
by New Contributor II
  • 2689 Views
  • 3 replies
  • 2 kudos

PDF Generation via databricks Job

WE have a databricks Job that will aggregate some data and create some data tables. This needs to be exported out in a PDF format.I have seen a few python libraries that can generate PDF, but was wondering if the PDF can be generated and dropped in a...

  • 2689 Views
  • 3 replies
  • 2 kudos
Latest Reply
Mulder81
New Contributor II
  • 2 kudos

Are there any specific ways to generate the PDF file from a dataframe? and libraries that work

  • 2 kudos
2 More Replies
PraveenReddy21
by New Contributor III
  • 2238 Views
  • 3 replies
  • 0 kudos

how to create catalog

Hi ,I am trying to create  catalog  and database  its not  allowing  databricks  , please  suggest  .Here my code .base_dir = "/mnt/files"spark.sql(f"CREATE CATALOG IF NOT EXISTS dev")spark.sql(f"CREATE DATABASE IF NOT EXISTS dev.demo_db") first i ne...

  • 2238 Views
  • 3 replies
  • 0 kudos
Latest Reply
JairoCollante
New Contributor II
  • 0 kudos

I got a similar error trying create a catalog with "databricks.sdk" library I resolved it add the parameter "storage_root": w.catalogs.create(name=c.name, storage_root='s3://databricks-workspace-bucket/unity-catalog/426335709') In my case all catalog...

  • 0 kudos
2 More Replies
MadhuB
by Valued Contributor
  • 1919 Views
  • 1 replies
  • 0 kudos

Resolved! Installing Maven (3rd party) libraries on Job Cluster

I'm trying to install Maven Libraries on the job cluster (non interactive cluster) as part of databricks workflow. I've added the context in the cluster configuration as part of deployment which I cant find the same in the post deployment configurati...

MadhuB_0-1742919949369.png
  • 1919 Views
  • 1 replies
  • 0 kudos
Latest Reply
MadhuB
Valued Contributor
  • 0 kudos

I found the workaround. Below are the steps:1. Add the required library to the Allowed list at the workspace level (require workspace/metastore admin access); you might need coordinates groupdd:artifactId:version2. At the task level, include under De...

  • 0 kudos
Pu_123
by New Contributor
  • 925 Views
  • 1 replies
  • 0 kudos

Cluster configuration

Please help me configure/choose the cluster configuration. I need to process and merge 6 million records into Azure SQL DB. At the end of the week, 9 billion records need to be processed and merged into Azure SQL DB, and a few transformations need to...

  • 925 Views
  • 1 replies
  • 0 kudos
Latest Reply
Shua42
Databricks Employee
  • 0 kudos

It will depend on the transformations and how you're loading them. Assuming it's mostly in spark, I recommend starting small using a job compute cluster with autoscaling enabled for cost efficiency. For daily loads (6 million records), a driver and 2...

  • 0 kudos
walgt
by New Contributor II
  • 1206 Views
  • 1 replies
  • 1 kudos

Resolved! Permission Issue in Delta Lake Course

Hi everyone,I'm new to Databricks and working on the "Data Ingestion with Delta Lake" course. I encountered a permission error with the following query:Can anyone help with this?Thanks! 

walgt_0-1742915294633.png
  • 1206 Views
  • 1 replies
  • 1 kudos
Latest Reply
Advika
Community Manager
  • 1 kudos

Hello @walgt! Apologies for the inconvenience. This was a known issue, but it has now been fixed! You should now be able to run your query without any problems. Thanks for your patience!

  • 1 kudos
kp12
by New Contributor II
  • 8242 Views
  • 1 replies
  • 0 kudos

Connecting to Azure PostgreSQL from Azure Databricks

Hello,In Databricks there are 2 ways to connect to PostgreSQL, i.e., using JDBC driver or the named connector as mentioned in the document -  https://learn.microsoft.com/en-us/azure/databricks/external-data/postgresqlFor JDBC, the driver needs to be ...

  • 8242 Views
  • 1 replies
  • 0 kudos
Latest Reply
sharukh_lodhi
New Contributor III
  • 0 kudos

Hi Kp12,I just wanted to check whether you found the answer or not.I also want to know the difference because the named connector "PostgreSQL" is overwhelming the CPU of PostgreSQL while inserting 41M rows.

  • 0 kudos
Brianben
by New Contributor III
  • 1169 Views
  • 1 replies
  • 0 kudos

Choice of SQL Warehouse

Hi community,I am studying the documentation about different kind of SQL warehouse (https://docs.databricks.com/aws/en/compute/sql-warehouse/warehouse-types#:~:text=A%20classic%20SQL%20warehouse%20supports,than%20in%20your%20Databricks%20account.)I s...

  • 1169 Views
  • 1 replies
  • 0 kudos
Latest Reply
Advika
Community Manager
  • 0 kudos

Hello @Brianben! Classic SQL warehouses are better for cost-sensitive 24/7 workloads, stable query patterns, and older workflows that depend on traditional data warehouse setups or external Hive metastores. They also allow some manual configuration, ...

  • 0 kudos
Sergecom
by New Contributor III
  • 3507 Views
  • 9 replies
  • 4 kudos

Databricks SQL Exists does not work correct?

Can someone explain to me how this is possible?  

Sergecom_0-1742460269477.png Sergecom_1-1742460306964.png
  • 3507 Views
  • 9 replies
  • 4 kudos
Latest Reply
Sergecom
New Contributor III
  • 4 kudos

@Shua42 I've realized that I didn’t mention the subquery issue in my first post, so I guess this can be handled as a separate ticket.

  • 4 kudos
8 More Replies
T_I
by New Contributor II
  • 2875 Views
  • 5 replies
  • 0 kudos

Connect Databricks to Airflow

Hi,I have Databricks on top of aws. I have a Databricks connection on Airflow (mwaa). I am able to conect and execute a Datbricks job via Airflow using a personal access token. I believe the best practice is to conect using a service principal. I und...

  • 2875 Views
  • 5 replies
  • 0 kudos
Latest Reply
Sloka
New Contributor II
  • 0 kudos

https://airflow.apache.org/docs/apache-airflow-providers-databricks/6.9.0/connections/databricks.html

  • 0 kudos
4 More Replies
Mailendiran
by New Contributor III
  • 6374 Views
  • 3 replies
  • 0 kudos

Unity Catalog - Storage Account Data Access

I was exploring on unity catalog option on Databricks premium workspace.I understood that i need to create storage account credentials and external connection in workspace.Later, i can access the cloud data using 'abfss://storage_account_details' .I ...

  • 6374 Views
  • 3 replies
  • 0 kudos
Latest Reply
DouglasMoore
Databricks Employee
  • 0 kudos

Databricks strategic direction is to deprecate mount points in favor of Unity Catalog Volumes.Setup an STORAGE CREDENTIAL and EXTERNAL LOCATION to access and define how to get to your cloud storage account. To access data on the account, define a Tab...

  • 0 kudos
2 More Replies
ChrisLawford
by New Contributor II
  • 5145 Views
  • 4 replies
  • 2 kudos

PyTest working in Repos but not in Databricks Asset Bundles

Hello,I am trying to run PyTest from a notebook or python file that exists due to being deployed by a Databricks Asset Bundle (DAB).I have a repository that contains a number of files with the end goal of trying to run PyTest in a directory to valida...

  • 5145 Views
  • 4 replies
  • 2 kudos
Latest Reply
cinyoung
Databricks Employee
  • 2 kudos

@ChrisLawford You can run pytest through job databricks bundle run -t dev pytest_job I was able to work around in this way.resource/pytest.job.ymlresources: jobs: pytest_job: name: pytest_job tasks: - task_key: pytest_task ...

  • 2 kudos
3 More Replies
Aquib
by New Contributor
  • 3868 Views
  • 2 replies
  • 0 kudos

How to migrate DBFS from one tenant to another tenant

I am working on Databricks workspace migration, where I need to copy the Databricks workspace including DBFS from source to target (both source and target are in different subscription/account). Can someone suggest what could be approach to migrate D...

  • 3868 Views
  • 2 replies
  • 0 kudos
Latest Reply
arjunappani
New Contributor II
  • 0 kudos

Hi @jose_gonzalez How can we migrate the data from Managed storage account from azure data bricks to new tenant?

  • 0 kudos
1 More Replies
hdu
by New Contributor II
  • 1023 Views
  • 1 replies
  • 1 kudos

Resolved! Change cluster owner API call failed

I am trying to change cluster's owner using API call. but get following error. I am positive that host, cluster_id and owner_username are all correct. The error message says No API found. Is this related with the compute I am using? or something else...

hdu_0-1742837197352.png
  • 1023 Views
  • 1 replies
  • 1 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 1 kudos

Hi hdu,How are you doing today?, As per my understanding, It sounds like you’re really close! That “No API found” error usually means either the wrong API endpoint is being used, or the cluster type doesn’t support ownership changes—for example, shar...

  • 1 kudos
Shivap
by New Contributor III
  • 1852 Views
  • 4 replies
  • 3 kudos

What's the recommended way of creating tables in Databricks with unity catalog (External/Managed)

I have databricks with unity catalog enables and created an external ADLS location. when I create the catalog/schema it uses the external location. when I try to create the table it uses the extrenal location but they are managed tables. What's the r...

  • 1852 Views
  • 4 replies
  • 3 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 3 kudos

Hi Shivap,How are you doing today?, As per my understanding, in Unity Catalog, if you want to create an external table, you just need to make sure the external location is registered and approved first. Even though you're specifying a path with LOCAT...

  • 3 kudos
3 More Replies
Labels