Data Engineering

Forum Posts

Sorted by:

by SamAdams • New Contributor III

4 hours ago

35 Views
2 replies
1 kudos

Migrating source directory in an existing DLT Pipeline with Autoloader

I have a DLT pipeline that reads data in S3 into an append-only bronze layer using Autoloader. The data sink needs to be changed to a new s3 bucket in a new account, and data in the existing s3 bucket migrated to the new one.Will Autoloader still be ...

Data Engineering

35 Views
2 replies
1 kudos

4 hours ago

View Replies

Latest Reply

Brahmareddy
Honored Contributor II

an hour ago

1 kudos

Hi SamAdams,How are you doing today?, As per my understanding, You're on the right track here! When you change the S3 path for Autoloader, even if the files are exactly the same and just copied from the old bucket, Autoloader will treat them as new f...

1 kudos

an hour ago

1 More Replies

by Mulder81 • New Contributor II

7 hours ago

91 Views
3 replies
2 kudos

PDF Generation via databricks Job

WE have a databricks Job that will aggregate some data and create some data tables. This needs to be exported out in a PDF format.I have seen a few python libraries that can generate PDF, but was wondering if the PDF can be generated and dropped in a...

Data Engineering

91 Views
3 replies
2 kudos

7 hours ago

View Replies

Latest Reply

Mulder81
New Contributor II

6 hours ago

2 kudos

Are there any specific ways to generate the PDF file from a dataframe? and libraries that work

2 kudos

6 hours ago

2 More Replies

by PraveenReddy21 • New Contributor III

08-23-2024 4:33:12 AM

803 Views
3 replies
0 kudos

how to create catalog

Hi ,I am trying to create catalog and database its not allowing databricks , please suggest .Here my code .base_dir = "/mnt/files"spark.sql(f"CREATE CATALOG IF NOT EXISTS dev")spark.sql(f"CREATE DATABASE IF NOT EXISTS dev.demo_db") first i ne...

Data Engineering

803 Views
3 replies
0 kudos

08-23-2024 4:33:12 AM

View Replies

Latest Reply

JairoCollante
Visitor

4 hours ago

0 kudos

I got a similar error trying create a catalog with "databricks.sdk" library I resolved it add the parameter "storage_root": w.catalogs.create(name=c.name, storage_root='s3://databricks-workspace-bucket/unity-catalog/426335709') In my case all catalog...

0 kudos

4 hours ago

2 More Replies

by TomHauf • Visitor

5 hours ago

38 Views
0 replies
0 kudos

Sending my weather data to a clients cloud storage

Hi, One of our clients is asking to switch from our API feed to have weather data delivered automatically to their Cloud Storage. What steps do I need to take from my end? Do I need to join Databricks to do so? Thanks. Tom

Data Engineering

38 Views
0 replies
0 kudos

5 hours ago

by MadhuB • Contributor III

9 hours ago

75 Views
1 replies
0 kudos

Installing Maven (3rd party) libraries on Job Cluster

I'm trying to install Maven Libraries on the job cluster (non interactive cluster) as part of databricks workflow. I've added the context in the cluster configuration as part of deployment which I cant find the same in the post deployment configurati...

Data Engineering

75 Views
1 replies
0 kudos

9 hours ago

View Replies

Latest Reply

MadhuB
Contributor III

6 hours ago

0 kudos

I found the workaround. Below are the steps:1. Add the required library to the Allowed list at the workspace level (require workspace/metastore admin access); you might need coordinates groupdd:artifactId:version2. At the task level, include under De...

0 kudos

6 hours ago

by BobCat62 • New Contributor II

6 hours ago

49 Views
0 replies
0 kudos

How to copy notebooks from local to the tarrget folder via asset bundles

Hi all,I am able to deploy Databricks assets to the target workspace. Jobs and workflows can also be created successfully.But I have aspecial requirement, that I copy the note books to the target folder on databricks workspace.Example:on Local I have...

Data Engineering

49 Views
0 replies
0 kudos

6 hours ago

by usabuysmm5 • Visitor

6 hours ago

48 Views
0 replies
0 kudos

How do I verify my PayPal account?

Get Reliable PayPal Account Assistance Are you looking for a seamless way to manage international transactions? Our team provides secure solutions to help you verify your PayPal account and ensure hassle-free payment acceptance across the globe.

Data Engineering

48 Views
0 replies
0 kudos

6 hours ago

by Pu_123 • Visitor

yesterday

91 Views
1 replies
0 kudos

Cluster configuration

Please help me configure/choose the cluster configuration. I need to process and merge 6 million records into Azure SQL DB. At the end of the week, 9 billion records need to be processed and merged into Azure SQL DB, and a few transformations need to...

Data Engineering

91 Views
1 replies
0 kudos

yesterday

View Replies

Latest Reply

Shua42
Databricks Employee

8 hours ago

0 kudos

It will depend on the transformations and how you're loading them. Assuming it's mostly in spark, I recommend starting small using a job compute cluster with autoscaling enabled for cost efficiency. For daily loads (6 million records), a driver and 2...

0 kudos

8 hours ago

by walgt • Visitor

10 hours ago

79 Views
1 replies
0 kudos

Permission Issue in Delta Lake Course

Hi everyone,I'm new to Databricks and working on the "Data Ingestion with Delta Lake" course. I encountered a permission error with the following query:Can anyone help with this?Thanks!

Data Engineering

79 Views
1 replies
0 kudos

10 hours ago

View Replies

Latest Reply

Advika
Databricks Employee

8 hours ago

0 kudos

Hello @walgt! Apologies for the inconvenience. This was a known issue, but it has now been fixed! You should now be able to run your query without any problems. Thanks for your patience!

0 kudos

8 hours ago

by subhas • New Contributor

9 hours ago

53 Views
0 replies
0 kudos

Auto Loader bringing NULL Records

Hi I am using auto loader to fetch some records stored in two files. Please see below my code. It fetches records from two files correctly and then it starts fetching NULL records. I attach option("cleanSource", ) to readStream. But it is ...

Data Engineering

53 Views
0 replies
0 kudos

9 hours ago

by balu_9309 • Visitor

9 hours ago

63 Views
0 replies
0 kudos

databricks job runs connect with powerbi

Hi i have databricks jobs run how to connect power bi app or that runs save in blob or delta table

Data Engineering

63 Views
0 replies
0 kudos

9 hours ago

by chexa_Wee • Visitor

18 hours ago

211 Views
5 replies
0 kudos

How to manage two separate projects ?

Hi all, I am managing one project in Databricks, with one more coming soon. Can anyone guide me on how to use Unity Catalog or any other method for this?"

Data Engineering

211 Views
5 replies
0 kudos

18 hours ago

View Replies

Latest Reply

mnorland
New Contributor III

10 hours ago

0 kudos

There are a wide variety of needs that need to be considered such as governance, compute and storage. This depends on the size of your projects.

0 kudos

10 hours ago

4 More Replies

by ayushmangal72 • Visitor

14 hours ago

113 Views
1 replies
0 kudos

Revert cluster DBR version to last DBR

Hi Team,We have updated our clusters DBR version, later we got to know that some of our jobs started failing, now we wanted to revert to DBR version to the previos one only but we forgot the DBR version on which job was running fine.Is there any way ...

Data Engineering

113 Views
1 replies
0 kudos

14 hours ago

View Replies

Latest Reply

adhi_databricks
New Contributor III

9 hours ago

0 kudos

Hey @ayushmangal72 , try using the Databricks Job Run API (/api/2.2/jobs/runs/list) to fetch older run IDs for the job.Once you have the run_id, make a request to the API at /api/2.2/jobs/runs/get. You'll be able to find the DBR version in the API r...

0 kudos

9 hours ago

by mrstevegross • Contributor

10 hours ago

62 Views
0 replies
0 kudos

Attempt to use a custom container with an instance pool fails

I am trying to run a job with (1) custom containers, and (2) via an instance pool. Here's the setup:The custom container is just the DBR-provided `databricksruntime/standard:12.2-LTS`The instance pool is defined via the UI (see screenshot, below).At ...

Data Engineering

62 Views
0 replies
0 kudos

10 hours ago

by cmathieu • New Contributor II

12 hours ago

71 Views
0 replies
0 kudos

DAB - All projects files deployed

I have an issue with DAB where all the project files, starting from root ., get deployed to the /files folder in the bundle. I would prefer being able to deploy certain util notebooks, but not all the files of the project. I'm able to not deploy any ...

Data Engineering

71 Views
0 replies
0 kudos

12 hours ago

User

Count

1611

763

345

286

252

Databricks Community

Forum Posts

Migrating source directory in an existing DLT Pipeline with Autoloader

PDF Generation via databricks Job

how to create catalog

Sending my weather data to a clients cloud storage

Installing Maven (3rd party) libraries on Job Cluster

How to copy notebooks from local to the tarrget folder via asset bundles

How do I verify my PayPal account?

Cluster configuration

Permission Issue in Delta Lake Course

Auto Loader bringing NULL Records

databricks job runs connect with powerbi

How to manage two separate projects ?

Revert cluster DBR version to last DBR

Attempt to use a custom container with an instance pool fails

DAB - All projects files deployed

Join Us as a Local Community Builder!

Delta Live Tables are refreshed in parallel rather...

How can I efficiently remove backslashes during a ...

Partitioning vs. Clustering for a 50 TiB Delta Lak...

Run failed with termination code: RunExecutionErro...

Asset Bundles Email/Notifications Prod ONly