Data Engineering

Forum Posts

Sorted by:

by Anonymous • Not applicable

07-01-2023 9:20:39 AM

2785 Views
2 replies
3 kudos

Databricks streaming dataframe into Snowflake

Any suggestions on how to stream data from databricks into snowflake?. Is snowpipe is the only option?. Snowpipe is not faster since it runs copy into in a small batch intervals and not in few seconds. If no option other than snowpipe, how to call it...

Data Engineering

2785 Views
2 replies
3 kudos

07-01-2023 9:20:39 AM

View Replies

Latest Reply

Anonymous
Not applicable

07-11-2023 10:15:50 PM

3 kudos

Hi @Anonymous Hope everything is going great. Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we...

3 kudos

07-11-2023 10:15:50 PM

1 More Replies

by Gil • New Contributor III

06-29-2023 12:58:52 PM

8494 Views
10 replies
7 kudos

DLT optimize and vacuum

We were finally able to get DLT pipelines to run the optimize and vacuum automatically. We verified this via the the table history. However I am able to still query versions older than 7 days. Has anyone been experiencing this and how were you a...

Data Engineering

8494 Views
10 replies
7 kudos

06-29-2023 12:58:52 PM

View Replies

Latest Reply

Anonymous
Not applicable

07-11-2023 9:35:45 PM

7 kudos

Hi @Gil Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you. Thanks!

7 kudos

07-11-2023 9:35:45 PM

9 More Replies

by Anonymous • Not applicable

06-13-2023 5:49:44 AM

3151 Views
2 replies
5 kudos

Dear @Werner Stinckens and @Tyler Retzlaff We would like to express our gratitude for your participation and dedication in the Databricks Commun...

Dear @Werner Stinckens and @Tyler Retzlaff We would like to express our gratitude for your participation and dedication in the Databricks Community last week. Your interactions with customers have been valuable and we truly appreciate the time...

Data Engineering

3151 Views
2 replies
5 kudos

06-13-2023 5:49:44 AM

View Replies

Latest Reply

dplante
Contributor II

07-11-2023 8:24:16 PM

5 kudos

Congratulations guys!

5 kudos

07-11-2023 8:24:16 PM

1 More Replies

by Constantine • Contributor III

03-30-2022 9:19:56 AM

9910 Views
2 replies
4 kudos

Resolved! How does merge schema work

Let's say I create a table like CREATE TABLE IF NOT EXISTS new_db.data_table ( key STRING, value STRING, last_updated_time TIMESTAMP ) USING DELTA LOCATION 's3://......';Now when I insert into this table I insert data which has say 20 columns a...

Data Engineering

9910 Views
2 replies
4 kudos

03-30-2022 9:19:56 AM

View Replies

Latest Reply

timdriscoll22
New Contributor II

07-11-2023 12:51:24 PM

4 kudos

I tried running "REFRESH TABLE tablename;" but I still do not see the added columns in the data explorer columns, while I do see the added columns in the sample data

4 kudos

07-11-2023 12:51:24 PM

1 More Replies

by pjain • New Contributor II

07-09-2023 11:22:17 PM

4498 Views
4 replies
0 kudos

_sqldf value in case of query failure in %sql cell

I am trying to write a code for Error Handling in Databricks notebook in case of a SQL magic cell failure. I have a %sql cell followed by some python code in next cells. I want to abort the notebook if the query in %sql cell fails. To do so I am look...

Data Engineering

4498 Views
4 replies
0 kudos

07-09-2023 11:22:17 PM

View Replies

Latest Reply

Anonymous
Not applicable

07-11-2023 3:12:09 AM

0 kudos

Hi @pjain We haven't heard from you since the last response from @daniel_sahal , and I was checking back to see if her suggestions helped you. Or else, If you have any solution, please share it with the community, as it can be helpful to others. A...

0 kudos

07-11-2023 3:12:09 AM

3 More Replies

by GC-James • Contributor II

03-04-2022 7:34:53 AM

23574 Views
15 replies
5 kudos

Resolved! Lost memory when using dbutils

Why does copying a 9GB file from a container to the /dbfs lose me 50GB of memory? (Which doesn't come back until I restarted the cluster)

Data Engineering

23574 Views
15 replies
5 kudos

03-04-2022 7:34:53 AM

View Replies

Latest Reply

AdrianP
New Contributor II

07-11-2023 1:56:28 AM

5 kudos

Hi James,Did you get to the bottom of this? We are experiencing the same issue, and all the suggested solutions don't seem to work.Thanks,Adrian

5 kudos

07-11-2023 1:56:28 AM

14 More Replies

by Vadim1 • New Contributor III

06-03-2022 6:46:50 AM

4365 Views
3 replies
3 kudos

Resolved! Error on Azure-Databricks write RDD to storage account with wsabs://

Hi, I'm trying to write data from RDD to the storage account:Adding storage account key:spark.conf.set("fs.azure.account.key.y.blob.core.windows.net", "myStorageAccountKey")Read and write to the same storage:val path = "wasbs://x@y.blob.core.windows....

Data Engineering

4365 Views
3 replies
3 kudos

06-03-2022 6:46:50 AM

View Replies

Latest Reply

TheoDeSo
New Contributor III

07-11-2023 1:11:55 AM

3 kudos

Hello @Vadim1 and @User16764241763. I'm wondering if you find a way to avoid adding the hardcoded key in the advanced options spark config section in the cluster configuration. Is there a similar command to spark.conf.set("spark.hadoop.fs.azure.accou...

3 kudos

07-11-2023 1:11:55 AM

2 More Replies

by jdobken • New Contributor III

06-07-2023 2:35:43 AM

10647 Views
8 replies
11 kudos

As the Databricks account manager; I cannot login: "Your user already belongs to a Databricks account"

On GCP I subscribed to Databricks in one project within the organization.Then I canceled this subscription and subscribed to Databricks in another project.When I try to login onto the newly subscribed databricks with google SSO:> There was an error s...

Data Engineering

10647 Views
8 replies
11 kudos

06-07-2023 2:35:43 AM

View Replies

Latest Reply

Anonymous
Not applicable

07-10-2023 2:37:23 PM

11 kudos

I can see the issue might be related to organizations or billing accounts. The new Databricks project I tried creating was on a different organization/billing-account than the test Databricks subscription I created a month back.I went back to the ori...

11 kudos

07-10-2023 2:37:23 PM

7 More Replies

by Distributed_Com • New Contributor III

10-24-2022 11:41:31 AM

13916 Views
4 replies
6 kudos

Resolved! Location not empty but not a Delta table

I need help or insight regarding the following errors. My instructors (Brooke Wenig with Conor Murphy) ran this code successfully on our course video, but I cannot replicate what she did. Here is the code and below it is the outcome from my Cours...

Data Engineering

13916 Views
4 replies
6 kudos

10-24-2022 11:41:31 AM

View Replies

Latest Reply

gilo12
New Contributor III

07-10-2023 10:44:20 AM

6 kudos

DELETE the original Parquet table as a separate statementHow can this be done? simple query "DROP TABLE .... " still failing with "cannot be found"

6 kudos

07-10-2023 10:44:20 AM

3 More Replies

by Siravich • New Contributor

07-10-2023 4:13:29 AM

781 Views
0 replies
0 kudos

Permission on Unity catalog

I am facing an issue when assign permission on view created on unity catalog. The problem is I had create a user defined function (UDFs) in order to encrypt sensitive column, I create a view which call the functions and source table within the catalo...

Data Engineering

781 Views
0 replies
0 kudos

07-10-2023 4:13:29 AM

by glebex • New Contributor II

06-13-2023 5:56:16 AM

12415 Views
7 replies
7 kudos

Resolved! Accessing workspace files within cluster init script

Greetings all!I am currently facing an issue while accessing workspace files from the init script.As it was explained in the documentation, it is possible to place init script inside workspace files (link). This works perfectly fine and init script i...

Data Engineering

12415 Views
7 replies
7 kudos

06-13-2023 5:56:16 AM

View Replies

Latest Reply

jacob_hill_prof
New Contributor II

06-20-2023 1:44:01 PM

7 kudos

@Gleb Smolnik You might also want to try cloning a github repo in your init script and then storing dependencies like requirements.txt files and other init scripts there. By doing this you can pull a whole slew of init scripts to be utilized in your...

7 kudos

06-20-2023 1:44:01 PM

6 More Replies

by Raviiit • New Contributor II

07-09-2023 7:55:02 PM

5554 Views
4 replies
8 kudos

Resolved! spark managed tables

Hi, I recently started learning about spark. I was studying about spark managed tables. so as per docs " spark manages the both the data and metadata". Assume that i have a csv file in s3 and I read it into data frame like below.df = spark.read .for...

Data Engineering

5554 Views
4 replies
8 kudos

07-09-2023 7:55:02 PM

View Replies

Latest Reply

Tharun-Kumar
Databricks Employee

07-09-2023 10:05:39 PM

8 kudos

Yes, @Raviiit DBFS (Databricks File System) is a distributed file system used by Databricks clusters. DBFS is an abstraction layer over cloud storage (e.g. S3 or Azure Blob Store), allowing external storage buckets to be mounted as paths in the DBFS ...

8 kudos

07-09-2023 10:05:39 PM

3 More Replies

by databicky • Contributor II

07-07-2023 9:41:21 PM

7857 Views
5 replies
0 kudos

File copy in adls

i am using dbutils.fs.copy(abfss://container/provsn/filen[ame.txt,abfss://container/data/sasam.txt)while.trying this copy method to copy the files it is showing urisyntax exception near the square bracket how can i read and copy it

Data Engineering

7857 Views
5 replies
0 kudos

07-07-2023 9:41:21 PM

View Replies

Latest Reply

dplante
Contributor II

07-09-2023 8:59:54 PM

0 kudos

From looking at stack trace, it looks like URIException. Easiest solution would be renaming the file so that there are no square brackets in the filename. If this is not an option, it might help to URLEncode the path - https://stackoverflow.com/que...

0 kudos

07-09-2023 8:59:54 PM

4 More Replies

by brickster • New Contributor II

10-30-2022 2:58:20 AM

5806 Views
3 replies
2 kudos

Passing values between notebook tasks in Workflow Jobs

I have created a Databricks workflow job with notebooks as individual tasks sequentially linked. I assign a value to a variable in one notebook task (ex: batchid = int(time.time()). Now, I want to pass this batchid variable to next notebook task.What...

Data Engineering

5806 Views
3 replies
2 kudos

10-30-2022 2:58:20 AM

View Replies

Latest Reply

fijoy
Contributor

07-09-2023 7:51:12 PM

2 kudos

@brickster You would use dbutils.jobs.taskValues.set() and dbutils.jobs.taskValues.get().See docs for more details: https://docs.databricks.com/workflows/jobs/share-task-context.html

2 kudos

07-09-2023 7:51:12 PM

2 More Replies

by Enzo_Bahrami • New Contributor III

05-30-2023 12:18:46 PM

9089 Views
6 replies
1 kudos

Resolved! On-Premise SQL Server Ingestion to Databricks Bronze Layer

Hello everyone!So I want to ingest tables with schemas from the on-premise SQL server to Databricks Bronze layer with Delta Live Table and I want to do it using Azure Data Factory and I want the load to be a Snapshot batch load, not an incremental lo...

Data Engineering

9089 Views
6 replies
1 kudos

05-30-2023 12:18:46 PM

View Replies

Latest Reply

Anonymous
Not applicable

05-31-2023 8:18:18 PM

1 kudos

Hi @Parsa Bahraminejad Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best an...

1 kudos

05-31-2023 8:18:18 PM

5 More Replies

Databricks Community

Forum Posts

Databricks streaming dataframe into Snowflake

DLT optimize and vacuum

Dear @Werner Stinckens and @Tyler Retzlaff We would like to express our gratitude for your participation and dedication in the Databricks Commun...

Resolved! How does merge schema work

_sqldf value in case of query failure in %sql cell

Resolved! Lost memory when using dbutils

Resolved! Error on Azure-Databricks write RDD to storage account with wsabs://

As the Databricks account manager; I cannot login: "Your user already belongs to a Databricks account"

Resolved! Location not empty but not a Delta table

Permission on Unity catalog

Resolved! Accessing workspace files within cluster init script

Resolved! spark managed tables

File copy in adls

Passing values between notebook tasks in Workflow Jobs

Resolved! On-Premise SQL Server Ingestion to Databricks Bronze Layer

Join Us as a Local Community Builder!

Can't mergeSchema handle int and bigint?

Understanding least common type in databricks

Least Common Type is different in Serverless and A...

Figure out stale tables/folders being loaded by au...

Cannot import pyspark.pipelines module