Data Engineering

Forum Posts

Sorted by:

by 563641 • New Contributor II

07-06-2022 8:25:48 AM

1188 Views
1 replies
2 kudos

Advanced ML Virtual Training Video from 2022 Summit (not currently accessible)

There does not seem to be a way to log into and view the recent "paid" training sessions from the 2022 Data/AI Summit. I was able to log in and view the videos yesterday, but the website currently posted has no option for logging in/access. Is the...

Data Engineering

1188 Views
1 replies
2 kudos

07-06-2022 8:25:48 AM

View Replies

Latest Reply

Vidula
Honored Contributor

09-01-2022 12:34:27 AM

2 kudos

Hey there @Christopher Warner Just wanted to check in if you were able to resolve your issue or do you need more help? We'd love to hear from you.Thanks!

2 kudos

09-01-2022 12:34:27 AM

by pawelmitrus • Contributor

07-24-2022 5:31:17 AM

6196 Views
4 replies
1 kudos

Why Databricks spawns multiple jobs

I have a Delta table spark101.airlines (sourced from `/databricks-datasets/airlines/`) partitioned by `Year`. My `spark.sql.shuffle.partitions` is set to default 200. I run a simple query:select Origin, count(*) from spark101.airlines group by Origi...

Data Engineering

6196 Views
4 replies
1 kudos

07-24-2022 5:31:17 AM

View Replies

Latest Reply

User16753725469
Databricks Employee

09-01-2022 12:01:18 AM

1 kudos

Could you please paste the query plan here to analyse the issue

1 kudos

09-01-2022 12:01:18 AM

3 More Replies

by hamzatazib96 • New Contributor III

07-05-2022 9:54:01 AM

3326 Views
1 replies
1 kudos

Snowflake/GCP error: Premature end of chunk coded message body: closing chunk expected

Hello all,I've been experiencing the error described below, where I try to query a table from Snowflake which is about ~5.5B rows and ~30columns, and it fails almost systematically; specifically, either the Spark Job doesn't even start or I get the ...

Data Engineering

3326 Views
1 replies
1 kudos

07-05-2022 9:54:01 AM

View Replies

Latest Reply

Vidula
Honored Contributor

08-31-2022 11:30:52 PM

1 kudos

Hey there @hamzatazib96 Does @Kaniz Fatma response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

1 kudos

08-31-2022 11:30:52 PM

by Anonymous • Not applicable

07-05-2022 9:50:03 AM

5379 Views
5 replies
3 kudos

Encryption/Decryption options in ADB

Hello all,We are working on one of the client requirements to implement suitable data encryption in Azure Databricks.We should be able to encrypt and decrypt the data based on the access, we explored fernet library but client denied it saying it degr...

Data Engineering

5379 Views
5 replies
3 kudos

07-05-2022 9:50:03 AM

View Replies

Latest Reply

Vidula
Honored Contributor

08-31-2022 11:27:49 PM

3 kudos

Hi @purushotham Chanda Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you...

3 kudos

08-31-2022 11:27:49 PM

4 More Replies

by Hubert-Dudek • Esteemed Contributor III

08-25-2022 9:18:06 AM

8284 Views
3 replies
28 kudos

Some key notes from today's Databricks Azure Roadmap Q4 event: - Faster merge and updates with Photon + Deletion vectors in Q3,- Unity Catalog - s...

Some key notes from today's Databricks Azure Roadmap Q4 event:- Faster merge and updates with Photon + Deletion vectors in Q3,- Unity Catalog - system tables, including system table for lineage,- ML features and models under Unity Catalog governance,...

Data Engineering

8284 Views
3 replies
28 kudos

08-25-2022 9:18:06 AM

View Replies

Latest Reply

Pat
Esteemed Contributor

08-29-2022 4:57:24 AM

28 kudos

Thanks, for sharing this.Nothing about Delta Live Tables availability in Unity Catalog ?

28 kudos

08-29-2022 4:57:24 AM

2 More Replies

by chainavarro • New Contributor

08-30-2022 1:23:53 PM

2311 Views
1 replies
1 kudos

Resolved! Attended 27th July 2022 webinar but have not recieved voucher,even uploaded Lakehouse certificate

@Kaniz Fatma (Databricks) @Samantha (Databricks) This is Diego Navarro .Actually I attended Databricks webinar on 27th July for (Databricks Certification Exam Overview Training: Databricks Certified Data Analyst Associate).I was expecting vouchers ...

Data Engineering

2311 Views
1 replies
1 kudos

08-30-2022 1:23:53 PM

View Replies

Latest Reply

Nadia1
Databricks Employee

08-31-2022 7:00:04 AM

1 kudos

Hello,you received your voucher on 8/4. It might have gone to spam. Here you go: APP82P22diwBJHT4Thank you

1 kudos

08-31-2022 7:00:04 AM

by Zoltar • New Contributor III

08-20-2022 10:07:03 PM

11078 Views
4 replies
10 kudos

Resolved! UI Improvements / Personalization?

I have a few suggestions for UI improvement on Databricks console -- Or maybe if anyone has figured out a way (using greasemonkey or similar scripts) to make some changes to Databricks UI -- i would like to know. # 1 - Workspace NavigationCan we have...

Data Engineering

11078 Views
4 replies
10 kudos

08-20-2022 10:07:03 PM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

08-30-2022 1:33:24 PM

10 kudos

Great ideas.I know that regarding #1 new file manager is in development.#3 I also proposed when we discussed possible improvements.@Lindsay Olson @Jose Gonzalez @Prabakar Ammeappin maybe we can push it as user feedback as that are great ideas with...

10 kudos

08-30-2022 1:33:24 PM

3 More Replies

by spyderfaye • New Contributor II

07-04-2022 2:38:54 AM

2532 Views
3 replies
1 kudos

Has anyone come across an issue where a table join fails for a single row, when there is no reason for this to happen?

So, I have a super simple left join from one table to another it's purpose to retrieve the date of birth for a customer from the customer ID FK in the transaction table to the customer ID PK in the customer table. A customer will have several transac...

Data Engineering

2532 Views
3 replies
1 kudos

07-04-2022 2:38:54 AM

View Replies

Latest Reply

Vidula
Honored Contributor

08-31-2022 2:00:03 AM

1 kudos

Hi @Faye Hughes Thank you so much for getting back to us. It's really great of you to send in the solution and mark the answer as best. We really appreciate your time.Wish you a great Databricks journey ahead!

1 kudos

08-31-2022 2:00:03 AM

2 More Replies

by Michael_Galli • Contributor III

07-04-2022 2:11:21 AM

2635 Views
3 replies
3 kudos

Streaming with Delta table source- definition of "File"?

Hi all,I have a Delta Table as a Spark Streaming source.This table contains signals on row level -> each signal is one append to the source table that creates a new version in the delta transaction history.I am not really sure now how Spark streaming...

Data Engineering

2635 Views
3 replies
3 kudos

07-04-2022 2:11:21 AM

View Replies

Latest Reply

Vidula
Honored Contributor

08-31-2022 1:56:50 AM

3 kudos

Hey there @Michael Galli Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from y...

3 kudos

08-31-2022 1:56:50 AM

2 More Replies

by c038644 • New Contributor II

08-30-2022 4:58:07 AM

2932 Views
3 replies
3 kudos

Use of venv pack

Hi, I very new so this probably sounds stupid...I'm following the blog on How to Manage Python Dependencies in PySpark:https://www.databricks.com/blog/2020/12/22/how-to-manage-python-dependencies-in-pyspark.html...but when I try the packing works fin...

Data Engineering

2932 Views
3 replies
3 kudos

08-30-2022 4:58:07 AM

View Replies

Latest Reply

Debayan
Databricks Employee

08-30-2022 9:54:24 PM

3 kudos

Can you try using an absolute path instead of a relative path for the same? For example: https://stackoverflow.com/questions/38661464/filenotfounderror-winerror-3

3 kudos

08-30-2022 9:54:24 PM

2 More Replies

by AnandR • New Contributor

08-26-2022 11:18:33 AM

1360 Views
1 replies
1 kudos

I have 2 roles created for my Dbricks acc on AWS. Want to know which role will be used by Dbricks for AWS resources (ex. Cluster Creation)

I have 1 role with AWS root account and 1 role wit AWS non-root account. How do I tell Dbricks to use specific role for cluster creation ? Please guide me here or if any documentation will also suffice . Thanks.

Data Engineering

1360 Views
1 replies
1 kudos

08-26-2022 11:18:33 AM

View Replies

Latest Reply

AmanSehgal
Honored Contributor III

08-30-2022 9:21:31 PM

1 kudos

Go to settings > Admin ConsoleUnder instance profiles tab you can add an instance profile which is a container for IAM role. Using this you can let EC2 instance know which S3 buckets it can access. Under users tab you can manage users who have access...

1 kudos

08-30-2022 9:21:31 PM

by TT1 • New Contributor III

08-30-2022 10:24:09 AM

3350 Views
2 replies
8 kudos

My co-worker has had issues with her notebook changes not saving. Definitely not a file size issue. What are the potential things that I can have her do to get to the bottom of this? Is there a way to force the system to save the notebook?

Data Engineering

3350 Views
2 replies
8 kudos

08-30-2022 10:24:09 AM

View Replies

Latest Reply

AmanSehgal
Honored Contributor III

08-30-2022 8:50:14 PM

8 kudos

Notebooks are auto saved and you can track changes by clicking on Revision History on top right corner of the notebook. Also link git repo to your notebook to track changes.

8 kudos

08-30-2022 8:50:14 PM

1 More Replies

by zyang • Contributor II

07-21-2022 11:42:33 PM

2533 Views
1 replies
4 kudos

pyspark delta table schema evolution

I am using the schema evolution in the delta table and the code is written in databricks notebook. df.write .format("delta") .mode("append") .option("mergeSchema", "true") .partitionBy("date") .save(path)But I ...

Data Engineering

2533 Views
1 replies
4 kudos

07-21-2022 11:42:33 PM

View Replies

Latest Reply

Noopur_Nigam
Databricks Employee

08-30-2022 2:04:11 PM

4 kudos

Hi @z yang Please provide the df creation code as well to understand the complete exception and scenario.

4 kudos

08-30-2022 2:04:11 PM

by j02424 • New Contributor

08-30-2022 1:09:27 AM

4259 Views
1 replies
4 kudos

Best practice to delete /dbfs/tmp ?

What is best practice regarding the tmp folder? We have a very large amount of data in that folder and not sure whether to delete, back up etc?

Data Engineering

4259 Views
1 replies
4 kudos

08-30-2022 1:09:27 AM

View Replies

Latest Reply

Debayan
Databricks Employee

08-30-2022 2:04:10 PM

4 kudos

/dbfs/tmp can contain a lot of files including temporary system files used for intermediary calculations or other sub directories which can contain packages of user defined installations. It is always better to backup the files.

4 kudos

08-30-2022 2:04:10 PM

by Akshith_Rajesh • New Contributor III

08-30-2022 9:36:29 AM

7069 Views
3 replies
6 kudos

Unable to write Data frame to Azure Synapse Table

When I am trying to insert records into the azure synapse Table using JDBC Its throwing below error com.microsoft.sqlserver.jdbc.SQLServerException: The statement failed. Column 'COMPANY_ADDRESS_STATE' has a data type that cannot participate ...

Data Engineering

7069 Views
3 replies
6 kudos

08-30-2022 9:36:29 AM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

08-30-2022 1:12:07 PM

6 kudos

Columns that use any of the following data types cannot be included in a columnstore index:nvarchar(max), varchar(max), and varbinary(max) (Applies to SQL Server 2016 and prior versions, and nonclustered columnstore indexes)so the issue is on the Azu...

6 kudos

08-30-2022 1:12:07 PM

2 More Replies

Databricks Community

Forum Posts

Advanced ML Virtual Training Video from 2022 Summit (not currently accessible)

Why Databricks spawns multiple jobs

Snowflake/GCP error: Premature end of chunk coded message body: closing chunk expected

Encryption/Decryption options in ADB

Some key notes from today's Databricks Azure Roadmap Q4 event: - Faster merge and updates with Photon + Deletion vectors in Q3,- Unity Catalog - s...

Resolved! Attended 27th July 2022 webinar but have not recieved voucher,even uploaded Lakehouse certificate

Resolved! UI Improvements / Personalization?

Has anyone come across an issue where a table join fails for a single row, when there is no reason for this to happen?

Streaming with Delta table source- definition of "File"?

Use of venv pack

I have 2 roles created for my Dbricks acc on AWS. Want to know which role will be used by Dbricks for AWS resources (ex. Cluster Creation)

My co-worker has had issues with her notebook changes not saving. Definitely not a file size issue. What are the potential things that I can have her do to get to the bottom of this? Is there a way to force the system to save the notebook?

pyspark delta table schema evolution

Best practice to delete /dbfs/tmp ?

Unable to write Data frame to Azure Synapse Table

Join Us as a Local Community Builder!

List workspace permissions should return identity

How Are You Using Local IDEs (VS Code / Cursor/ Wh...

What strategies have you found most effective for ...

What are best practices for designing a large-scal...

How do I choose between a standard cluster and a s...