cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

563641
by New Contributor II
  • 1188 Views
  • 1 replies
  • 2 kudos

Advanced ML Virtual Training Video from 2022 Summit (not currently accessible)

There does not seem to be a way to log into and view the recent "paid" training sessions from the 2022 Data/AI Summit. I was able to log in and view the videos yesterday, but the website currently posted has no option for logging in/access. Is the...

  • 1188 Views
  • 1 replies
  • 2 kudos
Latest Reply
Vidula
Honored Contributor
  • 2 kudos

Hey there @Christopher Warner​ Just wanted to check in if you were able to resolve your issue or do you need more help? We'd love to hear from you.Thanks!

  • 2 kudos
pawelmitrus
by Contributor
  • 6196 Views
  • 4 replies
  • 1 kudos

Why Databricks spawns multiple jobs

I have a Delta table spark101.airlines (sourced from `/databricks-datasets/airlines/`) partitioned by `Year`. My `spark.sql.shuffle.partitions` is set to default 200. I run a simple query:select Origin, count(*) from spark101.airlines group by Origi...

image
  • 6196 Views
  • 4 replies
  • 1 kudos
Latest Reply
User16753725469
Databricks Employee
  • 1 kudos

Could you please paste the query plan here to analyse the issue

  • 1 kudos
3 More Replies
hamzatazib96
by New Contributor III
  • 3326 Views
  • 1 replies
  • 1 kudos

Snowflake/GCP error: Premature end of chunk coded message body: closing chunk expected

Hello all,I've been experiencing the error described below, where I try to query a table from Snowflake which is about ~5.5B rows and ~30columns, and it fails almost systematically; specifically, either the Spark Job doesn't even start or I get the ...

  • 3326 Views
  • 1 replies
  • 1 kudos
Latest Reply
Vidula
Honored Contributor
  • 1 kudos

Hey there @hamzatazib96​ Does @Kaniz Fatma​  response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

  • 1 kudos
Anonymous
by Not applicable
  • 5379 Views
  • 5 replies
  • 3 kudos

Encryption/Decryption options in ADB

Hello all,We are working on one of the client requirements to implement suitable data encryption in Azure Databricks.We should be able to encrypt and decrypt the data based on the access, we explored fernet library but client denied it saying it degr...

  • 5379 Views
  • 5 replies
  • 3 kudos
Latest Reply
Vidula
Honored Contributor
  • 3 kudos

Hi @purushotham Chanda​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you...

  • 3 kudos
4 More Replies
Hubert-Dudek
by Esteemed Contributor III
  • 8284 Views
  • 3 replies
  • 28 kudos

Some key notes from today's Databricks Azure Roadmap Q4 event: - Faster merge and updates with Photon + Deletion vectors in Q3,- Unity Catalog - s...

Some key notes from today's Databricks Azure Roadmap Q4 event:- Faster merge and updates with Photon + Deletion vectors in Q3,- Unity Catalog - system tables, including system table for lineage,- ML features and models under Unity Catalog governance,...

  • 8284 Views
  • 3 replies
  • 28 kudos
Latest Reply
Pat
Esteemed Contributor
  • 28 kudos

Thanks, for sharing this.Nothing about Delta Live Tables availability in Unity Catalog ?

  • 28 kudos
2 More Replies
chainavarro
by New Contributor
  • 2311 Views
  • 1 replies
  • 1 kudos

Resolved! Attended 27th July 2022 webinar but have not recieved voucher,even uploaded Lakehouse certificate

@Kaniz Fatma (Databricks)​ @Samantha (Databricks)​ This is Diego Navarro .Actually I attended Databricks webinar on 27th July for (Databricks Certification Exam Overview Training: Databricks Certified Data Analyst Associate).I was expecting vouchers ...

  • 2311 Views
  • 1 replies
  • 1 kudos
Latest Reply
Nadia1
Databricks Employee
  • 1 kudos

Hello,you received your voucher on 8/4. It might have gone to spam. Here you go: APP82P22diwBJHT4Thank you

  • 1 kudos
Zoltar
by New Contributor III
  • 11078 Views
  • 4 replies
  • 10 kudos

Resolved! UI Improvements / Personalization?

I have a few suggestions for UI improvement on Databricks console -- Or maybe if anyone has figured out a way (using greasemonkey or similar scripts) to make some changes to Databricks UI -- i would like to know. # 1 - Workspace NavigationCan we have...

image image interface_left image
  • 11078 Views
  • 4 replies
  • 10 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 10 kudos

Great ideas.I know that regarding #1 new file manager is in development.#3 I also proposed when we discussed possible improvements.@Lindsay Olson​ @Jose Gonzalez​ @Prabakar Ammeappin​ maybe we can push it as user feedback as that are great ideas with...

  • 10 kudos
3 More Replies
spyderfaye
by New Contributor II
  • 2532 Views
  • 3 replies
  • 1 kudos

Has anyone come across an issue where a table join fails for a single row, when there is no reason for this to happen?

So, I have a super simple left join from one table to another it's purpose to retrieve the date of birth for a customer from the customer ID FK in the transaction table to the customer ID PK in the customer table. A customer will have several transac...

  • 2532 Views
  • 3 replies
  • 1 kudos
Latest Reply
Vidula
Honored Contributor
  • 1 kudos

Hi @Faye Hughes​ Thank you so much for getting back to us. It's really great of you to send in the solution and mark the answer as best. We really appreciate your time.Wish you a great Databricks journey ahead!

  • 1 kudos
2 More Replies
Michael_Galli
by Contributor III
  • 2635 Views
  • 3 replies
  • 3 kudos

Streaming with Delta table source- definition of "File"?

Hi all,I have a Delta Table as a Spark Streaming source.This table contains signals on row level -> each signal is one append to the source table that creates a new version in the delta transaction history.I am not really sure now how Spark streaming...

  • 2635 Views
  • 3 replies
  • 3 kudos
Latest Reply
Vidula
Honored Contributor
  • 3 kudos

Hey there @Michael Galli​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from y...

  • 3 kudos
2 More Replies
c038644
by New Contributor II
  • 2932 Views
  • 3 replies
  • 3 kudos

Use of venv pack

Hi, I very new so this probably sounds stupid...I'm following the blog on How to Manage Python Dependencies in PySpark:https://www.databricks.com/blog/2020/12/22/how-to-manage-python-dependencies-in-pyspark.html...but when I try the packing works fin...

  • 2932 Views
  • 3 replies
  • 3 kudos
Latest Reply
Debayan
Databricks Employee
  • 3 kudos

Can you try using an absolute path instead of a relative path for the same? For example: https://stackoverflow.com/questions/38661464/filenotfounderror-winerror-3

  • 3 kudos
2 More Replies
AnandR
by New Contributor
  • 1360 Views
  • 1 replies
  • 1 kudos

I have 2 roles created for my Dbricks acc on AWS. Want to know which role will be used by Dbricks for AWS resources (ex. Cluster Creation)

I have 1 role with AWS root account and 1 role wit AWS non-root account. How do I tell Dbricks to use specific role for cluster creation ? Please guide me here or if any documentation will also suffice . Thanks.

  • 1360 Views
  • 1 replies
  • 1 kudos
Latest Reply
AmanSehgal
Honored Contributor III
  • 1 kudos

Go to settings > Admin ConsoleUnder instance profiles tab you can add an instance profile which is a container for IAM role. Using this you can let EC2 instance know which S3 buckets it can access. Under users tab you can manage users who have access...

  • 1 kudos
TT1
by New Contributor III
  • 3350 Views
  • 2 replies
  • 8 kudos
  • 3350 Views
  • 2 replies
  • 8 kudos
Latest Reply
AmanSehgal
Honored Contributor III
  • 8 kudos

Notebooks are auto saved and you can track changes by clicking on Revision History on top right corner of the notebook. Also link git repo to your notebook to track changes.

  • 8 kudos
1 More Replies
zyang
by Contributor II
  • 2533 Views
  • 1 replies
  • 4 kudos

pyspark delta table schema evolution

I am using the schema evolution in the delta table and the code is written in databricks notebook. df.write .format("delta") .mode("append") .option("mergeSchema", "true") .partitionBy("date") .save(path)But I ...

  • 2533 Views
  • 1 replies
  • 4 kudos
Latest Reply
Noopur_Nigam
Databricks Employee
  • 4 kudos

Hi @z yang​ Please provide the df creation code as well to understand the complete exception and scenario.

  • 4 kudos
j02424
by New Contributor
  • 4259 Views
  • 1 replies
  • 4 kudos

Best practice to delete /dbfs/tmp ?

What is best practice regarding the tmp folder? We have a very large amount of data in that folder and not sure whether to delete, back up etc?

  • 4259 Views
  • 1 replies
  • 4 kudos
Latest Reply
Debayan
Databricks Employee
  • 4 kudos

/dbfs/tmp can contain a lot of files including temporary system files used for intermediary calculations or other sub directories which can contain packages of user defined installations. It is always better to backup the files.

  • 4 kudos
Akshith_Rajesh
by New Contributor III
  • 7069 Views
  • 3 replies
  • 6 kudos

Unable to write Data frame to Azure Synapse Table

When I am trying to insert records into the azure synapse Table using JDBC Its throwing below error com.microsoft.sqlserver.jdbc.SQLServerException: The statement failed. Column 'COMPANY_ADDRESS_STATE' has a data type that cannot participate ...

  • 7069 Views
  • 3 replies
  • 6 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 6 kudos

Columns that use any of the following data types cannot be included in a columnstore index:nvarchar(max), varchar(max), and varbinary(max) (Applies to SQL Server 2016 and prior versions, and nonclustered columnstore indexes)so the issue is on the Azu...

  • 6 kudos
2 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels