cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

shamly
by New Contributor III
  • 3608 Views
  • 2 replies
  • 3 kudos

spark exception error while reading a parquet file

when I try to read parquet file from Azure datalake container from databricks, I am getting spark exception. Below is my queryimport pyarrow.parquet as pqfrom pyspark.sql.functions import *from datetime import datetimedata = spark.read.parquet(f"/mnt...

  • 3608 Views
  • 2 replies
  • 3 kudos
Latest Reply
DavideAnghileri
Contributor
  • 3 kudos

Hi @shamly pt​ , more info are needed to solve the issue. However common problems are:The storage is not mountThat file doesn't exists in the mounted storageAlso, there is no need to use an f-string if there are no curly brackets with expressions in ...

  • 3 kudos
1 More Replies
db-avengers2rul
by Contributor II
  • 3080 Views
  • 8 replies
  • 18 kudos

Code snippet error from course - Databricks Academy - Delta Lake Rapid Start with Python

Dear Team,While i was doing hands on practice from the course - Delta Lake Rapid Start with Pythonhttps://customer-academy.databricks.com/learn/course/97/delta-lake-rapid-start-with-pythoni have come across false as the output dbutils.fs.rm(health_t...

  • 3080 Views
  • 8 replies
  • 18 kudos
Latest Reply
Anonymous
Not applicable
  • 18 kudos

Could you give more description about your issue (screenshot or something). Hope to help you find the issue?

  • 18 kudos
7 More Replies
rajat1
by New Contributor
  • 14606 Views
  • 2 replies
  • 1 kudos

How to convert dataframe (df), to a excel file that I can share with my colleagues ?

I am working on microsoft azure databrick, I have a final dataframe of shape (3276*23) , I want to share it in form of excel file? How can I do it ( I am using ->df.to_excel('fileOutput.xlsx', sheet_name = 'Sheet1', index = False) , command is runn...

  • 14606 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

You could try this way, convert Pyspark Dataframe to Pandas Dataframe then export to excel file.

  • 1 kudos
1 More Replies
LPlates
by New Contributor III
  • 11623 Views
  • 2 replies
  • 1 kudos

Resolved! How do you read an Excel spreadsheet with Databricks

My cluster has Scala 2.12I've installed Maven Library com.crealytics:spark-excel_2.12:0.14.0I get an error java.lang.IllegalStateException: Cannot get a STRING value from a NUMERIC cellwhen trying to execute the following%pythonexcelFileName="/mnt/dl...

  • 11623 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Another way also help for your case is usign Pandas to read excel then convert Pandas Dataframe to Pyspark Dataframe

  • 1 kudos
1 More Replies
JanakaNaw
by New Contributor II
  • 4790 Views
  • 9 replies
  • 3 kudos

Resolved! Databricks Certified Data Engineer Associate Certificate or Badge not received

Hello, I passed Databricks Certified Data Engineer Associate on 28th October 2022, but I haven't received my certificate/badge yet. Please help me with this. Best Regards,Janaka Nawarathna.

  • 4790 Views
  • 9 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Have you received your badge?

  • 3 kudos
8 More Replies
Ryan_Chynoweth
by Esteemed Contributor
  • 8428 Views
  • 3 replies
  • 7 kudos

Resolved! Best language to use

Databricks supports SQL, Scala, Python, and R. Is there a most performant language to use on Databricks? I know SQL well but would like to get into one of the other languages and don't know which to focus on.

  • 8428 Views
  • 3 replies
  • 7 kudos
Latest Reply
Anonymous
Not applicable
  • 7 kudos

It total depends on you? BTW, you can choose Python and SQL

  • 7 kudos
2 More Replies
NOOR_BASHASHAIK
by Contributor
  • 701 Views
  • 0 replies
  • 2 kudos

Databricks SQL endpoint authentication

Hi all​I have a requirement that goes like this:Users from a particular software that doesn't have out-of-the-box integration with Databricks click on a dashboard, the button click then sends an SQL query​ to Databricks (user gets authenticated in Da...

  • 701 Views
  • 0 replies
  • 2 kudos
wyzer
by Contributor II
  • 5465 Views
  • 2 replies
  • 12 kudos

Resolved! Add the creation date of a parquet file into a DataFrame

Currently I load multiple parquet file with this code:df = spark.read.parquet("/mnt/dev/bronze/Voucher/*/*")(Inside the Voucher folder, there is one folder by date. Each one containing one parquet file)How can I add a column into this DataFrame, that...

  • 5465 Views
  • 2 replies
  • 12 kudos
Latest Reply
wyzer
Contributor II
  • 12 kudos

Thanks @Michail Karamanos​ 

  • 12 kudos
1 More Replies
AnubhavG
by Contributor
  • 3846 Views
  • 5 replies
  • 18 kudos

Resolved! External KMS integration with Databricks like AWS KMS, Azure Key Vault.

I would like to know how we can integrate Databricks with External KMS providers, like currently it is doing with AWS KMS and Azure Key Valut?Can we import keys from any other KMS?

  • 3846 Views
  • 5 replies
  • 18 kudos
Latest Reply
Vivian_Wilfred
Databricks Employee
  • 18 kudos

@Anubhav Gupta​ Databricks is hosted on the cloud provider which means that all resources used by databricks in the backend are in the cloud. For instance, if you create a cluster, the VMs are launched in AWS as EC2 instances. So the integration of K...

  • 18 kudos
4 More Replies
rgb
by New Contributor
  • 910 Views
  • 0 replies
  • 0 kudos

Migration_pipeline.py failing to get default credentials

cat ~/.databrickscfg looks like this (with the correct token/host values in place of xxxxxx)[DEFAULT]host = xxxxxxtoken = xxxxxxjobs-api-version = 2.0The command I run to start the pipeline with default configured credentials is :sudo python3 migrati...

databrickserror
  • 910 Views
  • 0 replies
  • 0 kudos
693872
by New Contributor II
  • 3085 Views
  • 5 replies
  • 2 kudos

Here I am getting this error when i execute left join on two data frame: PythonException: 'pyspark.serializers.SerializationError: Caused by Traceback (most recent call last): going to post full traceback:

I simply do left join on two data frame and both data frame content i was able to print.Here is the code looks like:-df_silver = spark.sql("select ds.PropertyID,\              ds.*             from dfsilver as ds LEFT JOIN dfaddmaster as dm \        ...

  • 3085 Views
  • 5 replies
  • 2 kudos
Latest Reply
Dooley
Valued Contributor II
  • 2 kudos

Did that answer your question? Did it work?

  • 2 kudos
4 More Replies
marcus1
by New Contributor III
  • 472 Views
  • 0 replies
  • 0 kudos

Why does databricks https://docs.databricks.com/dev-tools/api/latest/scim/scim-users.html#get-users take so long

I've been observing as we added more workspaces and users to those workspaces that fetching users per workspace is now taking 11 minutes or more.Our automation to provision group access is now unacceptably long. I've noted that the UI doesn't suffer...

  • 472 Views
  • 0 replies
  • 0 kudos
J_M_W
by Contributor
  • 3221 Views
  • 2 replies
  • 5 kudos

Resolved! Databricks is automatically creating a _apply_changes_storage table in the database when using apply_changes for Delta Live Tables

Hi there,I am using apply_changes (aka. Delta Live Tables Change Data Capture) and it works fine. However, it seems to automatically create a secondary table in the database metastore called _apply_storage_changes_{tableName}So for every table I use ...

image image
  • 3221 Views
  • 2 replies
  • 5 kudos
Latest Reply
J_M_W
Contributor
  • 5 kudos

Hi - Thanks @Hubert Dudek​ I will look into disabling access for the users!

  • 5 kudos
1 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels