cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Deepak_Kandpal
by New Contributor III
  • 10225 Views
  • 3 replies
  • 3 kudos

Resolved! Invalid configuration value detected for fs.azure.account.key with com.crealytics:spark-excel

I have setup my Databricks notebook to use Service Principal to access ADLS using below configuration.service_credential = dbutils.secrets.get(scope="<scope>",key="<service-credential-key>")   spark.conf.set("fs.azure.account.auth.type.<storage-accou...

  • 10225 Views
  • 3 replies
  • 3 kudos
Latest Reply
Harsha_Dbrs
New Contributor II
  • 3 kudos

Below is the implementation of same code in scala:spark.sparkContext.hadoopConfiguration.set("fs.azure.account.key.<accountName>.dfs.core.windows.net",<accountKey>)

  • 3 kudos
2 More Replies
Rishabh-Pandey
by Esteemed Contributor
  • 3002 Views
  • 3 replies
  • 5 kudos

www.linkedin.com

woahhh #Excel plug in for #DeltaSharing.Now I can import delta tables directly into my spreadsheet using Delta Sharing.It puts the power of #DeltaLake into the hands of millions of business users.What does this mean?Imagine a data provider delivering...

  • 3002 Views
  • 3 replies
  • 5 kudos
Latest Reply
udit02
New Contributor II
  • 5 kudos

If you have any uncertainties, feel free to inquire here or connect with me on my LinkedIn profile for further assistance.https://whatsgbpro.org/

  • 5 kudos
2 More Replies
PraveenSaini
by New Contributor
  • 140601 Views
  • 34 replies
  • 6 kudos

How to read excel file using databricks

0 I have a excel file as source file and i want to read data from excel file and convert data in data frame using databricks. I have already added maven dependence for Excel file format. when i a tring below code it is giving error .(Error: java.io....

  • 140601 Views
  • 34 replies
  • 6 kudos
Latest Reply
Jenish_lodha
New Contributor II
  • 6 kudos

To read an Excel file using Databricks, you can use the Databricks runtime, which supports multiple programming languages such as Python, Scala, and R. Here are the general steps to read an Excel file in Databricks using Python:1. **Upload the Excel ...

  • 6 kudos
33 More Replies
vanessafvg
by New Contributor III
  • 1964 Views
  • 1 replies
  • 3 kudos

Extracting data from excel in datalake storage using openpyxl

i am trying to extract some data into databricks but tripping all over openpyxl, newish user of databricks..from openpyxl import load_workbookdirectory_id="hidden"scope="hidden"client_id="hidden"service_credential_key="hidden"container_name="hidden"s...

  • 1964 Views
  • 1 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Vanessa Van Gelder​ Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question. Thanks.

  • 3 kudos
Yash_542965
by New Contributor II
  • 7962 Views
  • 2 replies
  • 3 kudos

Resolved! Access Excel file in delta live pipeline

I'm having an issue accessing the excel through dlt pipeline. the file is in ADLS I'm using pandas to read the Excel. It seems pandas are not able to understand abfss protocol is there any way to read Excel with pandas in dlt pipeline?I'm getting thi...

  • 7962 Views
  • 2 replies
  • 3 kudos
Latest Reply
Yash_542965
New Contributor II
  • 3 kudos

Thanks for the info. It works just need to install an additional library using "%pip install openpyxl".

  • 3 kudos
1 More Replies
mbaumga
by New Contributor III
  • 7311 Views
  • 3 replies
  • 2 kudos

Performance issues when loading an Excel file from DBFS using R

I have uploaded small Excel files on my DBFS. I then use function read_xlsx() from the "readxl" package in R to import the file into the R memory. I use a standard cluster (12.1, non ML). The function works but it takes ages. E.g. a simple Excel tabl...

  • 7311 Views
  • 3 replies
  • 2 kudos
Latest Reply
Anonymous
Not applicable
  • 2 kudos

Hi @Marcel Baumgartner​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best an...

  • 2 kudos
2 More Replies
Rishabh-Pandey
by Esteemed Contributor
  • 2631 Views
  • 0 replies
  • 6 kudos

To connect Delta Lake with Microsoft Excel, you can use the Microsoft Power Query for Excel add-in. Power Query is a data connection tool that allows ...

To connect Delta Lake with Microsoft Excel, you can use the Microsoft Power Query for Excel add-in. Power Query is a data connection tool that allows you to connect to various data sources, including Delta Lake. Here's how to do it:Install the Micros...

  • 2631 Views
  • 0 replies
  • 6 kudos
CBull
by New Contributor III
  • 1762 Views
  • 3 replies
  • 2 kudos

Spark Notebook to import data into Excel

Is there a way to create a notebook that will take the SQL that I want to put into the Notebook and populate Excel daily and send it to a particular person?

  • 1762 Views
  • 3 replies
  • 2 kudos
Latest Reply
Meghala
Valued Contributor II
  • 2 kudos

@Aviral Bhardwaj​  thanks for this, I was needed this info

  • 2 kudos
2 More Replies
Ajay-Pandey
by Esteemed Contributor III
  • 6215 Views
  • 3 replies
  • 13 kudos

Resolved! Fetching data in excel through delta sharing

Hi all,Is anyway that we can access or push data in delta sharing by using Microsoft excel?

  • 6215 Views
  • 3 replies
  • 13 kudos
Latest Reply
Rishabh-Pandey
Esteemed Contributor
  • 13 kudos

hey @Ajay Pandey​ yes recently the new excel feature also comes in the market that we can enable the delta sharing from excel also so whatever the changes you will made to delta , it will automaticaly get saved in the excel file also ,refer this lin...

  • 13 kudos
2 More Replies
pkgltn
by New Contributor III
  • 2135 Views
  • 2 replies
  • 2 kudos

Resolved! Load an Excel File (located in Databricks Repo connected to Azure DevOps) into a dataframe

Hi, How can I load an Excel File (located in Databricks Repo connected to Azure DevOps) into a dataframe? When I pass the full path into the load method, it displays an error.java.io.FileNotFoundException Has someone done it previously?

  • 2135 Views
  • 2 replies
  • 2 kudos
Latest Reply
pkgltn
New Contributor III
  • 2 kudos

Hi,Just managed to do it.Upgraded the cluster to the latest version because Files in Repos only works in most recent versions of the cluster.When loading the dataframe, specify the path as follows: file:/Workspace/Repos/user@email.com/filepath/filena...

  • 2 kudos
1 More Replies
rajat1
by New Contributor
  • 13729 Views
  • 2 replies
  • 1 kudos

How to convert dataframe (df), to a excel file that I can share with my colleagues ?

I am working on microsoft azure databrick, I have a final dataframe of shape (3276*23) , I want to share it in form of excel file? How can I do it ( I am using ->df.to_excel('fileOutput.xlsx', sheet_name = 'Sheet1', index = False) , command is runn...

  • 13729 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

You could try this way, convert Pyspark Dataframe to Pandas Dataframe then export to excel file.

  • 1 kudos
1 More Replies
LPlates
by New Contributor III
  • 10885 Views
  • 2 replies
  • 1 kudos

Resolved! How do you read an Excel spreadsheet with Databricks

My cluster has Scala 2.12I've installed Maven Library com.crealytics:spark-excel_2.12:0.14.0I get an error java.lang.IllegalStateException: Cannot get a STRING value from a NUMERIC cellwhen trying to execute the following%pythonexcelFileName="/mnt/dl...

  • 10885 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Another way also help for your case is usign Pandas to read excel then convert Pandas Dataframe to Pyspark Dataframe

  • 1 kudos
1 More Replies
sreedata
by New Contributor III
  • 4623 Views
  • 4 replies
  • 10 kudos

Resolved! Date field getting changed when reading from excel file to dataframe

The date field is getting changed while reading data from source .xls file to the dataframe. In the source xl file all columns are strings but i am not sure why date column alone behaves differentlyIn Source file date is 1/24/2022.In dataframe it is ...

  • 4623 Views
  • 4 replies
  • 10 kudos
Latest Reply
Pradeep_Namani
New Contributor III
  • 10 kudos

Hi Team, @Merca Ovnerud​ I am also facing same issue , below is the code snippet which I am using df=spark.read.format("com.crealytics.spark.excel").option("header","true").load("/mnt/dataplatform/Tenant_PK/Results.xlsx")I have a couple of date colum...

  • 10 kudos
3 More Replies
sarvesh
by Contributor III
  • 3976 Views
  • 0 replies
  • 0 kudos

Can we read an excel file with many sheets with there indexes?

I am trying to read a excel file which has 3 sheets which have integers as there names,sheet 1 name = 21sheet 2 name = 24sheet 3 name = 224i got this data from a user so I can't change the sheet name, but with spark reading these is an issue.code -v...

  • 3976 Views
  • 0 replies
  • 0 kudos
User16790091296
by Contributor II
  • 1689 Views
  • 1 replies
  • 0 kudos

How to read a Databricks table via Databricks api in Python?

Using Python-3, I am trying to compare an Excel (xlsx) sheet to an identical spark table in Databricks. I want to avoid doing the compare in Databricks. So I am looking for a way to read the spark table via the Databricks api. Is this possible? How c...

  • 1689 Views
  • 1 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 0 kudos

What is the format of the table - if It is delta, you could use the python bindings for the native Rust API and read the table from your python code and compare bypassing the metastore.

  • 0 kudos
Labels