cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to load xlsx Files to Delta Live Tables (DLT)?

avrm91
Contributor

I want to load a .xlsx file to DLT but struggling as it is not available with Autoloader.


With the Assistant I tried to load the .xlsx first to a data frame and then send it to DLT.

 

 

import dlt
from pyspark.sql import SparkSession

# Load xlsx file into DataFrame
df = spark.read \
    .format("com.crealytics.spark.excel") \
    .option("dataAddress", "${my_etl.sheet_address}") \
    .option("header", "${my_etl.header}") \
    .option("inferSchema", "${my_etl.infer_schema}") \
    .option("timestampFormat", "${my_etl.timestamp_format}") \
    .load("${my_etl.input_path}")

@dlt.table
def conformed():
    return df

 

 

but it is not working.

4 REPLIES 4

shan_chandra
Databricks Employee
Databricks Employee

@avrm91  - can try dividing xlsx files into a csv as a preprocessing step and ingest them in to a dataframe using Autoloader. Also, you can use openpyxl to load into a dataframe. refer to this doc for example.

 

For now, I will use the Azure Data Factory to convert the xlsx to csv and then process it within Databricks. 
Overall, I wonder if there will be a xlsx Autoloader implementation in the future, as the preprocessing will not have the Autoloader features and I will need to take care what was already loaded and what not.
All over I would not say that it is a solution so I will let this ticket open.

shan_chandra
Databricks Employee
Databricks Employee

@avrm91 - There is a feature request in place. DB engg will priorotized. As of now no ETA yet on this. 

avrm91
Contributor

Added a feature request into Azure Community Portal
XLSX - DLT Autoloader · Community (azure.com)

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group