i am trying to extract some data into databricks but tripping all over openpyxl, newish user of databricks..
from openpyxl import load_workbook
directory_id="hidden"
scope="hidden"
client_id="hidden"
service_credential_key="hidden"
container_name="hidden"
storage_account="hidden"
application_id = dbutils.secrets.get(scope= scope, key=client_id)
service_credential = dbutils.secrets.get(scope= scope, key=service_credential_key)
adls_path = f"abfss://{container_name}@{storage_account}.dfs.core.windows.net/"
spark.sql("set spark.databricks.delta.optimizeWrite.enabled = True")
project = "hidden/hidden/"
path = f"abfss://{container_name}@{storage_account}.dfs.core.windows.net/"
raw_path = f"{path}raw/{project}"
raw_schema_path = f"{raw_path}/schema/"
bronze_path = f"{path}bronze/{project}"
silver_path = f"{path}silver/{project}"
gold_path = f"{path}gold/{project}"
temp_path = f"{path}{project}"
print(temp_path)
dbutils.fs.ls(temp_path)
user = load_workbook(f"{temp_path}staff.xlsx")
FileNotFoundError: [Errno 2] No such file or directory: 'hidden'
it definitely does it exist and the link is correct, but not quite understanding why its not picking it up. if i convert it to csv and use
user = spark.read.format("csv").options(**{"sep":",","header":True,"inferSchema":True}).load(f"{temp_path}staff.csv")
it works fine, but trying to avoid the extra hop. working with excel seems a bit painful however. Fairly new on databricks so might be missing something obvious.