Databricks Community

Hritik_Moon · ‎10-07-2025

table_name = f"project.bronze.{file_name}"

spark.sql(

f"""

CREATE TABLE IF NOT EXISTS {table_name}

USING DELTA

"""

)

what am I getting wrong?

szymon_dybczak · ‎10-07-2025

Hi @Hritik_Moon ,

For json issue - try to add following option to reader. Very often it resolves issue:

.option("multiLine",true)

View solution in original post

szymon_dybczak · ‎10-07-2025

Hi @Hritik_Moon ,

What error did you get? Because the code itself is correct as you can see on screenshot below - assuming you've created catalog named project and schema named bronze before. One thing that is weird - why do you create schemaless table here? Is there a particular reason for doing that?

Hritik_Moon · ‎10-07-2025

says define absolute path, and add /dbfs before path. When I do that it says no permission.

Hritik_Moon · ‎10-07-2025

I will explain what I am tryin to do,

I have create a cataloag and schema as:

project

files

raw_files

orders.json

orders.csv

bronze

Notebook1 reads the file present in raw_files splits the name into file_name and file_format and stores then in a dict inside list. Which I have defined in a taskValue.

Notebook 2 reads these parameters performs basic cleaning like null handling, duplicate handling, typecasting and stores data in bronze in delta format.

Now the cleaned data is stored in bronze.cleaned{file_name} and bad records in bronze.bad_records_{file_name} using

df_clean.write.format("delta").mode("append").saveAsTable(f"project.bronze.cleaned{file_name}")

--------------------------------------------------------------------------------------------------------------------------------------------------------

this was giving me errors cause no table was present( I misinterpreted the error)

the actual thing is my json file is corrupted, when I try with other csv files the JOB is working fine.

Now I ma trying to solve whats wrong with the json file.

szymon_dybczak · ‎10-07-2025

Hi @Hritik_Moon ,

For json issue - try to add following option to reader. Very often it resolves issue:

.option("multiLine",true)

Hritik_Moon · ‎10-07-2025

yes, multiline solved it. 😀.

Is there any better approach to this scenario?

szymon_dybczak · ‎10-07-2025

If you don't expect many files then it could be fine approach. But if you expect to handle thousands of files this approach won't scale - beacuse you iteration one file after the other.
You can check how to deal with ingestion at scale with auto loader. But for learning purpose your scenario is good.

Anyway, if my previous answer was helpful to you, please consider marking it as a solution. In this way we help community members find an answer for similiar question faster.

Hritik_Moon · ‎10-15-2025

Hey @szymon_dybczak , I am working on readStream and writeStream. Is this similar to the job/pipeline I built earlier?

Databricks Community

create delta table in free edition

Join Us as a Local Community Builder!

Join us for another BrickTalk: Vibe-Coding Databricks Apps in Replit with Augusto!

🌟 Community Pulse: Your Weekly Roundup! November 14 – 20, 2025

Celebrating Our First Brickster Champion: Louis Frolio

⭐ Setup Spark with Hadoop Anywhere : A DBR aligned local Spark+HDFS+Hive stack on Docker⭐

Big Book of Data Engineering - Get how-tos, code snippets and real-world examples