cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

create delta table in free edition

Hritik_Moon
New Contributor
table_name = f"project.bronze.{file_name}"
spark.sql(
    f"""
    CREATE TABLE IF NOT EXISTS {table_name}
    USING DELTA
    """
)
 
what am I getting wrong?
1 ACCEPTED SOLUTION

Accepted Solutions

szymon_dybczak
Esteemed Contributor III

Hi @Hritik_Moon ,

For json issue - try to add following option to reader. Very often it resolves issue:

.option("multiLine",true)

 

View solution in original post

6 REPLIES 6

szymon_dybczak
Esteemed Contributor III

Hi @Hritik_Moon ,

What error did you get? Because the code itself is correct as you can see on screenshot below - assuming you've created catalog named project and schema named bronze before. One thing that is weird - why do you create schemaless table here? Is there a particular reason for doing that?

szymon_dybczak_0-1759834112938.png

 

 

says define absolute path, and add /dbfs before path. When I do that it says no permission.

Hritik_Moon
New Contributor

I will explain what I am tryin to do,

I have create a cataloag and schema as:

project

        files

       raw_files

                     orders.json

                     orders.csv

 

      bronze

 

Notebook1 reads the file present in raw_files splits the name into file_name and file_format and stores then in a dict inside list. Which I have defined in a taskValue.

Notebook 2 reads these parameters performs basic cleaning like null handling, duplicate handling, typecasting and stores data in bronze in delta format.

Now the cleaned data is stored in bronze.cleaned{file_name} and bad records in bronze.bad_records_{file_name} using 

df_clean.write.format("delta").mode("append").saveAsTable(f"project.bronze.cleaned{file_name}")

--------------------------------------------------------------------------------------------------------------------------------------------------------

this was giving me errors cause no table was present( I misinterpreted the error)

the actual thing is my json file is corrupted, when I try with other csv files the JOB is working fine.

Now I ma trying to solve whats wrong with the json file.

 

 

szymon_dybczak
Esteemed Contributor III

Hi @Hritik_Moon ,

For json issue - try to add following option to reader. Very often it resolves issue:

.option("multiLine",true)

 

Hritik_Moon
New Contributor

yes, multiline solved it. 😀.

Is there any better approach to this scenario?

szymon_dybczak
Esteemed Contributor III

If you don't expect many files then it could be fine approach. But if you expect to handle thousands of files this approach won't scale - beacuse you iteration one file after the other.
You can check how to deal with ingestion at scale with auto loader. But for learning purpose your scenario is good.

Anyway, if my previous answer was helpful to you, please consider marking it as a solution. In this way we help community members find an answer for similiar question faster.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now