cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

Copy Into command to copy into delta table with predefined schema and csv file has no headers

DataInsight
New Contributor II

How do i use copy into command to load 200+ tables with 50+ columns into a delta lake table with predefined schema. I am looking for a more generic approach to be handled in pyspark code.

I am aware that we can pass the column expression into the select clause but passing column name into the select clause seems to be more tedious task.

any help over this is really appreciated 

1 REPLY 1

Lakshay
Esteemed Contributor
Esteemed Contributor

Does your source data have same number of columns as your target Delta tables? In that case, you can do it this way:
COPY INTO my_pipe_data
FROM 's3://my-bucket/pipeData'
FILEFORMAT = CSV
FORMAT_OPTIONS ('mergeSchema' = 'true',
'delimiter' = '|',
'header' = 'true')

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.