cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How can we get AutoLoader to detect a file footer?

BF7
Contributor

We are dealing with CSVs that have footers in them. When we have an empty file, the presence of this footer seems to impair the schema inferencing of AutoLoader, because of the footer.

I know where is a header = true parameter, but I don't see a footer parameter in documentation. 

Has anyone found a good way to deal with the presence of a FOOTER in CSV files being detected by AutoLoader?

1 ACCEPTED SOLUTION

Accepted Solutions

BigRoux
Databricks Employee
Databricks Employee

Then no, there is no option as part of the Spark API to handle that. You would have to do some custom coding. Hope this help, Louis.

View solution in original post

3 REPLIES 3

BigRoux
Databricks Employee
Databricks Employee

To be clear, when you say footer are you referring to the last row of the tuple?  e.g. Header = row 1, Footer = row_last.

 

If it were to be correctly read from a CSV into a dataframe in which the rows are tuples then yes, the footer would be the last row.

BigRoux
Databricks Employee
Databricks Employee

Then no, there is no option as part of the Spark API to handle that. You would have to do some custom coding. Hope this help, Louis.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now