Data Engineering

Forum Posts

Sorted by:

by THIAM_HUATTAN • Valued Contributor

05-16-2019 1:49:40 AM

53974 Views
8 replies
2 kudos

Skip number of rows when reading CSV files

staticDataFrame = spark.read.format("csv")\ .option("header", "true").option("inferSchema", "true").load("/FileStore/tables/Consumption_2019/*.csv") when above, I need an option to skip say first 4 lines on each CSV file, How do I do that?

Data Engineering

53974 Views
8 replies
2 kudos

05-16-2019 1:49:40 AM

View Replies

Latest Reply

Michael_Appiah
Contributor II

09-21-2023 12:48:58 AM

2 kudos

The option... .option("skipRows", <number of rows to skip>) ...works for me as well. However, I am surprised that the official Spark doc does not list it as a CSV Data Source Option: https://spark.apache.org/docs/latest/sql-data-sources-csv.html#data...

2 kudos

09-21-2023 12:48:58 AM

7 More Replies

by shamly • New Contributor III

01-08-2023 8:20:09 AM

5712 Views
3 replies
2 kudos

How to remove extra ENTER line in csv UTF-16 while reading

Dear Friends,I have a csv and it looks like this‡‡Id‡‡,‡‡Version‡‡,‡‡Questionnaire‡‡,‡‡Date‡‡‡‡123456‡‡,‡‡Version2‡‡,‡‡All questions have been answered accurately and the guidance in the questionnaire was understood and followed‡‡,‡‡2010-12-16 00:01:...

Data Engineering

5712 Views
3 replies
2 kudos

01-08-2023 8:20:09 AM

View Replies

Latest Reply

Aviral-Bhardwaj
Esteemed Contributor III

01-08-2023 8:33:55 PM

2 kudos

This is working fine, from pyspark.sql.functions import regexp_replace path="dbfs:/FileStore/df/test.csv" dff = spark.read.option("header", "true").option("inferSchema", "true").option('multiline', 'true').option('encoding', 'UTF-8').option("delimi...

2 kudos

01-08-2023 8:33:55 PM

2 More Replies

by Raagavi • New Contributor

10-17-2022 8:04:52 AM

8036 Views
1 replies
1 kudos

Is there a way to read the CSV files automatically from on-premises network locations and write back to the same from Databricks?

Data Engineering

8036 Views
1 replies
1 kudos

10-17-2022 8:04:52 AM

View Replies

Latest Reply

Debayan
Databricks Employee

10-18-2022 12:12:41 AM

1 kudos

Hi @Raagavi Rajagopal , you can access files on mounted object storage (just an example) or files, please refer: https://docs.databricks.com/files/index.html#access-files-on-mounted-object-storageAnd in the DBFS , CSV files can be read and write fr...

1 kudos

10-18-2022 12:12:41 AM

by hisham1 • New Contributor

09-25-2022 10:23:04 PM

2427 Views
0 replies
0 kudos

unable to read Csv files from Databricks Database Tables

I amTrying to read a csv file stored in database tables of databricks, but getting error . It is runnin gfine for dbfs but same format not working for Database Tables.

Data Engineering

2427 Views
0 replies
0 kudos

09-25-2022 10:23:04 PM

by Shay • New Contributor III

06-07-2022 12:47:04 AM

9460 Views
8 replies
6 kudos

Resolved! How do you Upload TXT and CSV files into Shared Workspace in Databricks?

I try to upload the needed files under the right directory of the project to work.The files are zipped first as that is an accepted format. I have a Python project which requires the TXT and CSV format files as they are called and used via .py files ...

Data Engineering

9460 Views
8 replies
6 kudos

06-07-2022 12:47:04 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

06-07-2022 1:04:50 AM

6 kudos

@Shay Alam, can you share the code with which you read the files? Apparently python interprets the file format as a language, so it seems like some options are not filled in correctly.

6 kudos

06-07-2022 1:04:50 AM

7 More Replies

by lprevost • Contributor II

08-09-2021 12:45:34 PM

3388 Views
1 replies
1 kudos

Resolved! Schema inferrence CSV picks up \r carriage returns

I'm using: frame = spark.read.csv(path=bucket+folder, inferSchema = True, header = True, multiLine=True ) to read in a series of CSV ...

Data Engineering

3388 Views
1 replies
1 kudos

08-09-2021 12:45:34 PM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

11-22-2021 4:47:52 AM

1 kudos

Files saved in Windows operation system contain carriage return and line feed in every line.Please add following option it can help: .option("ignoreTrailingWhiteSpace", true)

1 kudos

11-22-2021 4:47:52 AM

Databricks Community

Skip number of rows when reading CSV files

How to remove extra ENTER line in csv UTF-16 while reading

Is there a way to read the CSV files automatically from on-premises network locations and write back to the same from Databricks?

unable to read Csv files from Databricks Database Tables

Resolved! How do you Upload TXT and CSV files into Shared Workspace in Databricks?

Resolved! Schema inferrence CSV picks up \r carriage returns