showing only a limited number of lines from the CSV file
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-24-2024 02:14 AM - last edited on 07-24-2024 03:48 AM by Retired_mod
Expected no of lines is - 16400
Showing only 20 No of records
Script
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-24-2024 02:45 AM
Hi, the show() method prints only the top 20 rows by default: DataFrame.
show
(n: int = 20, truncate: Union[bool, int] = True, vertical: bool = False) (cf https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.DataFrame.show...)
You can either use show() with a bigger n parameter, or use the Databricks display() command to print the dataframe in a tabular format:
df = spark.read.format("CSV").option("inferSchema", "true").option("header", "true").option("delimiter", ",").load(file_location)
display(df)
https://www.databricks.com/spark/getting-started-with-apache-spark/dataframes#view-the-dataframe
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-24-2024 03:16 AM - edited 07-24-2024 04:14 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-24-2024 03:56 AM
Hi @Yyyyy ,
You should edit your question and redacted key your'e setting in spark session.

