cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Show Existing Header From CSV I External Table

Frantz
New Contributor III

Hello, is there a way to load csv data into an external table without the _c0, _c1 columns showing?

1 ACCEPTED SOLUTION

Accepted Solutions

Frantz
New Contributor III

My question was answered in a separate thread here.

View solution in original post

4 REPLIES 4

Frantz
New Contributor III

The databricks community discussion post  creation sucks. I've been trying for the past 20 minutes to post a question but I keep getting a "Correct Highlighted errors" message. When I correct the "errors", the message still does not go through. I've resulted to posting test messages just to see what goes through. 

Hi @Frantz, I'm sorry you're experiencing difficulties posting on the Databricks community discussion. Hang in there, and hopefully, we can get this sorted out soon so you can participate fully in the community discussions.

Kaniz_Fatma
Community Manager
Community Manager

Hi @Frantz! Setting the header option to true allows you to easily avoid the preset default column names (_c0, _c1, etc.) when using PySpark to load CSV data into an external table. This allows you to use the first row of your CSV file as the column names, which can be done as follows:

# Assuming you have a CSV file named "file.csv"
dff = spark.read.format("csv") \
   .option("delimiter", ",") \
   .option("header", "true") \
   .option("inferSchema", "true") \
   .load("file.csv")

This code snippet uses the header=True setting to ensure that the first row of the CSV file is recognized as column names. Additionally, the inferSchema=True setting allows for automatic inference of column data types. As a result, the DataFrame dff will contain the same column names as the original CSV file rather than the generic _c0, _c1, etc.

 

Frantz
New Contributor III

My question was answered in a separate thread here.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group