cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Float Value change when Load with spark? Full Path?

Hola1801
New Contributor

Hello,

I have created my table in Databricks, at this point everything is perfect i got the same value than in my CSV.

for my column "Exposure" I have :

0         0,00
1         0,00
2         0,00
3         0,00
4         0,00
...

But when I load my file with spark, in the column exposure I have something different:

0        14 032,24
1        14 032,24
2         8 061,94
3         8 061,94
4        15 506,37

I use this code to load by table:

import pandas as pd
 
df = spark.table("imos_prior").toPandas()
df['Exposure']

Do you know how could specify the full path of my table ?

My table is located by default in default/imos_prior

Or do you have any idea why the value could have change ? I hope is just a question of different file that is used by sparks.

Thanks you

1 ACCEPTED SOLUTION

Accepted Solutions

-werners-
Esteemed Contributor III

do you happen to know the path where the actual data resides?

Tables in databricks are not the actual data but like a view on top of the data (parquet, csv etc)

View solution in original post

3 REPLIES 3

Anonymous
Not applicable

Hello @Anis Ben Salem​ - My name is Piper and I'm a moderator for Databricks. Welcome and thank you for posting your question. Let's give it a bit longer for other members to respond. If we don't hear anything, we'll circle back around.

-werners-
Esteemed Contributor III

do you happen to know the path where the actual data resides?

Tables in databricks are not the actual data but like a view on top of the data (parquet, csv etc)

jose_gonzalez
Databricks Employee
Databricks Employee

Hi @Anis Ben Salem​ ,

How do you read your CSV file? do you use Pandas or Pyspark APIs? also, how do you created your table?

could you share more details on the code you are trying to run?

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group