cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Importing a large csv file into databricks free

Mamdouh_Dabjan
New Contributor III

Basically, I have a large csv file that does not fit in a single worksheet. I can just use it in power query. I am trying to import this file into my databricks notebook. I imported it and created a table using that file. But, When I saw the table, it was full of random symbols and not the data I imported. Is there a way to convert these symbols into my data??

the symbols are something like this

6��@#W&���9�`�ϻ��U1�ѵL�T���E)�N�9;��l01H�O���>�4Q+(�2�wiɆ�������%?-2��7��A�ze�C��H��r+�;�>���(�2����~Y���D����[�2g�����eϢ��ԯy�ir#��~�

6 REPLIES 6

weldermartins
Honored Contributor

hello, manually opening one of the parts of the csv file is the view different?

I don't really understand what you mean. If you mean to open a csv on a worksheet it does not fit because the data is over 1 million rows

if you open cvs it will generate a message that it is not possible to read all lines, but you can preview the file. Post more information than you already have.

It does say what you are saying. But what Can I do to import this csv into databricks?. as I said before, I uploaded the data and created a table using it but it is displaying these random symbols instead of the data I imported.

you can use pyspark to remove the unicodes. 

This example removes null unicode. You will have to search or match yours or you can find some solution on google.

# Change null to empty in DataFrame
null = u'\u0000'
dfCnae = df\
.withColumn('id', regexp_replace(df['id'], null, ''))\
.withColumn('description', regexp_replace(df['description'], null, ''))

Thanks

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.