cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Write data Frame into Azure Data Lake Storage

juan_perez
New Contributor

It happens that I am manipulating some data using Azure Databricks. Such data is in an Azure Data Lake Storage Gen1. I mounted the data into DBFS, but now, after transforming the data I would like to write it back into my data lake.

To mount the data I used the following:

configs = {"dfs.adls.oauth2.access.token.provider.type": "ClientCredential", "dfs.adls.oauth2.client.id": "<your-service-client-id>", "dfs.adls.oauth2.credential": "<your-service-credentials>", "dfs.adls.oauth2.refresh.url": "https://login.microsoftonline.com/<your-directory-id>/oauth2/token"}

dbutils.fs.mount( source = "adl://<your-data-lake-store-account-name>.azuredatalakestore.net/<your-directory-name>", mount_point = "/mnt/<mount-name>", extra_configs = configs)

I want to write back a .csv file. For this task I am using the following line

dfGPS.write.mode("overwrite").format("com.databricks.spark.csv").option("header", "true").csv("adl://<your-data-lake-store-account-name>.azuredatalakestore.net/<your-directory-name>")

However, I get the following error:

IllegalArgumentException: u'No value for dfs.adls.oauth2.access.token.provider found in conf file.'

Any piece of code, suggestions that can help me? Or link that walks me through.

Thanks.

2 REPLIES 2

AndrewSears
New Contributor III

Hi there,

Did you try writing to your mount point location?

  1. dfGPS.write.mode("overwrite").format("com.databricks.spark.csv").option("header","true").csv("/mnt/<mount-name>")

There is a related post here to configure the appropriate Hadoop properties if you want to use ADL syntax.

https://forums.databricks.com/questions/13205/azure-data-lake-config-issue-no-value-for-dfsadlso.htm...

cheers,

Andrew

PawanShukla
New Contributor III

I am new in Azure Data Bricks..and I am trying to write the Data frame in mounted ADLS file. But in below command

dfGPS.write.mode("overwrite").format("com.databricks.spark.csv").option("header","true").csv("/mnt/<mount-name>")

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.