cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Dbfs init script migration

Ameshj
New Contributor II

I need help with migrating from dbfs on databricks to workspace. I am new to databricks and am struggling with what is on the links provided.

My workspace.yml also has dbfs hard-coded. Included is a full deployment with great expectations.

This was done by an external vendor.

9 REPLIES 9

NandiniN
Valued Contributor III
Valued Contributor III

Hi @Ameshj , 

I do not see any attachments and links in this post. 

However I will add the document which I feel should help you https://docs.databricks.com/en/_extras/documents/aws-init-workspace-files.pdf

If you can specify the specific error that you are facing, we can guide you better.

Thanks!

 

Ameshj
New Contributor II

Hello - thank you for your reply.

I have only gone as far as creating a new folder and copied the generate init file over.

where it says copy from dbfs - I am unable to as dbfs pathway nomore exists.

Regards

 

Ameshj
New Contributor II

here is another error I received when running the visualize job on databricks.

NandiniN
Valued Contributor III
Valued Contributor III

There's also this KB specific to init script migration - https://kb.databricks.com/clusters/migration-guidance-for-init-scripts-on-dbfs

Ameshj
New Contributor II

hi

attached the global init script migration - but it does not work either.

dbutils.fs.put(
"/databricks/scripts/pyodbc-install.sh","""
#!/bin/bash
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
curl https://packages.microsoft.com/config/ubuntu/$(lsb_release -rs)/prod.list > /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get install -y msodbcsql17
# optional: for bcp and sqlcmd
sudo ACCEPT_EULA=Y apt-get install -y mssql-tools
echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc
source ~/.bashrc
# optional: for unixODBC development headers
sudo apt-get install -y unixodbc-dev""",
True)

it says script exit status is non-zero

NandiniN
Valued Contributor III
Valued Contributor III

Hi @Ameshj ,

Sorry for the delay in the response.

For the all_df screenshot - how are you creating that df? Does it contain Tablename? How is it related to init script migration?

Kindly add set -x after the first line, and enable cluster logs to DBFS and share the logs if possible.

Thanks & Regards,

Nandini

 

NandiniN
Valued Contributor III
Valued Contributor III

One of the other suggestions is to use Lakehouse Federation. It is possible it may be a driver issue (we will get to know from the logs)

Ameshj
New Contributor II

hi
The issue is that DBFS is not there anymore since it reached end of life.
now when the cluster starts to run the job - it needed to fire off the init script - but it cant anymore. (cluster settings image)
So how do i get the init script to start up else where - i am stuck here?

NandiniN
Valued Contributor III
Valued Contributor III

#!/bin/bash
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
curl https://packages.microsoft.com/config/ubuntu/$(lsb_release -rs)/prod.list > /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get install -y msodbcsql17
# optional: for bcp and sqlcmd
sudo ACCEPT_EULA=Y apt-get install -y mssql-tools
echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc
source ~/.bashrc
# optional: for unixODBC development headers
sudo apt-get install -y unixodbc-dev