cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Dbfs init script migration

Ameshj
New Contributor III

I need help with migrating from dbfs on databricks to workspace. I am new to databricks and am struggling with what is on the links provided.

My workspace.yml also has dbfs hard-coded. Included is a full deployment with great expectations.

This was done by an external vendor.

9 REPLIES 9

NandiniN
Honored Contributor
Honored Contributor

Hi @Ameshj , 

I do not see any attachments and links in this post. 

However I will add the document which I feel should help you https://docs.databricks.com/en/_extras/documents/aws-init-workspace-files.pdf

If you can specify the specific error that you are facing, we can guide you better.

Thanks!

 

Ameshj
New Contributor III

Hello - thank you for your reply.

I have only gone as far as creating a new folder and copied the generate init file over.

where it says copy from dbfs - I am unable to as dbfs pathway nomore exists.

Regards

 

Ameshj
New Contributor III

here is another error I received when running the visualize job on databricks.

NandiniN
Honored Contributor
Honored Contributor

There's also this KB specific to init script migration - https://kb.databricks.com/clusters/migration-guidance-for-init-scripts-on-dbfs

Ameshj
New Contributor III

hi

attached the global init script migration - but it does not work either.

dbutils.fs.put(
"/databricks/scripts/pyodbc-install.sh","""
#!/bin/bash
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
curl https://packages.microsoft.com/config/ubuntu/$(lsb_release -rs)/prod.list > /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get install -y msodbcsql17
# optional: for bcp and sqlcmd
sudo ACCEPT_EULA=Y apt-get install -y mssql-tools
echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc
source ~/.bashrc
# optional: for unixODBC development headers
sudo apt-get install -y unixodbc-dev""",
True)

it says script exit status is non-zero

NandiniN
Honored Contributor
Honored Contributor

Hi @Ameshj ,

Sorry for the delay in the response.

For the all_df screenshot - how are you creating that df? Does it contain Tablename? How is it related to init script migration?

Kindly add set -x after the first line, and enable cluster logs to DBFS and share the logs if possible.

Thanks & Regards,

Nandini

 

NandiniN
Honored Contributor
Honored Contributor

One of the other suggestions is to use Lakehouse Federation. It is possible it may be a driver issue (we will get to know from the logs)

Ameshj
New Contributor III

hi
The issue is that DBFS is not there anymore since it reached end of life.
now when the cluster starts to run the job - it needed to fire off the init script - but it cant anymore. (cluster settings image)
So how do i get the init script to start up else where - i am stuck here?

NandiniN
Honored Contributor
Honored Contributor

#!/bin/bash
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
curl https://packages.microsoft.com/config/ubuntu/$(lsb_release -rs)/prod.list > /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get install -y msodbcsql17
# optional: for bcp and sqlcmd
sudo ACCEPT_EULA=Y apt-get install -y mssql-tools
echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc
source ~/.bashrc
# optional: for unixODBC development headers
sudo apt-get install -y unixodbc-dev

 

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!