cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Dbfs init script migration

Ameshj
New Contributor III

I need help with migrating from dbfs on databricks to workspace. I am new to databricks and am struggling with what is on the links provided.

My workspace.yml also has dbfs hard-coded. Included is a full deployment with great expectations.

This was done by an external vendor.

1 ACCEPTED SOLUTION

Accepted Solutions

NandiniN
Databricks Employee
Databricks Employee

#!/bin/bash
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
curl https://packages.microsoft.com/config/ubuntu/$(lsb_release -rs)/prod.list > /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get install -y msodbcsql17
# optional: for bcp and sqlcmd
sudo ACCEPT_EULA=Y apt-get install -y mssql-tools
echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc
source ~/.bashrc
# optional: for unixODBC development headers
sudo apt-get install -y unixodbc-dev

 

View solution in original post

12 REPLIES 12

NandiniN
Databricks Employee
Databricks Employee

Hi @Ameshj , 

I do not see any attachments and links in this post. 

However I will add the document which I feel should help you https://docs.databricks.com/en/_extras/documents/aws-init-workspace-files.pdf

If you can specify the specific error that you are facing, we can guide you better.

Thanks!

 

Ameshj
New Contributor III

Hello - thank you for your reply.

I have only gone as far as creating a new folder and copied the generate init file over.

where it says copy from dbfs - I am unable to as dbfs pathway nomore exists.

Regards

 

Ameshj
New Contributor III

here is another error I received when running the visualize job on databricks.

NandiniN
Databricks Employee
Databricks Employee

There's also this KB specific to init script migration - https://kb.databricks.com/clusters/migration-guidance-for-init-scripts-on-dbfs

Ameshj
New Contributor III

hi

attached the global init script migration - but it does not work either.

dbutils.fs.put(
"/databricks/scripts/pyodbc-install.sh","""
#!/bin/bash
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
curl https://packages.microsoft.com/config/ubuntu/$(lsb_release -rs)/prod.list > /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get install -y msodbcsql17
# optional: for bcp and sqlcmd
sudo ACCEPT_EULA=Y apt-get install -y mssql-tools
echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc
source ~/.bashrc
# optional: for unixODBC development headers
sudo apt-get install -y unixodbc-dev""",
True)

it says script exit status is non-zero

NandiniN
Databricks Employee
Databricks Employee

Hi @Ameshj ,

Sorry for the delay in the response.

For the all_df screenshot - how are you creating that df? Does it contain Tablename? How is it related to init script migration?

Kindly add set -x after the first line, and enable cluster logs to DBFS and share the logs if possible.

Thanks & Regards,

Nandini

 

NandiniN
Databricks Employee
Databricks Employee

One of the other suggestions is to use Lakehouse Federation. It is possible it may be a driver issue (we will get to know from the logs)

Ameshj
New Contributor III

hi
The issue is that DBFS is not there anymore since it reached end of life.
now when the cluster starts to run the job - it needed to fire off the init script - but it cant anymore. (cluster settings image)
So how do i get the init script to start up else where - i am stuck here?

NandiniN
Databricks Employee
Databricks Employee

#!/bin/bash
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
curl https://packages.microsoft.com/config/ubuntu/$(lsb_release -rs)/prod.list > /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get install -y msodbcsql17
# optional: for bcp and sqlcmd
sudo ACCEPT_EULA=Y apt-get install -y mssql-tools
echo 'export PATH="$PATH:/opt/mssql-tools/bin"' >> ~/.bashrc
source ~/.bashrc
# optional: for unixODBC development headers
sudo apt-get install -y unixodbc-dev

 

Ameshj
New Contributor III

wow, you are a magician in this.

Thank you for your guidance and assistance. It Worked finally. 

NandiniN
Databricks Employee
Databricks Employee

Hi @Ameshj , can you please "accept as solution" what worked for you.

NandiniN
Databricks Employee
Databricks Employee

Glad it worked and helped you.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group