cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How do you create a Sandbox in your data environment ?

William_Scardua
Valued Contributor

Hi guys,

How do you create a Sandbox in your data environment ? have any idea ?

Azzure/AWS + Data Lake + Databricks

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

maybe use Azure DataFactory to periodically copy your data to sandbox storage.

If you need to configure mounts and sql databases/tables the best is to have notebook to do that which is run in both environments with widget which specify deployment (so it will replace mount and table location accordingly).

1) copy in dataFactory

2) when completed run deployment notebook (also can be with dataFactory and param available though widget can be set there)

View solution in original post

6 REPLIES 6

Hubert-Dudek
Esteemed Contributor III

maybe use Azure DataFactory to periodically copy your data to sandbox storage.

If you need to configure mounts and sql databases/tables the best is to have notebook to do that which is run in both environments with widget which specify deployment (so it will replace mount and table location accordingly).

1) copy in dataFactory

2) when completed run deployment notebook (also can be with dataFactory and param available though widget can be set there)

thank you @Hubert Dudek​ 

-werners-
Esteemed Contributor III

Depends on how much you want the sandbox to be disconnected from the rest.

The ideal scenario is a complete separate setup, as in DEV-QA-PRD-SANDBOX.

But to be honest, I think that is overkill.

If you have a separate storage account or you have a separate BLOB or even a subdirectory which is protected with permissions, you have already quite a lot.

Because the data is the most important part.

Then the notebooks: you can opt for a separate databricks account but again, you can do without.

f.e. use Repos for your 'official' notebooks, and the workspace/user folder for playing around.

You only have to make sure that you use the correct mount and that can be set with a widget as Hubert mentioned.

It also depends on the amount of people working on databricks. If you are only a small team you do not have to be too strict. But with lots of people and frequent personnel changes (consultants f.e.) it is a good idea to have strict permissions/procedures etc.

I agree @Werner Stinckens​ , thank you

missyT
New Contributor III

In a sandbox environment, you will find the Designer enabled. You can activate Designer by selecting the design icon Designer. on a page, or by choosing the Design menu item in the Settings Settings menu.

Thank you @Missy Trussell​ 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group