cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Sync prod WS DBs to dev WS DBs

Mr__E
Contributor II

We have a couple sources we'd already set up to stream to prod using a 3p system. Is there a way to sync this directly to our dev workspace to build pipelines? eg. directly connecting to a cluster in prod and pull with a job cluster, dump to S3 and use autoloader, or maybe there's a way to create a shared DBFS and just share on this?

We initially created the dev / prod workspaces using the automagical workspace creating tool, so I'm unfamiliar with how setting up a shared dbfs would work.

2 REPLIES 2

Debayan
Esteemed Contributor III

DBFS can be used in many ways.

Please refer below:

  • Allows you to interact with object storage using directory and file semantics instead of cloud-specific API commands.
  • Allows you to mount cloud object storage locations so that you can map storage credentials to paths in the Databricks workspace.
  • Simplifies the process of persisting files to object storage, allowing virtual machines and attached volume storage to be safely deleted on cluster termination.
  • Provides a convenient location for storing init scripts, JARs, libraries, and configurations for cluster initialization.
  • Provides a convenient location for checkpoint files created during model training with OSS deep learning libraries.

https://docs.databricks.com/dbfs/index.html#what-can-you-do-with-dbfs

Please let us know if this helps or you need further clarification on the same.

Kaniz_Fatma
Community Manager
Community Manager

Hi @Erik Louie​ , We haven't heard from you on the last response from @Debayan Mukherjee​, and I was checking back to see if his suggestions helped you.

Or else, If you have any solution, please share it with the community as it can be helpful to others.

Also, Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group