cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Customer deployment

thomasthomas
New Contributor II

Hi,

I have a bunch of scripts in Databricks that perform a decent amount of data-wrangling. All of these scripts contain sensitive information and I have no intention of making them public.

I would like to provide a service to my customers - so they can use all these scripts - but they cannot see their content. My idea was to deploy Databricks in their own subscription but it looks like even with Databricks premium I cannot hide those scripts from my customers. The admin can see everything. (I wouldn't be the admin, obviously)

My second idea was to keep my databricks as is in my own sub and then create databricks clusters in their subscriptions. However, I can't see if there is such an option - if I happen to create a cluster, then it will run in my own subscription.

For obvious reasons, my clients do not want their data leave their subscription.

What's the recommended way of customer deployment, then? All of my scripts are pyspark scripts - so there is no obvious way to compile them given that python is an interpreted language. There are a few libraries out there but I don't trust them 100%.

Please advise,

Tamas

1 ACCEPTED SOLUTION

Accepted Solutions

Atanu
Esteemed Contributor
Esteemed Contributor

@Tamas D​  I understood your concern.

  1. For cluster creation in different subscription I think that's by design at this moment. But I would like to request you to add your use case to https://feedback.azure.com/d365community/forum/2efba7dc-ef24-ec11-b6e6-000d3a4f0da0# which actually reviewed by our product team to evaluate and in future may be this can be implemented.
  2. Also the case goes for hiding the script (if our ACLs does not solve the purpose)

Thanks. Please let me know if you have any queries.

Thanks

View solution in original post

4 REPLIES 4

Anonymous
Not applicable

Hello, @Tamas D​ - My name is Piper, and I'm a community moderator for Databricks. Please accept my apologies for not responding sooner. 😞

I'll pass this to the subject matter experts now and the team will go looking for the best person to answer your question.

@Piper Wilson​  Thank you.

In the meantime I found out that Databricks supports REST API calls to execute scripts in a given language but I am not convinced that that would be a great approach. With API 1.2 we need to define the language - and in almost all notebooks I keep switching between python and spark sql.

https://docs.databricks.com/dev-tools/api/1.2/index.html

Also, it not uncertain how temporary views would be handled if scripts would be sent separately. (Python, sql, python, sql etc.) Maybe it is not worth going down that road...

Anonymous
Not applicable

@Tamas D​  Thank you for the update.

Atanu
Esteemed Contributor
Esteemed Contributor

@Tamas D​  I understood your concern.

  1. For cluster creation in different subscription I think that's by design at this moment. But I would like to request you to add your use case to https://feedback.azure.com/d365community/forum/2efba7dc-ef24-ec11-b6e6-000d3a4f0da0# which actually reviewed by our product team to evaluate and in future may be this can be implemented.
  2. Also the case goes for hiding the script (if our ACLs does not solve the purpose)

Thanks. Please let me know if you have any queries.

Thanks

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.