cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Is there something like Oracle's VPD-Feature in Databricks?

quakenbush
Contributor

Since I am porting some code from Oracle to Databricks, I have another specific question.

In Oracle there's something called Virtual Private Database, VPD. It's a simple security feature used to generate a WHERE-clause which the system will add to a user's query.

Basically, you set an "Application Context" (something like a tag) when a connection is made. This context is read in a specific function which then returns a string to be attached as the WHERE clause to the actual query.

I am aware of the current_user()-function in Pyspark/Databricks. However, I can't use it, because it would return the technical username of the BI-tool. That's why the VPD was used in the first place.

Thanks 🙂

1 REPLY 1

Anonymous
Not applicable

@Roger Bieri​ :

In Databricks, you can use the UserDefinedFunction (UDF) feature to create a custom function that will be applied to a DataFrame. You can use this feature to add a WHERE clause to a DataFrame based on the user context. Here's an example of how you can use UDF to apply a WHERE clause to a DataFrame based on the user context:

from pyspark.sql.functions import udf
from pyspark.sql.types import StringType
 
# Define your custom function to add WHERE clause to DataFrame
def add_where_clause(user_context):
    if user_context == "context_1":
        return "WHERE column_1 > 10"
    elif user_context == "context_2":
        return "WHERE column_2 < 100"
    else:
        return ""
 
# Create a UDF from the custom function
add_where_clause_udf = udf(add_where_clause, StringType())
 
# Apply the UDF to the DataFrame to add WHERE clause based on user context
df = df.select("*", add_where_clause_udf(current_user()).alias("where_clause"))
df_filtered = df.filter(df.where_clause != "")

In this example, add_where_clause is a custom function that takes the user context as an input and returns a WHERE clause string based on the context. The add_where_clause_udf function creates a UDF from the custom function. The df DataFrame is then modified by applying the UDF to create a new column where_clause. Finally, the df_filtered DataFrame is created by filtering the rows where where_clause is not an empty string. You can modify the add_where_clause function to suit your specific use case, and apply the UDF to your DataFrame to add a WHERE clause based on the user context.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group