Introduction
Every hour, global manufacturing plants across multiple regions generate gigabytes of telemetry data from sensors, machines, and production lines. This vast amount of data holds immense potential for predictive maintenance, operational optimization, and efficiency improvements. Manufacturing organizations face a uniquely complex data governance landscape, as they often span multiple regions with different infrastructure capabilities and local compliance requirements.
Manufacturing telemetry data is not just large in volume, it is highly diverse: ranging from time-series sensor readings and batch logs to equipment configuration files. Much of this data is sensitive, containing proprietary machine configurations or insights into operator behavior, and is often needed in real time by cross-functional teams spanning engineering, data science, and operations.
Traditional data platforms struggle to scale governance across such diverse formats, use cases, and regions without creating bottlenecks. Databricks offers a unified and scalable approach to this problem. Its lakehouse architecture combines the reliability and governance of data warehouses with the openness and flexibility of data lakes. It supports structured and unstructured data at scale, enables fine-grained access controls across clouds and regions, and facilitates secure data sharing and real-time processing—making it particularly well suited to the unique demands of modern manufacturing data governance.
In this blog post, we explore best practices for implementing this using Databricks:
Defining and partitioning tables at a plant level
Structuring teams and access in Unity Catalog
A well-defined team topology ensures that data access aligns with organizational roles by using account-level groups, which can be synchronized from the organization's identity provider (such as Azure AD, Okta, or others).
By structuring access in this way, organizations can maintain a scalable governance model that supports innovation while enforcing least-privilege access principles.
Implementing fine-grained governance with row filters, tags, and mapping tables
To enforce fine-grained access control, organizations can combine row filters, metadata tags, and mapping tables to define dynamic governance rules. Mapping tables serve as a central component for encoding governance logic — specifying which users, roles, or account-level groups (synced from the identity provider) are authorized to access specific subsets of data. This approach ensures the principle of least privilege is consistently applied, granting teams access only to the data relevant to their role. Additionally, metadata tagging enhances data discoverability and facilitates lifecycle management.
To understand how these capabilities apply in real-world manufacturing environments, let's first examine the typical data setup in these plants.
Current Setup
Manufacturing plants often employ a diverse range of data solutions to handle telemetry and operational data. These setups typically include:
The next section explores key challenges and best practices for optimizing data management while balancing flexibility and compliance.
Challenges
Without structured governance, sensitive data remains vulnerable, and collaboration across regions is restricted. These challenges highlight the need for a structured governance framework that balances security with operational flexibility.
Solution
Databricks enables a scalable governance model by defining clear access privileges for different teams while maintaining compliance. The following matrix outlines the levels of access required for various operational teams.
Databricks Account Groups |
Telemetry data |
ERP data |
Operating Plant Data Analysts |
Access to only specific rows for plant |
No access to PII data |
Regional Data Analysts |
Access to multiple plants in the region |
No access to PII data |
Global Data Analytics Teams |
Access to all plant telemetry data |
No access to PII data |
Global HR Teams |
No access to telemetry data |
Can access PII data |
In order to translate the above table ACLs and apply them to their respective groups we will use the concept of mapping tables. The below diagram outlines the following:
The mapping table as outlined in the diagram stores the access control lists for both:
Mapping table structure and data
The mapping table has the following columns:
The following naming conventions are followed while defining the persona-based groups
Execute the below DDL and DML statements to create the ACL_MAPPING table and insert sample data into the table.
Create table:
create or replace table acl_mapping (
group_name string,
entity_type string,
entity_ids array<int>,
pii_access string
);
Insert sample data into the table:
insert into acl_mapping values ('group_us_east_1_101', 'plant', array(101), 'no');
insert into acl_mapping values ('group_us_east_1', 'region', array(101, 102), 'no');
insert into acl_mapping values ('group_global', 'global', array(), 'no');
insert into acl_mapping values ('group_global_hr', 'hr', array(), 'yes');
The table below shows rows sample data inserted in the mapping table. Let us assume the mapping table is named ACL_MAPPING.
ACL_MAPPING Data:
group_name |
entity_type |
entity_ids |
pii_access |
group_us_east_1_101 |
plant |
[101] |
no |
group_us_east_1 |
region |
[101,102] |
no |
group_global |
global |
[] |
no |
group_global_hr |
hr |
[] |
yes |
Row Filter function:
In order to define row-based access control we will use the concept of a row filter function. The function accepts plant_id as a parameter which binds to the plant_id column in the telemetry data. Plant_id indicates the plant which originates the particular telemetry table row.
create or replace function telemetry_rls_mapping(plant_id string)
returns string
return (
select 1 from rls_mapping mt
where is_account_group_member('group_global')
)
or exists (
select
1
from
acl_mapping mt
where
is_account_group_member(mt.group_name)
and (
(
mt.entity_type = 'plant'
and array_contains(mt.entity_ids, plant_id)
)
or (
mt.entity_type = 'region'
and array_contains(mt.entity_ids, plant_id)
)
)
);
The above row filter once defined can be applied to the telemetry table using and ALTER statement as below:
alter table telemetry_measurements
set row filter telemetry_rls_mapping on (plant_id);
Data Masking function
In order to define column masks for PII columns we will leverage the concept of column mask functions. The code below references the mapping table to infer if a group should have access to a particular pii sensitive data column based on the pii_access flag column. If the group/persona should not have access to PII data columns then the users in the group would see masked value ‘********’
create function pii_mask(pii_column_value string)
returns string
return (
select
case when pii_access='yes' then pii_column_value else '*********' end
from acl_mapping mt
where is_account_group_member(mt.group_name)
);
The above masking function can be defined at a column level to the employee table’s phone number column using an ALTER statement as shown below.
alter table worker alter column phone_number set mask pii_mask;
Please note that If a particular user belonging to a group which does not have an entry in the ACL_MAPPING table queries a table for which the data masking function has been implemented, then they would see masked values.
With these governance controls in place, organizations can ensure that manufacturing telemetry and ERP data remain secure and compliant while enabling structured, role-based access. By leveraging Databricks’ row filtering and column masking capabilities, businesses can maintain a scalable, region-aware access model that enhances operational agility and mitigates risks associated with data silos and unauthorized access.
Conclusion:
In this blog, we outlined a structured approach to governing telemetry and ERP data in manufacturing environments. We examined how to implement team-based access control using mapping tables, row-level security, and column masking in Databricks. A well-defined team topology plays a crucial role in this governance model, ensuring that access is aligned with organizational roles and responsibilities. By applying these best practices, organizations can strengthen data governance, ensure compliance, and unlock deeper insights from their manufacturing data. This approach not only protects sensitive information but also empowers teams to make data-driven decisions with confidence.
Here are some related links for your reference:
Filter sensitive table data using row filters and column masks
Support and Limitations of Row filters and column masks
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.