How Digital Payment Lending Platforms Can Collaborate with Banks Without Exposing Sensitive Data
1. Business Context & Regulatory Reality
In 2020, large Indian fintech platforms faced a unique regulatory constraint: NBFCโled digital platforms were not allowed to issue credit cards directly due to RBI compliance restrictions. However, these platforms had something extremely valuable โ deep, highโfrequency behavioral and creditโadjacent data.
To bridge this gap, fintechs partnered with regulated banks to jointly offer coโbranded credit cards. The business goal was simple but technically complex:
Identify common eligible customers between Fintech and Bank
Ensure zero leakage of raw PII or credit bureau data
Maintain auditability, repeatability, and regulatory compliance
Scale the process to millions of users, every 2 weeks
This collaboration problem is exactly what Databricks Clean Rooms are designed to solve.
2. Data Landscape at Fintech (Digital Payment Platform)
2.1 Credit Bureau Ingestion (CIBIL & Experian)
Each user onboarding or lending event triggered:
Soft pull from CIBIL & Experian
JSON/XML credit report ingestion
Secure storage in encrypted data lake
2.2 Feature Engineering at Scale (500+ Features)
The ML platform generated 500+ engineered features, broadly grouped as:
Feature Category Examples
| Credit Behavior | DPD buckets, credit vintage, utilization |
| Velocity | Credit inquiries (7/30/90 days) |
| Stability | Address consistency, employer tenure |
| Risk Signals | Writeโoffs, settlements, delinquency trends |
All features were computed using incremental pipelines, ensuring:
Only new bureau deltas were processed
Historical recomputation was avoided
Feature freshness SLAs were maintained
3. The Core Challenge: Secure Partner Matching
Problem Statement
How do two independent entities (Fintech + Bank) identify common eligible customers without sharing raw PII or internal scores?
Traditional solutions involved:
Manual data rooms
Shared EC2 servers
Adโhoc scripts
These approaches:
โIncreased compliance risk
โWere hard to audit
โDidnโt scale
4. Reโimagining the DRE Using Databricks Clean Rooms
Databricks Clean Rooms allow multiโparty computation over governed datasets, where:
Each party keeps data in its own account
Only approved queries are executed
Output is strictly controlled (aggregated, masked, or whitelisted)
Architecture Overview
5. Data Preparation (Fintech Side)
5.1 Masking & Tokenization
Only irreversible tokens were exposed:
CREATE TABLE fintech_clean.masked_users AS
SELECT
sha2(pan, 256) AS pan_hash,
sha2(mobile, 256) AS mobile_hash,
eligibility_score,
risk_bucket
FROM fintech_prod.user_features
WHERE eligibility_score >= 0.75;
No raw PAN, mobile, or bureau attributes ever left the Fintech boundary.
6. Clean Room Policy Definition
Databricks Clean Room policies strictly control who can query what.
6.1 Allowed Operations
Equality joins on hashed keys
Bankโowned eligibility filters
Rowโlevel output constraints
6.2 Disallowed Operations
Raw data export
Reverse joins
Freeโform SELECT *
Example policy (conceptual):
allowed_operations:
- join_on: [pan_hash, mobile_hash]
- filters: bank_rules
output_constraints:
max_rows: 200000
columns:
- pan_hash
- offer_flag
7. Matching Logic Inside Clean Room
7.1 Bank Side: Internal Rules
Bank applied its proprietary checks:
7.2 Joint Matching Query
SELECT
f.pan_hash,
CASE
WHEN b.cibil_score >= 750
AND b.internal_risk = 'LOW'
THEN 'APPROVED'
ELSE 'REJECTED'
END AS offer_flag
FROM fintech_clean.masked_users f
JOIN bank_clean.customer_base b
ON f.pan_hash = b.pan_hash
WHERE f.risk_bucket IN ('LOW', 'MEDIUM');Neither party ever sees the other's raw data.
8. Output Whitelisting & Activation
Only APPROVED hashes were released:
Bank whitelisted customers internally
Fintech triggered inโapp invite banners
No PII exchange required postโmatching
This ensured:
โ RBIโcompliant separation of duties
โ Zero data duplication
โ Full audit trace
9. Operational Excellence
9.1 BiโWeekly Runs
9.2 Audit & Compliance
Query lineage
Policy enforcement logs
Timeโbound access
This drastically reduced regulator and partner friction.
10. Why This Pattern Scales
Dimension Traditional DRE Databricks Clean Room
| Security | Medium | Enterpriseโgrade |
| Auditability | Low | Native |
| Scalability | Manual | Elastic |
| Compliance | Risky | Byโdesign |
11. Key Takeaways for Data Leaders
Data collaboration is inevitable in regulated industries
The future is computeโtoโdata, not data sharing
Clean Rooms unlock new revenue partnerships without legal risk
This architecture is applicable to:
12. Final Thoughts
What started as a compliance workaround evolved into a blueprint for secure, scalable data partnerships.
Databricks Clean Rooms enable organizations to:
Collaborate with confidence, innovate with speed, and comply by default.
If you are designing partner data ecosystems in fintech, banking, or healthcare โ Clean Rooms are no longer optional, they are foundational.