DataHacks 2026: University Alliance in Action at UCSD
How a single weekend of hands on exposure creates the next generation of Databricks advocates
Workshop Lead: Anjana Sriram
Why University Alliances Matters in the Field
Early in my career, the tools I defaulted to were the ones I already had access to. I had a Microsoft license in college, so Excel and Power BI were second nature before I ever walked into an office.
Same story with Tableau. I had student access, built projects on it, and when it came time to recommend a BI tool professionally I didn't have to think twice. That familiarity created a bias that stuck with me for years. It wasn't that other tools were worse. I just knew these ones. I knew the shortcuts, the quirks, the workflows. And when you're a new hire trying to prove yourself, you reach for what you're confident in.
I've watched this same pattern play out with every early career analyst and engineer I've worked with. The tools you learn first become the tools you advocate for. When a student graduates having built real projects on a platform, that platform gets carried into their first job, their first architecture decision, their first recommendation to leadership.
This is exactly what the University Alliances program is designed to do. Getting Databricks into the hands of the next generation of data professionals early (through Free Edition, through workshops, through real hackathon projects) is one of the highest leverage investments we can make in long term platform adoption. The students at DataHacks this weekend are the analysts, engineers, and decision makers our Sales and FE teams will be working with in two to five years.
That's why events like DataHacks matter so much. Last weekend, over 400 students at UC San Diego spent 36 hours building real projects on real platforms, and Databricks was right in the middle of it. When these students graduate and walk into their first data engineering or analytics role, they won't be Googling "what is a lakehouse." They'll have built pipelines in Lakeflow, tracked models in MLflow, and explored data with Genie. That kind of hands on experience creates organic, bottom up momentum for platform adoption across the organizations they join.
Sameep and Ben at the Mentors table
Data Bricks
On the Ground at UCSD
DataHacks 2026 is a 36 hour, MLH certified hackathon hosted by UC San Diego's Data Science Student Society (DS3). Over 400 students packed into the Rec Gym to build solutions across tracks like AI/ML, Cloud Development, Data Analytics, and more. Through the University Alliances program, Databricks sponsored a dedicated challenge track for the best project implementation using the Databricks platform.
I spent 12 hours mentoring student teams throughout the weekend, helping them work through architecture decisions, debug pipelines, and get the most out of the platform. I got to speak with well over 100 students across the event, answering questions about Databricks, data engineering careers, and what working with data at scale actually looks like.
- 400+ Students
- 36 Hours
- 13 Databricks Submissions
- 12 Hours Mentoring
It Takes a Team
Our Databricks presence at DataHacks was a cross team effort coordinated through the University Alliances program. I want to call out the people who made it happen.
Sameep Mohta
Booth Lead, Saturday
Showed up Saturday and immediately started fielding questions from a flood of dozens of students at our booth. We were way busier than the AWS table right next to us. A great problem to have, but impossible to manage alone.
Anjana Sriram
Workshop Co-Lead
Co-led the University Alliances Workshop and built an excellent hands on project that walked students through the full functionality available in Databricks Free Edition.
Manu Mehra
Databricks Challenge Judge
Joined me in reviewing all 13 Databricks Challenge submissions, evaluating how effectively each team leveraged the platform end to end.
Ben Novak
Event Coordinator / Judge
Cross team coordinator for the University Alliances activation at UCSD. Led mentoring shifts, co hosted the workshop, judged the Databricks Challenge, and managed our presence across the full three day event.
Ben Novak introducing the Databricks Notebook
The University Alliance Workshop: Zero to Lakehouse in Four Hours
As part of the hackathon, Anjana and I hosted a 4 hour University Alliances Workshop on site. 40 students showed up in person, and 38 of them had never touched Databricks before.
We started from scratch: building dummy datasets, setting up Lakeflow pipelines, training models in MLflow, and exploring AI/BI with Genie. By the end of the session, students who had zero Databricks experience were navigating the lakehouse with confidence. All on Free Edition.
This is the kind of bottom up exposure that compounds. These students will carry that familiarity into internships, into their first full time roles, and into the tooling conversations that shape how their teams build. For our Sales and FE teams, this means more prospects who already know and trust the platform before we ever walk into a room.
Databricks Challenge Winner: Solarify
The winning submission
The Problem
San Diego has the most expensive electricity in the United States, with the average household paying nearly $400/month. Solar energy is the obvious solution, but the tools to act on it have always been scattered across different platforms. Solarify brings solar placement, savings estimates, and environmental impact together in one place so you can see the full impact of your decisions.
What It Does
Solarify maps solar opportunity across every zip code in San Diego. Click any zip code to see a 10 year savings projection, CO2 impact, and a side by side energy bill forecast with and without solar. A what if simulator lets you drag a slider to see in real time what happens as solar adoption increases, and a scoring system ranks every zip code by priority so you always know where to act first.
Solarify on Databricks: Architecture Diagram
Architecture
The team built a full end to end data pipeline across four real world datasets:
- SDG&E: Electricity consumption by zip code
- EIA: Commercial pricing, CO2 emissions, facility level data
- NREL: Solar irradiance data across zip codes
- ZenPower: Solar permit data for ground truth on installations
All data flowed into Databricks, where it was ingested and cleaned using Delta Tables and Spark DataFrames. Two ML models handled the heavy lifting:
- Solar Opportunity Scorer - Ranks every zip code by solar potential using consumption, cost, heat intensity, and adoption gaps to support customer acquisition.
- Dual Forecast Engine - Runs status quo predictions vs. solar based predictions so users can see exactly where the savings diverge over time.
The output feeds a FastAPI backend serving a map based React frontend in real time.
The team's architecture diagram: from raw data sources through Databricks' Bronze/Silver/Gold medallion layers, MLflow models, and out to the FastAPI + React frontend.
UCSD DataHacks 2026, Databricks Challenge Winners
What Made It Special
This was a two person team that wrangled 52 quarterly CSV files across four completely different data schemas, trained two ML models, built a full backend, and shipped a polished map based frontend, all in 36 hours. They learned Databricks from scratch during the hackathon, picking up Delta Tables, Spark DataFrames, and Mosaic AI on the fly. That kind of ambition and execution is exactly what the Databricks Challenge was designed to reward.
Get Involved with University Alliances
If you're on a Sales or FE team and want to build grassroots awareness with the next generation of data professionals, the University Alliances program is the best place to start.
Hackathons, workshops, and Free Edition give students hands on experience that turns into long term platform advocacy. Reach out to the University Alliances team or ping me directly if you want to get involved in upcoming events.