Databricks Community

jasonyip · ‎03-19-2026

Databricks recently released Genie Code. It sounds very promising. We know it’s a Databricks product and we wouldn’t be surprised if I tell you it works within Databricks. In this experiment, we want to see how much effort does it take to migrate someone else’s code and onboard to Databricks. We will take the latest autoresearch by Andrej Karpathy.

Having done a similar migration last year with Gemini (https://www.tredence.com/blog/game-arena-deepmind-on-databricks), I know it will take about an evening’s work to migrate with LLMs that don’t understand Databricks, at least back in the days when skills were not available.

While I am confident that Genie code will live up to the expectation, I want to measure its effectiveness by the following:

How many prompts do I need to use to achieve my goal?
How long is the prompt that I need to write? Do I need to create a very detailed project plan? Or will it just read my mind?
Will the first run be successful?
Does it recover from errors automatically?

Let’s find out.

First of all, we need to clone the repo, and I am glad to do this step manually: https://github.com/karpathy/autoresearch

The second step is to “design” a prompt. I’d say this worries me, but I tried to start with something simple and continue to improve it. Below is the prompt I used:

“Can you convert autoresearch using databricks FMAPI and run a research topic.

Make sure you track the chain of thoughts using MLflow and log the experiment.”

Having used many agents in the past year, I know for a fact that the agent won’t know a few things:

The syntax for connecting to FMAPI
It is not aware of the UC version of MLflow
It definitely doesn’t know tracing, which I didn’t mention in the prompt

With the code and the above prompt. I hit enter. And I’d expect some clarifications and failures, which even the most senior data scientist would ask. For example:

Where is the autoresearch code located?
Where is the endpoint located?
Which function do I want to log?
Where should I log the experiment?

But we will let Genie Code figure it out or allow it to ask follow-up questions. Below is the output:

It took a while, but as we can see, everything ran automatically, and it was able to find the code and modify it.

The results? Yes, it was able to run an experiment and log those traces into MLflow! We can see below that it ran for 5 iterations and produced a result in the 6th step, which is clean and organized. It also did some testing by itself in the first step to ensure the logging is working.

We can open up the final step and see the research output!

What’s more, Genie Code also summarizes the iterations to determine which ones to keep and which to drop, aligning with the ReadMe.

Conclusion

We now have the answer. Genie Code is not a simple rebrand from Assistant. It’s built for deep Databricks understanding. To answer my own questions:

How many prompts do I need to use to achieve my goal?
One prompt only. Did not try to improve it.
How long is the prompt that I need to write? Do I need to create a very detailed project plan? Or will it just read my mind?
I wrote two sentences. Very short.
Will the first run be successful?
Yes – I can see the output
Does it recover from errors automatically?
Yes – Genie code automatically recovers from error and determines the next step!

This is a huge step forward for any code migration and onboarding to Databricks. No prior expert knowledge is required, and a single prompt is all you need!

simonhuang · ‎04-14-2026

does your Genie Code need customization? I try the same prompts but my Genie Code seems completely idiotic to perform the task.

jasonyip · ‎04-15-2026

Genie code is Claude fine-tuned with Databricks knowledge. I think it works quite well. Feel free to share your input and output

Databricks Community

Genie code meets autoresearch

🌟 Community Pulse: Your Weekly Roundup! July 06 – 12, 2026

Upcoming Community BrickTalk | Sports Analytics: Turning Tracking Data into Real-Time AI Decisions

How to Optimize Your Content for GEO: Best Practices for Writing Discoverable Community Content

Solution Accelerator Series | Building Common Sense Product Recommendations With LLMs

Databricks Community Fellows – June 2026 Recap