Hi @greengil good question, I went through this something similar recently, so sharing what I found.
My instinct was also to build it in Python, but once I dug in, the "just write a script" path hides a lot of pain:
- Deletions are invisible. Jira's REST API doesn't return deleted issues. Without webhooks, you'll have ghost records in Delta forever.
- Field history isn't free. The API gives you current state, not change history. Reporting usually needs history, which means building and maintaining it yourself.
- Archived issues aren't returned in JQL queries, only by ID.
- Rate limits, pagination, schema drift for custom fields, all real work.
Fivetran's Jira connector handles all of this natively, JQL-based incremental sync, webhook-based deletion capture, auto-populated ISSUE_FIELD_HISTORY tables, schema drift detection, MERGE into Delta, and it's available through Databricks Partner Connect for quick setup. There's also a free dbt package (fivetran/dbt_jira) with pre-built analytics models.
My take: I would suggest go with Fivetran unless you have a specific reason not to - high volume cost concerns, need for archived issues, or data residency restrictions. Custom Python makes sense for narrow use cases, but it's weeks of build plus ongoing maintenance.
References I did research and came up with solution, please take a look, I think you will find it really helpful:
Happy to dig in further if you're leaning one way.