Databricks Community

greengil · ‎04-13-2026

We need to import large amount of Jira data into Databricks, and should import only the delta changes. What's the best approach to do so? Using the Fivetran Jira connector or develop our own Python scripts/pipeline code? Thanks.

greengil · ‎04-22-2026

There are other errors, like this one:

api.atlassian.com

Error class: UnknownHostException

Table: issues_without_deletes

Message: api.atlassian.com

greengil · ‎05-14-2026

Hi @Ashwin_DSA - So we got things set up so far so good. But when running the ETL pipeline, there's one table (isssue_with_deletes) fails due to the following error (partially shown). We have followed the instructions on this page: https://docs.databricks.com/aws/en/ingestion/lakeflow-connect/jira-source-setup.

org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = xxxxxxxxxxxxxxxx, runId = xxxxxxxxxxxx] terminated with exception: Job aborted due to stage failure: Task 0 in stage 146.0 failed 4 times, most recent failure: Lost task 0.3 in stage 146.0 (TID 149) (10.0.15.32 executor 0): com.databricks.pipelines.execution.conduit.common.DataConnectorException: [JIRA_ADMIN_PERMISSION_MISSING] Error encountered while calling Jira APIs. Source API type: sourceApi.jira.fetchAuditLogs. Ensure the connecting user has Jira admin permissions for your Jira instance.

Looking at this page: https://docs.databricks.com/aws/en/ingestion/lakeflow-connect/jira-reference#supported-jira-source-t...it mentions the user needs global admin permissions in Jira.

When setting up the connection in Catalog, our Databricks admin sets it up but he has no Jira admin privileges. When clicking on 'Sign into Jira" button in the process, we enter the actual Jira admin credentials to log into Jira. My understanding is, this is a one-time connection establishment between Jira and Databricks. The actual required permissions are set in the scope in the Oauth app. Reading the above page, it seems like the Databricks admin need Jira admin access for things to work during all data injection? Thanks.

abhi_dabhi · ‎04-17-2026

Hi @greengil good question, I went through this something similar recently, so sharing what I found.

My instinct was also to build it in Python, but once I dug in, the "just write a script" path hides a lot of pain:

Deletions are invisible. Jira's REST API doesn't return deleted issues. Without webhooks, you'll have ghost records in Delta forever.
Field history isn't free. The API gives you current state, not change history. Reporting usually needs history, which means building and maintaining it yourself.
Archived issues aren't returned in JQL queries, only by ID.
Rate limits, pagination, schema drift for custom fields, all real work.

Fivetran's Jira connector handles all of this natively, JQL-based incremental sync, webhook-based deletion capture, auto-populated ISSUE_FIELD_HISTORY tables, schema drift detection, MERGE into Delta, and it's available through Databricks Partner Connect for quick setup. There's also a free dbt package (fivetran/dbt_jira) with pre-built analytics models.

My take: I would suggest go with Fivetran unless you have a specific reason not to - high volume cost concerns, need for archived issues, or data residency restrictions. Custom Python makes sense for narrow use cases, but it's weeks of build plus ongoing maintenance.

References I did research and came up with solution, please take a look, I think you will find it really helpful: