cancel
Showing results for 
Search instead for 
Did you mean: 
Databricks Platform Discussions
Dive into comprehensive discussions covering various aspects of the Databricks platform. Join the conversation to deepen your understanding and maximize your usage of the Databricks platform.
cancel
Showing results for 
Search instead for 
Did you mean: 

Browse the Community

Data Engineering

Join discussions on data engineering best practices, architectures, and optimization strategies with...

12242 Posts

Data Governance

Join discussions on data governance practices, compliance, and security within the Databricks Commun...

530 Posts

Generative AI

Explore discussions on generative artificial intelligence techniques and applications within the Dat...

379 Posts

Machine Learning

Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithm...

1024 Posts

Warehousing & Analytics

Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Communi...

684 Posts

Activity in Databricks Platform Discussions

greengil
by > New Contributor III
  • 48 Views
  • 1 replies
  • 0 kudos

Delta Jira data import to Databricks

We need to import large amount of Jira data into Databricks, and should import only the delta changes.  What's the best approach to do so?  Using the Fivetran Jira connector or develop our own Python scripts/pipeline code?  Thanks.

  • 48 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ashwin_DSA
Databricks Employee
  • 0 kudos

Hi @greengil, Have you considered Lakeflow Connect?  Databricks now has a native Jira connector in Lakeflow Connect that can achieve what you are looking for. It's in beta, but something you may want to consider.  It ingests Jira into Delta with incr...

  • 0 kudos
muaaz
by > Visitor
  • 71 Views
  • 2 replies
  • 1 kudos

Registering Delta tables from external storage GCS , S3 , Azure Blob in Databricks Unity Catalog

Hi everyone,I am currently working on a migration project from Azure Databricks to GCP Databricks, and I need some guidance from the community on best practices around registering external Delta tables into Unity Catalog.Currenlty I am doing this but...

  • 71 Views
  • 2 replies
  • 1 kudos
Latest Reply
Ashwin_DSA
Databricks Employee
  • 1 kudos

Hi @muaaz, On GCP Databricks, the SQL pattern you are using is fine, but the recommended best practice is to back it with a Unity Catalog external location instead of pointing tables directly at arbitrary gs:// paths. In practice, that means first cr...

  • 1 kudos
1 More Replies
prakharsachan
by > New Contributor
  • 44 Views
  • 2 replies
  • 0 kudos

Accessing secrets(secret scope) in pipeline yml file

How can I access secrets in pipeline yaml or directly in python script file?

  • 44 Views
  • 2 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @prakharsachan ,In Declarative Automation Bundles YAML (formerly known as Databricks Assets Bundles) you can only define secret scopes:If you want to read secrets from secret scope you can use dbutils in python script:password = dbutils.secrets.ge...

  • 0 kudos
1 More Replies
200649021
by > New Contributor II
  • 303 Views
  • 1 replies
  • 1 kudos

Data System & Architecture - PySpark Assignment

Title: Spark Structured Streaming – Airport Counts by CountryThis notebook demonstrates how to set up a Spark Structured Streaming job in Databricks Community Edition.It reads new CSV files from a Unity Catalog volume, processes them to count airport...

  • 303 Views
  • 1 replies
  • 1 kudos
Latest Reply
amirabedhiafi
  • 1 kudos

That's cool ! why not git it ?

  • 1 kudos
RPalmer
by > Contributor
  • 200 Views
  • 7 replies
  • 0 kudos

Unable to connect to any cluster from a notebook

I'm experiencing an unusual issue following my return from annual leave. I'm unable to connect to any compute from a notebook (both Classic Compute and Serverless) this is despite having Can Manage permissions on the clusters.The error shown is: "Unk...

  • 200 Views
  • 7 replies
  • 0 kudos
Latest Reply
alex1234
Visitor
  • 0 kudos

Im also having the same issue

  • 0 kudos
6 More Replies
prakharsachan
by > New Contributor
  • 56 Views
  • 1 replies
  • 1 kudos

pipeline config DAB

I am deploying DLT pipeline in dev environment using DABs. source code is in a python script file. In the pipeline's yml file the configuration key is set to true(with all correct indentations), yet the pipeline isnt deploying in the continuous mode....

  • 56 Views
  • 1 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @prakharsachan ,Continuous must be set inside the pipeline resource definition, not under configuration.The configuration block in a SDP (former DLT) pipeline definition is for Spark/pipeline settings (key-value string pairs passed to the runtime)...

  • 1 kudos
tsam
by > Visitor
  • 54 Views
  • 1 replies
  • 0 kudos

Driver memory utilization grows continuously during job

I have a batch job that runs thousands of Deep Clone commands, it uses a ForEach task to run multiple Deep Clones in parallel. It was taking a very long time and I realized that the Driver was the main culprit since it was using up all of its memory ...

tsam_2-1776095245905.png
  • 54 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @tsam ,I think your problem might be caused by the fact that each call "CREATE OR REPLACE TABLE ... DEEP CLONE" accumulates state on the driver even though you're not collecting data.The main culprits are:1. Spark Plan / Query Plan Caching Every S...

  • 0 kudos
IM_01
by > Contributor III
  • 75 Views
  • 4 replies
  • 0 kudos
  • 75 Views
  • 4 replies
  • 0 kudos
Latest Reply
IM_01
Contributor III
  • 0 kudos

@Ashwin_DSA  could you please provide an example .

  • 0 kudos
3 More Replies
ChristianRRL
by > Honored Contributor
  • 301 Views
  • 6 replies
  • 2 kudos

Resolved! Get task_run_id that is nested in a job_run task

Hi, I'm wondering if there is an easier way to accomplish this.I can use Dynamic Value reference to pull the run_id of Parent 1 into Parent 2, however, what I'm looking for is for Child 1's task run_id to be referenced within Parent 2.Currently I am ...

  • 301 Views
  • 6 replies
  • 2 kudos
Latest Reply
anuj_lathi
Databricks Employee
  • 2 kudos

Hi @ChristianRRL  you're absolutely right, and I apologize for the earlier suggestion. I've verified that task values from child jobs are not propagated back through run_job tasks. Your instinct about the REST API was correct. Here's the fix: Solutio...

  • 2 kudos
5 More Replies
ChristianRRL
by > Honored Contributor
  • 144 Views
  • 2 replies
  • 2 kudos

Resolved! Get task_run_id (or job_run_id) of a *launched* job_run task

Hi there, I'm finding this a bit trickier than originally expected and am hoping someone can help me understand if I'm missing something.I have 3 jobs:One orchestrator job (tasks are type run_job)Two "Parent" jobs (tasks are type notebook)parent1 run...

task_run_id-poc-1.png task_run_id-poc-2.png task_run_id-poc-3.png
  • 144 Views
  • 2 replies
  • 2 kudos
Latest Reply
emma_s
Databricks Employee
  • 2 kudos

Hi, I ran into the same confusion and did some testing on this. Here's what I found: Task values don't cross the run_job boundary. So even if child1 sets a task value with dbutils.jobs.taskValues.set(), the orchestrator can't read it. But {{tasks.par...

  • 2 kudos
1 More Replies
abhishek0306
by > New Contributor
  • 156 Views
  • 4 replies
  • 0 kudos

Databricks file based trigger to sharepoint

Hi,Can we create a file based trigger from sharepoint location for excel files from databricks. So my need is to copy the excel files from sharepoint to external volumes in databricks so can it be done using a trigger that whenever the file drops in ...

  • 156 Views
  • 4 replies
  • 0 kudos
Latest Reply
rohan22sri
New Contributor II
  • 0 kudos

File-based triggers in Databricks are designed to work with data that already resides in cloud storage (such as ADLS, S3, or GCS). In this case, since the source system is SharePoint, expecting a native file-based trigger from Databricks is not feasi...

  • 0 kudos
3 More Replies
iamgoce
by > New Contributor III
  • 676 Views
  • 3 replies
  • 2 kudos

External embedding for reports using federated credentials fails

Hi,We are implementing external dashboard embedding in Azure Databricks and want to avoid using client secrets by leveraging **Azure Managed Identity** with **OAuth token federation** for generating the embedded report token.Following OAuth token fed...

  • 676 Views
  • 3 replies
  • 2 kudos
Latest Reply
gsbence
Visitor
  • 2 kudos

Did you get any information whether this is on their roadmap?I came across this issue last week and the documentation doesn't have anything about this limitation.

  • 2 kudos
2 More Replies
thomasm
by > New Contributor III
  • 722 Views
  • 7 replies
  • 3 kudos

MLFlow Detailed Trace view doesn't work in some workspaces

I've created a Databricks Model Serving Endpoint which serves an MLFlow Pyfunc model. The model uses langchain and I'm using mlflow.langchain.autolog().At my company we have some production(-like) workspaces where users cannot e.g. run Notebooks and ...

thomasm_1-1767785859607.png thomasm_0-1767785737567.png thomasm_2-1767785939124.png
  • 722 Views
  • 7 replies
  • 3 kudos
Latest Reply
lkt1
New Contributor III
  • 3 kudos

Funnily enough, the problem also disappeard on my end this morning Previously, I saw a networking issue in my logs, but that also went away. Let's hope it stays that way! 

  • 3 kudos
6 More Replies
krishna007
by > New Contributor II
  • 296 Views
  • 4 replies
  • 2 kudos

Resolved! What is the best way to use Unity catalog with medallion architecture using ADLS2

Hi,I am using a medallion architecture on Azure Data Lake Storage Gen2 with Azure Databricks. Currently, I am storing data in Parquet format (not Delta tables), and I am planning to implement Unity Catalog (UC).As part of this setup, I understand tha...

  • 296 Views
  • 4 replies
  • 2 kudos
Latest Reply
krishna007
New Contributor II
  • 2 kudos

I was going to follow 3rd but then it violets our medallion. And we don't have that much data to separate it physically. So going with 1st approach. But Thank you very much @karthickrs, I'll keep this in mind

  • 2 kudos
3 More Replies
Akshatkumar69
by > New Contributor
  • 168 Views
  • 3 replies
  • 1 kudos

Resolved! Metric views joins

I am currently working on a migration project from power BI to ai bi dashboard in databricks . Now i am using the metric views to create all the measures and DAX queries which i have in my power BI report in YAML in the metric views but the main prob...

Akshatkumar69_0-1775806455687.png
  • 168 Views
  • 3 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Hey @Akshatkumar69, welcome to the community. You're not alone on this one, it is common with folks coming from Power BI. The key thing to understand is that AI/BI charts do expect a single data source, but that source can be a metric view that alrea...

  • 1 kudos
2 More Replies