Databricks Platform Discussions
Dive into comprehensive discussions covering various aspects of the Databricks platform. Join the co...
Dive into comprehensive discussions covering various aspects of the Databricks platform. Join the co...
Engage in vibrant discussions covering diverse learning topics within the Databricks Community. Expl...
How do I start in Databricks by creating a notebook and use it to run a simple data processing task (a Spark job)?
I'm trying to add tables to an existing SQL server CDC ingestion pipeline and today am getting this mysterious error message. Failed to edit ingestion pipelinePostgreSQL slot name cannot be empty or null Have not encountered this before. Is this simp...
After I posted this I noticed that the gateway compute for this pipeline was repeatedly failing and retrying. This was resolved by increasing our quota of "Standard FS Family" compute on Azure. And when that was resolved the above error also disappea...
I am trying to create a simple databricks custom app but I am getting Error: Could not import 'app'. error.app.yaml fileenv: - name: FLASK_APP value: '/Workspace/Users/sam@xxx.com/databricks_apps/hello-world_2025_11_13-16_19/Gaap_commentry/app'comm...
Seems like you are combining a file path for FLASK_APP with running a file using command. When FLASK_APP is set to a full path, Flask expects that path to point to a Python file (e.g., app.py) or package that contains the application instance.Correc...
HiHow do I disable Data Apps on my workspace. This is really annoying that Databricks pushes new features without any option to disable them. At least you should have some tools to control access before rolling it out. It seems you only care about fe...
@Raman_Unifeye , I don't have visibility into the roadmap. However, if you are a customer you can always log a feature request. Cheers, Louis.
Hi,I want to run a dbt workflow task and would like to use the git integration for that. Using my personal user I am able to do so but I am running my workflows using a service principal.I added git credentials and the repository using terraform. I a...
On the other hand, another approach you could use. Configure your tasks with relative paths to notebooks and deploy all of them with DAB. Your job will reference directly the deployed notebook, no need to access GIT from jobs/notebooks. That is deleg...
What is the best way to know what kind of join was used for a SQL query between broadcast, shuffle hash and sort merge? How can the spark UI or the query plan be interpreted?
Hello @smoortema , here are some helpful tips and tricks. Here’s how to quickly determine which join strategy Spark used—between broadcast hash join, shuffle hash join, and sort-merge join—and how to read both the query plan and the Spark UI to ver...
Hello, i have a problem.When I try to run the MLlib Assembler (from pyspark.ml.feature import VectorAssembler) I get this error and I don't know what to do anymore. Please help.
Do you plan to support this in Serverless Free Edition? Migration from Community Edition to Serveless has been fraught with these limitations.
Hello,I recently created a Multi Agent supervisor yesterday and it is working fine till date but when i created a second Multi Agent Supervisor assistant, im facing below error: Endpoint update failedFailed to deploy : Quota Exceeded: You've hit the ...
Hello @shivamrai162! The error is from a workspace quota, not billing. You’ve hit the Model Serving provisioned concurrency quota, which is enforced independently of your remaining trial credits. That’s why you can still have $200 left and see a quot...
Hello,I am building a Data Pipeline which extract data from Oracle Fusion and Push it to Databricks Delta lake.I am using Bronze, Silver and Gold Approach.May someone please help me how to control all three segment that is Bronze, Silver and Gold wit...
Here’s how you can implement DQ at each stage:Bronze LayerChecks:File format validation (CSV, JSON, etc.).Schema validation (column names, types).Row count vs. source system.Tools:Use Databricks Autoloader with schema evolution and badRecordsPathImpl...
Can anyone help with official Practice Exams set for Databricks Certified Data Engineer Professional exam, like we have below for Databricks Certified Data Engineer AssociatePractice exam for the Databricks Certified Data Engineer Associate exam
Hi Databricks Community,I’m trying to deploy a model serving endpoint that uses Databricks Feature Store (Unity Catalog, online tables).My offline and online feature tables are created and visible in Databricks.The model is logged with FeatureEnginee...
Root cause in plain English The lookup client is trying to read SQL-style credentials like PREFIX_USER/PREFIX_PASSWORD for a third‑party online store and the “prefix” is empty, so it searches for “_USER” and fails. That auth scheme applies only to th...
Hello Databricks experts,In Automating Best Practices with Agentic AI Workload Analyzer, Krishna Satyavarapu and Nikhil Mishra mentioned "The Agentic AI Workload Analyzer". Is it known when this will be available.cheersMario
Generative AI is transforming how we handle data and automation. The key challenge now is balancing model creativity with control — ensuring reliable outputs while keeping innovation at the core.
Hello guys,I'm building ETL pipeline and need to access HANA data lake file system. In order to do that I need to have sap-hdlfs library in compute environment, library is available in maven repository.My job will have multiple notebook task and ETL ...
DLT doesn’t have a UI for library installation, but you can:Use libraries configuration in the pipeline JSON or YAML spec:{ "libraries": [ { "maven": { "coordinates": "com.sap.hana.hadoop:sap-hdlfs:<version>" } } ] }Or...
Data Profile on a table is not a securable object in Unity Catalog or at Workspace level. This make the management of Data Profiles difficult for workspace admins.Why isn’t “profile” a securable object in Databricks? It makes sense to require “Manage...
well no concrete answer on why, perhaps Data Profile is treated as ephemeral, computed metadata or a snapshot of summary statistics (like min/max, distinct counts, etc.). It is created by a user's compute job within a specific workspace environment. ...
Hello,My name is Shubham. I recently watched your video and found it very informative. I have completed my B.Tech and am in the process of joining a company that requires a Databricks certification.I am reaching out to request a voucher for the Datab...
When is the next festival date ? I missed it
| User | Count |
|---|---|
| 1852 | |
| 922 | |
| 859 | |
| 477 | |
| 317 |