Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best prac...
Explore discussions on Databricks administration, deployment strategies, and architectural best prac...
Join discussions on data engineering best practices, architectures, and optimization strategies with...
Join discussions on data governance practices, compliance, and security within the Databricks Commun...
Explore discussions on generative artificial intelligence techniques and applications within the Dat...
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithm...
Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Communi...
I would guess unusual but want to hear from others before I nag my managers about it. In Databricks (I access in web browser) we have a compute cluster specifically for Git; you need to start it to push code or even to change branches. This is separa...
It means you are on the old classic Git Proxy that helped establish connectivity from the Databricks Control Plane to your on-prem Git Server. If your Git Server was cloud-based you would not need the proxy cluster. That being said, the new way is th...
What permissions does a Service Principal need to run Databricks jobs that reference notebooks created by a user and stored in Git?Hi everyone,We are exploring the notebooks‑first development approach with Databricks Bundles, and we’ve run into a wor...
Hi @DineshOjha, This is a good question, and researching this helped me learn some best practices along the way. What you’re seeing is actually expected behaviour. Service principals aren’t meant to execute notebooks directly from users’ personal wor...
Hi Databricks Team, Is there a standard Databricks cost estimation template(xl), sizing calculator, or TCO tool that allows us to provide the following inputs and derive an approximate monthly and annual platform cost:Source systems and their types (...
Hi, There isn't anything publicly available that I'm aware of. For this kind of complex migration I'd recommend working with your account team. As somebody who does Databricks sizing a lot, it's a nuanced art which I suspect is why we don't have any ...
I have created custom tags on a column and plan to mask columns with tags via policy. I am facing 2 issues 1. Cant see the custom tag under - Mask column if it has specific tag. 2. If I type my custom tag get error when creating policy Policy creat...
Hi, to use a tag in a tag policy it needs to be a governed tag rather than just a general tag. IF you just create it using Set tags UC sees it as a informational tag rather than a governed tag. If you use a CREATE tag statement to create it then you'...
Any documentation available to connect from the Azure SQL database to Azure Databricks SQL workspace. We created a SQL warehouse personal access token for a user in a different team who can connect from his on-prem SQL DB to Databricks using the conn...
Thank you for the detailed answer
How can we configure a job in a different Azure application to be triggered after the completion of an Azure Databricks job? Once the Databricks job is successful, the job in the third-party application hosted in Azure should start. I attempted to us...
Thank you. Thank you for the detailed answer!I have tested the Azure function way and also using an Azure runbook as well. Both works fine.Also tested the option of adding as the final task and a condition to "if all other notebooks" successful, the...
I have a job that runs a notebook, the notebook uses serverless GPU (A10) and it keeps failing with a "Run failed with error message Cluster 'xxxxxxxxxxx' was terminated. Reason: UNKNOWN (SUCCESS)". The base environment is 'Standard v4' and I have tr...
Hi @rtglorenabasul, Thanks for sharing the details. The behaviour you’re seeing is consistent with an issue in how the job is bringing up Serverless GPU compute, rather than with the notebook code itself. Having done some checks, that error usually m...
Hello Databricks Support Team,I’m experiencing a severe issue in my Databricks workspace related to the new Git‑Integrated Alerts behavior. Overnight, my workspace went from 67 alerts to nearly 1,000 alerts, all of which appear to have been auto‑gene...
Hi @kcheng, Thanks for sharing the details. This looks like behaviour that will need workspace‑specific investigation by Databricks Support, rather than something the community can reliably diagnose or fix. Because it resulted in a sudden, large volu...
Hi!I'm facing an error related to Checkpoint whenever I try to display a dataframe using auto Loader in Databricks free edition. Please refer the screenshot. To combat this, I have to delete the checkpoint folder and then execute the display or writ...
Hi @AanchalSoni, I can’t see the full history of your notebook, so I’m not sure of the exact cause. But the behaviour strongly suggests that an earlier version of the stream used complete mode against the same checkpointLocation, and that configurati...
Hi,We've created an Agent using Copilot Studio for Genie and integrated with Teams Channel.The feedback there is working and we can see the reactions in the Copilot Studio Analytics.But the feedback is not going to the actual genie space, neither the...
Hi @souravg, @Ale_Armillotta is right. At the moment, Genie only records feedback (thumbs up/down, "Fix it", comments) when it’s given directly in the Genie UI. The public Genie Conversation APIs that Copilot Studio/Teams use don’t expose any endpoin...
I followed the official Databricks documentation("https://docs.databricks.com/en/_extras/notebooks/source/mongodb.html")to integrate MongoDB Atlas with Spark by setting up the MongoDB Spark Connector and configuring the connection string in my Datab...
try using single-user cluster or non-isolation cluster instead of shared cluster
I'm trying to add _metadata column while reading a json file: from pyspark.sql.functions import colfrom pyspark.sql.types import StructType, StructField, LongType, TimestampTypedf_accounts_read = spark.readStream.format("cloudFiles").\ option("clo...
Hi @AanchalSoni, Looking at the first snapshot, it appears the path in all three records points to the checkpoint location. The _metadata column isn’t the root cause here. The issue is that Autoloader is ingesting your checkpoint files as data. Becau...
Hi,I was going through the documentation on quarantining records. Initially I thought that partitioning is not supported for temporary tables however I came cross the following@DP.table( temporary=True, partition_cols=["is_quarantined"], ) @dp.ex...
I'm trying to deploy using Databricks Asset Bundles via an Azure DevOps pipeline. I keep getting this error when trying to use oauth:Error: default auth: oauth-m2m: oidc: databricks OAuth is not supported for this host. Config: host=https://<workspac...
Hi @bradleyjamrozik, thank you for posting your question. You will need to use ARM_ variables to make it work Specifically ARM_CLIENT_ID ARM_TENANT_ID ARM_CLIENT_SECRET https://learn.microsoft.com/en-us/azure/databricks/dev-tools/auth#environment-3 f...
When trying to create a ingestion pipelines, auto generated cluster is hitting quota limit errors. The type of vm its trying to use is not available in our region and there seems no way to add fallback to different types of vms. Can you please help h...
Hi @Neelimak, For managed ingestion pipelines, the auto‑generated cluster is just a classic jobs cluster whose shape is controlled by a compute policy, so you can override the VM type and add fallbacks. Ask a workspace admin to create or edit a Job C...
| User | Count |
|---|---|
| 1837 | |
| 882 | |
| 769 | |
| 470 | |
| 312 |