Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best prac...
Explore discussions on Databricks administration, deployment strategies, and architectural best prac...
Join discussions on data engineering best practices, architectures, and optimization strategies with...
Join discussions on data governance practices, compliance, and security within the Databricks Commun...
Explore discussions on generative artificial intelligence techniques and applications within the Dat...
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithm...
Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Communi...
Hello,Our Azure Databricks workspace (URL: https://adb-3568788088379780.0.azuredatabricks.net) was deployed by the Azure Databricks Resource Provider. No “Manage Account” option appears in the UI, and no Account Admin is listed. Please link this work...
Good Afternoon,I’m using Databricks with Git integration to Azure DevOps (ADO).Authentication is via Microsoft Entra federated credentials for a service principal (SPN).The SPN has Basic access in ADO, is in the same project groups as my user, and Gi...
The issue stems from a fundamental architectural difference in how Databricks handles Git authentication: 1. Git Credential Gap: While your SPN successfully authenticates to Databricks via Microsoft Entra federated credentials, it lacks the sec...
I am trying to explore New Databricks Free edition but SQL Server connector Ingestion pipeline not able to set up through UI.. Its showing error that --Serverless Compute Must be Enabled for the workspace,But Free Edition only have Serverless Option ...
Hi @RakeshRakesh_De The error is misleading. As mentioned in the second row of the table here the gateway runs on classic compute, and the ingestion pipeline runs on serverless compute (mentioned in the third row of the same table linked above). Hop...
Dear All,I am trying to use jdbc driver to connect to an oracle database and append a new record to a table. The table has a column needs to be populated with a sequence number. I've been trying to use select `<sequence_name>.nextval` to get the sequ...
Hey @austinoyoung , Short answer: Don’t try to pull the sequence in your Spark insert. Let Oracle assign it. Why this happens (ORA-02287: sequence number not allowed here Spark’s JDBC writer generates parameterized INSERT statements like: INSERT INT...
Hi everyone,I'm new to Databricks and am trying to connect my Google Cloud Storage bucket to my Databricks workspace. I have a 43GB CSV file stored in a GCP bucket that I want to work with. Here’s what I've done so far:Bucket Setup:I created a GCP bu...
Hey @refah_1 , Thanks for laying out the steps—you’re very close. Here’s a structured checklist to get GCS working with Unity Catalog and a couple of common gotchas to check. What’s likely going on The region mismatch isn’t the root cause; docs em...
Hi,I'm trying to update the GCP permissions for Databricks as described here: https://docs.databricks.com/gcp/en/admin/cloud-configurations/gcp/gce-updateTo be able to do that, I have to log in to the account console here: https://accounts.gcp.databr...
Greetings @borft , It sounds like you’re being redirected into a workspace without the right privileges; let’s get you into the correct Databricks account console for your GCP Marketplace subscription and identify the right login. What login is requ...
Hi,I am currently trying to use the Accounts SDK to add External groups from Entra ID to functional groups within Databricks. I expect thousands of groups in Entra and I want to add these groups programmatically (for example) to a group in Databricks...
Hi @Sven_Relijveld — great to hear that your bulk-initial activation workflow is working as expected. Thanks for the update. Regarding the 5K external group limit you’re seeing: That is the current default soft quota for Azure Databricks accounts. It...
Hi all,For our set-up we have configured SCIM provisioning using Entra ID, group assignment on Azure is dealt with by IdentityIQ Sailpoint, and have enabled SSO for Databricks. It has/is working fine apart from one scenario. The original email assign...
The other option is to raise a ticket with Databricks Accounts team. Our Databricks team worked on the backend and the new email was synced.
When using DBR 16.4, I am seeing a lot of Stack traces as standard error in jobs, any idea why they are showing up and how to turn then off? Thx"FlagSettingCacheMetricsTimer" id=18 state=WAITING- waiting on <0x2d1573c6> (a java.util.TaskQueue)- locke...
spark.databricks.driver.disableJvmThreadDump=trueThis setting will remove the ST.
I am trying to develop a declarative pipeline. As per platform policy, I cannot use serverless, reason, I am using asset bundle to create declarative pipeline. In the bundle, I am trying to specify compute for the pipeline. However, I am constantly f...
Hello @crami Good day!!As the error tells. you need to increase the VM size, i know you have enough things in your place but spot fallback + Photon + autoscale triggers the failure. Go to Azure Portal → Subscriptions → Usage + quotasFilter: Provide...
I am working on a personalized price package recommendation and implemented an AutoGluon code integrating it with MLflow.The code has been created in a modular fashion to be used by other team members. They just need to pass the data, target column a...
Hi @cleversuresh Thanks for sharing the code and the context. Here are the core issues I see and how to fix them so MLflow logging works reliably on Databricks. What’s breaking MLflow logging in your code Your PyFunc wrapper loads the AutoGluon mod...
I have recently been able to run AutoML successfully on a certain dataset. But it has just failed on a second dataset of similar construction, before being able to produce any machine learning training runs or output. The Experiments page says```Mo...
Hi @dkxxx-rc , Thanks for the detailed context. This error is almost certainly coming from AutoML’s internal handling of imbalanced data and sampling, not your dataset itself. The internal column _automl_sample_weight_0000 is created by AutoML when i...
When we export a dashboard with maps, the map background doesn't show up in the pdf.
When exporting a Databricks dashboard with maps to PDF, it is a known issue that the map background sometimes does not appear in the exported PDF file. This problem has been discussed in the Databricks community as of early 2025, and appears to be a ...
Hey all, I am trying to read data from multiple s3 locations using a single stream DLT pipeline and loading data into a single target. Here is the scenario. S3 Locations: Below are my s3 raw locations with change in the directory names at the end. Ba...
You are using Databricks Autoloader (cloudFiles) within a Delta Live Tables (DLT) pipeline to ingest streaming Parquet data from multiple S3 directories with a wildcard pattern, and you want to ensure all matching directories’ data is included in a s...
A change recently came out in which Databricks necessarily requires using the Unity Catalog as the output of a DLT because previously it was HiveMetaStore. At first I was working using CDC plus expectations which resulted in the "allow_expectations_c...
Databricks has recently enforced Unity Catalog as the output target for Delta Live Tables (DLT), replacing the legacy Hive Metastore approach. As a result, the familiar "allow_expectations_col" column, which was automatically added to help track and ...
| User | Count |
|---|---|
| 1801 | |
| 879 | |
| 660 | |
| 468 | |
| 312 |