Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best prac...
Explore discussions on Databricks administration, deployment strategies, and architectural best prac...
Join discussions on data engineering best practices, architectures, and optimization strategies with...
Join discussions on data governance practices, compliance, and security within the Databricks Commun...
Explore discussions on generative artificial intelligence techniques and applications within the Dat...
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithm...
Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Communi...
Hello, I want to host a webapp whose frontend will be on Streamlit and backend running on FastApi. Currently Databricks app listens to host 0.0.0.0 and port 8000 and my backend is running on host '127.0.0.1' and port 8080(if it's available). I want t...
Hi @prajwalpoojary , Given you already have Streamlit on 0.0.0.0:8000 and FastAPI on 127.0.0.1:8080, you can keep that split and do server-side calls from Streamlit to http://127.0.0.1:8080/. It’s efficient and avoids cross-origin/auth issues. If you...
How should the Databricks workspace folder architecture be designed to support cross-team collaboration, access governance, and scalability in an enterprise platform? Please suggest below or share some ideas from your experience ThanksNote: I'm new t...
Thanks for the detailed information, Iwill review and get back to you if any question meanwhile can you please on this query Databricks Workspace ACL Enforcement – How to Prevent Users from Creating Objects Outside Team Folder and Attaching to Shared...
ContextWe are in the process of extracting data between SAP BDC Datasphere and Databricks (Brownfield Implementation).SAP Datasphere is hosted in AWS (eu10)Databricks is hosted in Azure (West Europe)The BDC Connect System is located in the same regio...
The error DELTA_SHARING_INVALID_RECIPIENT_AUTH refers to an invalid authorization specification when accessing Delta Sharing resources. This maps to SQLSTATE code 28000 ("invalid authorization specification") and typically occurs when the recipient's...
Hi,as part of a small OSS project I am doing, dbt-unity-lineage, I need to enable Bring your own data lineage (Public Preview as of December 2025). But it seems you can't enable that Preview in either free edition or Trial?I'd rather not use my emplo...
Thanks, the trial is currently created as a Premium, did not see any options to choose otherwise. I tried use east and eu central, thinking it might have been a regional thing.But thanks for checking it out, and your reply.
Hello,In the context of reviewing our company's databricks structure and migrating legacy workspaces to Unity Catalog enabled ones, we're stuck with a few questions regarding enabling the automatic identity management feature.We currently provision D...
Spark 3.4 introduced parameterized SQL queries and Databricks also discussed this new functionality in a recent blog post (https://www.databricks.com/blog/parameterized-queries-pyspark)Problem: I cannot run any of the examples provided in the PySpark...
@adriennn this has nothing to do with DLT, but about Databricks providing a different session implementation here than regular Spark.
I am running a relatively simple SQL query that writes back to a table on a Databricks serverless SQL warehouse, and I'm trying to understand why there is a "Columnar To Row" node in the query profile that is consuming the vast majority of the time s...
@dave_d We do not have a document with list of operations that would bring up ColumnarToRow node. This node provides a common executor to translate an RDD of ColumnarBatch into an RDD of InternalRow. This is inserted whenever such a transition is de...
We have a Lakeflow Spark Declarative Pipeline using the new PySpark Pipelines API. This was working fine until about 7am (Central European) this morning when the pipeline started failing with a PYTHON.NAME_ERROR: name 'kdf' is not defined. Did you me...
It turns out this problem was caused by a package that was pip installed using an init script. This package had for some reason started pulling in pandas 3.x (despite the fact that the package itself had not been updated), and our Databricks contact ...
After implementing StreamingQueryListener to enable integration with our monitoring solution we have noticed some strange metrics for our DeltaSource streams (based on https://learn.microsoft.com/en-us/azure/databricks/structured-streaming/stream-mon...
Firstly - let’s talk about batch vs trigger.A trigger is the scheduling event that tells Spark when to check for new data (eg processingTime, availableNow, once). A batch (micro-batch) is the actual unit of work that processes data, reads input, and...
Does anyone know if there is a way to get anchor links working in Databricks notebooks so you can jump to sections in the same book without a full page refresh? i.e. something that works like the following html:<a href="#jump_to_target">Jump</a>...<p...
@hobrob_ex , yes, this is possible, but not like the HTML way; instead, you will have to use the markdown rendering formats. Add #Heading 1, #Heading 2.. so on in the (+Text) button of the notebook. Once these headings/ sections that you want are con...
I’m trying to get a full list of Databricks workspace groups and their user memberships. I want to do this in two ways:As a queryable table or view (e.g., for audits, security reviews, app integration)From within a Databricks App (Streamlit-style), u...
@discuss_darende - you could use below code in the notebook.Pls adjust it based on your need.from databricks.sdk import AccountClient, WorkspaceClient # If env vars are set, this picks them up automatically a = WorkspaceClient() # List identities u...
Hi,Today I completed the test for Lakehouse fundamentals by scored 85%, still I haven't received the badge through my email francis@intellectyx.comKindly let me know please !-Francis
HI I completed the test for Databricks Certified Data Engineer Associate on 17 December 2024. still I haven't received the badge through my email sureshrocks.1984@hotmail.comKindly let me know please !SURESHK
One thing becomes very clear when you spend time in the Databricks community: AI is no longer an experiment. It is already part of how real teams build, ship, and operate data systems at scale.For a long time, many organizations treated data engineer...
Thanks @Louis_Frolio for your kind words. Happy to contribute here.
Hi,I cannot find so far a way to get programmatically (SQL/Python) the Subqueries(/Sub-statements) executions history records, shown in ADBricks UI Query History/Profile, that were executed during a TaskRun of Job, as shown in [red boxes] on the atta...
Greetings @ADBricksExplore , Short answer: there isn’t a supported public API that returns the “Substatements / Subqueries” panel you see in the Query History or Profile UI. The GraphQL endpoints the UI relies on are internal and not stable or suppo...
I’m working on a data usage use case and want to understand the right way to get read bytes and written bytes per table in Databricks, especially for Unity Catalog tables.What I wantFor each table, something like:DateTable name (catalog.schema.table)...
system.access.audit focuses on governance and admin/security events. It doesn’t capture per-table I/O metrics such as read_bytes or written_bytes.Use system.query.history for per-statement I/O metrics (read_bytes, written_bytes, read_rows, written_ro...
| User | Count |
|---|---|
| 1836 | |
| 881 | |
| 746 | |
| 470 | |
| 312 |