Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
Databricks One is a user interface designed for business users, giving them a single, intuitive entry point to interact with data and AI in Azure Databricks, without needing to navigate technical concepts such as clusters, queries, models, or noteboo...
Materialized views running on SQL warehouse are super cost-efficient, and additionally, it is a really simple and powerful data engineering tool - just be sure that Enzyme updates it incrementally.
Read more:
- https://databrickster.medium.com/sql-wa...
Small, hidden but useful cluster setting.You can set that no jobs are allowed on the all-purpose cluster.Or vice versa, you can set an all-purpose cluster that can be used only by jobs.
read more:
- https://databrickster.medium.com/purpose-for-your-...
Hey everyone I’m currently exploring machine learning model development and I’m interested in understanding how to effectively integrate ML workflows within Databricks.Specifically, I’d like to hear from the community about:How do you structure ML pi...
You can integrate machine learning model development into Databricks Workflows pretty smoothly using the platform’s native tools. The main idea is to treat your ML lifecycle (data prep → training → evaluation → deployment) as a series of tasks within...
We followed this document https://docs.databricks.com/aws/en/connect/streaming/kafka?language=Python#msk-aad to use Kafka client to read events from our event hub for a feature.As part of the SFI, the guidance is to move away from client secret and u...
Currently, Databricks does not support using Managed Identities directly for Kafka client authentication (e.g., MSK IAM or Event Hubs Kafka endpoint) in Python Structured Streaming connections. However, there is a supported and secure alternative tha...
Hi Everyone,I am currently facing an issue with in our Test Environment where Data bricks is not able to mount with the storage account and we are using the same mount in other environments those are Dev,Preprod and Prod and it works fine there witho...
This issue in your Test environment, where Databricks fails to mount an Azure Storage account with the error java.lang.Exception: 480, is most likely related to expired credentials or cached authentication tokens, even though the same configuration w...
Hi,I'm trying to execute the following code:%sqlSELECT LSOA21CD, ST_X(ST_GeomFromWKB(Geom_Varbinary)) AS STX, ST_Y(ST_GeomFromWKB(Geom_Varbinary)) AS STYFROM ordnance_survey_lsoas_december_2021_population_weighted_centroidsWHERE LSOA21CD ...
I am getting below error connecting a databricks instance using JDBC driver .ERROR: [08S01/500593] Can't connect to database - [Databricks][JDBCDriver](500593) Communication link failure. Failed to connect to server. Reason: HTTP Response code: 401, ...
I am trying to connect Databricks from Mainframe z/OS using JDBC driver and using below IBM Java version java version "11.0.26" 2025-01-21IBM Semeru Runtime Certified Edition for z/OS 11.0.26.0 (build 11.0.26+4)IBM J9 VM 11.0.26.0 (build z/OS-Release...
Hello Community!I am writing to you with a question and hope that you will help me to find the right approach.I am building AI Enterprise System and the organization store the data on Data Bricks. To access the given data, you have to raise a request...
Ignore for now you have MCP Server.The problem you are trying to solve1) An AI Agent needs to access data inside Databricks 2) The agent need to operate at the user's permissionsThere are muliple paths1) Directly using OAuth/HTTPhttps://docs.databric...
Hi,I would like to get some support in creating a Community User Group in Madrid, Spain. It would be nice to host events/meetings/discussions ...Regards,Ángel
Hi Ángel,I see your post is from quite some time ago, but I wanted to say that I’d also love to see a Databricks User Group here in Madrid.Although I’m not new to Databricks, I haven’t really taken much advantage of the community so far due to lack o...
I’m currently analyzing a large geospatial dataset focused on Michigan county boundaries and map data, and I’m using Apache Spark on Databricks to process and transform millions of records.Even though I’ve optimized basic things like repartitioning, ...
I do not have experience with geospatial data on databricks.But I do know that since a while, Sedona can be installed on Databricks.Sedona is created for large-scale geospatial data processing. Sounds like something for you no?https://sedona.apache....
I noticed that unlike "Alter Table" there is no "Alter View" command to add comment on a column in the existing view. This is a regular view created on Tables (and not Materialized view). If the underlying table column has comment then the View inh...
I’m building a dashboard in Power BI’s Pro Workspace, connecting data via Direct Query from Databricks (around 60 million rows from 15 combined tables), using a SQL Serverless (small size and 4 clusters).The problem is that the dashboard is taking to...
@nayan_wylde no don't do that hehe. It was example of extreme approach. Usually use catalog to separate environment + in enterprises to separate divisions like customer tower, marketing tower, finance tower etc
Let’s say we have big data application where data loss is not an option.Having GZRS (geo-zone-redundant storage) redundancy we would achieve zero data loss if primary region is alive – writer is waiting for acks from two or more Azure availability zo...
Databricks is working on improvements and new functionality related to that. For now, the only solution is a DEEP CLONE. You can run it more frequently or implement your own replication based on a change data feed. You could use delta sharing for tha...