- 182 Views
- 1 replies
- 0 kudos
How to Optimize Spark Jobs in Databricks for Large-Scale Geospatial Data Processing?
I’m currently analyzing a large geospatial dataset focused on Michigan county boundaries and map data, and I’m using Apache Spark on Databricks to process and transform millions of records.Even though I’ve optimized basic things like repartitioning, ...
- 182 Views
- 1 replies
- 0 kudos
- 0 kudos
I do not have experience with geospatial data on databricks.But I do know that since a while, Sedona can be installed on Databricks.Sedona is created for large-scale geospatial data processing. Sounds like something for you no?https://sedona.apache....
- 0 kudos
- 12746 Views
- 16 replies
- 3 kudos
Is it possible to view Databricks cluster metrics using REST API
I am looking for some help on getting databricks cluster metrics such as memory utilization, CPU utilization, memory swap utilization, free file system using REST API.I am trying it in postman using databricks token and with my Service Principal bear...
- 12746 Views
- 16 replies
- 3 kudos
- 3 kudos
Is there any solution found to get cpu, memory metrics for Hive meta store backed workloads ? We are not using UC. So can't use system tables
- 3 kudos
- 13235 Views
- 4 replies
- 0 kudos
Resolved! Unable to add column comment on a View. Any way to update comments on multiple columns in bulk?
I noticed that unlike "Alter Table" there is no "Alter View" command to add comment on a column in the existing view. This is a regular view created on Tables (and not Materialized view). If the underlying table column has comment then the View inh...
- 13235 Views
- 4 replies
- 0 kudos
- 0 kudos
Use COMMENT ONCOMMENT ON | Databricks on AWS
- 0 kudos
- 5809 Views
- 2 replies
- 0 kudos
Improve query performance of direct query with Databricks
I’m building a dashboard in Power BI’s Pro Workspace, connecting data via Direct Query from Databricks (around 60 million rows from 15 combined tables), using a SQL Serverless (small size and 4 clusters).The problem is that the dashboard is taking to...
- 5809 Views
- 2 replies
- 0 kudos
- 0 kudos
@viniciuscini have you managed to get it working well for you?
- 0 kudos
- 962 Views
- 7 replies
- 15 kudos
Unity catalogues - What would you do
If you were creating Unity Catalogs again, what would you do differently based on your past experience?
- 962 Views
- 7 replies
- 15 kudos
- 15 kudos
@nayan_wylde no don't do that hehe. It was example of extreme approach. Usually use catalog to separate environment + in enterprises to separate divisions like customer tower, marketing tower, finance tower etc
- 15 kudos
- 764 Views
- 3 replies
- 2 kudos
Resolved! How to reduce data loss for Delta Lake on Azure when failing from primary to secondary regions?
Let’s say we have big data application where data loss is not an option.Having GZRS (geo-zone-redundant storage) redundancy we would achieve zero data loss if primary region is alive – writer is waiting for acks from two or more Azure availability zo...
- 764 Views
- 3 replies
- 2 kudos
- 2 kudos
Databricks is working on improvements and new functionality related to that. For now, the only solution is a DEEP CLONE. You can run it more frequently or implement your own replication based on a change data feed. You could use delta sharing for tha...
- 2 kudos
- 468 Views
- 2 replies
- 1 kudos
Delta comparison architecture using flatMapGroupsWithState in Structured Streaming
I am designing structured streaming job in Azure data bricks(using Scala) which will consume messages from two event hubs, lets call them source and target.I would like your feedback on below flow, whether it is will survive the production load and ...
- 468 Views
- 2 replies
- 1 kudos
- 1 kudos
It is hard to understand what the source is and what the target is. Some charts could be useful. Also, information on how long the state is kept. My solution usually is:- Use declarative lakeflow pipelines if possible (dlt) - if not, consider handlin...
- 1 kudos
- 1129 Views
- 6 replies
- 4 kudos
Resolved! Databricks partner Tech Summit FY26 access
I'm trying to access the recordings of Partner Tech Summit FY26 which happened a month back. It says lobby is closed.Is there any other way i can access the recordings. I'm yet to watch the day 2 sessions.
- 1129 Views
- 6 replies
- 4 kudos
- 4 kudos
Hi @saurabh18cs , check link shared by @Advika . Make sure you are logged in using partner account.Link - https://partner-academy.databricks.com/learn/catalog/view/168SS:
- 4 kudos
- 843 Views
- 4 replies
- 5 kudos
Resolved! serialized_dashboard
I have a dashboard.json file, for example: {select * from ${{var.table_name}}}. I have job.yml and section serialized_dashboard there? bcs my job runs parallel with dashboard. Can I use variables in databrics.yml if I define the table_variable variab...
- 843 Views
- 4 replies
- 5 kudos
- 5 kudos
I currently use the parameter inside IDENTIFIER(:schema || 'my_table') and the 'bundle scripts' feature to perform substitutions, but I hope for better support soon.
- 5 kudos
- 603 Views
- 4 replies
- 5 kudos
Resolved! Need help understanding Databricks
Hi,I come from a traditional ETL background and am having trouble understanding some of the cloud hyper scalar features and use cases.I understand Databricks is hosted on a cloud providers. I see the cloud providers have their own tools for ETL, ML/A...
- 603 Views
- 4 replies
- 5 kudos
- 5 kudos
Thanks a lot Gema. For the detailed and meticulous answers.I guess I have to unlearn and relearn everything starting today.
- 5 kudos
- 909 Views
- 5 replies
- 3 kudos
Resolved! Stateless streaming with aggregations on a DLT/Lakeflow pipeline
In a DLT pipeline I have a bronze table that ingest files using Autoloader, and a derived silver table that, for this example, just stores the number of rows for each file ingested into bronze. The basic code example: import dlt from pyspark.sql impo...
- 909 Views
- 5 replies
- 3 kudos
- 3 kudos
For scenarios in Databricks where lower latency is needed for Silver tables but continuous streaming pipelines are not feasible, using jobs or notebooks with foreachBatch running in Structured Streaming mode is a common and recommended approach. This...
- 3 kudos
- 610 Views
- 4 replies
- 0 kudos
Data analyst learning plan lab files
Hi all,I am very new to databricks and to this community. I recently signed up for the data analyst learning plan and the data engineering one.The learning platform page seems like confusing maze to navigate! In the course material for the data analy...
- 610 Views
- 4 replies
- 0 kudos
- 0 kudos
Hi,I managed to find the lab. It wasn't straight-forward at all. It was part of another link and no in the learning path I had signed upThe lab series I am trying to work on is thishttps://partner-academy.databricks.com/learn/courses/3701/aibi-for-da...
- 0 kudos
- 2621 Views
- 2 replies
- 1 kudos
More than expected number of Jobs created in Databricks
Hi Databricks Gurus !I am trying to run a very simple snippet :data_emp=[["1","sarvan","1"],["2","John","2"],["3","Jose","1"]]emp_columns=["EmpId","Name","Dept"]df=spark.createDataFrame(data=data_emp, schema=emp_columns)df.show() --------Based on a g...
- 2621 Views
- 2 replies
- 1 kudos
- 730 Views
- 6 replies
- 6 kudos
Resolved! Cluster cannot find init script stored in Volume
I have created an init script stored in a Volume which I want to execute on a cluster with runtime 16.4 LTS. The cluster has policy = Unrestricted and Access mode = Standard. I have additionally added the init script to the allowlist. This should be ...
- 730 Views
- 6 replies
- 6 kudos
- 6 kudos
Hi @jimoskar ,Since you're using standard access mode you need to add init script to allowlist. Did you add your init script to allowlist? If not, do the following:In your Databricks workspace, click Catalog.Click the gear icon .Click the metastore ...
- 6 kudos
- 391 Views
- 2 replies
- 3 kudos
Resolved! Delta sharing with Celonis
Is there is any way/plans of Databricks use Delta sharing to provide data access to Celonis?
- 391 Views
- 2 replies
- 3 kudos
- 3 kudos
Hi @cbhoga ,Delta Sharing is an open protocol for secure data sharing. Databricks already supports it natively, so you can publish data using Delta Sharing. However, whether Celonis can directly consume that shared data depends on whether Celonis sup...
- 3 kudos
Join Us as a Local Community Builder!
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now-
.CSV
1 -
Access Data
2 -
Access Databricks
2 -
Access Delta Tables
2 -
Account reset
1 -
ADF Pipeline
1 -
ADLS Gen2 With ABFSS
1 -
Advanced Data Engineering
2 -
AI
3 -
Analytics
1 -
Apache spark
1 -
Apache Spark 3.0
1 -
Api Calls
1 -
API Documentation
3 -
App
1 -
Architecture
1 -
asset bundle
1 -
Asset Bundles
3 -
Auto-loader
1 -
Autoloader
4 -
AWS security token
1 -
AWSDatabricksCluster
1 -
Azure
6 -
Azure data disk
1 -
Azure databricks
15 -
Azure Databricks SQL
6 -
Azure databricks workspace
1 -
Azure Unity Catalog
6 -
Azure-databricks
1 -
AzureDatabricks
1 -
AzureDevopsRepo
1 -
Big Data Solutions
1 -
Billing
1 -
Billing and Cost Management
2 -
Blackduck
1 -
Bronze Layer
1 -
Certification
3 -
Certification Exam
1 -
Certification Voucher
3 -
CICDForDatabricksWorkflows
1 -
Cloud_files_state
1 -
CloudFiles
1 -
Cluster
3 -
Cluster Init Script
1 -
Comments
1 -
Community Edition
3 -
Community Event
1 -
Community Group
2 -
Community Members
1 -
Compute
3 -
Compute Instances
1 -
conditional tasks
1 -
Connection
1 -
Contest
1 -
Credentials
1 -
Custom Python
1 -
CustomLibrary
1 -
Data
1 -
Data + AI Summit
1 -
Data Engineer Associate
1 -
Data Engineering
3 -
Data Explorer
1 -
Data Ingestion & connectivity
1 -
Data Processing
1 -
Databrick add-on for Splunk
1 -
databricks
2 -
Databricks Academy
1 -
Databricks AI + Data Summit
1 -
Databricks Alerts
1 -
Databricks App
1 -
Databricks Assistant
1 -
Databricks Certification
1 -
Databricks Cluster
2 -
Databricks Clusters
1 -
Databricks Community
10 -
Databricks community edition
3 -
Databricks Community Edition Account
1 -
Databricks Community Rewards Store
3 -
Databricks connect
1 -
Databricks Dashboard
3 -
Databricks delta
2 -
Databricks Delta Table
2 -
Databricks Demo Center
1 -
Databricks Documentation
4 -
Databricks genAI associate
1 -
Databricks JDBC Driver
1 -
Databricks Job
1 -
Databricks Lakehouse Platform
6 -
Databricks Migration
1 -
Databricks Model
1 -
Databricks notebook
2 -
Databricks Notebooks
4 -
Databricks Platform
2 -
Databricks Pyspark
1 -
Databricks Python Notebook
1 -
Databricks Repo
1 -
Databricks Runtime
1 -
Databricks SQL
5 -
Databricks SQL Alerts
1 -
Databricks SQL Warehouse
1 -
Databricks Terraform
1 -
Databricks UI
1 -
Databricks Unity Catalog
4 -
Databricks Workflow
2 -
Databricks Workflows
2 -
Databricks workspace
3 -
Databricks-connect
1 -
databricks_cluster_policy
1 -
DatabricksJobCluster
1 -
DataCleanroom
1 -
DataDays
1 -
Datagrip
1 -
DataMasking
2 -
DataVersioning
1 -
dbdemos
2 -
DBFS
1 -
DBRuntime
1 -
DBSQL
1 -
DDL
1 -
Dear Community
1 -
deduplication
1 -
Delt Lake
1 -
Delta Live Pipeline
3 -
Delta Live Table
5 -
Delta Live Table Pipeline
5 -
Delta Live Table Pipelines
4 -
Delta Live Tables
7 -
Delta Sharing
2 -
deltaSharing
1 -
Deny assignment
1 -
Development
1 -
Devops
1 -
DLT
10 -
DLT Pipeline
7 -
DLT Pipelines
5 -
Dolly
1 -
Download files
1 -
Dynamic Variables
1 -
Engineering With Databricks
1 -
env
1 -
ETL Pipelines
1 -
External Sources
1 -
External Storage
2 -
FAQ for Databricks Learning Festival
2 -
Feature Store
2 -
Filenotfoundexception
1 -
Free trial
1 -
GCP Databricks
1 -
GenAI
1 -
Getting started
2 -
Google Bigquery
1 -
HIPAA
1 -
Hubert Dudek
5 -
import
1 -
Integration
1 -
JDBC Connections
1 -
JDBC Connector
1 -
Job Task
1 -
Learning
1 -
Lineage
1 -
LLM
1 -
Login
1 -
Login Account
1 -
Machine Learning
3 -
MachineLearning
1 -
Materialized Tables
2 -
Medallion Architecture
1 -
meetup
1 -
Metadata
1 -
Migration
1 -
ML Model
2 -
MlFlow
2 -
Model Training
1 -
Module
1 -
Monitoring
1 -
Networking
1 -
Notebook
1 -
Onboarding Trainings
1 -
OpenAI
1 -
Pandas udf
1 -
Permissions
1 -
personalcompute
1 -
Pipeline
2 -
Plotly
1 -
PostgresSQL
1 -
Pricing
1 -
Pyspark
1 -
Python
5 -
Python Code
1 -
Python Wheel
1 -
Quickstart
1 -
Read data
1 -
Repos Support
1 -
Reset
1 -
Rewards Store
2 -
Sant
1 -
Schedule
1 -
Serverless
3 -
serving endpoint
1 -
Session
1 -
Sign Up Issues
2 -
Software Development
1 -
Spark Connect
1 -
Spark scala
1 -
sparkui
2 -
Splunk
2 -
SQL
8 -
Summit23
7 -
Support Tickets
1 -
Sydney
2 -
Table Download
1 -
Tags
3 -
terraform
1 -
Training
2 -
Troubleshooting
1 -
Unity Catalog
4 -
Unity Catalog Metastore
2 -
Update
1 -
user groups
1 -
Venicold
3 -
Voucher Not Recieved
1 -
Watermark
1 -
Weekly Documentation Update
1 -
Weekly Release Notes
2 -
Women
1 -
Workflow
2 -
Workspace
3
- « Previous
- Next »
| User | Count |
|---|---|
| 133 | |
| 126 | |
| 57 | |
| 45 | |
| 42 |