- 738 Views
- 1 replies
- 0 kudos
Resolved! Deduplication with rocksdb, should old state files be deleted manually (to manage storage size)?
Hi, I have following streaming setup:I want to remove duplicates in streaming.1) deduplication strategy is defined by two fields: extraction_timestamp and hash (row wise hash)2) watermark strategy: extraction_timestamp with "10 seconds" interval--> R...
- 738 Views
- 1 replies
- 0 kudos
- 0 kudos
Found solution. https://kb.databricks.com/streaming/how-to-efficiently-manage-state-store-files-in-apache-spark-streaming-applications <-- these two parameters.
- 0 kudos
- 731 Views
- 6 replies
- 2 kudos
Disable exiting current cell when moving around with keyboard arrows
Is there any way do disable exiting current cell when I move cursor around with arrows. When I press up arrow or down arrow it will exit the current cell and go to another cell. Can that functionally be disabled so when I hold up or down arrow key, c...
- 731 Views
- 6 replies
- 2 kudos
- 2 kudos
Is there any place where I can put this as a request.
- 2 kudos
- 5751 Views
- 4 replies
- 0 kudos
Adding NFS storage as external volume (Unity)
Can anyone share experience (or point me to another reference) that describes how to configure Azure Blob storage which has NFS enabled as an external volume to Databricks ?I've succeeded in adding SMB storage to Databricks but (if I understand prope...
- 5751 Views
- 4 replies
- 0 kudos
- 0 kudos
hi @phguk could you share how you managed to create an external volume referencing to an azure fileshare ?are you using Unity catalog for this ? it was my understanding this is not possible.
- 0 kudos
- 476 Views
- 0 replies
- 0 kudos
How to create a mount point to File share in Azure Storage account
Hello All,I have a requirement to create a mount point to file share in Azure Storage account, I did follow the official documentation. However, I could not create the mount point to fileshare.. and the documentation discribed the mount point creatio...
- 476 Views
- 0 replies
- 0 kudos
- 1221 Views
- 4 replies
- 0 kudos
Use Python notebook to read data from Databricks
I'm very new to Databricks. I hope this is the right place to ask this question.I want to use PySpark in a notebook to read data from a Databricks database with the below codes. databricks_host = "adb-xxxx.azuredatabricks.net" http_path = "/sql/1.0/w...
- 1221 Views
- 4 replies
- 0 kudos
- 0 kudos
I would try changing the query to something like the following, it should return the column names in the table so you can see if the jdbc call is actually returning the data correctlySELECT * FROM wphub_poc.gold.v_d_building limit 10
- 0 kudos
- 3919 Views
- 8 replies
- 22 kudos
Turn Off Auto-reveal of Navigation Sidebar
I work with the navigation sidebar closed and use the stacked hamburgers symbol in the upper left to reveal it when I want. Now, if you mouse over the left edge of the browser window too slowly it will auto-reveal the navigation sidebar. I do not wan...
- 3919 Views
- 8 replies
- 22 kudos
- 22 kudos
I've checked with the team, and there's no way to turn this off. However, they are making adjustments to improve the experience, and a fix to refine the sidebar functionality is on the way.
- 22 kudos
- 485 Views
- 1 replies
- 0 kudos
New Group in Denver
Can we create a new Group in Denver?
- 485 Views
- 1 replies
- 0 kudos
- 2780 Views
- 3 replies
- 0 kudos
Virtual Learning Festival Enrollment
Hi everyone,I tried to enroll to Virtual Learning Festival: 9 April - 30 April but upon clicking the Customers & Prospects link for LEARNING PATHWAY 1: ASSOCIATE DATA ENGINEERING I got the error (refer attached image).Thank you in advance for the hel...
- 2780 Views
- 3 replies
- 0 kudos
- 0 kudos
Hi @Advika,Refer attached image. I thought it was attached in my question. I really new to here.
- 0 kudos
- 253 Views
- 1 replies
- 0 kudos
Expectation in DLT using multiple columns
Is it possible to define an expectation in DLT pipeline using multiple columns?For example, my source has two fields - Division, Material_Number. For division 20, material number starts with 5; for 30 material number starts with 9.Can we have this ...
- 253 Views
- 1 replies
- 0 kudos
- 0 kudos
Hi @Master_DataBric , Yes its possibleHere is the doc link : - https://docs.databricks.com/aws/en/dlt/expectations?language=Python- https://docs.databricks.com/aws/en/dlt/expectations?language=SQL
- 0 kudos
- 4353 Views
- 2 replies
- 1 kudos
POC Comparison: Databricks vs AWS EMR
Hello,I need some assistance with a comparison between Databricks and AWS EMR. We've been evaluating the Databricks Data Intelligence platform for a client and found it to be significantly more expensive than AWS EMR. I understand the challenge in ma...
- 4353 Views
- 2 replies
- 1 kudos
- 1 kudos
Databricks is highly optimized for Delta, which leverages columnar storage, indexing, and caching for better performance.Instead of directly processing CSV files, convert them to Delta first, then perform aggregations and joins, see if this helps
- 1 kudos
- 260 Views
- 1 replies
- 1 kudos
Is it possible to concatenate two notebooks?
I don't think it's possible but I thought I would check. I need to combine notebooks. While developing I might have code in various notebooks. I read them in with "%run".Then when all looks good I combine many cells into fewer notebooks. Is there any...
- 260 Views
- 1 replies
- 1 kudos
- 1 kudos
Hi @397973, Combining multiple notebooks into a single notebook isn't an out-of-the-box feature, but will try to combine %run commands ando output them to see if it works, sort of like: %run "/path/to/notebook1"%run "/path/to/notebook2"
- 1 kudos
- 2298 Views
- 5 replies
- 2 kudos
In databricks deployment .py files getting converted to notebooks
A critical issue has arisen that is impacting our deployment planning for our client. We have encountered a challenge with our Azure CI/CD pipeline integration, specifically concerning the deployment of Python files (.py). Despite our best efforts, w...
- 2298 Views
- 5 replies
- 2 kudos
- 2 kudos
Another option is Databricks Asset Bundles.
- 2 kudos
- 307 Views
- 2 replies
- 1 kudos
Databricks Lakehouse Monitoring
Hi,I am trying to implement lakehouse monitoring using Inference profile for my inference data that I have, I see that when I create the monitor, two tables get generated profile and drift, I wanted to understand how are these two tables generating a...
- 307 Views
- 2 replies
- 1 kudos
- 1 kudos
When you create a Databricks Lakehouse Monitoring monitor with an Inference profile, the system automatically generates two metric tables: a profile metrics table and a drift metrics table. Here's how this process works: Background Processing When yo...
- 1 kudos
- 470 Views
- 2 replies
- 0 kudos
Liquid Clustering Key Change Question
If i already have a cluster key1 for existing table, i want to change cluster key to key2 using ALTER TABLE table CLUSTER BY (key2), then run OPTIMIZE table, based on databrick document , existing files will not be rewritten (verified by my test as w...
- 470 Views
- 2 replies
- 0 kudos
- 0 kudos
@ShivangiB You're correct in your understanding. When you change a clustering key using ALTER TABLE followed by OPTIMIZE, it doesn't automatically recluster existing data. Let me explain why this happens and what options you have.In Delta Lake (which...
- 0 kudos
- 384 Views
- 1 replies
- 0 kudos
Unable to Access S3 from Serverless but Works on Cluster
Hi everyone,I am trying to access data from S3 using an access key and secret. When I run the code through Databricks clusters, it works fine. However, when I try to do the same from a serverless cluster , I am unable to access the data.I have alread...
- 384 Views
- 1 replies
- 0 kudos
- 0 kudos
Hello @HarryRichard08! It looks like this post duplicates the one you recently posted. A response has already been provided to the Original post. I recommend continuing the discussion in that thread to keep the conversation focused and organized.
- 0 kudos
Join Us as a Local Community Builder!
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now-
Access Data
2 -
Access Delta Tables
2 -
Account reset
1 -
ADF Pipeline
1 -
ADLS Gen2 With ABFSS
1 -
AI
1 -
Analytics
1 -
Apache spark
1 -
API Documentation
2 -
Architecture
1 -
Auto-loader
1 -
Autoloader
2 -
AWS
3 -
AWS security token
1 -
AWSDatabricksCluster
1 -
Azure
4 -
Azure data disk
1 -
Azure databricks
12 -
Azure Databricks SQL
5 -
Azure databricks workspace
1 -
Azure Unity Catalog
4 -
Azure-databricks
1 -
AzureDatabricks
1 -
AzureDevopsRepo
1 -
Big Data Solutions
1 -
Billing
1 -
Billing and Cost Management
1 -
Bronze Layer
1 -
Certification
3 -
Certification Exam
1 -
Certification Voucher
3 -
Cloud_files_state
1 -
CloudFiles
1 -
Cluster
3 -
Community Edition
3 -
Community Group
1 -
Community Members
1 -
Compute
3 -
conditional tasks
1 -
Connection
1 -
Contest
1 -
Cost
2 -
Credentials
1 -
CustomLibrary
1 -
Data + AI Summit
1 -
Data Engineering
3 -
Data Explorer
1 -
Data Ingestion & connectivity
1 -
databricks
2 -
Databricks Academy
1 -
Databricks AI + Data Summit
1 -
Databricks Alerts
1 -
Databricks Assistant
1 -
Databricks Certification
1 -
Databricks Cluster
2 -
Databricks Clusters
1 -
Databricks Community
9 -
Databricks community edition
3 -
Databricks Community Rewards Store
3 -
Databricks connect
1 -
Databricks Dashboard
1 -
Databricks delta
2 -
Databricks Delta Table
2 -
Databricks Demo Center
1 -
Databricks Documentation
1 -
Databricks Job
1 -
Databricks Lakehouse Platform
6 -
Databricks notebook
2 -
Databricks Notebooks
2 -
Databricks Platform
2 -
Databricks Pyspark
1 -
Databricks Python Notebook
1 -
Databricks Repo
1 -
Databricks Runtime
1 -
Databricks SQL
5 -
Databricks SQL Alerts
1 -
Databricks SQL Warehouse
1 -
Databricks UI
1 -
Databricks Unity Catalog
4 -
Databricks Workflow
2 -
Databricks Workflows
2 -
Databricks workspace
1 -
Databricks-connect
1 -
DatabricksJobCluster
1 -
DataDays
1 -
Datagrip
1 -
DataMasking
2 -
dbdemos
2 -
DBFS
1 -
DBRuntime
1 -
DDL
1 -
Dear Community
1 -
deduplication
1 -
Delt Lake
1 -
Delta
22 -
Delta Live Pipeline
3 -
Delta Live Table
5 -
Delta Live Table Pipeline
5 -
Delta Live Table Pipelines
4 -
Delta Live Tables
7 -
Delta Sharing
2 -
deltaSharing
1 -
Deny assignment
1 -
Development
1 -
Devops
1 -
DLT
10 -
DLT Pipeline
7 -
DLT Pipelines
5 -
Dolly
1 -
Download files
1 -
Dynamic Variables
1 -
Engineering With Databricks
1 -
env
1 -
External Sources
1 -
External Storage
2 -
FAQ for Databricks Learning Festival
2 -
Feature Store
2 -
Filenotfoundexception
1 -
Free trial
1 -
GCP Databricks
1 -
Getting started
1 -
Google Bigquery
1 -
HIPAA
1 -
Integration
1 -
JDBC Connector
1 -
Job Task
1 -
Lineage
1 -
LLM
1 -
Login
1 -
Login Account
1 -
Machine Learning
2 -
MachineLearning
1 -
Materialized Tables
2 -
Medallion Architecture
1 -
MlFlow
2 -
Model Training
1 -
Networking
1 -
Notebook
1 -
Onboarding Trainings
1 -
Permission
1 -
Permissions
1 -
personalcompute
1 -
Pipeline
2 -
Plotly
1 -
PostgresSQL
1 -
Pricing
1 -
Pyspark
1 -
Python
4 -
Python Code
1 -
Python Wheel
1 -
Quickstart
1 -
Read data
1 -
Repos Support
1 -
Reset
1 -
Rewards Store
2 -
Schedule
1 -
Serverless
2 -
Session
1 -
Sign Up Issues
2 -
Spark
3 -
sparkui
2 -
Splunk
1 -
SQL
8 -
Summit23
7 -
Support Tickets
1 -
Sydney
2 -
Table Download
1 -
Tags
1 -
Training
2 -
Troubleshooting
1 -
Unity Catalog
4 -
Unity Catalog Metastore
1 -
Update
1 -
user groups
1 -
Venicold
3 -
Voucher Not Recieved
1 -
Watermark
1 -
Weekly Documentation Update
1 -
Weekly Release Notes
2 -
Women
1 -
Workflow
2 -
Workspace
3
- « Previous
- Next »
User | Count |
---|---|
122 | |
56 | |
40 | |
30 | |
20 |