cancel
Showing results for 
Search instead for 
Did you mean: 
Databricks Platform Discussions
Dive into comprehensive discussions covering various aspects of the Databricks platform. Join the conversation to deepen your understanding and maximize your usage of the Databricks platform.
cancel
Showing results for 
Search instead for 
Did you mean: 

Browse the Community

Data Engineering

Join discussions on data engineering best practices, architectures, and optimization strategies with...

10557 Posts

Data Governance

Join discussions on data governance practices, compliance, and security within the Databricks Commun...

447 Posts

Generative AI

Explore discussions on generative artificial intelligence techniques and applications within the Dat...

174 Posts

Machine Learning

Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithm...

910 Posts

Warehousing & Analytics

Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Communi...

577 Posts

Databricks Free Trial Help

Engage in discussions about the Databricks Free Trial within the Databricks Community. Share insight...

62 Posts

Activity in Databricks Platform Discussions

hao-uit
by > New Contributor
  • 13 Views
  • 0 replies
  • 0 kudos

Spark Streaming Job gets stuck in the "Stream Initializing"

Hello all,I am having an issue with my Spark Streaming Job. It is stuck at "Stream Initializing" stage.Need your help here to understand what is happening inside the "Stream Initializing" stage of Spark Streaming job which is taking so long. Here are...

  • 13 Views
  • 0 replies
  • 0 kudos
aswinvishnu
by > New Contributor II
  • 22 Views
  • 0 replies
  • 0 kudos

Avoiding metadata information when sending data to GCS

Hi all,I have use case where I need to push the table data to GCS bucket,query = "${QUERY}" df = spark.sql(query) gcs_path = "${GCS_PATH}" df.write.option("maxRecordsPerFile", int("${MAX_RECORDS_PER_FILE}")).mode("${MODE}").json(gcs_path)This can ...

  • 22 Views
  • 0 replies
  • 0 kudos
Nagarathna
by > New Contributor II
  • 145 Views
  • 3 replies
  • 0 kudos

How to write trillions of rows to unity catalog table.

Hi team,I have a dataframe with 1269408570800 rows . I need to write this data to unity catalog table.How can I upload huge quantity of data ?I'm using databricks i runtime 15.4 LTS with 4 workers and each worker type is i3.4xlarge and driver of type...

Data Engineering
data upload
Unity Catalog
  • 145 Views
  • 3 replies
  • 0 kudos
Latest Reply
Isi
Contributor III
  • 0 kudos

Hey @Nagarathna @Lucas_TBrabo  I’d like to share my opinion and some tips that might help:1. You should try to avoud filtering by spark_partition_id because  you can create skewed partitions, you should use with repartition() and spark can optimize t...

  • 0 kudos
2 More Replies
chsoni12
by > New Contributor
  • 82 Views
  • 2 replies
  • 1 kudos

Impact of VACUUM Operations on Shallow Clones in Databricks

I performed a POC where i have to check that can we create a new delta table which contains only particular version of data of normal delta table without copying the data and if we make changes or perform any operation(insert/delete/truncate/records)...

  • 82 Views
  • 2 replies
  • 1 kudos
Latest Reply
chsoni12
New Contributor
  • 1 kudos

Thanks. It really helps me a lot But there is also an issue in shallow clone. We can only clone the full table data, particular delta version data using timestamp/version from the normal table using shallow clone but we can not clone the table data b...

  • 1 kudos
1 More Replies
ep208
by > New Contributor
  • 92 Views
  • 1 replies
  • 0 kudos

How to resolve Location Overlap

Hi,I am trying to ingest abfss://datalake@datalakename.dfs.core.windows.net/Delta/Project1/sales_table but when writting the table schema on the yamls, I uncorrectly wrote this table in other unit catalog table:---kind: SinkDeltaTablemetadata:  name:...

  • 92 Views
  • 1 replies
  • 0 kudos
Latest Reply
Isi
Contributor III
  • 0 kudos

Hey @ep208 ,From the error message you’re seeing (LOCATION_OVERLAP), it seems that Unity Catalog is still tracking a table or volume that points to the same path you’re now trying to reuse:abfss://datalake@datalakename.dfs.core.windows.net/Delta/Proj...

  • 0 kudos
utkarshamone
by > New Contributor
  • 237 Views
  • 2 replies
  • 0 kudos

Internal errors when running SQLs

We are running Databricks on GCP with a classic SQL warehouse. Its on the current version (v 2025.15)We have a pipeline that runs DBT on top of the SQL warehouseSince the 9th of May, our queries have been failing intermittently with internal errors f...

Screenshot 2025-05-15 at 4.51.49 pm.png Screenshot 2025-05-15 at 5.23.57 pm.png Screenshot 2025-05-15 at 5.24.12 pm.png
  • 237 Views
  • 2 replies
  • 0 kudos
Latest Reply
Isi
Contributor III
  • 0 kudos

Hi @utkarshamone ,We faced a similar issue and I wanted to share our findings, which might help clarify what’s going on.We’re using a Classic SQL Warehouse size L (v2025.15), and executing a dbt pipeline on top of it.Our dbt jobs started to fail with...

  • 0 kudos
1 More Replies
letsaskme
by > New Contributor
  • 997 Views
  • 1 replies
  • 0 kudos

letsaskme-com-digital-marketing-free-paid-guest-posting-blog-post-websites-list-

Lets ask me  List of 300+ Quality Marketing, Business, SEO, Tech & Wordpress Guest Blogging Sites That Accept Guest Posts.https://letsaskme.com/digital-marketing/free-paid-guest-posting-blog-post-websites-list-2020/#guestpost​ #blogger​ 

  • 997 Views
  • 1 replies
  • 0 kudos
Latest Reply
seo10
New Contributor
  • 0 kudos

I’ve tried Medium, YourStory, HackerNoon, ShoutMeLoud, SiteProNews, and BloggingCage. But Hikemytraffic helped me find targeted digital marketing guest post sites that actually bring traffic, backlinks, and real leads.

  • 0 kudos
Ampal
by > New Contributor
  • 294 Views
  • 0 replies
  • 0 kudos

VOIP

Hello,I recently had AT&T VOIP  installed with new data service.  I am encountering an issue whereby family members who use an AT&T mobile can call the landline, but my T Mobile phone can no longer reach this line. It was a traditional line prior to ...

  • 294 Views
  • 0 replies
  • 0 kudos
KG_777
by > New Contributor
  • 232 Views
  • 1 replies
  • 0 kudos

Capturing deletes for SCD2 using apply changes or apply as delete decorator

We're looking to implement scd2 for tables in our lakehouse and we need to keep track of records that are being deleted in the source. Does anyone have a similar use case and can they outline some of the challenges they faced and workarounds they imp...

  • 232 Views
  • 1 replies
  • 0 kudos
Latest Reply
LRALVA
Honored Contributor
  • 0 kudos

Hi @KG_777 Tracking deleted records in an SCD Type 2 implementation for a lakehouse architecture is indeed a challenging but common requirement.Here's an overview of approaches, challenges, and workarounds based on industry experience:Common Approach...

  • 0 kudos
eballinger
by > Contributor
  • 196 Views
  • 1 replies
  • 0 kudos

List all users groups and the actual users in them in sql

We have a bunch of cloud AD groups in Databricks and I can see what users are in each group by using the user interface Manage Account -> Users and groups -> GroupsI would like to be able to produce this full list in SQL. I have found the below code ...

  • 196 Views
  • 1 replies
  • 0 kudos
Latest Reply
BigRoux
Databricks Employee
  • 0 kudos

Try this: SQL Query Options: Databricks SQL supports the SHOW GROUPS command to list all groups within a system. This command optionally allows filtering by specific user associations or regular expressions to identify desired groups.The statement sy...

  • 0 kudos
Karthik_Karanm
by > New Contributor III
  • 1197 Views
  • 7 replies
  • 2 kudos

Insufficient Permission Error When Serving RAG Model with Multiple Vector Search Indexes

Hi Community,I’m currently working on a Retrieval-Augmented Generation (RAG) use case in Databricks. I’ve successfully implemented and served a model that uses a single Vector Search index, and everything works as expected.However, when I try to serv...

  • 1197 Views
  • 7 replies
  • 2 kudos
Latest Reply
Karthik_Karanm
New Contributor III
  • 2 kudos

Hello @LRALVA Thank you for your timePlease give me some clarification on this:The permission error occurred when we used multiple vector searches for a single model. During the model registration process in this scenario, we encountered the error.Ho...

  • 2 kudos
6 More Replies
thiagoawstest
by > Contributor
  • 6872 Views
  • 3 replies
  • 0 kudos

Save file to /tmp

Hello, I have python code that collects data in json, and sends it to an S3 bucket, everything works fine. But when there is a lot of data, it causes memory overflow.So I want to save locally, for example in /tmp or dbfs:/tmp and after sending it to ...

  • 6872 Views
  • 3 replies
  • 0 kudos
Latest Reply
JimBiard
New Contributor
  • 0 kudos

I am experiencing the same problem. I create a file in /tmp and can verify that it exists. But when an attempt is made to open the file using pyspark, the file is not found. I noticed that the path I used to create the file is /tmp/foobar.parquet and...

  • 0 kudos
2 More Replies
DanielW
by > New Contributor
  • 299 Views
  • 10 replies
  • 2 kudos

Resolved! Databricks Rest api swagger definition not handling bigint or integer

I want to test create a custom connector in a Power App that connects to table in Databricks.  The issue is if I have any columns like int or bigint. No matter what I define in the response in my swagger definition See  below), it is not correct type...

DanielW_0-1747312458356.png DanielW_0-1747313218694.png
  • 299 Views
  • 10 replies
  • 2 kudos
Latest Reply
DanielW
New Contributor
  • 2 kudos

Hi probably one to pick up next week.  but I attempted to parametise  me SQL statement and it was painful!parameters: - name: body in: body required: true schema: type: object properties: ...

  • 2 kudos
9 More Replies
Phani1
by > Valued Contributor II
  • 242 Views
  • 2 replies
  • 0 kudos

Potential Challenges of Using Iceberg Format (Databricks + Iceberg)

 Hi Team,What are the potential challenges of using Iceberg format instead of Delta for saving data in databricks?Regards,Phani

  • 242 Views
  • 2 replies
  • 0 kudos
Latest Reply
LRALVA
Honored Contributor
  • 0 kudos

Hi @Phani1 Using Apache Iceberg instead of Delta Lake for saving data in Databricks can unlock cross-platform compatibility but comes with several potential challenges,especially within the Databricks ecosystem which is natively optimized for Delta L...

  • 0 kudos
1 More Replies
Upendra_Dwivedi
by > New Contributor III
  • 127 Views
  • 2 replies
  • 0 kudos

How to enable Databricks Apps User Authorization?

Hi All,I am working on implementation of user authorization in my databricks app. but to enable user auth it is asking:"A workspace admin must enable this feature to be able to request additional scopes. The user's API downscoped access token is incl...

  • 127 Views
  • 2 replies
  • 0 kudos
Latest Reply
SP_6721
New Contributor III
  • 0 kudos

Hi @Upendra_Dwivedi To enable this feature, you'll need to go to Apps in your workspace and turn on the On-Behalf-Of User Authorization option. After that, when you're creating or editing your app, make sure to select the necessary user API scopes, t...

  • 0 kudos
1 More Replies