Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best prac...
Explore discussions on Databricks administration, deployment strategies, and architectural best prac...
Join discussions on data engineering best practices, architectures, and optimization strategies with...
Join discussions on data governance practices, compliance, and security within the Databricks Commun...
Explore discussions on generative artificial intelligence techniques and applications within the Dat...
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithm...
Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Communi...
Engage in discussions about the Databricks Free Trial within the Databricks Community. Share insight...
Hello all,I am having an issue with my Spark Streaming Job. It is stuck at "Stream Initializing" stage.Need your help here to understand what is happening inside the "Stream Initializing" stage of Spark Streaming job which is taking so long. Here are...
Hi all,I have use case where I need to push the table data to GCS bucket,query = "${QUERY}" df = spark.sql(query) gcs_path = "${GCS_PATH}" df.write.option("maxRecordsPerFile", int("${MAX_RECORDS_PER_FILE}")).mode("${MODE}").json(gcs_path)This can ...
Hi team,I have a dataframe with 1269408570800 rows . I need to write this data to unity catalog table.How can I upload huge quantity of data ?I'm using databricks i runtime 15.4 LTS with 4 workers and each worker type is i3.4xlarge and driver of type...
Hey @Nagarathna @Lucas_TBrabo I’d like to share my opinion and some tips that might help:1. You should try to avoud filtering by spark_partition_id because you can create skewed partitions, you should use with repartition() and spark can optimize t...
I performed a POC where i have to check that can we create a new delta table which contains only particular version of data of normal delta table without copying the data and if we make changes or perform any operation(insert/delete/truncate/records)...
Thanks. It really helps me a lot But there is also an issue in shallow clone. We can only clone the full table data, particular delta version data using timestamp/version from the normal table using shallow clone but we can not clone the table data b...
Hi,I am trying to ingest abfss://datalake@datalakename.dfs.core.windows.net/Delta/Project1/sales_table but when writting the table schema on the yamls, I uncorrectly wrote this table in other unit catalog table:---kind: SinkDeltaTablemetadata: name:...
Hey @ep208 ,From the error message you’re seeing (LOCATION_OVERLAP), it seems that Unity Catalog is still tracking a table or volume that points to the same path you’re now trying to reuse:abfss://datalake@datalakename.dfs.core.windows.net/Delta/Proj...
We are running Databricks on GCP with a classic SQL warehouse. Its on the current version (v 2025.15)We have a pipeline that runs DBT on top of the SQL warehouseSince the 9th of May, our queries have been failing intermittently with internal errors f...
Hi @utkarshamone ,We faced a similar issue and I wanted to share our findings, which might help clarify what’s going on.We’re using a Classic SQL Warehouse size L (v2025.15), and executing a dbt pipeline on top of it.Our dbt jobs started to fail with...
Lets ask me List of 300+ Quality Marketing, Business, SEO, Tech & Wordpress Guest Blogging Sites That Accept Guest Posts.https://letsaskme.com/digital-marketing/free-paid-guest-posting-blog-post-websites-list-2020/#guestpost​ #blogger​
I’ve tried Medium, YourStory, HackerNoon, ShoutMeLoud, SiteProNews, and BloggingCage. But Hikemytraffic helped me find targeted digital marketing guest post sites that actually bring traffic, backlinks, and real leads.
Hello,I recently had AT&T VOIP installed with new data service. I am encountering an issue whereby family members who use an AT&T mobile can call the landline, but my T Mobile phone can no longer reach this line. It was a traditional line prior to ...
We're looking to implement scd2 for tables in our lakehouse and we need to keep track of records that are being deleted in the source. Does anyone have a similar use case and can they outline some of the challenges they faced and workarounds they imp...
Hi @KG_777 Tracking deleted records in an SCD Type 2 implementation for a lakehouse architecture is indeed a challenging but common requirement.Here's an overview of approaches, challenges, and workarounds based on industry experience:Common Approach...
We have a bunch of cloud AD groups in Databricks and I can see what users are in each group by using the user interface Manage Account -> Users and groups -> GroupsI would like to be able to produce this full list in SQL. I have found the below code ...
Try this: SQL Query Options: Databricks SQL supports the SHOW GROUPS command to list all groups within a system. This command optionally allows filtering by specific user associations or regular expressions to identify desired groups.The statement sy...
Hi Community,I’m currently working on a Retrieval-Augmented Generation (RAG) use case in Databricks. I’ve successfully implemented and served a model that uses a single Vector Search index, and everything works as expected.However, when I try to serv...
Hello @LRALVA Thank you for your timePlease give me some clarification on this:The permission error occurred when we used multiple vector searches for a single model. During the model registration process in this scenario, we encountered the error.Ho...
Hello, I have python code that collects data in json, and sends it to an S3 bucket, everything works fine. But when there is a lot of data, it causes memory overflow.So I want to save locally, for example in /tmp or dbfs:/tmp and after sending it to ...
I am experiencing the same problem. I create a file in /tmp and can verify that it exists. But when an attempt is made to open the file using pyspark, the file is not found. I noticed that the path I used to create the file is /tmp/foobar.parquet and...
I want to test create a custom connector in a Power App that connects to table in Databricks. The issue is if I have any columns like int or bigint. No matter what I define in the response in my swagger definition See below), it is not correct type...
Hi probably one to pick up next week. but I attempted to parametise me SQL statement and it was painful!parameters: - name: body in: body required: true schema: type: object properties: ...
Hi Team,What are the potential challenges of using Iceberg format instead of Delta for saving data in databricks?Regards,Phani
Hi @Phani1 Using Apache Iceberg instead of Delta Lake for saving data in Databricks can unlock cross-platform compatibility but comes with several potential challenges,especially within the Databricks ecosystem which is natively optimized for Delta L...
Hi All,I am working on implementation of user authorization in my databricks app. but to enable user auth it is asking:"A workspace admin must enable this feature to be able to request additional scopes. The user's API downscoped access token is incl...
Hi @Upendra_Dwivedi To enable this feature, you'll need to go to Apps in your workspace and turn on the On-Behalf-Of User Authorization option. After that, when you're creating or editing your app, make sure to select the necessary user API scopes, t...
User | Count |
---|---|
1789 | |
851 | |
467 | |
311 | |
301 |