cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

5UDO
by New Contributor II
  • 2975 Views
  • 6 replies
  • 4 kudos

Databricks warehouse table optimization

Hi everyone,I just started using the Databricks and wanted to evaluate the reading speeds when using the Databricks warehouse.So I've generated the dataset of 100M records, which contains name, surname, date of birth, phone number and an address. Dat...

  • 2975 Views
  • 6 replies
  • 4 kudos
Latest Reply
5UDO
New Contributor II
  • 4 kudos

Hi Brahmareddy and AndrewN,Thank you on your answers.I first need to apologize as I accidentally wrote wrong that I got 270ms with hashing the date of birth, surname and name and then using the z ordering.I actually achieved around 290ms with hashing...

  • 4 kudos
5 More Replies
jtjohnson
by New Contributor II
  • 1778 Views
  • 4 replies
  • 0 kudos

API Definition File

Hello. We are in the process of setting up Azure APIM to Databricks Rest API(s). Is there an official definition file available for download?Any help would be greatly appreciated

  • 1778 Views
  • 4 replies
  • 0 kudos
Latest Reply
jtjohnson
New Contributor II
  • 0 kudos

Thank you for the feedback. The postman collection would be ideal but the link is a no longer active

  • 0 kudos
3 More Replies
harika5991
by New Contributor II
  • 1652 Views
  • 1 replies
  • 0 kudos

Unable to create a metastore for Unity Catalog as I don't have Account Admin rights

Hello guys,I just started learning Databricks. I created a Databricks workspace via the Azure Portal using the Trial (Premium - 14-Days Free DBUs) plan. The workspace name is `easewithdata-adb`.However,I do not currently see the option to create a Un...

  • 1652 Views
  • 1 replies
  • 0 kudos
Latest Reply
lingareddy_Alva
Esteemed Contributor
  • 0 kudos

Hi @harika5991 You're right about the root cause of your issue. Creating a Unity Catalog metastore requires Account Admin privileges, which is separate from just creating a workspace in Azure.These are options you can try:When you create a Databricks...

  • 0 kudos
Louis_Frolio
by Databricks Employee
  • 7890 Views
  • 4 replies
  • 4 kudos

Resolved! What are your most impactful use cases for schema evolution in Databricks?

  Data Engineers, Share Your Experiences with Delta Lake Schema Evolution! We're calling on all data engineers to share their experiences with the powerful schema evolution feature in Delta Lake. This feature allows for seamless adaptation to changin...

  • 7890 Views
  • 4 replies
  • 4 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 4 kudos

Outstanding!

  • 4 kudos
3 More Replies
flashmav
by New Contributor II
  • 1143 Views
  • 1 replies
  • 0 kudos

Resolved! ConcurrentDeleteDeleteException in liquid cluster table

I am doing a merge in a table in parallel via 2 jobs.The table is a liquid clustered table with the following properties:delta.enableChangeDataFeed=truedelta.enableDeletionVectors=truedelta.enableRowTracking=truedelta.feature.changeDataFeed=supported...

  • 1143 Views
  • 1 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

Hey @flashmav ,  keep in mind that operations in Delta Lake often occur at the file level rather than the row level. For example, if two sessions attempt to update data in the same file (even if they’re not updating the same row), you may encounter a...

  • 0 kudos
SoniSole
by New Contributor II
  • 9967 Views
  • 6 replies
  • 6 kudos

Issue with Docker Image connection

Hello, I have created and pushed a docker image to Azure Container Registry . I used that image to start the cluster in Databricks. But the cluster doesn't start and as such when I try to run a Databricks' Jon using that Cluster, I get this error bel...

image
  • 9967 Views
  • 6 replies
  • 6 kudos
Latest Reply
jeremy98
Honored Contributor
  • 6 kudos

We have the same issue, right now. Which is the problem??

  • 6 kudos
5 More Replies
vgupta
by New Contributor II
  • 10807 Views
  • 6 replies
  • 4 kudos

DLT | Cluster terminated by System-User | INTERNAL_ERROR: Communication lost with driver. Cluster 0312-140502-k9monrjc was not reachable for 120 seconds

Dear Community, Hope you are doing well.For the last couple of days I am seeing very strange issues with my DLT pipeline, So every 60-70 mins it is getting failed in continuous mode, with the ERROR; INTERNAL_ERROR: Communication lost with driver. Clu...

DLT_ERROR DLT_Cluster_events
  • 10807 Views
  • 6 replies
  • 4 kudos
Latest Reply
Rahiman
Databricks Partner
  • 4 kudos

We had similar error for one the DLT pipeline, This could be some times because of compute size, we had increased compute size of server in your DLT pipelines, still we were seeing this error while processing very large file. we then added below para...

  • 4 kudos
5 More Replies
aswinvishnu
by New Contributor II
  • 1703 Views
  • 3 replies
  • 1 kudos

Exporting table to GCS bucket using job

Hi all,Usecase: I want to send the result of a query to GCS bucket location in json format.Approach: From my java based application I create a job and that job will be running a notebook`. Notebook will have something like this```query = "SELECT * FR...

  • 1703 Views
  • 3 replies
  • 1 kudos
Latest Reply
LorelaiSpence
New Contributor II
  • 1 kudos

Consider using GCS signed URLs or access tokens for secure access.

  • 1 kudos
2 More Replies
Maverick1
by Valued Contributor II
  • 5971 Views
  • 6 replies
  • 6 kudos

How to infer the online feature store table via an mlflow registered model, which is deployed to a sagemaker endpoint?

Can an mlflow registered model automatically infer the online feature store table, if that model is trained and logged via a databricks feature store table and the table is pushed to an online feature store (like AWS RDS)?

  • 5971 Views
  • 6 replies
  • 6 kudos
Latest Reply
Janifer45
New Contributor II
  • 6 kudos

Thanks for this

  • 6 kudos
5 More Replies
BrianLind
by New Contributor II
  • 1083 Views
  • 2 replies
  • 0 kudos

Need access to browse onprem SQL data

 Our BI team has started using Databricks and would like to browse our local (onprem) SQL database servers from within Databricks. I'm not sure if that's even possible.So far, I've set up Databricks Secure Cluster Connectivity (SCC), created a privat...

  • 1083 Views
  • 2 replies
  • 0 kudos
Latest Reply
Renu_
Valued Contributor II
  • 0 kudos

Hi, based on what you’ve shared, it seems you’ve already completed many of the necessary steps. Just a few things to double-check as you move forward:SQL Warehouses used for BI tools need to run in Pro mode, not serverless, since only Pro or Classic ...

  • 0 kudos
1 More Replies
muano_makhokha
by New Contributor II
  • 1849 Views
  • 1 replies
  • 1 kudos

Resolved! Row filtering and Column masking not working even when requirements the are met

I have been trying to use the Row filtering and Column masking feature to redacted columns and and filter rows based on the group a user is in.I have all the necessary permissions and I've used cluster's with version 15.4 and higher.When I run the fo...

  • 1849 Views
  • 1 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Here are some things to consider/try:   The UnityCatalogServiceException error you are encountering, ABORTED.UC_DBR_TRUST_VERSION_TOO_OLD, generally indicates that the Databricks Runtime (DBR) version you are using no longer supports the operation, s...

  • 1 kudos
meret
by Databricks Partner
  • 1576 Views
  • 1 replies
  • 0 kudos

Column Default Propagation

Hi Today I found I somewhat strange behavior when it comes to default values in columns. Apparently, column defaults are propagated to a new table, when you select the column without any operation on it. This is a bit unexpected for me. Here a short...

  • 1576 Views
  • 1 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

The behavior you described regarding the propagation of default column values is expected and is tied to the specific usage of the delta.feature.allowColumnDefaults table property in Delta Lake. Here’s an explanation: Default Propagation Without Tra...

  • 0 kudos
Dinesh6351
by New Contributor II
  • 1006 Views
  • 2 replies
  • 3 kudos
  • 1006 Views
  • 2 replies
  • 3 kudos
Latest Reply
amos
New Contributor III
  • 3 kudos

Esse erro ocorre quando sua conta Azure excede a cota regional de núcleos disponíveis, impedindo a criação do cluster no Databricks. Isso significa que o cluster tentou utilizar mais recursos do que o permitido na sua região.1-Revise a configuração d...

  • 3 kudos
1 More Replies
jonhieb
by New Contributor III
  • 4548 Views
  • 6 replies
  • 3 kudos

Resolved! [Databricks Asset Bundles] Triggering Delta Live Tables

I would like to know how to schedule a DLT pipeline using DAB's.I'm trying to trigger a Delta Live Table pipeline using Databricks Asset Bundles. Below is my YAML code:resources:  pipelines:    data_quality_pipelines:      name: data_quality_pipeline...

  • 4548 Views
  • 6 replies
  • 3 kudos
Latest Reply
Walter_C
Databricks Employee
  • 3 kudos

As of now, Databricks Asset Bundles do not support direct scheduling of DLT pipelines using cron expressions within the bundle configuration. Instead, you can achieve scheduling by creating a Databricks job that triggers the DLT pipeline and then sch...

  • 3 kudos
5 More Replies
LasseL
by Databricks Partner
  • 4086 Views
  • 4 replies
  • 1 kudos

How to use change data feed when schema is changing between delta table versions?

How to use change data feed when delta table schema changes between delta table versions?I tried to read change data feed in parts (in code snippet I read version 1372, because 1371 and 1373 schema versions are different), but getting errorUnsupporte...

  • 4086 Views
  • 4 replies
  • 1 kudos
Latest Reply
lingareddy_Alva
Esteemed Contributor
  • 1 kudos

@LasseL When you read from the change data feed in batch mode, Delta Lake always uses a single schema:By default, it uses the latest table version’s schema, even if you’re only reading an older versionOn Delta Runtime ≥ 12.2 LTS with column mapping e...

  • 1 kudos
3 More Replies
Labels