cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

olivier-soucy
by Contributor
  • 2995 Views
  • 4 replies
  • 1 kudos

Resolved! Spark Streaming foreachBatch with Databricks connect

I'm trying to use the foreachBatch method of a Spark Streaming DataFrame with databricks-connect. Given that spark connect supported was added to  `foreachBatch` in 3.5.0, I was expecting this to work.Configuration:- DBR 15.4 (Spark 3.5.0)- databrick...

  • 2995 Views
  • 4 replies
  • 1 kudos
Latest Reply
daniel_sahal
Esteemed Contributor
  • 1 kudos

@olivier-soucy Are you sure that you're using DBR 15.4 and databricks-connect 15.4.2?I've seen this issue when using databricks-connect 15.4.x with DBR 14.3LTS.Anyway, I've just tested that with the same versions you've provided and it works on my en...

  • 1 kudos
3 More Replies
SharathE
by New Contributor III
  • 1667 Views
  • 2 replies
  • 0 kudos

Incremental refresh of materialized view in serverless DLT

Hello, Every time that I run a delta live table materialized view in serverless , I get a log of "COMPLETE RECOMPUTE" . How can I achieve incremental refresh in serverless in DLT pipelines?

  • 1667 Views
  • 2 replies
  • 0 kudos
Latest Reply
drewipson
New Contributor III
  • 0 kudos

Make sure you are using the aggregates and SQL restrictions outlined in this article. https://docs.databricks.com/en/optimizations/incremental-refresh.htmlIf a SQL function is non-deterministic (current_timestamp() is a common one) you will have a CO...

  • 0 kudos
1 More Replies
deng_dev
by New Contributor III
  • 645 Views
  • 1 replies
  • 2 kudos

Autoloader File Notifications mode S3 Access Denied error

Hi everyone!We are reading json files from cross-account S3 bucket using Autoloader and decided to switch from directory listing mode to files notification mode.We have set up all permissions mentioned here in our IAM role. But now the pipeline is fa...

  • 645 Views
  • 1 replies
  • 2 kudos
Latest Reply
drewipson
New Contributor III
  • 2 kudos

You need to be sure you have an instance profile configured with PassRole permissions so that it can assume the cross account role to access the bucket and file notification resources. I found this technical blog helpful: https://community.databricks...

  • 2 kudos
AntonPera
by New Contributor
  • 558 Views
  • 1 replies
  • 0 kudos

Lakehouse Monitoring - change profile type

I recently started to experiment with Lakehouse Monitoring. Created a monitor based on profile type of Time Series. However, want to change from Time Series to Snapshot. I have deleted the previously created two table drift_metrics and profile_metric...

  • 558 Views
  • 1 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Hi @AntonPera , If the dropdown to change the profile type is disabled, you might need to create a new monitor from scratch. Here’s how you can do it: Go to the Lakehouse Monitoring section in Databricks.Create a new monitor and select the Snapshot p...

  • 0 kudos
Pnascima
by New Contributor
  • 607 Views
  • 1 replies
  • 0 kudos

Help - For Each Workflows Performance Use Case

Hey guys, I've been going through a performance problem in my current Workflow. Here's my use case:We have several Notebooks, each one is responsible for calculating a specific metric (just like AOV, GMV, etc)I made a pipeline that creates a datafram...

  • 607 Views
  • 1 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Hi @Pnascima , When using the Serverless cluster, what was the tshirt sizing? Looking at your issue with dedicated cluster, it sounds to me like a resource issue (hoping no data volume changes) You would have to find a comparable size of the interact...

  • 0 kudos
ShakirHossain
by New Contributor
  • 1753 Views
  • 1 replies
  • 0 kudos

curl: (35) error:0A000126:SSL routines::unexpected eof while reading

Hello,I am new to Databricks and have new workspace created. I get this error msg in my bash terminal even when I run Databricks --help command. What am i missing and how should I configure it. Please let me know if any furhter details is needed  

  • 1753 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Are you referring to the databricks-cli? If so, then are you probably sitting behind a firewall or proxy? If you are, then you may need to export the proxy settings in your terminal (export HTTP_PROXY=$proxy; export HTTPS_PROXY=$proxy, with their cor...

  • 0 kudos
Dp15
by Contributor
  • 682 Views
  • 1 replies
  • 0 kudos

Databricks JDBC Insert into Array field

hi, I am trying to insert some data into a databricks table which has Array<String> fields (field1 & field2). I am using JDBC for the connection and my POJO class looks like this public class A{ private Long id; private String[] field1; priv...

  • 682 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

The error you're encountering, [Databricks][JDBC](11500) Given type does not match given object: [Ljava.lang.String;@3e1346b0, indicates that the JDBC driver is not recognizing the Java String[] array as a valid SQL array type. This is a common issue...

  • 0 kudos
Vivek_Singh
by New Contributor III
  • 512 Views
  • 1 replies
  • 0 kudos

Getting error :USER_DEFINED_FUNCTIONS.CORRELATED_REFERENCES_IN_SQL_UDF_CALLS_IN_DML_COMMANDS_NOT_IMP

Hello Focus,need help, implemented Row level security at Unity Catalog, it is working as expected however while deleting the record getting error as enclosed detail [USER_DEFINED_FUNCTIONS.CORRELATED_REFERENCES_IN_SQL_UDF_CALLS_IN_DML_COMMANDS_NOT_IM...

  • 512 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

The correlated subqueries within SQL User-Defined Functions (UDFs) used for row-level security are currently not supported for DELETE operations in Unity Catalog. You will need to adjust your row_filter_countryid_source_table UDF to avoid correlated ...

  • 0 kudos
SankaraiahNaray
by New Contributor II
  • 4359 Views
  • 1 replies
  • 1 kudos

default auth: cannot configure default credentials

 I'm trying to use dbutils from WorkspaceClient and i tried to run this code from databricks notebook.But i get this errorError:ValueError: default auth: cannot configure default credentials Code:from databricks.sdk import WorkspaceClientw = Workspac...

  • 4359 Views
  • 1 replies
  • 1 kudos
Latest Reply
VZLA
Databricks Employee
  • 1 kudos

To resolve the ValueError: default auth: cannot configure default credentials error when using dbutils from WorkspaceClient in a Databricks notebook, follow these steps: Ensure SDK Installation: Make sure the Databricks SDK for Python is installed. ...

  • 1 kudos
SakuraDev1
by New Contributor II
  • 741 Views
  • 1 replies
  • 0 kudos

autoloader cache and buffer utilization error

Hey guys,I'm encountering an issue with a project that uses Auto Loader for data ingestion. The production cluster is shutting down due to the error: The Driver restarted - possibly due to an OutOfMemoryError - and this stream has been stopped.I’ve i...

SakuraDev1_0-1729271704783.png SakuraDev1_0-1729271834424.png
  • 741 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

The error message is sometimes generic "possibly due to an OutOfMemoryError" There is memory pressure indeed, but try to correlate those graph metrics with the Driver's STDOUT file content and check if the GC/FullGCs are able to work properly and rec...

  • 0 kudos
SakuraDev1
by New Contributor II
  • 550 Views
  • 1 replies
  • 0 kudos

SakuraDev1 / Board: data-engineering (39000)

Link to post: (autoloader cache and buffer utilization error)by SakuraDev1 https://community.databricks.com/t5/data-engineering/autoloader-cache-and-buffer-utilization-error/m-p/94927#M39000 Hey guys, I'm encountering an issue with a project that use...

  • 550 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

To address the resource scheduling and code-specific optimizations for your Auto Loader data ingestion pipeline, consider the following suggestions: Resource Scheduling Dynamic Allocation: Enable dynamic allocation in your cluster configuration. Thi...

  • 0 kudos
pesky_chris
by New Contributor III
  • 837 Views
  • 1 replies
  • 0 kudos

Resolved! Support of Dashboards in Databricks Asset Bundles

Hello Databricks & Fellow Users,I noticed that support for Dashboards in DABs is coming soon (per the recent Databricks CLI pull request). Does anyone know if there are additional features planned to enhance the dashboard lifecycle? Currently, Git Fo...

  • 837 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

I can see that the usage of Git/Repos in Lakeview Dashboards is already in development, there is no ETA yet of when this will be GA but we can guarantee it is in progress  

  • 0 kudos
Dave_Nithio
by Contributor II
  • 1270 Views
  • 1 replies
  • 0 kudos

Production vs Development DLT Schema

My organization is currently ingesting data utilizing a Delta Live Table pipeline. This pipeline points to a production Storage location and Target schema. This means that whenever we make changes to this pipeline, it directly impacts the production ...

  • 1270 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

To test changes to your Delta Live Table (DLT) pipeline without impacting production data, you can point to a different storage location and target schema. This does not require creating a completely separate DLT pipeline. Here are the steps: Create...

  • 0 kudos
adhi_databricks
by New Contributor III
  • 531 Views
  • 1 replies
  • 0 kudos

DATABRICKS CLEANROOMS

Hi Team,I have a few questions regarding Databricks Cleanrooms:For onboarding first-party data, does the collaborator need a Databricks account with an enabled UC workspace?How is it useful for activating data for retargeting or prospecting use cases...

  • 531 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

For onboarding first-party data, the collaborator does need a Databricks account with an enabled Unity Catalog (UC) workspace. This is necessary to map system tables into its metastore and to observe non-UC governed assets. Activating data for retarg...

  • 0 kudos
sanket-kelkar
by New Contributor II
  • 805 Views
  • 1 replies
  • 0 kudos

Auto OPTIMIZE causing a data discrepancy

I have a delta table in Azure Databricks that gets MERGEd every 10 minutes.In the attached screenshot, in the version history of this table, I see a MERGE operation every 10 minutes which is expected. Along with that, I see the OPTIMIZE operation aft...

  • 805 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Can you please provide more context about this, specifically with respect to the DBR Release and reproducibility of this scenario? Any metrics or plan change differences between both select statements, while the Optimize was in progress and after? Th...

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels