cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Livingstone
by New Contributor II
  • 684 Views
  • 1 replies
  • 1 kudos

Install maven package to serverless cluster

My task is to export data from CSV/SQL into Excel format with minimal latency. To achieve this, I used a Serverless cluster.Since PySpark does not support saving in XLSX format, it is necessary to install the Maven package spark-excel_2.12. However, ...

  • 684 Views
  • 1 replies
  • 1 kudos
Latest Reply
Nurota
New Contributor II
  • 1 kudos

I have a similar issue: how to install maven package in the notebook when running with  a serverless cluster?I need to install com.crealytics:spark-excel_2.12:3.4.2_0.20.3 in the notebook like the way pypl libraries installed in the notebook. e.g. %p...

  • 1 kudos
AntonPera
by New Contributor
  • 264 Views
  • 1 replies
  • 0 kudos

Lakehouse Monitoring - change profile type

I recently started to experiment with Lakehouse Monitoring. Created a monitor based on profile type of Time Series. However, want to change from Time Series to Snapshot. I have deleted the previously created two table drift_metrics and profile_metric...

  • 264 Views
  • 1 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Hi @AntonPera , If the dropdown to change the profile type is disabled, you might need to create a new monitor from scratch. Here’s how you can do it: Go to the Lakehouse Monitoring section in Databricks.Create a new monitor and select the Snapshot p...

  • 0 kudos
Pnascima
by New Contributor
  • 292 Views
  • 1 replies
  • 0 kudos

Help - For Each Workflows Performance Use Case

Hey guys, I've been going through a performance problem in my current Workflow. Here's my use case:We have several Notebooks, each one is responsible for calculating a specific metric (just like AOV, GMV, etc)I made a pipeline that creates a datafram...

  • 292 Views
  • 1 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Hi @Pnascima , When using the Serverless cluster, what was the tshirt sizing? Looking at your issue with dedicated cluster, it sounds to me like a resource issue (hoping no data volume changes) You would have to find a comparable size of the interact...

  • 0 kudos
ShakirHossain
by New Contributor
  • 726 Views
  • 1 replies
  • 0 kudos

curl: (35) error:0A000126:SSL routines::unexpected eof while reading

Hello,I am new to Databricks and have new workspace created. I get this error msg in my bash terminal even when I run Databricks --help command. What am i missing and how should I configure it. Please let me know if any furhter details is needed  

  • 726 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Are you referring to the databricks-cli? If so, then are you probably sitting behind a firewall or proxy? If you are, then you may need to export the proxy settings in your terminal (export HTTP_PROXY=$proxy; export HTTPS_PROXY=$proxy, with their cor...

  • 0 kudos
Garrus990
by New Contributor
  • 400 Views
  • 1 replies
  • 0 kudos

How to run a python task that uses click for CLI operations

Hey,in my application I am using click to facilitate CLI operations. It works locally, in notebooks, when scripts are run locally, but it fails in Databricks. I defined a task that, as an entrypoint, accepts the file where the click-decorated functio...

  • 400 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

The SystemExit issue you’re seeing is typical with Click, as it’s designed for standalone CLI applications and automatically calls sys.exit() after running a command. This behavior can trigger SystemExit exceptions in non-CLI environments, like Datab...

  • 0 kudos
Dp15
by Contributor
  • 262 Views
  • 1 replies
  • 0 kudos

Databricks JDBC Insert into Array field

hi, I am trying to insert some data into a databricks table which has Array<String> fields (field1 & field2). I am using JDBC for the connection and my POJO class looks like this public class A{ private Long id; private String[] field1; priv...

  • 262 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

The error you're encountering, [Databricks][JDBC](11500) Given type does not match given object: [Ljava.lang.String;@3e1346b0, indicates that the JDBC driver is not recognizing the Java String[] array as a valid SQL array type. This is a common issue...

  • 0 kudos
Vivek_Singh
by New Contributor III
  • 217 Views
  • 1 replies
  • 0 kudos

Getting error :USER_DEFINED_FUNCTIONS.CORRELATED_REFERENCES_IN_SQL_UDF_CALLS_IN_DML_COMMANDS_NOT_IMP

Hello Focus,need help, implemented Row level security at Unity Catalog, it is working as expected however while deleting the record getting error as enclosed detail [USER_DEFINED_FUNCTIONS.CORRELATED_REFERENCES_IN_SQL_UDF_CALLS_IN_DML_COMMANDS_NOT_IM...

  • 217 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

The correlated subqueries within SQL User-Defined Functions (UDFs) used for row-level security are currently not supported for DELETE operations in Unity Catalog. You will need to adjust your row_filter_countryid_source_table UDF to avoid correlated ...

  • 0 kudos
SankaraiahNaray
by New Contributor II
  • 1207 Views
  • 1 replies
  • 0 kudos

default auth: cannot configure default credentials

 I'm trying to use dbutils from WorkspaceClient and i tried to run this code from databricks notebook.But i get this errorError:ValueError: default auth: cannot configure default credentials Code:from databricks.sdk import WorkspaceClientw = Workspac...

  • 1207 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

To resolve the ValueError: default auth: cannot configure default credentials error when using dbutils from WorkspaceClient in a Databricks notebook, follow these steps: Ensure SDK Installation: Make sure the Databricks SDK for Python is installed. ...

  • 0 kudos
SakuraDev1
by New Contributor II
  • 314 Views
  • 1 replies
  • 0 kudos

autoloader cache and buffer utilization error

Hey guys,I'm encountering an issue with a project that uses Auto Loader for data ingestion. The production cluster is shutting down due to the error: The Driver restarted - possibly due to an OutOfMemoryError - and this stream has been stopped.I’ve i...

SakuraDev1_0-1729271704783.png SakuraDev1_0-1729271834424.png
  • 314 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

The error message is sometimes generic "possibly due to an OutOfMemoryError" There is memory pressure indeed, but try to correlate those graph metrics with the Driver's STDOUT file content and check if the GC/FullGCs are able to work properly and rec...

  • 0 kudos
SakuraDev1
by New Contributor II
  • 280 Views
  • 1 replies
  • 0 kudos

SakuraDev1 / Board: data-engineering (39000)

Link to post: (autoloader cache and buffer utilization error)by SakuraDev1 https://community.databricks.com/t5/data-engineering/autoloader-cache-and-buffer-utilization-error/m-p/94927#M39000 Hey guys, I'm encountering an issue with a project that use...

  • 280 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

To address the resource scheduling and code-specific optimizations for your Auto Loader data ingestion pipeline, consider the following suggestions: Resource Scheduling Dynamic Allocation: Enable dynamic allocation in your cluster configuration. Thi...

  • 0 kudos
pesky_chris
by New Contributor III
  • 414 Views
  • 1 replies
  • 0 kudos

Resolved! Support of Dashboards in Databricks Asset Bundles

Hello Databricks & Fellow Users,I noticed that support for Dashboards in DABs is coming soon (per the recent Databricks CLI pull request). Does anyone know if there are additional features planned to enhance the dashboard lifecycle? Currently, Git Fo...

  • 414 Views
  • 1 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

I can see that the usage of Git/Repos in Lakeview Dashboards is already in development, there is no ETA yet of when this will be GA but we can guarantee it is in progress  

  • 0 kudos
cool_cool_cool
by New Contributor II
  • 383 Views
  • 1 replies
  • 0 kudos

Databricks Workflow is stuck on the first task and doesnt do anyworkload

Heya I have a workflow in databricks with 2 tasks. They are configured to run on the same job cluster, and the second task depends on the first.I have a weird behavior that happened twice now - the job takes a long time (it usually finishes within 30...

  • 383 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Given the provided context, the suggestion is to capture thread dumps from both the Spark Driver and any Active Executor when the task seems to be hung. Ideally, you should also be able to find in the Spark logs for the active executor with the hung ...

  • 0 kudos
Dave_Nithio
by Contributor
  • 374 Views
  • 1 replies
  • 0 kudos

Production vs Development DLT Schema

My organization is currently ingesting data utilizing a Delta Live Table pipeline. This pipeline points to a production Storage location and Target schema. This means that whenever we make changes to this pipeline, it directly impacts the production ...

  • 374 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

To test changes to your Delta Live Table (DLT) pipeline without impacting production data, you can point to a different storage location and target schema. This does not require creating a completely separate DLT pipeline. Here are the steps: Create...

  • 0 kudos
adhi_databricks
by New Contributor III
  • 235 Views
  • 1 replies
  • 0 kudos

DATABRICKS CLEANROOMS

Hi Team,I have a few questions regarding Databricks Cleanrooms:For onboarding first-party data, does the collaborator need a Databricks account with an enabled UC workspace?How is it useful for activating data for retargeting or prospecting use cases...

  • 235 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

For onboarding first-party data, the collaborator does need a Databricks account with an enabled Unity Catalog (UC) workspace. This is necessary to map system tables into its metastore and to observe non-UC governed assets. Activating data for retarg...

  • 0 kudos
sanket-kelkar
by New Contributor II
  • 375 Views
  • 1 replies
  • 0 kudos

Auto OPTIMIZE causing a data discrepancy

I have a delta table in Azure Databricks that gets MERGEd every 10 minutes.In the attached screenshot, in the version history of this table, I see a MERGE operation every 10 minutes which is expected. Along with that, I see the OPTIMIZE operation aft...

  • 375 Views
  • 1 replies
  • 0 kudos
Latest Reply
VZLA
Databricks Employee
  • 0 kudos

Can you please provide more context about this, specifically with respect to the DBR Release and reproducibility of this scenario? Any metrics or plan change differences between both select statements, while the Optimize was in progress and after? Th...

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels