cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

jeremy98
by Honored Contributor
  • 2979 Views
  • 11 replies
  • 0 kudos

restarting the cluster always running doesn't free the memory?

Hello community,I was working on optimising the driver memory, since there are code that are not optimised for spark, and I was planning temporary to restart the cluster to free up the memory.that could be a potential solution, since if the cluster i...

Screenshot 2025-03-04 at 14.49.44.png
  • 2979 Views
  • 11 replies
  • 0 kudos
Latest Reply
iyashk-DB
Databricks Employee
  • 0 kudos

Hi @jeremy98 , collect() operation brings data to the driver and yes it can cause the memory issues that you are seeing, which can cause the cluster to be hung/ crash as well if done enough times. You may confirm these instances from the cluster even...

  • 0 kudos
10 More Replies
Aravind17
by New Contributor III
  • 18 Views
  • 1 replies
  • 0 kudos

Not received free voucher after completing Data Engineer Associate learning path

I have completed the Data Engineer Associate learning path, but I haven’t received the free certification voucher yet.I’ve already sent multiple emails to the concerned support team regarding this issue, but unfortunately, I haven’t received any resp...

  • 18 Views
  • 1 replies
  • 0 kudos
Latest Reply
Advika
Community Manager
  • 0 kudos

Hello @Aravind17! This post appears to duplicate the one you recently posted. A response has already been provided to your recent post. I recommend continuing the discussion in that thread to keep the conversation focused and organised.

  • 0 kudos
yzhang
by Contributor
  • 2241 Views
  • 6 replies
  • 2 kudos

iceberg with partitionedBy option

I am able to create a UnityCatalog iceberg format table:    df.writeTo(full_table_name).using("iceberg").create()However, if I am adding option partitionedBy I will get an error.  df.writeTo(full_table_name).using("iceberg").partitionedBy("ingest_dat...

  • 2241 Views
  • 6 replies
  • 2 kudos
Latest Reply
LazyGenius
New Contributor II
  • 2 kudos

I found weird behavior here while creating table using SQLIf you are creating new table and have added partition column at the last of the column mapping it won't work but if you add it at the beginning it will work!!For example :-Below query will wo...

  • 2 kudos
5 More Replies
amekojc
by New Contributor II
  • 68 Views
  • 1 replies
  • 0 kudos

How to not make tab headers show when embedding dashboard

When embedding the AI BI dashboard, is there a way to not make the tabs show and instead use our own UI tab to navigate the tabs?Currently, there are two tab headers - one in the databricks dashboard and then another tab section in our embedding webp...

  • 68 Views
  • 1 replies
  • 0 kudos
Latest Reply
mukul1409
New Contributor
  • 0 kudos

Hi @amekojc At the moment, Databricks AI BI Dashboards do not support hiding or disabling the native dashboard tabs when embedding. The embedded dashboard always renders with its own tab headers, and there is no configuration or API to control tab vi...

  • 0 kudos
libpekin
by New Contributor II
  • 114 Views
  • 2 replies
  • 2 kudos

Resolved! Databricks Free Edition - Accessing files in S3

Hello,Attempting read/write files from s3 but got the error below. I am on the free edition (serverless by default). I'm  using access_key and secret_key. Has anyone done this successfully? Thanks!Directly accessing the underlying Spark driver JVM us...

  • 114 Views
  • 2 replies
  • 2 kudos
Latest Reply
libpekin
New Contributor II
  • 2 kudos

Thank @Sanjeeb2024 I was able to confirm as well

  • 2 kudos
1 More Replies
RyanHager
by Contributor
  • 79 Views
  • 0 replies
  • 1 kudos

Liquid Clustering and S3 Performance

Are there any performance concerns when using liquid clustering and AWS S3.  I believe all the parquet files go in the same folder (Prefix in AWS S3 Terms) verses folders per partition when using "partition by".  And there is this note on S3 performa...

  • 79 Views
  • 0 replies
  • 1 kudos
Gaurav_784295
by New Contributor III
  • 3475 Views
  • 3 replies
  • 0 kudos

pyspark.sql.utils.AnalysisException: Non-time-based windows are not supported on streaming DataFrames/Datasets

pyspark.sql.utils.AnalysisException: Non-time-based windows are not supported on streaming DataFrames/DatasetsGetting this error while writing can any one please tell how we can resolve it

  • 3475 Views
  • 3 replies
  • 0 kudos
Latest Reply
preetmdata
New Contributor II
  • 0 kudos

Hi @Gaurav_784295  ,In Spark, In case of streaming, please use a time based column in window function. Because, In streaming we cant say "last 10 rows", "limit 10" etc. Because streaming never ends. So when you use window, please dont use columns lik...

  • 0 kudos
2 More Replies
espenol
by New Contributor III
  • 27350 Views
  • 11 replies
  • 13 kudos

input_file_name() not supported in Unity Catalog

Hey, so our notebooks reading a bunch of json files from storage typically use a input_file_name() when moving from raw to bronze, but after upgrading to Unity Catalog we get an error message:AnalysisException: [UC_COMMAND_NOT_SUPPORTED] input_file_n...

  • 27350 Views
  • 11 replies
  • 13 kudos
Latest Reply
ramanpreet
New Contributor
  • 13 kudos

The reason why the 'input_file_name' is not supported because this function was available in older versions of Databricks runtime. It got deprecated from Databricks Runtime 13.3 LTS onwards

  • 13 kudos
10 More Replies
mydefaultlogin
by New Contributor II
  • 859 Views
  • 2 replies
  • 0 kudos

Inconsistent PYTHONPATH, Git folders vs DAB

Hello Databricks Community,I'm encountering an issue related to Python paths when working with notebooks in Databricks. I have a following structure in my project:my_notebooks - my_notebook.py /my_package - __init__.py - hello.py databricks.yml...

  • 859 Views
  • 2 replies
  • 0 kudos
Latest Reply
kenny_hero
New Contributor
  • 0 kudos

I have a related question.I'm new to Databricks platform. I struggle with PYTHONPATH issue as the original poster raised. I understand using sys.path.append(...) is one approach for notebook. This is acceptable for ad-hoc interactive session, but thi...

  • 0 kudos
1 More Replies
bsr
by New Contributor II
  • 176 Views
  • 2 replies
  • 3 kudos

Resolved! DBR 17.3.3 introduced unexpected DEBUG logs from ThreadMonitor – how to disable?

After upgrading from DBR 17.3.2 to DBR 17.3.3, we started seeing a flood of DEBUG logs like this in job outputs:```DEBUG:ThreadMonitor:Logging python thread stack frames for MainThread and py4j threads: DEBUG:ThreadMonitor:Logging Thread-8 (run) stac...

  • 176 Views
  • 2 replies
  • 3 kudos
Latest Reply
bsr
New Contributor II
  • 3 kudos

Thanks for the quick response!

  • 3 kudos
1 More Replies
kALYAN5
by New Contributor
  • 150 Views
  • 4 replies
  • 3 kudos

Service Principal

Can two service principal have same name,but unique id's ?

  • 150 Views
  • 4 replies
  • 3 kudos
Latest Reply
emma_s
Databricks Employee
  • 3 kudos

Hi @kALYAN5,  Here is an explanation of why service principals share a name but IDs are unique: Names Are for Human Readability: Organizations use human-friendly names like "automation-batch-job" or "databricks-ci-cd" to make it easy for admins to re...

  • 3 kudos
3 More Replies
Ligaya
by New Contributor II
  • 57349 Views
  • 7 replies
  • 2 kudos

ValueError: not enough values to unpack (expected 2, got 1)

Code:Writer.jdbc_writer("Economy",economy,conf=CONF.MSSQL.to_dict(), modified_by=JOB_ID['Economy'])The problem arises when i try to run the code, in the specified databricks notebook, An error of "ValueError: not enough values to unpack (expected 2, ...

  • 57349 Views
  • 7 replies
  • 2 kudos
Latest Reply
mukul1409
New Contributor
  • 2 kudos

The error happens because the function expects the table name to include both schema and table separated by a dot. Inside the function it splits the table name using a dot and tries to assign two values. When you pass only Economy, the split returns ...

  • 2 kudos
6 More Replies
ripa1
by New Contributor
  • 178 Views
  • 4 replies
  • 4 kudos

Is anyone getting up and working ? Federating Snowflake-managed Iceberg tables into Azure Databricks

I'm federating Snowflake-managed Iceberg tables into Azure Databricks Unity Catalog to query the same data from both platforms without copying it. I am getting weird error message when query table from Databricks and i have tried to put all nicely in...

Data Engineering
azure
Iceberg
snowflake
unity-catalog
  • 178 Views
  • 4 replies
  • 4 kudos
Latest Reply
ripa1
New Contributor
  • 4 kudos

Thanks Hubert. I did check the Iceberg metadata location and Databricks can list the files, but the issue is that Snowflake’s Iceberg metadata.json contains paths like abfss://…@<acct>.blob.core.windows.net/..., and on UC Serverless Databricks then t...

  • 4 kudos
3 More Replies
Askenm
by New Contributor
  • 1138 Views
  • 6 replies
  • 4 kudos

Docker tab missing in create compute

I am running databricks premium and looking to create a compute running conda. It seems that the best way to do this is to boot the compute from a docker image. However, in the ```create_compute > advanced``` I cannot see the the docker option nor ca...

Data Engineering
conda
Docker
  • 1138 Views
  • 6 replies
  • 4 kudos
Latest Reply
mukul1409
New Contributor
  • 4 kudos

Hi @Askenm In Databricks Premium, the Docker option for custom images is not available on all compute types and is not controlled by user level permissions. Custom Docker images are only supported on Databricks clusters that use the legacy VM based c...

  • 4 kudos
5 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels