cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

ushnish_18
by New Contributor
  • 527 Views
  • 1 replies
  • 0 kudos

Facing error which submitting lab assessment of Delivery Specialization: UC Upgrade

Hi,I attended the lab assessment of Delivery Specialization: UC Upgrade and while submitting my answers the grade is not getting updated successfully due to which the status is being showed as failed, although all the checkpoints got validated succes...

  • 527 Views
  • 1 replies
  • 0 kudos
Latest Reply
Arpita_S
Databricks Employee
  • 0 kudos

Hi @ushnish_18, Can you share the user ID or email you used to access the lab so the team can take a look? Alternatively, you can send all the details, including screenshots, here for the team to investigate in detail and guide you appropriately. Tha...

  • 0 kudos
HarryRichard08
by New Contributor II
  • 1005 Views
  • 1 replies
  • 0 kudos

Acess to s3 in aws

My Problem:My Databricks workspace (Serverless Compute) is in AWS Account A, but my  S3 bucket is in AWS Account B.Works in Shared Compute because i  am  manually setting access_key and secret_key.Does NOT work in Serverless Compute 

  • 1005 Views
  • 1 replies
  • 0 kudos
Latest Reply
SP_6721
Honored Contributor
  • 0 kudos

Hi @HarryRichard08 ,I came across a similar thread of yours. Were you able to find a resolution for this?

  • 0 kudos
minhhung0507
by Valued Contributor
  • 3020 Views
  • 10 replies
  • 5 kudos

Performance issue with Spark SQL when working with data from Unity Catalog

  We're encountering a performance issue with Spark SQL when working with data from Unity Catalog. Specifically, when I use Spark to read data from a Unity Catalog partition created by DLT and then create a view, the executor retrieval is very slow. ...

  • 3020 Views
  • 10 replies
  • 5 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 5 kudos

you can read the physical parquet files with spark.read.parquet().The trick is to know which files are the current ones.

  • 5 kudos
9 More Replies
NikosLoutas
by New Contributor III
  • 1179 Views
  • 3 replies
  • 2 kudos

Spark DataFrame Checkpoint

Good morning,I am having a difficulty when trying to checkpoint a PySpark DataFrame.The DataFrame is not involved in a DLT pipeline so I am using the df.checkpoint(eager=True) command, to truncate the logical plan of df and materialize it as files wi...

  • 1179 Views
  • 3 replies
  • 2 kudos
Latest Reply
saurabh18cs
Honored Contributor II
  • 2 kudos

your volume approach is also good idea

  • 2 kudos
2 More Replies
libpekin
by New Contributor
  • 3508 Views
  • 1 replies
  • 0 kudos

Generating Multiple Excels for SQL Query

Hello,I am getting "OSError: Errno 95: Operation not supported for the code below. I have 'openpyxl 3.1.5' installed on the cluster and have imported all required modules. I am sure this is something small, but I can't put my finger on it why this is...

  • 3508 Views
  • 1 replies
  • 0 kudos
Latest Reply
Renu_
Valued Contributor II
  • 0 kudos

Hi @libpekin, is filepath pointing to a DBFS path? Because writing directly to DBFS paths using to_excel() is not supported due to DBFS limitations with certain file operations, especially random writes.As a workaround, first save the Excel file to a...

  • 0 kudos
jeremy98
by Honored Contributor
  • 3221 Views
  • 0 replies
  • 0 kudos

Hydra configuration and job parameters of DABs

Hello Community,I'm trying to create a job pipeline in Databricks that runs a spark_python_task, which executes a Python script configured with Hydra. The script's configuration file defines parameters, such as id.How can I pass this parameter at the...

  • 3221 Views
  • 0 replies
  • 0 kudos
nielsehlers
by New Contributor
  • 760 Views
  • 1 replies
  • 1 kudos

Visualizations ignore Timezones

Databricks inline visualizations (bar charts / line charts) etc. ignore Timezones and always display the UTC Time on the X-Axis.

  • 760 Views
  • 1 replies
  • 1 kudos
Latest Reply
Advika
Databricks Employee
  • 1 kudos

Hello @nielsehlers! As a workaround, you can use the from_utc_timestamp() function to convert UTC timestamps to your desired time zone before visualising:SELECT from_utc_timestamp(column_name, 'Asia/Kolkata’) AS alias_name FROM table_name;

  • 1 kudos
user1234567899
by New Contributor II
  • 1367 Views
  • 2 replies
  • 0 kudos

Lineage not visible for table created in DLT

Hello,I've been struggling for two days with missing lineage information for the silver layer table, and I'm unsure what I'm doing incorrectly.I have a DLT pipeline with DPM public preview enabled. Data is ingested from an S3 bucket into the bronze t...

  • 1367 Views
  • 2 replies
  • 0 kudos
Latest Reply
obitech01
New Contributor II
  • 0 kudos

I'm having the same exact issue as the poster above.I'm using a newly created DLT pipeline as of April 1st 2025 (Unity catalog enabled, Serverless).  I get lineage for all tables and views involved in my pipeline except for the streaming table.The st...

  • 0 kudos
1 More Replies
aranjan99
by Contributor
  • 1240 Views
  • 4 replies
  • 0 kudos

Disabled billing system schema accidentally and now I cannot enable it again

We accidentally disabled billing system table schema and now getting this error on re-enabling it:Error: billing system schema can only be enabled by Databricks.How can we resolve the same. We have not purchased any of the support contracts and hence...

  • 1240 Views
  • 4 replies
  • 0 kudos
Latest Reply
aranjan99
Contributor
  • 0 kudos

00640919

  • 0 kudos
3 More Replies
prashantjjain33
by New Contributor II
  • 1181 Views
  • 3 replies
  • 0 kudos

databricks_error_message:REQUEST_LIMIT_EXCEEDED:

A databricks job failed unexpectedly with below error. There ware only 5 jobs running at that time and no major operations. what could be the root cause and how can we avoid this in futureCluster '0331-xxxxxx-zs8i8pcn' was terminated. Reason: INIT_SC...

  • 1181 Views
  • 3 replies
  • 0 kudos
Latest Reply
saurabh18cs
Honored Contributor II
  • 0 kudos

are you generating any databricks tokens in this process? if yes there is a limit of 600.

  • 0 kudos
2 More Replies
kleanthis
by New Contributor III
  • 782 Views
  • 1 replies
  • 0 kudos

Resolved! dbutls.fs.cp() fails in Runtime 16.3 Beta, when using abfss://

Hello,I am not sure if this is the right place to post this, however, I reporting what seems to me, a breaking issue with 16.3 Beta Runtime, when performing dbutils.fs.cp() operations between abfss://This is not a permissions issue — let's get that o...

  • 782 Views
  • 1 replies
  • 0 kudos
Latest Reply
kleanthis
New Contributor III
  • 0 kudos

To close off my loop: The ClassCastException has been resolved in 16.3

  • 0 kudos
JangaReddy
by New Contributor
  • 750 Views
  • 1 replies
  • 0 kudos

Serverless Access

Hi Team,Can you help us, how to restrict serverless access to only specific users/groups. (through Workspace admin /account admin)?Regards,Phani

  • 750 Views
  • 1 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

you cannot at the moment.  I suppose that is something that is coming soon (many complaints about that).It is only possible to toggle serverless compute on/off.

  • 0 kudos
lezwon
by Contributor
  • 1063 Views
  • 2 replies
  • 1 kudos

Resolved! At least 1 "file_arrival" blocks are required.

Hi folks, I'm trying to set up a databricks asset bundle for a job to load some product data into databricks. This job was created in databricks and loads the data from a location hardcoded into the notebook (for now). It is supposed to run every 3 h...

  • 1063 Views
  • 2 replies
  • 1 kudos
Latest Reply
Tharani
New Contributor III
  • 1 kudos

I think since it is a scheduled job, you have to explicitly specify a cron-based schedule instead of using file_arrival in trigger section in yaml file.

  • 1 kudos
1 More Replies
Vittorio
by New Contributor II
  • 943 Views
  • 1 replies
  • 1 kudos

Month-on-month growth with pivot

I need to create pivot tables with data on revenue/costs presented monthly and I need to show the month on month growth and it seems like mission impossible with Dashboards on SQL warehouse despite being quite obviously a very typical task.Pivot tabl...

  • 943 Views
  • 1 replies
  • 1 kudos
Latest Reply
Brahmareddy
Esteemed Contributor
  • 1 kudos

Hi Vittorio,How are you doing today? , As per my understanding, You're absolutely right—creating a proper pivot table with dynamic month-over-month (MoM) growth in Databricks SQL Dashboards is surprisingly tricky for such a common use case. The built...

  • 1 kudos
standup1
by Contributor
  • 3546 Views
  • 3 replies
  • 1 kudos

Recover a deleted DLT pipeline

Hello,does anyone know how to recover a deleted dlt pipeline, or at least recover deleted tables that were managed by the dlt pipeline ? We have a pipeline that stopped working and throwing all kind of errors, so we decided to create a new one and de...

  • 3546 Views
  • 3 replies
  • 1 kudos
Latest Reply
Nishair05
New Contributor II
  • 1 kudos

Found out a way to recover the tables. But it seems like we need to recover the pipeline as well. Any idea on how to recover the pipeline?

  • 1 kudos
2 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels