cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

ksamborn
by New Contributor
  • 83 Views
  • 1 replies
  • 0 kudos

withColumnRenamed error on Unity Catalog 14.3 LTS

Hi -  We are migrating to Unity Catalog 14.3 LTS and have seen a change in behavior using withColumnRenamed.There is an error COLUMN_ALREADY_EXISTS on the join key, even though the column being renamed is a different column.   The joined DataFrame do...

Data Engineering
Data Lineage
Unity Catalog
  • 83 Views
  • 1 replies
  • 0 kudos
Latest Reply
Palash01
New Contributor III
  • 0 kudos

Hey @ksamborn I can think of 2 solutions:Rename the column in df_2 before joining: df_1_alias = df_1.alias("t1") df_2_alias = df_2.alias("t2") join_df = df_1_alias.join(df_2_alias, df_1_alias.key == df_2_alias.key) rename_df = join_df.withColumnRenam...

  • 0 kudos
Brad
by New Contributor
  • 73 Views
  • 2 replies
  • 0 kudos

Dash in Databricks notebook directly

Hi team,Is there a way to embed plotly dash directly inside Databricks notebook?Thanks

  • 73 Views
  • 2 replies
  • 0 kudos
Latest Reply
Brad
New Contributor
  • 0 kudos

Thanks Dave.Is it possible to embed Dash to Databricks notebook cell result directly? Visualization in Databricks notebook can be published to dashboards. We expect to use Dash inside the Databricks notebook and publish it so inside dashboard people ...

  • 0 kudos
1 More Replies
isaac_gritz
by Valued Contributor II
  • 5179 Views
  • 3 replies
  • 4 kudos

Using Plotly Dash with Databricks

How to use Plotly Dash with DatabricksWe recommend checking out this article for the latest on building Dash Applications on top of the Databricks Lakehouse.Let us know in the comments if you use Plotly and if you're planning on adopting the latest i...

  • 5179 Views
  • 3 replies
  • 4 kudos
Latest Reply
dave-at-plotly
New Contributor
  • 4 kudos

Hey all.  Just wanted to make sure everyone had some up-to-date intel regarding leveraging Plotly Dash with Databricks.Most Dash app integrations w Databricks today leverage the Databricks Python SQL Connector.  More technical details are available v...

  • 4 kudos
2 More Replies
ss6
by New Contributor
  • 56 Views
  • 0 replies
  • 0 kudos

Liquid Cluster - SHOW CREATE TABLE error

We've got this table with liquid clustering turned on at first, but then we switch off with below command.ALTER TABLE table_name CLUSTER BY NONE;Now, our downstream process that usually runs "SHOW CREATE TABLE" is hitting a snag. It's throwing this e...

  • 56 Views
  • 0 replies
  • 0 kudos
Brad
by New Contributor
  • 98 Views
  • 2 replies
  • 0 kudos

Is there a way to control the cluster runtime version for DLT

Hi team, When I create a DLT job, is there a way to control the cluster runtime version somewhere? E.g. I want to use 14.3 LTS. I tried to add `"spark_version": "14.3.x-scala2.12",` inside cluster default label but not work.Thanks

  • 98 Views
  • 2 replies
  • 0 kudos
Latest Reply
Brad
New Contributor
  • 0 kudos

Thanks. I mean running DLT, not run a cell from notebook sourced for DLT.

  • 0 kudos
1 More Replies
Heisenberg
by New Contributor
  • 187 Views
  • 0 replies
  • 0 kudos

Migrate a workspace from one AWS account to another AWS account

Hi everyone,We have a Databricks workspace in an AWS account that we need to migrate to a new AWS account.The workspace has a lot of managed tables, workflows, saved queries, notebooks which need to be migrated, so looking for an efficient approach t...

Data Engineering
AWS
Databricks Migration
migration
queries
Workflows
  • 187 Views
  • 0 replies
  • 0 kudos
Dhruv_Sinha
by New Contributor
  • 124 Views
  • 2 replies
  • 1 kudos

Parallelizing processing of multiple spark dataframes

Hi all, I am trying to create a collection rd that contains a list of spark dataframes. I want to parallelize the cleaning process for each of these dataframes. Later on, I am sending each of these dataframes to another method. However, when I parall...

  • 124 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @Dhruv_Sinha, The issue you’re encountering—where the Spark context (sc) cannot be accessed from worker nodes—is a common challenge when working with Spark. Let’s explore why this happens and discuss potential workarounds. Spark Context and W...

  • 1 kudos
1 More Replies
exilon
by New Contributor
  • 61 Views
  • 0 replies
  • 0 kudos

DLT streaming with sliding window missing last windows interval

Hello, I have a DLT pipeline where I want to calculate the rolling average of a column for the last 24 hours which is updated every hour.I'm using the below code to achieve this:       @Dlt.table() def gold(): df = dlt.read_stream("silver_table")...

Data Engineering
dlt
spark
streaming
window
  • 61 Views
  • 0 replies
  • 0 kudos
LukeH_DE
by New Contributor
  • 62 Views
  • 1 replies
  • 1 kudos

Variable referencing in EXECUTE IMMEDIATE

Hi all,As part of an on-going exercise to refactor existing T-SQL code into Databricks, we've stumbled into an issue that we can't seem to overcome through Spark SQL.Currently we use dynamic SQL to loop through a number of tables, where we use parame...

Data Engineering
sql
Variables
  • 62 Views
  • 1 replies
  • 1 kudos
Latest Reply
LukeH_DE
New Contributor
  • 1 kudos

@SergeRielau  - appreciate you've been posting recently on EXECUTE IMMEDIATE - really insightful. Wonder if you'd be able to assist with the above!

  • 1 kudos
xssdfd
by New Contributor II
  • 90 Views
  • 2 replies
  • 0 kudos

File arrival trigger customization

Hi all. I have workflow which I would like to trigger when new file arrive. Problem is that in my storage account, there are few different types of files. Lets assume that I have big csv file and small xlsx mapping file. I would like to trigger job, ...

  • 90 Views
  • 2 replies
  • 0 kudos
Latest Reply
feiyun0112
New Contributor III
  • 0 kudos

option pathGlobFilter or fileNamePatternhttps://docs.databricks.com/en/ingestion/auto-loader/options.html 

  • 0 kudos
1 More Replies
giladba
by New Contributor III
  • 958 Views
  • 5 replies
  • 0 kudos

access to event_log TVF

Hi, According to the documentation:https://docs.databricks.com/en/delta-live-tables/observability.html"The event_log TVF can be called only by the pipeline owner and a view created over the event_log TVF can be queried only by the pipeline owner. The...

  • 958 Views
  • 5 replies
  • 0 kudos
Latest Reply
neha_ayodhya
New Contributor II
  • 0 kudos

Hi,I am also facing the same issue, even after following all the steps mentioned, I am not able to query the event logs.any help will be greatly appreciated.

  • 0 kudos
4 More Replies
Bharathi7
by New Contributor
  • 90 Views
  • 2 replies
  • 0 kudos

Python UDF fails with UNAVAILABLE: Channel shutdownNow invoked

I'm using a Python UDF to apply OCR to each row of a dataframe which contains the URL to a PDF document. This is how I define my UDF:  def extract_text(url: str): ocr = MyOcr(url) extracted_text = ocr.get_text() return json.dumps(extracte...

  • 90 Views
  • 2 replies
  • 0 kudos
Latest Reply
daniel_sahal
Honored Contributor III
  • 0 kudos

@Bharathi7 It's really hard to determine what's going on without knowing what acutally MyOcr function does.Maybe there's some kind of timeout on the service side? To many parallell connections?

  • 0 kudos
1 More Replies
alxsbn
by New Contributor III
  • 34 Views
  • 0 replies
  • 0 kudos

Compute pool and AWS instance profiles

Hi everyone,We're looking at using the compute pool feature. Now we're mostly relying on all-purpose and job compute. On these two we're using instance profiles to let the clusters access our s3 buckets and more.We don't see anything related to insta...

  • 34 Views
  • 0 replies
  • 0 kudos
JakeerDE
by New Contributor
  • 205 Views
  • 4 replies
  • 0 kudos

Resolved! Databricks SQL - Deduplication in DLT APPLY CHANGES INTO

Hi @Kaniz,We have a kafka source appending the data into bronze table and a subsequent DLT apply changes into to do the SCD handling. Finally, we have materialized views to create dims/facts.We are facing issues, when we perform deduplication inside ...

  • 205 Views
  • 4 replies
  • 0 kudos
Latest Reply
DavidHoyt
New Contributor
  • 0 kudos

The strategic inclusion of keywords, such as "improving quality and safety," ensures that individuals actively searching for insights in this domain can easily discover and benefit from this essay https://www.nursingpaper.com/examples/improving-quali...

  • 0 kudos
3 More Replies