cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Michael_Appiah
by Contributor
  • 23488 Views
  • 1 replies
  • 0 kudos

Hashing Functions in PySpark

Hashes are commonly used in SCD2 merges to determine whether data has changed by comparing the hashes of the new rows in the source with the hashes of the existing rows in the target table. PySpark offers multiple different hashing functions like:MD5...

  • 23488 Views
  • 1 replies
  • 0 kudos
Latest Reply
Michael_Appiah
Contributor
  • 0 kudos

Hi @Retired_mod ,thank you for your comprehensive answer. What is your opinion on the trade-off between using a hash like xxHASH64 which returns a LongType column and thus would offer good performance when there is a need to join on the hash column v...

  • 0 kudos
kaleighspitz
by New Contributor
  • 1449 Views
  • 0 replies
  • 0 kudos

Delta Live Tables saving as corrupt files

Hello,I am using Delta Live Tables to store data and then trying to save them to ADLS. I've specified the storage location of the Delta Live Tables in my Delta Live Tables pipeline. However, when I check the files that are saved in ADLS, they are cor...

Data Engineering
Delta Live Tables
  • 1449 Views
  • 0 replies
  • 0 kudos
jfarmer
by New Contributor II
  • 6428 Views
  • 3 replies
  • 1 kudos

PermissionError / Operation not Permitted with Files-in-Repos

I've been running a notebook using files-in-repo. Previously this has worked fine. I'm unsure what's changed (I was testing integration with DCS on older runtimes, but don't think I made any persistent changes)--but now it's throwing an error (always...

image image
  • 6428 Views
  • 3 replies
  • 1 kudos
Latest Reply
_carleto_
New Contributor II
  • 1 kudos

Hi @jfarmer , did you solved this issue? I'm having exactly the same challenge.Thanks!

  • 1 kudos
2 More Replies
Paval
by New Contributor
  • 1425 Views
  • 0 replies
  • 0 kudos

Failed to run the job on databricks version LTS 9.x and 10.x(AWS)

Hi Team,When we tried to change the databricks version from 7.3 to 9.x or 10.x we are getting below error. Caused by: java.lang.RuntimeException: MetaException(message:Unable to verify existence of default database: com.amazonaws.services.glue.model....

  • 1425 Views
  • 0 replies
  • 0 kudos
rp16
by New Contributor II
  • 2187 Views
  • 2 replies
  • 2 kudos

How can we create streaming tables as external delta tables ?

We would like to introduce DLT, Streaming tables to our medallion architecture but we are unable to create the streaming tables with concerned schemas. STREAMING Tables doesn't have an option to be stored with custom schemas. The requirement we have ...

  • 2187 Views
  • 2 replies
  • 2 kudos
Latest Reply
Faisal
Contributor
  • 2 kudos

If unity catalog is used, by default tables under that would be managed

  • 2 kudos
1 More Replies
nikhilkumawat
by New Contributor III
  • 4024 Views
  • 2 replies
  • 1 kudos

[INTERNAL_ERROR] Cannot generate code for expression: claimsconifer.default.decrypt_colA(

A column contains encrypted data at rest. I am trying to create a sql function which will decrypt the data if the user is a part of a particular group. Below is the function: %sql CREATE OR REPLACE FUNCTION test.default.decrypt_if_valid_user(col_a ST...

  • 4024 Views
  • 2 replies
  • 1 kudos
Latest Reply
nikhilkumawat
New Contributor III
  • 1 kudos

Hi @Retired_mod After removing "TABLE" keyword from create or replace statement this function got registered as builtin function. Just to verify that I displayed all the functions and I can see that function--> decrypt_if_valid_user:Now I am trying t...

  • 1 kudos
1 More Replies
Oliver_Angelil
by Valued Contributor II
  • 3049 Views
  • 3 replies
  • 1 kudos

Resolved! Are data health check expectations available only on Delta Live tables?

I love the idea of "expectations" being available for Delta Live tables: https://docs.databricks.com/delta-live-tables/expectations.htmlI'd like to know if they are also available for regular delta tables?Thank you in advance!

  • 3049 Views
  • 3 replies
  • 1 kudos
Latest Reply
erigaud
Honored Contributor
  • 1 kudos

Hello @Oliver_Angelil, so have you found a way to implement something resembling expectations for delta tables outside of a DLT pipeline ? 

  • 1 kudos
2 More Replies
invalidargument
by New Contributor III
  • 4227 Views
  • 1 replies
  • 1 kudos

How to display shap waterfall plot

Hi,I have managed to display force plot for a single observation using the advice from this thread:Solved: How to display SHAP plots? - Databricks - 28315But is there anyway to display the newer "waterfall"-plot shap.plots.waterfall — SHAP latest doc...

  • 4227 Views
  • 1 replies
  • 1 kudos
Latest Reply
invalidargument
New Contributor III
  • 1 kudos

Thank you for the swift response. I made a minimal example and it does work as you said. However when I try with my own model it does not work, the only output is<Figure size 576x468 with 3 Axes>I tried to save the figure as a file and then I do get ...

  • 1 kudos
rt-slowth
by Contributor
  • 1039 Views
  • 0 replies
  • 0 kudos

How to write test code in databricks

    from databricks.connect import DatabricksSession from data.dbx_conn_info import DbxConnInfo class SparkSessionManager: _instance = None _spark = None def __new__(cls): if cls._instance is None: cls._instance = s...

  • 1039 Views
  • 0 replies
  • 0 kudos
User16789201666
by Databricks Employee
  • 9078 Views
  • 3 replies
  • 4 kudos
  • 9078 Views
  • 3 replies
  • 4 kudos
Latest Reply
arun_pamulapati
Databricks Employee
  • 4 kudos

Use Lakehouse Monitoring:  https://docs.databricks.com/en/lakehouse-monitoring/index.html Specifically:  https://docs.databricks.com/en/lakehouse-monitoring/monitor-output.html#drift-metrics-table

  • 4 kudos
2 More Replies
MarcinO
by New Contributor II
  • 5439 Views
  • 2 replies
  • 2 kudos

InputWidgetNotDefined exception when running a notebook as a job

I have a notebook that reads a value of a text input in a Scala command:var startTimeStr = dbutils.widgets.get("Run Date")What doesn't make any sense that this notebook fails with InputWidgetNotDefined error when being scheduled as a job, but works j...

  • 5439 Views
  • 2 replies
  • 2 kudos
Latest Reply
berserkersap
Contributor
  • 2 kudos

Have you used dbutils.widget.text() before dbutils.widget.get() ?

  • 2 kudos
1 More Replies
samye760
by New Contributor II
  • 2128 Views
  • 0 replies
  • 1 kudos

Job Retry Wait Policy and Cluster Shutdown

Hi all,I have a Databricks Workflow job in which the final task makes an external API call. Sometimes this API will be overloaded and the call will fail. In the spirit of automation, I want this task to retry the call an hour later if it fails in the...

Data Engineering
clusters
jobs
retries
Workflows
  • 2128 Views
  • 0 replies
  • 1 kudos
js54123875
by New Contributor III
  • 11326 Views
  • 4 replies
  • 1 kudos

Connection to Azure SQL Server: ODBC Driver 18 for SQL Server

Task: Setup connection to Azure SQL Server.A couple things have changed...*We've started using Unity Catalog, so need Unity Catalog -enabled clusters*Legacy init scripts have been deprecated, and this is how we had our pyodbc setup, etc. defined.Code...

  • 11326 Views
  • 4 replies
  • 1 kudos
Latest Reply
diego_poggioli
Contributor
  • 1 kudos

Hi @js54123875 did you manage to find a solution for this? I'm facing a similar problem.ThanksDiego

  • 1 kudos
3 More Replies
RobiTakToRobi
by New Contributor II
  • 1838 Views
  • 1 replies
  • 1 kudos

How to allow non-ASCII characters to be stored in the view definition?

I've tried to create a view with a simple conditional statement containing Polish characters. The view is created without errors, but select on the view returns question marks in place of the non-ASCII characters. Why? How to fix it?Below on screens ...

example view_text
  • 1838 Views
  • 1 replies
  • 1 kudos
Latest Reply
andreas7891
New Contributor II
  • 1 kudos

Any solutions on this?We have the same problem with Greek characters.

  • 1 kudos
YS1
by Contributor
  • 3270 Views
  • 2 replies
  • 2 kudos

Live dashboard

Hello,I have a streaming dataset -I used delta live tables-, and I want to create a live dashboard that shows the changes instantly without the need to query the table every specific time -without the need to refresh-, What would be the best solution...

  • 3270 Views
  • 2 replies
  • 2 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels