cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Spark structured streaming

nileshtiwaari
New Contributor

hi,
could someone please help me with this code :-

input parameter df is a spark structured streaming dataframe
 def apply_duplicacy_check(df, duplicate_check_columns):
    if len(duplicate_check_columns) == 0:
         return None, df

    valid_df = df.dropDuplicates(duplicate_check_columns)

    error_df = df.exceptAll(valid_df)

    return error_df,valid_df

I am getting this error :- 

Except on a streaming DataFrame/Dataset on the right is not supported;
Except All true
:- Project [page#54781.Name AS division_name#54786, page#54781.ShortName AS short_name#54787, page#54781.ExternalSystemCode AS external_system_code#54788, page#54781.AccountingCode AS division_number#54789, page#54781.ParentDivisionId AS parent_division_id#54790, page#54781.TimeZone AS timezone#54791, page#54781.DivisionType.Id AS division_type_id#54792, page#54781.DivisionType.Name AS division_type_name#54793, sourceExtractDatetime#54773 AS source_extract_datetime#54794, page#54781.Id AS division_id#54795]
: +- Project [Data#54772, sourceExtractDatetime#54773, page#54781]
: +- Generate explode(Data#54772.Page), true, [page#54781]

0 REPLIES 0

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now