cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

UmaMahesh1
by Honored Contributor III
  • 2849 Views
  • 2 replies
  • 15 kudos

Resolved! Pyspark dataframe column comparison

I have a string column which is a concatenation of elements with a hyphen as follows. Let 3 values from that column looks like below, Row 1 - A-B-C-D-E-FRow 2 - A-B-G-C-D-E-FRow 3 - A-B-G-D-E-FI want to compare 2 consecutive rows and create a column ...

  • 2849 Views
  • 2 replies
  • 15 kudos
Latest Reply
NhatHoang
Valued Contributor II
  • 15 kudos

Hi,I think you can follow these steps:1. Use window function to create a new column by shifting, then your df will look like thisid value lag1 A-B-C-D-E-F null2 A-B-G-C-D-E-F A-B-C-D-E-F3 A-B-G-D-E-F ...

  • 15 kudos
1 More Replies
irfanaziz
by Contributor II
  • 1039 Views
  • 3 replies
  • 3 kudos

How to make a string column with numeric and alphabet values use as partition?

So i have two partitions defined for this delta table, One is year('GJHAR') contains year values, and the other is a string column('BUKS') with around 124 unique values. However, there is one problem with the 2nd partition column('BUKS'), The values ...

  • 1039 Views
  • 3 replies
  • 3 kudos
Latest Reply
Kaniz
Community Manager
  • 3 kudos

Hi @nafri A​, We haven’t heard from you on the last response from @Werner Stinckens​ , and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please do share that with the community as it can be helpful to ot...

  • 3 kudos
2 More Replies
Gopal_Sir
by New Contributor III
  • 19142 Views
  • 5 replies
  • 7 kudos

Resolved! How to convert a string column to Array of Struct ?

I have a nested struct , where on of the field is a string , it looks something like this ....string = "[{\"to_loc\":\"6183\",\"to_loc_type\":\"S\",\"qty_allocated\":\"18\"},{\"to_loc\":\"6137\",\"to_loc_type\":\"S\",\"qty_allocated\":\"9\"},{\"to_lo...

  • 19142 Views
  • 5 replies
  • 7 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 7 kudos

Can you mark the question as answered so others can find the solution?

  • 7 kudos
4 More Replies
shan_chandra
by Honored Contributor III
  • 11546 Views
  • 1 replies
  • 3 kudos

Resolved! dataframe - cast string to decimal when encountering zeros returns OE-16

The user is trying to cast string to decimal when encountering zeros. The cast function displays the  '0' as '0E-16'. could you please let us know your thoughts on whether 0s can be displayed as 0s?from pyspark.sql import functions as F df = spark.s...

Screen Shot 2022-03-09 at 12.13.11 PM
  • 11546 Views
  • 1 replies
  • 3 kudos
Latest Reply
shan_chandra
Honored Contributor III
  • 3 kudos

If the scale of decimal type is greater than 6, scientific notation kicks in hence seeing 0E-16.This behavior is described in the existing OSS spark issue - https://issues.apache.org/jira/browse/SPARK-25177Kindly cast the column to a decimal type les...

  • 3 kudos
Labels