12-23-2021 01:12 PM
Consider the following input:
ID PrevID
-------- ---------
33 NULL
272 33
317 272
318 317
I need to somehow get the following result:
Result
--------
/33
/33/272
/33/272/317
/33/272/317/318
I need to do this in SQL and if REALLY not possible, then anything else goes, and Databricks does not support recursive CTE's so I am out of ideas, i have an SQL Fiddle here that illustrates the recursivce CTE way that doesn't work on databricks: http://sqlfiddle.com/#!18/101df/3/0
Any help is appreciated, thanks
01-08-2022 08:49 AM
Hi @Shazal Gomes ,
I found out using pyspark this solution
from pyspark.sql import functions as F
from pyspark.sql.window import Window
from pyspark.sql import Row
from pyspark.sql.types import *
ids = ["33", "272", "317", "318"]
df_ids = spark.createDataFrame([Row(ID = i) for i in ids])
display(
df_ids.withColumn(
'PREVIOUS_ID', F.concat_ws('/',F.array_sort(F.collect_set(F.col('ID')).over(Window.orderBy(F.col('ID').cast('integer'))).cast('array<integer>')))
)
)
Let me know if It meets your need.
12-24-2021 03:05 AM
Hi @shaz! My name is Kaniz, and I'm the technical moderator here. Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question first. Or else I will get back to you soon. Thanks.
12-24-2021 05:03 AM
Hi @Kaniz Fatma , Thanks for following up ! Great to meet you too
06-28-2023 07:29 AM
Hi Kaniz:
I am reaching out for assistance using SQL/T-SQL for a string function to use on a concatenated string (i.e., DATABRICKS concatenated functionality is used to create unique identification names in the data table). I have read-only access and need to validate huge data sets. Thank you in advance!
Kind regards - Karen
01-08-2022 08:49 AM
Hi @Shazal Gomes ,
I found out using pyspark this solution
from pyspark.sql import functions as F
from pyspark.sql.window import Window
from pyspark.sql import Row
from pyspark.sql.types import *
ids = ["33", "272", "317", "318"]
df_ids = spark.createDataFrame([Row(ID = i) for i in ids])
display(
df_ids.withColumn(
'PREVIOUS_ID', F.concat_ws('/',F.array_sort(F.collect_set(F.col('ID')).over(Window.orderBy(F.col('ID').cast('integer'))).cast('array<integer>')))
)
)
Let me know if It meets your need.
01-10-2022 11:18 AM
Hi, thanks for this code, it works as needed ! really would've liked to have it in SQL but this will do for now! Thank you
02-12-2022 01:28 AM
https://speakerdeck.com/mitasingh https://telegra.ph/Free-Printable-cover-letter-template-01-17 https://www.imdb.com/user/ur148564602/?ref_=nv_usr_prof_2 https://www.glenewinestate.com.au/profile/coverletter50/profile https://www.wpanet.org/profile/coverletter50/profile https://www.clarinetu.com/profile/coverletter50/profile https://www.digital.hms.org.il/profile/coverletter50/profile https://www.gn2.poli.ufrj.br/profile/coverletter50/profile https://www.figma.com/@mitasingh https://kuula.co/profile/mitasingh https://participation.lillemetropole.fr/profiles/mitasingh/activity https://coactuemtest.ub.edu/profiles/mitasingh/activity https://questions.parlement-ouvert.fr/profiles/mitasingh/activity?locale=en https://otakumode.com/mitasingh https://linkworld.us/profile/mitasingh http://82808.homepagemodules.de/u458_mitasingh.html http://bcmoney-mobiletv.com/mitasingh
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group