Creating a Api links by url & list from a saved df
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ03-28-2023 08:11 AM
I have 106,000 + api's I need to call, so instead of calling them one by one I would like to create a loop as I have the list of location Id's which I've called from there api locations list and these will sit at the end of the url to get more info on each location as the location list is limited.
e.g I want it to bring back all 106,000 api links from the 'IdColumn' from my loaded list
www.apilink/24563 ....
Please see code below if anyone could help it would be so helpful.
from pyspark.sql.types import StructField, StructType, StringType, DataType, Row
Idlist = spark.read.load("loadedfile.paquet")
locid = Idlist.select('IdColumn')
LookUppy = str('https://apilink/locations/') + str(Idlist['IdColumn'])
print(LookUppy)
I get this as the output =
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ03-29-2023 01:41 AM
@Kay Connollyโ
Please check the below example:
data = [{"ID": 1},
{"ID": 2},
{"ID": 3},
{"ID": 4}
]
df = spark.createDataFrame(data)
for row in df.rdd.collect():
print("https://apilink/locations/"+str(row["ID"]))
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ03-31-2023 06:48 PM
Hi @Kay Connollyโ
Hope everything is going great.
Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we can help you.
Cheers!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ04-01-2023 10:33 PM
@Kay Connollyโ :
It looks like you are trying to concatenate a string with a column object, which is causing the error. You need to convert the column object to a string first before concatenating it to the URL. Here's a modified code snippet that should work:
from pyspark.sql.functions import concat_ws
Idlist = spark.read.load("loadedfile.paquet")
locid = Idlist.select('IdColumn')
# Convert the IdColumn to string and concatenate with the URL
lookup_urls = locid.withColumn('url', concat_ws('', 'https://apilink/locations/', locid.IdColumn.cast('string')))
# Show the resulting URLs
lookup_urls.show()
This should create a new column called url that contains the complete API links for each location ID in your dataframe. You can then use this column to make the API calls in a loop.