Creating a Api links by url & list from a saved df
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-28-2023
08:11 AM
- last edited on
03-21-2025
06:29 AM
by
Advika
I have 106,000 + api's I need to call, so instead of calling them one by one I would like to create a loop as I have the list of location Id's which I've called from there api locations list and these will sit at the end of the url to get more info on each location as the location list is limited.
e.g I want it to bring back all 106,000 api links from the 'IdColumn' from my loaded list
www.apilink/24563 ....
Please see code below if anyone could help it would be so helpful.
from pyspark.sql.types import StructField, StructType, StringType, DataType, Row
Idlist = spark.read.load("loadedfile.paquet")
locid = Idlist.select('IdColumn')
LookUppy = str('https://apilink/locations/') + str(Idlist['IdColumn'])
print(LookUppy)
I get this as the output =
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-29-2023 01:41 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-31-2023 06:48 PM
Hi @Kay Connolly
Hope everything is going great.
Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best so that other members can find the solution more quickly? If not, please tell us so we can help you.
Cheers!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-01-2023 10:33 PM
@Kay Connolly :
It looks like you are trying to concatenate a string with a column object, which is causing the error. You need to convert the column object to a string first before concatenating it to the URL. Here's a modified code snippet that should work:
from pyspark.sql.functions import concat_ws
Idlist = spark.read.load("loadedfile.paquet")
locid = Idlist.select('IdColumn')
# Convert the IdColumn to string and concatenate with the URL
lookup_urls = locid.withColumn('url', concat_ws('', 'https://apilink/locations/', locid.IdColumn.cast('string')))
# Show the resulting URLs
lookup_urls.show()This should create a new column called url that contains the complete API links for each location ID in your dataframe. You can then use this column to make the API calls in a loop.