Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-15-2021 02:28 AM
At the moment im just trying to preprocess my data and do it in efficient and quick way,
So there isnt any code of deep learning.
I didn't find if or how i can to train my model with DF as input since df dont accept np arrays as data type.( there are examples for images dataframe from databricks)
I read npz files from S3 bucket as binary, after its used udf to use np.load on the binary content and split the data to rows.
When im trying to get the np arrays from the df ( which saved now as lists) i need to use np.stack and pd.tolist so its take some time.
Im trying to get the data with less then 1 sec for quick training and minimum io waste.