โ08-28-2025 07:06 AM
Problem I am solving:
Reads the raw sports data IPL CSV โ bronze layer
Cleans and aggregates โ silver layer
Summarizes team stats โ gold layer
Prepares ML-ready features and trains a Random Forest classifier to predict match winners
Getting error: [PARSE_SYNTAX_ERROR] Syntax error at or near end of input. SQLSTATE: 42601 when i run code:
โ08-28-2025 12:19 PM - edited โ08-28-2025 12:20 PM
Hi @ManojkMohan ,
This section here:
df_ml
.select("features", "label")
.limit(10000) # Optional: limit for performance
.collect()I don't see anywhere prior to this code block where you actually created "df_ml"? Has that dataframe even been created prior to this? If yes, are you certain both of those columns ["features", "label"] are present in that dataframe.
All the best,
BS
โ08-28-2025 12:19 PM - edited โ08-28-2025 12:20 PM
Hi @ManojkMohan ,
This section here:
df_ml
.select("features", "label")
.limit(10000) # Optional: limit for performance
.collect()I don't see anywhere prior to this code block where you actually created "df_ml"? Has that dataframe even been created prior to this? If yes, are you certain both of those columns ["features", "label"] are present in that dataframe.
All the best,
BS
โ08-29-2025 08:07 AM
@BS_THE_ANALYST any framework recommendations for which ML to chose based on data , the way i have solved the problem for now
Data is ingested and converted to a usable format.
Building Block: Data Source โ Pandas DataFrame
Value Added:
Building Block: Feature Engineering
Value Added:
Building Block: Data Quality Checks
Value Added:
Building Block: Model Validation Setup
Value Added:
Building Block: ML Model
Value Added:
Building Block: Model Evaluation
Value Added:
Output: Prediction Comparison
Building Block: Results Visualization / Reporting
Value Added:
โ08-29-2025 08:08 AM
โ08-29-2025 09:53 AM
@ManojkMohan thanks for sharing this, I'm looking at starting an ML project in the coming weeks, I might have to bring this forward ๐. Feeling motivated with that confusion matrix in your output ๐.
Congrats on getting it working!
All the best,
BS