Re: Silver to Gold Layer | Running ML - Debug Hel...

ManojkMohan · ‎08-29-2025

@BS_THE_ANALYST any framework recommendations for which ML to chose based on data , the way i have solved the problem for now

Data is ingested and converted to a usable format.

Building Block: Data Source → Pandas DataFrame
Value Added:

Building Block: Feature Engineering
Value Added:

Selects numeric attributes (TotalRunsScored, MatchesPlayed, MaxMarginWon) as predictors.
Assigns the match winner (team1) as the target variable.
Ensures ML model knows what to learn from and what to predict.

Building Block: Data Quality Checks
Value Added:

Building Block: Model Validation Setup
Value Added:

Splits data into training (to learn patterns) and testing (to evaluate performance).
Supports generalization, ensuring the model is not overfitting.
Stratification maintains class proportions where possible.

Building Block: ML Model
Value Added:

Building Block: Model Evaluation
Value Added:

Measures accuracy (how many winners were predicted correctly).
Confusion matrix shows true vs predicted class counts, giving insight into model behavior.
Ensures model performance is quantified before deployment.

Output: Prediction Comparison

Building Block: Results Visualization / Reporting
Value Added: