MODEL SELECTION & TRAINING

Models & Hyperparameter Tuning

Multiple Classifier Approach

Benchmarking several classifier algorithms to identify optimal performance for phishing detection

# Model candidates with hyperparameter grids
models = {
    "Random Forest": RandomForestClassifier(verbose=1),
    "Decision Tree": DecisionTreeClassifier(),
    "Gradient Boosting": GradientBoostingClassifier(verbose=1),
    "Logistic Regression": LogisticRegression(verbose=1),
    "AdaBoost": AdaBoostClassifier(),
}

params = {
    "Decision Tree": {
        'criterion': ['gini', 'entropy', 'log_loss']
    },
    "Random Forest": {
        'n_estimators': [8, 16, 32, 128, 256]
    },
    "Gradient Boosting": {
        'learning_rate': [.1, .01, .05, .001],
        'subsample': [0.6, 0.7, 0.75, 0.85, 0.9],
        'n_estimators': [8, 16, 32, 64, 128, 256]
    },
    "AdaBoost": {
        'learning_rate': [.1, .01, .001],
        'n_estimators': [8, 16, 32, 64, 128, 256]
    }
}

Best Model Selection

Automatic selection of best performing model based on test metrics

# Find best model from evaluation results
# To get best model score from dict
best_model_score = max(sorted(model_report.values()))

# To get best model name from dict
best_model_name = list(model_report.keys())[
    list(model_report.values()).index(best_model_score)
]
best_model = models[best_model_name]

# Track the experiments with mlflow
self.track_mlflow(best_model, classification_metric)

Performance & Experiment Tracking

BEST MODEL

Random Forest

F1 Score: 0.96

Precision: 0.95

Recall: 0.97

MLflow Tracking

Experiment metrics and artifacts logged with MLflow & DagsHub

# MLflow experiment tracking
with mlflow.start_run():
    f1_score = classificationmetric.f1_score
    precision_score = classificationmetric.precision_score
    recall_score = classificationmetric.recall_score
    
    mlflow.log_metric("f1_score", f1_score)
    mlflow.log_metric("precision", precision_score)
    mlflow.log_metric("recall_score", recall_score)
    mlflow.sklearn.log_model(best_model, "model")

Model Serialization

Persisting model and preprocessor for deployment

model.pkl preprocessor.pkl NetworkModel AWS S3

Key Model Evaluation Metrics

F1 Score

0.96

Harmonic mean of precision and recall

Precision

0.95

True positives / (True positives + False positives)

Recall

0.97

True positives / (True positives + False negatives)

Best Algorithm

Random Forest

n_estimators=256, balanced class weights

Prev Slide 9/12 Next