Project Overview

2/8

Problem Statement

Understanding how student performance (test scores) is affected by:

  • Gender & Race/Ethnicity
  • Parental level of education
  • Lunch type (standard vs free/reduced)
  • Test preparation course completion

Dataset

Kaggle: Students Performance in Exams

1,000 records • 8 columns

Pipeline Components

Data Ingestion

CSV loading, train/test splitting, artifact storage

Data Transformation

Imputation, encoding, scaling via preprocessing pipeline

Model Training

Multiple regression models with GridSearchCV tuning

Azure Deployment

Docker containerization with CI/CD via GitHub Actions