CIFAR-10 Two-Layer Classifier
A two-layer neural network built from scratch in NumPy — cyclical learning rates, coarse-to-fine hyperparameter search, and analytic gradients verified against PyTorch to ~1e-16 error. 49.77% test accuracy.
49.77%
BEST TEST ACCURACY
~1e-16
GRADIENT MAX ERROR VS AUTOGRAD
~45k
Λ SEARCH IMAGES
Overview
No model.fit(). No autograd. For KTH's Deep Learning in Data Science course
I built a two-layer fully connected network from scratch in NumPy — forward
pass, softmax cross-entropy loss, ReLU, and the full analytic backpropagation —
and trained it on CIFAR-10.
Baseline accuracy was 46.01%; the final configuration reached 49.77% test accuracy, which is around the practical ceiling for a two-layer fully-connected net on raw CIFAR-10 pixels.
The pipeline
- Per-dimension data normalization, mini-batch gradient descent with L2 regularization, and cyclical (triangular) learning rates following Smith (2015) with η ∈ [1e-5, 1e-1].
- Gradient verification to machine precision: every analytic gradient is checked against PyTorch autograd, with max error around 1e-16 — plus an overfit-a-tiny-subset sanity check before any real training run.
- Coarse-to-fine log-scale search for λ over ~45k training images: coarse sweep over λ ∈ [1e-5, 1e-1], fine sweep over [5e-4, 7e-3], settling on λ = 1.07e-3 for the final 3-cycle run.
Bonus findings
I extended the baseline with four cumulative improvements — a wider hidden layer (m = 256), horizontal flip augmentation, random translation, and dropout — and found something satisfying: augmentation consistently shifted the optimal λ smaller, empirically confirming that data augmentation and L2 regularization act as substitute forms of regularization.
All findings are documented in a LaTeX report covering gradient verification, cyclic LR dynamics across 1- and 3-cycle runs, and the top hyperparameter configurations from both search stages.
Source
Code and report: github.com/rippyboii/cifar10-twolayer-classifer
APPENDIX A // MEDIA
IMAGE SLOT — AWAITING UPLOAD
IMAGE SLOT — AWAITING UPLOAD