MACHINE LEARNINGSHIPPED

CIFAR-10 Two-Layer Classifier

A two-layer neural network built from scratch in NumPy — cyclical learning rates, coarse-to-fine hyperparameter search, and analytic gradients verified against PyTorch to ~1e-16 error. 49.77% test accuracy.

ROLESole author
TEAMIndividual — KTH Deep Learning in Data Science
TIMELINEMar 2026 — Jun 2026
CATEGORYMachine Learning

49.77%

BEST TEST ACCURACY

~1e-16

GRADIENT MAX ERROR VS AUTOGRAD

~45k

Λ SEARCH IMAGES

Overview

No model.fit(). No autograd. For KTH's Deep Learning in Data Science course I built a two-layer fully connected network from scratch in NumPy — forward pass, softmax cross-entropy loss, ReLU, and the full analytic backpropagation — and trained it on CIFAR-10.

Baseline accuracy was 46.01%; the final configuration reached 49.77% test accuracy, which is around the practical ceiling for a two-layer fully-connected net on raw CIFAR-10 pixels.

The pipeline

  • Per-dimension data normalization, mini-batch gradient descent with L2 regularization, and cyclical (triangular) learning rates following Smith (2015) with η ∈ [1e-5, 1e-1].
  • Gradient verification to machine precision: every analytic gradient is checked against PyTorch autograd, with max error around 1e-16 — plus an overfit-a-tiny-subset sanity check before any real training run.
  • Coarse-to-fine log-scale search for λ over ~45k training images: coarse sweep over λ ∈ [1e-5, 1e-1], fine sweep over [5e-4, 7e-3], settling on λ = 1.07e-3 for the final 3-cycle run.

Bonus findings

I extended the baseline with four cumulative improvements — a wider hidden layer (m = 256), horizontal flip augmentation, random translation, and dropout — and found something satisfying: augmentation consistently shifted the optimal λ smaller, empirically confirming that data augmentation and L2 regularization act as substitute forms of regularization.

All findings are documented in a LaTeX report covering gradient verification, cyclic LR dynamics across 1- and 3-cycle runs, and the top hyperparameter configurations from both search stages.

Source

Code and report: github.com/rippyboii/cifar10-twolayer-classifer

APPENDIX A // MEDIA

IMAGE SLOT — AWAITING UPLOAD

FIG.01 — CYCLICAL LEARNING-RATE SCHEDULE ACROSS 3 CYCLES

IMAGE SLOT — AWAITING UPLOAD

FIG.02 — COARSE-TO-FINE Λ SEARCH RESULTS