Physics-Informed vs Raw MLP Classifier
An interactive comparison of two learning philosophies: one model starts with physical structure, the other starts with raw features. They train side-by-side on the same projectile dataset so you can see the trade-off between inductive bias and representation power in real time.
Live Training & Visualization
Both models train simultaneously on matched batches. You can inspect trajectory predictions, decision fields, confusion matrices, and learning curves while changing hyperparameters.
The short story
This page is a controlled experiment about inductive bias. The question is simple: when we already know the governing physics, does encoding that structure improve sample efficiency and training stability compared to a generic MLP?
The experimental rule is strict: keep data generation and evaluation synchronized, and change only representation. That way, the performance gap is interpretable rather than anecdotal.
Before the details
This project isolates one variable on purpose: representation. Both models are trained on the same task, same labels, same data stream, and same evaluation rhythm. The only thing that changes is what we feed them before optimization starts.
In the physics-informed branch, I encode known ballistic structure into the feature itself. In the raw branch, I do not provide that shortcut — the MLP has to infer the geometry from gradients. This is why the comparison is meaningful: we can observe not only final accuracy, but also stability, sample efficiency, and boundary formation during training.
So the page is less about “which model wins” and more about learning dynamics under controlled conditions. If one branch converges faster or behaves more calmly, we can attribute that to inductive bias rather than dataset luck.
Quick glossary
MLP (Multi-Layer Perceptron): A feed-forward neural network that learns nonlinear decision boundaries from data.
Physics-informed: A model that receives explicit structure from known equations before training.
Feature: Any numeric quantity the model receives as input.
Inductive bias: Built-in structure or assumptions that guide learning.
Decision field: A visual map of predicted classes across input space.
Confusion matrix: A compact view of correct predictions versus class-specific mistakes.
Validation: Evaluation on held-out samples to estimate generalization.
What you are seeing first
The target is a 3-class projectile range classifier. Colors are fixed across all panels so geometric changes in the decision boundary can be interpreted immediately.
Blue tracks short range, green medium range, and red long range. Keeping this mapping invariant across views is what makes the live comparison readable rather than cosmetic.
Problem setup
Given initial speed \(v\) and launch angle \(\theta\), the goal is to map the resulting range \(R\) to one of the three class intervals above.
The noiseless generator relation is:
This equation defines the target manifold that both models are implicitly trying to separate.
Physics-informed model
The physics branch explicitly constructs \(\hat{R}\) and feeds \([1,\hat{R}]\) into a linear softmax layer. This deliberately low-capacity model is useful because it exposes exactly how far a well-chosen feature can go.
Raw MLP model
The raw model receives
and passes this vector through a two-layer MLP (16 hidden units, Leaky-ReLU, Adam). In this branch, the geometric structure must be learned from gradients rather than injected analytically.
Fair comparison protocol
The protocol is synchronized by construction: matched mini-batches from a seeded generator, fixed 600-sample dataset split (70/30), and live evaluation on the same validation set each tick. A mild perturbation \(\sigma=0.02\) is added to the raw trigonometric inputs. The physics branch uses SGD-style updates while the MLP uses Adam, so architecture differs but the experimental stream does not.
What I usually observe
In early training, the physics-informed model typically stabilizes much faster. Its decision field quickly snaps into iso-range-like geometry because the core structure is already encoded in the feature map. The raw MLP can absolutely catch up, but it usually spends more updates discovering that same geometry from scratch.
The practical difference shows up in sensitivity too: the MLP is generally more reactive to learning rate and batch-size choices, while the physics-informed path is calmer and more predictable. By the end, both models can perform strongly, but the route they take to get there is meaningfully different.
Implementation notes
The host page uses Next.js, React, and TypeScript for layout and interaction flow, while animation transitions are handled by Framer Motion. The ML runtime itself is intentionally implemented in vanilla JavaScript inside the embedded visualization.
For this project I deliberately avoid TensorFlow.js or PyTorch. Forward passes, softmax, cross-entropy gradients, SGD updates (physics branch), and Adam updates (MLP branch) are implemented explicitly. That choice keeps every step inspectable and makes it easier to audit where performance differences come from.
Feature design is the real axis of comparison: the raw branch ingests \([1,\; v/60,\; \sin\theta,\; \cos\theta]\), while the physics-informed branch uses \(\hat{R} = v^2 \sin(2\theta)/g\). Both are then trained under the same seeded stream and synchronized evaluation cycle.
Architecturally, this separation of concerns is important: the React layer focuses on narrative and orchestration, and the embedded runtime owns numerical updates plus rendering. That keeps UI responsiveness high without compromising deterministic, side-by-side comparability.
And yes, here is the core snippet back in place:
// Numerically stable softmax for converting logits -> probabilities
function softmax(z) {
// Shift by max logit to avoid overflow in exp()
const m = Math.max(...z);
const e = z.map(v => Math.exp(v - m));
const s = e.reduce((a, b) => a + b, 0);
return e.map(v => v / s);
}
// Minimal 1-hidden-layer MLP used for the "raw feature" branch
class MLP {
constructor(inputSize = 4, hiddenSize = 8, outputSize = 3) {
this.inputSize = inputSize;
this.hiddenSize = hiddenSize;
this.outputSize = outputSize;
// Xavier-like small random initialization for layer 1
this.W1 = Array.from({ length: hiddenSize }, () =>
Array.from({ length: inputSize }, () => (Math.random() - 0.5) * 0.1)
);
this.b1 = Array(hiddenSize).fill(0);
// Same initialization style for output layer
this.W2 = Array.from({ length: outputSize }, () =>
Array.from({ length: hiddenSize }, () => (Math.random() - 0.5) * 0.1)
);
this.b2 = Array(outputSize).fill(0);
}
// Forward pass: ReLU hidden layer -> linear logits -> softmax probs
forward(x) {
const h = new Array(this.hiddenSize);
for (let i = 0; i < this.hiddenSize; i++) {
let s = this.b1[i];
for (let j = 0; j < this.inputSize; j++) s += this.W1[i][j] * x[j];
h[i] = Math.max(0, s); // ReLU activation
}
const z = new Array(this.outputSize);
for (let c = 0; c < this.outputSize; c++) {
let s = this.b2[c];
for (let i = 0; i < this.hiddenSize; i++) s += this.W2[c][i] * h[i];
z[c] = s;
}
return { h, p: softmax(z) };
}
// Argmax class prediction
pred(x) {
const { p } = this.forward(x);
return p.indexOf(Math.max(...p));
}
}
// One SGD training step with cross-entropy gradient
function trainStep(model, x, y, lr = 0.01) {
// Forward pass
const { h, p } = model.forward(x);
// dL/dz for softmax + cross-entropy
const dz = p.slice();
dz[y] -= 1;
// Update output layer params
for (let c = 0; c < model.outputSize; c++) {
for (let i = 0; i < model.hiddenSize; i++) {
model.W2[c][i] -= lr * dz[c] * h[i];
}
model.b2[c] -= lr * dz[c];
}
// Backprop into hidden layer
const dh = Array(model.hiddenSize).fill(0);
for (let i = 0; i < model.hiddenSize; i++) {
for (let c = 0; c < model.outputSize; c++) {
dh[i] += model.W2[c][i] * dz[c];
}
}
// ReLU derivative + update first layer params
for (let i = 0; i < model.hiddenSize; i++) {
const grad = h[i] > 0 ? dh[i] : 0;
for (let j = 0; j < model.inputSize; j++) {
model.W1[i][j] -= lr * grad * x[j];
}
model.b1[i] -= lr * grad;
}
}