Neural Network Interactive Simulator

Interactive Neural Network: Draw → Process → Recognize

$$\text{Architecture: } 784 \text{ (28×28 pixels)} \rightarrow 128 \rightarrow 64 \rightarrow 32 \rightarrow 10 \text{ (digits 0-9)}$$ $$\text{Total Parameters: } 784 \times 128 + 128 + 128 \times 64 + 64 + 64 \times 32 + 32 + 32 \times 10 + 10 = 111,018$$

Ready - Pretrained Model Loaded

Drawing Intensity: 0.8

Brush Size: 2

Select Digit to Draw:

Network Predictions:

0.0%

Draw a Digit (28×28)

Neural Network Structure

Neuron Activation & Common Activation Functions

$$\text{Neuron Output: } a = f(w_1x_1 + w_2x_2 + ... + w_nx_n + b)$$ $$\text{where } f \text{ is the activation function}$$

Input x₁: 0.7

Input x₂: 0.4

Weight w₁: 0.5

Weight w₂: -0.3

Bias b: 0.0

Input z (auto-updates): 0.23

Activation Function:

Leaky ReLU α: 0.01

Current Computation:

Symbolic:
z = w₁×x₁ + w₂×x₂ + b
a = f(z)

Numerical:
z = 0.5×0.7 + (-0.3)×0.4 + 0 = 0.23
a = ReLU(0.23) = 0.23

Function Properties:

ReLU: f(x) = max(0, x)

• Fast computation

• Solves vanishing gradients

• Can have "dying ReLU" problem

Interactive Neuron Visualization

ReLU

Leaky ReLU

Sigmoid

Tanh

Softmax

Active Function

Activation Function Comparison

Focus on Individual Neurons: How Each One Learns Features

$$\text{Select a specific neuron to see what pattern it has learned to detect}$$

Layer to Examine:

Current Layer Info:

Layer 1: Each neuron detects specific edge patterns

Click on any neuron below to see what it has learned

Selected Neuron Stats:

Select a neuron to see its properties

Activation: --

Weight Stats: --

Layer 1: Edge Detection (128 neurons)

Click on any neuron to see what pattern it detects:

Select a neuron above

This shows what pattern the selected neuron responds to

Current Input Response:

How strongly this neuron fires for the current input

Mathematical Foundations: Backpropagation & Gradient Descent

$$\text{Forward Pass: } \mathbf{a}^{(l+1)} = f(W^{(l+1)} \mathbf{a}^{(l)} + \mathbf{b}^{(l+1)})$$ $$\text{Backward Pass: } \frac{\partial C}{\partial W^{(l)}} = \mathbf{a}^{(l-1)} (\boldsymbol{\delta}^{(l)})^T, \quad \frac{\partial C}{\partial \mathbf{b}^{(l)}} = \boldsymbol{\delta}^{(l)}$$

Complete Backpropagation Process

$$\text{1. Cost Function: } C = \frac{1}{2}\sum_{i=1}^{10}(y_i - a_i^{(L)})^2$$ $$\text{2. Output Layer Error: } \boldsymbol{\delta}^{(L)} = \nabla_a C \odot f'(\mathbf{z}^{(L)})$$ $$\text{3. Hidden Layer Error: } \boldsymbol{\delta}^{(l)} = ((W^{(l+1)})^T \boldsymbol{\delta}^{(l+1)}) \odot f'(\mathbf{z}^{(l)})$$ $$\text{4. Weight Update: } W^{(l)} \leftarrow W^{(l)} - \eta \frac{\partial C}{\partial W^{(l)}}$$ $$\text{5. Bias Update: } \mathbf{b}^{(l)} \leftarrow \mathbf{b}^{(l)} - \eta \frac{\partial C}{\partial \mathbf{b}^{(l)}}$$

Learning Rate η: 0.01

Target Digit: 3

Current Prediction: 8

Training Progress:

Ready to start training

Cost: 0.500

Gradient Norm: 0.000

Iteration: 0

Accuracy: 0.0%

Numerical Example:

Forward:

z = Wx + b = 0.000

a = σ(z) = 0.000

Backward:

δ = (a - y) = 0.000

∂C/∂w = δ × x = 0.000

w_new = w - η∂C/∂w = 0.000

Interactive Neural Network Simulator for Deep Learning Education