What is a Neuron?
Inspired by the Brain
The human brain contains ~86 billion neurons. Each neuron:
- Receives signals from other neurons (inputs)
- Adds them up (weighted sum)
- Fires or doesn’t fire based on the total (activation)
Artificial neural networks mimic this structure.
The Artificial Neuron (Perceptron)
An artificial neuron computes:
Where:
- = inputs (features)
- = weights (how much each input matters)
- = bias (offset)
- activation = a function that transforms into the output
import numpy as np
def neuron(inputs, weights, bias, activation="relu"):
# Step 1: Weighted sum
z = np.dot(inputs, weights) + bias
# Step 2: Apply activation
if activation == "relu":
return max(0, z)
elif activation == "sigmoid":
return 1 / (1 + np.exp(-z))
elif activation == "tanh":
return np.tanh(z)
else:
return z # linear
# Example: detect if a tumor is malignant
# Input features: [cell_size, cell_shape, clump_thickness]
inputs = np.array([0.8, 0.7, 0.9])
weights = np.array([0.5, 0.3, 0.2])
bias = -0.4
z = np.dot(inputs, weights) + bias
output = 1 / (1 + np.exp(-z)) # sigmoid
print(f"z = {z:.4f}")
print(f"Output (probability of malignant): {output:.4f}")
Activation Functions
The activation function is what makes neural networks powerful. Without it, a stack of neurons is just linear regression.
ReLU (Rectified Linear Unit) — Most Common
x = np.linspace(-3, 3, 100)
relu = np.maximum(0, x)
# Use for hidden layers — simple, fast, avoids vanishing gradient
- Zero for negative inputs, linear for positive
- Computationally very cheap
- Used in almost all modern deep networks
Sigmoid
- Output in (0, 1) — perfect for binary classification output
- Prone to vanishing gradients (problem in deep networks)
- Use in the final layer for binary classification
Softmax (for multi-class output)
def softmax(x):
e_x = np.exp(x - np.max(x)) # numerical stability
return e_x / e_x.sum()
logits = np.array([2.0, 1.0, 0.5]) # raw outputs for 3 classes
probs = softmax(logits)
print(probs) # [0.629, 0.231, 0.140] — sum = 1.0
# → 62.9% probability for class 0, 23.1% for class 1
Comparison
| Activation | Range | Used in |
|---|---|---|
| ReLU | [0, ∞) | Hidden layers |
| Sigmoid | (0, 1) | Binary output |
| Softmax | (0, 1), sum=1 | Multi-class output |
| Tanh | (-1, 1) | RNNs, some hidden layers |
| Linear | (-∞, ∞) | Regression output |
Weights and Learning
Initially, weights are random. During training, the network adjusts them to reduce error:
# Visualize how weight affects output
import matplotlib.pyplot as plt
x = np.linspace(-3, 3, 100)
plt.figure(figsize=(12, 4))
for w in [-2.0, -0.5, 0.5, 2.0]:
z = w * x # single input, no bias
output = 1 / (1 + np.exp(-z)) # sigmoid output
plt.plot(x, output, label=f"w={w}")
plt.xlabel("Input x")
plt.ylabel("Output (sigmoid)")
plt.title("Effect of Weight on Neuron Output")
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()
A larger positive weight makes the neuron more sensitive to that input. A negative weight makes the neuron suppress that input.
From One Neuron to a Network
A single neuron can only learn linear patterns (with linear activation) or simple non-linear ones. The real power comes from connecting many neurons in layers:
Input Layer Hidden Layer 1 Hidden Layer 2 Output Layer
x₁ ──────┐ ○ ○ ○ ○ ○ ○ ○ ○
x₂ ──────┼──→ ○ ○ ○ ○ → ○ ○ ○ → (prediction)
x₃ ──────┘ ○ ○ ○ ○ ○ ○ ○
Each neuron in each layer takes all outputs from the previous layer as inputs. This creates a deep neural network — the key idea behind “deep learning.”
A neuron computes z = 0.5×2 + 0.3×3 + (-0.4) = 1.5. After applying ReLU, what is the output?