3 Operators

Tensors store data. Operators transform data. Let’s implement the essential ones.

3.1 The Goal

By the end of this chapter:

celsius = Tensor([[0.0], [100.0]])
w = Tensor([[1.8]])
b = Tensor([32.0])

fahrenheit = celsius @ w.T + b  # This will work!
print(fahrenheit)
# Tensor([[32.0], [212.0]])

We need three operators:

Transpose (w.T) — Already done in Chapter 2
Matrix Multiplication (@) — New
Addition (+) — New

3.2 Matrix Multiplication

The @ operator performs matrix multiplication:

class Tensor:
    # ... previous methods ...

    def __matmul__(self, other):
        """Matrix multiplication: self @ other"""
        if not isinstance(other, Tensor):
            other = Tensor(other)
        return Tensor(self.data @ other.data)

Note

Code Reference: See src/tensorweaver/operators/ for all operator implementations.

3.2.1 How Matrix Multiplication Works

For matrices A (m×n) and B (n×p), result C is (m×p):

\[C_{ij} = \sum_{k=1}^{n} A_{ik} \cdot B_{kj}\]

# celsius: (4, 1) @ w.T: (1, 1) = result: (4, 1)
celsius = Tensor([[0.0], [20.0], [37.0], [100.0]])  # (4, 1)
w = Tensor([[1.8]])                                  # (1, 1)

result = celsius @ w.T  # (4, 1) @ (1, 1) = (4, 1)
print(result)
# Tensor([[0.0], [36.0], [66.6], [180.0]])

3.3 Addition

The + operator adds tensors element-wise:

class Tensor:
    # ... previous methods ...

    def __add__(self, other):
        """Element-wise addition: self + other"""
        if not isinstance(other, Tensor):
            other = Tensor(other)
        return Tensor(self.data + other.data)

    def __radd__(self, other):
        """Handle: number + Tensor"""
        return self.__add__(other)

3.3.1 Broadcasting

NumPy’s broadcasting lets us add tensors of different shapes:

# result: (4, 1) + b: (1,) = fahrenheit: (4, 1)
result = Tensor([[0.0], [36.0], [66.6], [180.0]])  # (4, 1)
b = Tensor([32.0])                                  # (1,)

fahrenheit = result + b  # Broadcasting: (4, 1) + (1,) = (4, 1)
print(fahrenheit)
# Tensor([[32.0], [68.0], [98.6], [212.0]])

Broadcasting rules:

Align shapes from the right
Dimensions match if equal or one is 1
Missing dimensions treated as 1

(4, 1)     result
   (1,)    b (broadcast to match)
-------
(4, 1)     output

3.4 More Arithmetic Operators

Let’s add the complete set:

class Tensor:
    # ... previous methods ...

    def __sub__(self, other):
        """Subtraction: self - other"""
        if not isinstance(other, Tensor):
            other = Tensor(other)
        return Tensor(self.data - other.data)

    def __rsub__(self, other):
        """Handle: number - Tensor"""
        return Tensor(other) - self

    def __mul__(self, other):
        """Element-wise multiplication: self * other"""
        if not isinstance(other, Tensor):
            other = Tensor(other)
        return Tensor(self.data * other.data)

    def __rmul__(self, other):
        """Handle: number * Tensor"""
        return self.__mul__(other)

    def __truediv__(self, other):
        """Division: self / other"""
        if not isinstance(other, Tensor):
            other = Tensor(other)
        return Tensor(self.data / other.data)

    def __rtruediv__(self, other):
        """Handle: number / Tensor"""
        return Tensor(other) / self

    def __pow__(self, power):
        """Power: self ** power"""
        return Tensor(self.data ** power)

    def __neg__(self):
        """Negation: -self"""
        return Tensor(-self.data)

3.5 Reduction Operations

Operations that reduce dimensions:

class Tensor:
    # ... previous methods ...

    def sum(self, axis=None, keepdims=False):
        """Sum elements along axis."""
        return Tensor(self.data.sum(axis=axis, keepdims=keepdims))

    def mean(self, axis=None, keepdims=False):
        """Mean of elements along axis."""
        return Tensor(self.data.mean(axis=axis, keepdims=keepdims))

    def max(self, axis=None, keepdims=False):
        """Maximum along axis."""
        return Tensor(self.data.max(axis=axis, keepdims=keepdims))

    def min(self, axis=None, keepdims=False):
        """Minimum along axis."""
        return Tensor(self.data.min(axis=axis, keepdims=keepdims))

Usage:

t = Tensor([[1.0, 2.0],
            [3.0, 4.0]])

print(f"Sum all: {t.sum()}")           # Tensor(10.0)
print(f"Sum axis 0: {t.sum(axis=0)}")  # Tensor([4.0, 6.0])
print(f"Sum axis 1: {t.sum(axis=1)}")  # Tensor([3.0, 7.0])
print(f"Mean: {t.mean()}")             # Tensor(2.5)

3.6 Complete Temperature Conversion

Now we can do the full computation:

from tensorweaver import Tensor

# Input: Celsius temperatures
celsius = Tensor([[0.0],
                  [20.0],
                  [37.0],
                  [100.0]])

# Model parameters
w = Tensor([[1.8]])
b = Tensor([32.0])

# Forward pass: F = C × 1.8 + 32
fahrenheit = celsius @ w.T + b

print("Celsius -> Fahrenheit:")
for c, f in zip(celsius.data.flatten(), fahrenheit.data.flatten()):
    print(f"  {c:.1f}°C = {f:.1f}°F")

Output:

Celsius -> Fahrenheit:
  0.0°C = 32.0°F    ✓ Freezing point
  20.0°C = 68.0°F   ✓ Room temperature
  37.0°C = 98.6°F   ✓ Body temperature
  100.0°C = 212.0°F ✓ Boiling point

3.7 Comparison Operators

Useful for masking and conditions:

class Tensor:
    # ... previous methods ...

    def __gt__(self, other):
        """Greater than: self > other"""
        if not isinstance(other, Tensor):
            other = Tensor(other)
        return Tensor(self.data > other.data)

    def __lt__(self, other):
        """Less than: self < other"""
        if not isinstance(other, Tensor):
            other = Tensor(other)
        return Tensor(self.data < other.data)

    def __eq__(self, other):
        """Equal: self == other"""
        if not isinstance(other, Tensor):
            other = Tensor(other)
        return Tensor(self.data == other.data)

3.8 Part I Complete!

Tip

Milestone: You’ve built a working forward pass!

fahrenheit = celsius @ w.T + b  # Works!

We can now:

Create tensors of any shape
Perform arithmetic operations (+, -, *, /, **)
Do matrix multiplication (@)
Reduce dimensions (sum, mean)

But we hardcoded w=1.8 and b=32. What if we didn’t know these values?

3.9 What’s Next

In Part II, we’ll learn how to find the right values of w and b automatically:

Define a loss function (how wrong are we?)
Build a computational graph (track operations)
Implement backpropagation (compute gradients)
Update parameters (learn!)

The journey from “forward only” to “learning” begins.