Basic components of deep learning frameworks
This guide focuses on the core programming concepts needed to understand and work with TensorWeaver, particularly the computational graph system that powers modern deep learning frameworks.
Computational Graphs
What is a Computational Graph?
A computational graph represents mathematical operations as a directed graph:
- Nodes represent operations (addition, multiplication, etc.) or variables
- Edges represent the flow of data between operations
- The graph structure enables automatic differentiation
In machine learning, the computational graph is used to represent the flow of data through a neural network. In other words, it is a way to represent the computational process of a neural network. Eseeentialy, the daily work of a machine learning engineer is to design, implement, and optimize computational graphs.
Forward and Backward Computation
Forward Pass
The forward pass is compute the output of the neural network for a given input in current state (not perfect). The learning of the neural network is compare the output with the expected output and compute the loss. When the model is finalized, the forward pass is also called inference that is used to predict the output for new inputs.
The steps of the forward pass are:
- Computation flows from inputs to outputs
- Each node performs its operation and passes results forward
- Results are cached for use in the backward pass (if in training mode)
Backward Pass
Modern deep learning using a technique called backpropagation to compute the gradients of the loss with respect to the parameters of the model. It requires to compute the gradient of the loss with respect to the output of each node in the graph. Backward pass is used to efficiently compute these gradients.
- Gradients flow backwards through the graph
- Uses the chain rule (as explained in Mathematical Background)
- Each node computes its gradient with respect to its inputs
Optimization
How to use the gradients to update the parameters of the model is called optimization. Different optimization algorithms use different strategies to update the parameters.