Step 1 of 6·One input, one output, more than one path·~13 min left
Chapter 2 · Lesson 2.3
Chain rule on a graph
An input can affect an output through more than one path. The chain rule tells us how those paths combine into one total effect.
Lesson 2.1 gave us a graph of dependencies. Lesson 2.2 gave each operation a local rule. This lesson puts those two pieces together.
The chain rule on a graph says:
along one path, multiply the local sensitivities
across multiple paths, add the path contributions
One input, one output, more than one path
Start with the small graph we already know:
a = Value(2.0)b = Value(-3.0)c = a * bd = a + be = c + d
The full expression is:
e=(a⋅b)+(a+b)
Now ask a bigger question than Lesson 2.2 asked: how does a affect e? In Lesson 2.2, we only asked about direct neighbors like how does a affect c?
Now a and e are not directly connected. There are two paths from a to e, so one local derivative is no longer enough. We need the total effect of a on e:
aa→c→e→d→e
First, do one path
Take the path a → c → e. We already know the local pieces:
∂a∂c=b,∂c∂e=1
Multiply them to get the contribution from this path:
∂c∂e⋅∂a∂c=1⋅b=b
At our current values, b is -3, so this path contributes -3. This is the chain rule in its simplest graph form: if influence travels through a path, multiply the local sensitivities along that path.
Then, do the other path
Now take the second path a → d → e. Its local pieces are:
∂a∂d=1,∂d∂e=1
Its contribution is:
∂d∂e⋅∂a∂d=1⋅1=1
We now have both path contributions from a to e:
Path contributions
through c:∂c∂e⋅∂a∂c=1⋅b=−3
through d:∂d∂e⋅∂a∂d=1⋅1=+1
Total effect means add the path contributions
Both paths are real. Both are ways a changes e. So the total derivative is the sum of both contributions.