You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: DifferentiationInterface/docs/src/dev/math.md
+15-15Lines changed: 15 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,34 +10,34 @@ It is inspired by
10
10
11
11
Consider a mathematical function $f(x, c, s) = y$ where
12
12
13
-
- $x \in \mathcal{X}$ is the active argument (the one being differentiated)
14
-
- $c \in \mathcal{C}$ is a constant argument (corresponds to [`Constant`](@ref) contexts)
15
-
- $s \in \mathcal{S}$ is a scratch argument (corresponds to [`Cache`](@ref) contexts)
16
-
- $y \in \mathcal{Y}$ is the output
13
+
-$x \in \mathcal{X}$ is the active argument (the one being differentiated)
14
+
-$c \in \mathcal{C}$ is a constant argument (corresponds to [`Constant`](@ref) contexts)
15
+
-$s \in \mathcal{S}$ is a scratch argument (corresponds to [`Cache`](@ref) contexts)
16
+
-$y \in \mathcal{Y}$ is the output
17
17
18
18
In Julia code, some of the input arguments might be mutated, while the output may be written to as well.
19
19
Therefore, the proper model is a function $\phi(x_0, c_0, s_0, y_0) = (x_1, c_1, s_1, y_1)$ where $a_0$ is the state of argument $a$ before $f$ is run, while $a_1$ is its state after $a$ is run.
20
20
21
21
DI makes the following hypotheses on the implementation of $f$ (aka the behavior of $\phi$):
22
22
23
-
1. The active argument $x$ is not mutated, so $x_1 = x_0$
24
-
2. The constant argument $c$ is not mutated, so $c_1 = c_0$
25
-
3. The initial value of the scratch argument $s_0$ does not matter
26
-
4. The initial value of the output $y_0$ does not matter
23
+
1. The active argument $x$ is not mutated, so $x_1 = x_0$.
24
+
2. The constant argument $c$ is not mutated, so $c_1 = c_0$.
25
+
3. The initial value of the scratch argument $s_0$ does not matter. It does not affect any of the states $x_1$, $c_1$, $s_1$, $y_1$.
26
+
4. The initial value of the output $y_0$ does not matter. It does not affect any of the states $x_1$, $c_1$, $s_1$, $y_1$.
27
27
28
28
## Forward mode
29
29
30
30
We want to compute a Jacobian-Vector Product (JVP) $\dot{y} = \left(\frac{\partial f}{\partial x}\right) \dot{x}$ where $\dot{x} \in \mathcal{X}$ is an input tangent.
31
31
32
32
To do that, we run our AD backend on $\phi$ with input tangents $(\dot{x}_0, \dot{c}_0, \dot{s}_0, \dot{y}_0)$ and obtain $(\dot{x}_1, \dot{c}_1, \dot{s}_1, \dot{y}_1)$.
Thanks to our hypotheses 3 and 4 on the function's implementation, $\frac{\partial y_1}{\partial s_0} = 0$ and $\frac{\partial y_1}{\partial y_0} = 0$, so we are left with:
The tangent of $c$ will always be preserved by differentiation.
@@ -47,14 +47,14 @@ The tangent of $c$ will always be preserved by differentiation.
47
47
We want to compute a Vector-Jacobian Product (VJP) $\bar{x} = \left(\frac{\partial f}{\partial x}\right)^* \bar{y}$ where $\bar{y} \in \mathcal{Y}$ is an output sensivity.
48
48
49
49
To do that, we run our AD backend on $\phi$ with output sensitivities $(\bar{x}_1, \bar{c}_1, \bar{s}_1, \bar{y}_1)$ and obtain $(\bar{x}_0, \bar{c}_0, \bar{s}_0, \bar{y}_0)$.
Thanks to our hypotheses 1 and 2 on the function's implementation, $\frac{\partial x_1}{\partial x_0} = I$ and $\frac{\partial c_1}{\partial x_0} = 0$, so we are left with:
53
+
Thanks to our hypotheses 1 and 2 on the function's implementation, $\frac{\partial x_1}{\partial x_0} = I$ and $\frac{\partial c_1}{\partial x_0} = \frac{\partial c_0}{\partial x_0} =0$, so we are left with:
Thus, as long as $\bar{x}_1 = 0$ and $\bar{s}_1 = 0$, the input sensitivity $\bar{x}_0$ contains the correct VJP.
57
-
Let us now look at $\bar{s}_0$ with the help of hypothesis 3:
56
+
Thus, as long as we set $\bar{x}_1 = 0$ and $\bar{s}_1 = 0$, the input sensitivity $\bar{x}_0$ contains the correct VJP.
57
+
Let us now look at $\bar{s}_0$ with the help of hypothesis 3, which tells us that $\frac{\partial x_1}{\partial s_0} = 0$, $\frac{\partial c_1}{\partial s_0} = 0$, $\frac{\partial s_1}{\partial s_0} = 0$, and $\frac{\partial y_1}{\partial s_0} = 0$:
0 commit comments