Skip to content

Commit bde8219

Browse files
authored
Minor tweaks to "Mathematical model" docs (#922)
1 parent bbc39fd commit bde8219

1 file changed

Lines changed: 15 additions & 15 deletions

File tree

  • DifferentiationInterface/docs/src/dev

DifferentiationInterface/docs/src/dev/math.md

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -10,34 +10,34 @@ It is inspired by
1010

1111
Consider a mathematical function $f(x, c, s) = y$ where
1212

13-
- $x \in \mathcal{X}$ is the active argument (the one being differentiated)
14-
- $c \in \mathcal{C}$ is a constant argument (corresponds to [`Constant`](@ref) contexts)
15-
- $s \in \mathcal{S}$ is a scratch argument (corresponds to [`Cache`](@ref) contexts)
16-
- $y \in \mathcal{Y}$ is the output
13+
- $x \in \mathcal{X}$ is the active argument (the one being differentiated)
14+
- $c \in \mathcal{C}$ is a constant argument (corresponds to [`Constant`](@ref) contexts)
15+
- $s \in \mathcal{S}$ is a scratch argument (corresponds to [`Cache`](@ref) contexts)
16+
- $y \in \mathcal{Y}$ is the output
1717

1818
In Julia code, some of the input arguments might be mutated, while the output may be written to as well.
1919
Therefore, the proper model is a function $\phi(x_0, c_0, s_0, y_0) = (x_1, c_1, s_1, y_1)$ where $a_0$ is the state of argument $a$ before $f$ is run, while $a_1$ is its state after $a$ is run.
2020

2121
DI makes the following hypotheses on the implementation of $f$ (aka the behavior of $\phi$):
2222

23-
1. The active argument $x$ is not mutated, so $x_1 = x_0$
24-
2. The constant argument $c$ is not mutated, so $c_1 = c_0$
25-
3. The initial value of the scratch argument $s_0$ does not matter
26-
4. The initial value of the output $y_0$ does not matter
23+
1. The active argument $x$ is not mutated, so $x_1 = x_0$.
24+
2. The constant argument $c$ is not mutated, so $c_1 = c_0$.
25+
3. The initial value of the scratch argument $s_0$ does not matter. It does not affect any of the states $x_1$, $c_1$, $s_1$, $y_1$.
26+
4. The initial value of the output $y_0$ does not matter. It does not affect any of the states $x_1$, $c_1$, $s_1$, $y_1$.
2727

2828
## Forward mode
2929

3030
We want to compute a Jacobian-Vector Product (JVP) $\dot{y} = \left(\frac{\partial f}{\partial x}\right) \dot{x}$ where $\dot{x} \in \mathcal{X}$ is an input tangent.
3131

3232
To do that, we run our AD backend on $\phi$ with input tangents $(\dot{x}_0, \dot{c}_0, \dot{s}_0, \dot{y}_0)$ and obtain $(\dot{x}_1, \dot{c}_1, \dot{s}_1, \dot{y}_1)$.
33-
The interesting value is
33+
The value of interest is
3434
$$\dot{y}_1 = \frac{\partial y_1}{\partial x_0} \dot{x}_0 + \frac{\partial y_1}{\partial c_0} \dot{c}_0 + \frac{\partial y_1}{\partial s_0} \dot{s}_0 + \frac{\partial y_1}{\partial y_0} \dot{y}_0$$
3535

3636
Thanks to our hypotheses 3 and 4 on the function's implementation, $\frac{\partial y_1}{\partial s_0} = 0$ and $\frac{\partial y_1}{\partial y_0} = 0$, so we are left with:
3737
$$\dot{y}_1 = \frac{\partial y_1}{\partial x_0} \dot{x_0} + \frac{\partial y_1}{\partial c_0} \dot{c_0}$$
3838

39-
Thus, as long as $\dot{c}_0 = 0$, the output tangent $\dot{y}_1$ contains the correct JVP.
40-
Let us now look at $\dot{s}_1$ with the help of hypothesis 2:
39+
Thus, as long as we set $\dot{c}_0 = 0$, the output tangent $\dot{y}_1$ contains the correct JVP.
40+
Let us now look at $\dot{c}_1$ with the help of hypothesis 2:
4141
$$\dot{c}_1 = \frac{\partial c_1}{\partial x_0} \dot{x}_0 + \frac{\partial c_1}{\partial c_0} \dot{c}_0 + \frac{\partial c_1}{\partial s_0} \dot{s}_0 + \frac{\partial c_1}{\partial y_0} \dot{y}_0 = \dot{c}_0$$
4242

4343
The tangent of $c$ will always be preserved by differentiation.
@@ -47,14 +47,14 @@ The tangent of $c$ will always be preserved by differentiation.
4747
We want to compute a Vector-Jacobian Product (VJP) $\bar{x} = \left(\frac{\partial f}{\partial x}\right)^* \bar{y}$ where $\bar{y} \in \mathcal{Y}$ is an output sensivity.
4848

4949
To do that, we run our AD backend on $\phi$ with output sensitivities $(\bar{x}_1, \bar{c}_1, \bar{s}_1, \bar{y}_1)$ and obtain $(\bar{x}_0, \bar{c}_0, \bar{s}_0, \bar{y}_0)$.
50-
The interesting value is
50+
The value of interest is
5151
$$\bar{x}_0 = \left(\frac{\partial x_1}{\partial x_0}\right)^* \bar{x}_1 + \left(\frac{\partial c_1}{\partial x_0}\right)^* \bar{c}_1 + \left(\frac{\partial s_1}{\partial x_0}\right)^* \bar{s}_1 + \left(\frac{\partial y_1}{\partial x_0}\right)^* \bar{y}_1$$
5252

53-
Thanks to our hypotheses 1 and 2 on the function's implementation, $\frac{\partial x_1}{\partial x_0} = I$ and $\frac{\partial c_1}{\partial x_0} = 0$, so we are left with:
53+
Thanks to our hypotheses 1 and 2 on the function's implementation, $\frac{\partial x_1}{\partial x_0} = I$ and $\frac{\partial c_1}{\partial x_0} = \frac{\partial c_0}{\partial x_0} =0$, so we are left with:
5454
$$\bar{x}_0 = \bar{x}_1 + \left(\frac{\partial s_1}{\partial x_0}\right)^* \bar{s}_1 + \left(\frac{\partial y_1}{\partial x_0}\right)^* \bar{y}_1$$
5555

56-
Thus, as long as $\bar{x}_1 = 0$ and $\bar{s}_1 = 0$, the input sensitivity $\bar{x}_0$ contains the correct VJP.
57-
Let us now look at $\bar{s}_0$ with the help of hypothesis 3:
56+
Thus, as long as we set $\bar{x}_1 = 0$ and $\bar{s}_1 = 0$, the input sensitivity $\bar{x}_0$ contains the correct VJP.
57+
Let us now look at $\bar{s}_0$ with the help of hypothesis 3, which tells us that $\frac{\partial x_1}{\partial s_0} = 0$, $\frac{\partial c_1}{\partial s_0} = 0$, $\frac{\partial s_1}{\partial s_0} = 0$, and $\frac{\partial y_1}{\partial s_0} = 0$:
5858

5959
$$\bar{s}_0 = \left(\frac{\partial x_1}{\partial s_0}\right)^* \bar{x}_1 + \left(\frac{\partial c_1}{\partial s_0}\right)^* \bar{c}_1 + \left(\frac{\partial s_1}{\partial s_0}\right)^* \bar{s}_1 + \left(\frac{\partial y_1}{\partial s_0}\right)^* \bar{y}_1 = 0$$
6060

0 commit comments

Comments
 (0)