You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -63,4 +63,4 @@ This open source book is made available under the Creative Commons Attribution-S
63
63
64
64
The sample and reference code within this open source book is made available under a modified MIT license. See the [LICENSE-SAMPLECODE](LICENSE-SAMPLECODE) file.
65
65
66
-
[Chinese version](https://github.com/d2l-ai/d2l-zh) | [Discuss and report issues](https://discuss.d2l.ai/) | [Other Information](INFO.md)
66
+
[Chinese version](https://github.com/d2l-ai/d2l-zh) | [Discuss and report issues](https://discuss.d2l.ai/) | [Code of conduct](CODE_OF_CONDUCT.md) | [Other Information](INFO.md)
Copy file name to clipboardExpand all lines: chapter_appendix-mathematics-for-deep-learning/integral-calculus.md
+18-18Lines changed: 18 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
# Integral Calculus
2
2
:label:`sec_integral_calculus`
3
3
4
-
Differentiation only makes up half of the content of a traditional calculus education. The other pillar, integration, starts out seeming a rather disjoint question, "What is the area underneath this curve?" While seemingly unrelated, integration is tightly intertwined with the differentiation via what is known as the *fundamental theorem of calculus*.
4
+
Differentiation only makes up half of the content of a traditional calculus education. The other pillar, integration, starts out seeming a rather disjoint question, "What is the area underneath this curve?" While seemingly unrelated, integration is tightly intertwined with the differentiation via what is known as the *fundamental theorem of calculus*.
5
5
6
6
At the level of machine learning we discuss in this book, we will not need a deep understanding of integration. However, we will provide a brief introduction to lay the groundwork for any further applications we will encounter later on.
7
7
@@ -187,7 +187,7 @@ We will instead take a different approach. We will work intuitively with the no
187
187
188
188
## The Fundamental Theorem of Calculus
189
189
190
-
To dive deeper into the theory of integration, let us introduce a function
190
+
To dive deeper into the theory of integration, let us introduce a function
191
191
192
192
$$
193
193
F(x) = \int_0^x f(y) dy.
@@ -201,10 +201,10 @@ $$
201
201
202
202
This is a mathematical encoding of the fact that we can measure the area out to the far end-point and then subtract off the area to the near end point as indicated in :numref:`fig_area-subtract`.
203
203
204
-

204
+

205
205
:label:`fig_area-subtract`
206
206
207
-
Thus, we can figure out what the integral over any interval is by figuring out what $F(x)$ is.
207
+
Thus, we can figure out what the integral over any interval is by figuring out what $F(x)$ is.
208
208
209
209
To do so, let us consider an experiment. As we often do in calculus, let us imagine what happens when we shift the value by a tiny bit. From the comment above, we know that
210
210
@@ -259,7 +259,7 @@ First, suppose that we have a function which is itself an integral:
259
259
260
260
$$
261
261
F(x) = \int_0^x f(y) \; dy.
262
-
$$
262
+
$$
263
263
264
264
Let us suppose that we want to know how this function looks when we compose it with another to obtain $F(u(x))$. By the chain rule, we know
For a more intuitive derivation, consider what happens when we take an integral of $f(u(x))$ between $x$ and $x+\epsilon$. For a small $\epsilon$, this integral is approximately $\epsilon f(u(x))$, the area of the associated rectangle. Now, let us compare this with the integral of $f(y)$ from $u(x)$ to $u(x+\epsilon)$. We know that $u(x+\epsilon) \approx u(x) + \epsilon \frac{du}{dx}(x)$, so the area of this rectangle is approximately $\epsilon \frac{du}{dx}(x)f(u(x))$. Thus, to make the area of these two rectangles to agree, we need to multiply the first one by $\frac{du}{dx}(x)$ as is illustrated in :numref:`fig_rect-transform`.
289
+
For a more intuitive derivation, consider what happens when we take an integral of $f(u(x))$ between $x$ and $x+\epsilon$. For a small $\epsilon$, this integral is approximately $\epsilon f(u(x))$, the area of the associated rectangle. Now, let us compare this with the integral of $f(y)$ from $u(x)$ to $u(x+\epsilon)$. We know that $u(x+\epsilon) \approx u(x) + \epsilon \frac{du}{dx}(x)$, so the area of this rectangle is approximately $\epsilon \frac{du}{dx}(x)f(u(x))$. Thus, to make the area of these two rectangles to agree, we need to multiply the first one by $\frac{du}{dx}(x)$ as is illustrated in :numref:`fig_rect-transform`.
290
290
291
-

291
+

Consider the figure above where we have split the function into $\epsilon \times \epsilon$ squares which we will index with integer coordinates $i, j$. In this case, our integral is approximately
422
422
@@ -430,16 +430,16 @@ $$
430
430
\sum _ {j} \epsilon \left(\sum_{i} \epsilon f(\epsilon i, \epsilon j)\right).
431
431
$$
432
432
433
-

433
+

434
434
:label:`fig_sum-order`
435
435
436
-
The sum on the inside is precisely the discretization of the integral
436
+
The sum on the inside is precisely the discretization of the integral
As with single variables in :eqref:`eq_change_var`, the ability to change variables inside a higher dimensional integral is a key tool. Let us summarize the result without derivation.
469
+
As with single variables in :eqref:`eq_change_var`, the ability to change variables inside a higher dimensional integral is a key tool. Let us summarize the result without derivation.
470
470
471
-
We need a function that reparameterizes our domain of integration. We can take this to be $\phi : \mathbb{R}^n \rightarrow \mathbb{R}^n$, that is any function which takes in $n$ real variables and returns another $n$. To keep the expressions clean, we will assume that $\phi$ is *injective* which is to say it never folds over itself ($\phi(\mathbf{x}) = \phi(\mathbf{y}) \implies \mathbf{x} = \mathbf{y}$).
471
+
We need a function that reparameterizes our domain of integration. We can take this to be $\phi : \mathbb{R}^n \rightarrow \mathbb{R}^n$, that is any function which takes in $n$ real variables and returns another $n$. To keep the expressions clean, we will assume that $\phi$ is *injective* which is to say it never folds over itself ($\phi(\mathbf{x}) = \phi(\mathbf{y}) \implies \mathbf{x} = \mathbf{y}$).
Looking closely, we see that this is similar to the single variable chain rule :eqref:`eq_change_var`, except we have replaced the term $\frac{du}{dx}(x)$ with $\left|\det(D\phi(\mathbf{x}))\right|$. Let us see how we can to interpret this term. Recall that the $\frac{du}{dx}(x)$ term existed to say how much we stretched our $x$-axis by applying $u$. The same process in higher dimensions is to determine how much we stretch the area (or volume, or hyper-volume) of a little square (or little *hyper-cube*) by applying $\boldsymbol{\phi}$. If $\boldsymbol{\phi}$ was the multiplication by a matrix, then we know how the determinant already gives the answer.
489
+
Looking closely, we see that this is similar to the single variable chain rule :eqref:`eq_change_var`, except we have replaced the term $\frac{du}{dx}(x)$ with $\left|\det(D\phi(\mathbf{x}))\right|$. Let us see how we can to interpret this term. Recall that the $\frac{du}{dx}(x)$ term existed to say how much we stretched our $x$-axis by applying $u$. The same process in higher dimensions is to determine how much we stretch the area (or volume, or hyper-volume) of a little square (or little *hyper-cube*) by applying $\boldsymbol{\phi}$. If $\boldsymbol{\phi}$ was the multiplication by a matrix, then we know how the determinant already gives the answer.
490
490
491
491
With some work, one can show that the *Jacobian* provides the best approximation to a multivariable function $\boldsymbol{\phi}$ at a point by a matrix in the same way we could approximate by lines or planes with derivatives and gradients. Thus the determinant of the Jacobian exactly mirrors the scaling factor we identified in one dimension.
0 commit comments