Update README.md

wgyhhhh · web-flow · commit e6034ea49264 · 2025-10-10T23:22:18.000+08:00
diff --git a/README.md b/README.md
@@ -1,3 +1,19 @@
 《强化学习中的数学原理》-个人笔记与思考总结
 
 正在持续更新: https://wgyhhhh.github.io/Mathematical-Foundations-of-Reinforcement-Learning-Notes/
+
+## 📖 内容导航
+
+| 章节 | 关键内容 | 状态 |
+| --- | --- | --- |
+| [前言](https://wgyhhhh.github.io/Mathematical-Foundations-of-Reinforcement-Learning-Notes/Preface1/) | 本笔记的缘起、背景及阅读建议 | ✅ |
+| [第一章 基本概念](https://wgyhhhh.github.io/Mathematical-Foundations-of-Reinforcement-Learning-Notes/Chapter-1/intro/) | 强化学习的基本概念 | ✅ |
+| [第二章 状态值与贝尔曼方程](https://wgyhhhh.github.io/Mathematical-Foundations-of-Reinforcement-Learning-Notes/Chapter-2/intro/) | 回报、状态值、Bellman方程 | ✅ |
+| [第三章 最优状态值与贝尔曼最优方程](https://wgyhhhh.github.io/Mathematical-Foundations-of-Reinforcement-Learning-Notes/Chapter-3/intro/) | 最优状态值、最优策略、Bellman最优方程 | ✅ |
+| [第四章 值迭代与策略迭代](https://wgyhhhh.github.io/Mathematical-Foundations-of-Reinforcement-Learning-Notes/Chapter-4/intro/) | 值迭代算法、策略迭代算法、阶段策略迭代算法 | ✅ |
+| [第五章 蒙特卡罗方法](https://wgyhhhh.github.io/Mathematical-Foundations-of-Reinforcement-Learning-Notes/Chapter-5/intro/) | MC Basic、MC Exploring Starts、MC-Greedy | ✅ |
+| [第六章 随机近似算法](https://wgyhhhh.github.io/Mathematical-Foundations-of-Reinforcement-Learning-Notes/Chapter-6/intro/) | Robbins-Monro算法、Dvoretzky定理、随机梯度下降 | ✅ |
+| [第七章 时序差分算法](https://wgyhhhh.github.io/Mathematical-Foundations-of-Reinforcement-Learning-Notes/Chapter-7/intro/) | Sarsa、n步Sarsa、Q-learning、 Off-policy、On-policy| ✅ |
+| [第八章 值函数方法](https://wgyhhhh.github.io/Mathematical-Foundations-of-Reinforcement-Learning-Notes/Chapter-8/intro/) |基于值函数的Sarsa、基于值函数的Q-learning | ✅ |
+| [第九章 策略梯度方法](https://wgyhhhh.github.io/Mathematical-Foundations-of-Reinforcement-Learning-Notes/Chapter-9/intro/) | 策略梯度、REINFORCE | ✅ |
+| [第十章 演员-评论家算法](https://wgyhhhh.github.io/Mathematical-Foundations-of-Reinforcement-Learning-Notes/Chapter-10/intro/) | 优势演员-评论家、异策略演员-评论家、确定性演员-评论家 | ✅ |