File tree Expand file tree Collapse file tree
Expand file tree Collapse file tree Original file line number Diff line number Diff line change 11《强化学习中的数学原理》-个人笔记与思考总结
22
33正在持续更新: https://wgyhhhh.github.io/Mathematical-Foundations-of-Reinforcement-Learning-Notes/
4+
5+ ## 📖 内容导航
6+
7+ | 章节 | 关键内容 | 状态 |
8+ | --- | --- | --- |
9+ | [ 前言] ( https://wgyhhhh.github.io/Mathematical-Foundations-of-Reinforcement-Learning-Notes/Preface1/ ) | 本笔记的缘起、背景及阅读建议 | ✅ |
10+ | [ 第一章 基本概念] ( https://wgyhhhh.github.io/Mathematical-Foundations-of-Reinforcement-Learning-Notes/Chapter-1/intro/ ) | 强化学习的基本概念 | ✅ |
11+ | [ 第二章 状态值与贝尔曼方程] ( https://wgyhhhh.github.io/Mathematical-Foundations-of-Reinforcement-Learning-Notes/Chapter-2/intro/ ) | 回报、状态值、Bellman方程 | ✅ |
12+ | [ 第三章 最优状态值与贝尔曼最优方程] ( https://wgyhhhh.github.io/Mathematical-Foundations-of-Reinforcement-Learning-Notes/Chapter-3/intro/ ) | 最优状态值、最优策略、Bellman最优方程 | ✅ |
13+ | [ 第四章 值迭代与策略迭代] ( https://wgyhhhh.github.io/Mathematical-Foundations-of-Reinforcement-Learning-Notes/Chapter-4/intro/ ) | 值迭代算法、策略迭代算法、阶段策略迭代算法 | ✅ |
14+ | [ 第五章 蒙特卡罗方法] ( https://wgyhhhh.github.io/Mathematical-Foundations-of-Reinforcement-Learning-Notes/Chapter-5/intro/ ) | MC Basic、MC Exploring Starts、MC-Greedy | ✅ |
15+ | [ 第六章 随机近似算法] ( https://wgyhhhh.github.io/Mathematical-Foundations-of-Reinforcement-Learning-Notes/Chapter-6/intro/ ) | Robbins-Monro算法、Dvoretzky定理、随机梯度下降 | ✅ |
16+ | [ 第七章 时序差分算法] ( https://wgyhhhh.github.io/Mathematical-Foundations-of-Reinforcement-Learning-Notes/Chapter-7/intro/ ) | Sarsa、n步Sarsa、Q-learning、 Off-policy、On-policy| ✅ |
17+ | [ 第八章 值函数方法] ( https://wgyhhhh.github.io/Mathematical-Foundations-of-Reinforcement-Learning-Notes/Chapter-8/intro/ ) | 基于值函数的Sarsa、基于值函数的Q-learning | ✅ |
18+ | [ 第九章 策略梯度方法] ( https://wgyhhhh.github.io/Mathematical-Foundations-of-Reinforcement-Learning-Notes/Chapter-9/intro/ ) | 策略梯度、REINFORCE | ✅ |
19+ | [ 第十章 演员-评论家算法] ( https://wgyhhhh.github.io/Mathematical-Foundations-of-Reinforcement-Learning-Notes/Chapter-10/intro/ ) | 优势演员-评论家、异策略演员-评论家、确定性演员-评论家 | ✅ |
You can’t perform that action at this time.
0 commit comments