0
443views
Write short note on Temporal Difference Learning.
1 Answer
0
28views

Solution:

Temporal-Difference (TD) Learning:

  • a combination of DP and MC methods

  • updates estimates based on other learned estimates (i.e., bootstraps), (as DP methods) does not require a model; learns from raw experience as MC methods.

  • constitutes a basis for reinforcement learning.

  • Convergence to $\mathrm{V}^\pi$ is guaranteed (asymptotically as in MC …

Create a free account to keep reading this post.

and 5 others joined a min ago.

Please log in to add an answer.