diff --git a/source-code/Pseudocode/Value-Iteration/Value-Iteration.png b/source-code/Pseudocode/Value-Iteration/Value-Iteration.png index f0736dd..24ee7b0 100644 Binary files a/source-code/Pseudocode/Value-Iteration/Value-Iteration.png and b/source-code/Pseudocode/Value-Iteration/Value-Iteration.png differ diff --git a/source-code/Pseudocode/Value-Iteration/Value-Iteration.tex b/source-code/Pseudocode/Value-Iteration/Value-Iteration.tex index c0b5895..047232f 100644 --- a/source-code/Pseudocode/Value-Iteration/Value-Iteration.tex +++ b/source-code/Pseudocode/Value-Iteration/Value-Iteration.tex @@ -23,7 +23,7 @@ \Statex Actions $\mathcal{A} = \{1, \dots, n_a\},\qquad A: \mathcal{X} \Rightarrow \mathcal{A}$ \Statex Cost function $g: \mathcal{X} \times \mathcal{A} \rightarrow \mathbb{R}$ \Statex Transition probabilities $f$ - \Statex Learning rate $\alpha \in [0, 1]$, typically $\alpha = 0.1$ + \Statex Discounting factor $\alpha \in (0, 1)$, typically $\alpha = 0.9$ \Procedure{ValueIteration}{$\mathcal{X}$, $A$, $g$, $f$, $\alpha$} \State Initialize $J, J': \mathcal{X} \rightarrow \mathbb{R}_0^+$ arbitrarily \While{$J$ is not converged}