节点文献
基于线性平均的强化学习函数估计算法
Reinforcement learning function approximation algorithm based on linear average
【摘要】 提出了一种基于最小线性平均的强化学习算法,用于解决连续空间下强化学习函数估计的非收敛性问题。该算法基于梯度下降法,根据压缩映射原理,通过采用线性平均法作为值函数估计的性能衡量标准,把值函数估计的迭代过程转化为一个收敛于不动点的过程。该算法利用强化学习算法的标准问-题Mountain Car问题进行了验证,仿真结果验证了算法是有效的和可行的,并且可以快速收敛到稳定值。
【Abstract】 A reinforcement learning algorithm based on linear average is proposed,which is used to solve non-convergent problems of reinforcement learning function approximation in continuous state space.According to contraction theory,this algorithm is based on gradient descent method,which adopts linear average as performance evaluation of value function.So the iterative process of value function becomes a convergent process to a fixed value.A standard reinforcement learning problem,Mountain Car Problem,is used to verify the performance of the algorithm.Results show the effectiveness,feasibility and quick convergence of the algorithm.
【Key words】 automatic control technology; reinforcement learning; linear averages; function approximation; gradient descent method;
- 【文献出处】 吉林大学学报(工学版) ,Journal of Jilin University(Engineering and Technology Edition) , 编辑部邮箱 ,2008年06期
- 【分类号】TP301.6
- 【被引频次】3
- 【下载频次】174