نتایج جستجو برای: stationary points

تعداد نتایج: 319338  

Journal: :Annales de l'Institut Henri Poincaré C, Analyse non linéaire 2014

Journal: :Proceedings of the ... AAAI Conference on Artificial Intelligence 2021

The policy-based reinforcement learning (RL) can be considered as maximization of its objective. However, due to the inherent non-concavity objective, policy gradient method a first-order stationary point (FOSP) cannot guar- antee maximal point. A FOSP minimal or even saddle point, which is undesirable for RL. It has found that if all points are strict, second-order station- ary (SOSP) exactly ...

Journal: :International Journal of Mathematics and Mathematical Sciences 2001

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید