Markov  Decision Process

 

Markov Decision Process (MDP) ´Â ÀÌ»ê½Ã°£ È®·üÁ¦¾î °úÁ¤ (discrete time stochastic control process) À¸·Î¼­, ÀÏ·ÃÀÇ »óÅ (states), Çൿ (actions), ÁÖ¾îÁø »óÅ¿¡¼­ ¼±ÅÃµÈ Çൿ¿¡ ÀÇÁ¸ÇÏ´Â ÀüÀÌÈ®·üÇà·Ä (transition probability matrices) µîÀ» Ư¡À¸·Î ÇÑ´Ù. dynamic programming °ú °­È­ÇнÀ (Reinforcement learning) À» ÅëÇÑ ÇعýÀ» ã´Â ±¤¹üÀ§ÇÑ ÃÖÀûÈ­¹®Á¦ (optimization  problem) ¸¦ ¿¬±¸Çϴµ¥¿¡ MDP ´Â À¯¿ëÇÏ´Ù. ....... (Wikipedia : Markov decision process)

Âü°í¼­Àû

Bellman, R. E. Dynamic Programming. Princeton University Press, Princeton, NJ.

M. L. Puterman. Markov Decision Processes. Wiley, 1994.

Site :

MDP Toolbox for Matlab - An excellent tutorial and Matlab toolbox for working with MDPs.