Multiple-Environment Markov Decision Processes: Efficient Analysis and Applications
ICAPS 2020
- K. Chatterjee, M. Chmel´
Multiple-Environment Markov Decision Processes: Efficient Analysis - - PowerPoint PPT Presentation
Multiple-Environment Markov Decision Processes: Efficient Analysis and Applications ICAPS 2020 K. Chatterjee, M. Chmel k, D. Karkhanis, P. Novotn y, A. Royer October 27-30th, 2020 Introducing MEMDPS 0.8 0.4 0.9 0.5 0.5 0.25 s 1 s
[1]Multiple-Environment Markov Decision Processes, J.F. Raskin and O. Sancur, 2014
1
[1]Multiple-Environment Markov Decision Processes, J.F. Raskin and O. Sancur, 2014
1
[1]Multiple-Environment Markov Decision Processes, J.F. Raskin and O. Sancur, 2014
1
[1]Multiple-Environment Markov Decision Processes, J.F. Raskin and O. Sancur, 2014
1
2
s0 Fantasy book History book (Fantasy, Fantasy) book (Fantasy, Sci-fi) book (Fantasy, History) book (History, Fantasy) book (History, History) book ... 0.25 0.75 0.5 0.7 0.2 0.3 0.7 0.4 0.6 0.9 0.01 0.09 0.4 0.6
2
s0 s1 s2 s3 0.25 0.75 0.5 0.8 0.5 0.15 0.9 0.4 s0 s1 s2 s3 s0 s1 s2 s3 0.25 0.75 0.5 0.8 0.5 0.15 0.9 0.4
3
3
4
4
4
4
4
[7]Exact and approximate algorithms for partially observable Markov decision processes,
Cassandra, 1998 4
[3]Point-based value iteration: An anytime algorithm for POMDPs, Pineau et al, IJCAI 2003 [4]Monte-Carlo Planning in Large POMDPs, Silver and Veness, NeurIPS 2010
5
[3]Point-based value iteration: An anytime algorithm for POMDPs, Pineau et al, IJCAI 2003 [4]Monte-Carlo Planning in Large POMDPs, Silver and Veness, NeurIPS 2010
5
[5]An MDP-based Recommender System, Shani et al, JMLR 2005
6
(synthetic) MDP SPBVI POMCP POMCP-ex PAMCP PAMCP-ex Accuracy 0.12 ± 0.03
0.77 ± 0.07 0.68 ± 0.24 0.75 ± 0.08
0.96 ± 0.04 0.85 ± 0.30 0.94 ± 0.06 Runtime 5h30mn OOM 9mn36s 14s 14s 36s
[5]An MDP-based Recommender System, Shani et al, JMLR 2005
6
*: Environments are generated in a greedy manner, using perplexity as a metric
[5]An MDP-based Recommender System, Shani et al, JMLR 2005
6
7
7
8