SLIDE 19 Experimental results
5 10 15 20 25 1 10 100 1000 10000 100000 1e+06
Number of Iterations POMCP D2NG-POMCP
(a) RS[7, 8].
5 10 15 20 25 1e-05 0.0001 0.001 0.01 0.1 1 10 100
- Avg. Discounted Return
- Avg. Time Per Action (Seconds)
POMCP D2NG-POMCP
(b) RS[7, 8].
5 10 15 20 25 1 10 100 1000 10000 100000 1e+06
Number of Iterations POMCP D2NG-POMCP
(c) RS[11, 11].
5 10 15 20 25 1e-05 0.0001 0.001 0.01 0.1 1 10 100
- Avg. Discounted Return
- Avg. Time Per Action (Seconds)
POMCP D2NG-POMCP
(d) RS[11, 11].
5 10 15 20 25 1 10 100 1000 10000 100000
Number of Iterations POMCP D2NG-POMCP
(e) RS[15, 15].
5 10 15 20 25 0.0001 0.001 0.01 0.1 1 10 100
- Avg. Discounted Return
- Avg. Time Per Action (Seconds)
POMCP D2NG-POMCP
(f) RS[15, 15].
10 20 30 40 50 60 70 80 90 1 10 100 1000 10000 100000
Number of Iterations POMCP D2NG-POMCP
(g) PocMan.
10 20 30 40 50 60 70 80 90 1e-05 0.0001 0.001 0.01 0.1 1 10
- Avg. Discounted Return
- Avg. Time Per Action (Seconds)
POMCP D2NG-POMCP
(h) PocMan. Figure 4 : Performance of D2NG-POMCP in RockSample and PocMan
- A. Bai, F. Wu, Z. Zhang, and X. Chen
Thompson Sampling based Monte-Carlo Planning in POMDPs 19 / 22