DA2PL2018 November 23, 2018 Khannoussi et al. (LATERAL) November - - PowerPoint PPT Presentation

da2pl 2018
SMART_READER_LITE
LIVE PREVIEW

DA2PL2018 November 23, 2018 Khannoussi et al. (LATERAL) November - - PowerPoint PPT Presentation

Incremental preference elicitation for SRMP models: Application for autonomous drones Arwa Khannoussi, Alexandru-Liviu Olteanu, Catherine Dezan, Jean-Philippe Diguet, Christophe Labreuche, Jacques Petit-Fr` ere, Patrick Meyer DA2PL2018


slide-1
SLIDE 1

Incremental preference elicitation for SRMP models: Application for autonomous drones

Arwa Khannoussi, Alexandru-Liviu Olteanu, Catherine Dezan, Jean-Philippe Diguet, Christophe Labreuche, Jacques Petit-Fr` ere, Patrick Meyer

DA2PL’2018

November 23, 2018

Khannoussi et al. (LATERAL) November 23, 2018 1 / 23

slide-2
SLIDE 2

Outline

1

Context & Research questions

2

Simple Ranking Method using Reference Profiles (SRMP)

3

Heuristics for incremental learning of SRMP model

4

Experimental Results

5

Conclusion

Khannoussi et al. (LATERAL) November 23, 2018 2 / 23

slide-3
SLIDE 3

Drone Context

  • Guarantee a level of trust

in the drone's behavior

  • The drone's decisions

should be consistent with the priorities of the

  • perator

Confidence

3

Mission

1

  • Military
  • Civilian
  • Specified by a set of

waypoints and objectives

Decision Making

2

  • Make autonous decisions
  • Take into account multiple

criteria to acheive the mission's objectives

Khannoussi et al. (LATERAL) November 23, 2018 3 / 23

slide-4
SLIDE 4

Research questions

  • 1. Which MCDA model to integrate into the drone ?
  • 2. How to learn the parameters of the MCDA model ?

Khannoussi et al. (LATERAL) November 23, 2018 4 / 23

slide-5
SLIDE 5

RQ1 : Which model?

Real-world constraints Decisions related to high-level actions (like land, loiter, skip a waypoint, ...) → choose one alternative among all possible ones Guarantee a high level of trust in the drone’s decision → integrate the preference model onboard of the drone The preference model and its consequences are presented to the

  • perator in order to be validated

Heterogeneous evaluation scales of the criteria: quantitative (energy, ...) and qualitative (risk, ...)

Khannoussi et al. (LATERAL) November 23, 2018 5 / 23

slide-6
SLIDE 6

RQ1: Which model?

Proposal Ranking or choosing in the outranking paradigm Simple Ranking Method using Reference Profiles (SRMP) [Rolland 2013]

ranking can be explained through a series of rules, understandable by the drone operator

Khannoussi et al. (LATERAL) November 23, 2018 6 / 23

slide-7
SLIDE 7

RQ2 : How to learn the parameters of the model?

Real-world constraints Cognitive effort of the operator should be minimized The operator is not an expert of the decision model, and he is probably not able to fix the parameters of the model “directly” However : the operator is able to make a preferential choice between pairs of actions (in a simulated environment, with a given context) Not all decision actions are possible (existing database of possible pairs of actions, in a given context)

Khannoussi et al. (LATERAL) November 23, 2018 7 / 23

slide-8
SLIDE 8

RQ2 : How to learn the parameters of the model?

Proposal Learn the SRMP model from pairwise comparisons of alternatives (indirect elicitation) Elicit the SRMP model incrementally Confront the operator with pairs of possible actions selected from an existing database of possible pairs

Khannoussi et al. (LATERAL) November 23, 2018 8 / 23

slide-9
SLIDE 9

Outline

1

Context & Research questions

2

Simple Ranking Method using Reference Profiles (SRMP)

3

Heuristics for incremental learning of SRMP model

4

Experimental Results

5

Conclusion

Khannoussi et al. (LATERAL) November 23, 2018 9 / 23

slide-10
SLIDE 10

Simple Ranking Method using Reference Profiles (SRMP)

Preference parameters of the decision maker:

k reference profiles: p1, . . . , ph, . . . , pk a lexicographic order σ on the reference profiles criteria weights w1, . . . , wm, with wj 0 and

j∈M

wj = 1

Alternatives are compared indirectly via reference profiles (ph) a ph b iff a outranks ph more strongly than b Create a pre-order of the alternatives

P1 P2 w1 w2 w3 a b

Khannoussi et al. (LATERAL) November 23, 2018 10 / 23

slide-11
SLIDE 11

SRMP: Parameters elicitation

Preference elicitation : tune the parameters of the SRMP model In our practical context : indirect elicitation

Determine the parameters from holistic judgements of the decision maker on pairs of alternatives (aPb / aIb) Mixed Integer Program (MIP) [Olteanu et al. 2018] Inputs : a set of pairwise comparisons of alternatives Outputs : values of the parameters of the SRMP model

Our contribution :

Incremental learning to limit the number of learning example Heuristics to select the learning examples

Khannoussi et al. (LATERAL) November 23, 2018 11 / 23

slide-12
SLIDE 12

Outline

1

Context & Research questions

2

Simple Ranking Method using Reference Profiles (SRMP)

3

Heuristics for incremental learning of SRMP model

4

Experimental Results

5

Conclusion

Khannoussi et al. (LATERAL) November 23, 2018 12 / 23

slide-13
SLIDE 13

Incremental learning of the SRMP model

Model inference (MIP) Updated SRMP model Decision maker Pair selection (heuristic) Database of real pairs of alternatives Binary comparisons Elicitation

Khannoussi et al. (LATERAL) November 23, 2018 13 / 23

slide-14
SLIDE 14

Incremental learning of the SRMP model

Model inference (MIP) Updated SRMP model Decision maker Pair selection (heuristic) Database of real pairs of alternatives Binary comparisons Elicitation

First feasible model Model closest to previous one Model centered inside search space

Model inference (MIP) Pair selection (heuristic)

Random pair Pair of most similar alternatives Pair of most dissmilar alternatives Pair close to previous model profile Pair that uses the biggest number of profiles to be discriminated by the previous model (indifference, k profiles, k-1 profile, ..., 1 profile)

i

Khannoussi et al. (LATERAL) November 23, 2018 14 / 23

slide-15
SLIDE 15

Heuristics: Experimental setting

DM simulation: random SRMP model MDM (fixed k profiles, m criteria) Artificial datasets (different n alternatives, m criteria, k profiles)

elicitation phase: 100 pairs of alternatives test phase: 5000 alternatives

At each iteration:

a pair of alternatives is selected by the heuristic use MDM to add new constraints to construct a new SRMP model Mi

Rank the test data using MDM and Mi, and then compare these rankings using Kendall’s rank correlation measure τ. Each SRMP model configuration (k profiles and m criteria) is experimented 100 times

Khannoussi et al. (LATERAL) November 23, 2018 15 / 23

slide-16
SLIDE 16

Outline

1

Context & Research questions

2

Simple Ranking Method using Reference Profiles (SRMP)

3

Heuristics for incremental learning of SRMP model

4

Experimental Results

5

Conclusion

Khannoussi et al. (LATERAL) November 23, 2018 16 / 23

slide-17
SLIDE 17

Experimental Results

mean Kendall tau, 3 criteria, 2 profiles, 100 experiments

Khannoussi et al. (LATERAL) November 23, 2018 17 / 23

slide-18
SLIDE 18

Experimental Results

Mean Kendall T au of 100 experiments for 2profiles 3 criteria

Khannoussi et al. (LATERAL) November 23, 2018 18 / 23

slide-19
SLIDE 19

Experimental Results

Khannoussi et al. (LATERAL) November 23, 2018 19 / 23

slide-20
SLIDE 20

Experimental Results

Determination of the number of pairs necessary to achieve a “good” preference model

Khannoussi et al. (LATERAL) November 23, 2018 20 / 23

slide-21
SLIDE 21

Outline

1

Context & Research questions

2

Simple Ranking Method using Reference Profiles (SRMP)

3

Heuristics for incremental learning of SRMP model

4

Experimental Results

5

Conclusion

Khannoussi et al. (LATERAL) November 23, 2018 21 / 23

slide-22
SLIDE 22

Conclusion

The MaxProfiles heuristic dominates in the different MIP’s configuration No clear difference between the MIP configurations Choice of the least-cost configuration in terms of computation time Future work :

Experiment new heuristics (Volume, MaxProfiles++, Alternate weights / profiles) Input other preference information (typically the profiles) Experiment other data configurations ((2 profiles, 7 criteria), (3 profiles, 5criteria), . . . )

Computation time for 100 experiments: one week for 2 profiles and 3 criteria with all the different combination of heuristic and MIP’s

  • configuration. (200 cpu, 300GB RAM)

Integration of the learning algorithm in the drone simulator

Khannoussi et al. (LATERAL) November 23, 2018 22 / 23

slide-23
SLIDE 23

References

Olteanu, A-L. et al. (2018). “Preference Elicitation for a Ranking Method based on Multiple Reference Profiles”. working paper or preprint. url: https://hal.archives-ouvertes.fr/hal-01862334. Rolland, A. (2013). “Reference-based preferences aggregation procedures in multi-criteria decision making”. In: European Journal of Operational Research 225.3, pp. 479–486. doi: 10.1016/j.ejor.2012.10.013. url: https://doi.org/10.1016/j.ejor.2012.10.013.

Khannoussi et al. (LATERAL) November 23, 2018 23 / 23