Natural Questions for Crowdsourcing Platforms Conrad Soon (RI), - - PowerPoint PPT Presentation

natural questions for
SMART_READER_LITE
LIVE PREVIEW

Natural Questions for Crowdsourcing Platforms Conrad Soon (RI), - - PowerPoint PPT Presentation

Generating Natural Questions for Crowdsourcing Platforms Conrad Soon (RI), Sun Yiran (HCI) Outline 1. 1. Introduc roducti tion on 2. 2. Aim of Research earch and d Literatu rature re Review ew a. a. Ulam am-Ren enyi i game me


slide-1
SLIDE 1

Generating Natural Questions for Crowdsourcing Platforms

Conrad Soon (RI), Sun Yiran (HCI)

slide-2
SLIDE 2

Outline

1.

  • 1. Introduc

roducti tion

  • n

2.

  • 2. Aim of Research

earch and d Literatu rature re Review ew a.

  • a. Ulam

am-Ren enyi i game me 3.

  • 3. Meth

thodo dolog

  • gy

a.

  • a. Proble

blem state tement ent b.

  • b. Possi

sibl ble e Solu luti tion

  • ns:

s: Simulat lated ed Anneal ealin ing g and d Exhaust austive ive searc rch c. c. Final al Solut ution ion: Heuris ristic ic Search rch 4.

  • 4. Experi

erimental ental Resul ults ts 5.

  • 5. Websi

site te Demons

  • nstr

tration ation 6.

  • 6. Conclus

nclusion ion and Future re works ks

2
slide-3
SLIDE 3
  • 1. Introduction
slide-4
SLIDE 4

Crowd-Sourcing

4

Machine Learning Data-Labelling

slide-5
SLIDE 5

Problems

▪ Workers may make mistakes due to ▫ Lack of Motivation ▫ Lack of Expertise

5

Decrease rease the he Accuracy uracy of Crowdso dsourced urced Dat ata!

slide-6
SLIDE 6
  • 2. Literature Review
slide-7
SLIDE 7

Existing Strategies

Non-feedback based Feedback-based

  • ne class

A worker’s decision rule

7

Cod

  • de-ma

matr trix ix App pproa

  • ach

ch Ulam am-Rényi Approach Less question tions and more e inaccu accurat ate. More e questio tions ns and more e accurate curate.

slide-8
SLIDE 8

Ulam-Rényi Approach

8
slide-9
SLIDE 9

Key Issues

1.

  • 1. Worker

ers s may ay not t be able le to make ke cla lass-based ed dist stinctions inctions. 2.

  • 2. Quest

stions

  • ns may

ay end up bein ing very y compl plex x and long ng. 3.

  • 3. Needs

s to be shown wn that at a real l crowdsour sourci cing g pla latform tform usin ing it is feasibl sible.

9
slide-10
SLIDE 10

Aims

10

Key Research Aims

Find a way to simplify questions asked Demonstrate Ulam-Rényi approach is feasible Feature-based questions Reduce length of questions Create a web demo

slide-11
SLIDE 11
  • 3. Methodology
slide-12
SLIDE 12

A Feature-based Approach

Quest stions ions Aske ked d by Ulam-Reny enyi i Heuristic ristics s (and various question generation strategies) ➔ class-based: “Is this dog a poodle or a beagle?”, “Is this building a structurally unsound building?” ➔ may be excessively long: “Is this dog a poodle, a husky, a beagle, a samoyed, or a bulldog?”

slide-13
SLIDE 13

Featur ture-bas based ed Question estions ➔ “Does this dog have pointy ears?”, “Does this building have e misal saligned/ igned/til tilte ted d window dow frame ames s or door r frames?” Advantages antages ➔ Concis cise ➔ More e unders erstanda tandabl ble: e: the prese esence nce or absence sence of feature tures s are more e readily dily apparent arent

A Feature-based Approach

slide-14
SLIDE 14

Problem Formulation

14

▪ trans ansform form class ss-base based d Ulam am-Renyi enyi quest stion

  • ns

s into

  • concise feature-based questions

▪ while ile adhering to the constraints given en by Ulam am- Renyi i algori gorithm thm (to minim imise se the e number ber of quest stion

  • ns

s asked) ed)

slide-15
SLIDE 15 15

Feature Matrix

“Characterisation” ▪ A combination of Boolean features connected by logical connectors "AND" (∧), "OR" (∨) and "NOT" (¬). It is itself a boolean function. ▪ A set of classes is characterised by a characterisation if all classes in that set evaluate to true for that characterisation. Reduced d Task ▪ Find the shorte test st character teris isation ion for a set of classes that t sati tisf sfy y the Ulam- Renyi cons nstraint traint

slide-16
SLIDE 16

Revisitin iting the constrain raint: t:

16

|T |T1∩A0| = 1 | |T1∩A1| = 0

Coverage ge Vector ▪ Constraint vector: (1, 0) ▪ How many classes in each set can be characterised by characterisation Reduced Task sk ▪ Find a characterisation that is relatively short and has a coverage vector highly similar to the constraint

slide-17
SLIDE 17
  • 1. Start from the shortest characterizations (one-feature long)

2.

  • 2. Chec

eck k if there ere is an relat lativel vely good charac racte teriz rizati ation

  • n by

calculating the euclidean distance between the coverage vector of each characterisation and the constraint vector

  • 3. If there is a good characterisation, terminate.
  • 4. If there is not, sele

lect t the most promisin sing g charac racte teris risation ations s and d combin ine e them em using g logic ical al connec ectors tors to form next- generation characterizations.

17

Heuristic Search

slide-18
SLIDE 18

4. Experimental Results

slide-19
SLIDE 19

Experiments

▪ Performance Test of the Heuristic ▪ Performance Test of the Ulam Renyi Strategy Integrated with the Heuristic ▪ Optimality Test

slide-20
SLIDE 20

Performance Test of the Heuristic

▪ Time needed to generate a question: negligible (<10^(-7) sec/question) ▪ Length of the question

slide-21
SLIDE 21

▪ How well does the generated question satisfies the constraint given by Ulam-Renyi strategy?

slide-22
SLIDE 22

Performance Test of the Integrated Question Generation Strategy

slide-23
SLIDE 23

5. Website Demo

slide-24
SLIDE 24

Demonstration

24
slide-25
SLIDE 25

6.

Conclusion and Future Works

slide-26
SLIDE 26

Conclusion

  • 1. Feature-based decomposition reduces

complexity of questions.

  • 2. Proof-of-concept of a crowdsourcing platform

using Ulam-Rényi approach.

slide-27
SLIDE 27

Future Extensions

Automatic generation of features given any task.

slide-28
SLIDE 28

Thank you!

Any questions?

28
slide-29
SLIDE 29

If we adopt a trait-oriented approach, in

many cases, we have to ask serial questions about the object’s traits to pin down the class the object is in. How do we minimise the number of questions we asked?

slide-30
SLIDE 30

OUR PROCESS

Minimise the number of questions Add user- friendly features Minimise the length of questions

30
slide-31
SLIDE 31

1.

  • 1. Researc

arch on other er discr cret ete e optimisat isation ion strate ategi gies es 2.

  • 2. Researc

arch on the e possi sibili bility ty of conve verti rting g it to a conti tinuou uous s optimisat isation

  • n probl

blems ms 3.

  • 3. What

t if the e constr straint aint |Ti ∩ Aj| gets s fuzzy? How to utilis lise e it? 4.

  • 4. Run perform
  • rman

ance e analy alysis sis

31

Minimise the Length of Questions

By the end of term 3

slide-32
SLIDE 32

1.

  • 1. Python
  • n GUI (under

der const stru ruct ction ion) 2.

  • 2. Automate

ated d Trait it Generat eration ion a.

  • a. Use web-sc

scrapin raping g and d natu tural ral language guage proces essin sing g techni hnique que to gene nerat rate e traits its and trait t ma matrix x automa

  • mati

tical ally inst stead ead of asking ing users rs for manual ual input ut.

32

Add User-friendly Features

By December

slide-33
SLIDE 33

Ulam-Rényi Approach

1.

  • 1. Suppose
  • se ther

ere e are N o

  • bjects

cts in a s set 𝐓, a respon ponder er picks s an

  • bject

ct H and a quest stion ioner er can ask him quest stions ions about t it to find d out what t H is. a.

  • a. Questi

stions

  • ns are about

t me memb mbersh ship ip of H i in some me subset set of 𝐓 2.

  • 2. However,

ver, the responder ponder can lie up to a maximum amount t of e times. 3.

  • 3. We can interpre

erpret t a m multi-class class labellin elling g proble blem as such a game me. a.

  • a. 𝐓 becomes

es the e set of all l class sses es. b.

  • b. H becomes

es the e hidden den state te of the e image. c. c. e is a paramet ameter er that at can be varied ied based ed on how accurat rate e

  • ur worke

kers rs are.

33
slide-34
SLIDE 34

1.

  • 1. T0={Husky

{Husky, , Corgi, gi, Golden den Retriev riever} er} 2.

  • 2. A0={Hus

{Husky, ky, Beagl gle} e} A1={Samoy {Samoyed, d, Corgi, gi, Golden den Retriev riever} er} 3.

  • 3. Constrain

straint: t: |T0∩A0| | = 1, , |T0∩A1| | = 2

T1 T2

34

Ulam-Renyi Questions and Constraints

slide-35
SLIDE 35

▪ How much does each individual question contribute to a correct answer?