Generating Natural Questions for Crowdsourcing Platforms
Conrad Soon (RI), Sun Yiran (HCI)
Natural Questions for Crowdsourcing Platforms Conrad Soon (RI), - - PowerPoint PPT Presentation
Generating Natural Questions for Crowdsourcing Platforms Conrad Soon (RI), Sun Yiran (HCI) Outline 1. 1. Introduc roducti tion on 2. 2. Aim of Research earch and d Literatu rature re Review ew a. a. Ulam am-Ren enyi i game me
Generating Natural Questions for Crowdsourcing Platforms
Conrad Soon (RI), Sun Yiran (HCI)
Outline
1.
roducti tion
2.
earch and d Literatu rature re Review ew a.
am-Ren enyi i game me 3.
thodo dolog
a.
blem state tement ent b.
sibl ble e Solu luti tion
s: Simulat lated ed Anneal ealin ing g and d Exhaust austive ive searc rch c. c. Final al Solut ution ion: Heuris ristic ic Search rch 4.
erimental ental Resul ults ts 5.
site te Demons
tration ation 6.
nclusion ion and Future re works ks
2Crowd-Sourcing
4Machine Learning Data-Labelling
Problems
▪ Workers may make mistakes due to ▫ Lack of Motivation ▫ Lack of Expertise
5Decrease rease the he Accuracy uracy of Crowdso dsourced urced Dat ata!
Existing Strategies
Non-feedback based Feedback-based
A worker’s decision rule
7Cod
matr trix ix App pproa
ch Ulam am-Rényi Approach Less question tions and more e inaccu accurat ate. More e questio tions ns and more e accurate curate.
Ulam-Rényi Approach
8Key Issues
1.
ers s may ay not t be able le to make ke cla lass-based ed dist stinctions inctions. 2.
stions
ay end up bein ing very y compl plex x and long ng. 3.
s to be shown wn that at a real l crowdsour sourci cing g pla latform tform usin ing it is feasibl sible.
9Key Research Aims
Find a way to simplify questions asked Demonstrate Ulam-Rényi approach is feasible Feature-based questions Reduce length of questions Create a web demo
A Feature-based Approach
Quest stions ions Aske ked d by Ulam-Reny enyi i Heuristic ristics s (and various question generation strategies) ➔ class-based: “Is this dog a poodle or a beagle?”, “Is this building a structurally unsound building?” ➔ may be excessively long: “Is this dog a poodle, a husky, a beagle, a samoyed, or a bulldog?”
Featur ture-bas based ed Question estions ➔ “Does this dog have pointy ears?”, “Does this building have e misal saligned/ igned/til tilte ted d window dow frame ames s or door r frames?” Advantages antages ➔ Concis cise ➔ More e unders erstanda tandabl ble: e: the prese esence nce or absence sence of feature tures s are more e readily dily apparent arent
A Feature-based Approach
Problem Formulation
14▪ trans ansform form class ss-base based d Ulam am-Renyi enyi quest stion
s into
▪ while ile adhering to the constraints given en by Ulam am- Renyi i algori gorithm thm (to minim imise se the e number ber of quest stion
s asked) ed)
Feature Matrix
“Characterisation” ▪ A combination of Boolean features connected by logical connectors "AND" (∧), "OR" (∨) and "NOT" (¬). It is itself a boolean function. ▪ A set of classes is characterised by a characterisation if all classes in that set evaluate to true for that characterisation. Reduced d Task ▪ Find the shorte test st character teris isation ion for a set of classes that t sati tisf sfy y the Ulam- Renyi cons nstraint traint
Revisitin iting the constrain raint: t:
16|T |T1∩A0| = 1 | |T1∩A1| = 0
Coverage ge Vector ▪ Constraint vector: (1, 0) ▪ How many classes in each set can be characterised by characterisation Reduced Task sk ▪ Find a characterisation that is relatively short and has a coverage vector highly similar to the constraint
2.
eck k if there ere is an relat lativel vely good charac racte teriz rizati ation
calculating the euclidean distance between the coverage vector of each characterisation and the constraint vector
lect t the most promisin sing g charac racte teris risation ations s and d combin ine e them em using g logic ical al connec ectors tors to form next- generation characterizations.
17Heuristic Search
4. Experimental Results
Experiments
▪ Performance Test of the Heuristic ▪ Performance Test of the Ulam Renyi Strategy Integrated with the Heuristic ▪ Optimality Test
Performance Test of the Heuristic
▪ Time needed to generate a question: negligible (<10^(-7) sec/question) ▪ Length of the question
▪ How well does the generated question satisfies the constraint given by Ulam-Renyi strategy?
Performance Test of the Integrated Question Generation Strategy
5. Website Demo
Demonstration
246.
Conclusion and Future Works
Conclusion
complexity of questions.
using Ulam-Rényi approach.
Future Extensions
Automatic generation of features given any task.
Thank you!
Any questions?
28If we adopt a trait-oriented approach, in
many cases, we have to ask serial questions about the object’s traits to pin down the class the object is in. How do we minimise the number of questions we asked?
OUR PROCESS
Minimise the number of questions Add user- friendly features Minimise the length of questions
301.
arch on other er discr cret ete e optimisat isation ion strate ategi gies es 2.
arch on the e possi sibili bility ty of conve verti rting g it to a conti tinuou uous s optimisat isation
blems ms 3.
t if the e constr straint aint |Ti ∩ Aj| gets s fuzzy? How to utilis lise e it? 4.
ance e analy alysis sis
31Minimise the Length of Questions
By the end of term 3
1.
der const stru ruct ction ion) 2.
ated d Trait it Generat eration ion a.
scrapin raping g and d natu tural ral language guage proces essin sing g techni hnique que to gene nerat rate e traits its and trait t ma matrix x automa
tical ally inst stead ead of asking ing users rs for manual ual input ut.
32Add User-friendly Features
By December
Ulam-Rényi Approach
1.
ere e are N o
cts in a s set 𝐓, a respon ponder er picks s an
ct H and a quest stion ioner er can ask him quest stions ions about t it to find d out what t H is. a.
stions
t me memb mbersh ship ip of H i in some me subset set of 𝐓 2.
ver, the responder ponder can lie up to a maximum amount t of e times. 3.
erpret t a m multi-class class labellin elling g proble blem as such a game me. a.
es the e set of all l class sses es. b.
es the e hidden den state te of the e image. c. c. e is a paramet ameter er that at can be varied ied based ed on how accurat rate e
kers rs are.
331.
{Husky, , Corgi, gi, Golden den Retriev riever} er} 2.
{Husky, ky, Beagl gle} e} A1={Samoy {Samoyed, d, Corgi, gi, Golden den Retriev riever} er} 3.
straint: t: |T0∩A0| | = 1, , |T0∩A1| | = 2
T1 T2
34Ulam-Renyi Questions and Constraints
▪ How much does each individual question contribute to a correct answer?