evaluation demms evaluation of multimedia
play

Evaluation DEMMS: Evaluation of Multimedia What are the - PowerPoint PPT Presentation

Evaluation DEMMS: Evaluation of Multimedia What are the Evaluation lectures about: When to evaluate Systems What kinds of evaluation are possible Predictive evaluations Robert Villa Traditional user experiments


  1. Evaluation DEMMS: Evaluation of Multimedia • What are the Evaluation lectures about: – When to evaluate Systems – What kinds of evaluation are possible • Predictive evaluations Robert Villa • Traditional user experiments • Ethnographic style studies February 2008 – Case study describing an example evaluation in detail Today: Next week: • Lecture, Tuesday 12 th Feb: • The role of evaluation – Evaluation case study – Within the larger development effort • Tutorial, Tuesday 12 th Feb: • Predictive evaluation – Expert reviews – Evaluation case study – Usage simulations • Traditional user experiments – Collecting usage data • Ethnographic style techniques – Very briefly

  2. What is Evaluation? Kinds of Evaluation • Formative – Evaluation which occurs during the design of a product, to guide it’s development – The principle focus here • Summative – Evaluations which take place after a product has been developed, which judges the finished product Evaluation within the City Prototype development with Design Method formative evaluation • The City Design Method has been media selection rules media selection rules & patterns & patterns covered in pervious lectures user analysis – Dr McGee-Lennon media outline selection scripting requirements requirements design within media; guidelines task & attention design guidelines info analysis information information types types interaction prototype product design development implementation evaluation

  3. Evaluation in the development Prototyping life cycle • User-centred process • Early design stages – Can use storyboards as prototypes for – Predict how well a design works evaluation – Test out ideas quickly – Mock-ups (few web pages, images, etc.) • Later design stages • Problems can occur with prototypes – Identify user difficulties – False settings (e.g. Ignoring bandwidth – Identify possible improvements issues) – Can spend more time on more thorough evaluations Predictive evaluation Expert reviews • Does not involve user testing • A usability expert reviews the system for problems – Want to try and predict how something works – Expert attempts to simulate the behaviour of beginners • Why do it? • Advantages – Quick – Efficient: one or two reviewers may identify – Cheap many problems – Experts more forthcoming with information • Important that the reviewer is not involved with system development

  4. Heuristic evaluation Walkthroughs • Like expert reviews, but inspection is • Determine a task to be done, and the guided by a set of heuristics context of the task – Heuristics focus on key usability concerns – A expert then “walks through” the task reviewing the actions necessary – Examples of heuristics: • Similar to a review, but with more • Be consistent • Provide clearly marked exits detailed predictions of what users’ do • Speak the users’ language • (Nielsen, 1992) Simulations Predictive evaluation overview • Given a prototype, automatically • Advantages: simulate users actions with it – Relatively fast and cheep (does not require users to test software) – Requires prototype software – Does not require fully working prototypes – Enables a quick “what-if” analysis – Can provide allot of feedback from experts – May be more appropriate at the start of prototyping and design

  5. Predictive evaluation overview User Experiments • Disadvantages: • No matter what other kinds of evaluation are carried out, at some point – The views of experts may not coincide with you need to evaluate with real users how your users actually behave – Simulations don’t necessarily model user’s – Traditional lab-based experiments behaviour correctly – Participative evaluation/design – Ethnographic-style work • Quantitative/Qualitative data Traditional experiments • Laboratory setting • Psychological research is the model • Generally: – Aim is for quantitative results (“hard” evidence) – Often relatively narrow domain

  6. Variables Example • Independent variables • You develop a new type of video browsing interface X. You want to find – What you manipulate out if users can browse videos quicker • Dependent variables when compared to existing interface Y – Expected to be influenced by the • Independent variable: independent variables – The two different systems X and Y • X and Y are the two “levels” of the variable • Dependent variable: – Navigation time Experimental Design Collecting usage data • Between subject • Observing users – A user does only one condition • Think aloud protocol • Within subject • Software logging – Users do all conditions • Interviews • Matched pairs • Questionnaires – Users are matched in pairs based on some criteria

  7. Observing Users Observing Users (2) • Direct observation • Indirect observation – Watch someone carry out specially – E.g. video recording or screen recording devised or normal tasks software – Obtrusive - Hawthorne effect (1939) – Less obtrusive than direct monitoring • Behaviour and performance can be altered when you watch somebody who is aware of • Problems: being watched – Lots of data which can be very difficult and time consuming to analyse Think aloud protocol Software logging • Encourage a user to say out loud what • Software is “instrumented” to generate a he/she is thinking while carrying out a time-stamped log of actions task – Much easier to analyse a log than video – Added strain on users (have to talk about • E.g. “time on web page” can be calculated if a log contains time stamped browse events what they’re doing as well as do it) – Often requires software to be altered – Can generate lots of feedback about an interface • Can get general purpose key loggers, browser loggers, etc.

  8. Interviews Questionnaires • Structured interviews • Can be given to a large number of people (e.g. Put on the web) – Predefined questions asked in a set way • E.g. Public opinion surveys • Surprisingly difficult to do well – Important if you want to generate statistics – Importance is on creating unambiguous • E.g. “X% of people interviewed agreed with ...” questions: • Flexible interviews • Closed questions (multiple choice) • Open questions – Set topics, but interviewer is free to follow interviewee’s replies – Often used for requirements gathering and sometimes after more formal evaluations Questionnaires (cont) • Different scales can be used in closed questions: – Checklist options • E.g. Yes/no/don’t know – Multi-point rating • End points given (e.g very useful/of no use) – Likert scale: • Multi-point scale where strength of agreement is measured

  9. Standard questionnaires • Standard questionnaires have been developed, which can be re-used – NASA-TLX • Level of task load of a user – QUIS • “Questionnaire for user Interaction Satisfaction” • Assess user’s subjective satisfaction with aspects of a user interface Common Style of Experiment Common Style of Experiment (cont) • Often with Multimedia/HCI experiments: • Uses questionnaires: – Purpose is to determine if a system or – Entry questionnaire: interface is “better” than an old one • general information about the user (gender, languages, etc.) – Within subject designs – Post-task questionnaire: – Independent variables: • user perception of the task/system/etc. • Two or more “systems” or “interfaces” – Exit questionnaire: • One or more tasks (e.g. four different search task) • User perceptions of the different systems etc. – Dependent variables: • Time • Task performance (where it can be measured)

  10. Next week ... Ethnographic style studies • We’ll go through an example case study • Lab evaluations have been criticised: – The lab is not like the real world – No account of context – Artificial tasks – Not possible to control everything • In response, some argue for: – ethnographic style studies where researchers study the use of systems in situ Ethnographic style studies (cont) • In reality this generally means: – The experimenter must go into the work environment and observe users working • Issues: – Takes lots of time – Typically generates qualitative rather than quantitative data

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend