announcements
play

Announcements VisualStudio Express Ink Analysis Free, Hobbyist - PDF document

Announcements VisualStudio Express Ink Analysis Free, Hobbyist version of VS 2005 Presentations, Tuesday Jan 23 15 minute presentation + 3 minutes Richard Anderson discussion CSE 481b PowerPoint slides Winter 2007 Group


  1. Announcements � VisualStudio Express Ink Analysis � Free, Hobbyist version of VS 2005 � Presentations, Tuesday Jan 23 � 15 minute presentation + 3 minutes Richard Anderson discussion CSE 481b � PowerPoint slides Winter 2007 � Group order: A, B, C, D Today’s lecture Ink Analysis for Search � Output � Handwriting Recognition � Mapping of search results to source � Structure Recognition � Reflect underlying structure � Classification � Handle different types of search queries � Annotation � Raw text � Boolean � JNT Note Format � Typed queries (“481 as a course number”) � Object queries (Course numbers) � Environment (List, Prose, Mathematics, . . . ) Handwriting Recognition: Ink Analysis Pipeline Identify the following words Filter Structure Recognize Classify Annotate 1

  2. Recognizer Architecture Recognition results Ink Segments Top 10 List TDNN dog 68 clog 57 dug 51 doom 42 Output Matrix divvy 37 88 8 68226357 4 a Lexicon ooze 35 b 23 4 61 44 5757 4 … Beam Search … … cloy 34 a o 92 81 51 9 4720 14 d a 00 g 57 … g doxy 29 b 13 31 8 2 14 3 3 e o 12 l b 00 t 12 b t … client 22 c l 07 b 6 a g 711252 8 79 90 90 c 00 t a dozy 13 d h 1717 5 7 43 13 7 a 73 o t 5 d 00 g e … … o 09 n 7 18 57 2857 6 5 … g 68 t 53 16 79 914415 12 o t 8 Slide from Jay Pittman, Microsoft Recognizer Training Tablet PC Recognition API � Collect large set of training data � Basic idea: � Samples of known inputs that can be � Ink In, Text Out used to set “weights” in reco engine � Needed to build a recognizer � Dictionary � Language samples � Commercial recognizers based on massive data sets Recognition Code I Recognition Code II private void OnRecoClick(object sender, EventArgs e) { private Recognizers recognizers; RecognizerContext recoContext = this.recognizer.CreateRecognizerContext(); private Recognizer recognizer; recoContext.Factoid = GetFactoid(); recoContext.Strokes = this.inkCollector.Ink.Strokes; public Form1() { recoContext.EndInkInput(); InitializeComponent(); this.inkCollector = new InkCollector(this.inkPanel.Handle); RecognitionStatus recoStatus; this.inkCollector.Enabled = true; RecognitionResult recoResult = recoContext.Recognize(out recoStatus); this.recognizers = new Recognizers(); this.recognizer = recognizers.GetDefaultRecognizer(); if (recoStatus != RecognitionStatus.NoError) } return; string result = recoResult.TopString; RecognitionAlternate topAlt = recoResult.TopAlternate; 2

  3. Factoids Reading Journal Notes � Bias the recognizer towards certain � Journal Reader to import .JNT types of content � .JNT -> XML -> Custom Format � DEFAULT � Journal format gives an initial parsing � CURRENCY � You may want to undo this parsing and � NUMBER work with ink at the page level � TELEPHONE � EMAIL � UPPERCHAR JNT Format Journal Drawing � Journal Document � Uninterpreted Ink � List of Journal Pages � Base64String � Journal Page � List of Content if (childNode.Name.ToLower().Equals("inkobject")) { string base64Ink = childNode.InnerText; � Content ink = new Ink(); ink.Load(Convert.FromBase64String(base64Ink)); � Journal Drawing, Journal Paragraph, other } stuff Text Structure Shape recognition � JournalParagraph � Surprisingly challenging because of drawing artifacts � List of JournalLines � Open figures � JournalLine � Multiple strokes � List of JournalInkWords � Imprecise corners � JournalInkWord � Arrows � Alternate List � Uninterpreted Ink 3

  4. Basic approach for structure Structure recognition recognition � Grouping by rectangular region � Heuristics for separating regions � White space � Separating lines General approach to recognition/classification Clustering � Extraction of features � Objects become points in high dimensional space � Construct mapping from features to classes Learning Heuristic � Programmatically determine classification based on features 4

  5. Classification Classification � Identify different types of text � Mathematics � Prose � Lists � Brainstorming � Code � Domains � Chemistry, Physics, Algorithms, Annotation Annotation � Identify annotation marks � Highlighted text � Circles � Check marks � Cross out 5

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend