the ceas the ceas programming language programming
play

The CEAS The CEAS Programming Language Programming Language Luis - PowerPoint PPT Presentation

The CEAS The CEAS Programming Language Programming Language Luis Alonso (lra2103) Hila Becker (hb2143) Kate McCarthy (km2302) Isa Muqattash (imm2104) Motivation Webpages contain clutter extraneous menus, links, corporate logos,


  1. The CEAS The CEAS Programming Language Programming Language Luis Alonso (lra2103) Hila Becker (hb2143) Kate McCarthy (km2302) Isa Muqattash (imm2104)

  2. Motivation • Webpages contain clutter – extraneous menus, links, corporate logos, banners, etc. • Maj or disadvantages for − visually disabled users − users of handheld devices with constrained screens

  3. Solution – Crunch • Collection of heuristic filters that recognize and remove “ clutter” • Implementation − Web proxy − GUI − Document Obj ect Model • Avoid “ one size fits all” pitfall by adapting to the target application

  4. CEAS • A programmatic interface to the Crunch system • Defines a simple and rich set of commands for content extraction • Features − Tabbed Browsing − Offline Browsing − User defined filters via functions

  5. Language Characteristics Language Characteristics • Program flow control • Proceed from the first statement in a file to the last • Data types • int, boolean, String, void, Page and List • Operators • +, -, *, /, %, >, <, >=, <=, ==, !=, !, &, | • Control structures • if, while, do while, for • Functions • createPage(), extract(), append(), title(), rank(), status(), length(), print(), println(), show(), savePage() • Extraction constants • IMAGES, ADS, FLASH, SCRIPTS, TXTLINKS, IMGLINKS, XTRNSTYLE, STYLES, FORMS, LINKLISTS, EMPTYTBLS, INPUT, META, BUTTON, and IFRAME

  6. Page Data Type Page Data Type • Creating a Page Page createPage(“ http:/ / www.cnn.com/ ” ); Page createPage(webpage); Page createPage(“ http:/ / www.nytimes.com/ ” , “ pages/ world/ index.html” ); • Manipulating a Page extract(webpage, IMAGES , ADS ); append(webpage, “ next” ); title(webpage, “ My Webpage” ); rank(webpage, 2); status(webpage);

  7. Some Simple Examples Some Simple Examples Page blog1 = createPage(“ http:/ / www.myblog.com/ latestpost” ); extract(blog1, IMAGES ); show(blog1); List pl[3]; pl[0] = createPage(“ http:/ / www.myblog.com/ latestpost” ); pl[1] = createPage(” http:/ / www.otherblog.com/ latest” ); pl[2] = createPage(” http:/ / www.newblog.com/ ” ); for (int i=0 : 3) { extract(pl[i], IMAGES ); } show(pl);

  8. High-Level Implementation AST Parser Semantic http://... Analyzer AST If No Errors Lexer Interpreter Crunch CEAS source File to disk File to Browser

  9. Implementation Details • S emantic Analyzer − Uses simplified data type − Every line checked • Interpreter − Data types actually store values − Performs calculations − Uses Crunch to extract from web pages − Displays HTML using built-in browser

  10. Implementation Details • Functions − S tatic symbol table for functions − Function body, as AS T, stored in table • Variables − Nested symbol tables for each scope − Create new interpreter for included files − Page types are treated as Java-like obj ects

  11. Testing • Challenges − LRM − Complexity • What to look for − Code that should compile − Code that should not compile • Unit Testing − Lexer/ Parser − Overall Compiler

  12. Testing • Methodology − Overall process − Test hardness • Types of Tests − False positives − False negatives − Visual cases (title, show, rank)

  13. Lessons Learned • Good planning is extremely important • Really evaluate design before implementation • Keep everyone in the loop • S tick to your deadlines • Good communication is key

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend