team members
play

Team Members Ali Khodaei Kaveh Shahabi Search Engine Sangeetha - PDF document

9/23/2009 Team Members Ali Khodaei Kaveh Shahabi Search Engine Sangeetha U Santharam for for Shoah Foundation Presented by Ali Khodaei (khodaei@usc.edu) Project Motivation Project Definition Existence of huge set of


  1. 9/23/2009 Team Members • Ali Khodaei • Kaveh Shahabi Search Engine • Sangeetha U Santharam for for Shoah Foundation Presented by Ali Khodaei (khodaei@usc.edu) Project Motivation Project Definition • Existence of huge set of useful data • Robust, efficient and interactive search engine ranking testimonies based on combination of – Over 50,000 video testimonies – Textual (regular) keywords – Each divided to one-minute segments – Each segment tagged with set of keywords – Spatial keywords • Good amount of spatial and textual data • This search engine finds and ranks the most • Lack of location-based search engine textually and spatially relevant testimonies – Lack of an interface to ask for spatial data (segments) according to – Lack of ranking/scoring function to rank/score – query keywords document based on space and text simultaneously – query location Output Input • Query Keywords – Set of keywords inputted as text • Query Location – A region drawn on the map OR A i d th OR – A spatial keyword inputted as text 1

  2. 9/23/2009 System Components Tasks 1- Data tier SHOA – Data Cleansing Together DB handles sessions, user • Understand / format / standardize the data interactions, and events – Geocoding / GeoTagging GUI Data (Client Side) Extraction • Find missing lat/long information for some of • Find missing lat/long information for some of And And cleansing spatial keywords Web Video • Assign appropriate geographical information to Application Load Video (Server Side) DB RAW each testimony/segment Formatted – Index Construction DB • Create inverted files for regular keywords Readonly Web Service Index access • Create inverted files for spatial keywords Mid tier consist of all Creating index Structure structure (one time) the core functionalities Tasks Tasks 2- Middle tier 3- Interface (GUI) – Intelligent web-services – User friendly interface to receive input from the user • Talk to interface • Textbox for textual keywords – Receive input (query parameters) • Map interface to draw/show query location – Send output (query result) – A textbox can be used to input a location s name A textbox can be used to input a location’s name • Talk to data tier – Displays the result dynamically and interactively – Get data – Access index • Results should be changed on-the-fly based on map location – Access video database – Provides mechanism to show the testimonies from • Perform necessary operations the interface – Process data • Show testimonies on the same page – Calculates scores – Format the results • Link to a new page for showing the testimonies Tasks Break-down + Schedule 4- Research/Algorithm • Data tier – Hybrid index structure – Understand / format / cleanse (/geocode) / transfer the data • captures spatial and textual keywords (probably using inverted files) simultaneously and efficiently • 4 weeks sangy + Ali – Come up with index structure schema for the middle layer Come up with index structure schema for the middle layer – Relevance ranking function R l ki f i • 2 weeks Ali • Formulas for spatial and temporal scores – Create/implement the actual index structure • A combined scoring function with different weights • 4weeeks Ali + sangy for different features – Integration/extra,.. – Spatial representation of each segment • 1 week Ali and/or testimony’s spatial data 2

  3. 9/23/2009 Break-down + Schedule Break-down + Schedule • Research / Algorithm • Middle layer development – Creating prototypes /connectivity to the interface – Spatial representation of each segment • 3 weeks Kaveh and/or testimony’s spatial data – [1.5 weeks wait for data tier] • 1.5 weeks Ali + Sangy 1.5 weeks Ali Sangy – Create code for ranking function – Relevance ranking function, Formulas for • 2.5 weeks Kaveh spatial and textual scores – Create code for video • 2.5 weeks Ali • 2 weeks Kaveh – Integration/testing • 1 week Kaveh Tasks for Sangy Break-down + Schedule Integration / Testing • Web-development Implement Spatial Index – Static/complete GUI (no functionality) Sangy Middle layer • 3 weeks functionality - Adding functionality Sangy + Kaveh Tasks Static/complete p GUI - 2 weeks - Adding Ajax and dynamic features Kaveh + Data format / Geocode Ali - 4 weeks Data understanding /cleansing - Integration/test Kaveh + Sangy + Ali - 1 week 4 6 8 10 12 2 Time Tasks for Kaveh Tasks for Ali Integration / Testing Ajax/dynamic Integration / features Testing Coding: Video Implement Functionality Spatial Index Ajax/dynamic features Tasks Tasks Relevance ranking g Coding : function searching & ranking Adding index structure schema functionality to for the middle layer middle layer/interface Data understanding /cleansing/geo-tagging Prototyping mid tier 2 4 6 8 10 12 2 4 6 8 10 12 Time Time 3

  4. 9/23/2009 Milestones and Deliverables Deliverables 10/06/09 Prototype 10/30/09 Working Model 11/18/09 Complete GUI 1) Prototype of system having a static (non functional) interface Complete GUI with AJAX and Video embedding – 4rd week Mile Working Model with full functionality 2) System with actual ranking/index 2) System with actual ranking/index stone stone including Indexing/Ranking / / structure and end-to-end functionality – 9th week Prototype 3) (2) + Ajax + video embedding – 11th week 2 4 6 8 10 12 Time Resources • Data – Provided by Shoah Foundation • data stored in sysbase tables • Needs to be cleansed, formatted and transferred • Software – MS Visual Studio .Net – Oracle 10g + • Hardware – Windows Server (+IIS) 4

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend