federated search diagram
play

Federated Search Diagram Solution 1: Federate Searching aka - PowerPoint PPT Presentation

Thoughts on Federated and Aggregated Search Architectures Information Scatter Internal Local and remote file shares Email CMS /DMS Application portals Knowledge bases


  1. Thoughts on Federated and Aggregated Search Architectures � Information Scatter � • � Internal � • � Local and remote file shares � • � Email � • � CMS /DMS � • � Application portals � • � Knowledge bases � • � Multimedia, digital assets � • � External � • � Research papers and other gated content � Enterprise Search Summit 2010 NY � Avi Rappoport, Search Tools Consulting � • � Public-facing sites � www.searchtools.com / consult2@searchtools.com � • � The Web � Avi Rappoport / Enterprise Search Summit NY / May 2010 / consult2@searchtools.com � 2 � !" Thoughts on Federated and Aggregated Search Architectures � Thoughts on Federated and Aggregated Search Architectures � Federated Search Diagram � Solution 1: Federate Searching � • � aka “MetaSearch” � • � Single Search Interface � • � Accepts queries and converts to various formats � • � Sends queries to multiple external search engines � • � Includes user authentication � • � Collects result lists � • � In external search relevance order � • � Collates and sorts by relevance � • � Single list or in panels � • � Can be dynamically updated � #$"%&'"()&*"+,-.,&&/.,-0".)&*&,-1234" Avi Rappoport / Enterprise Search Summit NY / May 2010 / consult2@searchtools.com � Avi Rappoport / Enterprise Search Summit NY / May 2010 / consult2@searchtools.com � 3 � !" !" 4 �

  2. Thoughts on Federated and Aggregated Search Architectures � Thoughts on Federated and Aggregated Search Architectures � Federated Results: Apple Site � Federated Results: Science.gov � Special sources � Product info � Store Items � Main Results � Support Dynamic facets pages � from results clustering � Avi Rappoport / Enterprise Search Summit NY / May 2010 / consult2@searchtools.com � Avi Rappoport / Enterprise Search Summit NY / May 2010 / consult2@searchtools.com � !" 5 � !" 6 � Thoughts on Federated and Aggregated Search Architectures � Thoughts on Federated and Aggregated Search Architectures � Federate at Search Time � Solution 2: Aggregate Indexing � aka “Unified Information Access” � • � Gather all possible data � • � Robot crawlers on intranets � • � RSS blog feeds � • � Automated connectors � • � Custom scripts � • � Store in a single index � • � Include access control information � • � Simple to search all at once � graphic by AJ Summers � some rights reserved � Avi Rappoport / Enterprise Search Summit NY / May 2010 / consult2@searchtools.com � Avi Rappoport / Enterprise Search Summit NY / May 2010 / consult2@searchtools.com � 8 � !" 7 � !"

  3. Thoughts on Federated and Aggregated Search Architectures � Thoughts on Federated and Aggregated Search Architectures � Aggregate At Index Time � Aggregated Search Diagram � #$"%&'"()&*"+,-.,&&/.,-0".)&*&,-1234" graphic by AJ Summers � some rights reserved � Avi Rappoport / Enterprise Search Summit NY / May 2010 / consult2@searchtools.com � Avi Rappoport / Enterprise Search Summit NY / May 2010 / consult2@searchtools.com � !" 9 � !" 10 � Thoughts on Federated and Aggregated Search Architectures � Thoughts on Federated and Aggregated Search Architectures � Aggregate Results: HP.com � Sources of Content � Federating � Aggregating � • � Lotus Notes � • � Enterprise intranets � • � News feeds and archives � • � File servers � • � Legal: Westlaw, Lexis � • � Sharepoint � • � Government Documents � • � CMS/DMS � • � Patents, Census � • � usually have awful search � • � Multi-national materials � • � Data warehouses � • � Academic journal portals � • � Current CRM � • � Large social networks � • � Legal discovery � • � The Web � Avi Rappoport / Enterprise Search Summit NY / May 2010 / consult2@searchtools.com � Avi Rappoport / Enterprise Search Summit NY / May 2010 / consult2@searchtools.com � 12 � !" 11 � !"

  4. Thoughts on Federated and Aggregated Search Architectures � Thoughts on Federated and Aggregated Search Architectures � Keeping Content Current � Preparation Process � Federating � Aggregating � Federating: � Aggregating: � • � Analyze sources � • � Index � • � Source internal updates � • � Depend on connectors � • � Test search connectors � • � Match data connectors � • � Near-real-time � • � Frequent polling � • � Store source information � • � Open each file or record � • � Content may change � • � Automated notification � • � Tokenize, stem words � • � Programmatic triggers � • � Open Archives Initiative � • � Between queries � • � De-duplicate � • � Re-crawling � • � Taxonomy � • � No notification � • � Store words and documents � • � Merging Updates � • � Minimal bandwidth � • � Scale issues � • � Scale issues � • � Hardware and software � • � Content may change � • � Bandwidth requirements � • � Between index runs � • � Some notification � Avi Rappoport / Enterprise Search Summit NY / May 2010 / consult2@searchtools.com � 13 � Avi Rappoport / Enterprise Search Summit NY / May 2010 / consult2@searchtools.com � 14 � !" !" Thoughts on Federated and Aggregated Search Architectures � Thoughts on Federated and Aggregated Search Architectures � Security & Access Control � Search and Retrieval Process � Federating � Aggregating � Federating � Aggregating � • � Send user credentials with • � Early-binding � • � Convert & send query � • � Single syntax � query � • � Index and store ACL info � • � z39.50, RDW � • � No delay � • � Depends on source security � • � Update index on changes � • � HTTP, Web Services � • � Results in standard format � • � Automation can be hard � • � Late-binding � • � OAI - Open Archives � • � Cache frequent results � • � Always current � • � Custom connectors � • � For each result � • � Network and source speed � • � Send authorization request � • � OK if item is allowed � • � Collect results � • � Repeat until 10 items are • � Standardized formats, XML � allowed � • � Screen-scraping � • � Cache frequent results � Avi Rappoport / Enterprise Search Summit NY / May 2010 / consult2@searchtools.com � 15 � Avi Rappoport / Enterprise Search Summit NY / May 2010 / consult2@searchtools.com � 16 � !" !"

  5. Thoughts on Federated and Aggregated Search Architectures � Thoughts on Federated and Aggregated Search Architectures � Relevance Ranking � Checklist Per Data Source � • � Federation � Federating � Aggregating � • � Source interaction at search time � • � Combine results listings � • � One results listing � • � External content, databases, channels � • � Duplicate detection here � • � Still may need de-duping � • � Overall source relevance � • � Get single relevance rank � • � Always-current content and access control � • � May re-rank � • � Very fast � • � Slower response time, tricky relevance � • � IDF: Inverse Document • � Source ranking is quirky � • � Aggregation � Frequency � • � Based on metadata � • � Rare words in index � • � User activities � • � Source interaction at index time � • � Boost for more matches � • � Very large index files � • � Content and access control updates trickier � • � Fast response time, straightforward relevance � Avi Rappoport / Enterprise Search Summit NY / May 2010 / consult2@searchtools.com � 17 � Avi Rappoport / Enterprise Search Summit NY / May 2010 / consult2@searchtools.com � 18 � !" !" Thoughts on Federated and Aggregated Search Architectures � Be open-minded, analyze the benefits of each approach for each data source. � Aggregator � Post or blog about your experiences � !" 19 � Avi Rappoport / Enterprise Search Summit NY / May 2010 / consult2@searchtools.com �

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend