caim cerca i an lisi d informaci massiva
play

CAIM: Cerca i Anlisi dInformaci Massiva FIB, Grau en Enginyeria - PowerPoint PPT Presentation

CAIM: Cerca i Anlisi dInformaci Massiva FIB, Grau en Enginyeria Informtica Slides by Marta Arias, Jos Luis Balczar, Ramon Ferrer-i-Cancho, Ricard Gavald Department of Computer Science, UPC Fall 2020 http://www.cs.upc.edu/~caim


  1. CAIM: Cerca i Anàlisi d’Informació Massiva FIB, Grau en Enginyeria Informàtica Slides by Marta Arias, José Luis Balcázar, Ramon Ferrer-i-Cancho, Ricard Gavaldá Department of Computer Science, UPC Fall 2020 http://www.cs.upc.edu/~caim 1 / 11

  2. 0. Presentation

  3. COVID 19 ◮ Follow the instructions that FIB has sent to you. ◮ Sit always of the same place. ◮ Write your row and column somewhere so that you can remember it. 3 / 11

  4. Instructors ◮ Ramon Ferrer-i-Cancho (lectures + exercices 10 & 20; lab 12) ◮ rferrericancho@cs.upc.edu ◮ Omega S124, 93 413 4028 ◮ Ignasi Gómez (lab 11, 21 & 22) ◮ ignasi.gomez@upc.edu ◮ Javier Béjar (lab 13) ◮ bejar@cs.upc.edu ◮ Omega 204, 93 413 7879 4 / 11

  5. Class Logistics ◮ Fridays, 12–14 (A6E01), 15–17 (A6E02) ◮ Theory and exercises. Often, exercises will be proposed in advance. ◮ Thursdays, lab sessions ◮ Guided lab activities; expected to be complemented with an average estimate of 2 additional hours per session of autonomous work. ◮ Some lab sessions will finish by handing in a short written report; these count towards the evaluation of the course. 5 / 11

  6. Lab work - important rules ◮ Lab is done in pairs. Exceptions must have prior permission ◮ This semester: keep the same partner for the whole semester (see instructions at Racó). ◮ Do not exchange information with others, other than general ideas; that will be considered plagiarism 6 / 11

  7. Exercises ◮ In class, we will solve only a part of the exercises proposed ◮ You are strongly encouraged to try and solve the rest of the exercises ◮ Self-study: One or more small topics will not be explained in class. They will appear in the exam. 7 / 11

  8. Evaluation ◮ Evaluation: as per “Guia Docent” ◮ Parcial 1 (P1): November 5 16:00-17:30 (during week for partial exams), Parcial 2 (P2): 11/01/2021 15:00-18:00 ◮ On the day of Parcial 2 you may choose to do instead a final exam (F) on the whole course ◮ 40 % Lab + max(30 % P1 + 30 % P2, 60 % F) 8 / 11

  9. Contents I First half (until midterm): ◮ Core Information Retrieval: ◮ Introduction: Concept. The IR process ◮ Information Retrieval Models ◮ Indexing and Searching, Implementation ◮ Information Retrieval Evaluation, Feedback Models ◮ Web Search: ◮ Link analysis: Page Rank ◮ Crawling the web ◮ Architecture of a Web search system 9 / 11

  10. Contents II Second half: ◮ The “Big Data” Slogan ◮ Architecture of large-scale web search systems ◮ The Map-Reduce paradigm ◮ Introduction to NoSQL databases ◮ The Apache ecosystem for web search. ◮ Social Network Analysis: ◮ Characterizing of real complex networks ◮ Communities, influence, information diffusion ◮ Clustering and Locality Sensitive Hashing ◮ Recommender Systems 10 / 11

  11. Bibliography ◮ R. Baeza-Yates, B. Ribeiro-Neto: Modern Information Retrieval (2nd ed.). Addison Wesley, 2010. ◮ I.H. Witten, A. Moffat, T. Bell: Managing Gigabytes. Morgan Kaufmann, 1999. ◮ C.D. Manning, P . Raghavan, H. Schütze: Introduction to Information Retrieval. Cambridge 2008. ◮ Z. Markov, D.T. Larose: Data Mining the Web. Wiley, 2007. ◮ Russell, Matthew , Mining the Social Web: Analyzing Data from Facebook, Twitter, LinkedIn, and Other Social Media Site. O’Reilly , 2011 ◮ . . . There’s a whole web out there 11 / 11

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend