Reinforcement Learning Lecture 18a Gillian Hayes 7th March 2007 - PowerPoint PPT Presentation

Reinforcement Learning Lecture 18a Gillian Hayes 7th March 2007 Gillian Hayes RL Lecture 18a 7th March 2007

1 Focussed Web Crawling Using RL • Searching web for pages relevant to a specific subject • No organised directory of web pages Web Crawling : start at one root page, follow links to other pages, follow their links to further pages, etc. Focussed Web Crawling : specific topic. Find maximum set of relevant pages having traversed minimum number of irrelevant pages. Why try this? : Less bandwidth, storage time (can take weeks for exhaustive search – billions of web pages) Good for dynamic content – can do frequent updates Can get indexing for a particular topic Alexandros Grigoriadis, MSc AI, Edinburgh 2003 + CROSSMARC project – extracting multilingual info from web on specific domains e.g. laptop retail info, job adverts on companies’ web pages Gillian Hayes RL Lecture 18a 7th March 2007

2 Web Crawler Retrieve Evaluate Good base set pages pages www Link Evaluate Extract queue links links RL link scorer • Link Queue: current set of links that have to be visited. Fetch link with highest score on queue Gillian Hayes RL Lecture 18a 7th March 2007

3 • Evaluate page this link points to: based on set of text/content attributes. If relevant, store on Good Pages • Get links from page • Evaluate links, add to link queue. Does does the link point to a relevant page? will it lead to relevant pages in future? • Where can we use RL? In the link scorer Gillian Hayes RL Lecture 18a 7th March 2007

4 RL Crawling • Reward when it finds relevant pages • Needs to recognise important attributes and follow most promising links first • Aim is to get π ∗ • How to formulate problem? What are states? What are actions? Alternatives: • State = a link, Action = { follow, don’t follow } • State = web page, Action = links • Learn V? Must do local search to get policy • Learn Q? More training examples needed since Q(s,a). But faster to use Choice: Action–links and learn V using TD( λ ) Gillian Hayes RL Lecture 18a 7th March 2007

5 How to Characterise a State? • Use text analyser to come up with keywords for domain – these words typically appear on web pages on this subject area • Feature vector of 500 binary attributes: existence or not of a keyword • State space: 2 500 states ∼ 10 150 – too large for a table • Use a neural network for function approximation to give V(s) • Learn weights of network using temporal difference learning • Eligibility trace on weights instead of states • Reward is 1/0 if page is/is not relevant Gillian Hayes RL Lecture 18a 7th March 2007

6 State Values V Tabular S V V(s) table Feature V(f) = f(s) S V(s) vector V(f(s)) encoding network Gillian Hayes RL Lecture 18a 7th March 2007

7 Learning Procedure • Use a number of training sets of web pages, e.g. different companies’ web sites containing numbers of pages with job adverts and start with a random policy • Learn V π , need to do GPI to get V ∗ • Then incorporate into a regular crawler: the RL neural net evaluates each page – the V value is its score • Which link to choose? Must do one-step lookahead – follow all links in current page, evaluate the pages they lead to • Place new pages on link queue according to score • Follow link at front of link queue to next page with highest likely relevance Gillian Hayes RL Lecture 18a 7th March 2007

8 Performance: Finds relevant pages (if > 1) following fewer links but searches more pages in the 1-step lookahead vs. CROSSMARC non-RL web crawler. Not so good at finding a single relevant page on a site. • Datasets: up to 2000 pages, 16000 links, tiny number of relevant pages in each dataset, English and Greek, 1000 training episodes Gillian Hayes RL Lecture 18a 7th March 2007

9 Issues Depends on: graphical structure of pages • Features chosen: many attributes were == 0 so not discriminating enough • Need to try on bigger datasets • Paper outlines alternative learning procedures Andrew McCallum’s CORA – searching computer science research papers • Treated roughly as a bandit problem learning Q(a). Action a = link on a web page and words in its neighbourhood • Choose the link expected to give highest future discounted reward • 53,000 documents, half a million links, 3x increase in efficiency (no. links followed before 75% of docs found vs. breadth-first search) Gillian Hayes RL Lecture 18a 7th March 2007

10 Alexandros Grigoriadis, Georgios Paliouras: Focused crawling using temporal difference-learning. Proceedings of the Panhellenic Conference in Artificial Intelligence (SETN), Lecture Notes in Artificial Intelligence 3025, 142–153, Springer-Verlag, 2004. Andrew McCallum et al.: Building domain-specific search engines with ML techniques. Proc AAAI-99 Spring Symposium on Intelligent Agents in Cyberspace Gillian Hayes RL Lecture 18a 7th March 2007

Reinforcement Learning Lecture 18a Gillian Hayes 7th March 2007 - PowerPoint PPT Presentation

Reinforcement Learning Lecture 18a Gillian Hayes 7th March 2007 Gillian Hayes RL Lecture 18a 7th March 2007 1 Focussed Web Crawling Using RL Searching web for pages relevant to a specific subject No organised directory of web pages

Reinforcement Learning AIMA Chapters: 21.1, 21.2, 21.3. Sutton and Barto, Reinforcement Learning:

Reinforcement Learning Timothy Chou Charlie Tong Vincent Zhuang April 19, 2016 Reinforcement

Safe Reinforcement Learning Philip S. Thomas Stanford CS234: Reinforcement Learning, Guest

RL Overview of topics About Reinforcement Learning The Reinforcement Learning Problem

Reinforcement Learning UMaine COS 470/570 Introduction to AI Why reinforcement learning?

Reinforcement Learning and Simulation-Based Search David Silver Reinforcement Learning and

Reinforcement Learning Reinforcement Learning Reinforcement Learning in a nutshell g Imagine

Introduction CSCE CSCE 496/896 496/896 Lecture 7: Lecture 7: Reinforcement Reinforcement

Introduction to Reinforcement Learning Kevin Chen and Zack Khan Lecture 1: Introduction to

CS885 Reinforcement Learning Module 2: June 6, 2020 Maximum Entropy Reinforcement Learning

Class Structure Last time: Midterm This time: Fast Learning Next time: Fast Learning Lecture 11:

Reinforcement Learning Lecture 8 Reinforcement Learning November 24, 2015 1 Wentworth

Introduction to Reinforcement Learning and Q-Learning Skyler Seto (ss3349) May 2, 2016 Skyler

7. Motor Control and Reinforcement Learning Outline A. Action Selection and Reinforcement B.

1 Deep Reinforcement Learning Qianqian Li, Nayeon Koong, Langtian He What is deep reinforcement

Path following with reinforcement learning for autonomous cars - Mozzam Motiwala (IAS) Index

Digital Magazine Design Page Make A Plan Plan 1.1 1.2 1.3 1.5 1.6 1.4 Structure

Intro to Online Learning Instructor: Haifeng Xu Outline Online Learning/Optimization

Digital Advertising (PPC/SEM) Course Digital Advertising (PPC/SEM) Equinet 1 Academy Digital

disambiguation on Twitter Damiano Spina, Enrique Amig and Julio Gonzalo

GATORCON 2020 4TH 6TH FEBRUARY . OLD THORNS MANOR HOTEL #GatorCon2020 Live Q&A at sli.do

Honeycomb Crea/ve Works is financed by the European

ON-PAGE SEO HOW TO ANALYZE YOUR SEO PROJECT BEFORE YOU GET STARTED You Need Solid On-Page To

How To Use Social Elements to Achieve Specific Email Goals Marc Majers Manager of Web

Reinforcement Learning Lecture 18a Gillian Hayes 7th March 2007 - PowerPoint PPT Presentation

Reinforcement Learning Lecture 18a Gillian Hayes 7th March 2007 Gillian Hayes RL Lecture 18a 7th March 2007 1 Focussed Web Crawling Using RL Searching web for pages relevant to a specific subject No organised directory of web pages

Reinforcement Learning AIMA Chapters: 21.1, 21.2, 21.3. Sutton and Barto, Reinforcement Learning:

Reinforcement Learning Timothy Chou Charlie Tong Vincent Zhuang April 19, 2016 Reinforcement

Safe Reinforcement Learning Philip S. Thomas Stanford CS234: Reinforcement Learning, Guest

RL Overview of topics About Reinforcement Learning The Reinforcement Learning Problem

Reinforcement Learning UMaine COS 470/570 Introduction to AI Why reinforcement learning?

Reinforcement Learning and Simulation-Based Search David Silver Reinforcement Learning and

Reinforcement Learning Reinforcement Learning Reinforcement Learning in a nutshell g Imagine

Introduction CSCE CSCE 496/896 496/896 Lecture 7: Lecture 7: Reinforcement Reinforcement

Introduction to Reinforcement Learning Kevin Chen and Zack Khan Lecture 1: Introduction to

CS885 Reinforcement Learning Module 2: June 6, 2020 Maximum Entropy Reinforcement Learning

Class Structure Last time: Midterm This time: Fast Learning Next time: Fast Learning Lecture 11:

Reinforcement Learning Lecture 8 Reinforcement Learning November 24, 2015 1 Wentworth

Introduction to Reinforcement Learning and Q-Learning Skyler Seto (ss3349) May 2, 2016 Skyler

7. Motor Control and Reinforcement Learning Outline A. Action Selection and Reinforcement B.

1 Deep Reinforcement Learning Qianqian Li, Nayeon Koong, Langtian He What is deep reinforcement

Path following with reinforcement learning for autonomous cars - Mozzam Motiwala (IAS) Index

Digital Magazine Design Page Make A Plan Plan 1.1 1.2 1.3 1.5 1.6 1.4 Structure

Intro to Online Learning Instructor: Haifeng Xu Outline Online Learning/Optimization

Digital Advertising (PPC/SEM) Course Digital Advertising (PPC/SEM) Equinet 1 Academy Digital

disambiguation on Twitter Damiano Spina, Enrique Amig and Julio Gonzalo

GATORCON 2020 4TH 6TH FEBRUARY . OLD THORNS MANOR HOTEL #GatorCon2020 Live Q&amp;A at sli.do

Honeycomb Crea/ve Works is financed by the European

ON-PAGE SEO HOW TO ANALYZE YOUR SEO PROJECT BEFORE YOU GET STARTED You Need Solid On-Page To

How To Use Social Elements to Achieve Specific Email Goals Marc Majers Manager of Web

GATORCON 2020 4TH 6TH FEBRUARY . OLD THORNS MANOR HOTEL #GatorCon2020 Live Q&A at sli.do