INTRODUCTION TO AUTONOMOUS MOBILE ROBOTS Overview Low-level - PowerPoint PPT Presentation

Lesson 5 – Low-level control and learning Anders Lyhne Christensen, D6.05, anders.christensen@iscte.pt INTRODUCTION TO AUTONOMOUS MOBILE ROBOTS

Overview � Low-level control � Ad-hoc � Sense-think-act loop � Event driven control � Learning and adaptation I � Types of learning � Issues in learning � Example: Q-learning applied to a real robot (next time, we will discuss an interesting approach to learning called evolutionary robotics in more detail)

Low-level control We will cover three types of low-level control: � Stream of instructions � Classic Control loop � Event-driven languages Other approaches such as logic programming exist, but we will not cover those in this course.

Stream of instructions Example: // move forward for 2 seconds: moveForward(speed = 10) sleep(2000) if (obstacleAhead()) { turnLeft(speed = 10) sleep(1000) } else { … } � Suitable for industrial, assembly line robots � Easy to describe a fixed, predefined task as a recipe � Little branching

Classic control loop Sense Think Act The loop usually has a fixed duration, e.g. 100 ms and is called repeatedly

Classic control loop while (!Button.ESCAPE.isPressed()) { long startTime = System.currentTimeMillis(); sense(); // read sensors think(); // plan next action act(); // do next action try { Thread.sleep(100 – ( System.currentTimeMillis() – startTime)); } catch (Exception e) {} }

Event-driven languages URBI script – examples: Ball tracking: whenever (ball.visible) { headYaw.val += camera.xfov * ball.x & headPitch.val += camera.yfov * ball.y }; Interaction: at (speech.hear("hello")) { voice.say("How are you?") & robot.standup(); }

Distributed and event-driven Proximity sensors microcontroller Event bus Left motor microcontroller Right motor microcontroller ….

Based on slides from Prof. Lynn E. Parker LEARNING AND ADAPTATION

What is Learning/Adaptation? � Many definitions: � Modification of behavioral tendency by experience. (Webster 1984) � A learning machine, broadly defined, is any device whose actions are influenced by past experiences. (Nilsson 1965) � Any change in a system that allows it to perform better the second time on repetition of the same task or on another task drawn from the same population. (Simon 1983) � An improvement in information processing ability that results from information processing activity. (Tanimoto 1990) � Our operational definition: � Learning produces changes within an agent that over time enable it to perform more effectively within its environment.

What is Relationship between Learning and Adaptation? � Evolutionary adaptation: Descendents change over long time scales based on the success or failure of their ancestors in the environment � Structural adaptation: Agents adapt their morphology with respect to the environment � Sensor adaptation: An agent’s perceptual system becomes more attuned to its environment � Behavioral adaptation: An agent’s individual behaviors are adjusted relative to one another � Learning: Essentially anything else that results in a more ecologically fit agent (can include adaptation).

Habituation and Sensitization � Adaptation may produce habituation or sensitization � Habituation: � An eventual decrease in or cessation of a behavioral response when a stimulus is presented numerous times � Useful for eliminating spurious or unnecessary responses Example of habituation � Generally associated with relatively insignificant stimuli, such as loud noise � Sensitization: � The opposite – an increase in the probability of a behavioral response when a stimulus is repeated frequently � Generally associated with “dire” stimuli, like electrical shocks Sensitization

Learning � Learning, on the other hand, can improve performance in additional ways: � Introducing new knowledge (facts, behaviors, rules) into the system � Generalizing concepts from multiple examples � Specializing concepts for particular instances that are in some way different from the mainstream � Reorganizing the information within the system to be more efficient � Creating or discovering new concepts � Creating explanations of how things function � Reusing past experiences

AI Research has Generated Several Learning Approaches � Reinforcement learning: rewards and/or punishments are used to alter numeric values in a controller � Evolutionary learning: Genetic operators such as crossover and mutation are used over populations of controllers, leading to more efficient control strategies � Neural networks: A form of reinforcement learning that uses specialized architectures in which learning occurs as the result of alterations in synaptic weights � Learning from experience: � Memory-based learning: myriad individual records of past experiences are used to derive function approximators for control laws � Case-based learning: Specific experiences are organized and stored as a case structure, then retrieved and adapted as needed based on the current situational context

Learning Approaches (con’t.) � Inductive learning: Specific training examples are used, each in turn, to generalize and/or specialize concepts or controllers � Explanation-based learning: Specific domain knowledge is used to guide the learning process � Multistrategy learning: Multiple learning methods compete and cooperate with each other, each specializing in what it does best

Challenges with Learning � Credit assignment problem: How is credit or blame assigned to a particular piece or pieces of knowledge in a large knowledge base, or to the components of a complex system responsible for either the success or failure of an attempt to accomplish a task? � Saliency problem: What features in the available input stream are relevant to the learning task? � New term problem: When does a new representational construct (concept) need to be created to capture some useful feature effectively? � Indexing problem: How can a memory be efficiently organized to provide effective and timely recall to support learning and improved performance? � Utility problem: How does a learning system determine that the information it contains is still relevant and useful? When is it acceptable to forget things?

Example: Q-Learning Algorithm � Provides ability to learn by determining which behavioral actions are most appropriate for a given situation � State-action table: Actions, a State, x Next state y , utility E(y) � E(y) = utility of state y

Update function for Q(x,a) � Q(x,a) � Q(x,a) + β (r + λ E(Y) – Q(x,a)) � Where: � β is learning rate parameter � r is the payoff (reward or punishment) � λ is a parameter, called the discount factor, ranging between 0 and 1 � E(y) is the utility of the state y that results from the action and is computed by: E(y) = max(Q(y,a)) for all actions a � Reward actions are propagated across states so that rewards from similar states can facilitate learning, too. � What is “similar state”? One approach: Weighted Hamming Distance

Utility Function Used to Modify Robot’s Behavioral Responses Initialize all Q(x,a) to 0. Do Forever � Determine current world state s via sensing � 90% of the time choose action a that maximizes Q(x,a) else pick random action � Execute a � Determine reward r � Update Q(x, a) as described � Update Q(x’,a) for all states x’ similar to x End Do

Example of Using Q-Learning: Teaching Box-Pushing � Robot (Obelix): � 8 sonar (4 look forward, 2 look right, 2 look left) � Sonar quantized into two ranges: � NEAR (from 9-18 inches) � FAR (from 18-30 inches) � Forward-looking infrared (IR): � Binary response of 4 inches to indicate when robot in BUMP state � Current to drive motors monitored to determine if robot is STUCK (i.e., input current exceeds a threshold) � Total of 18 bits of sensor information available: 16 sonar bits (NEAR, FAR), two for BUMP and STUCK � Motor control outputs – five choices: Obelix robot and box, 1991 � Moving forward � Turning left 22 degrees � Turning right 22 degrees � Turning more sharply left at 45 degrees � Turning more sharply right at 45 degrees

Robot’s Learning Problem � Learning Problem: � Deciding, for any of the approximately 250,000 perceptual states, which of the 5 possible actions will enable it to find and push boxes around a room without getting stuck 5 actions 250,000 perceptual states = 250,000 x 5 = 1,250,000 state/action pairs to explore!

State Diagram of Behavior Transitions Anything else Finder STUCK + ∆ t Unwedger BUMP • Finder: moves robot toward possible boxes STUCK • Pusher: occurs after STUCK BUMP results from box Pusher find • Unwedger: removes BUMP + ∆ t robot when box is no longer pushable BUMP

Measurement of “State Nearness” � Use 18-bit representation of state (16 for sonar, two for BUMP and STUCK) � Compute Hamming distance between states � Recall: Hamming distance = number of bits in which the two states differ � For this example, states were considered “near” if Hamming distance < 3

INTRODUCTION TO AUTONOMOUS MOBILE ROBOTS Overview Low-level - PowerPoint PPT Presentation

Lesson 5 Low-level control and learning Anders Lyhne Christensen, D6.05, anders.christensen@iscte.pt INTRODUCTION TO AUTONOMOUS MOBILE ROBOTS Overview Low-level control Ad-hoc Sense-think-act loop Event driven control

Wheeled Mobile Robots 1 Mechanics of Mobile Robots companion slides for the blackboard lecture

Agenda Overview of Mobile Industrial Robots Future Steps for Mobile Industrial Robots

UNIVERSAL ROBOTS RUC 2018 Universal Robots - Evolving the future UNIVERSAL ROBOTS SET THE

The Imitation Game: The New Frontline of Security Fighting Robots Weve been warned for a

Robots Playing Catch Brandon Tolsch Brandon Tolsch Robots Playing Catch Two robots throwing

Agenda Overview of Mobile Industrial Robots Future Steps for Mobile Industrial Robots

CS325 Artificial Intelligence Robotics I Autonomous Robots (Ch. 25) Dr. Cengiz Gnay, Emory

Computer Vision and Control Control Computer Vision and for Autonomous Autonomous Robots

MOBILE ADVERTISING Agenda Get off to a mobile start with Media Impact! Why mobile? MI

Human robot interaction www.biorobotics.ttu.ee Social robots Traditional robots Tools

Computations by Luminous Robots Giuseppe Prencipe Universit di Pisa Swarms of robots Many

Autonomous Mobile Robots Searching Methods By: Alex Morales Graduate Mentor : Joey Durham

Autonomous Introduction to Mobile Robots Contents Acknowledgments xi Preface xiii 1

Building Situated Robots Overview: Agents and Robots Robot systems and architectures

Modular Robots Modular Robots by D. Dibbern and A. Werdermann by D. Dibbern and A. Werdermann

Mobile Capabilities And Credentials Contents Mobile Landscape Mobile Functionalities

Commissioning of The CMS Forward Pixel Detector Ashish Kumar SUNY Buffalo (for the CMS FPix

Real Time Java Real Time Java Filip Pizlo , Jan Vitek Filip Pizlo , Jan Vitek Purdue University

FINISH SOME LEFTOVER C++ TOPICS Professor Ken Birman THEN: DEADLOCKS, LIVELOCKS CS4414 Lecture

Slides on thr Slides on threads eads borr borrowed by Chase owed by Chase Landon Cox Landon

Shaders Slide credit to Prof. Zwicker Today Shader programming 2 Complete model Blinn

Accelerated machine learning inference as a service for particle physics computing Nhan Tran

Abstract Abstract. The security of the RSA cryptosystem relies on the believed difficulty

Today https://pollev.com/sprenkle Process Scheduling Review and conclusions Cooperating