Implementing Cross Entropy Method for TensorForce Tom Brady - PowerPoint PPT Presentation

Dec 28, 2023 •49 likes •119 views

Implementing Cross Entropy Method for TensorForce Tom Brady TensorForce* Open Source (Apache 2.0) Reinforcement Learning library Built on top of TensorFlow and compatible with Python 2.7 and >3.5 Goal: clear APIs, readability

Implementing Cross Entropy Method for TensorForce Tom Brady
TensorForce* ● Open Source (Apache 2.0) Reinforcement Learning library ● Built on top of TensorFlow and compatible with Python 2.7 and >3.5 ● Goal: clear APIs, readability and modularisation ● Differentiator: ○ “strict separation of environments, agents and update logic that facilitates usage in non-simulation environments” ○ Everything optionally configurable to be able to quickly experiment with new models. ● Integrates with OpenAI Gym API, OpenAI Universe, DeepMind lab, ALE and Maze explorer * Find out more: https://github.com/reinforceio/tensorforce
Sample Usage ● Clear APIs ● Readable ● Modular
Cross Entropy Method ● Probabilistic Stochastic Optimization Method ● Neural network parametrizes the distribution of solutions ● Intuition: Iteratively sampling and refining a distribution of solutions ● High Level Procedure: ○ Assume a distribution of the problem space (e.g. Gaussian, with specified mean and variance) ○ While not converged: ■ Sample domain by generating candidate solutions from distribution ■ Evaluate the generated candidates ■ Update distribution based on the better candidate solutions discovered, minimizing the cross entropy ● Open source implementations available (e.g. https://github.com/rll/rllab/blob/master/rllab/algos/cem.py)
Aim: Implement X-Entropy Method for TensorForce ● Goal : Implement Cross Entropy pure TensorFlow in the TensorForce architecture ○ Following TensorForce’s philosophy: clear APIs, readability and modularisation ○ Allow for experimentation with and deployment of RL models using X-entropy method using TensorForce ● Validation: Run x-entropy method on a simple OpenAI gym environment (e.g. CartPole) ○ Compare performance to other methods
Getting to the Goal Goal : Implement Cross Entropy pure TensorFlow in the TensorForce architecture Very little done so far & very little planned to do in the next week. From Monday onwards - I have a plan ! ● Analysis ○ Reading about Cross Entropy Method ○ Reading through TensorForce source, familiarizing myself with architecture ● Cross Entropy in TensorForce ● Test implementation on a simple OpenAI gym environment (e.g. CartPole) ○ Compare performance to other methods ● Hopefully get a PR merged into TensorForce to give this functionality to users
Thank you. Questions?

Recommend

Entropy, Relative Entropy, Cross Entropy Entropy Entropy, H(x) is a measure of the uncertainty of

Entropy, Relative Entropy, Cross Entropy Entropy Entropy, H(x) is a measure of the uncertainty of a discrete random variable. Properties: H(x) >= 0 Entropy Entropy Lesser the probability for an event, larger the entropy.

474 views • 21 slides

Formal Modeling in Cognitive Science Lecture 25: Entropy, Joint Entropy, Conditional Entropy 1

Entropy Entropy Formal Modeling in Cognitive Science Lecture 25: Entropy, Joint Entropy, Conditional Entropy 1 Entropy Entropy and Information Joint Entropy Frank Keller Conditional Entropy School of Informatics University of Edinburgh

81 views • 4 slides

Evolution Strategies using TensorForce LSDPO (2017/2018) Project Presentation Tudor Tiplea

Evolution Strategies using TensorForce LSDPO (2017/2018) Project Presentation Tudor Tiplea (tpt26) What is TensorForce? Open-Source Reinforcement Learning Library Built on top of TensorFlow Provides a strict separation of agents,

100 views • 9 slides

Entropy Coding Definition of Entropy Three Entropy coding techniques: (taken from the

Outline Entropy Coding Definition of Entropy Three Entropy coding techniques: (taken from the Technion) Huffman coding Arithmetic coding Lempel-Ziv coding 2 Entropy Definitions Alphabet : A finite set containing at least

403 views • 10 slides

1) Entropy = measure of randomness 2) Entropy = measure of compressibility More random = Less

Introduction to Information Retrieval Entropy: a basic introduction 1) Entropy = measure of randomness 2) Entropy = measure of compressibility More random = Less compressible High entropy = high randomness/low compressibility Low entropy =

236 views • 19 slides

Chapter 2 Entropy, Relative Entropy, and Mutual Infor- mation Peng-Hua Wang Graduate Institute

Chapter 2 Entropy, Relative Entropy, and Mutual Infor- mation Peng-Hua Wang Graduate Institute of Communication Engineering National Taipei University Chapter Outline Chap. 2 Entropy, Relative Entropy, and Mutual Information 2.1 Entropy 2.2

753 views • 51 slides

02 | 27 SOUTHERN CROSS 23.04 03 | 27 SOUTHERN CROSS 23.04 04 | 27 SOUTHERN CROSS 23.04 06

302 views • 27 slides

The Differentiable Cross-Entropy Method ICML 2020 Br Brandon Amos 1 De Denis is Yarats 12 12 1

The Differentiable Cross-Entropy Method ICML 2020 Br Brandon Amos 1 De Denis is Yarats 12 12 1 Facebook AI Research 2 New York University brandondamos denisyarats bamos.github.io cs.nyu.edu/~dy1042 The cross-entropy method is a powerful

589 views • 20 slides

The Shadow of the Cross The Cross of Jesus part 1B The Shadow of the Cross Hebrews 10:1-14 The

The Shadow of the Cross The Cross of Jesus part 1B The Shadow of the Cross Hebrews 10:1-14 The Shadow of the Cross The Shadow of the Cross OT Glimpses of the Cross OT Glimpses of the Cross Heb 8:5 & 10:1 Heb 8:5 & 10:1 OT Glimpses

359 views • 35 slides

Road detection via entropy By Anna Zaidman 1 1 What is entropy? Entropy is a mathematically

Road detection via entropy By Anna Zaidman 1 1 What is entropy? Entropy is a mathematically - defined thermodynamic quantity that helps to account for the flow of energy through a thermodynamic process. 2 What is

467 views • 13 slides

Entropy Change in Entropy Reversible Isobaric Process Ideal Gas in a Reversible Process Free

Entropy Change in Entropy Reversible Isobaric Process Ideal Gas in a Reversible Process Free Expansion of an Ideal Gas Microscopic Interpretation of Entropy Entropy and the Second Law of Thermodynamics

466 views • 11 slides

Entropy and The Second Law of Thermodynamics Entropy (S)

Entropy and The Second Law of Thermodynamics Entropy (S) Entropy is o8en related to disorder but not strictly correct Entropy is a measure

353 views • 8 slides

Orc David Schleef Entropy Wave Inc (c) 2009 Entropy Wave Inc What is Orc A system for

Orc David Schleef Entropy Wave Inc (c) 2009 Entropy Wave Inc What is Orc A system for describing low-level computation on modern CPUs (c) 2009 Entropy Wave Inc Motivation (c) 2009 Entropy Wave Inc Motivation Want maintainable assembly

611 views • 28 slides

Topological entropy and algebraic entropy on locally compact abelian groups - The Bridge Theorem

Topological entropy and algebraic entropy on locally compact abelian groups - The Bridge Theorem - Topological entropy and algebraic entropy on locally compact abelian groups - The Bridge Theorem - Anna Giordano Bruno - University of Udine

212 views • 17 slides

Probabilistic Models of Human Sentence Experiment 1: Entropy and Sentence Length 2 Processing

From Sentence to Text From Sentence to Text Experiment 1: Entropy and Sentence Length Experiment 1: Entropy and Sentence Length Experiment 2: Entropy in Context Experiment 2: Entropy in Context Experiment 3: Entropy out of Context Experiment

231 views • 7 slides

Comparison Between Bayesian and Maximum Entropy Analysis of Flow Networks 1 Maximum Entropy

Comparison Between Bayesian and Maximum Entropy Analysis of Flow Networks 1 Maximum Entropy The Maximum Entropy (MaxEnt) method is a methodology to assign or update probability distributions to describe systems which are not completely

202 views • 17 slides

Priorities for the 126 th Legislature Richard Levitan, rll@levitan.com March 21, 2013 Marcellus

Maine Energy & Environmental Policy: Priorities for the 126 th Legislature Richard Levitan, rll@levitan.com March 21, 2013 Marcellus gas production skyrocketing 2.5 2.0 Trillion Cubic Feet (Tcf) 1.5 1.0 0.5 0.0 2006 2007 2008 2009

258 views • 9 slides

Evolution and Co-Evolution of Computer Programs to Control Independently-Acting Agents John R.

Evolution and Co-Evolution of Computer Programs to Control Independently-Acting Agents John R. Koza Presented by MinHua Huang Outline Introduction Genetic Programming Paradigm 3 examples - Artificial Ant - Differential Game - Co-Evolution

760 views • 27 slides

FROM UP HERE YOULL SEE THE WORLD DIFFERENTLY TITAN HELICOPTER GROUP What do we Do? The Titan

FROM UP HERE YOULL SEE THE WORLD DIFFERENTLY TITAN HELICOPTER GROUP What do we Do? The Titan Helicopter Group (THG) is South African based and Owned. Our helicopter operations provide professional lateral solutions to customer requirements

616 views • 46 slides

FORM 10-Q QUARTERLY REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF

UNITED STATES SECURITIES AND EXCHANGE COMMISSION Washington, D.C. 20549 FORM 10-Q QUARTERLY REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934 For the quarterly period ended March 31, 2017 Or TRANSITION REPORT

1.43k views • 79 slides

,. Thts Region . ESTEI.A 1 1j. CARINO, EdD, CESO I,Y X From J '1" 11" Director

~ /1~ C~l(? "~!~ton Sconda~L. Elerncntary_~nd Th~: Rl'l~l<' Admi_!IS~at~:~9tJicl;! {_Lnds_L~?J -~ L~1 ---- \ )j II II P!IIl!PPI;\1 \ DEPARTMENT OF EDUCATION MEMORANDUM To: Schools Dtv 1sion Superintendents PRAISF Sub-

416 views • 4 slides

Frontiers in Chemistry: Lets create our Future! 100 years with IUPAC WELCOME TO FRANCE

Frontiers in Chemistry: Lets create our Future! 100 years with IUPAC WELCOME TO FRANCE TO CELEBRATE THE IUPAC CENTENARY THE CONVENTION CENTER OF PARIS 2 place de la Porte Maillot 75017 Paris WHERE? Palais des Congrs de Paris

406 views • 24 slides

PRESERVING LANGUAGE IN INDIGENOUS KNOWLEDGE CENTRES AND LIBRARIES Aboriginal and Torres

PRESERVING LANGUAGE IN INDIGENOUS KNOWLEDGE CENTRES AND LIBRARIES Aboriginal and Torres Strait Islander Councils IKC No IKC RLQ Programs and Exhibitions Bi-lingual Signage IKC on Broadcast F5F translations Spoken

276 views • 17 slides

Agenda Introduction to Swedish Military Aviation Regulations, Past and Present Total

Agenda Introduction to Swedish Military Aviation Regulations, Past and Present Total Aviation System Approach Introduction Presentation of Swedish Rules of Military Aviation (RML) System Worthiness Approach in Design of Military

449 views • 13 slides