Branes with Brains Reinforcement learning in the landscape of - PowerPoint PPT Presentation

Branes with Brains Reinforcement learning in the landscape of intersecting brane worlds F ABIAN R UEHLE (U NIVERSITY OF O XFORD ) String_Data 2017, Boston 11/30/2017 Based on [work in progress] with Brent Nelson and Jim Halverson

Motivation - ML ‣ Three approaches to machine learning: • Supervised Learning: Train the machine by telling it what to do ✦ • Unsupervised Learning: Let the machine train without telling it what to do ✦ • Reinforcement Learning: [Sutton, Barto ’98 ’17] Based on behavioral psychology ✦ Don’t tell the machine exactly what to do but reward “good” and/or ✦ punish “bad” actions AI = reinforcement learning + deep (neural networks) learning ✦ [Silver ’16]

Motivation - RL ‣ Agents interact with an environment (e.g. string landscape) ‣ Each interaction changes the state of the agent, e.g. the dof’s parameterizing the string vacuum ‣ Each step is either rewarded (action lead to a more realistic vacuum) or punished (action lead to a less realistic vacuum) ‣ The agent acts with the aim of maximizing its long-term reward ‣ Agent repeats actions until it is told to stop (found a realistic vacuum or give up)

Outline ‣ String Theory setup: • Intersecting D6-branes on orbifolds of toroidal orientifolds ‣ Implementation in Reinforcement Learning (RL) • Basic overview • Implementing the RL code • Modelling the environment ‣ Preliminary results • Finding consistent solutions ‣ Conclusion

String Theory 101 Intersecting D6-branes on orbifolds of toroidal orientifolds

String Theory 101 ‣ Have: IIA String Theory in 9D + time with 32 supercharges ‣ Want: A Theory in 3D + time with 4 supercharges ‣ Idea: Make extra 6D so small that we do not see them ‣ How do we do that? 1. Make them compact 2. Make their diameter so small that our experiments cannot   detect them ‣ Reduce supercharges from 32 to 4: • Identify some points with their mirror image

String Theory 101 - Setup ‣ Why this setup? • Well studied [Blumenhagen,Gmeiner,Honecker,Lust,Weigand '04'05; Douglas, Taylor '07, ...] • Comparatively simple [Ibanez, Uranga ’12] • Number of (well-defined) solutions known to be finite: [Douglas, Taylor ’07] Use symmetries to relate different vacua ✦ Combine consistency conditions to rule out combinations ✦ • BUT: Number of possibilities so large that not a single “interesting” solution could be found despite enormous random scans (estimated to 1:10 9 ) • Seems Taylor-made for big data / AI methods

String Theory 101 - Compactification ‣ How to make a dimension compact? Pacman ⇒

String Theory 101 - Compactification y 1 y 2 y 3 x 1 x 2 x 3 ‣ Now six compact dimensions, but idea too simple ‣ Resulting space too simple (but just a little bit) ‣ Make it a bit more complicated

String Theory 101 - Orbifolds y 1 T 2 / Z 2 T 2 x 1 ‣ Mathematically: ( x 1 , y 1 ) → ( − x 1 , − y 1 ) ‣ Resulting object is called an orbifold ‣ Need to also orientifold:   ( x 1 , y 1 ) → ( x 1 , − y 1 ) (plus something similar for the string itself)

String Theory 101 - Winding numbers ( n, m ) = (1 , 2) ( n, m ) = (1 , 0) ( n, m ) = (0 , 1) ( n, m ) Winding numbers : ( n, m ) , ( n, − m ) Note: Due to orientifold: include

String Theory 101 - D6 branes T 2 T 2 T 2 y 1 y 2 y 3 3D x 1 x 2 x 3 ‣ D6 brane: our 3D + a line on each torus ‣ Can stack multiple D6 branes on top of each other ‣ Brane stacks Tuple: ( N, n 1 , m 1 , n 2 , m 2 , n 3 , m 3 ) ⇔

String Theory 101 - Gauge group and particles ‣ Observed gauge group:   SU (3) × SU (2) × U (1) Y D6 branes on top of each other   U ( N ) : N Special cases: • D6 branes parallel to O6-plane SO (2 N ) : N • D6 branes orthogonal to O6-plane Sp ( N ) : N ‣ Intersection of -brane and -brane stack:   N M Particles in representation ( N, M ) 1 , − 1 ‣ Observed particles in the universe: 3 × (3 , 2) 1 + 3 × (3 , 1) − 4 + 3 × (3 , 1) 2 + Quarks 4 × (1 , 2) − 3 + 1 × (1 , 2) 3 + 3 × (1 , 1) 6 Leptons + Higgs

String Theory 101 - MSSM T 2 T 2 T 2 y 1 y 2 y 3 3D x 1 x 2 x 3 ‣ Green and yellow intersect in points 3 · 1 · 1 = 3 ‣ Note: Counting intersections on the orbifold a bit more subtle

String Theory 101 - Consistency ‣ Tadpole cancellation: Balance energy of D6 and O6: N a n a 0 1 0 1 1 n a 2 n a 8 3 #stacks − N a n a 1 m a 2 m a 4 X B C B C 3 A = B C B C − N a m a 1 n a 2 m a 4 @ @ A 3 a =1 − N a m a 1 m a 2 n a 8 3 ‣ K-Theory: Global consistency: 0 1 0 1 0 1 2 N a m a 2 0 1 m a 2 m a 3 #stacks 2 0 − N a m a 1 n a 2 n a X B C B C B C 3 A mod A = − N a n a B C B C B C 2 0 1 m a 2 n a @ @ @ A 3 a =1 − 2 N a n a 2 0 1 n a 2 m a 3

String Theory 101 - Consistency ‣ SUSY (computational control): ∀ a = 1 , . . . , # stacks m a 1 m a 2 m a 3 − j m a 1 n a 2 n a 3 − k n a 1 m a 2 n a 3 − ` n a 1 n a 2 m a 3 = 0 n a 1 n a 2 n a 3 − j n a 1 m a 2 m a 3 − k m a 1 n a 2 m a 3 − ` m a 1 m a 2 n a 3 > 0 ‣ Pheno: + particles SU (3) × SU (2) × U (1) ‣ is iff: T = ( T 1 , T 2 , . . . , T k ) , k = # U ( N ) stacks U (1)   T 1  2 N k m k    2 N 1 m 1 2 N 2 m 2 0 T 2 · · · 1 1 1   2 N k m k 2 N 1 m 2 2 N 2 m 2  = 0   .  · · · ·    1 2 2 .   . 2 N k m k 2 N 1 m 2 2 N 2 m 2 0  · · · 3 3 3 T k

String Theory 101 - IIA state space ‣ State space gigantic • Choose a maximal value for winding number w max • Let be the number of possible winding number N B combinations (up to ) after symmetry reduction w max • Let be the maximal number of stacks N S ✓ N B ◆ • Allows for combinations N S • Note: Each stack can have branes N = 1 , 2 , 3 , . . .

Reinforcement learning

Reinforcement learning - Overview ‣ At time , agent in state s t ∈ S total t ‣ Select action from action space based on policy   A a t π π : S total 7! A ‣ Receive reward for action based on reward r t ∈ R a t function R, R : S total × A → R ‣ Transition to the next state s t +1 ∞ X γ k r t + k ‣ Try to maximize long-term return , γ ∈ (0 , 1] G t = k =1 ‣ Keep track of state value (“how good is the state”) v ( s ) ‣ Compute advantage estimate   Adv = r − v (“how much better than expected has the action turned out to be”)

Reinforcement Learning - Overview ‣ How to maximize future return? • Depends on policy π ‣ Several approaches • Tabular (small state/action spaces): [Sutton, Barto ’98] Temporal difference learning ✦ my breakout group on Friday ⇒ ✦ SARSA Q-learning ✦ • Deep RL (large/infinite state/action spaces): [Mnih et al ’15] ✦ Deep Q-Network [Mnih et al ’16] Asynchronous advantage actor-critic (A3C) ✦ Variations/extensions: Wolpertinger [Dulac-Arnold et al ’16], Rainbow [Hessel et al '17] ✦

Reinforcement Learning - A3C Global instance Policy Value Network Input … Worker 1 Worker 2 Worker n Policy Value Policy Value Policy Value Network Network Network Input Input Input Environment Environment Environment

Reinforcement Learning - A3C ‣ Asynchronous : Have n workers explore the environment simultaneously and asynchronously • improves training stability (experience of workers separated) • improves exploration ‣ Advantage : Use advantage to update policy ‣ Actor-critic : To maximize return need to know state or action value and optimize policy. Methods like Q-learning focuses on value function • Methods like policy-gradient focus on policy • AC: Use value estimate (“critic”) to update policy (“actor”) •

Reinforcement Learning - Implementation ‣ Open AI Gym: Interface between agent (RL) and environment (string landscape) [Brockman et al '16] We provide the environment • We use ChainerRL’s implementation of A3C for the agent • step Environment Chainer RL ✦ method ✦ action space make (A3C,DQN,…) ✦ observation (state) env reset ✦ NN architecture   space (FF, LSTM,…) ‣ step: ‣ make environment • go to new state ‣ specify RL method (A3C) • return (new_state, reward, done, comment) ‣ specify policy NN (FF,LSTM) ‣ reset: • reset episode • return start_state

Branes with Brains Reinforcement learning in the landscape of - PowerPoint PPT Presentation

Branes with Brains Reinforcement learning in the landscape of intersecting brane worlds F ABIAN R UEHLE (U NIVERSITY OF O XFORD ) String_Data 2017, Boston 11/30/2017 Based on [work in progress] with Brent Nelson and Jim Halverson Motivation -

Introduction to AdS/CFT D-branes Type IIA string theory: Dp-branes p even (0,2,4,6,8) Type IIB

Exotic Branes and Exotic Branes and Superconformal Field Theories Superconformal Field Theories

Azumaya noncommutative geometry and D-branes - an origin of the master nature of D-branes

D-branes at singularities and SUSY breaking Dmitry Malyshev Princeton university Madison

D-Branes and AdS/CFT Junaid Saif Khan Supervised by: Dr. Babar A. Qureshi MS mid-year

Nernst Branes from special geometry David Errington March 5, 2015 arXiv:hep-th/1501 . 07863 Paul

String Phenomenology: Type II/F-Theory Perspective Focus on particle physics & D-branes: I.

Brane Tilings, M2-Branes and Chern-Simons Theories NOPPADOL MEKAREEYA Theoretical Physics Group,

Brains, Genes, and Language Evolution Morten H. Christiansen Cornell University Santa Fe

Neural networks Chapter 20, Section 5 Chapter 20, Section 5 1 Outline Brains Neural

Exploring the Brains Activity: a signal and image modeling challenge Maureen Clerc Inria

DESIGNING FOR BRAINS: THE PSYCHOLOGY OF UX DESIGN M A R I S S A E P S T E I N Marissa

Neural network representations and visual processing in brains Katja Seeliger [ www.ccnlab.net |

GlusterFS: Arbiter based replication Without 3x storage cost + zero split-brains! Ravishankar N.

Neural networks Chapter 20 Chapter 20 1 Outline Brains Neural networks Perceptrons

Modeling the Brains Operating System Dana H. Ballard Computer Science Dept. University of

Remembering Klaus Keimel * 22 September 1939 18 November 2017 Narrative Slides selected

HR Be Nimble: Work/Life and Well-Being Sponsored by March 7, 2017 Presenters Linda Harber

Statisticians quest for biomarkers: optimizing the two stage testing procedures Vera

Karlene Allen Bayshore Ambulance Foster City Max Baldridge Piner s Napa Ambulance Napa

The Supply Century Andrew Coulcher, Group Knowledge and Membership Director CIPS Leading global

Sports Book Club Executive Summary (draft in progress) Problem Boys from every walk of life do

Light and Stillness An Exploration of the Spirituality of Rembrandt By Daniel A. Seeger Quaker

Computer Engineering Computer Engineers Logic Design Elect Circuts Disc Algor & Program

Sambuz

Useful Links

Newsletter

Mail Us

Branes with Brains Reinforcement learning in the landscape of - PowerPoint PPT Presentation

Branes with Brains Reinforcement learning in the landscape of intersecting brane worlds F ABIAN R UEHLE (U NIVERSITY OF O XFORD ) String_Data 2017, Boston 11/30/2017 Based on [work in progress] with Brent Nelson and Jim Halverson Motivation -

Introduction to AdS/CFT D-branes Type IIA string theory: Dp-branes p even (0,2,4,6,8) Type IIB

Exotic Branes and Exotic Branes and Superconformal Field Theories Superconformal Field Theories

Azumaya noncommutative geometry and D-branes - an origin of the master nature of D-branes

D-branes at singularities and SUSY breaking Dmitry Malyshev Princeton university Madison

D-Branes and AdS/CFT Junaid Saif Khan Supervised by: Dr. Babar A. Qureshi MS mid-year

Nernst Branes from special geometry David Errington March 5, 2015 arXiv:hep-th/1501 . 07863 Paul

String Phenomenology: Type II/F-Theory Perspective Focus on particle physics &amp; D-branes: I.

Brane Tilings, M2-Branes and Chern-Simons Theories NOPPADOL MEKAREEYA Theoretical Physics Group,

Brains, Genes, and Language Evolution Morten H. Christiansen Cornell University Santa Fe

Neural networks Chapter 20, Section 5 Chapter 20, Section 5 1 Outline Brains Neural

Exploring the Brains Activity: a signal and image modeling challenge Maureen Clerc Inria

DESIGNING FOR BRAINS: THE PSYCHOLOGY OF UX DESIGN M A R I S S A E P S T E I N Marissa

Neural network representations and visual processing in brains Katja Seeliger [ www.ccnlab.net |

GlusterFS: Arbiter based replication Without 3x storage cost + zero split-brains! Ravishankar N.

Neural networks Chapter 20 Chapter 20 1 Outline Brains Neural networks Perceptrons

Modeling the Brains Operating System Dana H. Ballard Computer Science Dept. University of

Remembering Klaus Keimel * 22 September 1939 18 November 2017 Narrative Slides selected

HR Be Nimble: Work/Life and Well-Being Sponsored by March 7, 2017 Presenters Linda Harber

Statisticians quest for biomarkers: optimizing the two stage testing procedures Vera

Karlene Allen Bayshore Ambulance Foster City Max Baldridge Piner s Napa Ambulance Napa

The Supply Century Andrew Coulcher, Group Knowledge and Membership Director CIPS Leading global

Sports Book Club Executive Summary (draft in progress) Problem Boys from every walk of life do

Light and Stillness An Exploration of the Spirituality of Rembrandt By Daniel A. Seeger Quaker

Computer Engineering Computer Engineers Logic Design Elect Circuts Disc Algor &amp; Program

Sambuz

Useful Links

Newsletter

Mail Us

String Phenomenology: Type II/F-Theory Perspective Focus on particle physics & D-branes: I.

Computer Engineering Computer Engineers Logic Design Elect Circuts Disc Algor & Program