An Overview of the AI Safety Landscape Workshop on Reliable - PowerPoint PPT Presentation

Jan 03, 2024 •406 likes •576 views

http://ea-foundation.org An Overview of the AI Safety Landscape Workshop on Reliable Artificial Intelligence 2017, ETH Zurich Max Daniel Research Project Manager, Effective Altruism Foundation https://blog.openai.com/faulty-reward-functions/

http://ea-foundation.org An Overview of the AI Safety Landscape Workshop on Reliable Artificial Intelligence 2017, ETH Zurich Max Daniel Research Project Manager, Effective Altruism Foundation
https://blog.openai.com/faulty-reward-functions/ 2
https://blog.openai.com/faulty-reward-functions/ 3
“[C]oncrete safety problems that are ready for experimentation today and relevant to the cutting edge of AI systems” 4. Safe exploration 1. Avoid negative side effects 5. Robustness to 2. Avoid reward hacking distributional shift 3. Scalable oversight Amodei, Olah et al. 2016 4
Ng and Russell (ICML 2000), Hadfield-Menell et al. (NIPS 2016) 5
Christiano et al. 2017 6
Security Huang et al. 2017 7
Source: http://rll.berkeley.edu/adversarial/videos/pong_a3c_trpo_l-inf.mp4
Corrigibility Soares et al. (AAAI 2015), Orseau and Armstrong (UAI 2016) 9
Privacy Papernot et al. (ICLR 2017) 10
“This technical agenda primarily covers topics that the authors believe are tractable, uncrowded, focused, and unable to be outsourced to forerunners of the target AI system.” 1. Realistic World-Models 2. Decision Theory 3. Logical Uncertainty 4. Vingean Reflection Soares and Fallenstein (2017 [2014]) 11
1) Research Goal 13) Liberty and Privacy 2) Research Funding 14) Shared Benefit 3) Science-Policy Link 15) Shared Prosperity 4) Research Culture 16) Human Control 5) Race Avoidance 17) Non-subversion 6) Safety 18) AI Arms Race 7) Failure Transparency 19) Capability Caution 8) Judicial Transparency 20) Importance 9) Responsibility 21) Risks 10) Value Alignment 22) Recursive Self-Improvement 11) Human Values 23) Common Good 12) Personal Privacy Source: Asilomar AI Principles
Conclusion ● Ensuring that AI agents do what we want is a nontrivial problem. ● Technical AI safety is a thriving field in AI/ML research. ● Several research agendas and concrete problems have been pursued. ● Complements contributions from law, economics, policy, philosophy, social science, … 13
Thank you. Presentation title Subtitle or caption max.daniel@ea-foundation.org John Smith | Head of Department 28.06.2016

Recommend

State Tax Litigation State Tax Litigation Landscape Landscape Landscape Landscape Helen

State Tax Litigation State Tax Litigation Landscape Landscape Landscape Landscape Helen Hecht, Tax Counsel Helen Hecht, Tax Counsel Federation of Tax Administrators Federation of Tax Administrators Where is the forest? Where is the

389 views • 15 slides

Intersection Safety Intersection Safety Intersection Safety FHWA Safety Focus Areas FHWA Safety

Intersection Safety Intersection Safety Intersection Safety FHWA Safety Focus Areas FHWA Safety Focus Areas Intersection Safety Intersection Safety 2 Intersections Intersections Intersection Safety Intersection Safety 3 Intersections

502 views • 36 slides

Literature for Landscape Management Inhalt 1. Landscape analyses, landscape assessments,

Joint Master Programme Sustainable Development 2011, Module Land Management 22/03/2011 Literature for Landscape Management Inhalt 1. Landscape analyses, landscape assessments, and landscape changes in Europe ........ 1 2. Impacts

542 views • 7 slides

CYBER CYBER-SAFETY CYBER CYBER SAFETY SAFETY SAFETY BASICS BASICS Engineering Staff College

3/29/2012 CYBER CYBER-SAFETY CYBER CYBER SAFETY SAFETY SAFETY BASICS BASICS Engineering Staff College of India I N T R O D U C T I O N What is Cyber-safety Consequences Threats of Inaction Cyber-safety? Cyber-safety Cyber-safety at

216 views • 7 slides

Safety Presentation The Silence 1 Safety Presentation SAFETY SAFETY OR 2 Safety

Safety Presentation The Silence 1 Safety Presentation SAFETY SAFETY OR 2 Safety Presentation Oh.You Mean That SAFETY 3 Safety Video 4 THANK YOU Thank You 5

322 views • 5 slides

The Model The Model Water Efficient Water Efficient Landscape Ordinance Landscape Ordinance

Water Use and Efficiency Branch The Model The Model Water Efficient Water Efficient Landscape Ordinance Landscape Ordinance Landscape Graphic Courtesy of Sonoma County Water Agency, Ali Davidson Landscape Architect David Bunnett

854 views • 60 slides

The TV viewing The TV viewing The total video viewing landscape landscape landscape A Few

SECTION TWO SECTION TWO The TV viewing The TV viewing The total video viewing landscape landscape landscape A Few Definitions before we start TV Live/Recorded Broadcaster generated content viewed on any device. Live TV TV

473 views • 32 slides

Connecting A Global Campus The Landscape The Landscape Institutional Perspective: Duke Global

Connecting A Global Campus The Landscape The Landscape Institutional Perspective: Duke Global The Landscape Institutional Perspective: Duke Global Global IT Process: Discovery, Preparation, Implementation The Landscape Institutional

1.48k views • 114 slides

Courtesy: SCAPE / Landscape Architecture Rebuild by Design Team Courtesy: SCAPE / Landscape

Courtesy: SCAPE / Landscape Architecture Rebuild by Design Team Courtesy: SCAPE / Landscape Architecture Rebuild by Design Team Courtesy: SCAPE / Landscape Architecture Rebuild by Design Team Courtesy: SCAPE / Landscape Architecture Rebuild by

241 views • 10 slides

Aviation Safety Cases The Safety Case and Safety Argument Dr Tim Fowler 29 November 2005

Aviation Safety Cases The Safety Case and Safety Argument Dr Tim Fowler 29 November 2005 Overview Why Consider Safety? Safety Assessment: What is Required? Safety and the System Life Cycle. The Safety Argument. Goal

199 views • 17 slides

San Diego Mesa College Safety Committee Safety Committee Overview The Mesa College Safety

San Diego Mesa College Safety Committee Safety Committee Overview The Mesa College Safety Committee provides a venue to address safety issues and promote safety in all areas across the campus. It also provides planning related to safety

620 views • 12 slides

1 AGENDA Safety Production Overview Safety Standards Overview Safety Devices

1 AGENDA Safety Production Overview Safety Standards Overview Safety Devices Applications According to Reference Standards Safety Limit Switches With Rope and Manual Reset FEP: Safety Electromagnetic Switch With Separate

701 views • 20 slides

Control Systems Turf & Landscape Irrigation Control Systems Canada Turf & Landscape

Water & Irrigation Control Systems Turf & Landscape Irrigation Control Systems Canada Turf & Landscape Irrigation Annual savings of millions of dollars USA Turf & Landscape Irrigation Brookhaven Collage, TX Significant

499 views • 26 slides

Proposed Landscape Development for Nalanda University Rajgir, Bihar Landscape concept An

Proposed Landscape Development for Nalanda University Rajgir, Bihar Landscape concept An interlinked process * Thompson, Ian. Ecology Community and Delight: An Inquiry into Values in Landscape Architecture . London : Routledge, 1999.

390 views • 22 slides

Digital Landscape Group (DLGI) July 16, 2020 Analyst Day DIGITAL Notice to Recipient

DIGITAL LANDSCAPE GROUP Digital Landscape Group (DLGI) July 16, 2020 Analyst Day DIGITAL Notice to Recipient LANDSCAPE GROUP Important Notices This document has been prepared by Digital Landscape Group, Inc. (DLGI) solely for

1.31k views • 45 slides

Landscape Character Assessment 2019 Update AONB: Working in Partnership What is Landscape

Landscape Character Assessment 2019 Update AONB: Working in Partnership What is Landscape Character Assessment? A tool to analyse, describe and define landscape based on physical and perceptual characteristics Can be applied across a

204 views • 16 slides

Improved read/write cost tradeoff in DNA-based data storage using LDPC codes Shubham Chandak

Improved read/write cost tradeoff in DNA-based data storage using LDPC codes Shubham Chandak Stanford University Allerton 2019 Outline Motivation DNA storage setup Theoretical analysis Proposed framework Results

835 views • 49 slides

Optimal Control and Estimation, Chapters 1 and 2 H Brendan McMahan Carnegie Mellon University

Optimal Control and Estimation, Chapters 1 and 2 H Brendan McMahan Carnegie Mellon University www.cs.cmu.edu/mcmahan October 17, 2002 Outline 1. The optimal control problem: Modeling dynamic Systems 2. Introduction to Laplace Transforms

769 views • 20 slides

zfso.TT General p spin model Hulot the centered Gaussian with yN EE l I ELHnlolH.de f Ngc

Sherrington Kirkpatrick model Hn Ily R 6 Wo Wn GOEIN HN 161 21 j W La G 1Gt o N 10,1 Gigi N centered Hw 16 Gaussian process oey yw ELHnldthd D zfso.TT General p spin model Hulot the centered Gaussian with yN EE l I ELHnlolH.de f

514 views • 5 slides

Optimization-based approach to congestion control Resource allocation as optimization problem:

Optimization-based approach to congestion control Resource allocation as optimization problem: how to allocate resources (e.g., bandwidth) to optimize some objective function may not possible that optimality exactly obtained but

697 views • 38 slides

Secure and Trustworthy Cyber-Physical System Design: A Cross-Layer Perspective Pierluigi Nuzzo

Secure and Trustworthy Cyber-Physical System Design: A Cross-Layer Perspective Pierluigi Nuzzo Ming Hsieh Department of Electrical and Computer Engineering University of Southern California, Los Angeles nuzzo@usc.edu In Honor of Alberto

604 views • 23 slides

Computable Lower Bounds for Capacities of Input-Driven Finite-State Channels V. Arvind Rameshwar

Background Ideas and Directions Single-Letter Lower Bound Applications Future Work Computable Lower Bounds for Capacities of Input-Driven Finite-State Channels V. Arvind Rameshwar Navin Kashyap Department of Electrical Communication

768 views • 72 slides

sr s t r P

sr s t r P t rs s

781 views • 59 slides

AlphaGo, etc. Lab 4 Due Feb. 29 (you have two weeks 1.5 remaining) new game0.py

AlphaGo, etc. Lab 4 Due Feb. 29 (you have two weeks 1.5 remaining) new game0.py with show_values for debugging Exam on Tuesday in lab I sent out a topics list last night. On Monday in lecture, well be doing review

700 views • 15 slides

An Overview of the AI Safety Landscape Workshop on Reliable - PowerPoint PPT Presentation

http://ea-foundation.org An Overview of the AI Safety Landscape Workshop on Reliable Artificial Intelligence 2017, ETH Zurich Max Daniel Research Project Manager, Effective Altruism Foundation https://blog.openai.com/faulty-reward-functions/

State Tax Litigation State Tax Litigation Landscape Landscape Landscape Landscape Helen

Intersection Safety Intersection Safety Intersection Safety FHWA Safety Focus Areas FHWA Safety

Literature for Landscape Management Inhalt 1. Landscape analyses, landscape assessments,

CYBER CYBER-SAFETY CYBER CYBER SAFETY SAFETY SAFETY BASICS BASICS Engineering Staff College

Safety Presentation The Silence 1 Safety Presentation SAFETY SAFETY OR 2 Safety

The Model The Model Water Efficient Water Efficient Landscape Ordinance Landscape Ordinance

The TV viewing The TV viewing The total video viewing landscape landscape landscape A Few

Connecting A Global Campus The Landscape The Landscape Institutional Perspective: Duke Global

Courtesy: SCAPE / Landscape Architecture Rebuild by Design Team Courtesy: SCAPE / Landscape

Aviation Safety Cases The Safety Case and Safety Argument Dr Tim Fowler 29 November 2005

San Diego Mesa College Safety Committee Safety Committee Overview The Mesa College Safety

1 AGENDA Safety Production Overview Safety Standards Overview Safety Devices

Control Systems Turf & Landscape Irrigation Control Systems Canada Turf & Landscape

Proposed Landscape Development for Nalanda University Rajgir, Bihar Landscape concept An

Digital Landscape Group (DLGI) July 16, 2020 Analyst Day DIGITAL Notice to Recipient

Landscape Character Assessment 2019 Update AONB: Working in Partnership What is Landscape

Improved read/write cost tradeoff in DNA-based data storage using LDPC codes Shubham Chandak

Optimal Control and Estimation, Chapters 1 and 2 H Brendan McMahan Carnegie Mellon University

zfso.TT General p spin model Hulot the centered Gaussian with yN EE l I ELHnlolH.de f Ngc

Optimization-based approach to congestion control Resource allocation as optimization problem:

Secure and Trustworthy Cyber-Physical System Design: A Cross-Layer Perspective Pierluigi Nuzzo

Computable Lower Bounds for Capacities of Input-Driven Finite-State Channels V. Arvind Rameshwar

sr s t r P

AlphaGo, etc. Lab 4 Due Feb. 29 (you have two weeks 1.5 remaining) new game0.py

Sambuz

Useful Links

Newsletter

Mail Us

An Overview of the AI Safety Landscape Workshop on Reliable - PowerPoint PPT Presentation

http://ea-foundation.org An Overview of the AI Safety Landscape Workshop on Reliable Artificial Intelligence 2017, ETH Zurich Max Daniel Research Project Manager, Effective Altruism Foundation https://blog.openai.com/faulty-reward-functions/

State Tax Litigation State Tax Litigation Landscape Landscape Landscape Landscape Helen

Intersection Safety Intersection Safety Intersection Safety FHWA Safety Focus Areas FHWA Safety

Literature for Landscape Management Inhalt 1. Landscape analyses, landscape assessments,

CYBER CYBER-SAFETY CYBER CYBER SAFETY SAFETY SAFETY BASICS BASICS Engineering Staff College

Safety Presentation The Silence 1 Safety Presentation SAFETY SAFETY OR 2 Safety

The Model The Model Water Efficient Water Efficient Landscape Ordinance Landscape Ordinance

The TV viewing The TV viewing The total video viewing landscape landscape landscape A Few

Connecting A Global Campus The Landscape The Landscape Institutional Perspective: Duke Global

Courtesy: SCAPE / Landscape Architecture Rebuild by Design Team Courtesy: SCAPE / Landscape

Aviation Safety Cases The Safety Case and Safety Argument Dr Tim Fowler 29 November 2005

San Diego Mesa College Safety Committee Safety Committee Overview The Mesa College Safety

1 AGENDA Safety Production Overview Safety Standards Overview Safety Devices

Control Systems Turf &amp; Landscape Irrigation Control Systems Canada Turf &amp; Landscape

Proposed Landscape Development for Nalanda University Rajgir, Bihar Landscape concept An

Digital Landscape Group (DLGI) July 16, 2020 Analyst Day DIGITAL Notice to Recipient

Landscape Character Assessment 2019 Update AONB: Working in Partnership What is Landscape

Improved read/write cost tradeoff in DNA-based data storage using LDPC codes Shubham Chandak

Optimal Control and Estimation, Chapters 1 and 2 H Brendan McMahan Carnegie Mellon University

zfso.TT General p spin model Hulot the centered Gaussian with yN EE l I ELHnlolH.de f Ngc

Optimization-based approach to congestion control Resource allocation as optimization problem:

Secure and Trustworthy Cyber-Physical System Design: A Cross-Layer Perspective Pierluigi Nuzzo

Computable Lower Bounds for Capacities of Input-Driven Finite-State Channels V. Arvind Rameshwar

sr s t r P

AlphaGo, etc. Lab 4 Due Feb. 29 (you have two weeks 1.5 remaining) new game0.py

Sambuz

Useful Links

Newsletter

Mail Us

Control Systems Turf & Landscape Irrigation Control Systems Canada Turf & Landscape