Embedded High Performance Computing (EHPC) and Neuromorphic - - PowerPoint PPT Presentation

embedded high performance
SMART_READER_LITE
LIVE PREVIEW

Embedded High Performance Computing (EHPC) and Neuromorphic - - PowerPoint PPT Presentation

Embedded High Performance Computing (EHPC) and Neuromorphic Computing November 18, 2014 Mr. Mark Barnell Senior Computer Scientist Information Directorate Integrity Service Excellence Air Force Research Laboratory 1 DISTRIBUTION A.


slide-1
SLIDE 1

1

DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)

Integrity  Service  Excellence

Embedded High Performance Computing (EHPC) and Neuromorphic Computing

November 18, 2014

  • Mr. Mark Barnell

Senior Computer Scientist Information Directorate Air Force Research Laboratory

slide-2
SLIDE 2

2

DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)

Outline

  • Challenges
  • Approaches

– Scalable computing & applications – Exploit 3D integration – New devices and models of computation – Move “C4ISR to the edge” with EHPC

slide-3
SLIDE 3

3

DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)

Challenges in Big Data Analytics

  • Big data limits human performance in analysis and decision

making

  • High-performance information technologies for massive analytics

enables the journey from big data to information, knowledge and wisdom.

  • R&D needed to achieve trusted autonomous systems that are capable
  • f learning, reasoning, inferencing and interacting with human.
  • Advantage in labor power could convert into global

competitive advantages

  • Ability to “do more, do better” with more intelligent power, less

labor power, would be a “force multiplier”

  • Computing hardware technologies reach physical limits in

area, power and performance

  • Three-dimensional integrated circuits and systems with optimized

performance under size, weight and power (SWaP) constraints.

  • RDT&E for nano/quantum/neuro device and system technologies.
slide-4
SLIDE 4

4

DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)

Multi-Tiered Approach to EHPC or HPEC Challenges

FUNDAMENTAL SCIENCE TRUSTED ARCHITECTURES

TECH DEVELOPMENT

slide-5
SLIDE 5

5

DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2011-3208, 06 Jun 2011)

The Condor Cluster

1716 SONY Playstation3s

  • STI Cell Broadband Engine
  • PowerPC PPE
  • 6 SPEs
  • 256 MB RAM

84 head nodes

  • 6 gateway access points
  • 78 compute nodes
  • Intel Xeon X5650 dual-socket hexa-

core

  • (2) NVIDIA Tesla GPGPUs
  • 39 nodes – (78) C2050
  • 39 nodes – (78) C2070/5
  • 48 GB RAM

FY10 DHPI Key design considerations: Price/performance & Performance/Watt

slide-6
SLIDE 6

6

DISTRIBUTION A. Approved for Public Release [88ABW-2010-6225] Distribution Unlimited

Input: character images Character level: auto- associative neural networks

T e t x i m g a e

Word level: confabulation algorithms based on knowledge base of weighted links among letters Sentence level: confabulation algorithms based on knowledge base of weighted links among words and phrases Output: “Text image”

Hybrid Neuromorphic Model

slide-7
SLIDE 7

7

DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2011-3208, 06 Jun 2011)

RADAR Data Processing for High Resolution Images

slide-8
SLIDE 8

8

DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)

  • R. U. D. I. Cluster

176 Jetson Boards (60T/flops @ 2.1kW)

slide-9
SLIDE 9

9

DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)

3D Integration

3D Integration

Hybrid Memory Cube (HMC Consortium) 3D NAND Flash Memory (Toshiba)

slide-10
SLIDE 10

10

DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)

Input Vectors Config. Output Buffer Training Complemted Training Signal Generation Error Detection ST & Arbiter

R/W Control

Crossbar Array M+

Diff Std Patterns X

(k), k=1...m

Vin

Crossbar Array M-

Summing Amplifier & Comparator V(t+1) V+(t) Vout V-(t) Crossbar-based Computation Core

Neuromorphic Computing Accelerator (NCA) ADC

00101101

Arbiter

NCA

I/O Cfg Buffers

Arbiter

NCA

I/O Cfg Buffers

Arbiter

NCA

I/O Cfg Buffers

Arbiter

NCA

I/O Cfg Buffers

Bridge

ADC

00101101

Bridge

ADC

00101101

General Purpose Processor SRAM I/O

Conventional Processing Neuromorphic Computing Accelerators

Crossbar array of memristors

Bio-Inspired Computing Architecture

Neuromorphic models and algorithms (ANN, Inference, etc.)

slide-11
SLIDE 11

11

DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)

Moving ISR to the Tactical Edge

Versatile Intelligent Sensor

slide-12
SLIDE 12

12

DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)

Summary

  • Future embedded systems are challenged to continue delivering

extreme performance in small space

  • But security places additional challenges for trust, agility and

resilience

  • Powerful technology drivers are still at hand to meet the

challenges

– Computer architecture innovations – Nano and quantum advances – 3D stacking – Algorithm development and mapping to architectures