Deep Representation and Reinforcement Learning Soumalya Sarkar, PhD - - PowerPoint PPT Presentation

deep representation and reinforcement learning
SMART_READER_LITE
LIVE PREVIEW

Deep Representation and Reinforcement Learning Soumalya Sarkar, PhD - - PowerPoint PPT Presentation

Deep Representation and Reinforcement Learning Soumalya Sarkar, PhD for Anomaly Detection and Control Senior Research scientist, UTRC in Multi-modal Aerospace Applications May 9 @ GTC 2017 This document contains no technical data subject to the


slide-1
SLIDE 1

Soumalya Sarkar, PhD

Senior Research scientist, UTRC

Deep Representation and Reinforcement Learning for Anomaly Detection and Control in Multi-modal Aerospace Applications

This document contains no technical data subject to the EAR or the ITAR.

May 9 @ GTC 2017

slide-2
SLIDE 2

2

“UTRC is where you bring your toughest problems.”

Our

business units

This document contains no technical data subject to the EAR or the ITAR.

IIoT

slide-3
SLIDE 3

52% 36% 12% Military Aerospace & Space Commercial Aerospace Commercial & Industrial

END MARKETS

56% 44% Original Equipment Manufacturing Aftermarket

SALES BY TYPE SALES BY GEOGRAPHY

38% 27% 20% 15% Asia Pacific Europe United States Other

3

A

global

leader

$56B

2015 UTC Sales

$3.9B

invested in R&D

This document contains no technical data subject to the EAR or the ITAR.

slide-4
SLIDE 4

4

Cork, Ireland

Established in 2010, focuses on energy, security and aerospace systems

Shanghai, China

Established in 1997, focuses on integrated buildings, fluid and mechanical systems

Rome, Italy

Joined UTC in 2012, focuses on model-based design and embedded systems engineering

East Hartford, CT

Founded in 1929, focuses on a broad range of system engineering, thermal, fluid, material, and informational sciences

Berkeley, CA

Established in 2009, focuses

  • n cyber physical systems

and embedded intelligence

A

global

presence

This document contains no technical data subject to the EAR or the ITAR.

slide-5
SLIDE 5

5

Focused on

performance

Physical Sciences Systems Thermal & Fluid Sciences

– Advanced Materials – Applied Physics – Environmental Science – Materials Chemistry – Measurement Science – Solid Mechanics – Surface Mechanics – Advanced Laboratory for Embedded Systems – Control Systems – Cyber-Physical Systems – Decision Support & Machine Intelligence – Electromagnetics & Networks – Power Electronics – Software Systems – System Dynamics & Optimization – Acoustics – Aerodynamics – Aero-Thermal Testing – Combustion Science – Propulsion Technology – Thermo-Fluid Dynamics – Thermal Management

This document contains no technical data subject to the EAR or the ITAR.

slide-6
SLIDE 6

Topics

This document contains no technical data subject to the EAR or the ITAR.

  • Deep Representation Learning - DAE
  • Big PHM (Prognostics & Health Monitoring)
  • SHM (Structural Health Monitoring)
  • Deep Reinforcement Learning (DRL) for additive manufacturing
slide-7
SLIDE 7

Deep Auto-Encoder (DAE)

  • Parameter learning by Stochastic gradient descent (Hinton&Salakhutdinov, 2006; Bengio et al., 2007)
  • Variants: De-noising (RBM), variational etc
  • Static DAE instead of LSTM AE due to ease / speed of training

Multi-layer neural network based learner of non-linear representation of the data W

Input: Hidden representation: Sigmoid connecting two layers: Parameters: Sigmoid function at reverse mapping of reconstruction layers:

Where

Cost function for back- propagation:

This document contains no technical data subject to the EAR or the ITAR.

, 𝑋′ = 𝑋𝑈

slide-8
SLIDE 8

Topics

This document contains no technical data subject to the EAR or the ITAR.

  • Deep Representation Learning - DAE
  • Big PHM (Prognostics & Health Monitoring)
  • SHM (Structural Health Monitoring)
  • Deep Reinforcement Learning (DRL) for additive manufacturing
slide-9
SLIDE 9

Motivation

Big, multi-modal & heterogeneous data; unsupervised visualization

This document contains no technical data subject to the EAR or the ITAR.

Data: ~100 sensors, ~200 dimensional condition data Size ~ TB Zero/ few labels Problem: Understanding / separating different missions or faults Challenges: Low-dimensional visualization, robust separation of faults (FDI) / mission, real-time application and generalizability

slide-10
SLIDE 10

FDI approaches and challenges

  • Lack of high-fidelity non-linear models,
  • Tedious hand-crafting (domain knowledge) of fault features,
  • Lack of scalability to large data,
  • Insufficient robustness to noise and
  • The presence of various operating modes,
  • Presence of multi-modal sensors for fault disambiguation

Previous Work and Challenges Methods of fault detection and identification Model based

  • 1. Residual methods
  • 2. Parity based
  • 3. Kalman filter based

Data-driven

  • 1. Time,

frequency, symbolic domain features

  • 2. SVM, k-NN, artificial neural net based

learning systems

Hybrid

  • 1. Parity Equation Approach and

wavelet based signal features

  • 2. PCA based system models

Deep Learning

This document contains no technical data subject to the EAR or the ITAR.

slide-11
SLIDE 11

Database* for Validation

Apparatus: A set of electromechanical actuators (EMA), constructed by Moog Corporation, were used by Balaban et. al. (Balaban et al., 2009, 2015). To increase the horizon of available operating conditions, flyable electromechanical actuator (FLEA) testbed was also constructed. 13 multi-modal sensors @100Hz: Actuator Z Position, Measured Load, Motor Current X-Y-Z, Motor Voltage X-Y, Motor Temperature X-Y-Z, Nut X-Y Temperature, Ambient Temperature. Baseline an 2 fault classes: 1. A jam fault injected via a mechanism mounted on the return channel of the ball screw that can stop circulation of the bearing balls through the circuit. 2. A spall fault injected by introducing cuts of various geometries via a precise electrostatic discharge

  • process. The initial size and subsequent growth of these cuts were confirmed by using an optical

inspection and measurement system.

*open database available at NASA Dashlink, collected by Balaban et. al. (Balaban et al., 2009, 2015)

Fault Detection and Identification

This document contains no technical data subject to the EAR or the ITAR.

slide-12
SLIDE 12

DAE architecture

650 (50x13) -> 256 -> 196 -> 136 -> 76 -> 14 -> 76 -> 136 -> 196 -> 256 -> 650

Window size = 0.5 seconds, shifted by each time point

11-layer DAE

This document contains no technical data subject to the EAR or the ITAR.

slide-13
SLIDE 13

DAE Reconstruction Error

Actual signals and reconstructed signals for Motor X voltage, Motor Y temperature, and load sensors (from top to bottom) with bottleneck layer

  • f 14 dimension

Multi-modal Reconstruction Error Fault Detection and Identification

This document contains no technical data subject to the EAR or the ITAR.

slide-14
SLIDE 14

Training and Parameter Learning

Variation of normalized RMS error at the reconstructed output layer with increasing dimension of the bottleneck Individual sensor-wise reconstruction errors at the output layer for 3 different bottleneck layer dimensions

This document contains no technical data subject to the EAR or the ITAR.

Tuning bottleneck Layer

slide-15
SLIDE 15

Fault Diagnostics

ROC curves via varying detection threshold on testing data for different bottleneck dimensions of 11-layer DAE and few single layer AE models Precision-Recall curves for the same conditions as data is unbalanced

This document contains no technical data subject to the EAR or the ITAR.

ROC and Precision-Recall curves

slide-16
SLIDE 16

Unsupervised Fault Disambiguation

Spider charts showing the NRMS error across different sensors during testing phase for nominal and fault scenarios

Disambiguation by Multi-dimensional Reconstruction Error

This document contains no technical data subject to the EAR or the ITAR.

slide-17
SLIDE 17

Why Deep Architecture?

Spider charts of the average (over nominal and fault scenarios) NRMS error across different sensors during testing phase for (a) single hidden-layer AE with 512-dimensional bottleneck (b) proposed 11-layer DAE with 14-dimensional bottleneck

DAE Reconstruction Error increases fault separability with low over-fitting

This document contains no technical data subject to the EAR or the ITAR.

slide-18
SLIDE 18

Why Deep Architecture?

Clusters of two largest principal components obtained from PCA on 13-dimensional NRMS error distributions

Multi-dimensional NRMS from DAE increases inter-fault distance at low dimension

single hidden-layer AE model with 512-dimensional bottleneck layer proposed 11-layer DAE with 14-dimensional bottleneck layer

This document contains no technical data subject to the EAR or the ITAR.

slide-19
SLIDE 19

Discussions

  • trained directly on raw time series from heterogeneous sensors without feature hand-crafting

and extensive data preprocessing.

  • A high fault detection rate ( ~97.8%) along with zero false alarm on a large set of realistic data

(available on NASA DASHlink)

  • Disambiguation among different types of faults with high confidence in an unsupervised way
  • Proposed DAE more robust than single-hidden layer AE

This document contains no technical data subject to the EAR or the ITAR.

Fault separation even at a low dimension in an unsupervised way

slide-20
SLIDE 20

Topics

This document contains no technical data subject to the EAR or the ITAR.

  • Deep Representation Learning - DAE
  • Big PHM (Prognostics & Health Monitoring)
  • SHM (Structural Health Monitoring)
  • Deep Reinforcement Learning (DRL) for additive manufacturing
slide-21
SLIDE 21

Autonomous SHM from Imaging

Background

This document contains no technical data subject to the EAR or the ITAR.

Vision-based SHM is required to get details of the damage such as size, configuration, shape, topological networking, geometrical statistics for material characterization, damage prognostics , RUL analysis etc

  • Image processing as important tool in material/structural characterization for over three decades (Krakow, 1982;

Duval et al., 2014; Robertson et al., 2011; Leach, 2013).

  • Texture analysis (Comer & Delp, 2000) and segmentation (Ruggiero, Ross, & Porter, 2015; Park, Huang, Ji, & Ding,

2013) are applied image processing techniques to SHM.

  • Pre-processing steps like filtering and enhancement techniques (Tomasi & Manduchi, 1998; Angulo & Velasco-Forero,

2013; Buades, Coll, & Morel, 2005) used to denoise image and perform alignment and artifact correction.

  • Recent breakthroughs of deep learning are mostly in image processing because it models multiple levels of

abstractions (low-level features to higher-order representations, Erhan, Courville, & Bengio, 2010).

  • Broad applications to medical imaging (similar to vision-based SHM from computer vision perspective), recent

application to video-based combustion PHM (Sarkar et al., 2015)

slide-22
SLIDE 22

Why DAE for SHM?

Not explored enough yet!

This document contains no technical data subject to the EAR or the ITAR.

Methodological challenges: Computational Challenges:

  • Extensive heuristics for parameter tuning in existing tools
  • Limited availability of annotation causing small number of training labels (no CNN)
  • Robustness issue in computer vision (segmentation) techniques
  • Seamless incorporation of domain expert in the loop
  • Lengthy and tedious process of manual annotation by domain experts on large number of samples
  • Human error (expert bias) in labelling the ground truth
  • Process changes significantly with new experimental setup and material
slide-23
SLIDE 23

Use case: Damage characterization

Experimental setup: Variable load-induced cracks on a composite

(a) Scheme of coupon testing and (b) Representative damage pattern

This document contains no technical data subject to the EAR or the ITAR.

  • Thick multi-layer composite sub-elements used in numerous rotorcraft and

aircraft applications.

  • Usually under conditions of multi-axial loading with dominant influence of

bending, generating complex patterns of internal damages

  • Carbon fiber polymer-matric 55 layered composite (IM7/977-3 materials

with lay-up [+454 / -454 / 03]2S[03/- 454 /+ 454] representing thickness of 0.290 in) coupons were considered under conditions of five-point bending.

  • Video starts with a straight coupon and slowly it is bent under

monotonically increasing displacement-controlled load till full fracture.

slide-24
SLIDE 24

Framework for Damage Characterization

Image frame to crack-length distribution

Distribution of crack lengths Video frames with dynamic crack and non-linear bending

This document contains no technical data subject to the EAR or the ITAR.

slide-25
SLIDE 25

Patching DAE and Guided Segmentation

Modeling nominal surface and segmenting cracks from reconstruction error map

Input layer Output layer Multiple Hidden layers Frame with NO crack

Learning Nominal Surface via Reconstructing DAE (patch size 3 by 7 pixels)

𝑇 𝑦, 𝑧 = 𝐽 𝑦, 𝑧 − 𝑑 1 − 𝑞 𝑦, 𝑧

Similarity measure Raw image intensity of pixel 𝑦, 𝑧 Probability (intensity of reconstruction error map) of a pixel 𝑦, 𝑧 being a crack

This document contains no technical data subject to the EAR or the ITAR.

Patching DAE Guided region-growing segmentation

Mean intensity of current region

slide-26
SLIDE 26

A = true crack zone B = detected crack zone A ∩ B = correctly detected crack area

  • 1. 𝐸𝑗𝑑𝑓 𝑡𝑑𝑝𝑠𝑓 = 2(𝐵 ∩ 𝐶)/(𝐵 ∪ 𝐶)

A B A ∩ B

Performance metrics

  • 3. Average minimum distance between true

crack and detected crack areas

  • 2. Distance between

histograms, d

  • 4. Number of cracks

This document contains no technical data subject to the EAR or the ITAR.

For characterization number of cracks and d are the most significant metrics

slide-27
SLIDE 27

Tuning thresholds (on intensity of reconstruction error map) on medium level load

Refining parameter

Chosen threshold

This document contains no technical data subject to the EAR or the ITAR.

slide-28
SLIDE 28

Tuning thresholds based on medium load = 0.55

Crack detection at medium load from Tuned parameter

This document contains no technical data subject to the EAR or the ITAR.

slide-29
SLIDE 29

This document contains no technical data subject to the EAR or the ITAR.

Actual

Medium level load High level load Low level load

Crack detection across various load levels

Predicted Shows better robustness (to different load level) even crack thickness of only 2-3 pixels than sophisticated contour detector, edge detectors, morphological segmentations and single step region growing segmentation.

slide-30
SLIDE 30

Discussions

  • High characterization accuracy and satisfactory performance over a wide range of loading

conditions with limited number labeled training image data.

  • Meets required robustness (to different load level) via DAE error map in comparison to other

benchmarks

  • This approach can applied to field inspection and borescope inspection on complex surface

structures.

This document contains no technical data subject to the EAR or the ITAR.

Less heuristics, Validation on real data, Robustness to varying condition

slide-31
SLIDE 31

Topics

This document contains no technical data subject to the EAR or the ITAR.

  • Deep Representation Learning - DAE
  • Big PHM (Prognostics & Health Monitoring)
  • SHM (Structural Health Monitoring)
  • Deep Reinforcement Learning (DRL) for additive manufacturing (AM)
slide-32
SLIDE 32
  • Noisy robot state
  • Surface imaging

Nozzle Trajectory

𝑝𝑢

Global feedback controller, No need to solve online MPC

AM in aerospace domain: high precision standards & high variability of tasks

No need for state estimation

  • 3D printing
  • Cold Spray
  • Arc welding
  • Powder deposition
  • Reduce reliance on expert (rather expert guided)
  • Self-learning/adaptation to optimal behavior
  • Closing the loop with real time perception
  • Cost reduction,
  • easier commissioning,
  • improved performance & scalability,
  • high precision geometric and material property

for printing and repair

DRL for AM

This document contains no technical data subject to the EAR or the ITAR.

Cold Spray Robot

Scanner

slide-33
SLIDE 33

Sensors Perception/ State Estimation Policy /Control Low Level Control Actuator

Actions/ Control Sensing

Explore recent advances in deep learning to address

  • End-to-end sensinglearning/control
  • Incorporate expert/prior knowledge

Automate

Expert engineered on a case by case basis

What are we trying to do?

This document contains no technical data subject to the EAR or the ITAR.

DRL for AM

slide-34
SLIDE 34

Different Flavors

End-to-end perception to learning/control

  • End-to-end

guided policy search (Levine et. al. 2015)

  • Deep Q Learning (Mnih et. al.,

2013, Lillicrap et. al., 2016)

Learning features/dynamics to use in control/RL

  • Deep Q Fitted Network (Lang et. al,

2012)

  • Embed to control-iLQR (Watter et.

al., 2015)

  • Deep

Dynamical Models-NMPC (Assael et. al, 2015)

Learning value function/policy in RL

  • Guided Policy Search

Levine e.t al., 2014)

This document contains no technical data subject to the EAR or the ITAR.

DRL for AM

slide-35
SLIDE 35

How it works?

  • NN to represent value function
  • Purely exploratory
  • No need for generative model
  • Gaming applications / simulated environments
  • NN to represent policy
  • Exploit trajectory optimization to guide search
  • Need (approx.) generative model
  • Robotics application

Agent Simulated/Real Environment Action, 𝑣𝑢 ∼ 𝜌(. |𝑡𝑢) State, 𝑡𝑢+1 Policy: 𝜌: 𝑇 → 𝑄(𝑉) Value/Q function: 𝑅(𝑡, 𝑣) = E𝜌[J(𝜐)|s1 = s, u1 = u] 𝑡𝑢+1 = 𝑔(𝑡𝑢, 𝑣𝑢) min

𝜌 𝐹𝜌[𝐾 𝜐 ]

𝐾 𝜐 = 𝑑(𝑡𝑢, 𝑣𝑢)

𝑈 𝑢=1

𝜐 = {𝑡1, 𝑣1, 𝑡2, 𝑣2, … . , 𝑡𝑈, 𝑣𝑈} Cost, 𝑑(𝑡𝑢, 𝑣𝑢) Key advance: Stable training NN for function approximation Deep Q Learning Guided Policy Search

This document contains no technical data subject to the EAR or the ITAR.

DRL for AM

*(Levine et. al. 2015)

slide-36
SLIDE 36

Guided Policy Search Problem Formulation

Approach I: One Shot

  • “Demonstrator”
  • Trajectory optimizer to generate optimal guiding

sample

  • NN trained to match samples

Approach II: Incremental

  • “Good teacher”
  • Provides training to NN in small steps
  • Modified cost for trajectory optimizer, so that it solves the

control problem similar to how student (fails to) solve it

min

𝜄 𝐹𝜌𝜄[𝐾(𝜐)]

Policy Search Guided Policy Search = Supervised Learning

Trajectory Optimization Guiding samples 𝑟(𝑣𝑢|𝑡𝑢) Simulated/Real Environment 𝜄 𝜌𝜄(𝑣𝑢|𝑡𝑢)

This document contains no technical data subject to the EAR or the ITAR.

DRL for AM

slide-37
SLIDE 37

Deep Reinforcement Learning (DRL)

Guided Policy Search: Incremental Approach

Dynamic Model Approx. Cost Function Approx. Iterative LQR Sample Trajectory NN Training 𝑟𝑗(𝑣𝑢|𝑡𝑢) 𝜐𝑗𝑘 𝑗 = 1, . . , 𝑁 𝑘 = 1, . . , 𝑂 Lagrange Multiplier Update 𝜐𝑗

𝐽 , 𝑗 = 1, . , 𝑁

Simulator/ Real System Quadratic Approx. 𝜌𝜄,Σ 𝑣𝑢 𝑡𝑢 𝜈(. ; 𝜄)

This document contains no technical data subject to the EAR or the ITAR.

slide-38
SLIDE 38

DRL for AM: Cold Spray for high precision repair

Guided DRL for cold spray control

Challenges

Complex physics:

  • Coupled geometric/structural properties
  • High dimensional nonlinear system

Path planning:

  • Laborious manual process ~multiple hours per part
  • Difficult to assess frustration points
  • Part machined to simplify path planning

Dynamics: Objective: Compute 𝑣1, 𝑣2, … , 𝑣𝑈 to minimize 𝑡𝑢+1 = 𝑔 𝑡𝑢, 𝑣𝑢 𝑡𝑢 = (𝐸 𝑠

1, 𝑢 , … 𝐸 𝑠𝑂, 𝑢 , 𝑦𝑡 𝑢 , 𝛽(𝑢))

𝑣𝑢 = (𝑤𝑡, 𝜕𝑢) 𝐾(𝜐) = 1 2 𝑡𝑢 1: 𝑂 − 𝐸𝑠𝑓𝑔

∗𝑆 𝑡𝑢(1: 𝑂) − 𝐸𝑠𝑓𝑔 + 𝑣𝑢 ∗𝑅𝑣𝑢 𝑈 𝑢=1 ─ distribution of particles in spray cone ─ deposit efficiency function ─ total deposit in the point r during one run ─ position of the nozzle at time t 𝜒 tan 𝛽 𝑔 cot 𝛾 𝐸 𝑠, 𝑢 xs(t), hs 𝜖𝐸 𝑠, 𝑢 𝜖𝑢 = 𝜒 𝑠 − 𝑦𝑡 𝑢 ℎ𝑡 − 𝐸 𝑠, 𝑢

𝑈

𝑔 𝑠 − 𝑦𝑡 𝑢 − ℎ𝑡 − 𝐸 𝑠, 𝑢 𝜖 𝜖𝑠 𝐸 𝑠, 𝑢 ℎ𝑡 − 𝐸 𝑠, 𝑢 + 𝑠 − 𝑦𝑡 𝑢 𝜖 𝜖𝑠 𝐸 𝑠, 𝑢 d𝑢 α β γ D(r,t) α β γ xs(t) r hs

𝐸𝑠𝑓𝑔

Goal: Automate cold spray control

This document contains no technical data subject to the EAR or the ITAR.

slide-39
SLIDE 39

Results: GPS leads to optimal behavior as MPC

DRL for AM: Cold Spray for high precision repair

500 randomly generated surfaces

Much Faster!

This document contains no technical data subject to the EAR or the ITAR.

Cost

Better Training!

slide-40
SLIDE 40

machining Constant Speed DRL Based Control Generalizability to various surface topology with limited training and benchmarking against constant speed nozzle control (state-of-the-art in industry)

DRL for AM: Cold Spray for high precision repair

Generalizability to various surface imperfection

This document contains no technical data subject to the EAR or the ITAR.

slide-41
SLIDE 41

DRL for AM: Conclusions

  • Cost reduction from time and material saving,
  • easier commissioning due to generalizability,
  • improved performance & scalability,
  • high precision geometric and material property
slide-42
SLIDE 42

Questions?

Explainability & Certification of these approaches for safety-critical systems…

slide-43
SLIDE 43

Publications

  • Amit Surana, Soumalya Sarkar, and Kishore K. Reddy, “Guided Deep Reinforcement Learning for Additive Manufacturing Control Application”, NIPS 2016 Deep Reinforcement Learning

Workshop, December 2016.

  • Soumalya Sarkar, Kishore K. Reddy, Michael Giering, and Mark Gurvich, “Deep Learning for Structural Health Monitoring: A Damage Characterization Application”, Conference of the

Prognostics and Health Management Society, August 2016.

  • Kishore K. Reddy, Soumalya Sarkar, Vivek Venugopalan, and Michael Giering, “Anomaly Detection and Fault Disambiguation in Large Flight Data: A Multi-modal Deep Autoencoder

Approach”, Conference of the Prognostics and Health Management Society, August 2016.

This document contains no technical data subject to the EAR or the ITAR.