AI and Embodiment at the Edge: Leveraging Deep Learning, with OSS Project Watson-Intu (“Self”)
IBM WATSON CLOUD MARCH, 2018
AI and Embodiment at the Edge: Leveraging Deep Learning, with OSS - - PowerPoint PPT Presentation
AI and Embodiment at the Edge: Leveraging Deep Learning, with OSS Project Watson-Intu (Self) IBM WATSON CLOUD MARCH, 2018 AI What are we talking about when we say AI Multiple uses of the term AI: Artificial Intelligence
IBM WATSON CLOUD MARCH, 2018
What are we talking about when we say “AI” Multiple uses of the term
AI: “Artificial Intelligence” – Replicating the human mind in artificial form AI: “Augmented Intelligence” – Augmenting human cognition with intelligent assistance
Distinction between AI and Deep Learning / NN’s
AI, in both cases, may be brought about by multiple contributing methods/technologies: Rule-based or “Expert” systems1 Knowledge Graphs2 / Semantic representations Machine Learning, Neural Network based Deep Learning (differentiable programming3)
1 “Rule-Based Expert Systems”, Crina Grosan, Ajith Abraham, “https://link.springer.com/chapter/10.1007/978-3-642-21004-4_7 2 “Towards a Definition of Knowledge Graphs”, Lisa Ehrlinger and Wolfram Wöß Institute for Application Oriented Knowledge Processing Johannes Kepler University Linz, Austria; http://ceur-ws.org/Vol-1695/paper4.pdf
Trea Knowledge Graph for IBM Inventors EKG Rule-Based System Deep Neural Network
Cognitive Systems are able to learn their behavior through education That support forms of expression which are more natural for human interaction Whose primary value is their expertise; and That continue to evolve as they experience new information, new scenarios, and new responses And do so at enormous scale
1 NVIDIA GTC 2016 Keynote, Rob High CTO IBM Watson Cloud: https://fudzilla.com/news/40407-ibm-cto-shows-off-gpu-accelerated-cognitive-computing
Human-System Interaction1 Data produced per day, projections1
Rapidly advancing capabilities:
“A Historical Perspective of Speech Recognition” , By Xuedong Huang, James Baker, Raj Reddy; Communications of the ACM, Vol. 57 No. 1, Pages 94-103, 10.1145/2500887, “Fast human detection in crowded scenes by contour integration and local shape estimation”, Csaba Beleznai, Horst Bischof, 2009 in 2009 IEEE Conference on Computer Vision and Pattern Recognition
Jensen Huang, 2017 NVIDIA Keynote Human Detection in Crowded Scenes2
Historical Perspective of Speech Processing1
IBM Watson Natural Language
An Open source project for embodied cognition
Embodied: An entity is in and of the world and that it can sense, react, and act in that world. Cognitive: An entity can reason and learn. An embodied cognitive entity has a identity that distinguishes itself from all other entities (and is to a degree aware of its own identity).
1 “Project Intu v1.0”, Grady Booch, IBM: https://ibm.box.com/v/IBMWatson-Intu-Self-Embodiment
Nao Robot and Hilton Concierge SoftBank “Pepper”
Based on a cognitive architecture named “Self”.
Self is an agent-based architecture that combines connectionist and symbolic models of computation, using blackboards for opportunistic collaboration. Project Intu provides a framework for orchestrating cognitive services in a manner that brings higher level cognition to an embodied system.
1 “Project Intu v1.0”, Grady Booch, IBM: https://ibm.box.com/v/IBMWatson-Intu-Self-Embodiment
2 Image Credit: Global Digital Citizen: https://globaldigitalcitizen.org/category/global-digital-citizen
Credit: Global Digital Citizen2
Models
Within
Knowledge graph maintained of historical events, interactions, representations
1 “Project Intu v1.0”, Grady Booch, IBM: https://ibm.box.com/v/IBMWatson-Intu-Self-Embodiment
Agent-based blackboard system Sensors contribute data to topics published to blackboard Agents
(system health, question, identity, position, weather…)
according to Goals, fulfilling Plans
Skills are defined to interact with the user, and Gestures are the manifestation of the interaction
1 “Project Intu v1.0”, Grady Booch, IBM: https://ibm.box.com/v/IBMWatson-Intu-Self-Embodiment
Agent-based blackboard system Sensors contribute data to topics published to blackboard Agents
(system health, question, identity, position, weather…)
according to Goals, fulfilling Plans
Skills are defined to interact with the user, and Gestures are the manifestation of the interaction
Self Startup Sequence
Agent-based blackboard system Sensors contribute data to topics published to blackboard Agents
(system health, question, identity, position, weather…)
according to Goals, fulfilling Plans
Skills are defined to interact with the user, and Gestures are the manifestation of the interaction
The result of decades of foundational technologies, combined with emerging technologies Society of Mind Agent-based Blackboard Knowledge Graphs Neural Networks Deep Learning Edge Compute Distributed Systems Containerized Microservices GPU-accelerated Devices
Docker registry
1010 0101 1010 1010 0101 1010 1010 0101 1010Watson Services Image Credit: https://www.ibm.com/cloud-computing/bluemix/sites/default/files/assets/page/feature-cognitive_0_0.png
Ref: Docker Stack
* External repo for Self setup on NVIDIA Jetson TX2: https://github.com/chrod/self-jetsontx2/wiki/Getting-Started
Setup*:
Jetson TX2 Hardware: KB, Monitor, Webcam, Mic Set up IBM Edge (see previous WIoTP steps) Register Edge Device In IBM Watson Cloud:
Setup Watson Services: Conversation Conversation Setup: Intu Dialog Starter Copy Watson creds to local config folder on Jetson
“Self” Embodiment on Jetson TX2
Out-of-the box integration w/ Watson services Custom “Edge” plugin w/ basic Agent (DL Agent) Workload group:
Intu “Self” Aural2: command words Face-classification: emotion
Intu Self on Jetson TX2 (viewed from Mac)
/dev/video1 /dev/video5 /dev/video6
Face Classification AI Watson-Intu
1010 0101 1010Aural2
1010 0101 1010/Mic
NVIDIA GTC 2018 Session: 81037: Doing what the User Wants, Before They Finish Speaking Sound state classifier for human speech Long short-term memory (LSTM) model
Training: Upload ten-second audio clips for labeling & subsequent training Written using TensorFlow compute graph, golang ~30 times/second: Model outputs the probabilities for each state of the world Multiple vocabularies: (one model trained per vocabulary)
Words the user says: Recognized the word spoken “play” Intent (action the user wants performed): “play music” (as opposed to “I play ball”) *Emotional state of the user *Person who is currently talking (user or other)
Performance
Model Size: 5.4MB 10% CPU usage (800MHz x86 laptop). Can run on a Pi3, but slowly… 2.5 mins training time - 3000 mini batches, 1 hr of audio (GTX 1060) i.e. Train at home Negative Latency: Aural predicts the word/intent prior to the end of the word
Privacy
All data stays at the edge (inference and training on prem w/ consumer-grade equip)
Github repo for open source face classification build: https://github.com/open-horizon/cogwerx-face-classification-tx2/ Face classification codebase with Intu publish: https://github.com/ig0r/face_classification/tree/intu
Face Classification Workload:
Keras 2.x Convolutional Neural Network Tensorflow 1.3; Computer Vision: OpenCV 3.1 Programming languages: Python 3, OpenCV native bindings Demos: Face emotion classification using a video, and webcam
Tie-in with Intu:
Service: Publishes messages to blackboard topic Emotion (“Angry”, 1) - W.I.P: Agent inspects topic msgs, persists user emotional state
Deep Learning Models:
Adaptive Boosting frontal face detector by Rainer Lienhart (OpenCV) Keras Xception Model with 7 classes (angry, disgust, fear, happy, sad, surprise, neutral)
Training:
Model trained with the Kaggle Facial Expression Recognition Challenge dataset (FER2013). Training time: 2 x NVidia GRID K2 GPU: 6-20 hours depending on training parameters settings
GRID K2: 2x2.2 GFlops (FP32)…. GTX1080: 1x9.0 TFLOPS (1080Ti: 1x11.3 TFLOPS)
Face-Classification from video file, publishing to Intu Topic
* External repo for Self setup on NVIDIA Jetson TX2: https://github.com/chrod/self-jetsontx2/wiki/Getting-Started
Setup*:
Jetson TX2 Hardware: KB, Monitor, Webcam, Mic Set up IBM Edge (see previous WIoTP steps) Register Edge Device In IBM Watson Cloud:
Setup Watson Services: Conversation Conversation Setup: Intu Dialog Starter Copy Watson creds to local config folder on Jetson
“Self” Embodiment on Jetson TX2
Out-of-the box integration w/ Watson services Custom “Edge” plugin w/ basic Agent (DL Agent) Workload group:
Intu “Self” Aural2: command words Face-classification: emotion
Intu Self on Jetson TX2 (viewed from Mac)
/dev/video1 /dev/video5 /dev/video6
Face Classification AI Watson-Intu
1010 0101 1010Aural2
1010 0101 1010/Mic
Video: See 2018 NVIDIA/IBM Edge Webinar
Hardware
What you can do with IBM's Edge (Beta) Deploy Services to the Edge
Custom-Develop Services
Collect Insights
Docker registry
1010 0101 1010 1010 0101 1010 1010 0101 1010IBM is interested in your feedback: Visit us on discourse.bluehorizon.network
Compute at the source of data (IoT and Beyond)
Supporting cognitive edge deployment and development
(Ubuntu, Debian, Raspbian)
Getting Started: open-horizon examples and wiki: https://github.com/open-horizon/examples/wiki
Cloud
Getting Started: open-horizon examples and wiki: https://github.com/open-horizon/examples/wiki
Getting Started: open-horizon examples and wiki: https://github.com/open-horizon/examples/wiki
Getting Started: open-horizon examples and wiki: https://github.com/open-horizon/examples/wiki
Jetson TX2
MQTT
WIoTP Portal
MQTT
…. ….
forwarded on to WIoTP Cloud.
(may need to wait a few seconds if the edge node has just been enabled and is coming up...)
Getting started
Develop edge services
video, mic audio, cpu load), exposure to raw data
Develop workloads
analytics, deep learning inference, etc
access to system hardware if not included at runtime
Develop pattern
Examples; GPU & Display access via Docker
Best practices Docker container builds
Multi-container groups; on-device streams
/dev/video1 /dev/video5 /dev/video6 /dev/video7
*Edge agent creates Docker networks for patterned containers
Face Classification AI Application
1010 0101 1010*Image Attribution: https://success.docker.com/api/images/Docker_Reference_Architecture_Designing_Scalable,_Portable_Docker_Container_Networks%2F%2Fimages%2Fcnm.png
Face Classification
CUDA/ CUDNN
FROM: <CUDA/CUDNN>
Project Intu
https://github.com/watson-intu/self
https://intu-team.slack.com
IBM Watson IoT Platform
Single URL with all of the above:
1 Email egrady@booch.com for an invitation
History 2016-’17: Project Blue Horizon
Integration w/ IBM Watson Cloud
IBM Watson IoT Platform: https://www.ibm.com/internet-of-things/spotlight/watson-iot-platform
http://bluehorizon.network