F ROM R ESEARCH T O I NDUSTRY M OBILE EDITION (Or, How I Learned - - PowerPoint PPT Presentation
F ROM R ESEARCH T O I NDUSTRY M OBILE EDITION (Or, How I Learned - - PowerPoint PPT Presentation
F ROM R ESEARCH T O I NDUSTRY M OBILE EDITION (Or, How I Learned To Stop Worrying about Papers and Start Building Smartphone Apps) Stephen Miller, Co-Founder and SVP Engineering Fyusion, Inc T HIS TALK IS ABOUT How to
THIS TALK IS ABOUT…
- How to future-proof your research code
- How to transition from academia to industry
- Hard things I’ve learned growing from a team of ~4 to
a team of >50
- A few tricks for porting desktop code to mobile
- Just enough shameless company promotion to justify
reimbursing the flight
THIS TALK IS NOT ABOUT…
- In-depth technical details
- Any particular library
- Science
- Computer Vision, really, except insofar as it informs
certain challenges
YOU ARE…?
ROUGH OUTLINE
1: Context (~5 min)
- Who I am
- What I did before Fyusion
- What Fyusion is all about
2: Growing pains (~10 min)
- Problems with going from research -> industry
- Particular problems with going from desktop ->
mobile
3: Helpful tips (~10 min)
- General code organization
- Device-specific vs generic code
- Broader startup tips (git flow, process, management)
WHO AM I?
My current face
Robotics 3D Vision Open Source / Industry Collaboration
https://www.youtube.com/watch?v=5FGVgMsiv1s
ROBOTICS @ BERKELEY
(2007 - 2011)
GAP BETWEEN ACADEMIC + PUBLIC EXPECTATION
http://abcnews.go.com/Politics/oklahoma-sen-tom-coburn-report-shows-taxpayer-money/story?id=13689403
3D VISION @ STANFORD
- Started PhD on 3D perception
- Use only low-cost ($100 or less)
sensors
Sensor 0 Sensor 1
(2011 - 2014-ish)
- World’s largest 3D Image Processing initiative
250 500 3/1/11 9/1/11 3/1/12 9/1/12
CUMULATIVE MONTH TO MONTH
D e v e l
- p
e r s
MAINTAINER @ PCL
(2011 - present…ish)
Stop wasting tax dollars and give me all the stuff I saw in the movies
Point clouds, Meshes, Octrees, …
Technologists PROBLEM: CONSUMERS DON’T CARE ABOUT TECH DEMOS Consumers
Point clouds, Meshes, Octrees, …
Technologists PROBLEM: CONSUMERS DON’T CARE ABOUT TECH DEMOS Consumers
Slick, seamless, “magic”
THE CHALLENGE: MAKE IT SO STURDY THEY DON’T
KNOW IT’S RESEARCH
http://fyusion.com
UNDER THE HOOD: SCIENCE
TO THE USER: MAGIC
FASHION, E COMMERCE
CUSTOM CAPTURE MODES
EXAMPLE INDUSTRY: AUTOMOTIVE
EXAMPLE INDUSTRY: AUTOMOTIVE
HOW DID WE SCALE?
REQUIRED BUSTING A LOT OF MYTHS
MYTH: DEMO CODE™
RECENT EXAMPLE: BOTTLENECK
foreach slice [start, end]: fire up decoder march to start fire up processor process to end
Slices were *nonoverlapping* and *consecutive*. This code is extremely wasteful^
GIT BLAME
ORIGINAL CODE
fire up decoder process fire up processor
“Just hack it!”
ORIGINAL CODE
fire up decoder foreach slice [start, end]: march to start fire up processor process to end
“Just hack it!”
Bad code evolves
- ver time
++
“JUST HACK IT” - INDUSTRY EDITION
THE POINT:
DO IT WELL ENOUGH THE FIRST TIME, BECAUSE…
MYTH: THE ENGINEERS™ WILL DO IT LATER
“THE ENGINEER” WILL
- Understand my hacky pseudocode
- Convert my O(n^3) algorithm into O(n) with
Optimization™
- Add helpful comments
- Rename obscure variables and functions
- Write unit tests for everything
“The QArchitician is half QA, half Software Architect, half Mathematician!”
THE ACTUAL ENGINEER WILL
- Be deeply frustrated to work with brittle
code as a starting point
- Be very conservative about accidentally
breaking something
- Require your input and review time, and in
all likelihood…
- Quarantine your code before they fix it
AND THEY’LL BE *RIGHT* TO QUARANTINE, BECAUSE…
TOO MANY COOKS SPOIL THE ALGORITHM
Old, hacky research code % you know offhand: 100
New, “clean”, “optimized”, “refactored” code % you know offhand: ~60 % they know offhand: ~60 % shared: 10
TOO MANY COOKS SPOIL THE ALGORITHM
WHEN CRASHES ATTACK
WHEN CRASHES ATTACK
“This looks nothing like my algorithm.”
WHEN CRASHES ATTACK
“Don’t ask me, it’s not my algorithm.”
TAKEAWAY: SOMEONE SHOULD FULLY UNDERSTAND
A GIVEN CODE BLOCK.
IF IT’S COMPUTER VISION RELATED,
THAT SOMEONE IS PROBABLY YOU.
MYTH: THE PLATFORM IS IRRELEVANT
There Will Be Bugs
2016 ANDROID RECORDING BUG
- Scattered reports from users on a specific device by a European
manufacturer: “Suddenly, the camera just stops sensing motion”
- Difference in timestamps:
13107153 13107234 13107212 13107128
- ^ 2^17 second offset
2017 IPHONE 8 AND X RECORDING BUG
- “All of a sudden it just stops working” - iPhone 8, 8+, and X users
- Bonus points for anyone who can tell me what is mathematically interesting
about 768614.395 seconds…
2015 VIDEO DECODER BUG
- “It works fine for the first 204…then stops forever”
2015 VIDEO DECODER BUG
- Bug in Apple video decoder (lasted at least 3 years)
video200.mp4 video200.mp4 video200.mp4 video204.mp4 Decoder1 Decoder1 Decoder1 Decoder1 Decoder204
2015 VIDEO DECODER BUG
- Bug in Apple video decoder (lasted at least 3 years)
video200.mp4 video200.mp4 video200.mp4 video204.mp4 Decoder1 Decoder1 Decoder1 Decoder1 Decoder204 video_sym.mp4
EXISTENTIAL QUESTION: HOW DO YOU TEST FOR THIS $@*(&?
- Not scalable to actually test for every possible case
- Irresponsible not to test for
every possible case
“Okay, so before a PR can be merged, we need to make sure we run it on this particular model
- f Alcatel phone for at least 3 weeks, and
also the plus version, and the international version, and a version running Lollipop, and also bring it to that garage, and also…”
MYTH: I CAN JUST DO IT ON THE CLOUD™ INSTEAD
GOING TO THE CLOUD
- 1 minute of 1080p capture on an iPhone = 200MB
- In many cities, you’ll be lucky if that is done in 5
minutes — and that assumes Apple doesn’t kill you first
- If you want to scale globally (China, India), you can’t
assume an LTE connection
Strong AI Achieved!
Just point at anything and our patented algorithm will do the rest
MYTH: I CAN JUST SAVE RAW DATA AND DO IT OFFLINE
DOING IT OFFLINE
- 1 minute of 1080p capture = 3600 frames
- If you have a simple thing (e.g. 10 ms or less), you’re
still looking at 30 seconds to run it
- And that’s ignoring h264 decoding time!
- If you can do anything online, do it! Attention spans
are very low, even for cutting edge tech.
MYTH: IT’LL BE LIKE A LAB, WE CAN WING IT!
IN THE BEGINNING…
You and a handful of colleagues know every line of code. You move extremely quickly, and have no need for QA
- r code review.
“Meetings” make you think of Dilbert
EVENTUALLY
50+ people, multiple products, multiple deadlines, actual customers (and this is just one repository) No way for one person to keep track of everything
REALIZATION: WE NEED A MORE FORMAL PROCESS
MYTH: PROCESS™ WILL SOLVE IT
PROCESS: A PRIMER
Step 1: Commit to an insane deadline
Five days? I’ll do it in one!
PROCESS: A PRIMER
Step 1: Commit to an insane deadline Step 2: Fail
PROCESS: A PRIMER
Step 1: Commit to an insane deadline Step 2: Fail Step 3: Blame Process™
“We need JIRA!” “Trello or die!” “Did you even read The Lean Startup?!!” “Scrum you fools! It’s been scrum the whole time!”
THE TRUTH
Process is useful, and necessary. But it isn’t a magic bullet.
ENOUGH MYTHS; WHAT SHOULD I DO?
TIP: BEWARE OF CVPR SYNDROME
- “What if it also used GANs to predict when the user
wants to stop recording?”
- “Surely superresolution can help this”
- “Why are we using JPEG? This latest compression
algorithm is *way* better”
- “Couldn’t a neural network handle all the
- n-screen rendering too?”
- “This would be much better if I wrapped it
in Haskell”
TIP: BEWARE OF CVPR SYNDROME
https://www.youtube.com/watch?v=evUWersr7pc
IT WILL GET MORE COMPLICATED ON ITS OWN. DON’T PUSH IT.
- Customization per customer
- Exceptions for certain lighting conditions
- Optimizations for particular phones
- New battery constraints that force you to fork and
simplify
- Handling of about 10000 different edge cases
“BUT WAIT, THERE’S…”
TIP: DON’T REINVENT THE WHEEL
- Just because you can make it on your own
doesn’t mean you should.
- Many of the things you hate about a certain
codebase are exactly what will happen to
- yours. But theirs will be tested.
- Best case scenario: you waste time.
- Worst case scenario: years of bugfixing
Your hard-coded quaternion arithmetic Eigen::Quaternionf
TIP: KEEP ALGORITHMS CROSS PLATFORM
libCV.a
Android Source (Java)
FYAndroid.git
CVLib.framework
iOS Source (ObjC, Swift)
FYiOS.git
CVLib Source (C++)
libCV.so
CVLib.git
Debugging Tools Algorithmic Unit Tests
FYServer.git FYOculus.git …
TIP: I/O IS CRITICAL
State Of The Art ™ “Realtime” Algorithm 25ms
C a m e r a
- >
g r a y s c a l e c v : : M a t 3 m s cv::Mat -> jpeg 30ms
- Always look for hardware accelerated options
- Minimize unnecessary data
modifications (resizing, color conversions, CPU <-> GPU)
- Color conversions in particular
are a constant source of bugs; settle on a bulletproof convention ASAP
CVLib.git (C++)
ImageIO.saveImage() RandAccMovieIO.loadFrame(idx) SeqMovieIO.loadNextFrame() DataIO.saveEncrypted(blob) @interface HEIFWriter : ImageIO @interface MJPEGEncoder : RandAccMovieIO @interface HVECEncoder : SyncMovieIO @interface iOSCryptoWrapper : DataIO
FYiOS.git (ObjC)
CVLib.git (C++)
ImageIO.saveImage() RandAccMovieIO.loadFrame(idx) SeqMovieIO.loadNextFrame() DataIO.saveEncrypted(blob) class JPEGWriter : ImageIO class PNGDirIO : RandAccMovieIO class H264Encoder : SyncMovieIO class FastAndroidEncryptor : DataIO
FYAndroid.git (Java)
CVLib.git (C++) FYiOS.git (Obj C)
CVImage resize() cvtColor() toBitmap() toGLBuffer() … iOSImageWrapper MetalTexture UIImage Processor ImageIO IOSProcessingDelegate ProcessingCB() HEIFWriter
DOWNSIDE: SHARED CODE = TOUGHER QA
TIP: AUTOMATE EVERYTHING
Grad school edition: Makefiles for everything
TIP: CONTINUOUS INTEGRATION IS A LIFESAVER
- Use Jenkins to run sanity checks on device simulators
Build alone is an enormous timesaver, especially on device Keep track of multiple products simultaneously
(PARTICULARLY WITH THE CROSS PLATFORM SIDE)
libCV.a CVLib.framework
CVLib Source (C++)
libCV.so
- Build all targets (iOS, Android, Desktop) to ensure no API breakage.
- Alternative: your research engineers will never be able to push code without
having an iPhone, an Android phone, and extraordinary patience
TIP: (SOME) GIT FLOW IS IMPORTANT
- Lots of blogs take this to an extreme
(Feature -> sprint -> epic -> qa -> release -> …)
TIP: (SOME) GIT FLOW IS IMPORTANT
What works for us is more basic:
- Engineer makes PR onto dev
- dev PR gets code review + unit tested
- > merged
- Once all features are in, PR opened from
dev -> release
- release PR gets more thorough QA,
manual testing -> merged
- Only release branch gets shipped to
iOS, Android, etc repositories
TIP: FIND THE RIGHT REVIEWING BALANCE
Overly Strict:
- Nothing gets merged unless
it is perfect
- Deadlines slip due to inactivity
/ unwillingness to take any chances
- Morale drops to 0
Overly Permissive:
- Everything gets merged
- Destabilizing crashes ruin
every delivery, deadline rushes create unusable code
- Morale drops to 0
6/17/2018 Complex processing changes by sdmiller · Pull Request #1 · sdmiller/cvpr https://github.com/sdmiller/cvpr/pull/1#pullrequestreview-129398515 1/2
master addedProcessing
Trying out slicing 0a85920 Slicing 99472cc 3 days of hard work bbde897 CVPR/Processor.m 46 47 48 49 + + + +‑ (NSArray<ProcessingResult*>*)processForSlices:(NSArray*)slices CVPR/Processor.m 49 50 51 52 +‑ (NSArray<ProcessingResult*>*)processForSlices:(NSArray*)slices +{ + NSMutableArray<ProcessingResult*>* resultsPerSlice = [NSMutableArray new]; + for (Slice* slice in slices) { CVPR/Processor.m 51 52 53 54 + NSMutableArray<ProcessingResult*>* resultsPerSlice = [NSMutableArray new]; + for (Slice* slice in slices) { + ProcessingResult* result = [ProcessingResult new]; + [self.decoder restart]; CVPR/Processor.m 52 53 54 55 + for (Slice* slice in slices) { + ProcessingResult* result = [ProcessingResult new]; + [self.decoder restart]; + int frameCount = ‑1; CVPR/Processor.m 56 57 58 59 + Frame* currentFrame; + while ( (currentFrame = [self.decoder getFrame]) ) { + ++frameCount; + if (frameCount < slice.start) { CVPR/Processor.m 58 59 60 61 + ++frameCount; + if (frameCount < slice.start) { + continue; + } else if (frameCount > slice.end) { CVPR/Processor.m 61 62 63 64 + } else if (frameCount > slice.end) { + break; + } + // BLAH BLAH DO MAGIC CVPR/Processor.m 63 64 65 66 + } + // BLAH BLAH DO MAGIC + } + [resultsPerSlice addObject:result]; CVPR/Processor.m 71 72 73 74 + +‑ (ProcessingResult*)process +{ + ProcessingResult* result = [ProcessingResult new]; CVPR/Processor.m 74 75 76 77 + ProcessingResult* result = [ProcessingResult new]; + [self.decoder restart]; + int frameCount = ‑1; + Frame* currentFrame;
6/17/2018 Complex processing changes by sdmiller · Pull Request #1 · sdmiller/cvpr https://github.com/sdmiller/cvpr/pull/1#pullrequestreview-129398515 1/2
master addedProcessing
Trying out slicing 0a85920 Slicing 99472cc 3 days of hard work bbde897 CVPR/Processor.m 46 47 48 49 + + + +‑ (NSArray<ProcessingResult*>*)processForSlices:(NSArray*)slices CVPR/Processor.m 49 50 51 52 +‑ (NSArray<ProcessingResult*>*)processForSlices:(NSArray*)slices +{ + NSMutableArray<ProcessingResult*>* resultsPerSlice = [NSMutableArray new]; + for (Slice* slice in slices) { CVPR/Processor.m 51 52 53 54 + NSMutableArray<ProcessingResult*>* resultsPerSlice = [NSMutableArray new]; + for (Slice* slice in slices) { + ProcessingResult* result = [ProcessingResult new]; + [self.decoder restart]; CVPR/Processor.m 52 53 54 55 + for (Slice* slice in slices) { + ProcessingResult* result = [ProcessingResult new]; + [self.decoder restart]; + int frameCount = ‑1; CVPR/Processor.m 56 57 58 59 + Frame* currentFrame; + while ( (currentFrame = [self.decoder getFrame]) ) { + ++frameCount; + if (frameCount < slice.start) { CVPR/Processor.m 58 59 60 61 + ++frameCount; + if (frameCount < slice.start) { + continue; + } else if (frameCount > slice.end) { CVPR/Processor.m 61 62 63 64 + } else if (frameCount > slice.end) { + break; + } + // BLAH BLAH DO MAGIC CVPR/Processor.m 63 64 65 66 + } + // BLAH BLAH DO MAGIC + } + [resultsPerSlice addObject:result]; CVPR/Processor.m 71 72 73 74 + +‑ (ProcessingResult*)process +{ + ProcessingResult* result = [ProcessingResult new]; CVPR/Processor.m 74 75 76 77 + ProcessingResult* result = [ProcessingResult new]; + [self.decoder restart]; + int frameCount = ‑1; + Frame* currentFrame;
6/17/2018 Complex processing changes by sdmiller · Pull Request #1 · sdmiller/cvpr https://github.com/sdmiller/cvpr/pull/1#pullrequestreview-129398515 1/2
master addedProcessing
Trying out slicing 0a85920 Slicing 99472cc 3 days of hard work bbde897 CVPR/Processor.m 46 47 48 49 + + + +‑ (NSArray<ProcessingResult*>*)processForSlices:(NSArray*)slices CVPR/Processor.m 49 50 51 52 +‑ (NSArray<ProcessingResult*>*)processForSlices:(NSArray*)slices +{ + NSMutableArray<ProcessingResult*>* resultsPerSlice = [NSMutableArray new]; + for (Slice* slice in slices) { CVPR/Processor.m 51 52 53 54 + NSMutableArray<ProcessingResult*>* resultsPerSlice = [NSMutableArray new]; + for (Slice* slice in slices) { + ProcessingResult* result = [ProcessingResult new]; + [self.decoder restart]; CVPR/Processor.m 52 53 54 55 + for (Slice* slice in slices) { + ProcessingResult* result = [ProcessingResult new]; + [self.decoder restart]; + int frameCount = ‑1; CVPR/Processor.m 56 57 58 59 + Frame* currentFrame; + while ( (currentFrame = [self.decoder getFrame]) ) { + ++frameCount; + if (frameCount < slice.start) { CVPR/Processor.m 58 59 60 61 + ++frameCount; + if (frameCount < slice.start) { + continue; + } else if (frameCount > slice.end) { CVPR/Processor.m 61 62 63 64 + } else if (frameCount > slice.end) { + break; + } + // BLAH BLAH DO MAGIC CVPR/Processor.m 63 64 65 66 + } + // BLAH BLAH DO MAGIC + } + [resultsPerSlice addObject:result]; CVPR/Processor.m 71 72 73 74 + +‑ (ProcessingResult*)process +{ + ProcessingResult* result = [ProcessingResult new]; CVPR/Processor.m 74 75 76 77 + ProcessingResult* result = [ProcessingResult new]; + [self.decoder restart]; + int frameCount = ‑1; + Frame* currentFrame;
HEALTHY EXPECTATIONS IN RELATIONSHIPS
Dan Savage relationship advice: “There's no perfect person, no perfect person for you, no perfect match…etc. You'll meet a 0.64 or two if you're lucky—if you're really lucky you might even meet a 0.72—and it's your job to round that [person] up to one.”
HEALTHY EXPECTATIONS IN CODE
My engineering sanity advice: No one will write perfect code, with perfect variable names, perfect comment styles, a unit test for every method so thorough we will know it works simply if it runs. Like human relationships, if you want to ever have a chance at merging, round “pretty good!” up to perfect and move on with your life.
TIP: BUFFERS (THE HUMAN KIND) ARE CRITICAL
Feb 2016: I take over a project for a big client. Biweekly phone calls, incremental features. “Hacking is fun! I could do this in my sleep!” PRs every day or two with good, targeted improvements By March… ~15 self merge PRs a day, usually with curse words attached.
BURNOUT
By April… Actual photo in China after 2 sleepless weeks Fever medicine
HOW IT WORKS: ENGINEER <-> CUSTOMER
“Is X possible?” “How long?” Boss 2: “PUSH PUSH PUSH” “Great talk to you in 16 hours with a new buglist!” “Yes everything is trivial” “I have a PhD 9 hours” “What about Y?” “Idk, 7 hours?” Boss 1: “Keep Customer happy” “I don’t pay you to sleep!” “But what about sleep?” “You don’t pay me at a—“
Engineer Customer’s PM
1 WEEK LATER
Boss 2: “Great pushing today, Jimbo!” Boss 1: “Why is our codebase in Piglatin?”
HOW IT SHOULD WORK: PM <-> CUSTOMER
Product Manager Customer’s PM
“Is X possible?” “How long?” “So when will it be ready?” “I don’t know I’d need to ask my Engineer.” “I’ll ask tomorrow.” “What about Y?” “I’ll ask tomorrow.” Boss 1: “Keep Customer happy” “Okay put it in a giant spreadsheet” “Engineers, right? I’ll text u tmrw” “I love those!” Boss 2: “PUSH PUSH PUSH”
HOW IT SHOULD WORK: PM <-> ENGINEER
Product Manager Engineer
“I went rock climbing and watched Westworld, per CS stereotype” “Are you even listening I have a Ph—” “How was your weekend, champ?” “Cool cool cool hey, can we touch base on X later?” “It’s trivial I have a PhD I could do that in 9 hours.” “Can you do it by Friday?” “Friday! You’re a champ!”
1 WEEK LATER
Boss 2: “Great pushing today, Jimbo!” Boss 1: “Friday deadline on track?”
LOW-LEVEL LIGHTNING ROUND
TIP: GUARD THE MAIN THREAD WITH YOUR LIFE
Viewer
- Separate viewing logic
(previsualization, computation) from actual rendering
- iOS: DisplayLink very useful here
- Bonus: manually decode JPEGs on a
background thread; don’t let viewer handle it on the fly
TIP: GUARD THE MAIN THREAD WITH YOUR LIFE
Viewer Camera
- Separate viewing logic
(previsualization, computation) from actual rendering
- iOS: DisplayLink very useful here
- Bonus: manually decode JPEGs on a
background thread; don’t let viewer handle it on the fly
- Decouple online computation (detections,
keypoint tracking, style transfer, etc) from preview rendering
- Don’t queue forever: know when to
throttle, when to revert to offline
- Write flexibly; remember, what looks
good on the iPhone X won’t look good
- n a 4s
TIP: MEMORY MATTERS
- iOS: @autoreleasepool, intercept
memory warnings
- Android: reason with the garbage collector in mind
(and the stutter it could cause)
- Trust design patterns for asynchronous
systems: they are *hard* to rederive from scratch
- Favor online approaches wherever
possible
TIP: MEMORY MATTERS
- Common mistake: bad delegate patterns
TIP: MEMORY MATTERS
CameraViewController DGBGDetector @property (strong) ___ *detector @property (strong) ___ *delegate
Circular reference = memory leak!
TIP: MEMORY MATTERS
CameraViewController DGBGDetector @property (strong) ___ *detector @property (weak) ___ *delegate
TIP: MEMORY MATTERS
- Even more nefarious: inline function blocks
self.successHandler = ^{ _statusButton.text = @“SUCCESS!” }
- Correct pattern: weak pointers
__weak typeof(self) *weakSelf = self; self.successHandler = ^{ weakSelf.statusButton.text = @“SUCCESS!” }
self successHandler
@property (strong) void() * @property (strong) Controller *
weakSelf successHandler
@property (strong) void() * @property (weak) Controller *
TIP: MEMORY MATTERS
- Be strict:
- Reuse large objects (i.e neural networks, filters) as
global singletons
- Clean up unneeded memory ASAP; don’t wait for
garbage collector to get around to it
THANKS!
- Check us out at http://fyusion.com
- Try our newly released Viewer SDK at http://developers.fyusion.com
- Tweet me @sdavidmiller, find me at http://sdavidmiller.com, or…
- Google “Stephen Miller”, and help PageRank devalue this guy: