deep learning in computer vision csc2523 reading list
play

Deep Learning in Computer Vision (CSC2523) Reading List Bid for - PDF document

Deep Learning in Computer Vision (CSC2523) Reading List Bid for papers: Tue, Jan 26, 11.59pm, 2016 Reviews due: every Monday (one day before class), 11.59pm 1 Bid on papers NOW Below is a list of papers well be reading in the course. You


  1. Deep Learning in Computer Vision (CSC2523) Reading List Bid for papers: Tue, Jan 26, 11.59pm, 2016 Reviews due: every Monday (one day before class), 11.59pm 1 Bid on papers NOW Below is a list of papers we’ll be reading in the course. You are expected to present one paper. We’ll have a bidding system. Please submit a ranked list of papers you’d like to present here: https://docs.google.com/forms/d/1UqtYUESRonNjX5mjXF3xGb5bvbsmrXIrq6t0gMbCkhM/viewform?usp=send_form Use the numbers from this document to refer to papers. Ranking more than one paper is better, since I’ll just go lower down your list in case there is too much interest in one paper. If you don’t submit a preference list, I’ll do a random assignment. You are not expected to read all papers in order to bid. Just browse through them and decide which topics, types of approaches, etc, appeal you more. Note that the list contains a small subset of all available literature on deep learning and new papers are constantly being published. If you know of an interesting or newer paper that is not on the list, please suggest it through the link above. 2 Paper presentation Each presentation should be 10 to 20 minutes long, depending on the paper (some papers are easier to explain than others). Time your presentation such that you don’t go overtime. Each presentation will be followed by a 5 to 10 minute discussion by everyone in the class. You can present with slides or by explaining the paper by showing it on the projector. You can use existing visualizations, or even a few existing slides (if available), as long as you reference them properly (include a reference on each slide where you borrowed text/visualization/slide). The structure of the presentation should be roughly as follows. You are free to choose your own flow if better suited for the paper. • High-level overview, motivation, problem definition, contributions • Overview of the technical approach • Overview of the experimental evaluation • Strengths/weaknesses of the paper (approach, evaluation) Showing a demo (or some additional results) is a great addition (of course not possible for all papers). 1

  2. 3 Reviewing A rough guideline to write your paper reviews: • Short summary of the paper • List main contributions • List positive and negatives points with a short discussion • How strong is the evaluation? Are there some experiments missing? You do not need to write a novel, make the review short and concise. 4 Reading List Click on the title of the paper to access it. 1. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, Alan L Yuille ICLR, Nov 2015 Project page: https://bitbucket.org/deeplab/deeplab-public/ Topic(s): Semantic segmentation Presentation date : Jan 19, presenter : Shenlong 2. Highway Networks Rupesh Kumar Srivastava, Klaus Greff, Jrgen Schmidhuber (arXiv:1505.00387), Nov 2015 Project page: http://people.idsia.ch/~rupesh/very_deep_learning/index.html Topic(s): Very deep networks Presentation date : Jan 26, presenter : Renjie 3. Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun (arXiv:1512.03385), Dec 2015 Topic(s): Very deep CNNs Presentation date : Jan 26, presenter : Renjie 4. Rich feature hierarchies for accurate object detection and semantic segmentation Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik CVPR, 2014 Project page: https://github.com/rbgirshick/rcnn Topic(s): Object detection Presentation date : Jan 26, presenter : Kaustav 5. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun (arXiv:1506.01497 ), June 2015 Code (Matlab): https://github.com/ShaoqingRen/faster_rcnn 2

  3. Code (Python): https://github.com/rbgirshick/py-faster-rcnn Topic(s): Object detection Presentation date : Jan 26, presenter : Kaustav 6. DeepFace: Closing the Gap to Human-Level Performance in Face Verification Yaniv Taigman, Ming Yang, Marc’Aurelio Ranzato, Lior Wolf CVPR, June 2014 Topic(s): Face verification 7. PANDA: Pose Aligned Networks for Deep Attribute Modeling Ning Zhang, Manohar Paluri, MarcAurelio Ranzato, Trevor Darrell, Lubomir Bourdev CVPR, June 2014 Code: https://github.com/facebook/pose-aligned-deep-networks Topic(s): Attribute prediction 8. Computing the Stereo Matching Cost with a Convolutional Neural Network Jure ˇ Zbontar, Yann LeCun (arXiv:1409.4326 ), Sep 2014 Code: https://github.com/jzbontar/mc-cnn Topic(s): Stereo estimation 9. FlowNet: Learning Optical Flow with Convolutional Networks Philipp Fischer, Alexey Dosovitskiy, Eddy Ilg, Philip Husser, Caner Hazrba?, Vladimir Golkov, Patrick van der Smagt, Daniel Cremers, Thomas Brox (arXiv:1504.06852), April 2015 Code: http://lmb.informatik.uni-freiburg.de/resources/software.php Topic(s): Flow estimation 10. Visual Tracking with Fully Convolutional Networks Lijun Wang, Wanli Ouyang, Xiaogang Wang, Huchuan Lu ICCV, 2015 Topic(s): Tracking 11. Two-Stream Convolutional Networks for Action Recognition in Videos Karen Simonyan, Andrew Zisserman NIPS, Dec 2014 Topic(s): Action recognition 12. Dense Optical Flow Prediction from a Static Image Jacob Walker, Abhinav Gupta, Martial Hebert ICCV, 2015 Topic(s): Flow prediction from a monocular image 13. Designing Deep Networks for Surface Normal Estimation Xiaolong Wang, David Fouhey, Abhinav Gupta CVPR, 2015 Topic(s): Surface estimation from a monocular image 14. Depth Map Prediction from a Single Image using a Multi-Scale Deep Network David Eigen, Christian Puhrsch, Rob Fergus NIPS, 2014 Project page: http://www.cs.nyu.edu/~deigen/depth/ 3

  4. Presented together with: Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture David Eigen, Rob Fergus ICCV, 2015 Project page: http://www.cs.nyu.edu/~deigen/dnl/ Topic(s): Depth estimation from a monocular image 15. Learning Rich Features from RGB-D Images for Object Detection and Segmentation Saurabh Gupta, Ross Girshick, Pablo Arbelaez, Jitendra Malik ECCV, 2014 Project page: https://github.com/s-gupta/rcnn-depth Topic(s): Object detection and segmentation in RGB-D 16. Aligning 3D Models to RGB-D Images of Cluttered Scenes Saurabh Gupta, Pablo Arbelez, Ross Girshick, Jitendra Malik CVPR, 2015 Topic(s): Aligning CAD models in RGB-D 17. Monocular Object Instance Segmentation and Depth Ordering with CNNs Ziyu Zhang, Alexander G. Schwing, Sanja Fidler, Raquel Urtasun ICCV, 2015 Topic(s): Class-instance segmentation 18. Where to Buy It: Matching Street Clothing Photos in Online Shops M. Hadi Kiapour, Xufeng Han, Svetlana Lazebnik, Alexander C. Berg, Tamara L. Berg ICCV, 2015 Project page: http://www.tamaraberg.com/street2shop/ Topic(s): Instance recognition 19. Where are they looking? Adria Recasens, Aditya Khosla, Carl Vondrick, Antonio Torralba NIPS, 2015 Project page: http://gazefollow.csail.mit.edu/index.html Topic(s): Gaze prediction 20. DeepStereo: Learning to Predict New Views from the World’s Imagery John Flynn, Ivan Neulander, James Philbin, Noah Snavely arXiv:1506.06825, June 2015 Topic(s): View synthesis 21. Learning to Generate Chairs, Tables and Cars with Convolutional Networks Alexey Dosovitskiy, Jost Tobias Springenberg, Maxim Tatarchenko, Thomas Brox arXiv:1411.5928, Nov 2014 Code: http://lmb.informatik.uni-freiburg.de/resources/software.php Topic(s): Image generation 22. Learning to Deblur Christian J. Schuler, Michael Hirsch, Stefan Harmeling, Bernhard Scholkopf arXiv:1411.5928, Nov 2014 Topic(s): De-blurring 4

  5. 23. Explaining and Harnessing Adversarial Examples Ian J. Goodfellow, Jonathon Shlens, Christian Szegedy ICLR, 2015 Blog: http://karpathy.github.io/2015/03/30/breaking-convnets/ https://codewords.recurse.com/issues/five/why-do-neural-networks-think-a-panda-is-a-vulture Read together with: Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecogniz- able Images Anh Nguyen, Jason Yosinski, Jeff Clune CVPR, 2015 Topic(s): Fooling neural nets, adversarial training 24. Predicting Deep Zero-Shot Convolutional Neural Networks using Textual Descriptions Jimmy Ba, Kevin Swersky, Sanja Fidler, Ruslan Salakhutdinov ICCV, Dec 2015 Topic(s): Zero-shot learning of visual models from text 25. A Neural Algorithm of Artistic Style Leon A. Gatys, Alexander S. Ecker, Matthias Bethge (arXiv:1508.06576), Aug 2015 Code: https://github.com/jcjohnson/neural-style Topic(s): Changing the style of images 26. We Are Humor Beings: Understanding and Predicting Visual Humor Arjun Chandrasekaran, Ashwin K Vijayakumar, Stanislaw Antol, Mohit Bansal, Dhruv Batra, C. Lawrence Zitnick, Devi Parikh (arXiv:1512.04407), Dec 2015 Topic(s): Humor (doesn’t have much of NN flavor, but fun) 27. Deep Karaoke: Extracting Vocals from Musical Mixtures Using a Convolutional Deep Neural Network Andrew J.R. Simpson, Gerard Roma, Mark D. Plumbley (arXiv:1504.04658), April 2015 Topic(s): Blind-deconvolution with NNs 28. Fast Algorithms for Convolutional Neural Networks Andrew Lavin, Scott Gray (arXiv:1509.09308), Sep 2015 Topic(s): Improving the speed of CNNs 29. Distributed Representations of Words and Phrases and their Compositionality Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, Jeffrey Dean NIPS, 2013 Code: https://code.google.com/p/word2vec/ Topic(s): Word2vec (vector representation of words) 30. Skip-Thought Vectors Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov, Richard S. Zemel, Antonio Torralba, Raquel Urtasun, Sanja Fidler 5

Recommend


More recommend