Workshop Program

The ChaLearn Looking at People Workshop 2015, to be held next December 12th in conjunction with ICCV 2015 in Santiago de Chile, Chile, will be devoted to the presentation of the most recent and challenging techniques for Looking at People, like for example human body pose recovery, human action/interaction, soft-biometric analysis, facial expression recognition and cultural event recognition.  Best workshop papers will be invited to the IJCV Special Issue, with the best one being awarded an NVIDIA Titan X graphics card.

The NVIDIA ChaLearn LAP 2015 Best Paper Award has been awarded to "DEX: Deep EXpectation of apparent age from a single image", Rasmus Rothe, Radu Timofte, Luc Van Gool.


08:45h Opening: Presentation of the workshop, Sergio Escalera, UB-CVC 


09:00h Invited Speaker I: Rama Chellappa, University of Maryland
Session chair: Xavier Baró 

09:45h Session I: Challenge results presentation and award ceremony, Sergio Escalera, UB-CVC 
  • ChaLearn Looking at People 2015: Apparent Age and Cultural Event Recognition datasets and results, Sergio Escalera, Junior Fabian, Pablo Pardo, Xavier Baró, Jordi Gonzàlez, Hugo Jair, Dusan Misevic, Ulrich Steiner, Isabelle Guyon 
Session chair: Xavier Baró 

10:00h Coffee Break 

10:30h Session II: Winners Apparent Age Estimation 
  • 1st: DEX: Deep EXpectation of apparent age from a single image, Rasmus Rothe, Radu Timofte, Luc Van Gool 
  • 2nd: AgeNet: Deeply Learned Regressor and Classifier for Robust Apparent Age Estimation, Xin Liu, Shaoxin Li, Meina Kan, Jie Zhang, Shuzhe Wu, Wenxian Liu, hu Han, Shiguang Shan, Xilin Chen 
  • 3rd: A Study on Apparent Age Estimation, Yu Zhu, Yan Li, Guowang Mu, Guodong Guo 
Session chair: Pablo Pardo 

11:15h Session III: Winners Cultural Event Recognition 
  • 1st: Exploiting Feature Hierarchies with Convolutional Neural Networks for Cultural Event Recognition, Mengyi Liu, Xin Liu, Yan Li, Xilin Chen, Alexander Hauptmann, Shiguang Shan 
  • 2nd: Deep Spatial Pyramid Ensemble for Cultural Event Recognition, Xiu-Shen Wei, Bin-Bin Gao, Jianxin Wu 
  • 3rd: Better Exploiting OS-CNNs for Better Event Recognition in Images, Limin Wang, Zhe WANG, Sheng Guo, Qiao Yu 
  • 4th: DLDR: Deep Linear Discriminative Retrieval for cultural event classification from a single image, Rasmus Rothe, Radu Timofte, Luc Van Gool 
Session chair: Junior Fabian 


12:15h: Invited Speaker II: Jianxin Wu, "Deep ConvNet Meets Ordered Labels: Deep Label Distribution Learning", Nanjing University
Session Chair: Jordi Gonzalez 

12:45h Lunch break 

13:45h: Session IV: Looking at Poses 
  • Moving Poselets: A Discriminative and Interpretable Skeletal Motion Representation for Action Recognition, Lingling Tao, René Vidal 
  • Skeleton-free body pose estimation from depth images for movement analysis, Ben Crabbe, Adeline Paiement, Sion Hannuna, Majid Mirmehdi 
  • Motion Recognition Employing Multiple Kernel Learning of Fisher Vectors using Local Skeleton Features, Yusuke Goutsu, Wataru Takano, Yoshihiko Nakamura
  • Person Attribute Recognition with a Jointly-trained Holistic CNN Model, Patrick Sudowe, Hannah Spitzer, Bastian Leibe 
Session Chair: Marc Oliu 


14:45h: Invited Speaker III: Shalini Gupta, "Multi-modal Dynamic Hand Gesture Recognition with CNNs", NVIDIA Research
Session Chair: Hugo Jair Escalante 

15:00h Coffee Break 

15:30h: Invited Speaker IV: Caroline Pantofaru, Google Research  
Session Chair: Hugo Jair Escalante 

16:00h: Session V: Age Estimation
  • 4th: Deeply Learned Rich Coding for Cross-Dataset Facial Age Estimation, Wei Zhang, Zhanghui Kuang, Chen Huang
  • 5th: Deep Label Distribution Learning For Apparent Age Estimation, XU YANG, Bin-Bin Gao, Chao Xing, Zeng-Wei Huo, Xiu-Shen Wei, Ying Zhou, Jianxin Wu, Xin Ging 
  • 6th: Unconstrained Age Estimation with Deep Convolutional Neural Networks, Rajeev Ranjan, Sabrina Zhou, Jun_Cheng, Amit Kumar, Azadeh Alavi, Vishal Patel, Rama Chellappa 
Session Chair: Hugo Jair Escalante 

16:45h: Invited Speaker V: Larry Davis, University of Maryland 
Session Chair: Jordi Gonzàlez 

17:15h: Session VI: Looking at Faces 
  • An End-to-End System for Unconstrained Face Verification with Deep Convolutional Neural Networks, JunCheng Chen, Rajeev Ranjan, Amit Kumar, ChingHui Chen, Vishal Patel, Rama Chellappa 
  • Coordinated Local Metric Learning, Shreyas Saxena, Jakob Verbeek 
  • Facial Landmark Localization in Depth Images using Supervised Ridge Descent, Necati Cihan Camgoz, Vitomir Struc, Ahmet Alp Kindiroglu, Berk Gokberk, Lale Akarun 
  • When Face Recognition Meets with Deep Learning: an Evaluation of Convolutional Neural Networks for Face Recognition, Guosheng Hu, Yongxin Yang, Dong Yi, josef kittler, William Christmas, Stan Li, Timothy Hospedales 
Session Chair: Jordi Gonzàlez 

18:15h: The COST Action iV&L Net IC1307: Challenges in Computer Vision and Natural Language, Sergio Escalera (UB-VCVC) 
Session Chair: Xavier Baró 

18:30h: Break

Session Chair: Xavier Baro, Hugo Jair Escalante 

22:00h: End Hackaton

Confirmed Invited Speakers:

 Rama Chellappa, University of Maryland

 Prof. Rama Chellappa is a Minta Martin Professor of Engineering and Chair of the ECE department at the University of  Maryland. Prof. Chellappa received the K.S. Fu Prize from the International Association of Pattern Recognition (IAPR).  He is a recipient of the Society, Technical Achievement and Meritorious Service Awards from the IEEE Signal  Processing Society. He also received the Technical Achievement and Meritorious Service Awards from the IEEE  Computer Society. At UMD, he received college and university level recognitions for research, teaching innovation and  mentoring of undergraduate students. In 2010, he was recognized as an Outstanding ECE by Prude university. Prof.  Chellappa served at the Editor-in-Chief of PAMI. He is a Golden Core Member of the IEEE Computer Society, served as  a Distinguished Lecturer of the IEEE Signal Processing Society and as the President of IEEE Biometrics Council. He is a Fellow of IEEE, IAPR, OSA, AAAS, ACM and AAAI and holds four patents.

Jianxin Wu, Nanjing University

Jianxin Wu received his BS and MS degrees in computer science from Nanjing University, and his PhD degree in computer science from the Georgia Institute of Technology. He is currently a professor in the Department of Computer Science and Technology at Nanjing University, China, and is associated with the National Key Laboratory for Novel Software Technology, China. He was an assistant professor in the Nanyang Technological University, Singapore. His research interests are computer vision and machine learning. He has served as an area chair for ICCV and ACCV.

Convolution Neural Networks (ConvNets) have successfully improved the recognition performance for various facial characteristics such as age, head pose, gender and identity. A large labeled training set is one of the most important factors for its success. However, it is difficult to collect sufficient and complete training images in some domains such as age and head pose estimation, where the image labels are usually ordered numbers. Fortunately, there is a correlation among neighboring labels, which makes these tasks different from traditional classification. Based on this fact, we convert the label of each image into a discrete label distribution. In this paper, we propose a novel deep label distribution learning (DLDL) framework by minimizing a Kullback-Leibler divergence between the predicted and groundtruth label distributions.  DLDL effectively utilizes the correlation information among neighboring labels in both feature learning and classifier learning. Experimental results show that the proposed approach produces significantly better results than state-of-the-art methods for age estimation and head pose estimation, and is particular suitable when the training set is small.

Shalini Gupta, NVIDIA Research

Shalini Gupta is a senior research scientist at NVIDIA since 2013, where she focusses on designing novel automotive interfaces using computer vision technology. Prior to NVIDIA, she worked at Texas Instruments and and AT&T Laboratories. She obtained her doctoral and masters degrees in Electrical and Computer Engineering from the University of Texas at Austin in 2004, and 2008, respectively. Her primary research interests are in image, video and 3D scene understanding, machine learning and image processing.

Abstract: Touchless hand gesture recognition systems are becoming important in automotive user interfaces as they improve safety and comfort. Various computer vision algorithms have employed color and depth cameras for hand gesture recognition, but robust classification of gestures from different subjects performed under widely varying lighting conditions is still challenging. We propose an algorithm for dynamic hand gesture recognition that utilizes multiple sensors: depth, intensity and radar with a 3D convolutional neural networks (CNN)-based classifier. We interleave the multiple channels to build normalized spatio-temporal gesture volumes and train gesture classifiers on CUDA-capable NVIDIA GPUs using Theano. To reduce potential overfitting with limited training data and to improve the generalization capability of the classifier, we propose an effective online and offline spatio-temporal data augmentation method. Our algorithm results in the best performance on two challenging multi-modal dynamic hand gesture datasets compared to the existing approaches.

Caroline Pantofaru
, Google Research

Her research focuses on computer vision and machine perception. She is interested in understanding the world from visual information, including understanding events, scenes and activities, detecting and tracking people, as well as object detection and segmentation. She is also interested in exploring how systems can use and improve their understanding by interacting with the world and by being user-focused. This has driven her work in robotics and human-robot interaction. He is currently a Senior Research Scientist at Google, Inc. Previously she was a Research Scientist at Willow Garage, Inc., a personal robotics research lab. she completed her doctoral work at The Robotics Institute, Carnegie Mellon University, and she also spent time at INRIA Rhône-Alpes.

 Larry Davis, University of Maryland Institute for Advanced Computer Studies

 Larry S. Davis is a professor of computer science and director of the Center for Automation Research (CfAR). His  research focuses on object/action recognition/scene analysis, event and modeling recognition, image and video  databases, tracking, human movement modeling, 3-D human motion capture, and camera networks. Davis is also  affiliated with the Computer Vision Laboratory in CfAR. He served as chair of the Department of Computer Science  from 1999 to 2012. He received his doctorate from the University of Maryland in 1976. He was named an IAPR Fellow,  an IEEE Fellow, and ACM Fellow.

Other relevant information:

You can use the following links to find more information on the workshop.