Workshop Program

The ChaLearn Looking at People Workshop 2015, to be held next December 12th in conjunction with ICCV 2015 in Santiago de Chile, Chile, will be devoted to the presentation of the most recent and challenging techniques for Looking at People, like for example human body pose recovery, human action/interaction, soft-biometric analysis, facial expression recognition and cultural event recognition. Best workshop papers will be invited to the IJCV Special Issue, with the best one being awarded an NVIDIA Titan X graphics card.

The NVIDIA ChaLearn LAP 2015 Best Paper Award has been awarded to "DEX: Deep EXpectation of apparent age from a single image", Rasmus Rothe, Radu Timofte, Luc Van Gool.

PROGRAM (TENTATIVE):

08:45h Opening: Presentation of the workshop, Sergio Escalera, UB-CVC

09:00h Invited Speaker I: Rama Chellappa, University of Maryland

Session chair: Xavier Baró

09:45h Session I: Challenge results presentation and award ceremony, Sergio Escalera, UB-CVC

ChaLearn Looking at People 2015: Apparent Age and Cultural Event Recognition datasets and results, Sergio Escalera, Junior Fabian, Pablo Pardo, Xavier Baró, Jordi Gonzàlez, Hugo Jair, Dusan Misevic, Ulrich Steiner, Isabelle Guyon

Session chair: Xavier Baró

10:00h Coffee Break

10:30h Session II: Winners Apparent Age Estimation

1st: DEX: Deep EXpectation of apparent age from a single image, Rasmus Rothe, Radu Timofte, Luc Van Gool
2nd: AgeNet: Deeply Learned Regressor and Classifier for Robust Apparent Age Estimation, Xin Liu, Shaoxin Li, Meina Kan, Jie Zhang, Shuzhe Wu, Wenxian Liu, hu Han, Shiguang Shan, Xilin Chen
3rd: A Study on Apparent Age Estimation, Yu Zhu, Yan Li, Guowang Mu, Guodong Guo

Session chair: Pablo Pardo

11:15h Session III: Winners Cultural Event Recognition

1st: Exploiting Feature Hierarchies with Convolutional Neural Networks for Cultural Event Recognition, Mengyi Liu, Xin Liu, Yan Li, Xilin Chen, Alexander Hauptmann, Shiguang Shan
2nd: Deep Spatial Pyramid Ensemble for Cultural Event Recognition, Xiu-Shen Wei, Bin-Bin Gao, Jianxin Wu
3rd: Better Exploiting OS-CNNs for Better Event Recognition in Images, Limin Wang, Zhe WANG, Sheng Guo, Qiao Yu
4th: DLDR: Deep Linear Discriminative Retrieval for cultural event classification from a single image, Rasmus Rothe, Radu Timofte, Luc Van Gool

Session chair: Junior Fabian

12:15h: Invited Speaker II: Jianxin Wu, "Deep ConvNet Meets Ordered Labels: Deep Label Distribution Learning", Nanjing University

Session Chair: Jordi Gonzalez

12:45h Lunch break

13:45h: Session IV: Looking at Poses

Moving Poselets: A Discriminative and Interpretable Skeletal Motion Representation for Action Recognition, Lingling Tao, René Vidal
Skeleton-free body pose estimation from depth images for movement analysis, Ben Crabbe, Adeline Paiement, Sion Hannuna, Majid Mirmehdi
Motion Recognition Employing Multiple Kernel Learning of Fisher Vectors using Local Skeleton Features, Yusuke Goutsu, Wataru Takano, Yoshihiko Nakamura
Person Attribute Recognition with a Jointly-trained Holistic CNN Model, Patrick Sudowe, Hannah Spitzer, Bastian Leibe

Session Chair: Marc Oliu

14:45h: Invited Speaker III: Shalini Gupta, "Multi-modal Dynamic Hand Gesture Recognition with CNNs", NVIDIA Research

Session Chair: Hugo Jair Escalante

15:00h Coffee Break

15:30h: Invited Speaker IV: Caroline Pantofaru, Google Research

Session Chair: Hugo Jair Escalante

16:00h: Session V: Age Estimation

4th: Deeply Learned Rich Coding for Cross-Dataset Facial Age Estimation, Wei Zhang, Zhanghui Kuang, Chen Huang
5th: Deep Label Distribution Learning For Apparent Age Estimation, XU YANG, Bin-Bin Gao, Chao Xing, Zeng-Wei Huo, Xiu-Shen Wei, Ying Zhou, Jianxin Wu, Xin Ging
6th: Unconstrained Age Estimation with Deep Convolutional Neural Networks, Rajeev Ranjan, Sabrina Zhou, Jun_Cheng, Amit Kumar, Azadeh Alavi, Vishal Patel, Rama Chellappa

Session Chair: Hugo Jair Escalante

16:45h: Invited Speaker V: Larry Davis, University of Maryland

Session Chair: Jordi Gonzàlez

17:15h: Session VI: Looking at Faces

An End-to-End System for Unconstrained Face Verification with Deep Convolutional Neural Networks, JunCheng Chen, Rajeev Ranjan, Amit Kumar, ChingHui Chen, Vishal Patel, Rama Chellappa
Coordinated Local Metric Learning, Shreyas Saxena, Jakob Verbeek
Facial Landmark Localization in Depth Images using Supervised Ridge Descent, Necati Cihan Camgoz, Vitomir Struc, Ahmet Alp Kindiroglu, Berk Gokberk, Lale Akarun
When Face Recognition Meets with Deep Learning: an Evaluation of Convolutional Neural Networks for Face Recognition, Guosheng Hu, Yongxin Yang, Dong Yi, josef kittler, William Christmas, Stan Li, Timothy Hospedales

Session Chair: Jordi Gonzàlez

18:15h: The COST Action iV&L Net IC1307: Challenges in Computer Vision and Natural Language, Sergio Escalera (UB-VCVC)

Session Chair: Xavier Baró

18:30h: Break

19:00h: NVIDIA GPU Hackathon and Demo Session on Deep Learning for Cultural Event and Apparent Age Recognition

Session Chair: Xavier Baro, Hugo Jair Escalante

22:00h: End Hackaton

Confirmed Invited Speakers:

Rama Chellappa, University of Maryland

Prof. Rama Chellappa is a Minta Martin Professor of Engineering and Chair of the ECE department at the University of Maryland. Prof. Chellappa received the K.S. Fu Prize from the International Association of Pattern Recognition (IAPR). He is a recipient of the Society, Technical Achievement and Meritorious Service Awards from the IEEE Signal Processing Society. He also received the Technical Achievement and Meritorious Service Awards from the IEEE Computer Society. At UMD, he received college and university level recognitions for research, teaching innovation and mentoring of undergraduate students. In 2010, he was recognized as an Outstanding ECE by Prude university. Prof. Chellappa served at the Editor-in-Chief of PAMI. He is a Golden Core Member of the IEEE Computer Society, served as a Distinguished Lecturer of the IEEE Signal Processing Society and as the President of IEEE Biometrics Council. He is a Fellow of IEEE, IAPR, OSA, AAAS, ACM and AAAI and holds four patents.

Jianxin Wu, Nanjing University

Jianxin Wu received his BS and MS degrees in computer science from Nanjing University, and his PhD degree in computer science from the Georgia Institute of Technology. He is currently a professor in the Department of Computer Science and Technology at Nanjing University, China, and is associated with the National Key Laboratory for Novel Software Technology, China. He was an assistant professor in the Nanyang Technological University, Singapore. His research interests are computer vision and machine learning. He has served as an area chair for ICCV and ACCV.

Abstract: Convolution Neural Networks (ConvNets) have successfully improved the recognition performance for various facial characteristics such as age, head pose, gender and identity. A large labeled training set is one of the most important factors for its success. However, it is difficult to collect sufficient and complete training images in some domains such as age and head pose estimation, where the image labels are usually ordered numbers. Fortunately, there is a correlation among neighboring labels, which makes these tasks different from traditional classification. Based on this fact, we convert the label of each image into a discrete label distribution. In this paper, we propose a novel deep label distribution learning (DLDL) framework by minimizing a Kullback-Leibler divergence between the predicted and groundtruth label distributions. DLDL effectively utilizes the correlation information among neighboring labels in both feature learning and classifier learning. Experimental results show that the proposed approach produces significantly better results than state-of-the-art methods for age estimation and head pose estimation, and is particular suitable when the training set is small.

Shalini Gupta, NVIDIA Research

Shalini Gupta is a senior research scientist at NVIDIA since 2013, where she focusses on designing novel automotive interfaces using computer vision technology. Prior to NVIDIA, she worked at Texas Instruments and and AT&T Laboratories. She obtained her doctoral and masters degrees in Electrical and Computer Engineering from the University of Texas at Austin in 2004, and 2008, respectively. Her primary research interests are in image, video and 3D scene understanding, machine learning and image processing.

Abstract: Touchless hand gesture recognition systems are becoming important in automotive user interfaces as they improve safety and comfort. Various computer vision algorithms have employed color and depth cameras for hand gesture recognition, but robust classification of gestures from different subjects performed under widely varying lighting conditions is still challenging. We propose an algorithm for dynamic hand gesture recognition that utilizes multiple sensors: depth, intensity and radar with a 3D convolutional neural networks (CNN)-based classifier. We interleave the multiple channels to build normalized spatio-temporal gesture volumes and train gesture classifiers on CUDA-capable NVIDIA GPUs using Theano. To reduce potential overfitting with limited training data and to improve the generalization capability of the classifier, we propose an effective online and offline spatio-temporal data augmentation method. Our algorithm results in the best performance on two challenging multi-modal dynamic hand gesture datasets compared to the existing approaches.

Caroline Pantofaru, Google Research

Her research focuses on computer vision and machine perception. She is interested in understanding the world from visual information, including understanding events, scenes and activities, detecting and tracking people, as well as object detection and segmentation. She is also interested in exploring how systems can use and improve their understanding by interacting with the world and by being user-focused. This has driven her work in robotics and human-robot interaction. He is currently a Senior Research Scientist at Google, Inc. Previously she was a Research Scientist at Willow Garage, Inc., a personal robotics research lab. she completed her doctoral work at The Robotics Institute, Carnegie Mellon University, and she also spent time at INRIA Rhône-Alpes.

Larry Davis, University of Maryland Institute for Advanced Computer Studies

Larry S. Davis is a professor of computer science and director of the Center for Automation Research (CfAR). His research focuses on object/action recognition/scene analysis, event and modeling recognition, image and video databases, tracking, human movement modeling, 3-D human motion capture, and camera networks. Davis is also affiliated with the Computer Vision Laboratory in CfAR. He served as chair of the Department of Computer Science from 1999 to 2012. He received his doctorate from the University of Maryland in 1976. He was named an IAPR Fellow, an IEEE Fellow, and ACM Fellow.

Other relevant information:

You can use the following links to find more information on the workshop.

Challenges associated to the workshop
View submissions and schedule for important dates and submission instructions
IJCV special issue
Organizers and program committee

Page updated

Report abuse