The ChaLearn Looking at People Workshop 2015, to be held next December 12th in conjunction with ICCV 2015 in Santiago de Chile, Chile, will be devoted to the presentation of the most recent and challenging techniques for Looking at People, like for example human body pose recovery, human action/interaction, soft-biometric analysis, facial expression recognition and cultural event recognition. Best workshop papers will be invited to the IJCV Special Issue, with the best one being awarded an NVIDIA Titan X graphics card. The NVIDIA ChaLearn LAP 2015 Best Paper Award has been awarded to "DEX: Deep EXpectation of apparent age from a single image", Rasmus Rothe, Radu Timofte, Luc Van Gool.
PROGRAM (TENTATIVE):
09:00h Invited Speaker I: Rama Chellappa, University of Maryland
Session chair: Xavier Baró
09:45h Session I: Challenge results presentation and award ceremony, Sergio Escalera, UB-CVC
Session chair: Xavier Baró ![]() 10:00h Coffee Break
10:30h Session II: Winners Apparent Age Estimation
Session chair: Pablo Pardo 11:15h Session III: Winners Cultural Event Recognition
12:15h: Invited Speaker II: Jianxin Wu, "Deep ConvNet Meets Ordered Labels: Deep Label Distribution Learning", Nanjing University
Session Chair: Jordi Gonzalez
![]() 12:45h Lunch break
13:45h: Session IV: Looking at Poses
Session Chair: Marc Oliu
14:45h: Invited Speaker III: Shalini Gupta, "Multi-modal Dynamic Hand Gesture Recognition with CNNs", NVIDIA Research
Session Chair: Hugo Jair Escalante
15:00h Coffee Break
16:00h: Session V: Age Estimation
Session Chair: Hugo Jair Escalante
16:45h: Invited Speaker V: Larry Davis, University of Maryland
Session Chair: Jordi Gonzàlez 17:15h: Session VI: Looking at Faces
Session Chair: Jordi Gonzàlez
18:15h: The COST Action iV&L Net IC1307: Challenges in Computer Vision and Natural Language, Sergio Escalera (UB-VCVC)
Session Chair: Xavier Baró
18:30h: Break
Session Chair: Xavier Baro, Hugo Jair Escalante
22:00h: End Hackaton
Confirmed Invited Speakers:
![]() Rama Chellappa, University of Maryland Prof. Rama Chellappa is a Minta Martin Professor of Engineering and Chair of the ECE department at the University of Maryland. Prof. Chellappa received the K.S. Fu Prize from the International Association of Pattern Recognition (IAPR). He is a recipient of the Society, Technical Achievement and Meritorious Service Awards from the IEEE Signal Processing Society. He also received the Technical Achievement and Meritorious Service Awards from the IEEE Computer Society. At UMD, he received college and university level recognitions for research, teaching innovation and mentoring of undergraduate students. In 2010, he was recognized as an Outstanding ECE by Prude university. Prof. Chellappa served at the Editor-in-Chief of PAMI. He is a Golden Core Member of the IEEE Computer Society, served as a Distinguished Lecturer of the IEEE Signal Processing Society and as the President of IEEE Biometrics Council. He is a Fellow of IEEE, IAPR, OSA, AAAS, ACM and AAAI and holds four patents. ![]() Jianxin Wu, Nanjing University
Jianxin Wu received his BS and MS degrees in computer science from Nanjing University, and his PhD degree in computer science from the Georgia Institute of Technology. He is currently a professor in the Department of Computer Science and Technology at Nanjing University, China, and is associated with the National Key Laboratory for Novel Software Technology, China. He was an assistant professor in the Nanyang Technological University, Singapore. His research interests are computer vision and machine learning. He has served as an area chair for ICCV and ACCV.
Abstract: Convolution Neural Networks (ConvNets) have successfully improved the recognition performance for various facial characteristics such as age, head pose, gender and identity. A large labeled training set is one of the most important factors for its success. However, it is difficult to collect sufficient and complete training images in some domains such as age and head pose estimation, where the image labels are usually ordered numbers. Fortunately, there is a correlation among neighboring labels, which makes these tasks different from traditional classification. Based on this fact, we convert the label of each image into a discrete label distribution. In this paper, we propose a novel deep label distribution learning (DLDL) framework by minimizing a Kullback-Leibler divergence between the predicted and groundtruth label distributions. DLDL effectively utilizes the correlation information among neighboring labels in both feature learning and classifier learning. Experimental results show that the proposed approach produces significantly better results than state-of-the-art methods for age estimation and head pose estimation, and is particular suitable when the training set is small. Shalini Gupta is a senior research scientist at NVIDIA
since 2013, where she focusses on designing novel automotive interfaces
using computer vision technology. Prior to NVIDIA, she worked at Texas
Instruments and and AT&T Laboratories. She
obtained her doctoral and masters degrees in Electrical and Computer
Engineering from the University of Texas at Austin in 2004, and 2008,
respectively. Her primary research interests are in image, video and 3D
scene understanding, machine learning and image
processing.
Abstract: Touchless hand gesture recognition systems
are becoming important in automotive user interfaces as they improve
safety and comfort. Various computer vision algorithms have employed
color and depth cameras for hand gesture recognition,
but robust classification of gestures from different subjects performed
under widely varying lighting conditions is still challenging. We
propose an algorithm for dynamic hand gesture recognition that utilizes
multiple sensors: depth, intensity and radar with
a 3D convolutional neural networks (CNN)-based classifier. We
interleave the multiple channels to build normalized spatio-temporal
gesture volumes and train gesture classifiers on CUDA-capable NVIDIA
GPUs using Theano. To reduce potential overfitting with
limited training data and to improve the generalization capability of
the classifier, we propose an effective online and offline
spatio-temporal data augmentation method. Our algorithm results in the
best performance on two challenging multi-modal dynamic
hand gesture datasets compared to the existing approaches.
![]() Larry Davis, University of Maryland Institute for Advanced Computer Studies
Larry S. Davis is a professor of computer science and director of the Center for Automation Research (CfAR). His research focuses on object/action recognition/scene analysis, event and modeling recognition, image and video databases, tracking, human movement modeling, 3-D human motion capture, and camera networks. Davis is also affiliated with the Computer Vision Laboratory in CfAR. He served as chair of the Department of Computer Science from 1999 to 2012. He received his doctorate from the University of Maryland in 1976. He was named an IAPR Fellow, an IEEE Fellow, and ACM Fellow.
Other relevant information:
You can use the following links to find more information on the workshop.
|