ICPR2016 contest webstite

Thank you for your participation,

see the workshop page for photos and presentations.

Joint Contest on Multimedia Challenges Beyond Visual Analysis @ICPR2016


Workshop and Challenge on Multimedia Challenges

Cancún, México

December, 2016 (exact day TBA)

Research advances in computer vision and pattern recognition have resulted in tremendous advances in different problems and applications. As a result, several problems on visual analysis can be considered as solved (e.g., face recognition), at least in certain scenarios and under specific circumstances. Despite these important advances, there are still many open problems that are receiving much attention from the community because of the potential applications. We are organizing a contest around four of these problems that require, in addition to performing an effective visual analysis, to deal with multimodal information (e.g., audio, RGB-D video, etc.) in order to be solved. In addition we focus on problems in which we aim to recognize non visually evident patterns (e.g., personality traits). The contest is supported by three organizations with vast experience and prestige in the organization of academic contests, namely: Chalearn, MediaEval and ImageCLEF. The contest is also supported by the IAPR TC 12 on visual and multimedia information systems.

Workshop on Multimedia Challenges Beyond Visual Analysis:

Important dates (Workshop)

  • August 10th, 2016: Workshop paper submission deadline (for no participants)
  • August 31th, 2016: Workshop paper submission deadline (for contest participants) Extended deadline!
  • September 10th, 2016: Notification of paper acceptance.
  • September 14th, 2016: Camera ready of workshop papers.
  • December, 2016: ICPR 2016 Joint contest and workshop on Multimedia Challenges Beyond Analysis, challenge results, award ceremony.

Please prepare your paper according to the ICPR guidelines & template:


and submit your paper through the CMT system at:


Please refer to the workshop section for more details.

The contest is composed by four tracks:

Track 1: First impressions challenge

We are organizing a challenge track on “first impressions”, in which participants will develop solutions for recognizing personality traits of users in short video sequences. We will make available a large newly collected data set sponsored by Microsoft of at least 10,000 15-second videos collected from YouTube, annotated with personality traits by AMT workers. The traits will correspond to the “big five” personality traits used in psychology and well known of hiring managers using standardized personality profiling: Extroversion, agreeableness, conscientiousness, neuroticism, and openness to experience. As is known, the first impression made is highly important in many contexts, such as human resourcing or job interviews. This work could become very relevant to training young people to present themselves better by changing their behavior in simple ways. The participants who obtain the best results in the challenge will be invited to submit a paper to the workshop.


July 1: ChaLearn LAP and FotW Challenge and Workshop @ CVPR2016 Was a success!, thank you very much for your interest and participation (photos coming soon).

June 30: Joint Contest on Multimedia Challenges Beyond Visual Analysis @ICPR16 Started. Featuring four interesting competitions.

January 25: Tracks 1 (Apparent Age Estimation), 2 (Accessories Classification) and 3 (Smile and Gender Classification) started. Enjoy the Challenge!

Table 1. Sample videos that will be used for the first impression challenge.

Workshop/challenge participants will be invited to submit revised and extended versions of their papers to be considered for publication in a

Special Issue on Personality Analysis in the IEEE Transactions on Affecfive Computing

Please refer to the SI site for more details.

Track 2: Isolated gesture recognition track

We are organizing a track on isolated gesture recognition from RGB-D data, where the goal is developing methods for recognizing the category of human gestures from segmented RGB-D video. A new data set called ChaLearn LAP RGB-D Isolated Gesture Dataset (IsoGD) is considered for this track (sample images taken from this data set are shown in Figure 1). This database includes 47,933 RGB-D gesture videos (about 9G). Each RGB-D video depicts a single gesture and there are 249 gesture categories performed by 21 different individuals. Performance of methods will be judged by its recognition performance.

Track 3: Continuous gesture recognition track

We are also organizing a track complimentary to track 2 on continuous gesture recognition from RGB-D data. The goal in this track is to develop methods that can perform simultaneous segmentation and recognition of gesture categories from continuous RGB-D video. The newly created: ChaLearn LAP RGB-D Continuous Gesture Dataset (ConGD) is considered for this track (sample images taken from this data set are shown in Figure 1). The data set comprises a total of 47,933 gestures in 22535 RGB-D continuous videoss (about 4G).

Each RGB-D video depicts one or more gestures and there are 249 gesture categories performed by 21 different individuals. Performance of methods will be judged by its Jaccard index.

Figure 1: Sample images taken from the depth video for a set of the considered gestures.

Track 4: Context of experience track

The aim is to explore the suitability of video content for watching in certain situations. Specifically, we look at the situation of watching movies on an airplane (see Figure 3). As a viewing context, airplanes are characterized by small screens and distracting viewing conditions. We assume that movies have properties that make them more or less suitable to this context. We are interested in developing systems that are able to reproduce a general judgment of viewers about whether a given movie is a good movie to watch during a flight. We provide a data set including a list of movies and human judgments concerning their suitability for airplanes. The goal of the task is to use movie metadata and audio-visual features extracted from movie trailers in order to automatically reproduce these judgments. The provided data set comprises: a total of 318 movies + metadata + audio (MFCC), textual (td-idf) and visual (HOG, CM, LBP, GLRLM) features + movie. Video resolution: full HD 1080p, with length of around 2-5 minutes, the considered categories are: latest, recent, the collection, family, world, dutch and European. As evaluation measures we will use a combination of: precision, recall, and F1.

Figure 2. A set of conditions, including small screen and confined, crowded space, characterize the context of watching a movie on an airplane.


As previous challenges organized by Chalearn, all the four tracks of the contest will be based on the CodaLab platform, information about registration and participation can be found in the CodaLab's tracks' sites:

In broad terms the challenge will proceed as follows (track 1 will adhere to slightly different procedure):

  1. Participants register to the challenge.
  2. Development (labeled) and validation (unlabeled) data sets are made available to registered participants.
  3. Participants develop their methods using development and validation data.
  4. Validation labels are released, participants can tune their methods and use development+validation data to train their final models. ** Please note that for track 1, validation labels will not be released. **
  5. Final evaluation (test) data are released, participants make predictions for test samples and submit them via CodaLab.
  6. Participants submit fact sheets describing their methods.
  7. Organizers start the verification process and notify the final results.
  8. Top ranked participants with verified codes are eligible for awards.
  9. Top ranked participants are encouraged to submit a paper to the associated ICPR workshop.
  10. Winning certificates are awarded during the ICPR workshop at Cancun, Mexico.


Top three ranked participants of each track of the contest will receive an official certificate award and travel awards as allowed by our Budget,


In addition top ranked participants will be invited to follow the workshop submission guide for inclusion of a description of their system at the ICPR 2016 workshop proceedings. Participants of the workshop or challenge (with papers of relevant thematic content) will be invited to submit revised and extended versions of their papers to be considered for publication in a special issue on personality analysis in the IEEE Transactions on Affecfive Computing.

Important dates (quantitative challenge):

  • 30th June, 2016: Beginning of the quantitative competition, release of development (with labels) and validation data (without labels).
  • 8th August, 2016: Release of encrypted final evaluation data (without labels) and validation labels. Participants can start training their methods with the whole data set.
  • 10th August, 2016: Paper submission deadline for no participants.
  • 12th August, 2016: Release of final evaluation data decryption key. Participants start predicting the results on the final evaluation data.
  • 16th August, 2016: End of the quantitative competition. Deadline for submission of predictions on the final evaluation data. Deadline for code submission. The organizers start the code verification by running it on the final evaluation data
  • 17th August, 2016: Deadline for submitting the fact sheets.
  • 20th August, 2016: Release of the verification results to the participants for review. Participants are invited to follow the paper submission guide for submitting contest papers.
  • 24th August, 2016: Paper submission deadline for participants.
  • 24th August, 2016: Notification of acceptance for no participants (papers submitted before August 10th).
  • 2th September, 2016: Notification of paper acceptance.
  • 5th September, 2016: Camera ready of contest papers.
  • December 2016: ICPR 2016 Joint Contest on Multimedia Challenges Beyond Visual Analysis, challenge results, award ceremony.

We are very grateful with our sponsors:


Thanks to our ICPR 2016 sponsors: Microsoft Research, ChaLearn, University of Barcelona, INAOE, Unviersite Paris Saclay more TBA. This research has been partially supported by projects TIN2012-39051 and TIN2013-43478-P.