– We went back to the original data and selected batches including a large background area in which no gesture was taking place.
– We visually inspected the training videos to identify a cropping area including ev- ery important gesture parts. Once selected, the cropping size was fixed for the given batch. The aspect ratio was always always 4:3 (width:height), similar to the challenge data.
– For every test video in the batch, using the same cropping size, a different hori- zontal translation was applied. This was done by visual inspection to make sure no important gesture part was occluded. No vertical translation was applied.
We selected 20 batches for these experiments, not coinciding with the sets of batches used for validation, and final testing because most of those batches included dynamic gestures covering the entire image area, therefore not leaving room for translations. The batches used are harder on average than those used for the challenge final evaluation, in particular because they include more static posture recognition. We ran experiments with the un-translated batches (utran) and with the translated batches (tran).
Additional experiments were also performed on data scaled in various ways following a similar protocol, using the same utran batches. Each video has its own scaling number, therefore, we will have 47 numbers for each batch. Rows represent batches and columns represent video (K1_K47). The scaling shown here is (size of cropping box / size of original frame). Aspect ratio of cropping box is always preserved to be the same as original.
The data are available for download:
The results of the experiments are shown in the table below. For comparison, we also give results on the validation and final evaluation data. The table also include the approximate run time in seconds per batch.
The experiments were conducted by Pat Jangyodsuk.
Additional experiments on Occluded data were performed by Jun Wan.
CGD 2011 Data >