Full description
This dataset was collected for an assessment of a crowd counting alogorithm.The dataset is a vision dataset taken from a QUT Campus and contains three challenging viewpoints, which are referred to as Camera A, Camera B and Camera C. The sequences contain reflections, shadows and difficult lighting fluctuations, which makes crowd counting difficult. Furthermore, Camera C is positioned at a particularly low camera angle, leading to stronger occlusion than is present in other datasets.
The QUT datasets are annotated at sparse intervals: every 100 frames for cameras B and C, and every 200 frames for camera A as this is a longer sequence. Testing is then performed by comparing the crowd size estimate to the ground truth at these sparse intervals, rather than at every frame. This closely resembles the intended real-world application of this technology, where an operator may periodically ‘query’ the system for a crowd count.
Due to the difficulty of the environmental conditions in these scenes, the first 400-500 frames of each sequence is set aside for learning the background model.
Subjects
Artifical intelligence and image processing |
Computer vision |
Crowd counting |
Crowd monitoring |
Density estimation |
Engineering |
Information and Computing Sciences |
Image processing |
Local features |
Scene invariant |
Signal processing |
User Contributed Tags
Login to tag this record with meaningful keywords to make it easier to discover
Identifiers
- DOI : 10.4225/09/5858BFB708148
- Local : 10378.3/8085/1018.15701