ViBe is a background subtraction algorithm which has been presented at the IEEE ICASSP 2009 conference and was refined in later publications. More precisely, it is a software module for extracting background information from moving images. It has been developed by Oliver Barnich and Marc Van Droogenbroeck of the Montefiore Institute, University of Liège, Belgium.
ViBe is patented: the patent covers various aspects such as stochastic replacement, spatial diffusion, and non-chronological handling.
Pixel model and classification process
Many advanced techniques are used to provide an estimate of the temporal probability density function (pdf) of a pixel x. ViBe's approach is different, as it imposes the influence of a value in the polychromatic space to be limited to the local neighborhood. In practice, ViBe doesn't estimate the pdf, but uses a set of previously observed sample values as a pixel model. To classify a value pt(x), it is compared to its closest values among the set of samples.
Model update: Sample values lifespan policy
ViBe ensures a smooth exponentially decaying lifespan for the sample values that constitute the pixel models. This makes ViBe able to successfully deal with concomitant events with a single model of a reasonable size for each pixel. This is achieved by choosing, randomly, which sample to replace when updating a pixel model. Once the sample to be discarded has been chosen, the new value replaces the discarded sample. It is interesting to note that the pixel model that would result from the update of a given pixel model with a given pixel sample can't be predicted after the value to be discarded is chosen at random.
Model update: Spatial Consistency
To ensure the spatial consistency of the whole image model and handle practical situations such as small camera movements or slowly evolving background objects, ViBe uses a technique similar to that developed for the updating process in which it chooses at random and update a pixel model in the neighbourhood of the current pixel. By denoting NG(x) and p(x) respectively the spatial neighbourhood of a pixel x and its value, and assuming that it was decided to update the set of samples of x by inserting p(x), then ViBe additionally use this value p(x) to update the set of samples of one of the pixels in the neighbourhood NG(x), chosen at random. As a result, ViBe is able to produce spatially coherent results directly without the use of any post-processing method.
Although the model could easily recover from any type of initialization, for example by choosing a set of random values, it is convenient to get an accurate background estimate as soon as possible. Ideally a segmentation algorithm would like to be able to segment the video sequences starting from the second frame, the first frame being used to initialise the model. Since no temporal information is available prior to the second frame, ViBe populates the pixel models with values found in the spatial neighbourhood of each pixel; more precisely, it initialises the background model with values taken randomly in each pixel neighbourhood of the first frame. The background estimate is therefore valid starting from the second frame of a video sequence.