In the present world, video surveillance methods play a vital role in observing the activities that take place across secured and unsecured environment. The main aim with which a surveillance system is deployed is to spot abnormalities in specific areas like airport, military, forests and other remote areas, etc. A new block-based strategy is represented in this paper. This strategy is used to identify unusual circumstances by examining the pixel-wise frame movement instead of the standard object-based approaches. The density and also the speed of the movement is extorted by utilizing optical flow. The proposed strategy recognizes the unusual movement and differences by using discrete cosine transform coefficient. Our goal is to attain a trouble-free block-based Discrete Cosine Transform (DCT) strategy that promotes real-time abnormality detection. The proposed approach has been evaluated against an airport dataset and the outcome of unusual happenings occurred in is evaluated and reported.
In most of the existing video surveillance systems, objects that are in motion alone are identified and tracked. The actions that lead to the movement of object are not tracked. However it is equally important to track the person’s movement as well for detecting any abnormal activities. Also most of the surveillance systems just record the actions and classifying the abnormal events is generally performed by human intervention where security personnel identify the abnormal events manually. This manual intervention has to be removed and the surveillance system should be intelligent enough to recognize abnormal events on its own and report to the concerned authorities automatically. In the first phase, the proposed system recognizes the regions that have been subject to changes. In the second phase, the system computes the relevant data pertaining the changes. The data include computing the speed of motion, acceleration and route of movement and accordingly a representation of the current state of the object is provided. In the last stage, the video is examined by comparing the state constraints with the pre-standardized constraints. This provides the details of the unusual activities [
In [
Although various strategies and object handling methods are utilized in real life to promote tracking in crowded area, more difficulties emerge while tracking the scenes in crowd area rather than the small sequences. For instance, It is highly difficult to recognize a targeted object in crowded area due to the size of the targeted object and other scenarios such as occlusion, relative movement of other objects etc. To overcome these difficulties, various outcomes are projected where the researchers have reinstated those by tracking each unit of the targeted object. Some researchers have projected the algorithm by removing the foreground. The plan for recognizing and observing the temporal strategy for a crowded area is represented. Initially, various attributes recover the substances of every lead frameworks involved in the operation. Once every object is identified, the Gaussian Mixture algorithm (GMM) is used. In this segment, we describe the recognition of unusual performance in wider aspect, for instance, the unpredicted actions of a person. The researchers try to expand several methods that are usually utilized for video surveillance. If there are any unexpected transformation in scenes like lighting or change in weather and difficulties such as identifying the action are addressed using Gaussian Mixture Model (GMM). Individual events are identified in this series based on identifying the action of every person. Then “vision.BlobAnalysis” object is used for analyzing the individual objects. The blob analysis is instrumental in identifying individual objects. Before performing blob analysis, segmentation of the objects from the background is performed using GMM and then morphological operations are applied for removing noise and extracting the boxes containing the connected components.
Even though several related path techniques flourish in various applications, they undergo certain primary drawbacks. At the initial stage, execution of pipeline process might be upshot in a delicate design which might result in error percolating across the following phases. Secondly, the system follows numerous objects and needs multifaceted algorithms which are very computation intensive. As a result, in crowded regions, multi-object tracking is not competent always. In this assignment, the feature and color data of surveillance tapes are deprived. Thirdly, inflexible objects like vehicles, trains where in such form of objects trailing are competent, and moreover, it not suitable for form less actions likes waves on the water, trees shaking due to the wind, etc. To overcome the above discussed disadvantages, some of the authors have currently projected some learning techniques which relates to features apart from action trails. For such kind of cases, object trailing is not required whereas pixel-point attributes are taken into consideration. The primary objective of this paper is to monitor and analyze the actions of the object as normal or abnormal events effectively.
With the intention of providing protection to people, it is essential to examine the performances of people and also specify whether their actions are usual or unusual one. To denote similar notations various keywords are utilized in several research works (abnormal, exceptional, different, exciting, doubtful, and irregular). The abnormality can be defined as a measurement where its probability becomes less than a threshold values in a normal condition. A typical performance refers to the action that has not been monitored earlier; unusual actions are considered to be abnormal. Human activity becomes abnormal when they do indifferent activities like mugging, robbery and other criminal activities. To differentiate the abnormal and normal activities (running), there are two definitions are given here. Definition-1 says about the normal activity behavior of the human and Definition-2 says about the abnormal activity behavior of the human.
Definition-1: The normal state of the human is slowly and gradually accelerated from a starting level into higher level in terms of speed and time in the normal action. After a long interval of time the moving speed of the objects is greater than the normal speed of the objects normal action state is defined as normal activity behavior and it is represented in
Definition-2: The speed of the human objects changed from the normal state into speedy state with high and sudden acceleration represents the abnormal activity. The speed of the speedy state is higher than the normal state within a short interval of time is defined as abnormal activity behavior, which is depicted in
Normal state represents the walking state of the human and speedy state represents the running state of the human. Running state can be driving a cycle, car, van and etc. V0 is the initial velocity, V_normal is the velocity of walking state, V_speedy is the velocity of the running state and t says the time taken for walking and running state of the human. As the slope becomes steeper, it gives a chance for potential anomaly.
A universal model as shown in
In accordance to action characteristics our unusual happenings are recognized and extorted by action evaluation method. Within a picture series, the action inference aims by identifying the areas parallel to moving objects like car, also human beings. The characteristics are related to pixel-based optical flow, which is said to be one of the most normal methods for capturing the action [
Discrete Cosine Transform (DCT) is used to separate the image into parts which are of different significance based on the visual quality. DCT is analogous to Fourier transform in the fact that the image is transferred to frequency domain instead of time domain. It is much easier to depict DCT as a set of basic functions which can be of a known input array.
The two dimensional representation of DCT with an N x M sized image is given by
DCT is pertained in every segment, as it supplies a compressed presentation of the signal’s force. Lastly, the DCT coefficients are calculated in order to compute the data structure [
N is the picture volume and p specifies the possibility of action power value at certain pixel position. In the histogram, the amount of bins illustrates the nature of picture, whereas a grey scale picture is formed when the size of the action is structured per pixel also 256 bins are utilized which might match up to the amount of gray heights. To identify either the action is usual or unusual; the entropy rate is contrasted with threshold rate which has been noticed earlier in video tapes. Entropy denotes the measure of uncertainty associated with a random variable. The acquired threshold is related to median rate of the entropies which is considered during the initial 500 frames of video. Apparently, we might imagine that the unusual action does not happen in the first set of 500 frames of video series. This procedure is processed regularly by video series as the unusual actions are sieved with the help of the median rate. The basic functioning of DCT is shown below:
Median filtering is restricted for some time in order to organize the entire intricacy of the computations. The unusual happening is specified when the present frame rate of the entropy is slightly higher than the threshold rate specified within the block. A different strategy [
Each individual objects are obtained after background subtraction in the earlier research works. Then the objects are compared using other objects are features of the objects in the ground truth images. But in this paper, to increase the accuracy, the DCT block based object comparison without any disturbance the entropy values of the objects are compared. Since the entropy of the object represents various important information about the images and it is used here. In order to identify and show externally the abnormal objects in the image a boundary box is drawn on the image around the objects. Generally the power within RGB color space distinguishes every pixel. Finally, from the monitored rate of every pixel, the possibility is computed. For every connected element, the bounding boxes are founded, subsequent to individuals’ action within the object “vision.BlobAnalysis”. The areas of pixels that are connected in a picture are the recognitions of the blob analysis. The restriction of rectangle is created by following the action of individual; the location of every person is discovered throughout the action by utilizing the bounding box outcome (BBOX) and it can be obtained using the following pseudocode.
In this research work, pedestrian surveillance videos are utilized so that various unusual circumstances are replicated with the help of many volunteers. The unusual circumstances comprise of various volunteers suddenly dancing, running and pushing in a crowded place. Overall there are six kinds of unusual or abnormal circumstances which take place in 12000 frames of video series. The usual screening quality for a video surveillance is 720 × 576 by 29 frames per second, which is the spatial motion of a novel video frame. The frame rate is maintained uniformly across the experiment to ensure that there is no delay in processing frames. The experiment was conducted across outdoor environment which included different environment conditions such as rain, dim light, shade etc. and the performance of the system was evaluated.
A unique strategy which is projected is contrasted and compared with this approach. The proposed approach and the resulting output is comprehensively evaluated with contemporary works. Each frame is sub-divided into four quadrants or segments as part of the current research work. The number of segments per frame is customizable. The entropy of DCT coefficients is computed for each and every segment and inclusive of the first 500 frames’ median rate is computed. In relation to this research and analysis, the threshold of median entropy is set thrice higher than the median rate to categorize the abnormal happenings. If there are any anomaly and unusual happenings present, an unusual indicator is raised and alert is generated.
The number of frames extracted from a segmented video is 95. Initially, it is assumed that all the images as normal images and the process is initiated. Across each frame, following object detection, the calculation of entropy from the DCT values is carried out and this is compared with the computed threshold levels already recorded and stored in a database from ground truth images. The abnormal activities identified from the sequence of input frames are shown in
Also, the proposed approach performance is evaluated in terms of time complexity for computing and classification accuracy. The focus of the work is to enhance the accuracy of classification and not on the speedup. No major latency issues were observed in anomaly detection. Since the video frames are scanned immediately one after the other, any occlusion observed in a frame gets cleared in the next frame. In order to accomplish this, time taken for conversion at various stages of the exercise is found in the work and the obtained results is shown in
Data | Accuracy of classification | Total count | Accuracy percentage | ||||
---|---|---|---|---|---|---|---|
Dataset (frame count) | 200 | 400 | 600 | 800 | 1000 | 3000 | |
Correctly classified | 184 | 343 | 532 | 713 | 853 | 2625 | 87.5 |
Incorrect |
16 | 57 | 68 | 87 | 147 | 375 | 12.5 |
In this research work, an action-structured-related algorithm is developed, in order to identify unusual happenings in surveillance tape observed in public place. We had played a major focus was towards establishing attributes which are very useful to classify actions also by utilizing a threshold which is mechanically modernized in order to recognize unusual happenings. Discrete Cosine Transform (DCT) was used to analyze the anomalous pattern. The altered action size of DCT entropy determined is a consistent measure in order to categorize that the present action in the video tape is usual or unusual one. Since the projected technique is block-based, we can accurately specify the component of the frame in which the unusual happenings occurs. An additional benefit in utilization of block is that there is chance for parallel routing in real-time execution, as every block could be processed without any dependency. The airport surveillance video output is obtained by identifying five abnormal happenings during the aversion of fake alarms. The classification accuracy was 87.5%. From the obtained results, it can be observed that the proposed approach is a better approach for anomaly detection and recognition.
This structure need not depend upon the kind of view since it is common. As part of the future work, for identifying abnormalities, more exploration on DCT-related attributes is predicted. Moreover this approach can be tested with varying scenarios and the performance van be observed. Also, this approach can be scaled across a larger volume of data and can be applied across different domain to detect various types of anomalies. Since the time complexity is more and accuracy is merely less it should be improved by computing the error models in the frames and the objects. Future work will be to take this approach and experiment and verify the results with more benchmark dataset and custom dataset and to integrate this and model along with a sophisticated framework that can be deployed for smart surveillance.
The authors wish to express their thanks to one and all who supported them during this work.