Many video surveillance professionals have come across terms such as artificial intelligence, machine learning and deep learning. But what do these terms mean, and how do they affect video surveillance?
Artificial intelligence, machine learning and deep learning
Artificial intelligence is a broad term for applying human intelligence to a computer program, or allowing the program to learn over time, with the goal of producing better results in the learning process. Deep learning is an evolution of machine learning, a technique used to reach the level of artificial intelligence. In short, deep learning is an advanced, more complex machine learning technique, and both are ways to reach the level of artificial intelligence.
Application in video surveillance. In video surveillance, video analysis uses machine learning and deep learning methods to identify objects, classify them, and determine their properties.
Every time people receive new information, our brains try to compare the data to similar projects in order to make sense of it. This comparison method is the same concept used by machine learning and deep learning algorithms.
Machine learning and deep learning algorithms differ in how they are programmed to determine what constitutes a known object. Machine learning requires more human intervention by the programmer to establish desired parameters in order to achieve desired results. Deep learning identifies object properties independently and may consider features that programmers would not consider.
What do machine learning and deep learning mean for video analytics?
Both methods describe the programming approach of the system based on data set learning. In machine learning, data attributes sought by the system are usually preset or calibrated by human programmers. For example, the system could be programmed to depict an object wider and taller than it is, move limbs in a particular way and so on, and mark the object as a person.
Deep learning is considered superior to machine learning in part because programmers may not recognize the most relevant criteria. Using the previous algorithm to identify a person, a person sitting still may not trigger accurate detection.
With deep learning, video analysis algorithms can obtain large data sets representing objects. This step is called training, and the algorithm trains itself to recognize a type of object. For example, the system has thousands of photos of people of different genders, dress styles, ethnic backgrounds, taken from different angles and so on.
The algorithm calculates similar and dissimilar attributes and determines how to weigh the correlation of these features. After analyzing thousands of images, the algorithm could calculate that most of the images included triangular objects near the top of the image, with two dark oval blobs near the bottom that we could see as noses on someone’s face. In fact, the algorithm may already have identified many other such features that we wouldn’t have thought of.
Before users use AI intelligence software, developers train the system. This process requires a lot of computing power; Far more than the field use of detection and classification of objects. The result is that the system references a file to determine whether the detected object matches the classification.
Because the deep learning process uses machines to determine object characteristics, it leads to analysis that can provide more fine-grained classification. For example, older methods might be able to detect a person, but an analysis based on deep learning could detect whether the person is a man, woman or child. It can also detect relevant characteristics of individuals as well as the type or brand of vehicle.
AI needs to be learned over time
Ai in video surveillance is trained at design time and, in some cases, does not gradually become “smarter” when used in the field. Deep learning and machine learning do have this capability, and if you use deep learning technology, then video analysis, you can also learn over time.
A typical application might involve determining what is normal in a scenario. During breaks, for example, school corridors experience a rush of people about every 45 minutes. During peak hours, crowds are scattered rather than concentrated in any particular area.
Also, it’s unusual for everyone to move at very high speeds. If the system detects an unusual concentration of objects, it could indicate a fight. If everyone is running in the same direction outside the usual recess, it could indicate an emergency.
Smarter systems, better analytical results
Video surveillance systems generate a lot of data.
Monitoring and filtering such a large amount of information makes the task of quickly identifying security incidents and finding evidence more difficult than ever. Smart systems using deep learning can identify evidence in a more timely manner and analyze video in real time to alert system operators to suspicious events, providing better results for user safety plans.
The TSINGSEE Security Video structured intelligent analysis platform EasyCVR has incorporated deep learning technology to enable intelligent analysis of incoming video surveillance images.
EasyCVR can collect audio and video information of cameras and video source equipment from different manufacturers, different protocols and different models, push video stream to cloud platform with unified and standard video format and transmission protocol, complete lightweight access and distribution of massive urban surveillance video resources, and realize the interconnection between equipment and platform. Form a comprehensive platform of feeling, storage, knowledge and application.
Based on AI algorithm, deep learning, big data intelligent analysis, edge computing, 5G and other emerging technologies, we will enable more video +AI ecological application scenarios and accelerate the implementation of video AI in more industries.