Author: Scott Pezanowski
Produced: April 10, 2019
This document serves as a general review and recommendation report on Hikvision AI Cloud for security video surveillance. It is not intended to investigate specific performance measures of the system compared to alternative systems but rather to give general recommendations for more in-depth future review based upon needs. First, the requirements considered for this report are listed. Next, an overview and description of Hikvision and its AI Cloud is included, followed by detailed requirements and features. General concerns of Hikvision and associated technologies are summarized and before conclusions are drawn. Lastly, a background of related Artificial Intelligence (AI) technologies is provided as a reference.
A key component of modern video surveillance systems is the use of AI and more specifically machine learning to identify objects of interest in video throughout the large quantity of video containing normal daily activities. This takes a substantial burden from people to perform this repetitive, systematic, and tiring task that at times can lead to human error, because computers can identify video segments of interest allowing a trained security professional to focus on investigating these further. It is critical that these surveillance systems identify video segments of interest quickly and with accuracy. Also, given the sensitive nature of the video surveillance field, data security and privacy of such systems should be given high priority. The technology behind the surveillance camera hardware themselves is also important with computing resources on cameras and Internet capabilities a must in modern systems.
In this section, a brief overview of Hikvision is included followed by an overview of Hikvision AI Cloud.
Hikvision is a company based in Hangzhou, China. It was founded in 2001 and has its origins from academic technology research funded by the Chinese government. Hikvision is one of the world leaders in video surveillance. Their products range from security camera hardware to artificial intelligence applications for video surveillance through image recognition in videos and machine learning on related data.
AI use in video surveillance is a field that is growing rapidly due to the fact that machine learning on images is one of the most robust and accurate applications of machine learning (and more specifically, deep machine learning). Video surveillance can result in an extremely large amount of video and information that is impossible for a human to analyze and understand due to its size and therefore the success in machine learning for detecting images of interest is critical.
Because by nature video surveillance produces a large amount of data to analyze and it comes in the form of video and images, AI is a natural fit for Hikvision to improve its services.
Hikvision AI cloud
Hikvision AI Cloud is a system for video surveillance that employs AI through machine learning to identify video footage of interest out of the vast amount of video footage that is likely to not be of interest (normal daily activities). The majority of security cameras will have a large amount of footage with normal daily activities without any security concerns. In addition to this discrepancy in footage, it is common for large organizations to have many security cameras in use at any given moment, thus making it a challenge for people to view and analyze all video footage.
In addition to detecting images that are of interest for further action by trained security professionals, modern state-of-the-art machine learning allows a computer to detect specific objects of interest and identify patterns not possible even by trained professionals. Identification of objects of interest in images such as weapons or company equipment can be performed with high accuracy. Therefore, a surveillance system should not only alert professionals to security threats but also alert them of the type of threat.
The benefits of this three-layered architecture, allow for a pre-calculated sophisticated machine learning model to reside on the cameras themselves that make predictions on video in real-time. In addition, this sophisticated model can be created on the larger more powerful central cloud. As more video is accumulated and model predictions on items of interest are made, humans can confirm or deny these predictions thereby allowing the model built on the central cloud to be improved.
Hikvision AI Cloud is an environment for video surveillance AI with a three-layered architecture:
1) A layer that contains the security sensors (primarily video surveillance cameras) that record video and do initial processing on the imagery to detect images and objects of interest.
2) A secondary layer at Hikvision containing computers that perform more sophisticated intelligent processing on the video data. The cameras are able to connect to this layer because they are Internet enabled.
3) A central cloud-based computing facility provides high-performance intelligent big-data analysis and machine learning .
Figure 1: Hikvision AI Cloud three-layered architecture .
Below, a discussion of Hikvision AI offerings, hardware technologies, performance aspects, and data security, provide specific evaluation topics. A discussion of Hikvision general features of interest and potential benefits concludes this section. As mentioned above, a specific numerical evaluation is not intended in this document but should be considered once a specific need for a system is set.
Hikvision video surveillance systems provide facial recognition of people; detection and counting of people, animals, and cars; and detection of many other types of objects of interest such as mobile phones, weapons, and computer equipment. It is likely that items of interest can be customized to fit any organizations specific needs. General use-cases of people and object counting include determining high and low traffic areas and other potential areas of interest for security threats. While specific object detection can alert security to specific threats.
Although a precise comparison of the performance of video surveillance company’s machine learning models is not feasible, Hikvision’s deep learning models are likely to be the industry’s most comprehensive and accurate. By comprehensive, this means that models can detect a large number of different types of people’s faces and different types of objects of interest. By accurate, this means that both the models’ prediction’s of objects are correct (to avoid false alarms) and that there are not cases where the model fails to predict an object of interest (a missed security threat).
Hikvision cameras are Internet enabled and have computing resources on the camera itself. With an Internet connection to a large cloud-based infrastructure, video footage can be used to improve the accuracy of future models. By having computing resources on the camera, this allows for a predictive model on the frontline of the security system. Hikvision cameras allow a machine learning model to give real-time predictions of objects in the video in order to alert personnel immediately of any concerns.
In general, training a machine learning model to make predictions is very computationally intensive. This is one of the reasons why Hikvision AI Cloud has a central high performance cloud-based computing layer. However, after a machine learning model has been created, using that model to make predictions on new video is not as computationally intensive and can be performed by the smaller computer on the camera. This fact allows Hikvision to have predictions performed on the security cameras themselves for real-time identification of objects of interest.
System General Performance
General performance concerns of Hikvision security system’s stem from its dependence on an Internet connection with a cloud-based computing model. Since all cameras are Internet enabled and rely on cloud-based computers for improvements to machine learning models, having a dependable high-speed Internet connection is a must. Video footage can quickly amount to a large size of data that then needs to be uploaded to Hikvision servers for processing. Machine learning on imagery is improved with higher resolution images and this higher resolution grows data sizes substantially. Although concerns exist about the need for a quality Internet connection, these concerns can be overcome with investment in high quality Internet equipment, use of the most advanced video compression algorithms, and technical strategies allowing systems to continue to function well for short times during Internet outages.
A dependence on an Interent connection also brings into consideration general Internet data security issues, such as theft of the data, at any transmission point on the Internet throughout the data transfer back and forth to Hikvision computers. Although data security is a concern, it can be addressed and mitigated using Internet best practices for data security such as by using current state-of-the-art algorithms for encryption.
In addition to these data security technical concerns, as discussed in the Privacy section below, the Hikvision architecture requires video footage and data to be sent to Hikvision company servers. This poses possible data security concerns from Hikvision and the Chinese government.
In general, governments such as those in The United States of America and Australia have expressed substantial security concerns with Chinese technology companies. These countries government’s have also identified Hikvision and other Chinese video surveillance companies specifically given the potentially sensitive nature of the field .
The nature of these concerns stem from the fact that the video footage captured by the cameras can be easily stored and analyzed by the Chinese government. The Chinese government directly and indirectly owns a large portion of Hikvision and has been directly involved throughout their research and development history. Any discussion with Hikvision for their services should include these concerns and Hikvision should detail steps taken to protect data security and privacy.
Along with the specific concerns above about Hikvision and its Chinese government ties, there are broader sociological and privacy concerns related to the tracking of people and the collection of their personal information. In any application of these surveillance and AI technologies there should be strict company technical data security measures in place along with strict company security policies. The number of people with access to such data should be kept at a minimum. Again, these data security concerns can be prevented by using the proper industry standard data security practices.
Despite these concerns about Hikvision and AI, the potential benefits of AI applications in security are immense and therefore it should be considered for any major security system. For the majority of data security and privacy concerns related to AI and Internet technologies, solutions exist using industry standards and best practices to address them along with a strong relationship with the provider.
Features & Potential Extensions
Video surveillance merged with AI provides many potential benefits. Machine learning models can be trained to recognize faces, count people and objects, and recognize animals, cars, and other security objects of interest such as weapons.
Given that Hikvision security cameras capture video at known locations and their use of machine learning provides recognition of people, cars, objects, etc. within videos at this location, this information can then be tied to a vast amount of other information for more improved data mining and security knowledge discovery. The potential for a video surveillance system to be tied to satellite imagery, company attendance records, and company key performance measures to gain knowledge about company interests is immense and should be explored. Also, information from the cameras can be used to determine potential problem areas such as high traffic areas for vehicles and people.
This section is intended to give an overview of alternative systems to Hikvision AI Cloud. Multiple companies offer comparable systems for video surveillance that include AI capabilities. This review of Hikvision AI Cloud is not meant to have a comprehensive review of other systems. A more thorough evaluation of these alternatives along with a price comparison is recommended given a direct need for such a security system.
Axis, Bosch, Dahua, and Honeywell are all companies that provide comparable video surveillance systems and have AI machine learning capabilities. Costs for such systems are likely to vary widely and can be customized to your own company needs. However, given this, it has been suggested that Hikvision’s price points in general are lower than other competitors . A brief selection and description of Hikvision alternative systems is listed on the next page in no order.
In addition to these companies, other alternatives for video surveillance systems include Pelco by Schnieder, Tyco, Panasonic, Samsung, and Sony  along with many other newer and smaller companies.
When deciding on a video surveillance system, each of these companies systems can be considered as alternatives to Hikvision and likely have close to comparable technologies.
Axis is a pioneer in surveillance cameras that are Internet-enabled. Axis is based in Lund, Sweden. Given its lengthy history of innovation, Axis is considered one of the industry leaders in employing AI and machine learning towards video surveillance. Axis is also a pioneer in including computing capabilities on cameras themselves. As discussed earlier, having computing resources on cameras themselves allows for real-time detection of objects of interest in video surveillance.
Bosch Security and Safety Systems
Bosch is located near Stuttgart, Germany and has a strong history in security and video surveillance cameras. Although it was only more recent that they invested heavily in Internet-enabled cameras, their longstanding experience with video surveillance has brought them to the top-of-the-line quickly.
Dahua Technology is a Chinese company that has similar offerings as Hikvision. Dahua and Hikvision are the leading video surveillance companies based in China and a consideration of video surveillance systems should include both.
Honeywell is based in Charlotte, NC, USA and has a similar strong history to Bosch in video surveillance where they were later to invest in Internet-enabled cameras. However, their current product offerings are advanced.
Hikvision is one of the industry leaders in video surveillance. Their Hikvision AI Cloud product and related technologies are used throughout China and the rest of the world. In addition, they have provided surveillance for major security threats including the 2008 Beijing Olympics. Hikvision’s success is in large part due to their strong adoption of AI primarily through machine learning applied to image recognition in security videos. The accuracy and comprehensiveness of facial recognition and object detection employed by Hikvision is likely to be unmatched in the private sector. Concerns exist mainly because Hikvision is a Chinese company with strong ties to the Chinese government and China has far fewer restrictions on its government’s access to company information as compared to the United States, European countries, and Australia. Lastly, Hikvision’s prices are likely to be less than comparable systems.
Given that Hikvision AI Cloud has such advanced technologies and their price points are likely to be lower than competitors, this is a strong candidate to enhance a company’s security system. In discussions with Hikvision, special consideration should be given to data and privacy issues.
Machine learning as applied to images and videos can be adapted for different purposes. First, machine learning can classify an image as showing a certain type of object, landform, color, etc. Second, machine learning can detect objects within images and identify them as the type of object such as a person, car, computer, weapon, etc. Image recognition in machine learning combines both of these where images are classified into certain categories such as normal routine images and security threats and detection and reporting of specific objects in images occurs . The example in Figure 2 shows object detection of an abandoned object in a public area.
A classification of this image may identify the image as a potential security threat to be investigated while object detection would specifically identify the abandoned bag.
It is important that machine learning models are evaluated for their performance. As mentioned above, an important evaluation measure is the robustness of different objects a model can detect. There are many different sophisticated types of evaluation metrics commonly used in machine
Figure 2: Detection of an abandoned object in a public area .
learning with a simple but powerful one being a confusion matrix. A confusion matrix compares the number of correct and incorrect predictions made by a model with a set of data where the answers are known. Figure 3 shows a typical confusion matrix where 50 images were correctly classified as being an object, 10 images were incorrectly predicted to be the object and are not, 5 images were incorrectly predicted to not be the object and is, and 100 images were correctly predicted to not be an object and are not.
Figure 3: Confusion matrix to evaluate a machine learning model.
n security, it is important to minimize both the number of errors where the model incorrectly predicted an object and it is not (false alarms that use up valuable human resources to investigate) and images that were incorrectly predicted to not be an object when it actually is (failing to identify a security threat). Performance measures of models against known answers is important to report in order to improve models and to convey to others the expected accuracy of models against new video footage.
Hikvision AI primarily utilizes machine deep learning. Deep learning (explained below) is a subfield of machine learning, which in turn is a subfield of computer AI. In machine learning, computers apply mathematical and statistical algorithms on data to find patterns and anomalies and then look for these same patterns in unseen data. Machine learning has been widely used in various sectors of society with its popularity growing tremendously since the early 2000s. Some of the most successful applications of machine learning include identifying fraudulent bank card transactions, analyzing satellite imagery, and automated language translations among many others.
Deep learning is a family of machine learning algorithms that are able to find patterns in data that non-deep algorithms cannot because it performs many manipulations of data based upon errors found in its own previous calculations. Although deep learning has been in existence since the 1980s (and with roots even earlier) its use was limited because computer hardware was not capable of making the incredibly high number of mathematical calculations required (albeit often simple calculations). A discovery that high performance graphical processing units (GPUs) that were traditionally used to display high resolution graphics on computer monitors are well-adapted for performing these simple but large number of calculations has made deep learning algorithms feasible and has caused increased popularity in the 2010s.
Although deep learning models serve a similar purpose as traditional machine learning, the primary advancements of deep learning is increased accuracy and comprehensiveness of the models predictions. If a model used in security surveillance increases the accuracy of its prediction’s, this can easily mean the difference in preventing a security threat.
 Hikvision, “Hikvision holds ”shaping intelligence” ai cloud world summit,” Hikvision Corporate News.[Online]. Available: https://www.hikvision.com/en/Press/Press-Releases/Corporate-News/Hikvision-Holds-Shaping-Intelligence-AI-Cloud-World-Summit
 X. Yu, “Is the World’s Biggest Surveillance Camera Maker Sending Footage to China?” Voice of America. [Online]. Available: https://www.voanews.com/a/hikvision-surveillance-cameras-us-embassy-kabuk/3605715.html
 Memoori Smart Business Research, “The competitive landscape of the world videosurveillance business,” Memoori Smart Business Research Blog. [Online]. Available: https://www.memoori.com/competitive-landscape-world-video-surveillance-business/
 Azati Corporation, “Image detection, recognition, and classification with machine learning,” Azati Software Blog. [Online]. Available: https://azati.com/image-detection-recognition-and-classification-with-machine-learning/
 N. Bird, S. Atev, N. Caramelli, R. Martin, O. Masoud, and N. Papanikolopoulos, “Real time, online detection of abandoned objects in public areas,” in Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006., May 2006, pp. 3775– 780.