These days, when you scan Internet resources or take part in discussions, you inevitably come across materials and comments related to the use of cloud technology in antivirus protection.
The opinions are many, ranging from accusations against vendors that they are indulging in blatant PR campaigns in the complete absence of any benefits offered by the antivirus cloud, to assertions that these so-called clouds are a universal panacea. Both Internet users and security professionals are engaged in these discussions today, and no one can seem to agree.
The objective of this article is to make an attempt to really get to the bottom of the situation. We will address only the real-time collaboration of personal antivirus products installed on user computers with the vendor’s cloud infrastructure. This article will not discuss SaaS/hosted services.
For the sake of simplicity, we will use the term “antivirus cloud” to refer to an antivirus company’s system used to process information obtained from user computers to identify new, as yet undetected threats. Anticipating objections against using the word “cloud” in such a way, collaboration, we refer to the already standard practice of using the term in this context. Discussions of the appropriateness of the use of this name are beyond the scope of this article.
This article will provide an answer to the question: what is an antivirus cloud, really, and what are its pros and cons? This article is aimed primarily at readers who are interested in learning more about cloud-based antivirus protection, gaining an understanding of the general principles of how an antivirus cloud works, and what it offers in terms of protection.
The pre-cloud era, or what led to cloud formation
Over the past 20 years, antivirus protection has primarily been based on signature analysis and heuristic analysis. This was quite sufficient to effectively counteract malicious content, since:
- new malicious programs appeared relatively infrequently, and even the few virus labs run by antivirus companies had no trouble keeping up with them;
- the response time that typical updates provided for antivirus products fully met time requirements and were sufficient in terms of blocking threats.
However, in 2003-2004, we saw the development of mass communication, a rapid growth in the number of Internet users, and the arrival of Internet business, which created attractive conditions for cybercriminals. At first malicious programs were created simply for the fun of it or to prove a virus writer’s skill. Later, as opportunities arose to make money from the virtual property of others, and to steal others’ funds, cybercriminals started proactively developing malware in order to make money.
Increase in the number of unique malicious files detected by Kaspersky Lab
In addition to the increase in the number of new malicious files, we also saw an upswing in the number of different ways used to steal money: cybercriminals developed even more effective techniques for conducting attacks.
Antivirus developers continued to improve heuristic methods to detect malware and introduced automatic systems and/or automatic detection features in their products. The latter led to a marked increase in the volume of updates that approached a threshold where update downloads were becoming a major inconvenience for users.
Annual increase in the number of antivirus updates in MB (incl. 2010 forecasts)
The ongoing battle between cybercriminals and antivirus companies has grown more intense, and each side has proactively examined the enemy’s tools and methods. In 2008-2009, the speed at which new malicious programs appeared reached a new level and typical update systems were no longer sufficient to counteract the threats. According to a study conducted in the second quarter of 2010 by NSS Labs, antivirus companies required anywhere from 4.62 to 92.48 hours to block Internet threats. Improving response time to threats using typical antivirus updates was impossible, since the time required to detect threats, analyze them, and test antivirus updates had already reached a minimum.
It would seem that response time could be improved by using heuristic detection methods, which help block threats as soon as they appear without waiting for the release of antivirus database updates. However, heuristic methods detect, on average, just 50-70% of threats, which means that 30-50% of all emergent threats are left undetected by heuristic methods.
As a result, the main questions that the antivirus industry has had to consider recently are:
- How can we automate protection to counteract the increasing flood of threats?
- How can we minimize the size of antivirus databases while retaining a high level of protection?
- How can we significantly improve response time to emergent threats?
These questions have forced antivirus developers to devote more attention to the development of alternative methods of detecting and blocking today’s threats. The use of antivirus cloud technologies is one such method.
How an antivirus cloud works
As stated above, this article uses the term “antivirus cloud” to refer to the infrastructure that an antivirus company uses in order to process information obtained from the computers of those who use a specific personal product in order to identify new, as-of-yet undetectable threats, in addition to performing a number of other tasks. The technologies used to store and process user data remain in the background. The antivirus program sends a request to the cloud to see if there is any information available about a particular program, activity, link, or resource. The response will either be “yes, there is information, or “no, there is no information.”
How does the cloud differ from antivirus updates?
Options for user communication with antivirus infrastructure
An update system assumes one way interaction between the antivirus company and the user: from the AV vendor to the user. There is no feedback from the user, which is why it is not possible to promptly identify suspicious activity, or obtain information about a spreading threat or its sources. Often, antivirus companies face delays as they have to obtain this kind of data via additional data channels.
In contrast, the cloud approach is bilateral. A number of computers connected to the cloud via a central server inform the cloud of sources of infection and any suspicious activity that have been detected. After processing the information, it becomes accessible to other computers that are connected to the cloud. In fact, users are able to share information via the antivirus company’s infrastructure (not directly with one another!) about attacks launched against them and the sources of those attacks. The result is an integrated, distributed intellectual antivirus network which functions as a single whole.
The main difference between the cloud and existing antivirus technologies is the object being detected. While earlier generations of technologies (such as signatures, for example) worked with objects in the form of files, an antivirus cloud work with metadata. Consider this example to understand what metadata is: let us presume that we have a file — it is an object. Information about that file is metadata, which includes the file’s unique identifier (the hash function), data about how the file came to be in the system, how it behaved, etc. New threats are identified in the cloud using metadata even though the files themselves are not actually transmitted to the cloud for initial analysis. This approach facilitates real-time collection of data from tens of millions of voluntary participants in a distributed antivirus network in order to identify as yet undetected malware.
For example, if an antivirus user opts-in to the Kaspersky Security Network (KSN), the product will start to send two different kinds of metadata to Kaspersky Lab:
- data about infections or attacks;
- data about executable files’ suspicious activity
It should be stressed that this information is only transferred with the user’s consent.
The expert system identifies threats and checks for decision making errors, then looks for the sources spreading the threat. The sources, once located, also undergo automatic checks, in order to rule out any false positives. The data obtained by the expert system about the newly emergent threats and their sources is then promptly made available to all product users.
Metadata about infections is used to train expert systems, which consequently respond quickly to the latest malware and cybercriminal techniques by automatically identifying active threats on users’ computers. Information used by the system for self-learning includes verdicts received from signature and heuristic detection. It should be emphasizes that the most effective user protection is achieved by using a combined approach meshing an antivirus cloud with other technologies already used to counteract threats.
By gathering and processing data about suspicious activity from each participant in the network, the cloud is, essentially, a powerful expert system designed to analyze cybercriminal activity. Data needed to block attacks is provided to all of participants in the cloud network, which helps prevent subsequent infection.
Pros and cons of the cloud
- Response time. This is one of the key advantages to cloud protection. The speed at which a threat can be identified and blocked significantly exceeds the speed at which standard antivirus updates provide protection. While a signature update may require several hours, cloud technology can identify and detect new threats in just minutes.
The longest stage of the process is the analysis of the data obtained from user metadata in order to identify unknown malicious programs — however, even this process takes just a few minutes.
- Hidden logic in the decision-making process. Since metadata analysis takes place on antivirus company servers, the algorithms used to identify malicious content can’t be analyzed by cybercriminals for their own ends. Thanks to this feature, the system’s decision-making process remains highly efficient over a long period of time. This distinguishes cloud protection from signature-based and heuristic detection methods, as they must constantly be brought up to date. This is crucial for maintaining high detection rates: once regular updates are released, virus writers analyze them in order to develop the next version of the program, which will require the release of yet another update.
- Identification of new, as yet undetected threats, as well as their sources. This approach helps prevent users from visiting resources that are being used to spread malicious content. Considering that the sources of threats are often updated with new malicious programs, some of this malware may not be detected. Blocking both the threats and the sources themselves automatically solves that problem.
- Integrity of threat data. By collecting data in real time from participants in a distributed antivirus network that spans the globe, the expert system helps maintain a more complete database of threats than by using signature-based detection alone. The cloud possesses complete data: when an attack was launched, the threat used in the attack, and the scale of an attack.
- Minimizing false positives. Even some professionals have said that the use of clouds increases the probability of false positives (i.e., the erroneous detection of legitimate files). This is absolutely untrue. Practice has shown that the level of false positives using cloud-based detection is at least 100 times lower than typical signature-based detection. This is because at the heart of the expert protection system is a multi-tier verification process designed to prevent and promptly identify such errors. Furthermore, if a false positive does occur, cloud technology works much faster to identify and correct it.
- Ease of automating the detection process. The way in which the cloud identifies as yet unknown threats lends itself well to automation, exceeding the performance of signature-based and heuristic detection methods.
- Using cloud protection helps minimize the volume of antivirus databases downloaded by users because cloud databases are not delivered to the user computer. However, it is again worth emphasizing that access to a cloud infrastructure is fully dependent on the user’s computer being connected to the network constantly. This, of course, also holds true for traditional updates, which require a regular connection for downloads. But unlike the cloud, if a user successfully downloads an update, it will continue to protect the user even if the connection is interrupted. However when using a cloud system, protection is discontinued if the connection is interrupted.
- Detection based solely on hash functions. In the first incarnations of cloud infrastructure, detection was based solely on hash functions (used as unique file identifiers). As it became clear that this approach was insufficient, companies began introducing other approaches that will make it easier for a cloud signature to identify entire threat families (including polymorphic families).The cloud will essentially cease to be reactive, resulting in proactive detection — a long-awaited development.
- Problems with traffic on limited bandwidth connections (dial-up / GPRS / etc.). Again, this problem was inherent in the first incarnations of cloud systems. The introduction of adaptive approaches to traffic management is successfully solving this problem.
- Works only with executable files. It is true that the current technology is aimed at identifying threats in executable files only. However, new achievements have already been made in detecting other types of threats. As a result, this flaw will be remedied in the near future.
- The network is unreliable. This is, without a doubt, a serious flaw. The very concept behind the cloud system presumes that interaction with the user will take place using network channels. Consequently, if there is no network connection to the cloud infrastructure, there will be no protection. However, as cloud protection is not seen as something independent from existing security technologies, the signature method is still up and running in the event that there is no active connection, and the computer will not be left unprotected.
- Absence of authentication or verification of the accuracy of the data that is received. This is also a problem identified in the earliest versions of the cloud infrastructure. In order to resolve this issue, all one has to do is verify the legitimacy of the data source.
As a result, the only real drawback that cannot currently be resolved is the dependency of user protection on the existence of a stable connection. Kaspersky Security Network will have resolved all of the other issues in the next version of its cloud protection.
Some controversial aspects
There is yet another category of issues with cloud protection that is often discussed on the Internet and is viewed by those involved in the discussion as drawbacks. However, these points are not, in fact, flaws. We would like to address these issues and explain why they should not be seen as weak areas.
- The fact that the user might get an response created by a cybercriminal posing as an antivirus company. Including a digital signature with data that is sent will help solve this problem.
- The cloud cannot provide user protection during OnDemand scanning (object scans that are not performed in real time, but at the request of a user) due to the high number of scan requests submitted to the cloud server. Cloud protection can detect threats without any problems during OnDemand scanning. However, this leads to another question: is it advisable to use OnDemand scanning to protect users against active threats? As experience has shown, OnDemand scanning does a rather poor job assisting in the battle against active threats. If the OnAccess protection system is enabled (i.e., the system to identify threats in real time when an infected object is accessed), then this will be the component to provide the user with initial protection. If an OnDemand scan finds something while the OnAccess system is active, it will most likely turn out to be a latent malicious program that, one way or another, will still be blocked as soon as it is launched or if other applications send it requests. In other words, the use of OnDemand scanning in a cloud security environment is possible, but it isn’t going to be a very effective means of user protection against active threats.
Is the cloud approach a silver bullet, or just a fad?
We have reviewed the circumstances that led to the creation of antivirus clouds, and addressed in brief how cloud protection works, and its pros and cons.
Where does the cloud fit within today’s antivirus industry? Is there any true benefit to using cloud technology, and does it offer anything fundamentally new?
The cloud approach is certainly no silver bullet against cybercriminals. But cloud protection has already proven itself to have a number of major advantages: it identifies and blocks new threats at a high speed, and it doesn’t only block threats — it also blocks the sources spreading them. This helps us envisage a new direction of development in the antivirus industry. Furthermore, all of these advantages can be automated using an expert system which offers a low rate of false positives.
The cloud is not just a fad — it is an effective user protection technology. As these technologies develop, their role and significance within the antivirus industry will continue to grow.
However, we should not see an antivirus cloud as merely a separate user protection technology. Without a doubt, cloud systems can function completely automatically, without using any of the rich experience the industry has accumulated in threat detection. However, the effectiveness of this kind of approach is far from ideal. Maximum protection can be achieved by combining the security technologies we have already mastered with antivirus cloud systems. The result of this combined approach is superior to using only one or the other: it offers the rapid response time of cloud systems to as yet unknown threats, while retaining a high level of detection and proactivity, a low margin of error, and offering complete threat data.
If you have any questions, send them to Yury.Mashevsky (at) Kaspersky (dot) com, and they will be addressed in future articles.
For questions in languages other than Russian or English, please send an email to your local Kaspersky Lab office, where we will translate your question before passing it on to the author. Thank you!