Global Research on Data Leaks in 2009
InfoWatch presents the latest analytical study of reported incidents in the field of confidential data leaks. The study is targeted at security experts. This report is based upon the leak database maintained by InfoWatch analytical center since 2004. The database includes global data on any leaks reported by media, blogs, web forums, and any other public sources worldwide.
Total number of registered data leaks increased in H1 2009 as compared with H1 2008 and equaled to average 2.3 leaks/day. However, in H2 2009 this number started decreasing, and overall number of leaks in 2009 equaled to 735 leaks / 365 days = 2.0 leaks per day.
We attribute this minor but recognizable fluctuation to the loss of mass media attention to the issue. Nowadays data leaks are not as hot a news topic as two years ago, so many leaks are being ignored by media.
InfoWatch analysts predict smooth decrease of number of incidents, thanks to implementation of DLP systems and other data protection measures. This factor is probably already active, though not statistically recognizable against the stronger influence of media attention.
In absolute numbers, total amount of leaks in 2009 increased, as compared with 2008 – 735 against 530, or 39% growth.
Total number of reported data leaks has continued to grow but slowed down. Possible causes include financial crisis, greater latency, mass implementation of DLP systems.
Data leak issue has lost some popularity in the US, but in other countries it has become more critical. In 2009, UK adopted several legislative regulations forcing personal data operators to report any data leak (similar to current US laws). Therefore, number of incident reports from UK has increased in 2009 (125 reports, as compared with 54 in 2008). For example, Russia introduced the personal data law, which stayed a hot public topic throughout the year, therefore the number of data leaks reported in Russia has predictably increased (30 leaks in 2009, as compared to 6 in 2008). No strong statistical dynamics was discovered in any other country.
Average number of personal data records influenced by a leak is 754,000, while some incidents influenced only 1, 2, or 5 personal records.
During the previous years, this parameter was noticeably lower:
- 2008 – approx. 405 000;
- 2007 – approx. 608 000;
- 2006 – approx. 177 000.
What is the meaning of these figures? Personal data storage becomes more centralized each year. More people join the Internet, more bank cards are issued each year. While small enterprises often choose not to keep personal data of the clients, bigger ones usually do, but their confidentiality level doesn't increase as fast as their businesses grow, due to the lack of ability, budget, or desire to protect customers' personal data.
It's appropriate here to cite Lenin's point on the concentration of capital. Personal data is generated by users, so users are the new capital of Web 2.0 age; and valuable information tends to concentrate.
Accidental and intentional leaks
Percentage of the intentional leaks continues increasing.
Table 1: Percentage of accidental and intentional leaks
Picture 1, Leak distribution by intent
The explanation is self-evident: DLP systems, as well as other protection measures, prevent nearly each case of accidental leaks, but intentional leaks prevention is less efficient. Therefore, as DLP systems continue to grow in popularity, percentage of accidental leaks should drop further.
In the near future, percentage of intentional leaks will grow steadily.
On the other hand, cost of information (including personal data) grows steadily. Centralization of personal data storage also grows (see below). Therefore, potential intruders suffer ever-growing temptation, which is another factor facilitating the increasing percentage of intentional violations.
All enterprises and organizations that encounter public leaks are classified by InfoWatch experts into three categories: governmental agencies, commercial enterprises, and educational institutions plus public non-profit organizations. Educational institutions can be either commercial or non-profit, but yet InfoWatch experts separate them into a specific category as procedures for processing the students' personal data significantly differ from the same procedures in traditional commercial businesses, such as banks, clinics, supermarkets, etc.
|Type of organization||2009||2008|
|Educational / non-profit||98||13.3%||127||24.0%|
Table 2A: Leak sources, distributed per organization type
Picture 2, Leak distribution by organization type
In 2009, percentage of "commercial" leaks continued to grow as two other categories shrank. This trend was mentioned in our previous report, and we fully expect it to continue during the next years. The phenomenon can be explained by the fact that governmental bodies are generally characterized by greater formalization and stricter rules regulating data processing processes. Therefore, introduction of counter-leak measures and procedures provides better results in there. (This document mainly deals with governmental agencies of Western countries). Regarding the greatly decreased percentage of leaks in educational institutions, our experts are of opinion that these were recently in absolute chaos and now begin to improve the situation.
Another cause for the increase in commercial sector leaks can be attributed to the global financial crisis. As we already stated in the H1 2009 report, commercial competition grows during the crisis, and mass employee discharges occur, which definitely contributes to the growing number of data leaks, while educational institutions are immune – students' personal data cannot be used for unethical competition, so the financial crisis offers no additional incentive to steal student's data.
Now we will examine leak distribution per the three sectors, separately for accidental and intentional leaks.
|Type of organization||Intentional||Accidental|
|Educational / non-profit||42||11.2%||51||15.9%|
Table 2B: Leak sources, distributed per organization type
Percentage values for intentional and accidental leaks are reasonably similar, which means enterprises across the three sectors are similarly equipped by data protection systems, and employee behavior rules related to data security are essentially the same.
From here we can deduce that it's not feasible for data security developers to offer different product versions for governmental and private information systems. We can also hazard a conjecture that military (as well as intelligence, law-enforcement, etc) data security measures are similar to civil ones in the aspect of data protection quality.
Personal data are still in the lead
Total majority of the reported leaks in 2009 contain personal data – 89.8%, which is only slightly below the 2008 level. This absolute domination is self-evident. There's an overwhelming mass of personal data worldwide, and they are in constant use. Thousands of organizations have to deal with personal data processing, while commercial secrets are much less common and state secrets are even rarer.
|Type of confidential data||Incidents|
|Commercial secrets, know-how||26||3.5%|
|State and military secrets||13||1.8%|
|Other confidential data||32||4.4%|
Table 3: Leak sources, distributed per confidential data type
Picture 3, Leak distribution by data type
The global scale of personal data processing is a troubling sign for regulatory bodies' analysts. In particular, projects on declassification of certain personal data are being promoted, such as public disclosure of personal income and taxes (implemented in Norway, attempted in Italy) or termination of usage of social security number as a semi-official personal ID (USA). As of today, these projects are not yet implemented, though declassification of personal data should significantly reduce operating costs in many sectors.
InfoWatch analysts consider personal data protection as one of the most cost-intensive items in information risk cost structure, along with personal data leaks.
To the contrary, German businessmen are troubled by expected increase of costs due to the comprehensive amendments to the Federal Data Protection Act that entered into force on September 1, 2009. According to the experts, additional regulations of data protection-related issues, including marketing, security breach notification, service provider contracts and protection of employee data, new powers for data protection authorities and increased fines for violations of data protection law provisions will increase operating costs for nearly every manufacturing enterprise in the country, but will not reduce losses due to fraud.
Leak channels and technologies
"Leak channel" is defined as medium or data carrier that was used to transfer the classified data over the secured perimeter or to deliver it to a non-authorized user. Statistical analysis of leak channels allows estimating risk reduction resulting from implementation of various security measures.
|Mobile computers (notebooks, PDAs)||98||13.3||103||19.4|
|Mobile data carriers (flash drives, CD, DVD, etc)||34||4.6||30||5.7|
|Desktop computers, servers, HDD||107||14.6||40||7.5|
Table 4: Major data leak channels
We will now discuss 3 most interesting points in the above statistics.
First, percentage of mobile computers used as leak channels reduced significantly. Generally, mobile computers (notebooks, PDAs, communicators) become involved in data leaks due to theft. In the majority of cases we have good reason to believe that computers itself are theft targets, not information stored on them. Global statistics on mobile computer theft and loss are staggering. Some sources offer numbers as high as millions of units stolen per year. As many lost or stolen computers hold confidential information on board, it is necessary for the owner to encrypt the stored data. However, encryption measures are extremely unpopular among the mobile PC owners. Amount of data leak incidents with mobile computers currently decreases, but the decrease rate is very slow, despite an abundance of data encryption software (and even freeware) on the market. In 2009 users of mobile PCs finally got to the encryption practice, and number of incidents dropped from 19 to 13 per cent. Our experts expect further decrease of this percentage in the next year.
Encryption proves to be an important measure for valuable information protection in the today's situation of increased loss of physical data carriers, such as laptops or mass-storage devices.
Second, percentage of data leaks thru paper documents increased sharply, as we already mentioned in H1 2009. The cause for this is self-evident. Efficient technical measures can block accidental leaks via electronic channels, but paper documents cannot be blocked this way. Once out of printer, hard copies can be controlled only via personal security, which is more complicated and less efficient. Moreover, increased public interest in the issue of data leaks leads to general concernment over improper utilization of confidential documents. Utilized documents with confidential data often are retrieved from the disposal bins and delivered to mass media. It is necessary to note that percentage of accidental leaks via "paper" channels is very high (115 out of 146, or 79%, as compared to 43% - average percentage of accidental leaks across all channels). This is a definite symptom of mass implementation of automated leak prevention systems, such as DLP, that are most effective against accidental leaks via all channels except "paper" leaks.
Third, percentage of leaks thru archive (backup, reserve) media has increased, as this potential leak channel is generally overlooked now. The majority of security systems control only the simpler channels – the Internet, flash drive access, etc – but more complicated channels, such as archive media or printers, are out of their control. Full-featured DLP systems monitor all possible leak channels and protocols, but they are much more expensive and difficult to use, therefore, the complicated channels often stay unsecured. Generation of archive data copies is one of these. Due to security considerations, it is generally recommended to create archive copies on mobile medium that should be then removed from the archived system and transferred into secure storage. The transfer is a extremely vulnerable process, often subject to data leaks. Unfortunately, many archiving systems have no archive encryption functionality, which would reduce the leak statistics significantly, as the loss of an encrypted archive is not considered a leak.
InfoWatch experts believe encryption of archives to become a major trend.
In summary, trends seen in 2009 led us to the following conclusions:
Users finally started embracing the concept of notebook data encryption. As it is the simplest and most efficient method of securing the most popular leak channel, number of the related incidents began decreasing.
Increased percentage of paper leaks is a strong evidence for managerial security procedures being underrated in favor of electronic data protection systems, which efficiently block computer-related leak channels but are powerless against the paper leaks.
Protection of archive (backup) data copies is also commonly overlooked by data security. Archive copies often are unencrypted.
And one more important point – percentage of email leaks is reasonable low. According to the statistics, protecting this channel against potential data leaks covers at most 5% of total threats. In many years of our DLP implementation experience, InfoWatch has good reason to believe that most clients starts protecting their corporate information systems with email monitoring. This is definitely important but it's only the first step in creating an effective corporate data security system. Full-scale DLP system are expensive and difficult to use, but their efficiency across all possible leak channels is high.
InfoWatch analysts expect further decrease of email leaks percentage, as this channel is the easiest to monitor and secure.
Now we will discuss separate distribution of accidental and intentional leaks per media type. Statistics for both incident types are given below.
|Leak channel||Intentional leaks||Accidental leaks|
|Mobile computers (notebooks, PDAs)||59||15.7||31||9.7|
|Mobile data carriers (flash drives, CD, DVD, etc)||8||2.1||24||7.5|
|Desktop computers, servers, HDD||78||20.8||26||8.1|
Table 5: Major channels for intentional and accidental data leaks
Picture 4, Intentional leaks distribution by leak channels
Picture 5, Accidential leaks distribution by leak channels
Electronic protection measures are often overrated by the employees responsible for data security, while traditional (mostly accidental) paper leaks go overlooked, as managerial security procedures are the only efficient way to prevent these leaks.
The most prominent gap between accidental and intentional leaks is expressed for "desktop computers, servers" and "paper documents" categories.
Desktop computers and their hard disks generally become targeted by malicious insiders, accidental leaks are very rare here.
To the contrary, confidential documents usually end up in paper bins without any malicious intent. This channel is unpopular for intentional data leaks.
In other groups, general balance of accidental/intentional leaks is reasonably uniform.
Traditional security measures that include DLP systems and managerial procedures are lately being complemented by other methods for confidential data protection. First of all, it is necessary to mention the government initiatives: mandatory data protection measures, standardization and certification of data leak countermeasures, legal sanctions against employees responsible for leak incidents, mandatory notification of citizens whose personal data was leaked, lost, or disclosed. It should be noted that the most leak-vulnerable countries refocused their major efforts from leak prevention to suppression of leak effects. This additional frontline offers greater data security, which becomes more critical every year, as data value steadily increases and data struggles become more fierce.
If a company lacks funds to secure all possible leak channels, it is recommended to start with notebook data encryption and Internet traffic filtration, as these are the most probable leak channels. However, it should not be overlooked that this protection is mostly effective against accidental leaks (43.5%) but will not stop a persistent malicious user.
Leak distribution per country
Leak database maintained by InfoWatch includes data on incidents reported by public media sources. However, leaks do not always get their media coverage, even in countries where public reporting is required by law. Still, by comparing the available data per country we can estimate the incident latency and guess the approximate amount of unreported incidents.
Incident statistics distributed per country are given in the table below. The rightmost column contains specific amount of leaks per 1 million of country population.
|Country||Number of leaks||%||Leaks per 1 mln of population|
Table 6: Data leaks, as distributed per country
It is necessary to note that highly developed countries each suffer at least 2 leaks per year per million of population today. As an average personal data leak encompasses approx. 750 thousand records, we can reasonably state that personal data leaks are a global problem.
The rightmost column – incidents per 1 mln of population – is the most descriptive parameter regarding leak intensity and leak reporting per country.
In the past years, peak value of this parameter was about 1, i.e. one leak per year per million of population. This peak value was reached in the US. As we can see in the above table, today the former peak is exceeded by 4 countries. USA and UK demonstrate the lowest incident latencies, due to the strict laws on mandatory notification of leak subjects. Therefore, these two countries can be used for a base level. Total amount of leaks (both reported and latent) evidently increases.
Examples of the largest incidents of 2009 (sorted by amount of personal records leaked) are given in the table below.
|Number of records leaked||Country||Brief description|
|76 mln||USA||In National Archives and Records Administration (NARA), a hard drive storing part of a database of 76 million veterans' personal information had been sent back to a contractor for repair without having its information wiped|
|62 mln||UK||Nine employees of Department for Work and Pensions (DWP) attempted non-authorized access to the department database with personal data of 62 million citizens – the total majority of UK population – including 12 million minors|
|32.6 mln||USA||More than 32 mln personal accounts were compromised (via SQL Injection) as a result of a hacking attack upon RockYou.com, a website offering services to such social networks as Facebook and MySpace.|
|7.5 mln||Germany||A vulnerability was found in the social network StayFriends GmbH (www.stayfriends.de), which allowed unauthorized access to personal data of all network members.|
|6 mln||UK||A range of companies (including Castrol) based their ad campaigns on personal data of car owners illegally obtained from the governmental car database.|
|2.5 mln||UK||Malicious users gained access to the database of National Health Service (NHS) with personal data of all patients.|
|1.5 mln||Japan||An unauthorized employee retrieved personal data of company clients from the corporate database and sold some of them.|
|807 th.||USA||Archive tapes were lost that contained 12 years of personal data on police suspects, including social security numbers.|
To reiterate the major points of the report:
- Total number of reported data leaks has continued but slowed down. Possible causes include financial crisis, greater latency, mass implementation of DLP systems.
- In the near future, percentage of intentional leaks will grow steadily.
- InfoWatch analysts consider personal data protection as one of the most cost-intensive items in information risk cost structure, along with personal data leaks.
- Electronic protection measures are often overrated by the employees responsible for data security, while traditional (mostly accidental) paper leaks go overlooked, as managerial security procedures are the only efficient way to prevent these leaks.
- It is necessary to note that highly developed countries each suffer at least 2 leaks per year per million of population today. As an average personal data leak encompasses approx. 750 thousand records, we can reasonably state that personal data leaks are a global problem.
- Users finally started embracing the concept of notebook data encryption. As it is the simplest and most efficient method of securing the most popular leak channel, number of the related incidents began decreasing.
- Protection of archive (backup) data copies is also commonly overlooked by data security. Archive copies often are unencrypted.