Analysis of user password strength

Research

18 Jun 2024

minute read

Alexey Antonov

How passwords are typically stored
Methodology
Brute force attacks
Smart brute-force attacks
The use of dictionary words reduces password strength
Takeaways

Authors

Alexey Antonov

The processing power of computers keeps growing, helping users to solve increasingly complex problems faster. A side effect is that passwords that were impossible to guess just a few years ago can be cracked by hackers within mere seconds in 2024. For example, the RTX 4090 GPU is capable of guessing an eight-character password consisting of same-case English letters and digits, or 36 combinable characters, within just 17 seconds.

Our study of resistance to brute-force attacks found that a large percentage of passwords (59%) can be cracked in under one hour.

How passwords are typically stored

To be able to authenticate users, websites need a way to store login-password pairs and use these to verify data entered by the user. In most cases, passwords are stored as hashes, rather than plaintext, so that attackers cannot use them in the event of a leak. To prevent the password from being guessed with the help of rainbow tables, a salt is added before hashing.

Although hashes are inherently irreversible, an attacker with access to a leaked database can try to guess the passwords. They would have an unlimited number of attempts, as the database itself has no protection against brute-forcing whatsoever. Ready-made password-guessing tools, such as hashcat, can be found online.

Methodology

Our study looked at 193 million passwords found freely accessible on various dark web sites. Kaspersky does not collect or store user passwords. More details are available here and here.

We estimated the time it takes to guess a password from a hash using brute force and various advanced algorithms, such as dictionary attacks and/or enumeration of common character combinations. By dictionary we understand here a list of character combinations frequently used in passwords. They include, but are not limited to real English words.

Brute force attacks

The brute-force method is still one of the simplest and most straightforward: the computer tries every possible password option until one works. This is not a one-size-fits-all approach: enumeration ignores dictionary passwords, and it is noticeably worse at guessing longer passwords than shorter ones.

We analyzed the brute-forcing speed as applied to the database under review. For clarity, we have divided the passwords in the sample into patterns according to the types of characters they contain.

a: the password contains only lowercase or only uppercase letters.
aA: the password contains both lowercase and uppercase letters.
0: the password contains digits.
!: the password contains special characters.

The time it takes to crack a password using the brute-force method depends on the length and the number of character types. The results in the table are calculated for the RTX 4090 GPU and the MD5 hashing algorithm with a salt. The speed of enumeration in this configuration is 164 billion hashes per second. The percentages in the table are rounded.

Password pattern	Share of passwords of this type in the dataset, %	Share of brute-forceable passwords (by pattern, %)						Maximum password length in characters by crack time
Password pattern	Share of passwords of this type in the dataset, %	< 60 s	60 s to 60 min	60 min to 24 h	24 h to 30 d	30 d to 365 d	> 365 d	24 h to 30 d	30 d to 365 d	> 365 d
aA0!	28	0,2	0,4	5	0	9	85	—	9	10
a0	26	28	13	15	11	10	24	11	12	13
aA0	24	3	16	11	0	15	55	—	10	11
a0!	7	2	9	0	14	15	59	9	10	11
0	6	94	4	2	0	0	0	—	—	—
a	6	45	13	10	9	6	17	12	13	14
aA	2	15	22	11	14	0	38	10	—	11
a!	1	6	9	11	0	11	62	—	10	11
aA!	0,7	3	2	12	10	0	73	9	—	10
0!	0,5	10	27	0	18	13	32	10	11	12
!	0,006	50	9	10	5	6	19	11	12	13

The most popular type of passwords (28%) includes lowercase and uppercase letters, special characters and digits. Most of these passwords in the sample under review are difficult to brute-force. About 5% can be guessed within a day, but 85% of this type of passwords take more than a year to work out. The crack time depends on the length: a password of nine characters can be guessed within a year, but one that contains 10 characters, more than a year.

Passwords that are least resistant to brute-force attacks are the ones that consist of only letters, only digits or only special characters. The sample contained 14% of these. Most of them can be cracked within less than a day. Strong letter-only passwords start at 11 characters. There were no strong digit-only passwords in the sample.

Smart brute-force attacks

As mentioned above, brute force is a suboptimal password-guessing algorithm. Passwords often consist of certain character combinations: words, names, dates, sequences (“12345” or “qwerty”). If you make your brute-force algorithm consider this, you can speed up the process:

bruteforce_corr is an optimized version of the brute-force method. You can use a large sample to measure the frequency of a certain password pattern. Next, you can allocate to each variety a percentage of computational time that corresponds to its real-life frequency. Thus, if there are three patterns, and the first one is used in 50% of cases, and the second and third in 25%, then per minute our computer will spend 30 seconds enumerating pattern one, and 15 seconds enumerating patterns two and three each.
zxcvbn is an advanced algorithm for gauging password strength. The algorithm identifies the pattern the password belongs to, such as “word, three digits” or “special character, dictionary word, digit sequence”. Next, it calculates the number of iterations required for enumerating each element in the pattern. So, if the password contains a dictionary word, finding it will take a number of iterations equal to the size of the dictionary. If a part of the pattern is random, it will have to be brute-forced. You can calculate the total complexity of cracking the password if you know the time it takes to guess each component of the pattern. This method has a limitation: successful enumeration requires specifying a password or assuming a pattern. However, you can find the popularity of patterns by using stolen samples. Then, as with the brute-force option, allocate to the pattern an amount of computational time proportional to its occurrence. We designate this algorithm as “zxcvbn_corr”.
unogram is the simplest language algorithm. Rather than requiring a password pattern, it relies on the frequency of each character, calculated from a sample of passwords. The algorithm prioritizes the most popular characters when enumerating. So, to estimate the crack time, it is enough to calculate the probability of the characters appearing in the password.
3gram_seq, ngram_seq are algorithms that calculate the probability of the next character depending on n-1 previous ones. The proposed algorithm starts enumerating one character, and then sequentially adds the next one, while starting with the longest and most frequently occurring n-grams. In the study, we used n-grams ranging from 1 to 10 characters that appear more than 50 times in the password database. The 3gram_seq algorithm is limited to n-grams up to and including three characters long.
3gram_opt_corr, ngram_opt_corr is an optimized version of n-grams. The previous algorithm generated the password from the beginning by adding one character at a time. However, in some cases, enumeration goes faster if you start from the end, from the middle or from several positions simultaneously. *_opt_* algorithms check the varieties described above for a specific password and select the best one. However, in this case, we need a password pattern that allows us to determine where to start generating from. When adjusted for different patterns, these algorithms are generally slower. Still, they can provide a significant advantage for specific passwords.

Also, for each password, we calculated a best value: the best crack time among all the algorithms used. This is a hypothetical ideal case. To implement it, you will need to “guess” an appropriate algorithm or simultaneously run each of the aforementioned algorithms on a GPU of its own.

Below are the results of gauging password strength by running the algorithms on an RTX 4090 GPU for MD5 with a salt.

Crack time	Percentage of brute-forceable passwords
Crack time	ngram_seq	3gram_seq	unogram	ngram_opt _corr	3gram_opt _corr	zxcvbn _corr	bruteforce _corr	Best
< 60 s	41%	29%	12%	23%	10%	27%	10%	45%
60 s to 60 min	14%	16%	12%	15%	12%	15%	10%	14%
60 min to 24 h	9%	11%	12%	11%	12%	9%	6%	8%
24 h to 30 d	7%	9%	11%	10%	11%	9%	9%	6%
30 d to 365 d	4%	5%	7%	6%	8%	6%	10%	4%
> 365 d	25%	30%	47%	35%	47%	35%	54%	23%

The bottom line is, when using the most efficient algorithm, 45% of passwords in the sample under review can be guessed within one minute, 59% within one hour, and 73% within a month. Only 23% of passwords take more than one year to crack.

Importantly, guessing all the passwords in the database will take almost as much time as guessing one of them. During the attack, the hacker checks the database for the hash obtained in the current iteration. If the hash is in the database, the password is marked as cracked, and the algorithm moves on to working on the others.

The use of dictionary words reduces password strength

To find which password patterns are most resistant to hacking, we calculated the best value for an expanded set of criteria. For this purpose, we created a dictionary of frequently used combinations of four or more characters, and added these to the password pattern list.

dict: the password contains one or more dictionary words.
dict_only: the password contains only dictionary words.

Password pattern	Share of passwords, %	Share of passwords that can be cracked with a dictionary attack (by pattern, %)						Maximum password length in characters by crack time
Password pattern	Share of passwords, %	< 60 s	60 s to 60 min	60 min to 24 h	24 h to 30 d	30 d to 365 d	> 365 d	24 h to 30 d	30 d to 365 d	> 365 d
dict_a0	17	63	15	8	5	3	7	10	11	12
aA0!	14	5	6	5	5	3	76	6	7	8
dict_aA0	14	51	17	10	7	4	11	9	10	11
dict_aA0!	14	34	18	12	10	6	20	7	8	8
a0	10	59	22	6	6	1.8	6	10	11	12
aA0	10	19	13	13	6	7	42	9	10	11
0	6	92	5	1.5	1.3	0	0	15	—	—
dict_a0!	5	44	16	10	8	5	17	9	9	10
dict_a	4	69	12	6	4	2	6	11	12	13
a0!	2	31	19	13	9	5	23	9	9	10
a	1.2	76	7	6	3	3	6	11	12	13
dict_aA	1.2	56	15	8	6	3	11	9	10	10
dict_a!	0.8	38	16	10	8	5	23	8	9	10
aA	0.7	26	10	28	7	2	27	9	10	10
dict_aA!	0.5	31	17	11	10	6	26	8	9	9
0!	0.4	53	15	8	7	5	13	9	10	11
dict_only	0.2	99.99	0.01	0.0002	0.0002	0	0	18	—	—
dict_0	0.2	89	6	2	2	0	0	15	—	—
aA!	0.2	11	8	10	16	3	52	8	9	9
a!	0.1	35	16	10	9	5	25	8	9	10
dict_0!	0.06	52	13	7	6	4	17	9	10	11
!	0.006	50	10	6	8	4	20	8	9	10

The majority (57%) of the passwords reviewed contained a dictionary word, which significantly reduced their strength. Half of these can be cracked in less than a minute, and 67% within one hour. Only 12% of dictionary passwords are strong enough and take more than a year to guess. Even when using all recommended character types (uppercase and lowercase letters, digits and special characters), only 20% of these passwords proved resistant to brute-forcing.

It is possible to distinguish several groups among the most popular dictionary sequences found in passwords.

Names: “ahmed”, “nguyen”, “kumar”, “kevin”, “daniel”;
Popular words: “forever”, “love”, “google”, “hacker”, “gamer”;
Standard passwords: “password”, “qwerty12345”, “admin”, “12345”, “team”.

Non-dictionary passwords comprised 43% of the sample. Some were weak, such as those consisting of same-case letters and digits (10%) or digits only (6%). However, adding all recommended character types (the aA0! pattern) makes 76% of these passwords strong enough.

Takeaways

Modern GPUs are capable of cracking user passwords at a tremendous speed. The simplest brute-force algorithm can crack any password up to eight characters long within less than a day. Smart hacking algorithms can quickly guess even long passwords. These use dictionaries, consider character substitution (“e” to “3”, “1” to “!” or “a” to “@”) and popular combinations (“qwerty”, “12345”, “asdfg”).

This study lets us draw the following conclusions about password strength:

Many user passwords are not strong enough: 59% can be guessed within one hour.
Using meaningful words, names and standard character combinations significantly reduces the time it takes to guess the password.
The least secure password is one that consists entirely of digits or words.

To protect your accounts from hacking:

Remember that the best password is a random, computer-generated one. Many password managers are capable of generating passwords.
Use mnemonic, rather than meaningful, phrases.
Check your password for resistance to hacking. You can do this with the help of Password Checker, Kaspersky Password Manager or the zxcvbn
Make sure your passwords are not contained in any leaked databases by going to haveibeenpwned. Use security solutions that alert users about password leaks.
Avoid using the same password for multiple websites. If your passwords are unique, cracking one of them would cause less damage.

Analysis of user password strength

This site uses Akismet to reduce spam. Learn how your comment data is processed.

john fuller

Posted on June 21, 2024. 5:49 pm

Password!

Reply
Jeff

Posted on June 23, 2024. 8:29 pm

There is an assumption here that the service you’re accessing is using MD5 to hash passwords, which if it is at all serious (e.g. a bank) it absolutely shouldn’t be doing. Any vaguely respectable system would not have been using MD5 in the last decade – they should be using something like Bcrypt which creates much more expensive hashes.

Still the best advice is there at the end – use a password manager and have different passwords for each site – especially the important ones like your bank.

Reply
1. Alexey Antonov
  
  Posted on June 24, 2024. 9:57 am
  
  This is a essential point. Bcrypt would be much more difficult to crack.
  Unfortunately, in the real world, as we see, many services still use weak hashes. Moreover, in 2024 we still sometimes find password databases in clear text.
  So in this research we’ve used md5 as a baseline.
  
  Reply
James Sterling

Posted on July 26, 2024. 1:39 pm

Brute-forcing a hash is not “password guessing”.

Reply
1. Alexey Antonov
  
  Posted on August 5, 2024. 9:04 am
  
  Hi James!
  
  Apart from being a term for a specific MITRE subtechnique, “password guessing” may be used as a general definition of password cracking/brute-forcing, as it is used by Wikipedia: https://en.wikipedia.org/wiki/Password_cracking
  It is also worth noting that MITRE’s page for “password cracking” features the wording “guess the passwords” as well:
  
  “Techniques to systematically guess the passwords used to compute hashes are available, or the adversary may use a pre-computed rainbow table to crack hashes.”
  https://attack.mitre.org/techniques/T1110/002/
  
  Reply

Latest Posts

Latest Webinars

Reports

Kaspersky researchers analyze updated CoolClient backdoor and new tools and scripts used in HoneyMyte (aka Mustang Panda or Bronze President) APT campaigns, including three variants of a browser data stealer.

Kaspersky discloses a 2025 HoneyMyte (aka Mustang Panda or Bronze President) APT campaign, which uses a kernel-mode rootkit to deliver and protect a ToneShell backdoor.

Kaspersky GReAT experts analyze the Evasive Panda APT’s infection chain, including shellcode encrypted with DPAPI and RC5, as well as the MgBot implant.

Kaspersky expert describes new malicious tools employed by the Cloud Atlas APT, including implants of their signature backdoors VBShower, VBCloud, PowerShower, and CloudAtlas.

Analysis of user password strength

How passwords are typically stored

Methodology

Brute force attacks

Smart brute-force attacks

The use of dictionary words reduces password strength

Takeaways

GReAT Ideas. Balalaika Edition

GReAT Ideas. Green Tea Edition

GReAT Ideas. Powered by SAS: malware attribution and next-gen IoT honeypots

GReAT Ideas. Powered by SAS: threat actors advance on new fronts

GReAT Ideas. Powered by SAS: threat hunting and new techniques

How we took part in MLSEC and (almost) won

How to confuse antimalware neural networks. Adversarial attacks and protection

Yet another DCOM object for lateral movement

Following the digital trail: what happens to data stolen in a phishing attack

Turn me on, turn me off: Zigbee assessment in industrial environments

Inside the dark web job market

Signal in the noise: what hashtags reveal about hacktivism in 2025

Latest Posts

Anatomy of a Cyber World Global Report 2026

The SOC Files: Time to “Sapecar”. Unpacking a new Horabot campaign in Mexico

Free real estate: GoPix, the banking Trojan living off your memory

BeatBanker: A dual‑mode Android Trojan

Latest Webinars

SOC: Build, buy, or hybrid?

Inside the dark web job market: Their talent, our threat

Hunt Hub: Opening the black box of EDR detection

Signal in the noise. What 2025 hacktivism reveals about the modern threat landscape

Reports

HoneyMyte updates CoolClient and deploys multiple stealers in recent campaigns

The HoneyMyte APT evolves with a kernel-mode rootkit and a ToneShell backdoor

Evasive Panda APT poisons DNS requests to deliver MgBot

Cloud Atlas activity in the first half of 2025: what changed

Subscribe to our weekly e-mails