The use of embedded HTML documents in phishing e-mails is a standard technique employed by cybercriminals. It does away with the need to put links in the e-mail body, which antispam engines and e-mail antiviruses usually detect with ease. HTML offers more possibilities than e-mail for camouflaging phishing content.
There are two main types of HTML attachments that cybercriminals use: HTML files with a link to a fake website or a full-fledged phishing page. In the first case, the attackers can not only hide a link in the file, but also automatically redirect the user to the fraudulent site when they open this file. The second type of HTML attachment makes it possible to skip creating the website altogether and save on hosting costs: the phishing form and the script that harvests the data are embedded directly in the attachment. In addition, an HTML file, like an e-mail, can be modified according to the intended victim and attack vector, allowing for more personalized phishing content.
Fig.1. Example e-mail with an HTML attachment
Structure of phishing HTML attachments
Fig. 2. Phishing HTML page and its source code
Typically, the HTML page sends data to a malicious URL specified in the script. Some attachments consist entirely (or mostly) of a JS script.
In the e-mail source code, the HTML attachment looks like plain text, usually Base64-encoded.
Fig. 3. HTML attachment in e-mail source code
If a file contains malicious scripts or links in plaintext, the security software can quickly parse and block it. To avoid this, cybercriminals resort to various tricks.
For example, opening the HTML attachment in the phishing e-mail supposedly from HSBC Bank (see Fig. 1) in a text editor, we see some pretty confusing JS code, which, it would seem, hints neither at opening a link nor at any other meaningful action.
Fig. 4. Example of obfuscation in an HTML attachment
However, it actually is an obfuscated script that redirects the user to a phishing site. To disguise the phishing link, the attackers used a ready-made tool, allowing us to easily deobfuscate the script.
Fig. 5. Deobfuscated script from an attachment in an e-mail seemingly from HSBC Bank: link for redirecting the user
If a script, link, or HTML page is obfuscated manually, it is much harder to restore the original code. To detect phishing content in such a file, dynamic analysis may be required, which involves running and debugging the code.
Fig. 6. HTML file using the unescape() method — the source code of the file contains only five lines, one of which is empty
Fig. 7. Phishing page in the HTML attachment
The cybercriminals used an interesting trick that involves the deprecated JS method unescape(). This method substitutes the “%xx” character sequences with their ASCII equivalents in the string that is passed to it. Running the script and viewing the source code of the resulting page, we see plain HTML.
Fig. 8. The resulting HTML file
In the first four months of 2022, Kaspersky security solutions detected nearly 2 million e-mails containing malicious HTML attachments. Nearly half of them (851,328) were detected and blocked in March. January was the calmest month, with our antispam solutions detecting 299,859 e-mails with phishing HTML attachments.
Number of detected e-mails with malicious HTML attachments, January–April 2022 (download)
Phishers deploy a variety of tricks to bypass e-mail blocking and lure as many users as possible to their fraudulent sites. A common technique is HTML attachments with partially or fully obfuscated code. HTML files allow attackers to use scripts, obfuscate malicious content to make it harder to detect, and send phishing pages as attachments instead of links.
Kaspersky security solutions detect HTML attachments containing scripts regardless of obfuscation.