As penetration testers, we are frequently engaged to do penetration tests for PCI compliance. As a part of these penetration tests, we look for cardholder data (Card Numbers, CVV, etc.) in files, network traffic, databases, and anywhere else we might be able to catch it. Often times, we will find hashes of credit card numbers along with the first six and/or last four numbers of the credit card number. Given that credit card numbers are a fixed length, this limits the keyspace that we need to use to brute force the hashes.
The language in the PCI DSS is a little vague about how cardholder data needs to be hashed, but there is information in requirement 3.4 that helps.
“Render PAN unreadable anywhere it is stored (including on portable digital media, backup media, and in logs) by using any of the following approaches:
- One-way hashes based on strong cryptography (hash must be of the entire PAN)
- Truncation (hashing cannot be used to replace the truncated segment of PAN)
- Index tokens and pads (pads must be securely stored)
- Strong cryptography with associated key-management processes and procedure”
While this information is good, it does not ensure that the implementer of the hashing function is doing things correctly. “Strong cryptography” can be interpreted a number of different ways. One could argue that SHA256 is a strong hashing algorithm, therefore meeting the requirements. It does not take a significant amount of effort for us to try and brute force SHA256 hashes, so the strength of the algorithm is a moot point.
This type of attack is actually called out as a footnote in the requirement.
“Note: It is a relatively trivial effort for a malicious individual to reconstruct original PAN data if they have access to both the truncated and hashed version of a PAN. Where hashed and truncated versions of the same PAN are present in an entity‘s environment, additional controls should be in place to ensure that the hashed and truncated versions cannot be correlated to reconstruct the original PAN.”
These “additional controls” could include salts for the hashes (frequently stored with the hash) or encrypting the truncated versions. There are a number of other potential controls that we could talk about, but that would be enough info for another post. Even with proper additional controls on the PAN data (truncated and hashed), the root of the issue is still the length of the card number and the limited keyspace that is needed for guessing the number.
Given a (potentially) sixteen digit card number, the first six digits, and the last four digits, we are able to easily iterate through the remaining six digits in a matter of minutes. There are only a fixed number of IIN or BIN prefixes for cards (think the first 4-6 numbers of a card). These numbers are available online and pretty easy to find. Given a list of these numbers, we are able to reduce our cracking efforts for situations where hashes are only stored with the last four digits of the card number. Factoring in that credit card numbers are Luhn valid, this reduces the amount of effort that we have to go through to hash the credit card number guesses.
Example Card Format:
For this example, we will use the TCF Debit Card BIN. With a million potential card numbers (000000 to 999999 for the middle digits) with a last four digits of 1234, there are 100,000 potential Luhn valid card numbers in this space. So as you can see in this case, the Luhn check cuts the cracking space down by ninety percent. As it turns out, this will be the case with any of the credit card numbers that you are brute forcing. Since the last (or check) digit can only be one of ten numbers (0-9), you are limiting the number of valid Luhn check numbers to one of the ten numbers. Simply put, this works because you are not brute forcing the check digit.
Time wise, it takes about 30 minutes to get through this keyspace (for the example number above) on a 2.80 Ghz Intel Core i7 processor. I also ran this test with several other programs open, so your results may vary. In general practice, I’ve seen most hashes crack within two minutes.
The code for cracking these hashes is actually quite simple. Read in the input file, iterate through the numbers that need guessing, and hash the Luhn valid numbers. If the guess hash matches your input hash, it will write out your results to your output file. There’s also a small block in here to read in a list of IIN/BINS to use when you need to do guessing on the first 4-6 card numbers. You will have to provide your own list of these.
Below is some sample output with the full card numbers and hashes redacted. Each period represents a Luhn valid card number and is used to show the cracking status.
I’ve put the code out on the NetSPI GitHub for those who are interested: http://netspi.github.io/PS_CC_Checker/
The PCI DSS allows merchants to store partial card numbers, but not the full number. While the card number may not be stored in full, storing the hash of the card along with some of the digits allows an attacker to make educated guesses about the card number. This basically renders the hashing useless. While this is directly called out in requirement 3.4 of the DSS, we have found instances of hashes being stored with the truncated PAN data. Even without the truncated PAN data, the cracking effort for a card number hash is still reduced by the static IIN/BIN numbers associated with the card issuer.