Troy Hunt is such a treasure. And for us web application developers, there is no excuse for not having protection against credential stuffing! While the best defense is likely two-factor [1], checking against Hunt's hashed password database is also very good and requires no extra work for users!
I don't have anything to back this up, but my guess is that the vast majority of compromised user accounts comes from credential stuffing/password re-use. It's really surprising to me when I hear that huge companies don't do this check.[2] It's simple, easy, takes about a day to set up.
If you're a young CTO or early-stage engineer working on a web app and have never been targeted with a credential stuffing attack, let me tell you: It's coming! It's just a matter of time before it's 1AM and your phone blows up; your site is getting hammered; you think it's DDOS, but then realize most of the hits are on your login page, then realize that and then realize with a horrible feeling that some % of those hits are getting through the login page. You'll be up all night dealing with it, and then you have to make breach notifications, and that really sucks.
Troy Hunt's free database will save you that heartache (probably). Just do it.
1. https://cheatsheetseries.owasp.org/cheatsheets/Credential_St...
2. Like 23andMe. https://news.ycombinator.com/item?id=37794379
About a decade back, I was at an event that had an FBI employee presenting. During his presentation, he had mentioned a story of a sys admin who had been arrested for taking a hashed PW database in his company, comparing the hashes against known compromised one's (perhaps from haveibeenpwned?), and forced a password reset for everyone who had reused a password that had separately been compromised and sent an email to each employee explaining this.
One of the employees was apoplectic at the actions of the sys admin and had accused him of violating her privacy by doing this. While I do not recall which party initiated legal action against the sys admin that led to his arrest (i.e. the employee or the company), the bottom line of the story was that the FBI employee (and, by extention, whichever judge was involved in adjudication the case) considered the act of a sys admin accessing password hashes placed under his care to be a criminal breach of privacy regardless of his intent being to improve his company's security against password stuffing attacks.
Assuming the FBI employee didn't just make the whole thing up (which I have no reason to believe - there are a lot of tech-stupid judges and, especially a decade ago, tech-stupid FBI employees), it might be prudent to pass this by your legal team before checking for password hashes for your employees being in haveibeenpwned.
I would love to get a proper source on this. Seems a bit crazy, and wouldn't this be thrown out on appeal?
Unfortunately I have no source to give. The FBI employee was just giving an example of illegal behavior he knew of. He didn't cite jurisdiction or the names of people involved. Hell - even if he did, I likely wouldn't have remembered it as this was roughly 8 years ago I was in the audience for this (I know I said roughly a decade ago in my prior post - but I checked a receipt for the event and it was in 2015).
Quite likely Randal Schwartz.
"In July 1995, Schwartz was prosecuted in the case of State of Oregon vs. Randal Schwartz, which dealt with compromised computer security during his time as a system administrator for Intel. In the process of performing penetration testing, he cracked a number of passwords on Intel's systems. Schwartz was originally convicted on three felony counts, with one reduced to a misdemeanor, but on February 1, 2007, his arrest and conviction records were sealed through an official expungement, and he is legally no longer a felon." -- https://en.wikipedia.org/wiki/Randal_L._Schwartz
Important aspect: he had been fired and cracked passwords while no longer an employee, to try to get rehired:
"Rather ill-advisedly, the Perl-programming guru (who's written several books on the subject) tried to prove his worth by running a password cracking package after he'd left in order to produce evidence that security practices had deteriorated since his departure. Instead of re-hiring Schwartz, as he hoped, Intel called in the police and he was charged with hacking offences."
https://www.theregister.com/2007/03/05/intel_hacker_charges_...
Wikipedia is a little light on the details. How much time did he end up serving, and were there any repercussions for the other parties involved?
Really hard to belief without anything else to go by. This sounds like old wives tales like people that add disclaimers saying they aren't laywers when they comment on the internet because someone once told them they heard someone got in trouble.
Does it sound that unbelievable for the 2010s? There was quite a discrepancy between how the internet/computers were generally being used and the legality.
Like https://www.eff.org/deeplinks/2016/07/ever-use-someone-elses... > Last week, the Ninth Circuit Court of Appeals, in a case called United States v. Nosal, held 2-1 that using someone else’s password, even with their knowledge and permission, is a federal criminal offense.
Also, the courts only just legalized white hacking last year. Before that violating the terms of service was also potentially a federal crime. https://www.spiceworks.com/it-security/security-general/news...
Jobs can ask if you have ever been arrested outside of CA. (Note: not convicted of a crime).
Also you are going to spend a long time being arrested before the appeal goes out.
"In California, a criminal appeal can take several months to several years. The length of time depends on the complexity of the case and how quickly it moves through the appeals process."
Parent commenter never mentioned anything about comparing stored password hashes. What you do is block bad passwords at password set time by hashing the prospective password and comparing with HIBP. A prospective password you haven't accepted or stored or transmitted off the application server - common sense says that's not a privacy violation - and many giant companies including my employer do this routinely.
[Edit] Oh yea I remember HIBP has an online API. Don't use this. Take the HIBP dumps that they make freely available and compare locally. If not for reasons of privacy, for reasons of simplicity and removing an unnecessary external business/legal/software dependency.
Ideally, but what if you're a new hire and the passwords already exist?
Be satisfied with fixing the new passwords going forward. Or gracefully force a new password for everyone, if circumstances permit that (circumstances including decision making authority; if you are the new CTO or CISO, and you're paranoid about reviewing the existing hashes, you should strongly consider the batched graceful forced reset!)
You can set a flag on login to use the password in memory rather than stored.
That's how you get the whole company to love you as a new CTO - force everyone to change their password, including people who have a strong non-reused password.
Your job as a CTO isn't to be loved by the entire company.
We’re evaluating different options in this thread. The right move is based on the circumstances and your judgement. I would support a new leader with the courage to close a security hole, maybe respect them even if I don’t love them.
By the way, I don’t feel paranoid to flag bad passwords on login (perhaps triggering an email OTP and forcing a password reset), personally. I responded to this thread because a commenter made an unfounded implication about using HIBP data to reduce vulnerability to credential stuffing.
That's not the greatest advice IMO. The API gets updated data more frequently, doesn't require that you transmit the password or a useable hashed form, and it's dead simple to consume. I'd argue that it's more effort to maintain an internal store and synchronization infrastructure, and you're less likely to accidentally breach anonymity and leak a weak hash by using the API than you are rolling your own query against the raw data.
It's also used by hundreds of bigcorps and government agencies who have way more pedantic lawyers than you're likely to have. If they couldn't find a good reason not to use it I doubt yours will.
Those are good arguments for using an online service. But your conclusion is premature and certainly cannot be made blanket like that in favor of using the API.
Just as many arguments can be made for an offline check. Or against an online check. From added latency via required uptime to added dependencies.
My point being: no. "It depends"
The FBI feeds data into Troy Hunt's database and FBI Director Christopher Wray gave Troy Hunt a medal for his work [1].
The Open Web Application Security Project's Application Security Verification Standard recommends that you do a hashed password check [2].
For bigger companies, sure, go talk to legal, but for young startups, my feeling is it's not worth the $200 or whatever your counsel will charge to say it's ok. I personally did not ask anyone (am cto), I just added the check.
1. https://twitter.com/troyhunt/status/1674132801837477888
2. See OWASP ASVS 4.0 2.1.7 https://github.com/OWASP/ASVS/blob/master/4.0/en/0x11-V2-Aut...
The whole situation did seem pretty exceptional when I heard it and I felt like I was being exposed to an alternate reality where lawyers made security worse for everyone.
That said I struggle to believe the sys admin had competent representation.
They forced a password reset. You can use HIBT data in a way that's less disruptive.
not a crime
Tell it to the judge.
It is worth it, that $200 dollars gives you lots of credibility to stand on if something should arise and you need to prove diligence, which is not at all uncommon in these cases, if legal recourse is ever saught (unlikely if you do it from day 1, I think, but never the less)
Well this unlocked a new fear I didn't know I needed to have. I suppose this is the massive drawback to allowing dinosaurs to spearhead policy and govern laws.
For what it's worth, the average tech-smarts in the legal realm and within the FBI are significantly improved compared to 8 years ago. This is just from my personal observation.
That said, there are still tremendous gaps yet to be bridged with the understanding of many procecutors and lawyers as well as weird applications of the law that aren't intuitive to people whose life is technology.
For example (and I caveat this with IANAL): Did you know the physical medium you get Internet to your house determines what laws and processes the government can use to monitor your Internet traffic?
I did not! Do you have / know a good explanation of the details?
From my (non-lawyer) understanding, if you have a coax cable connected to a cable modem providing Internet to your residence, your privacy is governed by https://www.law.cornell.edu/uscode/text/47/551
Other means of Internet getting to your residence is covered by Title 3 of the ECPA which, historically, Feds have played fast and loose with getting data from.
That I did know, only because I was dumb enough to hitch my wagon to Comcast/Xfinity as a headend tech for years. Just affirmed the idea that all ISPs should be community owned.
Besides legal, I think it's important to realize that there is a very emotional response to discovering that your password is not good.
I know a company that started doing quarterly brute-forcing of passwords as a security check and the reaction to finding out that your password is not strong enough is....all sorts of emotions.
If you have a 10-12 character password that may have been strong at one point but now is not and your IT team is informing you, you're reaction is NEVER, oh thank you for helping me out. It's not stupidity, it's human nature to feel attacked.
When a 12 character gets bruteforced, my initial reaction is to blame the system for allowing so many password attempts!
Like imagine how many failed attempts must've happened for a 12 character password to get bruteforced. Alarms should have been raised way before it became an issue.
what if it was a crappy 12 character password like 123456789012 and got bruteforced in 2 tries?
also, at one point it was popular to use l33t speak for passwords so there are many crappy 12+ char l33t passwords floating around that are trivial to guess, no brute forcing required.
As part of fixing security problems 20+ years ago we put together a migration process that included cracking passwords. First off we created an interface for updating your password and that interface essentially ran through all the tests that the cracking software to better ensure you'd picked something good. Passwords were expired every 90 days (remember, this was 2001. The migration first set the expiration date so that people got used to the process and then, on occasion, we'd run the passwords through a brute force attack. To your point, the users were most unhappy when their password would get cracked and expired, but that's life. 2FA, keys, etc.. is really an improvement over what we've had for such a long time.
Why are good deeds punished so much by authorities?
This is a good way to disincentivize prosocial behavior.
The problem can be "who defines good deeds?" There are so many things which seem good when presented one way, but can be harmful when viewed another way. Obviously, as presented above this seems like "an obvious good", but context matters, snd clearly you don't get the whole context from a one paragraph summary.
Ultimately we have civil structures (government at every level) that tries to codify "good" and "bad". Life is seldom that clean though, so inevitably every regulation and law is good for some bad for others.
So, to answer your question, because "good" and "prosocial" are not universally true.
Accessing password hashes already in a DB is not the same as preventing, during account creation, the reuse of a password known to be compromised.
If I'm not mistaken it's all done using cryptographic schemes that leak neither the password nor the hash.
This is true. The story as written probably didn't happen with HIBP's database. Troy Hunt's database only includes SHA-1 hashes, and passwords in your own database will be hashed with a stronger algorithm (hopefully) and salted (hopefully), so you can't do a simple hash-to-hash comparison. The way to do a HIBP check is, when a user signs in, you hash their password in the way HIBP expects, and check that against either their API or against a local copy of HIBP's database, and if a hit is returned, you give them a nice message and direct them to the password reset flow. There's no easy way to use HIBP's data to identify users with compromised passwords until users actually try to log in.
Would it matter which hash function was used to create the password database.
But there's more than just the issue of discovering the passowrd itself.
What about the issue of discovering that a particular password hash comes from an employee at a certain company.
As I understand it, Tory Hunt downloads dumps of stolen passwords. He does not share the dumps. Instead he collects queries, like a search engine. Until people start sending him queries of hashes to check he does not necessarily know the locations of the people whose passwords were stolen.
However if he gets a series of hashes sent from some IP address belonging to a perticular corporation, then argubaly he now knows these are likely to be passwords belonging to employees at that corporation.
The API doesn't require the full hash, just a short prefix. They don't have enough information for your scenario to work.
https://www.troyhunt.com/understanding-have-i-been-pwneds-us...
Hmm. Interesting. Shitty outcome if true, but AD/Azure AD has an extension (3rd party if I recall) that automatically checks for breached passwords and lets the user know and forces them to change their password.
I certainly believe it a user was upset by it. We've gotten support tickets before from users accusing of of "snooping on their local machine" to find passwords... Like no, it was just in a breach, relax.
They're often now upset they've been called to task so it's just hard all around.
I'm pretty sure the password manager in Safari also checks this db, as I've been warned that some passwords have been discovered in breaches (even going back to the linked in breach).
I‘m flabbergasted how broken the system is.
It sounds like it was made up, should not be so hard to find the verdict.
Sorry, I don't understand the procedure. If the database contains hashed passwords (I haven't seen or download the database), how can you know you're using the same salt and method that the one in the datbase?
For example, let's say Tumblr was hacked and with it my password `hunter2`. Tumbler used some naive HMAC-MD5 method with a salt, but my site uses argon2 with (obviously) a different salt. Even though my password is the same (`hunter2`) the resulting hashed passwords will be different. How is this any effective preventing credential stuffing?
One can only implement a HIBP check when one has access to the user's unhashed password. So, at login, registration, and password reset.
Yes, exactly, so that's why I was asking, you mentioned the database was of hashed passwords. The database then contains the source passwords? And you're preventing the user from using one of those passwords?
Sorry, I still don't understand the procedure you mentioned and I'm genuinely curious.
Oh, I see the issue. The HIBP database is SHA-1 hashed with no salt. It was created from unhashed passwords. You can't download the unhashed version (you could of course compute it, if you really wanted to; but there's no need).
So, the procedure you need to implement is, on login/registration/pw reset, you SHA-1 hash the user's unhashed password and do a indexed lookup on your copy of HIBP's database. Or if you don't want to maintain that copy, you can use HIBP's API to do something similar.
Ah! Thanks a lot, it now makes sense. So at some point HIBP has the unhashed passwords, they obviously don’t make those public, good trick. How do you handle this from a UX perspective? Just tell the user that password is “not strong enough”?
Password managers that have HIBP integration are open about it - one says "this password appears in a list of compromised passwords"
The HIBP database only stores hashes of leaked passwords, but the source material is often (always?) plaintext passwords. If the hash of a password is in the HIBP database, the plaintext password is out there somewhere in a database of a malicious actor.
My understanding this isn't true. These leaks are often just the password hashes.
There are some leaks where passwords are cracked and included in plaintext and there are some leaks where passwords are not cracked and included only as hashes. If the leak includes cracked passwords in plaintext then they will be added to HIBP and can be checked, otherwise they are not included and cannot be checked.
The alternative is the exact same scenario, except that the percentage is several orders of magnitude lower, right?
The small subset of your users that explicitly opted-out of 2-factor authentication (if you allow that) and who try to choose "Password1!" with a second exclamation point when your site said "Error, your password has seen 83,000 times in password dumps, please use a unique password" will still get hacked.
Or is your expectation that no one will attack every user on your webapp with a credential stuffing attempt if they see that the probability of success is 0.001% instead of 1%?
Wait, a thousand fold decrease is not worth it?
Your numbers literally turns a scenario where 200,000 accounts are hacked into one where 200 are exposed. Or one where 30 hacked accounts turn into 0 hacked accounts.
There is a point where a difference in quantity becomes a difference in quality. I far prefer the latter scenarios.
Anybody (like GP) that doesn’t understand that this is entirely the nature of security work, should not be making any material decisions about security.
The number of times I’ve seen DEVELOPERS neglect to implement materially useful security measures because “they’re not technically perfect!” Is astounding.
The number of times I’ve seen purported security practitioners dismiss materially useful security measures because of some theoretical attack that nobody has ever seen in the wild in recorded history outside of stunt-hacking at Defcon is…probably higher
The bad feeling comes from knowing you could have reasonably done something to mitigate the harm. Don't let perfect be the enemy of good.
Remember that "identity theft" is marketing fluff. In a credential stuffing attack your business is the victim of fraud.
Yes, same scenario, but far fewer logins are successful. 3 orders of magnitude sounds right, but I don't know precise numbers. (Can others shed light?) Three orders of magnitude is a lot!
Besides 2FA, rate limiting your login endpoint (both by IP address and username) is a much more robust protection against this attack. Especially if you include temporary bans (e.g. “20 failed login attempts with the same IP, and/or same username, in the past minute = 15 minute ban for that IP and/or username”). A lot of API gateways, K8s ingresses, etc. make this dead simple, and if not it’s also super easy to add with a few lines of code and something like Redis to store counts of recent login attempts.
I do think checking against the HIBP DB is a good call too, but it doesn’t stop this attack overly well, rate limiting is a much better way to stop it.
Rate limiting definitely helps against credential stuffing in the form of trying a bunch of common passwords against random accounts.
But there's also "stuffing" with known breached username+password combinations – in which case it still helps, but I don't think as much? In the latter the attack is much more likely to succeed and there's a much smaller number of values being attempted, so the threshold of detection + blocking would have to be much lower...
The threshold is lower but in reality it still makes considerably more login attempts, many of them failed, than a normal client ever would. Credential stuffing attacks don't really limit themselves to a single account, even if it worked.
If you're working on a greenfield login/auth, please don't accept and store passwords in a database! Setup social OAuth, SSO, or magic link emails and make it someone else's problem.
If you do go down this route though, be sure to read up on what you're deploying, and understand what your libraries are doing (and more importantly, not doing).
You don't want to end up with a naive implementation of OAuth2 (like some big names had recently) which fails to check the audience parameter, and therefore lets anyone other service using the same SSO gain access to your users' accounts.
Recent HN post on this - https://news.ycombinator.com/item?id=38009291
I agree, and thanks for pointing that out, but between the two security failures, I'd rather have an incorrect OAuth2 implementation, which can be quickly fixed with no impact on existing customers, than credential stuffing, where I need to email customers apologizing for why I needed to reset their passwords.
(Non sarcastic), why would you feel bad for users using 1234 as their passwords? Unless your website is aimed at vulnerable people, I consider this to be their responsibility.
As other comments have said these users will probably go the easiest route (1234websitename) to fix the error.
Any restriction you put on your password field reduces entropy, and safety for everyone (even if marginally so).
Because anyone that has ever been responsible for anything knows that there’s a difference between something being your fault and something being your problem.
Breach notification etc legislation in some jurisdictions will also require that you report successful widespread credential stuffing.
Even AWS with their “shared responsibility model” works with GitHub etc to ensure that programmatic access credentials aren’t accidentally exposed via public repositories. This isn’t credential stuffing, but it’s a blindingly accurate demonstration of the fact that drawing a line in the sand and saying “users, work it out from here!” and attempting to wash your hands of the situation is nothing more than the ill-informed pipe dream of someone that’s never had to deal with this stuff in reality.
Have you ever operated an online business? Poor password choice is practically harmful to business. Marginal reduction of entropy by blocking breached passwords, what's the practical harm from that?
1234websitename is objectively better than 1234.
I'll go with NIST on this one (yes, and have a minimum length too):
https://pages.nist.gov/800-63-3/sp800-63b.html#memsecret