Yeah, you actually better not save the users passwords in plain text or in an encrypted way it could be decrypted.
You rather save a (salted) hashed string of the password. When a user logs in you compare the hashed value of the password the user typed in against the hashed value in your database.
What is hashed? Think of it like a crossfoot of a number:
Let’s say you have a number 69: It’s crossfoot is (6+9) 15. But if someone steals this crossfoot they can’t know the original number it’s coming from. It could be 78 or 87.
Dumb question: isn’t it irrelevant for the malicous party if it’s 78 or 87 per your example, because the login only checks the hash anyway? Won’t both numbers succesfully login?
It’s actually a really good question. What you’re explaining is called a collision, by creating the same hash with different numbers you can succesfully login.
This why some standard hashing function become deprecated and are replaced when someone finds a collision. MD5, which was used a lot to hash passwords or files, is considered insecure because of all the collisions people could find.
In the real world, finding an input that produces the right hash output isn’t easy. And because a lot of users reuse passwords (don’t do it, but people do), a list of emails and passwords gives you an incredibly lazy and easy to do way to compromise accounts on other sites.
Reminds me of a funny moment in my IT internship, ahead of an audit one of the sysadmins came over and was saying “yeah so I pulled all of the department password hashes to check for weak/compromised accounts and noticed one person has the same sysadmin and user password hash” and my boss went “wait everyone doesn’t do that?” And after realizing they outed themselves turned bright red and changed their admin password
Additional to what others have said: The “salted” part is very relevant for storing.
There aren’t soooo much different hashing algorithms people use. So, let’s simplify the hashing again to the crossfoot example.
Let’s say, 60% of websites use this one algorithm (crossfoot) for storing your password, and someone steals the password “hashes” (and the login / email). I could ran a program that creates me a list of all possible crossfoots for all numbers for 1 to 100000.
This would give me an easy lookup table for finding the “real” number behind those hashes.
(Those tables exists. Look up “rainbow tables”)
Buuuut what if I use a little bit of salt (and pepper pepper pepper) before doing my hashing / crossfooting?
Let’s use the pw “69” again and use a salt with a random number “420” and add them all together:
6 + 9 + 420 = 435
This hash wouldn’t be in my previous mentioned lookup table.
Use different salts for every user and at least the lookup problem isn’t such a big problem anymore.
With a hash it’s difficult to find a combination that results in this specific hashed password. Think of it like this: you have a biiig prime number and you multiply it by another. Now, that’s easy, but it’s way harder to do it backwards - factorize a large composite number (this is just for illustration). Similarly trying to find a password that works when you input it based on the hashed one is way more difficult than hashing the password in the first place.
i was more wondering why a length limit implies anything about how they’re storing the password. once they receive the password they’re free to hash it any which way they want
random memory—yahoo back in the day used to hash the password in the browser before sending it to the server, but TLS made that unnecessary i guess
Yeah, you actually better not save the users passwords in plain text or in an encrypted way it could be decrypted. You rather save a (salted) hashed string of the password. When a user logs in you compare the hashed value of the password the user typed in against the hashed value in your database.
What is hashed? Think of it like a crossfoot of a number:
Let’s say you have a number 69: It’s crossfoot is (6+9) 15. But if someone steals this crossfoot they can’t know the original number it’s coming from. It could be 78 or 87.
Dumb question: isn’t it irrelevant for the malicous party if it’s 78 or 87 per your example, because the login only checks the hash anyway? Won’t both numbers succesfully login?
It’s actually a really good question. What you’re explaining is called a collision, by creating the same hash with different numbers you can succesfully login.
This why some standard hashing function become deprecated and are replaced when someone finds a collision. MD5, which was used a lot to hash passwords or files, is considered insecure because of all the collisions people could find.
In the example yes.
In the real world, finding an input that produces the right hash output isn’t easy. And because a lot of users reuse passwords (don’t do it, but people do), a list of emails and passwords gives you an incredibly lazy and easy to do way to compromise accounts on other sites.
Reminds me of a funny moment in my IT internship, ahead of an audit one of the sysadmins came over and was saying “yeah so I pulled all of the department password hashes to check for weak/compromised accounts and noticed one person has the same sysadmin and user password hash” and my boss went “wait everyone doesn’t do that?” And after realizing they outed themselves turned bright red and changed their admin password
Additional to what others have said: The “salted” part is very relevant for storing.
There aren’t soooo much different hashing algorithms people use. So, let’s simplify the hashing again to the crossfoot example.
Let’s say, 60% of websites use this one algorithm (crossfoot) for storing your password, and someone steals the password “hashes” (and the login / email). I could ran a program that creates me a list of all possible crossfoots for all numbers for 1 to 100000.
This would give me an easy lookup table for finding the “real” number behind those hashes. (Those tables exists. Look up “rainbow tables”)
Buuuut what if I use a little bit of salt (and pepper pepper pepper) before doing my hashing / crossfooting?
Let’s use the pw “69” again and use a salt with a random number “420” and add them all together:
6 + 9 + 420 = 435
This hash wouldn’t be in my previous mentioned lookup table. Use different salts for every user and at least the lookup problem isn’t such a big problem anymore.
This was super helpful 🙏🏼 sent me down a whole other rabbit hole of learning
With a hash it’s difficult to find a combination that results in this specific hashed password. Think of it like this: you have a biiig prime number and you multiply it by another. Now, that’s easy, but it’s way harder to do it backwards - factorize a large composite number (this is just for illustration). Similarly trying to find a password that works when you input it based on the hashed one is way more difficult than hashing the password in the first place.
i was more wondering why a length limit implies anything about how they’re storing the password. once they receive the password they’re free to hash it any which way they want
random memory—yahoo back in the day used to hash the password in the browser before sending it to the server, but TLS made that unnecessary i guess