One afternoon I was happily coding at work when I got a phone call. It was my girlfriend.

After hanging up the phone, I quickly login 12306 to get rid of my password, fortunately I each site password is not the same, so it can be very good to avoid being hit by the library.

After work, back home, my girlfriend came to me for the first time, must I explain to him about 12306 data leakage behind the knowledge.

Text passwords

Plain text passwords such as 123456, admin, etc., instead of being encrypted to reveal ****, are known as cryptography. For example, ABC stands for 123. If you are told ABC without telling you the decoding rules, you cannot translate the real code 123.

Many websites have the function of registration and login. When a user registers, the user name and password are directly saved in the database without any processing. In this case, the password is saved in plain text.

In this way, it is very convenient for program development to save the plaintext password of the user directly. When users log in, they can match the account and password directly in the database. However, at the same time, it also buried a great hidden danger, once the database information is leaked, then hackers can get all the user names and passwords.

For example, the user’s plaintext password is HelloWorld and the encrypted ciphertext is xxeerrqq.

User registration:

Helloworld -> encrypt -> xxeerrQQ -> save xxeerrQQ to databaseCopy the code

The user login

Helloworld -> Encrypt -> xxeerrQQ -> use xxeerrQQ to match password in databaseCopy the code

Password encryption technology after many years of development, there have been a lot of mature schemes, here is a brief introduction of a few.

Symmetric encryption

Symmetric encryption refers to encryption algorithms that require the same key for encryption and decryption.

The simplest symmetric encryption algorithm saves the password by changing the ASCII code, such as converting ABCDE to BCDEF, which increases the ASCII code by one.

This encryption algorithm has a feature that it can restore the plaintext according to the key according to the ciphertext obtained after encryption.



But few websites now use such an algorithm, and although there are now many ways to keep keys separately, hackers have gained access to users’ ciphertexts by cracking websites.

Algorithms commonly used in symmetric encryption algorithms include: DES, 3DES, TDEA, Blowfish, RC2, RC4, RC5, IDEA, SKIPJACK, etc.

Unidirectional Hash algorithm

A one-way hash algorithm, also known as a hash function, is a function that changes an arbitrarily long input message string into a fixed length output string. Generally used to generate message digest, key encryption, etc.

The one-way Hash algorithm is a simple algorithm that cannot restore the original password through calculation.



Many Internet firms have kept passwords this way, and at one time it was a more secure approach.

Common Hash functions include: Message Digest Algorithm 5 (MD5), Secure Hash Algorithm (SHA), Message Authentication Code (MAC), and Cyclic Redundancy (CRC) Check)

Rainbow table

Rainbow table is a pre-computed table used to reverse the cryptographic hash function, often used to crack encrypted cryptographic hashes. Lookup tables are often used for encryption of fixed-length plain text passwords with finite characters. This is a typical practice of space for time, using less computing power and more storage space in brute force computations on every attempt, but using less storage space and more performance than a simple lookup table with each input hash.

Usually, when a field is hashed (such as MD5), a hash value is generated, and the hash value is usually not available to the original field by a specific algorithm. But in some cases, such as a large rainbow table, it is possible to find the actual field content of the hash value in a very short time by searching the table for the MD5 value.

Hash algorithm with salt

Salt, in cryptography, is the insertion of a specific string at any fixed position of the hash content (e.g. password) before the hash. This way of adding strings to a hash is called “salting.” The effect is to make the hash with salt different from the hash without salt, which can add additional security in different application scenarios.

Add salt after the hash value, can greatly reduce the user password leakage risk data were stolen, even by rainbow tables found after the hash value of the original content, but due to pass the salt, insert string disrupted the real password, make real password probability is reduced greatly.



 for added salt “fixed” Hash algorithm, the need to protect the “salt” cannot leak, it will encounter “protect the symmetric key” problem, once the “salt”, according to the “salt” rainbow tables can be rebuilt.

PBKDF2 algorithm

PBKDF2 algorithm, namely, Password-based Key Derivation Function 2. PBKDF2 simply recomputes the Hash with salt multiple times, an optional number.

The principle of this algorithm is roughly equivalent to adding random salt to the Hash algorithm and performing multiple Hash operations. Random salt greatly increases the difficulty of creating a rainbow table, while multiple Hash operations greatly increases the difficulty of creating and cracking a rainbow table.

If it takes 1 microsecond to calculate once, it takes 1 second to calculate 1 million times. If there are 10 million rainbow tables required to attack a password, the time required to create the corresponding rainbow table is 115 days. That’s a price that most attackers will forget and dread.



U.S. government agencies have standardized this approach and use it in several government and military systems. The biggest advantage of this scheme is that it is standardized, easy to implement and uses the time-tested SHA algorithm.

There are also many algorithms that can effectively defend against rainbow tables, such as Bcrypt and Scrypt.

bcrypt

Bcrypt is an algorithm specifically designed for password storage, based on Blowfish encryption algorithm, published by Niels Provos and David Mazieres in USENIX in 1999.

The bcrypt implementation uses a salted process to defend against rainbow list attacks, while the bcrypt is also an adaptive function that increases the number of iterations against increased computing power through brute force cracking.

Files encrypted by Bcrypt can be transferred on all supported operating systems and processors. Its password must be 8 to 56 characters long and will be internally converted to a 448-bit key. However, all of the characters provided have significant meaning. The stronger the password, the more secure your data is.

Bcrypt has been carefully analyzed by many security experts and used in the famously secure OpenBSD, it is generally considered to be more able to withstand the risks that come with increased computing power than PBKDF2. Bcrypt also has extensive library support, so storing passwords this way is recommended.

Bcrypt is used in Java

In the website (http://www.mindrot.org/projects/jBCrypt/) to get the source code of the algorithm. In Java, you can directly use the following methods for encryption:

public static void main(String[] args) throws NoSuchAlgorithmException

{
    String  originalPassword = "Talk programming";
    String generatedSecuredPasswordHash = BCrypt.hashpw(originalPassword, BCrypt.gensalt(12));
    System.out.println(generatedSecuredPasswordHash);

    boolean matched = BCrypt.checkpw(originalPassword, generatedSecuredPasswordHash);

    System.out.println(matched);

}
Copy the code

scrypt

Scrypt was developed by noted FreeBSD hacker Colin Percival for his backup service Tarsnap.

Designed with massive guest hardware attacks in mind, it requires a lot of memory. Scrypt requires a lot of memory because it generates a lot of pseudorandom data as the basis of the algorithm’s calculations. Once the data is generated, the algorithm reads the data in a pseudo-random order to produce results. So the most direct way to do this is to use a lot of memory to store this data in memory for the algorithm to compute.

Not only does Scrypt take a long time to compute, but it also takes up a lot of memory, making it extremely difficult to compute multiple digests in parallel, making it even more difficult to use rainbow tables for violent attacks. Scrypt is not widely used in production environments and lacks careful scrutiny and extensive library support. However, as long as scrypt is algorithmic, it should be more secure than PBKDF2 and BCrypt.

Scrypt is used in Java

There is a Java implementation scrypt tool class library (https://github.com/wg/scrypt) can be used directly. Usage is also relatively simple:

public static void main(String[] args) {

    String originalPassword = "Talk programming";

    String generatedSecuredPasswordHash = SCryptUtil.scrypt(originalPassword, 16, 16, 16);
    System.out.println(generatedSecuredPasswordHash);

    boolean matched = SCryptUtil.check("Talk programming", generatedSecuredPasswordHash);
    System.out.println(matched);

    matched = SCryptUtil.check("Comic book programming", generatedSecuredPasswordHash);
    System.out.println(matched);
}
Copy the code