Integrity
While data encryption is focused on keeping prying eyes from seeing the original data, data integrity is focused on assuring the data is accurate and consistent. Doing so requires ensuring data integrity through all stages of the data lifecycle, which includes transporting, storing, retrieving, and processing data.
Several tools can be used to ensure data integrity, including hashing algorithms, digital signatures, and file integrity monitoring (FIM).
Hashing Algorithms
A hashing algorithm is a mathematical function that is applied to data that should return a unique result. Unlike encryption, in which the result of the encryption process is data that could be decrypted back to the original format, hash data is one-way, making it impossible to return the original data. The purpose of a hash isn’t to hide or encrypt the data, but rather to ensure that the data you have received matches up with the original.
Consider a situation in which you receive a database with sensitive information. Your organization is going to use this information to help make some critical decisions on future products. You received this data from a trusted third-party source, but how can you be certain that a “bad actor” didn’t intercept the data and inject false information?
Your third-party source could use a hashing algorithm and send the resulting hash separately. Then you could take the data that you have received, perform the same hashing algorithm, and then compare the results with the hash from the third-party. If they match, you know you have unaltered data.
There are many different types of hashing algorithms. Each has specific advantages and disadvantages, but for the CompTIA Cloud+ certification exam, you should be familiar with the names of these algorithms:
MD5
SHA-1
SHA-2
SHA-3
RIPEMD-160
Digital Signatures
Suppose a friend sends you a letter. How would you know that it really came from that person? One method is to have your friend add a signature to the bottom of the letter. If you recognize the signature, you can be more certain that it came from your friend.
Digital signatures are used in the same way but are a bit more complicated in how they are implemented. Digital signatures make use of asymmetric cryptography in which the signature is encrypted using the private key of an individual or organization. The public key is made well known through another means. The signature that has been encrypted with the private key can only be decrypted by the public key. Successful decryption verifies the data came from the correct source.
File Integrity Monitoring (FIM)
In some cases, it is important to determine if data within a file has changed. The process that handles this determination is called file integrity monitoring. With FIM a checksum is created when the file is in a known state called a baseline. This checksum is a value that is based on the current contents and, in some cases, additional file attributes, such as the file owner and permissions.
To determine if a file or a file attribute has been changed, you can take another checksum sometime in the future. When you’re comparing the original checksum to the new checksum, if they match, the current file is the same as the original. This technique can be used to determine if someone has tampered with a key operating system file or a file that has been downloaded from a remote server.