It is important that an organization can trust its secure sensitive data and systems have not been altered. Security teams must plan to check and verify file integrity. To quickly verify file integrity, they can validate a cryptographic checksum, in which they run a hash algorithm against the sensitive file.

In this lab, you will learn how to use built-in hashing tools to verify file integrity.

Part 1: Launch an AWS EC2 Instance

Amazon Elastic Compute Cloud (EC2) is a web service that provides easy, resizable compute capacity. It provides you with complete control of your computing resources and lets you run on Amazon’s robust computing environment.

Cloud computing allows you to provision and boot a new server in seconds, which means that you could quickly scale capacity up or down, as your computing requirements change. Also, cloud computing allows you to pay only for the capacity that you actually use.

Amazon will bill you for any applicable AWS resources and time used that are not covered in the AWS Free Tier.

  1. Visit Amazon Web Services (AWS). Click on My Account then AWS Management Console. Login to your account.
  2. In the AWS Management Console page, click Services then EC2. This is the Amazon EC2 dashboard. Click on Launch Instance then Launch Instance.
  3. In Choose an Amazon Machine Image (AMI), scroll through and review the 40 available default AMIs. An Amazon Machine Image (AMI) provides the template information required to launch an instance, which is a virtual server in the cloud. However, not all AMIs are eligible for free tier. Select the free tier eligible Amazon Linux 2 AMI.
  4. In Choose an Instance Type, ensure that the instance type is set to t2.micro, which is Free tier eligible. Click Next: Configure Instance Details.
  5. Review the settings in Configure Instance Details, however notice that in making some changes, you may apply additional charges. Click Next: Add Storage.
  6. In Add Storage, launch the instance with the default 8 GB volume. Click Next: Add Tags.
  7. It is best practice to tag your instances. In Add Tags, click Add Tag. Enter “Name” as the Key and “web-server” as the Value. Click Add another tag. Enter “Owner” as the Key and your initials as the Value. Click Next: Configure Security Group.
  8. Click Review and Launch.
  9. In Review Instance Launch, click Launch. Click Choose an existing key pair or create a new one. DO NOT forget to download the key. Click Launch instances.
  10. Click View Instances. The instance will take steps to boot and initialize.
  11. Click Connect. Choose a connection method to get access to the instance.

Part 2: Use Built-in Hashing Tools

  1. Create a file called system_file.txt.
    echo "This is a file" > system_file.txt
    
  2. You should now see system_file.txt in your home folder.
    cat system_file.txt
    

    This file will have the contents that we echoed in.

    This is a file

  3. See the MD5 hash of the file.
    md5sum < system_file.txt
    

    The MD5 hash is a fixed-length output of 128 bits. It should be a86e2699931d4f3d1456e79383749e43 which will remain the same as long as the file is unchanged.

  4. See the SHA1 hash of the file.
    sha1sum < system_file.txt
    

    The SHA1 hash is also fixed-length, but notice that it is 4 bytes longer than the MD5 hash. It should be 799c11e348d39f1704022b8354502e2f81f3c037 which will remain the same as long as the file is unchanged.

  5. Change the file’s contents by adding a period to the end.

    This is a file.

  6. Use Ctrl+X then enter y to close and save the file.
  7. View the new MD5 hash of the edited file.
    md5sum < system_file.txt
    

    It should now be 98475036dc73d318982805bf4b16e8b2.

  8. Get the new SHA1 hash.
    sha1sum < system_file.txt
    

    It should now be d7dff2b1ef48b9c20c23d7b3a08b557957cec3c9.

  9. Change the file back by repeating steps 16 and 17, but this time removing the period you added at the end.

    This is a file

  10. View the newest MD5 hash of the reverted file.
    md5sum < system_file.txt
    

    It should now be the original a86e2699931d4f3d1456e79383749e43.

Part 3: Automate File Integrity Validation

  1. Create a backup of the system_file.txt from Part 2.
    cp system_file.txt system_file.bak
    
  2. Confirm that you have Python 2.7 installed. It should return Python 2.7.18 or similar.
    python -V
    
  3. Create a hash_checker.py Python file.
    nano hash_checker.py
    
  4. Create the Python file below. Review what the script does.
    import hashlib
        
    def md5(t):
        m = hashlib.md5()
        m.update(t)
        return m.hexdigest()
        
    def file_md5(path):
        with open(path, 'rb') as f:
            return md5 (f.read())
        
    f1 = 'system_file.bak'
        
    print (file_md5(f1))
    
  5. Save the Python file.
  6. In Terminal, run the script. The MD5 hash is a fixed-length output of 128 bits. It should be a86e2699931d4f3d1456e79383749e43, which will remain the same as long as the file is unchanged.
    python hash_checker.py
    
  7. Alter the script to check the system_file.txt file. Running it should still produce the same MD5 hash.
  8. Edit the system_file.txt file.
    nano system_file.txt
    
  9. Change the file’s contents by adding a period to the end.

    This is a file.

  10. Close and save the file. Use Ctrl+X then enter y to confirm that yes, you do want to save the modifications.
  11. Run the script again.
  12. Run the diff command to find the difference between the two files. This will confirm that the contents of the file have differences.
    diff system_file.bak system_file.txt
    
  13. Alter the script to compare the MD5 hashes of system_file.txt and system_file.bak. The script should print a message to show whether the files match or do not.
  14. Please write up a paragraph answering the following questions.
    1. Which of the two hash functions used today is more likely to produce a collision? Why?
    2. How could a security professional use hashing as a form of file integrity validation for important system files?
    3. What are some other practical uses for hashlib? Be specific and descriptive.

More Info