Hack This Site - Realistic - ToxiCo Industrial Chemicals

Wed, May 12, 21

Welcome back to my walkthrough of hackthissite.org’s CTF missions. I will be going through my thought process of how I solved these missions, and therefore also giving away the solutions. If you came across this to give you hints, watch out for spoilers! Good luck, have fun.

Our next realistic mission is to decode an encrypted message in which the encryption algorithm used is the ‘XECryption algorithm’. The first thing that I will do is save the encrypted text onto my computer so that I can easily analyze it. The first thing that I instantly notice when looking at the ciphertext is that there is an equal number of ‘.’ as there are numbers. Secondly, it is noteable that each number is a 3 digit number. Thirdly, there seems to be a range in which the 3 digit numbers fall into. A simple python script can output the min & max values that are found within this ciphertext, which gives a range of [210, 350]. Keeping these observations in mind, we will move onto playing around with the cipher to try and determine the algorithm.

Using no ‘encryption password’ as they designate it, a plaintext of a results with the ciphertext of .6.26.65. Something that instantly stands out is that 6+26+65=97 and 97 is the decimal designation for the a ASCII character. A plaintext of b results with the ciphertext of .43.41.14 which, when added, results in the decimal representation of b as an ASCII character. Another thing to note is that repeatedly encrypting the same character would give different results, but their sum would always match with the decimal representation of that character in ASCII.

Now, encrypting a with the encryption password 1 results in the following ciphertext: .53.36.57, the sum of which is 146. It is important to note that the decimal representation of the ASCII character 1 is 49, and since a is 97 we can make the conclusion that the encryption algorithm is potentially:

ciphertext = ""
for char in plaintext:
    ciphertext += split_in_three(ASCII_decimal_rep(char) + sum_ASCII_decimal_rep(encryption password))

To double check this hypothesis, encrypting a with the password 12 results with the ciphertext .43.45.108, the sum of which is 196. The ASCII decimal representation of 2 is 50, and the sum of the decimal representation of {a, 1, 2} is indeed 196 as expected by our hypothesis. Lastly, encrypting ab with the password 12 results with the ciphertext .73.72.51.95.66.36. Splitting this into two sections of 3, gives the two sums of: 196 and 197. Thus, with these test cases passing, we can conlude that the pseudocode above is indeed the encryption algorithm.

Now, we must try and figure out how we would decipher ciphertext without the correct encryption password. The simplest would be to discover the encryption password that was used and simply decrpt the message as normal. It is important to note that for each plaintext character, the offset for it’s ASCII decimal representation is the same across all characters. Here are some possible ways to approach this:

  1. Brute Force through all possible encryption passwords.
    • we know that the original plaintext would most likely fall within the decimal representation of [32,127]. So we would have a small possible search space.
  2. Statistical Analysis
    • we know that the letter e is the most common letter within the english language. Thus, we can do a count of the most common triplet sum and figure out the offset to the letter e and decrypt using that offset. Otherwise we know that the ‘space’ is a very common character as well.
  3. Turn our ciphertext into their respective triplet sum and look for the largest difference between the min & max & try to match to the difference between the possible ASCII characters.

All of these would eventually get you the correct answer. I will do the brute force option as it is a small search space, where for each possibility I will write the decrypted text to disk, but, for each one, I will do a find for the word ‘the’ and keep that in memory and output any that had that word. This will make looking through these possible plaintext files a lot simpler as ‘the’ is the most common word in the english language and I expect it to be within the text at least once.

The brute force code that I came up with can be seen below:

import os

def ciphertextToDec():
  '''
  Returns a list where each index is the decimal rep 
  of a ciphered ASCII character
  '''
  # read in ciphertext to memory
  with open("ciphertext.txt", 'r', encoding="utf-8") as f:
      ciph = f.read()
  # turn into in list
  ciph.replace('\n', '')
  arr = [int(num) for num in ciph.split('.')[1:]]
  # compute shifted value of each char
  i = 0
  res = []
  while i in range(len(arr)):
      res.append(arr[i] + arr[i+1] + arr[i+2])
      i += 3
  return res

def bruteforce(decCipher):
  '''
  Brute forces the possible search space.
  Returns list of tries that could possibly match (expect only 1)
  '''
  # create a directory for the output files
  if not os.path.exists("bruteforce"):
      os.makedirs("bruteforce")  
 
  """
  Take some middle index, we assume that it should be within the range 
  [32,127] as these are the printable characters.
  We then minus lower & upper range of printable chars to get possible
  shifts that were used on the original plaintext.
  """
  midInd = len(decCipher)//2
  rng = range(decCipher[midInd] - 127, decCipher[midInd] - 32 + 1)
  possibleKeys = []
  # for each possibly key, create candidate plaintext & write to file
  for key in rng:
      plntxt = ""
      for num in decCipher:
          if (num-key) in range(0, 0x10ffff):
              plntxt += chr(num - key)
      with open(f'bruteforce/key{key}.txt', 'w') as f:
          f.write(plntxt)
      # check if text has ' the ' to limit manual search later
      theIndex = plntxt.find(" the ")
      if theIndex != -1:
          possibleKeys.append(key)
  return possibleKeys
     

if __name__ == "__main__":
   # get array of decimal rep. of each ciphertext char
   decCipher = ciphertextToDec()
   # brute force & get output of array w/ tries that had the word 'the'
   possibleKeys = bruteforce(decCipher)
   print(possibleKeys)

This indeed did gives me the shift that the encryption password sums too: 762. The text that is required to submit can easily be copied from the created key762.txt file that my brute forcer created. Thus, we have completed another challenge!