Computer-Related Encodings¶
This section introduces some computer-related encodings.
Alphabet Encoding¶
- A-Z/a-z corresponds to 1-26 or 0-25
ASCII Encoding¶

Characteristics¶
When we typically use ASCII encoding, we work with visible characters, mainly the following:
- 0-9, 48-57
- A-Z, 65-90
- a-z, 97-122
Variations¶
Binary Encoding¶
Convert the ASCII code values to their binary representation.
- Contains only 0 and 1
- No more than 8 bits; 7 bits is also common since visible characters go up to 127.
- This is essentially another form of ASCII encoding.
Hexadecimal Encoding¶
Convert the ASCII code values to their hexadecimal representation.
- A-Z→0x41~0x5A
- a-z→0x61~0x7A
Tools¶
- jpk, ascii to number, number to ascii
- http://www.ab126.com/goju/1711.html
Examples¶

2018 DEFCON Quals ghettohackers: Throwback¶
The challenge description is as follows:
Anyo!e!howouldsacrificepo!icyforexecu!!onspeedthink!securityisacomm!ditytop!urintoasy!tem!
The first instinct is to fill in the content corresponding to the exclamation marks to get the flag, but after filling them in it doesn't work. So we can split the source string by !, where a string length of 1 corresponds to the letter a, length 2 corresponds to letter b, and so on:
ori = 'Anyo!e!howouldsacrificepo!icyforexecu!!onspeedthink!securityisacomm!ditytop!urintoasy!tem!'
sp = ori.split('!')
print repr(''.join(chr(97 + len(s) - 1) for s in sp))
This gives us the result, where we also need to assume that 0 characters represents a space, since this makes the original text readable:
dark logic
Challenges¶
- Jarvis-basic - German Military Cipher
Base Encoding¶
The "xx" in base xx represents how many characters are used for encoding. For example, base64 uses the following 64 characters for encoding. Since 2 to the power of 6 equals 64, every 6 bits form a unit corresponding to a printable character. 3 bytes have 24 bits, corresponding to 4 Base64 units, meaning 3 bytes need to be represented by 4 printable characters. It can be used as a transfer encoding for email. The printable characters in Base64 include letters A-Z, a-z, and digits 0-9, making 62 characters in total, plus two additional printable symbols that vary across different systems.

For more details, see Base64 - Wikipedia.
Encoding "man"

If the number of bytes to be encoded is not divisible by 3, there will be 1 or 2 extra bytes remaining. These can be handled as follows: first pad the end with zero values to make it divisible by 3, then perform base64 encoding. One or two = signs are appended after the encoded base64 text to represent the number of padded bytes. That is, when there is one remaining byte, the last 6-bit base64 block has four zero-value bits, and two equals signs are appended; when there are two remaining bytes, the last 6-bit base64 block has two zero-value bits, and one equals sign is appended. Refer to the table below:

Since the padded zeros do not participate in the computation during decoding, information can be hidden at these positions.
Similar to base64, base32 uses 32 visible characters for encoding. Since 2 to the power of 5 equals 32, every 5 bits form one group. 5 bytes equal 40 bits, corresponding to 8 base32 groups, meaning 5 bytes are represented by 8 base32 characters. If there are fewer than 5 bytes, the first group that doesn't have a full 5 bits is padded with zeros to complete 5 bits, and all remaining groups are filled with "=" until a full 5 bytes are reached. Therefore, base32 can have at most 6 equals signs. For example:

Characteristics¶
- base64 may end with
=signs, but at most 2 - base32 may end with
=signs, but at most 6 - The character set varies depending on the base type
- You may need to add equals signs yourself
- = is also 3D
- For more details, see base rfc
Tools¶
- http://www1.tc711.com/tool/BASE64.htm
- Python library functions
- Script for reading steganographic information
Examples¶
For the challenge description, see the data.txt file in the misc category base64-stego directory of ctf-challenge.
Use the script to read the steganographic information.
import base64
def deStego(stegoFile):
b64table = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
with open(stegoFile,'r') as stegoText:
message = ""
for line in stegoText:
try:
text = line[line.index("=") - 1:-1]
message += "".join([ bin( 0 if i == '=' else b64table.find(i))[2:].zfill(6) for i in text])[2 if text.count('=') ==2 else 4:6]
except:
pass
return "".join([chr(int(message[i:i+8],2)) for i in range(0,len(message),8)])
print(deStego("text.txt"))
Output:
flag{BASE64_i5_amaz1ng}
Challenges¶
Huffman Coding¶
See Huffman Coding.
XXencoding¶
XXencode encodes input text in units of three bytes. If the remaining data at the end is less than three bytes, the missing part is padded with zeros. These three bytes have 24 bits in total, which are divided into 4 groups of 6 bits each. The decimal value of each group falls between 0 and 63. Each value is replaced by the character at the corresponding position.
1 2 3 4 5 6
0123456789012345678901234567890123456789012345678901234567890123
| | | | | | |
+-0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
For more information, see Wikipedia
Characteristics¶
- Contains only digits, uppercase and lowercase letters
- Plus sign and minus sign.
Tools¶
Challenges¶
URL Encoding¶
Characteristics¶
- A large number of percent signs
Tools¶
Challenges¶
Unicode Encoding¶
See Unicode - Wikipedia.
Note that it has four representation forms.
Examples¶
Source text: The
&#x [Hex]: The
&# [Decimal]: The
\U [Hex]: \U0054\U0068\U0065
\U+ [Hex]: \U+0054\U+0068\U+0065