XORing Strings in the Twenty-First Century
Statically encoding strings with runtime decryption
When attempting to remain out of sight from string-reliant detection systems,
whether it be anti-cheat engines or even just the novice reverse engineer,
XORing is a commonly used technique. This is not a new-fangled technique by far
and truly only serves to defeat naïve static analysis; not only can the cipher
text be trivially decoded by hand, but the plaintext is stored as clear as day
in memory after first use. Rather, this is a way to escape signature scans of,
e.g., .rodata
.
Such techniques have been employed for quite a while in regard to mainly unsavory programs, including malware and game cheats. Traditionally, this involved a pseudo-preprocessor build step that would replace wrapped strings with encoded equivalents and include a runtime decoding function. With the advent of C++11 and modern template intricacies however, it is now possible to carry out this encoding process at compile-time without any external tooling.
First, we can generate ourselves a single byte key to XOR against. Here, it
would be best to rely on a variable, somewhat random source. I have encountered
implementations that seed using bytes of the __TIME__
preprocessor macro, but
for a little more work we can achieve a bit better. Using cmake
, we are able
to generate a string consisting of random numbers and feed this into the
preprocessor. We now have, in our header file,
const uint8_t xor_key = _XOR_KEY_;
And in CMakeLists.txt
,
string(RANDOM LENGTH 2 ALPHABET 0123456789ABCDEF XOR_KEY)
add_definitions(-D_XOR_KEY_=0x${XOR_KEY})
Regardless of how the key is chosen, we must craft a container for our encoded
data. In order to be able to iterate over a string and apply a XOR operation to
each character at compile-time, we need some sort of incrementing index. This
can be achieved by emulating a loop using the C++11 feature of parameter
packing. n
integer value parameters can be passed for a null-terminated
string of byte length n+1
, starting at 0
and incrementing to n-1
. Prior
to C++14, this had to be done through an even more ridiculous manner, though
the index_sequence
template eases our pain. We can then use
make_index_sequence
to produce such a sequence to match the length of our
string.
template <typename Is>
class xor_string;
template <size_t... Is>
class xor_string<std::index_sequence<Is...>> {
For efficiency, we can store the ciphertext directly in a field and perform the decoding process right in the very same memory. Not only does this speed up future access attempts, but no excess memory is allocated either.
bool decrypted = false;
char ciphertext[sizeof...(Is)+1];
The class constructor is the magic responsible for actually XORing the string
at compilation time. This is made possible once again through parameter
packing, now however expanding out the template data. Together with list
initialization, ciphertext
can be populated with the “encrypted” bytes as
if done at runtime inside of an explicit loop. Static initialization only
introduces a minor binary overhead compared to traditional string literal
storage.
public:
constexpr xor_string(char const * const str) noexcept
ciphertext{ static_cast<char>(str[Is] ^ xor_key + Is)... } {}
Finally, we have the meat of the approach: the decoding/decryption process. There’s nothing underhanded about this, and if anything, the lack of dense modern C++ functionality makes this routine stand out of place. If the message has already been decoded, then it is returned from memory. Otherwise, the XOR is carried out in place of the ciphertext.
char const *decrypt() {
if (decrypted) {
return ciphertext;
}
for (auto i = 0; i < sizeof...(Is); i++) {
ciphertext[i] = ciphertext[i] ^ xor_key + i;
}
ciphertext[sizeof...(Is)] = '\0';
decrypted = true;
return ciphertext;
}
};
Unfortunately, this all leaves us with a snake of an expression we have to
wrangle every time we use a string. Not only does an instance of xor_string
have to be created, but we must also create an std::index_sequence
to match
the length of the unterminated string. To keep all of that nasty template
hacking out of sight, we can rely on the good ol' preprocessor.
#define $(str) xor_string<std::make_index_sequence<sizeof(str) - 1>>(str).decrypt()
It can also prove useful to combine this macro with a string formatting utility such as fmtlib. Squint a little bit, and we have ourselves one of those fancy .NET features.
#define $(str, ...) fmt::format(xor_string<std::make_index_sequence<sizeof(str) - 1>>(str).decrypt(), __VA_ARGS__)
// ...
std::cout << $("{} is {}", "foo", 12) << std::endl; // "foo is 12"
So, what’s the output look like? After all, our efforts would be in vain if the compiler still sneaks a copy of our string inside somewhere. Let’s compile a snippet under MSVC to verify:
std::cout << $("Hello, world!") << std::endl;
Luckily, our Release mode binary1 is clear of any signs of our original
message, rather only holding our compile-time encoded string and some inlined
code from our decrypt
method. Keeping in mind that we weren’t intending to
fool any human adversaries, the simplicity of the decryption routine should
contribute to a negligible overhead, especially when strings are utilized only
once.
mov DWORD PTR [ebp], 0x3b331d00 ; load our encoded string
mov DWORD PTR [ebp+4], 0x7b763634
mov DWORD PTR [ebp+8], 0x332c322b
mov WORD PTR [ebp+12], 0x4004
mov BYTE PTR [ebp+14], 0x00
xor eax, eax ; eax ← 0
loop:
lea ecx, DWORD PTR [eax+0x55] ; ecx ← 0x55 + eax
xor BYTE PTR [ebp+eax+001], cl ; ebp[eax+1] ← ebp[eax+1] ^ ecx
inc eax ; eax ← eax + 1
cmp eax, 0xD ; if eax < 13:
jb SHORT loop ; goto loop
-
This is a must: compiling with debug symbols attached may inevitably leave our original string intact to ease the debugging process. ↩︎