Unpacking RC4 Encrypted Malware - REvil ransomware
REvil sample | https://bazaar.abuse.ch/sample/329983dc2a23bd951b24780947cb9a6ae3fb80d5ef546e8538dfd9459b176483/ |
SHA256 | 329983dc2a23bd951b24780947cb9a6ae3fb80d5ef546e8538dfd9459b176483 |
The sample contains REvil malware also known as Sodinokibi.  It was a Russia-based or Russian-speaking private ransomware-as-a-service (RaaS) operation.
Our objectives today are to unpack (if needed) and find the configuration information file so, let’s debug everything !
There is a packer here ?
Firstly, we need to do a basic analysis on the revil.bin
file to find if something is out of the ordinary with:
Sections
Okay, there is a very big encrypted section here: enc. You can see it through PE-BEAR.
Library
Next, the lack of library can also be a sign:
Entropy
Using Detect It Easy we found that enc section is declared as packed because of a very high entropy.
There is definitely something strange here. The enc section is clearly encrypted, so 2 ways to find the algorithms behind it : Manually and with CAPA!
REvil is a very well-known malware, so we know that it use RC4 encryption on the enc section.
A little about RC4
RC4 algorithms as 3 main parts:
- KSA → Key-Scheduling-Algorithm
- PRGA → Pseudo-Random Generation Algorithm
- XOR Stage
To find a trace of the RC4 algorithm in a malware, we need to search for the magic number 0x100
→ 256. This number is used in 2 loops (KSA and PRGA stages) because RC4 create a 256 bytes Substitution box(KSA) and scrambling it in the second stage(PRGA).
Deep-dive into Manually search
I’m going to use IDA FREE here without a decompiler (because I don’t have installed the plugin yet).
As we know, the magic number to search is 0x100
, so after launched IDA and loaded the binary, press ALT + I
and search for All the occurrences of 0x100
.
Two functions are found, sub_40100E
and sub_40110B
!
Next, if we look at the start of the program, we saw that the first function called is sub_40110B
!
Press ENTER
on sub_40110B
. At 1, we see the 100h
be moved into EDX register. It will be used in sub_401000
(2) to create the substitution table. Let’s rename sub_401000
to KSA !
In 3, we have 2 locations in the memory that are the parts of the KEY! Because of the endianness, we need to convert them before use them later. In sub_40110B
, let’s rename xmm0 by RC4_key!
4 is the PRGA, after press ENTER
, we found the last occurrences of 100h be used inside a scrambling loop.
Currently, we found a S-Box, a key but where is the encrypted data? We know that the .enc is linked, but we don’t know yet in an assembly perspective. Let’s deep-dive into sub_4010A3
!
Finally, the last stage, the XOR
instruction accesses the memory at the address 0x403000 + ecx
and perform the encryption or decryption. Another interesting thing here is the length of the date being modified : 1CE00H
→ 118272
Remember in the basic analysis part with PE-Bear:
Decryption party
Export the enc section with PE-Bear, Sections>enc>Section: [enc]>Save the content as
At this point, load in CyberChef the revil.bin[enc] file and use the RC4 recipe with the key we found earlier : kZlXjn3o373483wb6ne1LIBNWD3KWBEK
WOW, “MZ”, we just find a new executable !! Let’s download it.
Second stage
Open IDA and load the new decrypted file. In parallel, we are going to use CAPA to find the offset of any encryption methods.
capa decryptedREvil.exe -v
In IDA, jump to the address, press G and enter 0x406159
. We’re starting to see familiar things that bring us back to RC4 stages, here KSA and PRGA.
To find interesting things, let’s hit X
on sub_406159
to find cross-references.
Then, I rename sub_406159
to RC4_KSA_PRGA. Before analyzing sub_40646A
, we need to understand the parameters used in the RC4 algorithms, so we are going to see the function call and analyze the push
operations.
Like before, press X
on sub_40646A
:
Wow, we have clearly interesting information here. Firstly, we observe something which can be used as a key!
Follow unk_412000
and convert the data pressing A. We possibly got the RC4 Key ! The 20h
argument pushed just before was in fact the length of our RC4 Key (you can verify this information by copying the key inside CyberChef).
After that, let’s focus on edi register and the dword_412024
. If this follow the same logic, dword_412024
is the length(1C0Ch
→ 7180) and edi the data (maybe the encrypted data).
Now, to find the data inside edi we need to go back a bit in the graph to the mov
operation on unk_412028
.
After follow unk_412028
, extract all the data it contains(SHIFT + E
) and loads it in CyberChef with RC4 recipe. Next, use the previous RC4 key: OxyJfFnJoNseUwjQiex1jbfGMI9fNzqB
DONE, we found the configuration data used by REvil ransomware!!! I will stop here because the purpose of this post is to understand RC4, identify it and find interesting data ! Possibly in another part I will treat about AES or ChaCha in REvil.
In bonus, you can convert the base64 in nbody
field to retrieve the README text file used by the attackers to get money: