Identifying UPX packed ELF, decompressing, fixing, and analysing Linux malware

We'll take a look at analysing a piece of Linux malware. This sample is an ELF file, containing a UPX packed binary, capable of port scanning, SSH bruteforcing, deploying XMRig, and self replicating.

Download the sample here;

We'll first look at the header of the file to see what it's identified as;

$ file malware-sample
malware-sample: ELF 32-bit LSB executable, Intel 80386, version 1 (GNU/Linux), statically linked, no section header

Using a hex editor, we can have a quick look at the header.

└─$ xxd malware-sample | head
00000000: 7f45 4c46 0101 0103 0000 0000 0000 0000  .ELF............
00000010: 0200 0300 0100 0000 b8d2 0201 3400 0000  ............4...
00000020: 0000 0000 0000 0000 3400 2000 0200 2800  ........4. ...(.
00000030: 0000 0000 0100 0000 0000 0000 0010 c000  ................
00000040: 0010 c000 7fca 4200 7fca 4200 0500 0000  ......B...B.....
00000050: 0010 0000 0100 0000 7801 0000 7861 8208  ........x...xa..
00000060: 7861 8208 0000 0000 0000 0000 0600 0000  xa..............
00000070: 0010 0000 dd93 0689 5550 5821 d007 0d0c  ........UPX!....
00000080: 0000 0000 0000 0000 0000 0000 f400 0000  ................
00000090: 8300 0000 0800 0000 5f7b b2f9 7f45 4c46  ........_{...ELF

This indicates that this ELF binary contains UPX packed binary data. What is UPX?

UPX is a free, secure, portable, extendable, high-performance executable packer for several executable formats.

UPX is an advanced executable file compressor.
UPX will typically reduce the file size of programs and DLLs by around 50%-70%, thus reducing disk space, network load times, download times and other distribution and storage costs.

Ok, so we're going to have to unpack the file. We can use upx, which is already installed in Kali.

Ok, seems simple enough, let's unpack this UPX file.

$ upx -d malware-sample -o malware-sample-decompressed.elf

                       Ultimate Packer for eXecutables
                          Copyright (C) 1996 - 2020
UPX 3.96        Markus Oberhumer, Laszlo Molnar & John Reiser   Jan 23rd 2020

        File size         Ratio      Format      Name
   --------------------   ------   -----------   -----------
upx: malware-sample: CantUnpackException: p_info corrupted

Unpacked 1 file: 0 ok, 1 error.

Of course it wouldn't be that easy 😆

Before we continue, let's have a look at this file (in its compressed form) using strings and IDA.


Fairly useless, other than giving us an indication that the file is packed with UPX (confirming what we already identified above)

Loading the binary into IDA provides more of an indication that something isn't right. sp-analysis, red markers, lack of sub-routines, sometimes indicates that there's either not a whole lot happening (which we know isn't the case, given the nature of the sample) or that IDA can't decompile/reassemble these routines.

Back to our corrupt UPX archive. I came across this article which was helpful

In summary; there are two sections in the original binary which we can use to repair the p_info header so that it isn't corrupt. We'll start with the footer, which we know is 8 bytes prior to the end of the file.

Then we go back to the top of the file, find the UPX! header. We notice that the section after the UPX! marker is empty.

8 bytes after the end of the UPX! header, we need to insert our file size value (F8 BF 7B 00) which we recovered above. We take that value, insert it twice, 8 bytes after the end of the UPX! marker.

Save this file and then use UPX to unpack it

└─$ upx -d malware-sample-fixed -o malware.elf
                       Ultimate Packer for eXecutables
                          Copyright (C) 1996 - 2020
UPX 3.96        Markus Oberhumer, Laszlo Molnar & John Reiser   Jan 23rd 2020

        File size         Ratio      Format      Name
   --------------------   ------   -----------   -----------
   8110072 <-   4379004   53.99%   linux/i386    malware.elf

Unpacked 1 file.

Now we can open it in IDA (or ghidra). We can see there are a lot more unpacked functions, some with some interesting names, and there's obviously a lot of data here to analyse.

.. then you can go from there, analyse the file, understand what the file does, and maybe even write detection rules..

Last updated