Modifying Embedded Filesystems in ARM Linux zImages

Ever run binwalk on an embedded Linux device’s kernel image and find its entire fileystem contained inside? Ever want to change one little line inside to enable root shell on that device that’s just mocking you with its lack of boot security, only to be thwarted by a bit of compressed data entangled in machine code?

The mechanism for this built-in filesystem is known as the initial ramdisk,¹ which often takes the form of a CPIO archive (initramfs). The initial ramdisk is embedded in the kernel binary proper (vmlinux), which is in turn compressed and packed into a wrapper program (vmlinuz, zImage, bzImage). The wrapper performs initial setup, decompresses vmlinux, and then jumps into it.² The compressed vmlinux blob tends to be referred to as the “piggy” in Linux boot code.

While it’s quite easy to run extract-vmlinux or binwalk on these kernel images and unleash a flood of shell scripts, config files, and programs that one might have many reasons to want to modify, figuring out how to package these files back up into an image fit for execution is not so straightforward.

This article will demonstrate how to replace the piggy in a 32-bit ARM zImage without worrying about size constraints or finding the right toolchain and exact configuration options necessary to recompile the vmlinuz wrapper code. It’s not intended as a universal solution, but rather a guide that should provide one with enough understanding to make whatever tweaks needed for their specific use case.

While this information is intended to help a neighbor make modifications to proprietary kernel images that they can’t simply rebuild from source, I’ve decided to use the 32-bit ARM virt build of OpenWRT for demonstration purposes. May a thin layer of obfuscation by compression never get in the way of your path to proofs of concept again!

Setup

First, download the OpenWRT ARM virt zImage-initramfs image:

$ wget -q https://downloads.openwrt.org/releases/17.01.0/targets/armvirt/generic/lede-17.01.0-r3205-59508e3-armvirt-zImage-initramfs -O zImage-initramfs
$ sha256sum zImage-initramfs
5ad269e95b2db16aea3794dd0e97dabb6f9712184d79b0764bb10a810f8d7639  zImage-initramfs

We can boot this in qemu with:

$ qemu-system-arm -serial stdio -M virt -m 1024 -kernel zImage-initramfs

After booting, press enter to activate the console. The following banner is displayed:

BusyBox v1.25.1 () built-in shell (ash)

     _________
    /        /\      _    ___ ___  ___
   /  LE    /  \    | |  | __|   \| __|
  /    DE  /    \   | |__| _|| |) | _|
 /________/  LE  \  |____|___|___/|___|                      lede-project.org
 \        \   DE /
  \    LE  \    /  -----------------------------------------------------------
   \  DE    \  /    Reboot (17.01.0, r3205-59508e3)
    \________\/    -----------------------------------------------------------

=== WARNING! =====================================
There is no root password defined on this device!
Use the "passwd" command to set up a new password
in order to prevent unauthorized SSH logins.
--------------------------------------------------
root@LEDE:/#

This looks like a good target for a proof of concept modification. Let’s use the shell to check the base Linux kernel version:

root@LEDE:/# uname -a
Linux LEDE 4.4.50 #0 SMP Mon Feb 20 17:13:44 2017 armv7l GNU/Linux

The core pieces of the zImage wrapper code are unlikely to change much from the original, if at all, so we can look up the assembly source of the wrapper for that version of Linux. Bootlin’s Elixir Cross Referencer provides a nice interface for browsing Linux source code across different versions. Open a browser tab and navigate to https://elixir.bootlin.com/linux/v4.4.50/source/, or clone the linux repo to search through the code.

Most of the files we’re interested in can be found in the arch/arm/boot/compressed directory.

Piggy Extraction

We need to extract the piggy before we can modify it. The well known extract-vmlinux script³ performs a brute force search for the magic bytes of commonly used compression schemes, runs the associated decompressor program on them, and checks if the output is an ELF.

For this image it fails – we’ll see why in a minute. binwalk identifies XZ compressed data:

$ binwalk zImage-initramfs

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
0             0x0             Linux kernel ARM boot executable zImage (little-endian)
15400         0x3C28          xz compressed data
15632         0x3D10          xz compressed data

There’s clearly an XZ stream header magic at 0x3d10:

00003D10   FD 37 7A 58  5A 00 00 01  69 22 DE 36  02 01 07 00  .7zXZ...i".6....
00003D20   21 01 1A 00  AF BB 14 35  E2 84 87 EF  FF 5D 00 20  !......5.....].

As well an XZ stream footer magic at 0x2bf10c:

002BF100   70 AB A0 CD  9B E3 51 40  03 00 00 00  00 01 59 5A  p.....Q@......YZ
002BF110   28 CE 8A 00  00 00 00 00  00 00 00 00  00 00 00 00  (...............

binwalk can successfully extract the compressed vmlinux, but what we now need to know for certain is exactly which start and end offsets the decompressor code uses to delineate the piggy.

Notice that there are several piggy.*.S files in arch/arm/boot/compressed. These include the content of the piggy as a binary blob and store its start and end offsets in the globals input_data and input_data_end:

	.section .piggydata,#alloc
	.globl	input_data
input_data:
	.incbin	"arch/arm/boot/compressed/piggy.xzkern"
	.globl	input_data_end
input_data_end:

These globals are referenced in the decompress_kernel function in arch/arm/boot/compressed/misc.c, which prints a telltale string before calling do_decompress with the piggy start offset and length as arguments:

putstr("Uncompressing Linux...");
ret = do_decompress(input_data, input_data_end - input_data,
    output_data, error);
if (ret)
    error("decompressor returned an error");
else
    putstr(" done, booting the kernel.\n");

We can use cross-references to the “Uncompressing Linux…” string in the disassembled zImage-initramfs binary to locate the call to do_decompress, which will point us to the values of input_data and input_data_end.

I loaded the image up as a raw little endian⁴ ARM binary in Ghidra to find this call in the disassembly. Ghidra detects that 0x3d10 and 0x2bf114 are loaded from the Global Offset Table (GOT) section at the end of the zImage to set up the first two registers for the call to do_decompress.

These addresses match up with the XZ stream header and YZ stream footer bytes as seen above, but there is an extra word that comes just after the stream footer. A bit more digging into the source code confirms that this represents the uncompressed size of the XZ data, and that it’s expected to break decompression with the normal unxz command.⁵

Let’s carve out the piggy:

$ dd if=zImage-initramfs of=vmlinux.xz ibs=1 skip=$[0x3d10] count=$[0x2bf114-0x3d10]
2864132+0 records in
5594+1 records out
2864132 bytes (2.9 MB, 2.7 MiB) copied, 2.85465 s, 1.0 MB/s

Use the --single-stream option to avoid the “Unexpected end of input” error when decompressing it:

$ unxz --verbose --single-stream vmlinux.xz
vmlinux.xz (1/1)
  100 %   2,797.0 KiB / 8,883.5 KiB = 0.315

We now know the exact size and location of the piggy within the zImage. It’s 2864132 (0x2bb404) bytes long, located at 0x3d10 - 0x2bf114.

Modification

Direct replacement

For the test modification, I will modify the bytes after the WARNING! string in the banner. These bytes show up within the initramfs section of the decompressed vmlinux binary, which consists of an uncompressed CPIO archive with no checksums. It’s simple enough to directly edit with a hex editor:

0076AC30   61 74 20 3C  3C 20 45 4F  46 0A 3D 3D  3D 20 57 41  at << EOF.=== WA
0076AC40   52 4E 49 4E  47 21 20 3D  4D 6F 64 69  66 69 65 64  RNING! =Modified
0076AC50   21 20 68 65  6C 6C 6F 20  6E 65 69 67  68 62 6F 72  ! hello neighbor
0076AC60   73 3D 3D 3D  3D 3D 3D 3D  3D 3D 3D 3D  0A 54 68 65  s===========.The

If we try to naively recompress the modified image, it comes out significantly larger than the original piggy. Trying all of the compression presets -0 through -9, even with the --extreme flag, results in at best a 2994568 byte output. In fact, even if we just recompress the original vmlinux unchanged, it ends up at 2994540 bytes in the best case. That’s 130408 bytes larger!

Digging around the Linux boot files we can find the xz options used to compress the original piggy. The command is in xz_wrap.sh:

xz --check=crc32 --arm --lzma2=$LZMA2OPTS,dict=32MiB

Let’s try with those options:

$ xz --check=crc32 --arm --lzma2=,dict=32MiB < vmlinux-mod-warning > /tmp/vmlinux-mod-warntest.xz
$ wc -c /tmp/vmlinux-mod-warntest.xz
2864204 /tmp/vmlinux-mod-warntest.xz

Close, but it’s still 76 bytes too large (including the four extra bytes needed for the inflated size word). After digging through xz’s man page and experimenting with compression options, I found a useful setting that resulted in a smaller output:

$ xz --check=crc32 --arm --lzma2=,dict=32MiB,nice=128 < vmlinux-mod-warning > /tmp/vmlinux-mod-warntest.xz
$ wc -c /tmp/vmlinux-mod-warntest.xz
2863580 /tmp/vmlinux-mod-warntest.xz

Here’s the description of the nice option from the xz man page:

Specify what is considered to be a nice length for a match. Once a match of at least nice bytes is found, the algorithm stops looking for possibly better matches. Nice can be 2-273 bytes. Higher values tend to give better compression ratio at the expense of speed. The default depends on the preset.

A nice option indeed. Now that we have a smaller output, we can append the inflated vmlinux size to the new piggy and try to replace the original piggy with it.

Make a copy of the kernel image and zero out the piggy area:

$ cp zImage-initramfs zImage-initramfs-warnmod
$ dd if=/dev/zero of=zImage-initramfs-warnmod bs=1 seek=$[0x3d10] count=$[0x2bf114-0x3d10] conv=notrunc
2864132+0 records in
2864132+0 records out
2864132 bytes (2.9 MB, 2.7 MiB) copied, 7.53578 s, 380 kB/s

The size of the uncompressed vmlinux is still the same (9096744 bytes), so append that to the end of the new piggy as a little endian 32 bit integer (28 ce 8a 00). Then copy the new piggy into the piggy area:

$ echo -en "\x28\xce\x8a\x00" >> vmlinux-mod-warning.xz
$ dd if=vmlinux-mod-warning.xz of=zImage-initramfs-warnmod bs=1 seek=$[0x3d10] conv=notrunc
2864044+0 records in
2864044+0 records out
2864044 bytes (2.9 MB, 2.7 MiB) copied, 7.86938 s, 364 kB/s

Update the input_data_end word in the GOT near the end of the image (at 0x2bf124). The piggy now ends at 0x2beef0.

002BF110   00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ................
002BF120   50 F1 2B 00  F0 EE 2B 00  68 F5 2B 00  10 3D 00 00  P.+...+.h.+..=..
002BF130   64 F5 2B 00  64 F1 2B 00  54 F1 2B 00  40 09 00 00  d.+.d.+.T.+.@...
002BF140   5C F1 2B 00  60 F1 2B 00  58 F1 2B 00  00 00 00 00  \.+.`.+.X.+.....
002BF150

Then attempt to load it in qemu:

$ qemu-system-arm -serial stdio -M virt -m 1024 -kernel zImage-initramfs-warnmod

It doesn’t work! With a quick debugging session and another look at the code, it’s clear the decompressor is still checking for the inflated piggy size at its original location, 0x2bf110. Because we zeroed out the original piggy area, the decompressor will read zero as the inflated size of the piggy.

The location of the inflated size word shows up in a block of addresses in the main assembly code for the decompressor:⁶

LC0:	.word	LC0			@ r1
		.word	__bss_start		@ r2
		.word	_end			@ r3
		.word	_edata			@ r6
		.word	input_data_end - 4	@ r10 (inflated size location)
		.word	_got_start		@ r11
		.word	_got_end		@ ip
		.word	.L_user_stack_end	@ sp
		.word	_end - restart + 16384 + 1024*1024
		.size	LC0, . - LC0

Easy enough to fix with the hex editor again. 0x002bf110 is at offset 0x258 in the image and we can update it to the new inflated size word location, 0x2beef0 - 4 = 0x2beeec. Now it boots:

BusyBox v1.25.1 () built-in shell (ash)

     _________
    /        /\      _    ___ ___  ___
   /  LE    /  \    | |  | __|   \| __|
  /    DE  /    \   | |__| _|| |) | _|
 /________/  LE  \  |____|___|___/|___|                      lede-project.org
 \        \   DE /
  \    LE  \    /  -----------------------------------------------------------
   \  DE    \  /    Reboot (17.01.0, r3205-59508e3)
    \________\/    -----------------------------------------------------------

=== WARNING! =Modified! hello neighbors===========
There is no root password defined on this device!
Use the "passwd" command to set up a new password
in order to prevent unauthorized SSH logins.
--------------------------------------------------
root@LEDE:/#

It turns out we could’ve made things simpler by leaving the inflated size word in its original location. The extra zeros at the end of the XZ data don’t bother the XZ decompressor, and it would also save us from needing to update the location of the size word in LC0.

This approach works as long as we can recompress the modified vmlinux to a size equal to or smaller than the original. Some scripts such as repack-zImage.sh will perform modifications along these lines and attempt to optimize compression, but can’t repack an initramfs once modifications increase its compressed size.⁷

Extending the image

But what if no amount of compressor option tuning can save us? What if we must increase the size of the piggy? To figure out what needs to change if we move the end of the piggy to a higher address we can use the layout of the image as described in the linker script arch/arm/boot/compressed/vmlinux.lds.S.⁸

The GOT at the end of the table needs to be moved up, along with the bss section address. Nearly all of the references to the locations of these sections are baked into the LC0 object shown above.

We’ll need to update the addresses of anything that’s located after the start of the piggy, including:

Addresses in LC0 object
- __bss_start
- _end - end of program (including bss)
- _edata - end of image
- inflated piggy size location
- _got_start
- _got_end
- user_stack_end
- end - restart + 16384 + 1024*1024
Any entries in the GOT that come after input_data

Mapping of LC0 pointers to vmlinuz locations. Anything pointing to a location after <tt>.piggydata</tt> must be updated. — Mapping of LC0 pointers to vmlinuz locations. Anything pointing to a location after `.piggydata` must be updated.

At this point I started using a Python script to automate the editing. The LC0 object is easy to locate dynamically because it starts with its own address (e.g., for this zImage the word 0x00000248 is at offset 0x248). We can pull the GOT location from LC0 and use it to get input_data and input_data_end (i.e., piggy start and end).⁹ For each value in the LC0 and GOT that’s greater than the piggy start offset, we increase it by the amount we’re increasing the size of the image. Then we can extend the image and insert the new larger piggy over the original one.

One more thing to fix up is the _magic_end value near the beginning of the image: it matches the size of the zImage file. (This didn’t have any effect on whether qemu booted the image.)

Does it work yet? Nope! Another debugging session shows that the arguments for do_decompress are wrong: the input location, length, and error function pointer are all zero. Notice that these are all values in the GOT.

What’s happening here is that the handful of functions compiled from C code (misc.c and so on) have offset tables appended to them which are used to locate entries in the GOT. The first offset in the table is a PC-relative offset to the GOT itself. The subsequent offsets locate specific entries within it. Those entries contain a fixed up pointer to their global symbol.

input_data and input_data_end are globals referenced in the GOT, so we’ll have to fix this. Luckily we only have to fix the base GOT offset for each function.

There are some simple constraints we can use to implement a quick and dirty search and update routine for these values:

These words should only exist in between LC0 and the piggy.
The rough minimum possible GOT base offset is from where the code ends to where the GOT starts: got_start - piggy_start.
The rough maximum offset is from the beginning of the code after LC0 to the GOT start: got_start - lc0_end.

Updating each of these words with the size increase delta works and fixes the extended image! Here’s a demonstration:

$ cp vmlinux vmlinux-mod-big
$ # changing /etc/banner...
$ xz --check=crc32 < vmlinux-mod-big > vmlinux-mod-big.xz
$ # add inflated size to end of XZ data
$ echo -en "\x28\xce\x8a\x00" >> vmlinux-mod-big.xz
$ wc -c vmlinux-mod-big.xz
2994648 vmlinux-mod-big.xz
$ # that is 130516 (0x1fdd4) bytes larger than original
$ ./arm_zimg_extend.py zImage-initramfs bigpig --replace vmlinux-mod-big.xz
LC0 @ 0x0248 - 0x026c
  0x00: 0x00000248
  0x01: 0x002bf150
  0x02: 0x002bf56c
  0x03: 0x002bf150
  0x04: 0x002bf110
  0x05: 0x002bf120
  0x06: 0x002bf14c
  0x07: 0x002c0570
  0x08: 0x003c34a4
GOT @ 0x002bf120 - 0x002bf14c
  0x00: 0x002bf150
  0x01: 0x002bf114
  0x02: 0x002bf568
  0x03: 0x00003d10
  0x04: 0x002bf564
  0x05: 0x002bf164
  0x06: 0x002bf154
  0x07: 0x00000940
  0x08: 0x002bf15c
  0x09: 0x002bf160
  0x0a: 0x002bf158
piggy data @ 0x00003d10 - 0x002bf114
piggy compressed size: 0x002bb404
piggy inflated size @ 0x002bf110
piggy inflated size: 0x008ace28
piggy new compressed size: 0x002db1d8
extending image by 0x0001fdd4
LC0 extended:
  0x00: 0x00000248
  0x01: 0x002def24
  0x02: 0x002df340
  0x03: 0x002def24
  0x04: 0x002deee4
  0x05: 0x002deef4
  0x06: 0x002def20
  0x07: 0x002e0344
  0x08: 0x003e3278
GOT extended:
  0x00: 0x002def24
  0x01: 0x002deee8
  0x02: 0x002df33c
  0x03: 0x00003d10
  0x04: 0x002df338
  0x05: 0x002def38
  0x06: 0x002def28
  0x07: 0x00000940
  0x08: 0x002def30
  0x09: 0x002def34
  0x0a: 0x002def2c
Searching for GOT offsets...
Candidate GOT offset @ 0x09d8: 0x002be750
Candidate GOT offset @ 0x0ac0: 0x002be6f0
Candidate GOT offset @ 0x0b78: 0x002be618
Candidate GOT offset @ 0x0cc4: 0x002be498
Candidate GOT offset @ 0x1080: 0x002be24c
Candidate GOT offset @ 0x1184: 0x002be074
Candidate GOT offset @ 0x3360: 0x002bcbb4
Candidate GOT offset @ 0x350c: 0x002bbd84
magic start: 0x00000000
magic end: 0x002bf150
magic end updated: 0x002def24
wrote new image
$ qemu-system-arm -serial stdio -M virt -m 1024 -kernel bigpig
[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Linux version 4.4.50 (buildbot@builds-02.infra.lede-project.org) (gcc version 5.4.0 (LEDE GCC 5.4.0 r3101-bce140e) ) #0 SMP Mon Feb 20 17:13:44 2017
...
BusyBox v1.25.1 () built-in shell (ash)

         ^,    ,^
        /  ----  \
       / _\    /_ \  Ful
       |  / __ \  |
       |   /oo\   |            ,-.
       |   \__/   |____________.:'
       \   .__.   /            \ '
        '.______.'              \
            \                   |
             |  /____...-----\  |
             |  |            |  |
             |^^|            |^^|
 Big piggy mod!

=== WARNING! =====================================
There is no root password defined on this device!
Use the "passwd" command to set up a new password
in order to prevent unauthorized SSH logins.
--------------------------------------------------
root@LEDE:/#

The source code for this script can be found at https://gist.github.com/jamchamb/243e6973aeb5c9a2e302a4d4f57f16e1.

In the context of PCs the initial ramdisk is a more limited filesystem used for an intermediary “early userspace” stage. These images can still contain interesting programs, such as those used to get a disk decryption key from the user or TPM. ↩
https://people.kernel.org/linusw/how-the-arm32-linux-kernel-decompresses ↩
https://github.com/torvalds/linux/blob/master/scripts/extract-vmlinux ↩
If binwalk hadn’t already told us the image is little endian, the magic endianness value of 0x04030201 would. It’s stored as 01 02 03 04 near the beginning of the image, which tells us it’s stored little endian. ↩
https://elixir.bootlin.com/linux/v4.4.50/source/scripts/Makefile.lib#L374 ↩
https://elixir.bootlin.com/linux/v4.4.50/source/arch/arm/boot/compressed/head.S#L576 ↩
https://forum.xda-developers.com/t/script-repack-zimage-sh-unpack-and-repack-a-zimage-without-kernel-source-v-5.901152/ ↩
https://elixir.bootlin.com/linux/v4.4.50/source/arch/arm/boot/compressed/vmlinux.lds.S ↩
I’ve used the known indices of these values in my code, but some smarts could be added to automatically detect the right entries based on compression header magic (piggy start) and greatest offset before the GOT (piggy end). ↩