The Floundering Zygote: execv

Showing posts with label execv. Show all posts

Friday, 14 October 2011

How to copy arguments from kernel buffers into a new address space

One of the major complexities to execv is copying the arguments from kernel buffers back into the new user space.

The first question you have to ask is, where do I copy them to? Well the only address you have that is valid in that userspace is the stack pointer.

What you are going to want to do is use the stack to store the arguments and then give the user program a modified stack pointer.

You are going to need to count the total number of strings you bring in as well as the total number of character (including the null terminators).

Take the total size of what you need to store, (num_strings * size of a char pointer) + (num_of_chars) and subtract this from the stack pointer (stack grows down) . Now that you have reserved the size you need you can write to it. Remember that pointer access to an array grows upwards so write the first pointer of your list to stackptr - total size.

Interestingly you want to give the user program the new stack pointer so it doesn't over write your data and you also need to give it a pointer (in userspace) to the argument list. These are the same thing!

Here is a picture of the stack that should make all of this more clear.

Remember the stack pointer must be aligned to 4 bytes so round up to the next highest multiple of 4.

Wrote this one quick. If you have any questions leave a comment. There may be other ways of doing this but beware DumbVM does some fairly dumb things so make sure you fully understand what is going on because if you are reusing physical memory unintentionally your values may work now but not when you implement paging.

- FlounderingZ

Wednesday, 5 October 2011

OS161 Execv Part 2

Continuing from part 1, what exactly do we have left to do and what do we have accomplished?

First of all we have added the syscall to the kernel but it is currently empty. So we have a call to execv(const char *progname, char **args); from the user side which means that the arguments we get on the kernel side are as above.

All we really need to do now is recreate runprogram() with some very small tweaks.

Let us go over what runprogram() does and then we will see what tweaks need to be made.

runprogram

The first thing it does is look for the file name provided. Next we create a new address space and activate it in the TLB (Translation Lookaside Buffer) which is how the hardware caches memory access and is a part of the MIPS ISA (Instruction Set Architechture).

Now that we have our address space set up we can call load_elf which handles the semantics of the ELF format and loading into the correct segment of our address space for us.

Now that we have the binary in memory we can close that file and define our usermode stack.

runprogram() now calls md_usermode which warps it into usermode to start executing at entrypoint which is the location of (most likely) the start symbol in the binary. entrypoint is returned from load_elf, yet another thing we don't have to trouble ourselves with.

So runprogram is done, what is different for execv?

execv

For execv we need to pass the arguments to the program through the exception handler, into the kernel syscall, do some work, pass them to md_usermode.

So getting them through the exception handler is done for you and is detailed in part 1, and passing the to md_usermode is quite obviously trivial.

The question is what work has to be done on them before you pass them to md_usermode?

All you really have to do is count how many arguments you have received in the char **args array and ensure that the last argument in the list is NULL. You can also check that the first argument matches the filename but this is a convention so you shouldn't be strict about it. Once you have counted the number of arguments, you may want to check that each is NULL terminated as well since they are strings, just pass argv and argc into md_usermode.

Remember that you are going to need to copy the arguments into kernel space from userspace and then back out to userspace. You will probably want to put them on the heap, using kmalloc, since the only thing you need to ensure is on the stack when going to usermode are the arguments which are just pointers. By having the memory on the heap we can still access it from usermode.

-FlounderingZ

Saturday, 1 October 2011

OS161 execv Part 1

The function prototype

So the first thing we will do is decipher exactly what the prototype in unistd.h means.

int execv(const char *prog, char *const *args);

The syscall takes a constant character pointer to the program name. This could be a string literal “testbin/progname” or a string you have created. The second argument is more interesting although not complicated. It is simply an array of constant character pointers. A little more in depth this means that the variable args may be changed i.e.

args = new_args_array;

Each pointer in the array of constant character pointers that args points to however cannot be modified i.e.

args[0] = "Hello"; //"Hello" is a const char * but no go.

All of this however is kind of irrelevant because we will be passing these values through a syscall so gcc wont be generating any code and we do not have to respect these rules. The prototype may however give you a useful warning if you try something silly.

 Dropping into the Kernel

This part is all done for you. Here is a checklist of how to add a syscall just in case you forgot

Add the value to callno.h in kern/include/kern (already done)
Add the user syscall prototype to include/unistd.h (already done)
Add the prototype of the kernel version of the syscall to kern/include/syscall.h
Add a case to the switch statement in kern/arch/mips/mips/syscall.c
Add a call to the kernel version of the syscall to the case you just added. (you may not know all of the semantics you need just yet but just add a //TODO and if you have problems later do a “grep –r ‘TODO’ *” from your top level directoy.)

But what about our arguments?

The comment at the top of kern/arch/mips/mips/syscall.c is very informative on this matter.

When the user call into libc uses the syscall instruction to drop into the kernel.

That instruction jumps to the exception symbol in exception.S in kern/arch/mips/mips. This does some setup, sets up the trapframe and then jumps to mips_trap in kern/arch/mips/mips/trap.c and then it is going to hand off execution to mips_syscall.

During the setup of the trapframe before the call to mips_trap our arguments from userland were saved into the trap frame. The arguments are available at tf->a0, tf->a1, tf->a2, tf->a3, although we only need a0 and a1.

So now we have our arguments… but we haven’t done anything yet, have we? Not really, but maybe there will be a part 2?

- FlounderingZ