The Floundering Zygote: 2011

Wednesday, 26 October 2011

Troubleshooting Tips for Lab 3.0

Okay so this lab is fairly difficult, but it is also fairly straight forward. Much of the difficulty comes from debugging problems most of us have no experience with.

I have given an overview of the Lab, which I may do an update to if I have time; a thing I am currently fairly short of, and I will now give some tips for debugging I have learned to hopefully help minimize your debugging time. I could give a more in depth overview of coremap, pages table, etc. but the lecture content already does a very good job.

Debugging Tips

Multiple Entries in the TLB

This could be caused by 1 of 2 major reasons. The first of which is a programming error in which you could read an incorrect entry from your page table causing you to write the same entry to the TLB a second time.

A second reason could be that you forgot to to OR in the VALID bit in your TLB entry.

Asserts failing in subpage_kmalloc()

Check all of your programming. This is most likely cause by a double free! Check all of your logic. Try commenting out frees until your problem goes away. However if you comment out all the frees in your program and this does not fix the problem then it is obviously not a double free.

Another cause could be freeing the wrong memory and reusing it somewhere else. If this memory belongs to one of your kernel structures i.e. a page table or the core map, this could cause major problems and very confusing errors.

These were two time consuming errors so hopefully this will help. If you have any other problems that most importantly you can articulate well, I can try and help.

I have been told that people have commented but they do not seem to be appearing. Checked the spam inbox; nothing. So you are welcome to email me at dixon_171 at hotmail dot com. This is not a primary email so you can feel free to spam it I will just ignore it.

-FlounderingZ

Saturday, 22 October 2011

Kernel Space does not go through the TLB

This confused me for a bit but now I have got it all sorted out.

Any virtual address greater than 0x80000000 does NOT go through the TLB. So all you have to do in allock_pages is find an open slot in your coremap and return paddr + 0x80000000. No need to write into your TLB or any processes page tables.

Hope this clears up any confusion between User Mode virtual addresses and Kernel Mode virtual addresses

- FlounderingZ

Friday, 21 October 2011

Coremaps and On-Demand Paging

Welcome to part 1 of Assignment #3, ‘the hardest assignment of your university career’ – Guest lecturer on Tuesday (sorry I don’t recall you name).

The entire VM system is fairly complicated but I will try and focus on just what you need for part 1. I may mention some things you will want to think about when designing your components that will not come until later parts so if I don’t explain them that is probably why. I would also like to remind everyone reading this that Prof. Lie has broken down this assignment very well already and his Lecture slides are very informative so please use them to your advantage to fill in any holes I leave.

Also I don’t have an internet connection so I may be guessing at the semantics of some things. Feel free to correct me. Then again, you are free to change anything in the system if you feel your method makes more sense or is more efficient.

Coremaps

A coremap is a mixture between a bitmap and some sort of inverted page table, although it is NOT a replacement for an IPT. I’m sure it could be, but a coremap is an abstraction over managing physical memory whereas an IPT is an abstraction over accessing physical memory.

Please refer to the lecture as it explains exactly what a coremap is (structure wise), I will explain how this construct applies to the assignment and how to use it as an abstraction.

The coremap will be what is doing the heavy lifting for getppages, alloc_kpages, and free_kpages. I will go over what each of these should do.

getppages – This function should simply get the next available physical page and return it. If there are no pages available you could return 0 so that alloc_kpages will attempt to swap or otherwise free a page, or you could do that within this function. This is the main interface to the Coremap. This is where you take in a VPN find an available PFN and pass it back to alloc_kpages. Remember that your Coremap is a hashtable but with some special properties. You CANNOT have more VPN in the hashtable than PFN unless some address spaces are sharing physical memory. This differs from a normal hash table in that chaining as a method of collision resolution becomes complicated. You must point back into your array of nodes rather than pointing to a linked list. What you store in your Coremap is really up to you. You could store ASIDs and PIDs, you could throw pointers to thread structures in there etc. Although I would suggest PIDs over thread pointers since you already build a level of abstraction for finding threads by PID for waitpid.

alloc_kpages – This function is called by kmalloc, please go check out the malloc implementation, see where this is used, etc. This function will be returning a vaddr back to kmalloc. Where does this vaddr come from? Well it will be coming from the coremap but you are also going to want to add all the page information into your page table however you decide to implement that. You may want to build some sort of abstraction over allocating multiple pages at once to store some information about memory allocations and allowing you to free multiple pages at a time. Involved with this you may want to make getppages attempt to get as many pages in a row as possible and if this is less than the total number you need then just try again. I would not suggest swapping things out to make room for a complete piece of memory however, also since swap doesn’t exist yet that causes another problem Winking smile

free_kpages – This function will be called by kfree. This function will most likely be very simple if you have made some sort of abstraction over memory allocation of multiple pages. This function will be interfacing with the core map because for each page it frees it has to look it up in the coremap and invalidate it in both the coremap as well as in the owner thread’s page table. Copy-on-write semantics may complicate this a little but I think you could just remove the thread from the list of threads sharing that page since the “free” is technically writing changes to the page so it should “copy” it and then “free” it aka don’t do either just delete you reference from the coremap as well as your page table.

On-Demand memory allocation

This section of the assignment has a lot to with the implementation of your address space but also involves your coremap. This section will be responsible for defining the as_*** functions and how vm_fault works. For the address space functions you need to understand what each one does.

as_create – This funciton allocates space, in the kernel, for a structure that does the bookkeeping for a single address space. It does NOT allocate space for the stack, the program binary, etc., just the structure that hold information about the address space.

as_prepare_load – this function is called just before the text or data sections are loaded into memory. They will just get the required pages for those sections. You may not need this anymore since you should only be allocating pages in vm_fault and you should be modifying load_elf to do on-demand loading of the program rather than loading everything at once.

as_define_region – This function is called in load_elf and simply sets up the bookkeeping for the data and text regions. You will use this data to recognize different types of page faults.

as_activate – This function activates a given address space as the currently in use one. Currently this just invalidates the entire TLB on a context switch but you can give address spaces unique identifiers and then use these in the TLB to differentiate between which address space is currently in use to save on overhead during a context switch. If you do that then this function will just set a global variable or something to the address space ID you wish to be the one you search in the TLB for.

as_destroy – This will actually do something now!!! Throw some asserts in to ensure that everything is how you expect it to be. You may need to free all the pages inside your page table, or maybe this is done earlier and you should assert that the table is empty. Then free the bookkeeping structure and continue on your way.

vm_fault

This guy is responsible for all the cool stuff. When the hardware decides it cannot find an address if jumps to the TLB exception handler. This calls vm_fault so this is where you should make the magic happen. First decide what kind of fault it is i.e. a read or a write. If it is a read then if you hit a page fault that is bad news and you should probably kill that program. If it is a write and you have a page fault then you should check if is allowed. If the fault is because someone is writing to just below the last page of their stack, give them a new one that is how the stack is supposed to work. If it is someone using the heap they should have used sbrk so again kill the program, or return some sort of error if possible. To check if you have a page fault you will be checking to see if there is an entry in the page table. If not and you have a legal page fault then you will want to get a pages from the coremap, add it to the page table, write the entry into the TLB and restart the instruction. If you do find it in the page table then simply write it into the TLB and restart the instruction. The only way for a load or store instruction to complete is if there is a mapping in the TLB. Our software is simply responsible for adding the required mappings in.

Splitting up the work

This is extra important for this assignment since we are a little short on time. There are 2 good ways in my opinion of breaking up this assignment. The first is to have one partner do part 1 and the other do part 2. This will work well for a group with a good understanding of how the entire VM system works. One person implements getting pages from memory and the other implements when and how a process should get that back. Part 2 is an abstraction built on top of part 1 so to do this simultaneously you will need good communication and a good understanding of the whole system.

A second method of doing this would be pairs programming or very heavy code review. Pairs programming involves programming at the same time together. This is a popular style of programming as you catch bugs before they have a chance to happen. Code review means nothing gets checked into the code repository until someone has reviewed it. This means you can work on this simultaneously but you do not compile and test until you have done code review on each others work.

With good communication between group members this assignment should be difficult be very manageable.

One last thing. The minimum you need to implement to test this will be the coremap, page tables, address space structure, as_* functions and vm_fault, wait that’s everything. You can skip some optimizations like the one involving the program file until after you have tested but almost everything is defendant on the vm.

Also, your vm should be one of the first things you bootstrap.

Questions can be left in the comments

- FlounderingZ

Friday, 14 October 2011

How to copy arguments from kernel buffers into a new address space

One of the major complexities to execv is copying the arguments from kernel buffers back into the new user space.

The first question you have to ask is, where do I copy them to? Well the only address you have that is valid in that userspace is the stack pointer.

What you are going to want to do is use the stack to store the arguments and then give the user program a modified stack pointer.

You are going to need to count the total number of strings you bring in as well as the total number of character (including the null terminators).

Take the total size of what you need to store, (num_strings * size of a char pointer) + (num_of_chars) and subtract this from the stack pointer (stack grows down) . Now that you have reserved the size you need you can write to it. Remember that pointer access to an array grows upwards so write the first pointer of your list to stackptr - total size.

Interestingly you want to give the user program the new stack pointer so it doesn't over write your data and you also need to give it a pointer (in userspace) to the argument list. These are the same thing!

Here is a picture of the stack that should make all of this more clear.

Remember the stack pointer must be aligned to 4 bytes so round up to the next highest multiple of 4.

Wrote this one quick. If you have any questions leave a comment. There may be other ways of doing this but beware DumbVM does some fairly dumb things so make sure you fully understand what is going on because if you are reusing physical memory unintentionally your values may work now but not when you implement paging.

- FlounderingZ

Friday, 7 October 2011

OS161 waitpid

The waitpid system call in OS161 has implications in more places than just the files involved with adding a system call. The design of waitpid will have implications in the way you create and destroy threads, and how you exit threads cleanly.

Let us first go over the high level concepts of what the entire waitpid system will need to do.

PID Management

I will run on the assumption that you have created a way of getting the next available PID in your system along with the getpid system call. I will refer to the getting of the next PID as a function called get_new_pid.

So what does this system of allowing processes to wait on each other need to do.

First of all we need a way of getting the thread structure of a process based on it's PID. We need to do this because the semantics of waitpid state that we need to return some information about the process so the simplest way (although not the only) would be to have public members of the thread structure for waitpid's use. Additionally, the scheduler operates on thread structures and we will most likely need similar information in the scheduler and waitpid mechanisms so the thread structure seems like the simplest option.

So how can we get a thread structure from a pid? Data Structures are your friends! Hash Tables, Linked Lists, Binary Trees, Resizing Arrays, etc. Be creative, use something you are comfortable with and you know will have a stable implementation. For the purposes of this post I will refer to however you created this mechanism as get_thread_by_pid(). Along with this you will have things such as add_thread_by_pid(), remove_thread_by_pid(), etc. These are just standard methods for adding to your chosen data structure and removing

You are going to want to add_thread_by_pid() after get_new_pid() and when a thread is being destroyed you are going to need to remove_thread_by_pid() and allow the pid to be reclaimed.

Once this is implemented you will be able to get a thread structure based on a PID. The semantics of waitpid state that the parent must receive the child's exit code. A way of doing this would be to, during the exit system call, save the exit code into the thread structure and then get_thread_by_pid() and read from the thread structure in the thread that is waiting.

waitpid

Now lets talk about actually implementing waitpid. This call should not be that complicated, most of the code is just book keeping. You could implement this by recreating another piece of code and just sleeping on the address of the child thread and then having the child thread call thread_wakeupone (assuming you implemented something like this) when it exits. You could also choose to use a condition variable instead of recreating the CV code. You would probably want to store the CV object in the thread structure. Also remember that if you want to add something to the thread structure be sure to initialize it in thread_create.

One last thing that you need to worry about is a race condition when the thread exits. You will be trying to get_thread_by_pid at the same time that exorcise is going to be trying to delete that thread structure. One way you can get around this would be to add a new thread state that would represent a thread that has exited but is not yet a zombie because someone has been waiting on it. To do this you would also need to know if someone is waiting on you. You may also want to have a list of who is waiting on you, notify them in a FIFO order etc. You must conform to the semantics detailed in the man pages for OS161 and anything else is just going to be bonus. You would also need to modify thread_exit so that if it has any waiters it won't add the thread to the zombies array, waking all waiters first. This is the part that involves the biggest design decision. I hope that this gives you guidance on what exactly needs to be done but there is no way to be more specific without giving a solution. Just remember that you need to return your exit code to any waiters, handle memory correctly, and eliminate a race condition with exorcise.

This is a preliminary analysis from my standpoint so if there are any specific implementation problems people are having I can try and answer them in the comments.

To those who celebrate a very Happy Thanksgiving and to those who don't enjoy your long weekend! (Although I hope those who celebrate enjoy their weekends as well!)

- FlounderingZ

Wednesday, 5 October 2011

OS161 Execv Part 2

Continuing from part 1, what exactly do we have left to do and what do we have accomplished?

First of all we have added the syscall to the kernel but it is currently empty. So we have a call to execv(const char *progname, char **args); from the user side which means that the arguments we get on the kernel side are as above.

All we really need to do now is recreate runprogram() with some very small tweaks.

Let us go over what runprogram() does and then we will see what tweaks need to be made.

runprogram

The first thing it does is look for the file name provided. Next we create a new address space and activate it in the TLB (Translation Lookaside Buffer) which is how the hardware caches memory access and is a part of the MIPS ISA (Instruction Set Architechture).

Now that we have our address space set up we can call load_elf which handles the semantics of the ELF format and loading into the correct segment of our address space for us.

Now that we have the binary in memory we can close that file and define our usermode stack.

runprogram() now calls md_usermode which warps it into usermode to start executing at entrypoint which is the location of (most likely) the start symbol in the binary. entrypoint is returned from load_elf, yet another thing we don't have to trouble ourselves with.

So runprogram is done, what is different for execv?

execv

For execv we need to pass the arguments to the program through the exception handler, into the kernel syscall, do some work, pass them to md_usermode.

So getting them through the exception handler is done for you and is detailed in part 1, and passing the to md_usermode is quite obviously trivial.

The question is what work has to be done on them before you pass them to md_usermode?

All you really have to do is count how many arguments you have received in the char **args array and ensure that the last argument in the list is NULL. You can also check that the first argument matches the filename but this is a convention so you shouldn't be strict about it. Once you have counted the number of arguments, you may want to check that each is NULL terminated as well since they are strings, just pass argv and argc into md_usermode.

Remember that you are going to need to copy the arguments into kernel space from userspace and then back out to userspace. You will probably want to put them on the heap, using kmalloc, since the only thing you need to ensure is on the stack when going to usermode are the arguments which are just pointers. By having the memory on the heap we can still access it from usermode.

-FlounderingZ

Tuesday, 4 October 2011

OS161 Fork - Major Difficulties

A lot of people, including myself, have been having troubles with getting fork() to work as expected. Things like TLB misses on load or write, Bus errors, and other panic codes you never want to see.

What I am going to do is go over are Major Areas that can cause problems that are very difficult to find.

sys_fork

You must copy the address space BEFORE you call thread fork other wise you have a synchronization issue.
You must also copy the trapframe BEFORE you call thread fork. The is however another small issue with this. The trapframe must be on the heap not the stack so the space must be allocated using kmalloc

md_forkentry

All you have to do is assign the pointer to the address space you passed in, into the thread structure. Don't worry about anything else.
You now have to copy the trapframe from the heap onto the stack of the current thread which means this time you DO NOT want to use kmalloc just declare a struct NOT a pointer and copy into that location.
Last you will want to increment your epc, set your retrun value to zero and make sure userland will see the return as a success (please see comment at the beginning and code at the end of mips_syscall)

That is it. There are really very few 'tricks' to fork() just a couple of hard to debug memory issues.

Feel free to put comments below I will try and answer them if they are appropriate

- FlounderingZ

Monday, 3 October 2011

OS161 Fork

Lets make this one quick. It is late but I know lots of people will want this in the morning. Apologies for any oversights or errors. Also the formatting will probably suck.

How exactly do you “Fork”.

As the professor has said a lot has already been implemented for us.

We need to do 2 things. Make a sys_fork function that will be called from mips_syscall, make md_forkentry which is the first function that the newly forked thread will call.

sys_fork

Before we create a new thread me must get a pid for it.

EDIT: This can also be done in thread_fork. Including associating the thread with your tracking system.

Once that is done we can create a new thread, attach that thread structure to whatever tracking system you have involving pids, and return that pid to the parent. This may seem complicated but all it actually means is returning the value from your sys_fork function and letting mips_syscall take care of the rest. mips_syscall will put the return value into the parent threads trapframe, advance the PC and return. Execution then goes back into exception.S does some loading from the TF and returns to usermode.

What else do we do before we create a new thread? Copy the trapframe of course. The trapframe holds all of the information that was saved when we dropped into the kernel. Just do a memcpy into a temporary struct pointer. What else do we have to do? Copy the address space. Linux does this in a very smart way to make forking very light and only copies memory when a thread tries to write to it. This is actually fairly simple because this functionality is built into the VM and they just set some flags on the pages and it is all good. In our case we will just use as_copy and laugh as it does nothing exceptionally useful for now =D.

Okay so now we have done some stuff… right? What is next. We are going to call thread_fork and the argument we want to pass is the trapframe copy we made earlier. The function we would like it to call is md_forkentry (that name makes a little more sense now doesn’t it).

SIDENOTE: You have a decision to make. The new thread can attach its PID to your tracking system instead of the parent. This could be a smarter way of doing things and all you need to do is modify md_forkentry to take the pid as the second argument. This will depend on how your waitpid system is designed.

md_forkentry

Now this is where the magic happens. We have 2 threads with identical trapframes, same instruction spaces and interrupts are on. As stated above, the parent is simple. Just take the PID you got from get_new_pid (or whatever you called it) and return that from sys_fork. As for the child, they are in md_forkentry which is empy right now. What should it be doing?

This is where we need to do a couple of child specific things. We need to set our return value in the copied trapframe to 0 since we are the child which is done by modifying the v0 register. We also need the copy of the address space. The address space needs to be copied in the parent before thread_fork other wise it could have changed and not be what we want anymore. The question is how do we get its value to the new thread. It could be passed into thread_fork as the second argument to the function (an unsigned long, this seems a little dirty). Another option is to use the trapframe creatively. In the parent after a copy trapframe has been made we can use a0-a3 since we know that fork takes no arguments. This seems a little dirty as well.

That should be all of it. Of course there are some things left out but it should answer the big questions or a least point you on the right track.

100% there is a different way of approaching this, maybe in a more efficient way, hopefully it helped!
-FlounderingZ

Saturday, 1 October 2011

OS161 execv Part 1

The function prototype

So the first thing we will do is decipher exactly what the prototype in unistd.h means.

int execv(const char *prog, char *const *args);

The syscall takes a constant character pointer to the program name. This could be a string literal “testbin/progname” or a string you have created. The second argument is more interesting although not complicated. It is simply an array of constant character pointers. A little more in depth this means that the variable args may be changed i.e.

args = new_args_array;

Each pointer in the array of constant character pointers that args points to however cannot be modified i.e.

args[0] = "Hello"; //"Hello" is a const char * but no go.

All of this however is kind of irrelevant because we will be passing these values through a syscall so gcc wont be generating any code and we do not have to respect these rules. The prototype may however give you a useful warning if you try something silly.

 Dropping into the Kernel

This part is all done for you. Here is a checklist of how to add a syscall just in case you forgot

Add the value to callno.h in kern/include/kern (already done)
Add the user syscall prototype to include/unistd.h (already done)
Add the prototype of the kernel version of the syscall to kern/include/syscall.h
Add a case to the switch statement in kern/arch/mips/mips/syscall.c
Add a call to the kernel version of the syscall to the case you just added. (you may not know all of the semantics you need just yet but just add a //TODO and if you have problems later do a “grep –r ‘TODO’ *” from your top level directoy.)

But what about our arguments?

The comment at the top of kern/arch/mips/mips/syscall.c is very informative on this matter.

When the user call into libc uses the syscall instruction to drop into the kernel.

That instruction jumps to the exception symbol in exception.S in kern/arch/mips/mips. This does some setup, sets up the trapframe and then jumps to mips_trap in kern/arch/mips/mips/trap.c and then it is going to hand off execution to mips_syscall.

During the setup of the trapframe before the call to mips_trap our arguments from userland were saved into the trap frame. The arguments are available at tf->a0, tf->a1, tf->a2, tf->a3, although we only need a0 and a1.

So now we have our arguments… but we haven’t done anything yet, have we? Not really, but maybe there will be a part 2?

- FlounderingZ

OS161 PID Allocation

The quickest way to find a solution to essentially any problem you come across while working on OS161 is to look at the Linux kernel source or other similar resources.
Some useful sites include :

The IBM Technical Library

An IBM article on Linux Process Management

Some of the Linux kernel read with a text-to-speech synthesizer (including text source)

Linux Kernel Repository with tagged symbols

Stack Overflow

How The Linux Kernel Works
Now obviously the Linux Kernel does a lot of optimizations and has many more features than we currently have to deal with in OS161 so I will keep the details light.

For our implementation each process will have a PID. In Linux each process may have many process IDs, thread group IDs, process group IDs, and session IDs, and all of these are included in the task_struct structure. Linux keeps all of these IDs tucked away in hashed doubly linked lists and iterates over them in very weird, very (assumingly) optimized ways.

We do not need a lot of these implementation details. What most of us are looking for is how does Linux keep track of PIDs of any type, how does it reclaim them, etc.

What the Linux kernel does is make an array of bitmaps (explanation to follow) that are the same size as memory pages. It uses these in combination with test and set operations to create a management system that does not use locks, is allocated page-by-page and thus has a low memory over head, and is scalable up to 4 million PIDs (the size of an unsigned int).

So, what is a bitmap. If you have heard of bitfields, same concept. The Linux kernel will initially allocate a single page of memory (probably 4KB) and use each bit to represent a PID. This allows you to determine if PIDs 0-4095 are in use using only 4KB. This is extensible because if the PID is greater than any value on the current page, allocate a new one. If the PID is greater than your defined max_pid, wrap around. To keep track of available PIDs per page, Linux has a counter associated with each page of how many PIDs are available. This is atomically incremented and decremented appropriately. This allows you to quickly identify if there are any PIDs left on a given page.

Once a process has a PID Linux does a few more things with it. It gets attached to the task, added to hashed-doubly-linked lists, etc. Since we do not expose threads to the user, only processes, we don’t have much more to worry about in our allocation scheme. You can even use the existing thread structure and assign the PID to the thread ID. You may need to be able to look up a thread structure (if you use it) by PID for things like parent-child relationships, waiting until a process reaches a certain state. These things can be accomplished other ways but you may want to throw all your processes into a hash table or something.

What should I do for OS161?
For OS161 you are not required to implement exactly what the Linux kernel does but understanding what a large system is useful information. Using this you could decide to make this on a smaller scale. If you only need 4096 PIDs, don’t bother with pages just use a single bit map. Feel free to think of other methods of implementing this, you could create a linked list of reclaimed PIDs and only increment your max PID when this is empty. Remember for an assignment like this simplicity can be your friend as school deadlines are not very flexible.

Hopefully this sheds some lights on the mystery of PID allocation
- FlounderingZ

Edit : Prof. Lie has read this content and feels it lies well within the guidelines of Academic Integrity at U of T. Feel free to share with your friends and check back for more.
Edit 2: Never edit a post after uploading from Windows Live Writer, wow!