KoizOS - Processes and Scheduling

Last post we finally made it to user space, or ring 3, and allowed transition back to kernel space, or ring 0. Naturally, the logical next step would be to develop user applications. As most modern operating systems allow multiple processes to run, it would be beneficial to have some sort of scheduling scheme as well. Thus we have a few components needed for the processes and scheduling piece:

Building a simple user program
Loading and running the user program
Adding scheduling

Building a simple user program

As nice as it would be to try to get ELF format working, I opted for hard coding a sample binary program directly in the filesystem. Said differently, whenever the filesystem loads, this program will be part of the initial filesystem.

int fs_load_starting_files()
{
    /* Write the test program to the new file system */
    {
        uint8_t* file_name = 
            (uint8_t*)"test.bin";
        uint8_t file_contents[] =
            {
                /* Code */
                0x66, 0xbf, 0x1c, /* mov edi, msg */
                0x00, 0x00, 0x00, 

                0x66, 0xb8, 0x01, /* mov eax, 1 */
                0x00, 0x00, 0x00, 

                0xcd, 0x33,       /* int 0x33 */

                0x66, 0xbf, 0x00, /* mov edi, 0 */
                0x00, 0x00, 0x00, 

                0x66, 0xb8, 0x02, /* mov eax, 2 */
                0x00, 0x00, 0x00, 

                0xcd, 0x33,       /* int 0x33 */

                /* Data */
                0x48, 0x65, 0x6c, 0x6c, /* "Hello from test program!\n\0" */
                0x6f, 0x20, 0x66, 0x72, 
                0x6f, 0x6d, 0x20, 0x74, 
                0x65, 0x73, 0x74, 0x20, 
                0x70, 0x72, 0x6f, 0x67,
                0x72, 0x61, 0x6d, 0x21, 
                0x0a, 0x00
            };

        int res = ramdisk_fat16_file_write(file_name, file_contents, 
            sizeof(file_contents));

        if(res != 0)
            panic("Failed to write starting file test.bin!");
    }

    printf("Loaded starting files to disk!\n");
    return 0;
}

To make it easier to read, I added the corresponding assembly code in comments next to the associated hex code. The first part pushes the message "Hello from test program!" to the first argument and 1 (the system call number) to the second argument. It then calls interrupt 0x33 which we designated as our system call interrrupt. Moving on, the second part pushes the return code 0 as the first argument, and 2 as the system call number. Like before, we call 0x33 again.

The system call numbers correspond to very specific behavior in how we handle them in KoisOZ:

System call 1 (print): This prints the text given from the first argument
System call 2 (exit): This exits the application with the return code given from the first argument.

Loading and running the user program

Now that we have a sample user program, it's time to working on loading and running it in our operating system. An operating system usually keeps a list of processes with a unique process id (PID):

void process_init()
{
    /* Zero out the process list */
    uint32_t i = 0;
    for(i = 0; i < MAX_PROCESSES; i++)
    {
        processes[i].pid = i;
        processes[i].state = UNUSED;
        processes[i].process_memory_start = NULL;
    }

    /* First process (PID 0) is always the idle process */
    uint8_t idle_process_name[] = "Idle";
    process_init_process(idle_process_name, 0);
    processes[0].state = RUNNABLE;
}

PID 0 is special case for KoisOZ. It's an idle process that runs whenever there are no other processes running. For actually adding our process to the list, we need to: (1) set it up in our process list, (2) load it from disk to memory, and (3) schedule it to be ran.

First, is setting up the process itself. Notice we zero out the registers and allocate a block of physical ram for it to reside in.

void process_init_process(uint8_t* process_name, int pid)
{
    strcpy(processes[pid].name, process_name, PROCESS_NAME_SIZE);

    processes[pid].current_priority = 0;
    processes[pid].cpu_time_ms = 0;
    processes[pid].killed = 0;

    memset(&(processes[pid].registers), 0, 
        sizeof(processes[pid].registers));

    processes[pid].process_memory_start = pmem_alloc();
    processes[pid].process_memory_size = PHYS_BLOCK_SIZE;
}

Second is loading it from disk to memory. The code here actually contains a function call to process_init_process above and loads the program into the allocated memory provided by it.


int process_execve(uint8_t* file_name, 
    uint8_t *argv[], uint8_t *envp[])
{
    // Check if the program exists first
    if(!fs_file_exists(file_name))
    {
        printf("Error! Program %s not found!\n", file_name);
        return 0;
    }
    
    // Set up the process
    int pid = process_get_next_free_pid();
    process_init_process(file_name, pid);

    // Load the program into memory
    int res = fs_file_read(file_name, 
        processes[pid].process_memory_start, 0);
    printf("Loaded process %s to memory with pid %d\n",
        file_name, pid);
        
    // LASTLY mark as runnable. 
    processes[pid].state = RUNNABLE;
    sched_addpid(pid);
    
    return 1;
}

Last is marking the process as runnable and adding it to our scheduler to be ran.

Scheduling

The last ingredient for everything is the scheduler. This is responsible for choosing which process runs at any given point in time. The simplest version is round-robin, but I elected for a slightly more complicated scheduling algorithm called MLFQ (as stated in the popular OSTEP book). Processes start at the highest level of priority, but as they consume more CPU time, they move down in priority. This keeps CPU-heavy tasks from bogging down the system too much while highly interactive apps that yield frequently stay in high priority for user responsiveness.

Let's first take a look at what happens when we add our process to the scheduler:

void sched_addpid(int pid)
{
    struct process* ptr = process_get(pid);

    proc_list[pid].ptr = ptr;
    proc_list[pid].priority = 0;

    proc_list[pid].tickets_remaining = STARTING_TICKETS;
    
    /* Push it in front of the priority list */
    proc_list[pid].next_process = proc_priority_list[0];
    proc_priority_list[0] = &(proc_list[pid]);

    printf("Schduler: Added new PID %d to scheduler. Name: %s \n", 
        pid, ptr->name);
}

We set our new process to the highest priority (priority 0), give it a set number of tickets (which are used to track how much CPU time it uses), and set it next on our priority list. Processing through the scheduler is done with sched_update(), which implements the MLFQ algorithm itself. The output of this is simply which process to run next. I added an abridge version of the algorithm below:


void sched_update(void)
{

    /* Scenerio 1: No current process exists. 
        In this case, simply find one in the highest priority queue */
    if(current_process == NULL) {
        int i = 0;
        for(i = 0; i < MAX_PRIORITY_QUEUES; i++) {
            if(proc_priority_list[i] != NULL) {
                current_process = proc_priority_list[i];
                break;
            }
        }
        return;
    } 

    /* Decrement the current pcoess by number of tickets and decrement priority if needed */
    current_process->tickets_remaining -= FULL_TIMESLICE_COST;
    if(current_process->tickets_remaining < 0) {
        current_process->tickets_remaining = 0;

        /* Bump process down the queue if we can */
        int cprior = current_process->priority;
        if(cprior < MAX_PRIORITY_QUEUES-1) {

            /* Decrement priority */
            int newprior = current_process->priority + 1;
            
            /* Add to lower priority queue */
            current_process->priority = newprior;
            current_process->tickets_remaining = STARTING_TICKETS;
            current_process->next_process = proc_priority_list[newprior];
            proc_priority_list[newprior] = current_process;
        }
    }
    
    /* Scenerio 2: Current process is NOT highest priority.
        Switch to a higher priority process (if possible) */
    if(current_process->priority != 0) {
        struct mlfq_entry* new_process = current_process;
        int i = 0;
        for(i = 0; i < MAX_PRIORITY_QUEUES; i++) {
            if(proc_priority_list[i] != NULL) {
                new_process = proc_priority_list[i];
                break;
            }
        }

        if(new_process->priority < current_process->priority) {
            current_process = new_process;
            return;
        }
    }

    /* Scenerio 3: Current process is the highest priority.
        Switch to the next process in line (if possible) */
    int cpr = current_process->priority;
    if(current_process->next_process != NULL) {
        current_process = current_process->next_process;
    } else if(proc_priority_list[cpr] != NULL) {
        current_process = proc_priority_list[cpr];
    }

    /* Scenerio 4: We're the only highest priority process.
        Continue as is! */
}

Putting it all together

By now, we have a test program, a process framework, and a scheduler. From here, we can demonstrate the following behaviors of how it all works:

We can use "ls" to see our user program in the filesystem.
We use a new "execbg" command which loads and executes a program in the background. For the sake of the demo, we use our test program, which will run infinitely versus terminate.
We can use the new "sched" and "ps" commands to see them added to process list and ran.

And that's it! It's been a multi-year journey of on-and-off work on this project, but I'm genuinely happy with where it finally ended up. I might revisit it again, or maybe improve on it someday.