As you have seen in homework assignment 2, the kernel maintains the state for each process and records that state in the state field of the task_struct of the process. The state indicates whether the process is runnable or running (TASK_RUNNING), sleeping (TASK_INTERRUPTIBLE, TASK_UNINTERRUPTIBLE), stopped (__TASK_STOPPED), dead (TASK_DEAD), etc. When a process is dead, the exit_state field of the task_struct of the process indicats whether the process is zombied (EXIT_ZOMBIE) or really dead (EXIT_DEAD). For this homework, you will need to trace the state changes for processes and record them in a ring buffer. Then you will need to write a synchronization mechanism based on the ring buffer. Other than changes required in existing files in the kernel source code, your code should be implemented in a file pstrace.c in the kernel directory, i.e. kernel/pstrace.c, and a file pstrace.h in the kernel include directory, i.e. include/linux/pstrace.h.
Trace the state change of processes in a ring buffer. Write a system call that enables the tracing of a process and another system call that disables the tracing. The interfaces of these system calls are:
#define PSTRACE_BUF_SIZE 500 /* The maximum size of the ring buffer */
/*
* Syscall No. 441
* Enable the tracing for @pid. If -1 is given, trace all processes.
*/
long pstrace_enable(pid_t pid);
/*
* Syscall No. 442
* Disable tracing.
*/
long pstrace_disable();
In other words, your tracing system will either trace a process or trace all processes. Tracing a process includes tracing all threads that share the same pid. You should return an appropriate error if the pid does not exist. If tracing is already enabled, pstrace_enable will replace the set of processes being traced with what is specified in the newer pstrace_enable call.
In addition to the global ring buffer, you should maintain a global data structure to track what processes are being traced. You may not modify the task_struct for this assignment.
You should implement a function that will record state changes for the respective processes and call this function from various places in the kernel to capture those state changes. The function should record the state change in a ring buffer. The interface of the function is:
/* The data structure used to save the traced process. */
struct pstrace {
char comm[16]; /* name of the process */
long state; /* state of the process */
pid_t pid; /* pid of the process, ie returned by getpid */
pid_t tid; /* tid of the thread, ie returned by gettid */
};
/* Add a record of the state change into the ring buffer. */
void pstrace_add(struct task_struct *p, long state);
You should trace the following seven states and record them in pstrace.state:
TASK_RUNNING
TASK_RUNNABLE (TASK_RUNNING state but not running on a CPU)
TASK_INTERRUPTIBLE
TASK_UNINTERRUPTIBLE
__TASK_STOPPED
EXIT_ZOMBIE
EXIT_DEAD
For example, if a process’s state changes from TASK_RUNNING to TASK_INTERRUPTIBLE, you should add a record with state TASK_INTERRUPTIBLE indicating that the state has changed to TASK_INTERRUPTIBLE. This does not necessarily mean that you need to record every instance when the state field in the task_struct is modified. For example, if the Linux code changes state from TASK_RUNNING to TASK_INTERRUPTIBLE to TASK_RUNNING all without actually running another task, the process’s state did not really change from TASK_RUNNING. A key part of this assignment is figuring out where a process’s state actually changes and recording those events. You should carefully consider the discussion in class regarding the lifecycle of a process in Linux. Note that we are also asking you to trace when a process switches from being on the run queue to actually running on the CPU, even though Linux denotes both of those states as TASK_RUNNING. To do this, you should introduce a TASK_RUNNABLE state for tracing purposes only (i.e. TASK_RUNNABLE is not stored in the actual state field of the task_struct. You should define TASK_RUNNABLE to have a value of 3.
Since the ring buffer is shared by all CPUs, you should properly use locks to protect the ring buffer from race conditions. You should also maintain a buffer counter, which is a persistent count of the number of records that have been recorded to the ring buffer. You may find it helpful to define your own data structure for each record in the ring buffer that contains more information than the pstrace structure.
Copy the tracing buffer into the user space. You should write a system call that can copy the record in the ring buffer to user space. The interface of the system call is:
/*
* Syscall No. 443
*
* Copy the pstrace ring buffer info @buf.
* If @counter > 0, the caller process will wait until a full buffer can
* be returned after record @counter (i.e. return record @counter + 1 to
* @counter + PSTRACE_BUF_SIZE), otherwise, return immediately.
*
* Returns the number of records copied.
*/
long pstrace_get(struct pstrace *buf, long *counter);
Note that if @counter is positive, your system call should sleep until a full buffer can be returned. A full buffer is when the buffer counter is equal to @counter plus PSTRACE_BUF_SIZE. Your system call should copy the records into @buf in chronological order such that the first entry is the entry corresponding to buffer counter @counter + 1 and the last entry is @counter + PSTRACE_BUF_SIZE, and should return in @counter the value of the buffer counter corresponding to the last record copied. For example, if pstrace_get is called with @counter=1000, it should not return until the ring buffer counter has reached 1500, and when it returns it should return the relevant buffer records from buffer counter 1001 to 1500 with @counter updated to 1500. If the buffer provided for copying into is not large enough to hold the records requested, you should return an error. If there are no records to copy, you should still return with the current value of the buffer counter in @counter.
You should have a sychronization mechanism such that when the buffer is not full, pstrace_get should wait if the counter is positive; when the buffer is full for a waiting pstrace_get, the process calling this system call should be woken up. You may NOT let the system call spin when the ring buffer is not full.
You should also ensure that you account for the fact that there may be some time that elapses between when the process is woken up and when the system call gets to complete and return the records to the calling process.
You should also handle interruption by signals.
You should have another system call that clears the ring buffer.
/*
* Syscall No.444
*
* Clear the pstrace buffer. Cleared records should
* never be returned to pstrace_get. Clear does not
* reset the value of the buffer counter.
*/
long pstrace_clear();
The system call should also wake up all processes waiting on the pstrace_get. The processes that are woken up should copy the relevant records in the buffer and return as opposed to waiting for their respective buffer full conditions to be met.
Test your pstrace. You should write a program that calls pstrace repeatedly to return the records in the buffer over time. Show how you can use the counter value so that successive calls to pstrace return a chronological ordering of all records, and explain any circumstances in which this may not be completely accurate. The program should be in the test branch of your team repo, and your makefile should generate an executable named test. For testing purposes, you should also write another program that changes its states between running and sleeping for a certain amount of times and exits. Use the first program to trace the process of the second program. This should be done by modifying test so that it forks a child process that execs the second program. In other words, we should be able to see the results of your second program by simply doing:
make
./test
You should be able to observe how the second program turns from running to sleeping and finally, to zombie and exits. Your testing should generate at least one record for each of the seven distinct process states we have asked you to record, and you should include the resulting output in your submission in a file pstrace_output.txt.