CS4118 OS multi-server

CS4118 OS multi-server
Multi-Server
Submission
We will be using GitHub for distributing and collecting your assignments. At this point you should have already set up your repository. If you have not yet done so, please follow the instructions sent to you over the class listserv. Note that if you submitted the group form late, you will not have a repo yet – please reach out to the TA listserv in that case. You will have to wait until a little after the assignment has been released, but please feel free to work on it independently.
To obtain the skeleton files that we have provided for you, you need to clone your private repository to your VM. Your repository page should have a button titled “< > Code”. Click on it, and under the “Clone” section, select “SSH” and copy the link from there. For example:
For each individual part, please create a new directory. For example, when you work on part 3, you should create a new directory named part3 . In each directory, you should include a Makefile with a default target that builds an executable named multi- server , and compiles with the -Wall and -Werror compiler flags.
The TAs will use scripts to download, extract, build, and display your code. It is essential that you DO NOT change the names of the skeleton files provided to you. If you deviate from the requirements, the grading scripts will fail and you will not receive credit for your work.
You need to have at least 5 git commits total, but we encourage you to have many more. All your work should be committed to the master branch. If you have not used git before, this tutorial can get you started.
With group assignments, we recommend that you push work to branches first, and then merge back into master once your group members have reviewed the code. You can read more about them here.
To hand in your assignment, you will create and push a tag:
You should verify that you are able to see your final commit and your hw3handin tag on your GitHub repository page for this assignment.
If you made a mistake and wish to resubmit your assignment, you may do the following to delete your submission tag:
You may then repeat the submission process. You are encouraged to resubmit as often as necessary to perfect your submission. As always, your submission should not contain any binary files.
At a minimum, README.txt should contain the following info:
Name & UNI for all group members Homework assignment number Description for each part
The description should indicate whether your solution for the part is working or not. You may also want to include anything else you would like to communicate to the grader, such as extra functionality you implemented or how you tried to fix your non-working code.
Answers to written questions, if applicable, must be added to the skeleton file we have provided.
Part 0: Basic single-process / single-threaded web server (Not graded)
Required tasks
Read the skeleton code provided ( multi-server.c ). Make sure you understand the code completely. Test multi-server.c using netcat ( nc ).
We recommend the OpenBSD variant of netcat. To install this, run:
Recommended tasks
Measure the performance of the basic web server
Use an HTTP traffic generator, such as Siege, to measure how many requests the web server can handle in a second. Use a sizable file (a hi-res image or a short movie) for testing so that it takes a measurable time for a request to complete.
Note that we are not suggesting that you conduct a serious performance measurement study. Measuring performance correctly and accurately is not an easy thing to do – many researchers build their careers around it. The actual numbers from your measurements don’t mean much. Your goal here is twofold:
a. Totestyourservertomakesureyouimplementeditcorrectly.
b. Togainadeeperunderstandingofserverarchitecturesbycomparingperformancecharacteristicsofdifferentstrategies.
Performance measurements are optional and will not be graded, but recommended. We will be using benchmarking tools to test your implementations.
Deliverables
Part 1: Multi-process web server
The basic version of multi-server has a limitation: it can handle only one connection at a time. This is a serious limitation because a malicious client could take advantage of this weakness and prevent the server from processing additional requests by sending an incomplete HTTP request. In this part we improve the situation by creating additional processes to handle requests.
The easiest way (from a programmer’s point of view) to handle multiple connections simultaneously is to create additional child processes with the fork() system call. Each time a new connection is accepted, instead of processing the request within the same process, we create a new child process by calling fork() and let it handle the request.
The child process inherits the open client socket and processes the request, generating a response. After the response has been sent, the child process terminates.
Required tasks
Modify the skeleton code (part 0) so that the web server forks after it accepts a new client connection, and the child process handles the request and terminates afterwards.
Test this implementation by connecting to it from multiple netcat clients simultaneously.
Recommended tasks
Do performance testing. Do you see any difference from part 1?
Requirements (and hints)
1. Notethatthetwosocketdescriptors–theserversocketandthenewconnectedclientsocket–areduplicatedwhentheserver forks. Make sure to close anything you don’t need as early as possible. Think about these:
Does the parent process need the client socket? Should it close it? If so, when? If the parent closes it, should the child close it again?
Does the child process need the server socket? Should it close it? What would happen if it doesn’t close it?
2. Don’t let your children become zombies… At least not for too long. Make sure the parent process calls waitpid() immediately after one or more child process have terminated.
How do you do this? Can you call waitpid() inside the main for (;;) loop? Obviously we cannot let waitpid() block until a child process terminates – we’d be back to where we started then. You will need to call waitpid() in a non-blocking way. (Hint: look into WNOHANG flag.) But even if you make it non-blocking, can you make your parent process call it immediately after a child process terminates? What if the parent process is being blocked on accept() ?
3. Modifytheloggingsothatitincludestheprocessidofthechildprocessthathandledtherequest.Assumingthelogginghappens in the child process, you can replace part0 ’s fprintf() call with the following:
4. You don’t have to worry about leaking memory when you terminate with Ctrl-C . However, while your server is running, there should not be any memory leaks – your memory usage should not increase as you run. This requirement applies to all parts of this assignment.
, multi-server.c , and other source files, under part1/
Part 2: Interprocess communication through shared memory
Reading assignment
APUE 14.8: read page 525-527, skim or skip the rest
APUE 15.9: skim or skip page 571-575, read page 576-578 APUE 15.10
Required tasks
Adapt your code from part 1 so that the web server keeps request statistics. The web server should respond to a special admin URL /statistics with a statistics page that looks something like this:
Feel free to beautify the output.
Requirements, hints, and recommended order of tasks
1. Sincemultiplechildprocesseswillneedtoupdatethestats,youneedtokeeptheminasharedmemorysegment.Useanonymous memory mapping described in APUE 15.9.
2. You should count the /statistics request itself in the 2xx count when you serve the page.
3. Performthehittestfrompart1andseeifyourcodekeepsaccuratestats.Therequestcountsmayormaynotbecorrectdueto
race conditions.
4. NowusePOSIXsemaphoreasdescribedinAPUE15.10tosynchronizeaccesstothestats.Afewthingstothinkabout:
POSIX semaphores can be named or unnamed. Which is a better choice here?
Where should you put the sem_t structure?
Are we using it as a counting semaphore or a binary semaphore?
Are any of the semaphore functions you are calling a “slow” system call? If so, what do you need to handle?
5. Repeattheperformancetestandverifythatthestatsareaccurate.
, multi-server.c , and other source files, under part2/ Part 3: Directory listing
The skeleton multi-server.c does not handle directory listing. When a requested URL is a directory, it simply responds with 403 Forbidden .
Adapt your code from part 2 so that it will provide directory listings.
Requirements and hints
Run /bin/ls -al on the requested directory and send out the result. You can format it in HTML if you wish, but the raw output is fine too.
In order to take the output of the ls command, you need to call pipe , fork , and exec . (Note: there may be multiple ways of achieving this functionality, but for this assignment you are required to use the aforementioned functions.) Arrange the file descriptors so that the ls output comes through the pipe.
Make sure you do not lose the multi-processing capability; that is, you still need to be able to serve multiple requests (whether they are files or directory listings) simultaneously.
Be diligent in closing the file descriptors that you don’t need as early as possible.
If ls encounters an error, it will print things to stderr . Make sure that the result you send to the browser includes them.
, multi-server.c , and other source files, under part3/
Part 4: Directory listing without running /bin/ls (0 points)
This part is optional and will not be graded.
Adapt your code from part 3 so it serves a directory listing without running /bin/ls . Requirements and hints
This part is easy. Instead of fork ing and exec ing /bin/ls , just use opendir() and readdir() functions. See APUE 1.4 for an example.
You don’t have to mimic the output of ls -al . Just the list of filenames is fine – i.e., mimic the output of ls -a . Deliverables (optional)
Makefile , multi-server.c , and other source files, under part4/ Part 5: Multi-threaded web server
POSIX threads provide a light-weight alternative to child processes. Instead of creating child processes to handle multiple HTTP requests simultaneously, we will create a new POSIX thread for each HTTP request.
Required tasks
1. Modifytheoriginalskeletoncode(part0)sothatthewebservercreatesanewPOSIXthreadafteritacceptsanewclient connection, and the new thread handles the request and terminates afterwards.
2. Twolibraryfunctionsusedbytheskeletonmulti-server.carenotthread-safe.Youmustreplacethemwiththeirthread-safe counterparts in your code.
In your README.txt , identify the two functions and describe how you fixed them.
Note that exit() is also not thread-safe, but do not consider it one of the two functions that you list.
3. Testthisimplementationbyconnectingtoitfrommultiplenetcatclientssimultaneously.
Recommended tasks
Perform benchmark measurements as you did in part 0 and part 1. Do you see any improvement over the skeleton version (part 0)? Any improvement over the multi-process version from part 1?
Requirements and hints
Call pthread_create() to create a new thread, passing the client socket descriptor as an argument to the thread start function.
Make sure that the newly created threads do not remain as thread zombies when they are done. You can prevent a thread from remaining as a thread zombie by either joining with it from another thread (i.e., call pthread_join() ) or making it a detached thread (i.e., call pthread_detach(pthread_self()) ). Which method makes more sense in this situation?
Note that malloc() is not async-signal-safe, but is thread-safe.
, multi-server.c , and other source files, under part5/ How you fixed the two non-thread-safe function calls, in README.txt
Part 6: Pre-created pool of threads
Read the following Q&A at StackOverflow.com:
Calling accept() from multiple threads
In parts 6 & 7, we will implement the two methods described in the article.
Required tasks
1. Adaptyourcodefrompart5.Insteadofcreatinganewthreadforeachnewclientconnection,pre-createafixednumberofworker threads in the beginning. Each of the pre-created worker threads will act like the original skeleton web server – i.e., each thread will be in a for(;;) loop, repeatedly calling accept() .
2. Testthisimplementationbyconnectingtoitfrommultiplenetcatclientssimultaneously.
Recommended tasks
Perform benchmark testing. How does it compare with part 5?
Requirements and hints
You can use a global array of pthread_t like this:
After creating N_THREADS worker threads, make sure your main thread does not exit. One way to do this is to call pthread_join() .
Assume that the server runs forever. You can stop the server by pressing Ctrl-C, which will kill all the threads. You are not expected to clean up allocated resources upon termination.
, multi-server.c , and other source files, under part6/ Part 7: Blocking queue
Required tasks
1. Adapt your code from part 6 so that only the main thread calls accept() . The main thread puts the client socket descriptor into a blocking queue, and wakes up the worker threads which have been blocked waiting for client requests to handle.
After the main thread puts a client socket descriptor into the blocking queue, should it call pthread_cond_signal() or pthread_cond_broadcast() ? Or will the server behave correctly both ways (assuming everything else is correct)?
2. Testthisimplementationbyconnectingtoitfrommultiplenetcatclientssimultaneously.
Recommended tasks
Perform benchmark testing. How does it compare with part 6?
Requirements and hints
You must use the following structures for your blocking queue:
Implement the following queue API:
The members of struct queue should be accessed ONLY using the four API functions.
The queue API should be independent of memory management for the queue struct itself. That is, the queue API should assume memory management for q is done by the user. Do NOT malloc() or free() q in queue_init / queue_destroy . The user should be able to allocate their queue struct however they’d like (e.g. statically, heap, stack).
Even though you are not expected to clean up allocated resources upon termination of the process, you must implement queue_destroy() . The best way to test this function is to create a separate test program that initializes your queue, tests put and
get, and then destroys the queue.
You can assume that no other threads will access the queue before it is fully initialized or after/while it is destroyed. The main thread is in a for(;;) loop, accept() ing and putting the client socket into the queue.
The worker threads are in a for(;;) loop, taking out a socket descriptor from the queue and handling the connection.
, multi-server.c , and other source files, under part7/ Part 8: Listening on multiple ports
Required tasks
1. Adaptyourcodefrompart7sothatthewebservertakesnotjustone,butmultipleportnumbersascommandlinearguments (followed by the web root as the last argument.) The web server will bind and listen on all of the ports.
2. Testthisimplementationbyconnectingtoitfrommultiplenetcatclientssimultaneouslytodifferentports.
Recommended tasks
Perform benchmark testing, hitting multiple ports. Compared to part 7, the performance penalty should be negligible, if at all.
Requirements and hints
Here is a piece of code you can use in main() :
The code will create server sockets for all of the ports specified in the command line, up to 31 of them. The servSocks array is initially filled with -1 so that we can tell where the list of socket descriptors ends.
BTW, do you understand how memset(servSocks, -1, sizeof(servSocks)) fills an array of ints with -1 when it is supposed to fill the memory byte-by-byte?
In your main thread, before you call accept() , you need to find out which server sockets currently have a client pending so that you can call accept() knowing that it won’t block.
You can accomplish that task using the select() system call. You pass a read set containing all your server socket descriptors.
When select() returns, you can go through the server socket descriptors, calling accept() on only those descriptors that are ready for reading.
Note that select() is special in that, even if the SA_RESTART option is specified, the select() function is not restarted under most UNIX systems. Make sure you handle this behavior properly.
, multi-server.c , and other source files, under part8/ Part 9: Nonblocking accept() (0 points)
This part is optional and will not be graded.
Part 8 has a flaw. Between select() and accept() , there is a chance that the client connection gets reset. If that happens, in some
systems, accept() may block. In order to handle that case, we need to make the server socket nonblocking.
Note that this behavior depends on the version of the system you are running the server on, and may be difficult to reproduce. You are
not expected to test this behavior.
1. Adapt your code from part 8 so that createServerSocket() sets the server socket into a nonblocking mode. You can use fcntl() to turn on nonblocking right after you create a server socket with a socket() call.
2. Now accept() will never block. In those cases where it might have blocked, it will now fail with certain errno values. Read the man page to find out which errno values you need to handle.
Also don’t forget to handle interruption by signals.
, multi-server.c , and other source files, under part9/ Part 10: Printing request statistics on SIGUSR1
Recall part 2, where we implemented a special admin URL /statistics to fetch a web server request statistics page. In this part, we will implement an alternate mechanism to print statistics.
For this part, we have to go back and start from our part 3 code, which is the last version of multi-server with multiple processes (before we switched to multi-threading in part 5.)
1. Adapt your code from part 3 so that when the web server receives a SIGUSR1 signal, it will print the statistics at that time to standard error.
2. Test it by sending the signal with the kill command while the web server is blocked on an accept() call. Make sure the web server prints out the stats immediately, not when it receives the next HTTP request.
3. Test it by sending the signal with the kill command to a child process while the child process is in the middle of receiving an HTTP request. Describe what happens and explain why.
You are not expected to do anything special about the child processes in this part, which means that the children will inherit the parent’s signal handler. This is not the right design. (Think about why.) You don’t have to fix the behavior. Just explain what happens and why.
Requirements and hints
Use sigaction() to install a handler for SIGUSR1 . You need to decide if you should set SA_RESTART flag or not.
Note that what you can do inside a signal handler is very limited. For example, you can’t call fprintf because it is not an async
signal safe function.
Don’t forget to lock the semaphore when you access the stats. Again, you can’t lock stuff in signal handlers because you can then
The web server should respond immediately to SIGUSR1 when it’s blocked on accept() . If a SIGUSR1 signal comes during the
short period of time between two accept() calls, it will miss it. You don’t have to handle this case.
, multi-server.c , and other source files, under part10/ Explanation for task #3, in README.txt
Part 11: Server-side bash scripts (0 points)
This part is optional and will not be graded. You may skip to part 12.
This part is a challenge for those of you hackers, who are complaining that this assignment has been too easy so far.
In this part, we will enable server-side bash scripts. When a requested URL is an executable script, the