ECE391 MP2

ECE391: Computer Systems Engineering Spring 2022 Machine Problem 2 Checkpoint 1 due in Gitlab repo by 5:59 PM CST on Monday 28 February
Checkpoint 2 due in Gitlab repo by 5:59 PM CDT on Monday 07 March
Device, Data, and Timing Abstractions
Read the whole document before you begin, or you may miss points on some requirements.
In this machine problem, you will extend a video game consisting of about 4,000 lines of code with additional graphical features and a serial port device. The code for the game is reasonably well-documented, and you will need to read and understand the code in order to succeed, thus building your ability to explore and comprehend existing software systems. Most code that you will encounter is neither as small nor as well documented—take a look at some of the Linux sources for comparison—but this assignment should help you start to build the skills necessary to extend more realistic systems. As your effort must span the kernel/user boundary, this assignment will also expose you to some of the mechanisms used to manage these interactions, many of which we will study in more detail later in the course. Before discussing the tasks for the assignment, let’s discuss the skills and knowledge that we want you to gain:
• Learn to write code that interacts directly with devices.
• Learn to abstract devices with system software.
• Learn to manipulate bits and to transform data from one format into another. • Learn the basic structure of an event loop.
• Learn how to use the pthread API.
Device Protocols: We want you to have some experience writing software that interacts directly with devices and must adhere to the protocols specified by those devices. Similar problems arise when one must meet software interface specifications, but you need experience with both in order to recognize the similarities and differences. Unfortunately, most of the devices accessible from within QEMU have fully developed drivers within Linux. The video card, however, is usually managed directly from user-level so as to improve performance, thus most of the code is in other software packages (e.g., XFree86).
We are fortunate to have a second device designed by and , two previous staff members. The Tux Controller is that funny little game controller attached to each of the machines in the lab.
The Tux Controller connects to the USB port of the lab machine. An FTDI “Virtual Com Port” (VCP) driver makes the USB port appear to software as a standard (old fashioned) RS232 serial port. We can then set up QEMU so that one of the emulated serial ports on the virtual machine maps to the emulated serial port connected to the Tux Controller. In this assignment, you will write code that interacts directly with both the (emulated) video card and the game controller board.
Device Abstraction: Most devices implement only part of the functionality that a typical user might associate with them. For example, disk drives provide only a simple interface through which bits can be stored and retrieved in fixed-size blocks of several kB. All other functionality, including everything from logical partitions and directories to variable-length files and file-sharing semantics, is supported by software, most of which resides in the operating system. In this machine problem, you will abstract some of the functionality provided by the Tux controller board.
Format Interchange: This machine problem gives you several opportunities for working with data layout in memory and for transforming data from one form to another. Most of these tasks relate to graphics, and involve taking bit vectors or pixel data laid out in a form convenient to C programmers and changing it into a form easily used by the Video Graphics Array (VGA) operating in mode X. Although the details of the VGA mode X layout are not particularly relevant today, they do represent the end product of good engineers working to push the limits on video technology. If
—-PAGE—-
you work with cutting-edge systems, you are likely to encounter situations in which data formats have been contorted for performance or to meet standards, and you are likely to have to develop systems to transform from one format to another.
Event Loops: The idea of an event loop is central to a wide range of software systems, ranging from video games and discrete event simulators to graphical user interfaces and web servers. An event loop is not much different than a state machine implemented in software, and structuring a software system around an event loop can help you to structure your thoughts and the design of the system.
Threading: Since the advent of faster processors, it became possible to exploit code parallelism by creating multiple threads. Multiple threads of execution also allow logical separate tasks to be executed using synchronous operations. If one thread blocks waiting for an operation to complete, other threads are still free to work. Note that threads are different from processes. A thread is the smallest unit of processing that can be scheduled by an operating system. Multiple threads can exist within the same process and share resources such as memory. Different processes cannot share these resources. On a single processor, multithreading generally occurs by having the processor switch between different threads, which is known as context switching. Since context switching happens frequently, the user perceives the threads as running concurrently.
In this machine problem, we illustrate the basic concepts by using a separate thread to get updates from the keyboard. You will need to synchronize your code in the main thread with this helper thread using a Posix mutex. You will also need to add a new thread to take input from the tux controller. This thread may also need synchronization to guarantee correctness.
You may want to read the class notes on Posix threads to help you understand these interactions. Later classes will assume knowledge of this material.
Software examples and test strategies: In addition to these five learning objectives for the assignment, this machine problem provides you with examples of software structure as well as testing strategies for software components.
Two of the files include main functions that are only included when a defined value is changed at the top of the file. By setting this value to one and compiling the file by itself, you create an executable that tests the functionality provided by the file in the absence of other code. When you design software interfaces, you should do so in a way that allows you to test components separately in this manner and thus to deal with bugs as soon as possible and in as small a body of code as possible. Individual function tests and walking through each function in a debugger are also worthwhile, but hard to justify in an academic setting. The input control file allows you to develop and test your Tux controller code without using the game interface. Since the game changes the display to mode X, testing without the hassle of changing back to text mode can be quite helpful.
The modex.c file is compiled by the Makefile to create a separate executable that returns your system to text mode. We also made use of a technique known as a memory fence to check some of the more error-prone activities in the file; read the code to understand what a memory fence is and what it does for you.
The Pieces Provided
You are given a working but not fully-functional copy of the source tree for a maze game along with a skeletal kernel module for the Tux controller. The Tux controller boards are attached to each machine in the lab.
The table below explains the contents of the source files.
—-PAGE—-
assert.c Support for assertions and cleanups as design and debugging aids. blocks.s Graphic block images of the maze, the player, fruits, etc.
input.c Input control. Provides for initialization and shutdown of the input controller. The version provided to you supports keyboard control. You must write the support for the Tux controller. This file is not compiled with the rest of the game and should be used only to test your inputs separately from the game. Can be compiled stand-alone to test the input device.
maze.c Maze generation and logic. Creates the maze; creates, checks for, and consumes fruit; checks for win conditions. Transforms the block image data in blocks.s into pixelized graphic images of maze lines and columns for use by modex.c in drawing parts of the logical view. Can be compiled stand-alone to test maze generation.
mazegame.c Multithreaded main process of the mazegame. There is a thread that controls the game logic, includ- ing the main event loop, level set up, all timing logic, display control logic (motion and scrolling). Includes a computer-controlled player as well as one using the regular input control. Another thread is created to handle keyboard control. You will eventually need to add another thread for the Tux Controller control.
modex.c Functions to support use of VGA mode X. Includes things like switching from text mode to mode X and back, and clearing the screens. Provides a logical view window abstraction that is drawn in normal memory and copied into video memory using a double-buffering strategy. When the logical view is shifted, data still on the screen are retained, thus only those portions of the logical view that were not previously visible must be drawn. Finally, supports mapping from pixelized graphics in formats convenient to C into drawing a screen at a certain logical position. Relies on maze.c to provide vertical and horizontal lines from the maze. Is also compiled stand-alone to create the tr text mode restoration program.
text.c Text font data and conversion from ASCII strings to pixelized graphic images of text (you must write the latter).
We have also included two stripped binaries to illustrate your end goal. The mazegame-demo program is a fully working version of the game that allows both keyboard and Tux controller input. The input-demo program is a stand-alone compilation of input.c, again allowing both forms of input.
The tr Program
The make file provided to you builds both the game and a text-mode-restoration program called tr. The latter program is provided to help you with debugging. One difficulty involved with debugging code that makes use of the video hardware is that the display may be left in an unusable (i.e., non-text-mode) state when the program crashes, hangs, or hits a breakpoint. In order to force the display back into text mode for debugging purposes (or, if you are not running the program in a debugger, to regain control of your shell), you can run the tr program. Unless you are fairly confident in your ability to type without visual feedback, we recommend that you keep a second virtual console (CTRL-ALT-F1 through F6) logged in with the command to execute the text restoration program pre-typed, allowing you to simply switch into that console and press Enter. Using this program is substantially easier than rebooting your machine to put it back into text mode.
You should also look at the cleanup handlers provided by the assert module (assert.h and assert.c). These cleanup handlers provide support for fatal exceptions, putting the machine back into a usable state when your program crashes. However, there may be instances and problems not covered by the handlers, and GDB can stop the program before the handlers are invoked, leaving the machine in an unusable state until you restore text mode with tr.
—-PAGE—-
Mode X and Graphic Images
Mode X is a 256-color graphics mode with 320×200 resolution. It was not supported by the standard graphics routines that came with the original VGAs, but was supported by the VGA hardware itself, and was quickly adopted as the stan- dard for video games at the time because of certain technical advantages over the documented 256-color mode (mode 13h, where the ‘h’ stands for hexadecimal). In particular, mode X supports multiple video pages, allowing a program to switch the display between two screens, drawing only to the screen not currently displayed, and thereby avoiding the annoying flicker effects associated with showing a partially-drawn image. This technique is called double-buffering.
Each pixel in mode X is represented by a one-byte color value. Although only 256 colors are available in mode X, the actual color space is 18-bit, with 6-bit depth for red, green, and blue saturation. A level of indirection is used to map one-byte pixel color values into this space. The table used in this mapping is called a palette, as it is analogous to a painter’s palette, which in theory holds an infinite range of colors, but can only hold a few at any one time. Palettes are often used to reduce the amount of memory necessary to describe an image, and are thus useful on embedded devices even today. For your final projects, some of you may want to play with graphic techniques such as palette color selection to best represent a more detailed image and dithering to map a more detailed image into a given palette.
The mapping between screen pixels and video memory in mode X is a little contorted, but is not as bad as some of the older modes. The screen is first separated into blocks four pixels wide and one pixel high. Each block corresponds to a single memory address from the processor’s point of view. You may at this point wonder how four bytes of data get crammed into a single memory address (one byte in a byte-addressable memory). The answer, of course, is that they don’t. For performance reasons, the VGA memory was 32-bit, and the interface between the processor and the VGA is used to determine how the processor’s 8-bit reads and writes are mapped into the VGA’s 32-bit words. For example, when the processor writes to a memory address, four bits of a specific VGA register are used to enable (1 bits) or disable (0 bits) writes to 8-bit groups of the corresponding word in VGA memory. When drawing rectangles of one color, this mask reduces the work for the processor by a factor of four. When drawing images, however, it poses a problem in that adjacent pixels must be written using different mask register settings. As changing a VGA register value is relatively slow, the right way to use mode X is to split the image data into four groups of interleaved pixels, set the mask for each group, and write each group as a whole using the x86 string instructions (you won’t need to write any of these, although you may want to look at how it is done in the existing code or in the x86 manual).
address 0xA1284 (0xA1234 + 80)
address 0xA12D4 (0xA1234 + 160)
0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 …
address 0xA1234
address 0xA1235
address 0xA1236
address 0xA1237
(all at 0xA1236) plane 3
Mode X video memory is memory-mapped and runs from (physical) memory address 0xA0000 to 0xAFFFF, for a total of 216 addresses and 256 kB of addressable memory (counting all four planes). The addresses used in your program are virtual rather than physical addresses, a concept to be discussed later in the course; don’t be surprised if they are not the same as the physical addresses, though. A full screen occupies 320 × 200/4 = 16, 000 addresses, so four fit into the full memory, with a little extra room. The VGA can be directed to start the display from any address in its memory; addresses wrap around if necessary. In the figure shown above, for example, the screen starts at address 0xA1234.
Due to the timing behavior of emulated interactions with video memory, the code provided to you does not use a traditional double-buffering model in which the off-screen image is drawn directly within video memory. As with double-buffering, we do maintain two regions within the video memory for screen images. However, we use regular memory to maintain an image of the screen and to update that image in place. When a new image is ready for display, we copy it into one of two regions of the video memory (the one not being displayed) and then change the VGA registers to display the new image. Copying an entire screen into video memory using one x86 string instruction seems to take about as long as writing a small number of bytes to video memory under QEMU, thus our image display is actually faster than trying to draw a vertical line in video memory, which requires one MOV instruction per vertical pixel.
—-PAGE—-
Only the modex.c file relies on mode X. The rest of the code should use a graphics format more convenient to C. In particular, an image that is X pixels wide and Y pixels high should be placed in an array of dimensions [Y ][X]. The type of the array depends on the color depth of the image, and in our case is an unsigned char to store the 8-bit color index for a pixel. The block images in blocks.s already use this format, as do the line generation routines in maze.c. When you write code to generate graphic images from text, as described in a later section, you should use the same format.
Graphical Mapping in the Game
The game defines a two-dimensional world of pixels, and the screen at any time shows a region of that world. The
maze world is never fully constructed in memory; rather, it is constructed dynamically as necessary to fill the screen.
This facet is useful for games in which drawing the entire virtual world at once requires too much memory.
(show_x,show_y)
size of scrolling portion of video screen
build buffer
plane 3 of logical view
plane 2 of logical view
plane 1 of logical view
plane 0 of logical view
mapped into build buffer using maze coordinates for mode X planes
planes shift in the build buffer as the logical view moves within the maze world
maze world pixels
logical view window
We use a build buffer to keep the pixel data relevant to the screen organized in a form convenient for moving into video memory in mode X. However, in order to avoid having to move the data around a lot in the build buffer (or redraw the whole screen each time we move the logical view window by one pixel), we break the maze world into 4×1 pixel chunks using the mode X mapping illustrated in the previous section. The address of the logical view window in the maze world is used to decide where to place the image planes within the build buffer, and moves around within the build buffer as the logical window moves in the maze world, as shown in the figure below.
this address is in the window for
this address is in plane 0 the window for
planes 1, 2, and 3
The mapping that we have described has a subtle detail: the address range used for the different planes within the logical view window may not be the same. Consider the case shown above, in which show x & 3 == 1. As we move the logical view window around in the maze world, we need to keep each address block at a fixed place in the build buffer (again to avoid having to copy data around). If we were to keep the planes in the order 0 to 3 and not put any extra space between them, the image of plane 0 would in this case overlap with the image of plane 1 in the build buffer. By reversing the order, we can avoid this problem (alternatively, we could have used a one-byte buffer zone between consecutive planes).
The next problem is mapping the build buffer into the video memory. We use two buffers in video memory and map into the non-displayed buffer, then change the VGA register to show the new screen. You can read about this double- buffering technique in the code and see how it works. The complexity with regard to the plane mapping is that we must map the build buffer planes, which are defined in terms of the maze world coordinates, into the video planes, which are defined in terms of the screen coordinates. The picture below illustrates this problem. In general, a cyclic shift of the planes suffices for the mapping.
logical view window

—-PAGE—-
logical view windo