Assignment 4 – File Systems
Due: Apr 2, at 10 p.m. (Do not leave this assignment until the last few days, or you will NOT be able to complete it!)
Introduction
In this assignment, you will explore the implementation of a particular file system, ext2, and will write tools to modify ext2-format virtual disks. To do this work, you will need to be comfortable working with binary data and will need to learn about the ext2 filesystem.
You can work in pairs for this assignment. MarkUs will only create the appropriate A4 directory in your repository when you log into MarkUs and either invite a partner, or declare that you will work alone. As usual, please log into MarkUs well before the deadline to make sure you can access your repository. (Do not create an A4 the directory in svn, otherwise MarkUs won’t know about it and we won’t be able to see your work.)
This assignment contains some bonus features. Implementing a bonus will compensate for any possible marks lost in another section of the assignment, but can also give you more than 100% if implemented correctly. For implementing any additional functionality which is not specified in the handout, or if you are unsure whether to handle some inode parameters in specific cases (after having read the documentation!), please ask on the discussion board.
Requirements
Your task is to write a set of programs (in C) that operate on an ext2 formatted virtual disk. The executables must be named exactly as listed below, and must take the specified arguments.
ext2_cp: This program takes three command line arguments. The first is the name of an ext2 formatted virtual disk. The second is the path to a file on your native operating system, and the third is an absolute path on your ext2 formatted disk. The program should work like cp, copying the file on your native file system onto the specified location on the disk. If the specified file or target location does not exist, then your program should return the appropriate error (ENOENT). Please read the specifications of ext2 carefully, some things you will not need to worry about (like permissions, gid, uid, etc.), while setting other information in the inodes may be important (e.g., i_dtime).
ext2_mkdir: This program takes two command line arguments. The first is the name of an ext2 formatted virtual disk. The second is an absolute path on your ext2 formatted disk. The program should work like mkdir, creating the final directory on the specified path on the disk. If any component on the path to the location where the final directory is to be created does not exist or if the specified directory already exists, then your program should return the appropriate error (ENOENT or EEXIST). Again, please read the specifications to make sure you’re implementing everything correctly (e.g., directory entries should be aligned to 4B, entry names are not null- terminated, etc.).
ext2_ln: This program takes three command line arguments. The first is the name of an ext2 formatted virtual disk. The other two are absolute paths on your ext2 formatted disk. The program should work like ln, creating a link from the first specified file to the second specified path. This program should handle any exceptional circumstances, for example: if the source file does not exist (ENOENT), if the link name already exists (EEXIST), if a hardlink refers to a directory (EISDIR), etc. then your program should return the appropriate error code. Additionally, this command may take a “-s” flag, after the disk image argument. When this flag is used, your program must create a symlink instead (other arguments remain the same). If in doubt about correct operation of links, use the ext2 specs and ask on the discussion board.
ext2_rm: This program takes two command line arguments. The first is the name of an ext2 formatted virtual disk, and the second is an absolute path to a file or link (not a directory) on that disk. The program should work like rm, removing the specified file from the disk. If the file does not exist or if it is a directory, then your program should return the appropriate error. Once again, please read the specifications of ext2 carefully, to figure out what needs to actually happen when a file or link is removed (e.g., no need to zero out data blocks, must set i_dtime in the inode, removing a directory entry need not shift the directory entries after the one being deleted, etc.).
BONUS: Implement an additional “-r” flag (after the disk image argument), which allows removing directories as well. In this case, you will have to recursively remove all the contents of the directory specified in the last argument. If “-r” is used with a regular file or link, then it should be ignored (the ext2_rm operation should be carried out as if the flag had not been entered). If you decide to do the bonus, make sure first that your ext2_rm works, then create a new copy of it and rename it to ext2_rm_bonus.c, and implement the additional functionality in this separate source file.
ext2_restore: This program takes two command line arguments. The first is the name of an ext2 formatted virtual disk, and the second is an absolute path to a file or link (not a directory!) on that disk. The program should be the exact opposite of rm, restoring the specified file that has been previous removed. If the file does not exist (it may have been overwritten), or if it is a directory, then your program should return the appropriate error.
Hint: The file to be restored will not appear in the directory entries of the parent directory, unless you search the “gaps” left when files get removed. The directory entry structure is the key to finding out these gaps and searching for the removed file.
Note: If the directory entry for the file has not been overwritten, you will still need to make sure that the inode has not been reused, and that none of its data blocks have been reallocated. You may assume that the bitmaps are reliable indicators of such fact. If the file cannot be fully restored, your program should terminate with ENOENT, indicating that the operation was unsuccessful.
Note(2): For testing, you should focus primarily on restoring files that you’ve removed using your ext2_rm implementation, since ext2_restore should undo the exact changes made by ext2_rm. While there are some removed entries already present in some of the image files provided, the respective files have been removed on a non-ext2 file system, which is not doing the removal the same way that ext2 would. In ext2, when you do “rm”, the inode’s i_blocks do not get zeroed, and you can do full recovery, as stated in the assignment (which deals solely with ext2 images, hence why you only have to worry about this type of (simpler) recovery). In other FSs things work differently. In ext3, when you rm a file, the data block indexes from its inode do get zeroed, so recovery is not as trivial. For example, there are some removed files in deletedfile.img, which have their blocks zero-ed out (due to how these images were created). In such cases, your code should still work, but simply recover a file as an empty file (with no data blocks). However, for the most part, try to recover files that you’ve ext2_rm-ed yourself, to make sure that you can restore
data blocks as well.
Note(3): We will not try to recover files that had hardlinks at the time of removal. This is because when trying to restore a file, if its inode is already in use, there are two options: the file we’re trying to restore previously had other hardlinks (and hence its inode never really got invalidated), _or_ its inode has been re-allocated to a completely new file. Since there is no way to tell between these 2 possibilities, recovery in this case should not be attempted.
BONUS: Implement an additional “-r” flag (after the disk image argument), which allows restoring directories as well. In this case, you will have to recursively restore all the contents of the directory specified in the last argument. If “-r” is used with a regular file or link, then it should be ignored (the restore operation should be carried out as if the flag had not been entered). If you decide to do the bonus, make sure first that your ext2_restore works, then create a new copy of it and rename it to ext2_restore_bonus.c, and implement the additional functionality in this separate source file.
ext2_checker: This program takes only one command line argument: the name of an ext2 formatted virtual disk. The program should implement a lightweight file system checker, which detects a small subset of possible file system inconsistencies and takes appropriate actions to fix them (as well as counts the number of fixes), as follows:
a. the superblock and block group counters for free blocks and free inodes must match the number of free inodes and data blocks as indicated in the respective bitmaps. If an inconsistency is detected, the checker will trust the bitmaps and update the counters. Once such an inconsistency is fixed, your program should output the following message: “Fixed: X’s Y counter was off by Z compared to the bitmap”, where X stands for either “superblock” or “block group”, Y is either “free blocks” or “free inodes”, and Z is the difference (in absolute value). The Z values should be added to the total number of fixes.
b. for each file, directory, or symlink, you must check if its inode’s i_mode matches the directory entry file_type. If it does not, then you shall trust the inode’s i_mode and fix the file_type to match. Once such an inconsistency is repaired, your program should output the following message: “Fixed: Entry type vs inode mismatch: inode [I]”, where I is the inode number for the respective file system object. Each inconsistency counts towards to total number of fixes.
c. for each file, directory or symlink, you must check that its inode is marked as allocated in the inode bitmap. If it isn’t, then the inode bitmap must be updated to indicate that the inode is in use. You should also update the corresponding counters in the block group and superblock (they should be consistent with the bitmap at this point). Once such an inconsistency is repaired, your program should output the following message: “Fixed: inode [I] not marked as in-use”, where I is the inode number. Each inconsistency counts towards to total number of fixes.
d. for each file, directory, or symlink, you must check that its inode’s i_dtime is set to 0. If it isn’t, you must reset (to 0), to indicate that the file should not be marked for removal. Once such an inconsistency is repaired, your program should output the following message: “Fixed: valid inode marked for deletion: [I]”, where I is the inode number. Each inconsistency counts towards to total number of fixes.
e. for each file, directory, or symlink, you must check that all its data blocks are allocated in the data bitmap. If any of its blocks is not allocated, you must fix this by updating the data bitmap. You should also update the corresponding counters in the block group and superblock, (they should be consistent with the bitmap at this point). Once such an inconsistency is fixed, your program should output the following message: “Fixed: D in-use data blocks not marked in data bitmap for inode: [I]”, where D is the number of data blocks fixed, and I is the inode number. Each inconsistency counts towards to total number of fixes.
Code Help
Your program must count all the fixed inconsistencies, and produce one last message: either “N file system inconsistencies repaired!”, where N is the number of fixes made, or “No file system inconsistencies detected!”.
You may limit your consistency checks to only regular files, directories and symlinks.
Hint: You might want to fix the counters based on the bitmaps, as a one-time step before attempting to fix any other type of inconsistency. Even if initially trusting the bitmaps may not be the way to go (since they could be corrupted), the counters should get readjusted in the later steps anyway, whenever the bitmaps get updated. The Z values from point a) should be added to the tally of fixes, but do not include any further superblock or block group counter adjustments from points c) and e) (since technically these may be just correcting the adjustments made in point a)).
All of these programs should be minimalist. Don’t implement what isn’t specified: only provide the required functionality and specified errors. (For example, don’t implement wildcards. Also, can’t delete directories? Too bad! Unless you want the bonus! 🙂
You will find it very useful for these programs to share code. You will want a function that performs a path walk, for example. You will also want a function that opens a specific directory entry and writes to it.
To help you visualize your file system, we are giving you an already built tool, called ext2_ls (ext2_ls). This program takes two command line arguments.
– The first is the name of an ext2 formatted virtual disk.
– The second is an absolute path on the ext2 formatted disk.
The program works like ls -1a (that’s number one “1”, not lowercase letter “L”): it prints one line for every directory entry (including “.” and “..”) from the directory specified by the absolute path. If the path does not exist, it prints “No such file or directory”. If the path is a file or link, the tool simply prints the full path on a single line.
We are also giving you a tool called ext2_dump (ext2_dump), which dumps all the raw information about the image contents. This is very similar to the readimage program that you have to implement for the tutorial exercises that give you practice with ext2 images. Once again, we encourage you to work on the tutorial exercises first, to gain experience with extracting various bits of information from an ext2 image.
We are also giving you a tool called ext2_corruptor (ext2_corruptor), which corrupts file system images, introducing various inconsistencies like the ones that you have to fix. This tool has limited capabilities, and is solely to help you with basic testing. You are welcome to develop your own corruptor tool as well.
Finally, to help you in determine some basic correctness, we are giving you some sample test cases: running a set of commands and their expected outputs (image dumps). These self-tester test cases are found on the teaching labs under: /u/csc369h/winter/pub/public/A4-self-test
Learning about the Filesystem
Here are several sample virtual disk images:
emptydisk (./images/emptydisk.img): An empty virtual disk.
onefile (./images/onefile.img): A single text file has been added to emptydisk.
deletedfile (./images/deletedfile.img): The file from onefile has been removed.
onedirectory (./images/onedirectory.img): A single directory containing a text file has been added to emptydisk.
hardlink (./images/hardlink.img): A hard link to the textfile in onedirectory was added. deleteddirectory (./images/deleteddirectory.img): A recursive remove was used to remove the
directory and file from onedirectory.
twolevel (./images/twolevel.img): The root directory contains a directory called level1 and a file called afile. level1 contains a directory called level2, and level2 contains a file called bfile.
twolevel-corrupt (./images/twolevel-corrupt.img): Same as twolevel, except that the image contains file system inconsistencies that you will have to repair with the checker. twolevel-norestore-afile (./images/twolevel-norestore-afile.img): Same as twolevel, except that /afile has been removed, and its inode number has been reused. This image can be used for testing the case when a file cannot be restored.
largefile (./images/largefile.img): A file larger than 13KB (13440 bytes) is in the root directory. This file requires the single indirect block in the inode.
These disks were each created and formatted in the same way (on an ubuntu virtual machine):
% dd if=/dev/zero of=~/DISKNAME.img bs=1024 count=128
% mke2fs -N 32 DISKNAME.img
% sudo mount -o loop ~/DISKNAME.img /home/bogdan/mntpoint
% cd /home/bogdan/mntpoint
% …… normal linux commands to add/remove files/directories/links …..
% umount /home/bogdan/mntpoint
Since we are creating images with mke2fs, the disks are formatted with the ext2 file system (http://en.wikipedia.org/wiki/Ext2). You may wish to read about this system before doing some exploration. The wikipedia page for ext2 (http://en.wikipedia.org/wiki/Ext2) provides a good overview, but the Ext2 wiki (http://wiki.osdev.org/Ext2) and Dave Poirer’s Second Extended File System (http://www.nongnu.org/ext2- doc/index.html) article provide more detail on how the system places data onto a disk. It’s a good reference to keep on hand as you explore.
We are restricting ourselves to some simple parameters, so you can make the following assumptions when you write your code:
A disk is 128 blocks where the block size is 1024 bytes.
There is only one block group.
There are 32 inodes.
You do not have to worry about permissions or modified time fields in the inodes. You should set the type (in i_mode), i_size, i_links_count, i_blocks(disk sectors), and the i_block array.
We will not test your code on anything other than disk images that follow this specification, or on corrupted disk images.
Other tips:
Inode and disk block numbering starts at 1 instead of 0.
The root inode is inode number 2 (at index 1)
The first 11 inodes are reserved.
There is always a lost+found directory in the root directory.
Disk sectors are 512 bytes. (This is relevant for the i_blocks field of the inode.) You should be able to handle directories that require more than one block. You should be able to handle a file that needs a single indirection
Although you can construct your own structs from the information in the documentation above, you are welcome to use the ext2.h (./ext2.h) file that I used for the test code. I took out a bunch of
程序代写 CS代考 加微信: cstutorcs
components that we aren’t using, but there are still quite a few fields that are irrelevant for our purposes.
However, you will probably also want to explore the disk images to get an intuitive sense of how they are structured. (The next three exercises will also help you explore the disk images and get started on the assignment.)
There are two good ways to interface with these images. The first way is to interact with it like a user by mounting the file system so that you can use standard commands (mkdir, cp, rm, ln) to interact with it. Details of how to do this are below. The second way is to interact with the disk as if it is a flat binary file. Use xxd to create hex dumps, diff to look for differences between the dumps, and your favorite text editor to view the diffs. For example (YMMV):
% diff <(xxd emptydisk.img) <(xxd onefile.img) > empty-onefile.diff
% vimdiff empty-onefile.diff
You should be able to use a combination of these techniques to understand how files are placed on disk and how they are removed. For example, you can create a new disk image, use mount to place files of various sizes on it, unmount it, and then use xxd and diff to see how the image differs from the other images you have.
Mounting a file system
If you have root access on a Linux machine (or Linux virtual machine), you can use mount to mount the disk into your file system and to peruse its contents. (Note: this requires sudo, so you will need to do this on a machine (or virtual machine) that you administer.
On the teaching labs, you can use a tool called FUSE that allows you to mount a file system at user-level (from your regular account). It may not work on an NFS mounted file system, so this will only work on the teaching labs workstations (it will not work via ssh-ing remotely).
Note:
# create a directory in /tmp and go there
mkdir -m 700 /tmp/
cd /tmp/
# to create your own disk image
dd if=/dev/zero of=DISKNAME.img bs=1024 count=128
/sbin/mke2fs -N 32 -F DISKNAME.img
# create a mount point and mount the image
# CWD is /tmp/
fuseext2 -o rw+ DISKNAME.img mnt
# check to see if it is mounted
# now you can use the mounted file system, for example
mkdir mnt/test
Programming Help
# unmount the image
fusermount -u mnt
You can use the same strategy to mount one of the images provided above.
Marking scheme
ext2_cp: 12%
ext2_mkdir: 12%
ext2_ln: 12%
ext2_rm: 14% (+5% bonus) ext2_restore: 20% (+5% bonus) ext2_checker: 20%
Coding Style: 10% Submission
The assignment should be submitted to an A4 directory in your svn repository. Don’t forget to add and commit all of the code for the required programs. Please also provide a Makefile that will create your programs. Your Makefile should use -Wall, and produce no warnings. Please make sure that your Makefile includes the following separate targets:
ext2_cp: compiles and produces the ext2_cp executable
ext2_mkdir: compiles and produces the ext2_mkdir executable
ext2_ln: compiles and produces the ext2_ln executable
ext2_rm: compiles and produces the ext2_rm executable
ext2_rm_bonus (optional) compiles and produces the ext2_rm_bonus executable (optional, for bonus)
ext2_restore: compiles and produces the ext2_restore executable
ext2_restore_bonus (optional): compiles and produces the ext2_restore_bonus executable (optional, for bonus)
ext2_checker: compiles and produces the ext2_checker executable
Additionally, invoking make without arguments must compile all the targets.
Please consider separating the bonus parts as indicated above, in order to make it easier for us to determine if you did the bonus or not. The bonus source files and their counterparts may include large portions of the same code. This will make it easier for you as well, to make sure that at the very least the baseline implementation works, even if the bonus part may have problems or crash your baseline (non- bonus) implementation.
Additionally, you must submit an INFO.txt file, which contains as the first 3 lines the following:
your name(s)
your UtorID(s)
the svn revision number for your last submission. As a general rule, we will always take the last revision before the deadline (or after, if you decide to use grace tokens), so this is simply a sanity check for us that we did not miss a revision when we retrieve your code via MarkUs.
Aside from this, please feel free to describe problems you’ve encountered, what isn’t fully implemented (or doesn’t work fully), any special design decisions you’ve taken, etc. Feel free to explain what is not
implemented and describe what features you have completed. You may receive partial credit for functionality that is implemented but that does not complete one of the five required programs successfully.
Final (and Very Important) notes:
– Assignments missing a Makefile will receive a 0, as if the code did not compile!
– You must make sure that your Makefile compiles all of your files, including possible helper files, depending on your design, and that you have included all the necessary targets specified above.
– You must make sure that the source files that are mandatory are named exactly as indicated in the handout, and that the Makefile produces executables with the same name excluding the .c extension (for example: compiling ext2_cp.c should generate an executable named ext2_cp – do not submit the executables though).
– Missing files due to submission mistakes (forgot to add files, forgot to commit last version, etc.), will not be considered!
– It is your responsibility to ensure that your code works exactly as you expect it to, on the teaching labs!
ý 2017 Bogdan Simion
powered by Jekyll-Bootstrap-3 (http://jekyll-bootstrap-3.github.io/preview/#/theme/Bootstrap) and Twitter Bootstrap 3.0.3 (http://getbootstrap.com)