CSC369 Assignment 5 FUSE File Systems
Due Dec 10 by 11:59p.m. Points 9 Available after Nov 23 at 12p.m.
Introduction
You will be implementing a version of the Very Simple File System (VSFS) from the OSTEP text and lectures. We will be using FUSE to interact with your file system. FUSE allows you to implement a file system in user space by implementing the callback functions that the libfuse library will call. The Tutorial 9 Exercise (https://q.utoronto.ca/courses/278937/quizzes/275729) should give you some practice with using FUSE.
Using FUSE
Refer to the Tutorial 9 Exercise (https://q.utoronto.ca/courses/278937/quizzes/275729) handout for instructions on getting started with FUSE.
Additional Setup
Unlike the passthrough file system of the tutorial exercise, your VSFS file system will operate on a disk image file. A disk image (https://en.wikipedia.org/wiki/Disk_image) is simply an ordinary file that holds the content of a disk partition or storage device.
To allow you to test your file system operations independently of your file system format code (mkfs), we have provided some simple VSFS-formatted disk images in the course pub directory on teach.cs at /u/csc369h/fall/pub/a5:
vsfs-empty.disk – Small, empty file system (64 inodes, 1 MB size). Contains just root directory with ‘.’ and ‘..’ entries.
vsfs-maxfs.disk – Maximum size VSFS file system (512 inodes, 128 MB size). Contains just root directory with ‘.’ and ‘..’ entries.
vsfs-1file.disk – Small file system (64 inodes, 1 MB size) containing a single small file (only 1 data block) in the root directory.
vsfs-3file.disk – Medium file system (128 inodes, 16 MB size) containing 3 files (small – only direct blocks, medium – some indirect blocks, and maximum VSFS file size).
vsfs-many.disk – Small file system (256 inodes, 2 MB size) containing lots of small files (root directory inode uses indirect block pointer).
You will need to make your own copies of these disk images to use them, since you will need to be able to write to them.
https://q.utoronto.ca/courses/278937/assignments/901111
Computer Science Tutoring
2022/12/3 17:59 FUSE File Systems
You will also need to create your own empty disk images that you can format using your mkfs program. To do so, you will run the following commands:
truncate -s
./mkfs.vsfs -i
The truncate command will create the image file if it doesn’t exist and will set its size; mkfs.vsfs will format it into your vsfs file system (after you have completed the implementation).
Once you have a formatted vsfs disk image (one of ours, or your own) the next step is to mount your file system. We assume that you will be using /tmp/userid as in the Tutorial 9 exercise as the mountpoint, and that you will want to keep it running in the foreground so that you can see your output as it runs:
./vsfs
The image file is the disk image formatted by mkfs.vsfs . Not only does vsfs mount the disk image into the local file system, it also sets up callbacks and then calls fuse_main() so that FUSE can do its work. Both vsfs and mkfs.vsfs have additional options – run them with -h to see their descriptions.
After the file system is mounted, you can access it using standard tools (ls, cp, rm, etc.). To unmount the file system, run:
fusermount -u
Note that you should be able to unmount the file system after any sequence of operations, such that when it is mounted again, it has the same contents.
Simplifying Assumptions
For this assignment, we make a number of simplifying assumptions:
VSFS file systems are always small enough that they can be entirely mmap’d into the vsfs process’s virtual address space.
The underlying operating system will handle all write back of dirty pages to the vsfs disk image.
If the file system crashes, the disk image may be inconsistent. Your code should not crash, but it does not need to take any special effort to maintain crash consistency.
There is a flat namespace. All files are located in the root directory and there are no subdirectories. You do not need to implement mkdir or rmdir.
All paths are absolute (they all start with ‘/’). If you see a path that is not absolute, or that has more than one component, you can return an error.
Understanding the starter code
https://q.utoronto.ca/courses/278937/assignments/901111
2022/12/3 17:59 FUSE File Systems
First read through all of the starter code to understand how it fits together, and which files contain helper functions that will be useful in your implementation.
– contains the program to format your disk image. You need to write part of this program. You will also find it helpful to read the code to see how we access parts of the file system after using mmap() to map the entire disk image into the process virtual address space.
– contains the data structure definitions and constants needed for the file system. You may add other definitions or constants that you find useful, but you should not change the file system metadata. That is, do not add or modify fields in the superblock, inode, or direntry structures and do not change the existing definitions.
– contains the program used to mount your file system. This includes the callback functions that will implement the underlying file system operations. Each function that you will implement is preceded by detailed comments and has a “TODO” in it. Please read this file carefully.
NOTE: It is very important to return the correct error codes (or 0 on success) from all the FUSE callback functions, according to the “Errors” section in the comment above the function. The FUSE library, the kernel, and the user-space tools used to access the file system all rely on these return codes for correctness of operation.
Note: You will see many lines like . Their purpose is to prevent the compiler from warning about unused variables. You should delete these lines as you make use of the variables.
and – The struct contains runtime state of your mounted file system. Any time you think you need a global variable, it should go in this struct instead. We have cached some useful global state in this structure already (e.g. pointers to superblock, bitmaps, and inode table), but you may find there is additional state that you want to add, instead of recomputing it on every operation.
and – contain the function used by vsfs and to map the image file into memory and determine its size. You should not need to change anything here, or make any additional calls to the function beyond what is in the starter code.
and – contain the code to parse command line arguments for the vsfs program. You should not need to change anything here, or make any additional calls to the
function beyond what is in the starter code. – contains some handy functions.
and – contain code to initialize bitmaps, and to allocate or free items tracked by the bitmaps. You will use these to allocate and free inodes and data blocks, so make sure you read the functions and understand how to use them. You may notice that the bitmap_alloc function can be slow, since it always starts the search for a 0 bit from the start of the bitmap. You are free to improve on this if you wish, but you do not need to.
You are welcome to put some of the helper functions in separate files instead of keeping everything in
. Make sure to update the to compile those files and add/commit/push them to your git
repository.
map_file()
map_file()
map_file()
https://q.utoronto.ca/courses/278937/assignments/901111 3/8
2022/12/3 17:59 FUSE File Systems
Recommended progression of your work
You should tackle this project in stages so that you can be confident that each piece works before moving on to the next step. The creation of a new file system (mkfs.c) and operations on a formatted file system (vsfs.c) can be handled independently however, so you can do Steps A and B in either order.
Step A1: Write enough of so that you can mount the file system and see the root directory. Step A2: Implement and test it by running on the root directory to check that the
superblock is initialized correctly by your mkfs .
Step B1: Write . You have probably seen from the tutorial exercise that FUSE calls
a lot. Implementing this function is the key to the rest of the operations. You will want to write
a helper function that takes a path and returns a pointer to the inode (or the inode number) for the last component in the path. Remember that you only need to handle paths that are of the form “/” or “/somefile” – all paths are absolute and there are no subdirectories in our vsfs file systems.
Step B2: Write so that you can run on the root directory when the root directory entries fit within a single data block. You should be able to mount vsfs-empty.disk, vsfs-maxfs.disk, vsfs- 1file.disk and vsfs-3file.disk and list their root directories on completion of this step.
Step B3: Add the ability to create empty files by implementing . On completion of this step, you should be able to mount vsfs-empty.disk and use to create a new empty file. The new file should be visible and the mode and timestamps should be appropriate when you ‘ls -l’ on the root directory.
Step B4: Add the ability to write to, and read from, small files, first where the data is stored in a single data block, and then when the data can be stored using only the direct block pointers in the inode. Implement first, then and .
Step B5: Add the ability to remove small files (where the file data uses only the direct block pointer in the inode).
Step B6: Enhance your implementation of to list large directories, where the directory inode’s indirect block is needed to read all of the directory data blocks. You should be able to mount vsfs- many.disk and list its root directory on completion of this step.
Step B7: Enhance your implementations of , , , and vsfs_unlink() to support large files, where the indirect block in the file’s inode is used to locate some of the file’s data blocks.
Tip: Comment your code well. It will help you keep track of what is implemented and your understanding of how things work. Refactor your code during development (not after) and keep your functions short and well-structured.
vsfs_statfs()
vsfs_getattr()
vsfs_readdir()
vsfs_create()
vsfs_truncate()
vsfs_write() vsfs_read()
vsfs_readdir()
vsfs_truncate() vsfs_write() vsfs_read()
https://q.utoronto.ca/courses/278937/assignments/901111 4/8
Github
Tip: Check that there is enough space before making any changes to the file system. This will save you from having to roll back changes if you discover that an operation cannot be completed due to lack of space.
Tip: Remember to update fields in the superblock (e.g. free_inodes, free_blocks) as you operate on the file system.
Testing and debugging recommendations
You can use standard Unix tools to manipulate directories and files in a mounted vsfs image in order to test your implementation. System call tracing with strace can help understand what syscalls they invoke to access the file system. You can, in general, use the behaviour of the host file system (ext4) as a reference – your vsfs should have the same observable behaviour for operations that vsfs needs to support. You can also write your own C programs that invoke relevant syscalls directly.
You will find it useful to run vsfs under gdb :
gdb –args ./vsfs
break vsfs_getattr
fusermount
fusermount
2022/12/3 17:59 FUSE File Systems
You can then run file system operations in a separate terminal window. You can set breakpoints at the start of your FUSE callback functions (e.g. ) to help you understand what callbacks are invoked when you execute a file system operation (e.g. ls), in what order, and with what arguments. The debugger is also helpful in investigating crashes (e.g., segfaults) and stepping through the execution of the callback functions so that you can check your the state of the filesystem as the operations execute. Off-by-one errors are common but can be catastrophic when they lead to accessing the wrong block of file system metadata.
You might also find it useful to view the binary contents of your vsfs image files using xxd . See xxd for documentation.
To avoid errors when mounting the file system, make sure that the mount point is not in use (e.g. by a previous vsfs mount that didn’t finish cleanly). If fails to unmount because the mount point directory is “busy”, you can use the lsof command (see ) to identify the process that keeps it open.
One common error message you might see when running operations on the mounted file system is “transport endpoint is not connected”. This error usually means that the file system is still mounted, but the vsfs program has terminated (e.g. crashed). In this case you need to manually unmount it with
One of the most common errors you might see at the early stages of the implementation is reporting an “I/O error” and displaying “???” entries. This error usually means that your callback returns invalid data in the stat structure and/or an invalid return value.
https://q.utoronto.ca/courses/278937/assignments/901111 5/8
2022/12/3 17:59 FUSE File Systems
In order to test reads and writes at an offset, you can either use the tail command (its -c option; see ), or write your own C programs that use and .
Limits and details
man 1 tail
pread() pwrite()
VSFS_BLOCK_SIZE
VSFS_NAME_MAX
_POSIX_PATH_MAX
truncate()
The maximum number of inodes in the system is a parameter to
known to it, and the block size is (4096 bytes – declared in vsfs.h). Many parameters of your file system can be computed from these three values.
We will not test your code on an image smaller than 64 KiB (16 blocks) with 4 inodes. You should be able to fit the root directory and a non-empty file in an image of this size and configuration. You shouldn’t pre-allocate additional space for metadata (beyond the fixed metadata definced for VSFS, the space needed to store the inode table and the root directory) in your implementation. Indirect blocks should only be allocated on demand, when a file or directory grows large enough to need it.
The maximum path component length is (252 bytes including the null terminator). This value is chosen to fit the directory entry structure into 256 bytes (see ). Names stored in directory entries are null-terminated strings so that you can use standard C string functions on them. The maximum full path length is (4096 bytes including the null terminator). This allows you to use fixed-size buffers for operations like splitting a path into a directory name and a file name.
The maximum file size is dictated by the number of direct block pointers in a vsfs inode (VSFS_NUM_DIRECT) and the number of block pointers in an indirect block (VSFS_BLOCK_SIZE / sizeof(vsfs_blk_t)).
The number of directory entries is limited by the maximum number of directory entry data blocks (same as the limit on file blocks).
The number of blocks in your file system is limited by the number of bits in a single VSFS block, since we use only 1 block for the data bitmap.
You can assume that read and write operations are performed one block at a time. Each and
call your file system receives will only cover a range within a single block. NOTE: this does not apply to – a single call needs to be able to extend or shrink a file by an arbitrary number of blocks.
Sample disk configurations that must work include:
64KiB size and 4 inodes 64KiB size and 16 inodes 1MiB size and 64 inodes 128MiB size and 512 inodes
We will not be testing your code under extreme circumstances so don’t get carried away by thinking about corner cases. However, we do expect you to properly handle “out of space” conditions in your code. Any operation that cannot be completed because there are not enough free blocks or inodes must be cleanly aborted – no blocks or inodes can “leak” in the process. The simplest way to ensure this is to
https://q.utoronto.ca/courses/278937/assignments/901111 6/8
, the image size is also
2022/12/3 17:59 FUSE File Systems
check that there is enough space to complete the operation before modifying any file system metadata. The formatting program ( mkfs ) must also check that the image file is large enough to accommodate the requested number of inodes.
Other implementation notes:
Although the “.” and “..” directory entries can be manually listed by the callback (as in the starter code), you should create actual entries for these when you initialize the root directory in mkfs().
The only timestamp you need to store for each file and directory is (modification time) – you don’t need to store and . You can use the command to set the modification timestamp of a file or directory to current time.
Any data and metadata blocks (other than the fixed metadata) should only be allocated on demand. Read and write I/O should be performed by reading/writing the virtual memory where the disk image is mmap’d. It should NOT be performed byte-by-byte (which is very inefficient); use .
Your implementation shouldn’t use any floating point arithmetic. See the helper functions in util.h – if you need other, similar, functions (like floor), they can also be easily implemented using integer arithmetic.
Documentation
It is recommended that you include a README.txt file that describes any aspects of your code that do not work well. Code that works well and implements a subset of the functionality will get a higher mark than code that attempts to implement more functionality but doesn’t work.
What to submit
Add all the starter files from MarkUs to your a5 repository. Also add to your repository all additional source code files that you create as part of your implementation.
Your a5 repository must contain all the files necessary to compile and run and vsfs . It may include a README file as described above. Do NOT add and commit virtual machine images, executables, .o or .d files, disk image files, or any other unnecessary files – you will lose code style marks you do submit those. You are welcome to commit test code and other text files. You should use a
file to help ensure you only commit and push files you should.
Useful links
libfuse GitHub repository: https://github.com/libfuse/libfuse (https://github.com/libfuse/libfuse)
FUSE API header file for the version we’re using:
https://github.com/libfuse/libfuse/blob/fuse_2_9_bugfix/include/fuse.h (https://github.com/libfuse/libfuse/blob/fuse_2_9_bugfix/include/fuse.h)
vsfs_readdir()
.gitignore
https://q.utoronto.ca/courses/278937/assignments/901111 7/8
2022/12/3 17:59
FUSE wiki: https://github.com/libfuse/libfuse/wiki
FUSE File Systems
(https://github.com/libfuse/libfuse/wiki)
https://q.utoronto.ca/courses/278937/assignments/901111
Programming Help