COMP1521 23T2 — Assignment 2 a file synchroniser

Assignment 2: a file synchroniser
version: 1.1.1 last updated: 2023-07-23 17�00�00

You may find the Assignment 2 overview video to be a helpful resource to help you get started.

The prerequisite knowledge for this assignment has been covered in all lectures up to the Week 7 Monday

lecture, as well as this bonus lecture which covers stat, chmod and directories. You will also need mkdir in

Subset 5, which is covered at the start of this video.

An additional help video for subset 2 onwards of the assignment is available here. We recommend

watching this video after you complete subset 1.

Getting Started

Handling Errors

Reference implementation

Helper utilities

Formats of basin indices

Type A Basin Index format

Type B Basin Index format

Type C Basin Index format

basin vs rsync

Assumptions and Clarifications

Assessment

Submission

Assessment Scheme

Intermediate Versions of Work

Assignment Conditions

Change Log

building a concrete understanding of file system objects;

practising C, including bitwise operations and robust error handling;

understanding file operations, including input-output operations on binary data

The rsync utility is a useful and popular tool which efficiently transfers files between computers. In this assignment
you will be implementing basin, which is a simplified version of rsync.

To copy a file from a sending computer to a receiver, it would theoretically be sufficient to just naïvely send over the

entire contents (and possibly metadata) of the file.

However, if the receiver already has an older version of the file which is very close to the sender’s version (or even an

identical copy!), then a large amount of redundant data is being transmitted. With slow networks or large file sizes

this can translate to a unnecessary waiting and cost.

Both the real rsync utility and the basin utility that you’ll be implementing in this assignment avoid unnecessary data

transfer by only sending the chunks of a file which differ between sender and receiver. The basin algorithm takes

place over four stages:

�. Stage 1: the sender constructs a Type A Basin Index (TABI) file containing a record for each file the sender

wants to send. Each record contains metadata about the file, as well as a hash for each block in the file (see the

subset 1 description for more information).

�. Stage 2: the receiver uses the TABI file to construct a Type B Basin Index (TBBI) file containing a record for

each TABI record. The TBBI file contains information about which blocks the receiver already has an up-to-date

copy of (see the subset 2 description for more information).

�. Stage 3: the sender uses the Type B Basin Index file to construct a Type C Basin Index (TCBI) file containing a

record for each TBBI record. The TCBI file contains the contents of the blocks which the receiver did not have

an up-to-date copy of (see the subset 3 description for more information).

�. Stage 4: the receiver uses the TCBI file to reconstruct an up-to-date copy of the files it is receiving. (see the

subset 4 description for more information).

The first four subsets of this assignment correspond to implementing each of these stages for a given list of files.

The fifth subset involves adding support for directories.

The real rsync utility is able to transfer files both over a network to a remote computer; where the sender would be

one computer and the receiver would be another. It can also transfer files locally, where the ‘sender’ and ‘receiver’

are two different directories on the same computer. In this assignment, you will only be implementing the local

version of basin, where the sender and receiver are two different directories on the same computer.

Getting Started
Create a new directory for this assignment, change to this directory, and fetch the provided code by running

$ mkdir -m 700 basin
$ cd basin
$ 1521 fetch basin

If you’re not working at CSE, you can download the provided files as a zip file or a tar file.

This will give you the following files:

basin.c is the only file you need to change: it contains partial definitions of four functions,
stage_1, stage_2, stage_3, and stage_4, to which you need to add code to complete the

assignment. You can also add your own functions to this file.

basin_main.c contains a main, which has code to parse the command line arguments, and which
then calls one of stage_1, stage_2, stage_3, and stage_4, depending on the command

line arguments given to basin. Do not change this file.

https://en.wikipedia.org/wiki/Rsync
https://cgi.cse.unsw.edu.au/~cs1521/23T2/assignments/ass2/provided.zip
https://cgi.cse.unsw.edu.au/~cs1521/23T2/assignments/ass2/provided.tar

You can run make to compile the provided code; and you should be able to run the result.

dcc basin.c basin_main.c basin_provided.c -o basin
Usage: ./basin [–stage-1|–stage-2|–stage-3|–stage-4]

You may optionally create extra .c or .h files. You can modify the provided Makefile fragment if you choose to do

You should run 1521 basin-examples to get a directory called examples/ full of test files and example Basin
Index files to test your program against.

$ 1521 basin-examples
$ ls examples
aaa bbb ccc tabi tbbi tcbi

To complete subset 1, you need to complete the provided stage_1 function.

The stage_1 function should create a TABI file at the specified output path, based on a given array of filenames.

The TABI file should contain the appropriate header, as outlined in the format of the TABI file section below.

It should then produce a TABI record for each file in the given array of in_filenames.

basin.h contains shared function declarations and some useful constant definitions. Do not
change this file.

basin_provided.c contains the hash_block function; you should call this function to calculate hashes for
subset 1. Do not change this file.

basin.mk contains a Makefile fragment for basin.

basin_hash_block.c contains the source code for the 1521 basin-hash-block helper utility which we
have provided you. You may find it useful to look at this code to better understand how

the hash_block function can be used. Do not change, attempt to compile with, or

submit this file.

https://manpages.debian.org/jump?q=make.1

$ 1521 basin-examples
$ cd examples/aaa
emojis.txt empty fizz fractal_bin little_endian_shorts long_path lyrics.txt short.txt
$ ../../basin –stage-1 ../out.tabi emojis.txt empty
$ 1521 basin-show ../out.tabi
Field name Offset Bytes ASCII/Numeric
———————————————————————–
magic 0x00000000 54 41 42 49 chr TABI
num records 0x00000004 02 dec 2
============================= Record 0 ==============================
pathname len 0x00000005 0a 00 dec 10
pathname 0x00000007 65 6d 6f 6a 69 73 2e 74 chr emojis.t
0x0000000f 78 74 chr xt
num blocks 0x00000011 03 00 00 dec 3
hashes[0] 0x00000014 90 30 e3 14 6e e7 0a 90 chr .0..n…
hashes[1] 0x0000001c 91 90 5c 46 fc 07 b3 93 chr ..\F….
hashes[2] 0x00000024 8c ec 01 86 4c dc 63 af chr ….L.c.
============================= Record 1 ==============================
pathname len 0x0000002c 05 00 dec 5
pathname 0x0000002e 65 6d 70 74 79 chr empty
num blocks 0x00000033 00 00 00 dec 0

Use fopen to create the TABI file for writing. You should overwrite the file if it already exists.

Use fputc and/or fwrite to write bytes to the TABI file.

Use fgetc and/or fread to read bytes from the input files.

Use stat to get the size of each input file. In particular, you may find the st_size field of the
struct stat useful. You may find inode to be a useful source of documentation for the struct stat
fields – note that filesystem blocks are not relevant to this assignment, and shouldn’t be confused with the

blocks in a TABI record.

The provided number_of_blocks_in_file function will determine the number of blocks required for a
TABI record, given the size of the file in bytes.

Use C bitwise operations such as << & and | to combine bytes into little endian integers. You may find it useful to write a helper function to do this, as you will need to do this in later subsets. Make sure you understand the format of the TABI file. To compute the hashes field, you will need to open and read from the file, and for each block use the provided hash_block function. Think carefully about the functions you can construct to avoid repeated code. TABI files do not necessarily end with .tabi . This has been done with the provided example files purely as a convenience. You may assume any paths in in_filenames are either regular files or do not exist. http://man7.org/linux/man-pages/man3/fopen.3.html http://man7.org/linux/man-pages/man3/fputc.3.html http://man7.org/linux/man-pages/man3/fwrite.3.html http://man7.org/linux/man-pages/man3/fgetc.3.html http://man7.org/linux/man-pages/man3/fread.3.html http://man7.org/linux/man-pages/man2/stat.2.html http://man7.org/linux/man-pages/man7/inode.7.html To complete subset 2, you need to complete the provided stage_2 function. The stage 2 function receives a path to an input TABI file and a path to an output TBBI file. The TBBI file should contain the appropriate header, as outlined in the format of the TBBI file section below. It should then produce a TBBI record for each file in the given TABI file. $ # [continued from subset 1 example] $ cd ../bbb $ ../../basin --stage-2 ../out.tbbi ../out.tabi $ 1521 basin-show ../out.tbbi Field name Offset Bytes ASCII/Numeric ----------------------------------------------------------------------- magic 0x00000000 54 42 42 49 chr TBBI num records 0x00000004 02 dec 2 ============================= Record 0 ============================== pathname len 0x00000005 0a 00 dec 10 pathname 0x00000007 65 6d 6f 6a 69 73 2e 74 chr emojis.t 0x0000000f 78 74 chr xt num blocks 0x00000011 03 00 00 dec 3 matches[0] 0x00000014 a0 bin 10100000 ============================= Record 1 ============================== pathname len 0x00000015 05 00 dec 5 pathname 0x00000017 65 6d 70 74 79 chr empty num blocks 0x0000001c 00 00 00 dec 0 Remember that stage 2 will typically be invoked in a different working directory to the directory in which stage 1 was invoked. You will need to detect invalid TABI files being supplied to stage 2, and handle them appropriately. You may find it handy to refer to the section on error handling below. Use C bitwise operations such as << and | to construct the matches field. You may find the provided num_tbbi_match_bytes function to be helpful. In subset 3, you will need to complete the provided stage_3 function, you will need to produce a TCBI file given a TBBI file as input. The TCBI file should contain the appropriate header, as outlined in the format of the TCBI file section below. It should also contain a TCBI record for each file in the given TBBI file, containing the data for the blocks the receiver didn't already have an up-to-date copy of. $ # [continued from subset 2 example] $ cd ../aaa $ ../../basin --stage-3 ../out.tcbi ../out.tbbi $ 1521 basin-show ../out.tcbi Field name Offset Bytes ASCII/Numeric ----------------------------------------------------------------------- magic 0x00000000 54 43 42 49 chr TCBI num records 0x00000004 02 dec 2 ============================= Record 0 ============================== pathname len 0x00000005 0a 00 dec 10 pathname 0x00000007 65 6d 6f 6a 69 73 2e 74 chr emojis.t 0x0000000f 78 74 chr xt file type 0x00000011 2d chr - owner perms 0x00000012 72 77 2d chr rw- group perms 0x00000015 72 2d 2d chr r-- other perms 0x00000018 2d 2d 2d chr --- file size 0x0000001b 01 02 00 00 dec 513 num updates 0x0000001f 01 00 00 dec 1 (0) block num 0x00000022 01 00 00 dec 1 (0) update len 0x00000025 00 01 dec 256 (0) update data 0x00000027 54 68 65 20 73 65 63 6f chr The seco 0x0000002f 6e 64 20 62 6c 6f 63 6b chr nd block [... omitted for brevity ...] 0x00000117 73 20 61 73 74 65 72 69 chr s asteri 0x0000011f 73 6b 20 2d 2d 3e 20 2a chr sk --> *
============================= Record 1 ==============================
pathname len 0x00000127 05 00 dec 5
pathname 0x00000129 65 6d 70 74 79 chr empty
file type 0x0000012e 2d chr –
owner perms 0x0000012f 72 77 2d chr rw-
group perms 0x00000132 72 2d 2d chr r–
other perms 0x00000135 2d 2d 2d chr —
file size 0x00000138 00 00 00 00 dec 0
num updates 0x0000013c 00 00 00 dec 0

You may find the stat system call to be useful here – in particular, the st_mode field of the struct stat

So far, we’ve created several types of basin indices files in order to communicate the current state of the receiver’s

files to the sender, and to communicate updated blocks from the sender to the receiver. In this subset, you will need

to complete the provided stage_4 function, which will be invoked with a TCBI file as input. You will then need to
apply the changes described in the TCBI file to the receiver’s files. This includes updating the contents of the

receiver’s files, and creating any new files that are required. You will also need to update the mode of the receiver’s

files such that the permissions match those described in the TCBI file.

http://man7.org/linux/man-pages/man2/stat.2.html

$ # [continued from subset 3 example]
$ cd ../bbb
$ ../../basin –stage-4 ../out.tcbi
$ diff ../aaa/empty ../bbb/empty # identical
$ diff ../aaa/emojis.txt ../bbb/emojis.txt # identical
$ # we have now synchronised `empty` and `emojis.txt` from aaa/ to bbb/

You may find chmod and fseek to be useful here.

Subset 5 requires you to add support for directories. You will need to update your stage_1 , stage_2 , stage_3
and stage_4 implementations to complete subset 5:

In stage_1 , if the value of num_in_filenames is zero, then you should create a TABI file containing the
contents of the entire current working directory. When num_in_filenames is non-zero you can still make the
assumption that all paths in in_filenames are either regular files or don’t exist.

When creating a TABI file for the current directory, you should include a record for every directory, as well as

every file. Records for directories should have their number of blocks as zero. The record for a parent directory

should be placed in the TABI file before any records for files or sub-directories in that parent directory. Apart

from that restriction, you may choose any order for records in the generated TABI file.

In stage_2 , a record with a path which is a directory for the receiver should result in all match bits being set to

In stage_3 , a record with a path which is a directory for the sender should be treated as an empty file. That is,
the number of blocks should be checked to be zero, and a record with no updates should be generated. Note

that the file type of the mode should be a d rather than a – . The file size in the TCBI record should be
obtained from the st_size field reported by stat.

In stage_4 , you should create directories for directory records if they do not exist. You should also set the
correct permissions for directories. If a record for a file has the path of an existing directory, or a record for a

directory has the path of an existing file, then you should output an appropriate error message and exit with

Additionally, you must add checks in stage_2 , stage_3 and stage_4 to detect if any paths referenced in the
input basin indices reference files outside the current working directory. When that occurs, you should output an

appropriate error message and exit with status 1. In real code, it is important that untrusted user input such as paths

cannot be used to do damage to the wider system. You may assume that if any initial segment of the path exits the

current working directory then the whole path will exit the current working directory.

You are encouraged to use the reference implementation to check that your understanding of the above subset 5

requirements are correct.

You may find opendir, readdir, closedir to be useful here.

Error handling
Error checking is an important part of this assignment. Automarking will test error handling.

http://man7.org/linux/man-pages/man2/chmod.2.html
http://man7.org/linux/man-pages/man3/fseek.3.html
http://man7.org/linux/man-pages/man2/stat.2.html
https://en.wikipedia.org/wiki/Directory_traversal_attack
http://man7.org/linux/man-pages/man3/opendir.3.html
http://man7.org/linux/man-pages/man3/readdir.3.html
http://man7.org/linux/man-pages/man3/closedir.3.html

Error messages should be one line (only) and be written to stderr (not stdout ).

basin should exit with status 1 after an error.

You do not have to free memory or close files before exiting in the event of an error.

basin should check all file operations for errors.

As much as possible match the reference implementation error messages exactly.

The reference implementation uses perror to report errors from file operations and other system calls.

It is not necessary to remove files and directories already created or partially created when an error occurs.

You may leave any created basin indices in an indeterminate state.

Where multiple error messages could be produced, basin may produce any one of the error messages.

In stages 2, 3, and 4 you cannot assume that the input basin indices are in a valid format. If your program is given an

invalid Basin Index file, you must output an appropriate error message to stderr and exit with status 1.

Reference implementation
A reference implementation is a common, efficient, and effective method to provide or define an operational

specification; and it’s something you will likely work with after you leave UNSW.

We’ve provided a reference implementation, 1521 basin , which you can use to find the correct outputs and
behaviours for any input:

$ 1521 basin-examples
$ cd examples
$ 1521 basin –stage-1 ../out.tabi short.txt
$ 1521 basin-show ../out.tabi
Field name Offset Bytes ASCII/Numeric
———————————————————————–
magic 0x00000000 54 41 42 49 chr TABI
num records 0x00000004 01 dec 1
============================= Record 0 ==============================
pathname len 0x00000005 09 00 dec 9
pathname 0x00000007 73 68 6f 72 74 2e 74 78 chr short.tx
0x0000000f 74 chr t
num blocks 0x00000010 01 00 00 dec 1
hashes[0] 0x00000013 15 b8 4c 98 fe c3 b7 d6 chr ..L…..

Every concrete example shown below is runnable using the helper utilities; run 1521 basin instead of ./basin .

The command 1521 basin-show display the contents of TABI, TBBI and TCBI files in a human
readable format. It is useful for understanding the output of both the reference implementation and your own

implementation.

Where any aspect of this assignment is undefined in this specification, you should match the behaviour exhibited by

the reference implementation. Discovering and matching the reference implementation’s behaviour is deliberately a

part of this assignment.

If you discover what you believe to be a bug in the reference implementation, please report it in the class forum. If it is

a bug, we may fix the bug; or otherwise indicate that you do not need to match the reference implementation’s

behaviour in that specific case.

Helper utilities
Alongside 1521 basin-show , which was used above, we have also provided you two additional utilities –
1521 basin-dump-blocks and 1521 basin-hash-block . These utilities have been provided to assist you in

http://man7.org/linux/man-pages/man3/perror.3.html

understanding the requirements of the assignment, and to help you debug your program.

1521 basin-dump-blocks takes a file as input and splits it into 256 ( BLOCK_SIZE ) byte blocks, and outputs it to
stdout either in hex format or raw bytes. This is useful for ensuring that your program is correctly splitting files into

$ 1521 basin-dump-blocks —raw examples/aaa/emojis.txt
=== block 0 ===
This file should be broken up by your program into three blocks: the
first 256 bytes spans lines one to four (and includes the newline on line
four), the second 256 bytes is from line 5 to the asterisk (inclusive), and
the final block is only 1 byte long!

=== block 1 ===
The second block started on this line. Now for an assortment of emoji:
✨ ✨ ✨ � � � � ✨ ✨ ✨
📚 🎓 📈 📈 💾 💽 💿 🖥 💻 🚀 🌌 🤯 🎉 🥳
The last character of this block is this asterisk –> *
=== block 2 ===
[… no newline after output …]
$ 1521 basin-dump-blocks —hex examples/aaa/emojis.txt
=== block 0 ===
54 68 69 73 20 66 69 6c 65 20 73 68 6f 75 6c 64
20 62 65 20 62 72 6f 6b 65 6e 20 75 70 20 62 79
20 79 6f 75 72 20 70 72 6f 67 72 61 6d 20 69 6e
74 6f 20 74 68 72 65 65 20 62 6c 6f 63 6b 73 3a
20 74 68 65 0a 66 69 72 73 74 20 32 35 36 20 62
79 74 65 73 20 73 70 61 6e 73 20 6c 69 6e 65 73
20 6f 6e 65 20 74 6f 20 66 6f 75 72 20 28 61 6e
64 20 69 6e 63 6c 75 64 65 73 20 74 68 65 20 6e
65 77 6c 69 6e 65 20 6f 6e 20 6c 69 6e 65 0a 66
6f 75 72 29 2c 20 74 68 65 20 73 65 63 6f 6e 64
20 32 35 36 20 62 79 74 65 73 20 69 73 20 66 72
6f 6d 20 6c 69 6e 65 20 35 20 74 6f 20 74 68 65
20 61 73 74 65 72 69 73 6b 20 28 69 6e 63 6c 75
73 69 76 65 29 2c 20 61 6e 64 0a 74 68 65 20 66
69 6e 61 6c 20 62 6c 6f 63 6b 20 69 73 20 6f 6e
6c 79 20 31 20 62 79 74 65 20 6c 6f 6e 67 21 0a

=== block 1 ===
54 68 65 20 73 65 63 6f 6e 64 20 62 6c 6f 63 6b
20 73 74 61 72 74 65 64 20 6f 6e 20 74 68 69 73
20 6c 69 6e 65 2e 20 4e 6f 77 20 66 6f 72 20 61
6e 20 61 73 73 6f 72 74 6d 65 6e 74 20 6f 66 20
65 6d 6f 6a 69 3a 0a e2 9c a8 20 e2 9c a8 20 e2
9c a8 20 31 ef b8 8f e2 83 a3 20 35 ef b8 8f e2
83 a3 20 32 ef b8 8f e2 83 a3 20 31 ef b8 8f e2
83 a3 20 20 e2 9c a8 20 e2 9c a8 20 e2 9c a8 0a
f0 9f 93 9a 20 f0 9f 8e 93 20 f0 9f 93 88 20 f0
9f 93 88 20 f0 9f 92 be 20 f0 9f 92 bd 20 f0 9f
92 bf 20 f0 9f 96 a5 ef b8 8f 20 f0 9f 92 bb 20
f0 9f 9a 80 20 f0 9f 8c 8c 20 f0 9f a4 af 20 f0
9f 8e 89 20 f0 9f a5 b3 0a 54 68 65 20 6c 61 73
74 20 63 68 61 72 61 63 74 65 72 20 6f 66 20 74
68 69 73 20 62 6c 6f 63 6b 20 69 73 20 74 68 69
73 20 61 73 74 65 72 69 73 6b 20 2d 2d 3e 20 2a

=== block 2 ===

Additionally, 1521 basin-dump-blocks is able to only output a single block, specified by the –index option. For
example, to only output the first block of the file examples/aaa/emojis.txt as hex, you would run:

$ 1521 basin-dump-blocks –index 0 –hex basin/examples/aaa/emojis.txt
=== block 0 ===
54 68 69 73 20 66 69 6c 65 20 73 68 6f 75 6c 64
20 62 65 20 62 72 6f 6b 65 6e 20 75 70 20 62 79
20 79 6f 75 72 20 70 72 6f 67 72 61 6d 20 69 6e
74 6f 20 74 68 72 65 65 20 62 6c 6f 63 6b 73 3a
20 74 68 65 0a 66 69 72 73 74 20 32 35 36 20 62
79 74 65 73 20 73 70 61 6e 73 20 6c 69 6e 65 73
20 6f 6e 65 20 74 6f 20 66 6f 75 72 20 28 61 6e
64 20 69 6e 63 6c 75 64 65 73 20 74 68 65 20 6e
65 77 6c 69 6e 65 20 6f 6e 20 6c 69 6e 65 0a 66
6f 75 72 29 2c 20 74 68 65 20 73 65 63 6f 6e 64
20 32 35 36 20 62 79 74 65 73 20 69 73 20 66 72
6f 6d 20 6c 69 6e 65 20 35 20 74 6f 20 74 68 65
20 61 73 74 65 72 69 73 6b 20 28 69 6e 63 6c 75
73 69 76 65 29 2c 20 61 6e 64 0a 74 68 65 20 66
69 6e 61 6c 20 62 6c 6f 63 6b 20 69 73 20 6f 6e
6c 79 20 31 20 62 79 74 65 20 6c 6f 6e 67 21 0a

We have also provided a 1521 basin-hash-block command that reads up to 256 ( BLOCK_SIZE ) bytes from
standard input and outputs the 64-bit hash of the data as a hex string, using the same hash_block function as
provided for the assignment. We’ve also provided you the source code for this command in basin_hash_block.c for

your reference.

You can combine these commands to check the hash of any given block of an input file, for example:

$ 1521 basin-dump-blocks –index 0 –raw examples/aaa/emojis.txt | 1521 basin-hash-block
900ae76e14e33090

It is important to use the –raw option and specify a block index in order to produce the expected hash for that

Formats of basin indices
The basin indices emitted by your implementation must follow the exact format produced by the reference

implementation.

Type A Basin Index format
When a sender wants to send files, it first creates a TABI file. This file contains a record for each file that is going to

be sent. In each record is the pathname of the file, the number of blocks in the file (computed by

number_of_blocks_in_file ), and the hash of each block in the file.

A TABI file consists of a header, followed by 0 or more records. The format of the header is:

name length type description

magic number 4 B

characters

The magic number for TABI files, which is the sequence of bytes

0x54, 0x41, 0x42, 0x49 (ASCII TABI ).

unsigned, 8-bit The number of records in this TABI file.

The TABI header is followed by the specified number of records. Each TABI record has the following format:

name length type description

2 B (byte) unsigned, 16-bit,

little-endian

The length of the pathname of this record.

name length type description

pathname pathname-length character sequence The pathname of the file of this record. It is not

nul-terminated.

3 B (byte) unsigned, 24-bit,

little-endian

The number of 256-byte blocks in the sender’s

version of the file (the final block may be shorter

than 256 bytes).

hashes 8 B (byte) × num-blocks sequence of

unsigned, 64-bit,

little-endian integers

The hashes the sender has computed for their

version of the file (using the hash_block
function), with one 64-bit hash for each block.

An example TABI file, displayed using 1521 basin-show :

$ 1521 basin-examples
$ cd examples
$ 1521 basin-show tabi/my_text_files.tabi
Field name Offset Bytes ASCII/Numeric
———————————————————————–
magic 0x00000000 54 41 42 49 chr TABI
num records 0x00000004 03 dec 3
============================= Record 0 ==============================
filename len 0x00000005 09 00 dec 9
filename 0x00000007 73 68 6f 72 74 2e 74 78 chr short.tx
0x0000000f 74 chr t
num blocks 0x00000010 01 00 00 dec 1
hashes[0] 0x00000013 15 b8 4c 98 fe c3 b7 d6 chr ..L…..
============================= Record 1 ==============================
filename len 0x0000001b 0a 00 dec 10
filename 0x0000001d 65 6d 6f 6a 69 73 2e 74 chr emojis.t
0x00000025 78 74 chr xt
num blocks 0x00000027 03 00 00 dec 3
hashes[0] 0x0000002a 90 30 e3 14 6e e7 0a 90 chr .0..n…
hashes[1] 0x00000032 91 90 5c 46 fc 07 b3 93 chr ..\F….
hashes[2] 0x0000003a 8c ec 01 86 4c dc 63 af chr ….L.c.
============================= Record 2 ==============================
filename len 0x00000042 05 00 dec 5
filename 0x00000044 65 6d 70 74 79 chr empty
num blocks 0x00000049 00 00 00 dec 0

The above example shows that the sender is sending three files: short.txt , emojis.txt , and empty . The file
short.txt has one block of data (so its length must be between 1 and 256), and that block has a hash
0xd6b7c3fe984cb815 .

The second file emojis.txt has 3 blocks, so its length must be between 513 and 768. The first block (bytes at
indices 0..255) hashes to 0x900ae76e14e33090 , the second block (bytes and indices 256..511) has a hash of
0x93b307fc465c9091 and the final block (bytes from index 512 to the end of the file) has a hash of
0xaf63dc4c8601ec8c .

The final record is for the file named empty . Since it has zero blocks it must be, as its name suggests, empty.

Type B Basin Index format
After a rece