CS 61B Spring 2024

2024/3/29 11:00 Project 2B: Ngordnet (Wordnet) | CS 61B Spring 2024

https://sp24.datastructur.es/projects/proj2b/ 1/15

Projects / Project 2B: Ngordnet (Wordnet)

Each assignment will have an FAQ linked at the top. You can also access it by adding “/faq” to

the end of the URL. The FAQ for Project 2B is located here.

In this project, youʼll complete your implementation of the NGordnet tool.

As this is a quite new project, there may be occasional bugs or confusion with the spec. If you

notice anything of this sort, please post on Ed.

IMPORTANT NOTE: After you read the 2B spec, you may be tempted to start coding.

Donʼt do this!

Before implementing ANY code for 2B, please read the 2C spec, as your design may

change depending on 2C. You can find it here.

Then, complete the Project 2B/C: Checkpoint and Design Document before starting

When designing your project, think about all of the requirements in advance. Planning ahead

will ensure you donʼt need to rewrite all of your code when you get to certain points in the

If you find yourself copy-pasting or repeating a lot of the same code youʼve already written,

there is probably an opportunity to reuse it directly, or slightly modify it so you donʼt have to

repeat yourself as often.

Project 2B: Ngordnet (Wordnet)

Checkpoint & Design Doc Due 03/15/2024

Coding Due 04/01/2024

Design Notes

Project Setup

https://sp24.datastructur.es/
https://sp24.datastructur.es/projects/
https://sp24.datastructur.es/projects/proj2b/faq/
https://sp24.datastructur.es/projects/proj2c/
https://www.gradescope.com/courses/708063/assignments/4133684
https://www.gradescope.com/courses/708063/assignments/4187810
https://sp24.datastructur.es/projects/proj2b/faq/

2024/3/29 11:00 Project 2B: Ngordnet (Wordnet) | CS 61B Spring 2024

https://sp24.datastructur.es/projects/proj2b/ 2/15

THE SETUP FOR THIS PROJECT IS DIFFERENT THAN THE OTHER LABS / PROJECTS.

PLEASE DO NOT SKIP THIS STEP!

Once you are done, your proj2b directory should look like this:

IMPORTANT NOTE: You should really complete Project 2B/C: Checkpoint first before

starting coding, or even designing your project. We think this would be helpful for your

understanding of the project. We will also require you to submit a design document to

Gradescope. More details about the design document can be found in Deliverables and

Complete Project 2B/C: Checkpoint

After finishing the checkpoint, complete Design Document

This part of the project is designed for you to come up with efficient and correct design for

your implementation. The design you come up with will be very important to handle these

cases. Please read 2B & 2C spec carefully before starting your design document.

Skeleton Setup

Similar to other assignments in this class, run git pull skeleton main to get the skeleton

code for this project.

Download the data files for this project using this link and move them into your proj2b

folder on the same level as src .

│ ├── ngrams

│ └── wordnet

├── static

Getting Started

https://www.gradescope.com/courses/708063/assignments/4133684
https://docs.google.com/document/d/1Vx7QAz4HFN0rEFFEt5rocY2X5AWVcIFFpRmD8vhegOM/edit?usp=sharing
https://www.gradescope.com/courses/708063/assignments/4133684
https://www.gradescope.com/courses/708063/assignments/4187810
https://drive.google.com/file/d/160iHOqwR4FAghGshbnSNMwd0idrjZxTR/view?usp=sharing

2024/3/29 11:00 Project 2B: Ngordnet (Wordnet) | CS 61B Spring 2024

https://sp24.datastructur.es/projects/proj2b/ 3/15

The course staff has created a couple of introductory videos to the project and the starter

code available here. Bear in mind we have changed the structure of the project so some

information might be outdated!

Weʼve also created two wonderful tools that you can (and should!) use to explore the dataset,

see how the staff solution behaves for specific inputs, and get expected outputs for your unit

tests (see Testing Your Code). Weʼll link them here, as well as in other relevant parts of the

Wordnet Visualizer: Useful for visually understanding how synsets and hyponyms work and

testing different words/lists of words for potential test case inputs. Click on the “?” bubbles

to learn how to use the various features of this tool!

Staff Solution Webpage: Useful for generating expected outputs for different test case

inputs. Use this to write your unit tests!

Read through entire 2B/C spec and complete Project 2B/C: Checkpoint

Read through entire 2B/C spec and complete Design Document

Before we can incorporate WordNet into our project, we first need to understand the WordNet

WordNet is a “semantic lexicon for the English language” that is used extensively by

computational linguists and cognitive scientists; for example, it was a key component in IBM s̓

Watson. WordNet groups words into sets of synonyms called synsets and describes semantic

relationships between them. One such relationship is the is-a relationship, which connects a

hyponym (more specific synset) to a hypernym (more general synset). For example, “change”

is a hypernym of “demotion”, since “demotion” is-a (type of) “change”. “change” is in turn a

hyponym of “action”, since “change” is-a (type of) “action”. A visual depiction of some

hyponym relationships in English is given below:

Using the WordNet Dataset

https://www.qxbytes.com/wordnet/
https://ngordnet.datastructur.es/
https://www.gradescope.com/courses/708063/assignments/4133684
https://www.gradescope.com/courses/708063/assignments/4187810
http://en.wikipedia.org/wiki/WordNet

2024/3/29 11:00 Project 2B: Ngordnet (Wordnet) | CS 61B Spring 2024

https://sp24.datastructur.es/projects/proj2b/ 4/15

Each node in the graph above is a synset. Synsets consist of one or more words in English that

all have the same meaning. For example, one synset is “jump, parachuting” , which represents

the act of descending to the ground with a parachute. “jump, parachuting” is a hyponym of

“descent”, since “jump, parachuting” is-a “descent”.

Words in English may belong to multiple synsets. This is just another way of saying words may

have multiple meanings. For example, the word “jump” also belongs to the synset “jump, leap”

, which represents the more figurative notion of jumping (e.g. a jump in attendance) rather the

literal meaning of jump from the other synset (e.g. a jump over a puddle). The hypernym of the

synset “jump, leap” is “increase”, since “jump, leap” is-an “increase”. Of course, there are other

ways to “increase” something: for example, we can increase something through

“augmentation,” and thus it is no surprise that we have an arrow pointing downwards from

“increase” to “augmentation” in the diagram above.

Synsets may include not just words, but also what are known as collocations. You can think of

these as single words that occur next to each other so often that they are considered a single

word, e.g. nasal_decongestant . To avoid ambiguity, we will represent the constituent words of

collocations as being separated with an underscore _ instead of the usual convention in

English of separating them with spaces. For simplicity, we will refer to collocations as simply

“words” throughout this document.

A synset may be a hyponym of multiple synsets. For example, “actifed” is a hyponym of both

“antihistamine” and “ nasal_decongestant”, since “actifed” is both of these things.

If youʼre curious, you can browse the Wordnet database by using the web interface ,

though this is not necessary for this project.

In Project 2B, your primary task is to implement this button, which will require reading in a

different type of dataset and synthesizing the results with the dataset from Project 2A. Unlike

2A, it will be entirely up to you to decide what classes you need to support this task.

Hyponyms (Basic Case)

Setting up a HyponymsHandler

In your web browser, open the ngordnet.html file in the static folder. As a refresher, you

can find how to do that here in bullet point 1. Youʼll see that there is a new button:

“Hyponyms”. Note that there is also a new input box called k .

Try clicking the Hyponyms button. Youʼll see nothing happens (and if you open the

developer tools feature of your web browser, youʼll see that your browser shows an error).

http://wordnetweb.princeton.edu/perl/webwn?o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=&o3=&o4=&s=jump&i=6&h=000010000000000000000000#c
http://wordnetweb.princeton.edu/perl/webwn?o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=&o3=&o4=&s=jump&i=2&h=100000000000000000000000#c
http://en.wikipedia.org/wiki/Collocation
http://wordnetweb.princeton.edu/perl/webwn?s=nasal+decongestant+&sub=Search+WordNet&o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=&o3=&o4=&h=
http://wordnetweb.princeton.edu/perl/webwn?o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=&o3=&o4=&r=1&s=sturgeon&i=3&h=1000#c
https://sp24.datastructur.es/projects/proj2a/#historytexthandler

2024/3/29 11:00 Project 2B: Ngordnet (Wordnet) | CS 61B Spring 2024

https://sp24.datastructur.es/projects/proj2b/ 5/15

If you see some error like “Could not load file some_file_here.txt ”, it probably means

that your project is not set up correctly. Be sure that you have the same structure as stated

in the Project Setup section.

Next, youʼll create a partial implementation of the Hyponyms button. For now, this button

Assume that the “words” entered is only a single word.

Ignore startYear, endYear, and k.

Return a string representation of a list of the hyponyms of the single word, including the

word itself. The list should be in alphabetical order, with no repeated words.

For example, suppose the WordNet dataset looks like the diagram below (given to you as the

input files synsets11.txt and hyponyms11.txt ). Suppose that the user enters “descent” and

clicks on the Hyponyms button.

In this case, the output of your handler should be the string representation of a list containing

“descent”, “jump” and “ parachuting”, i.e [descent, jump, parachuting] . Note that the words

are in alphabetical order.

Edit the file called HyponymsHandler to simply return the word “Hello!” when the user

clicks the Hyponyms button in the browser. Youʼll need to make the HyponymsHandler

class extend the NgordnetQueryHandler class. See your other Handler classes for

examples. Make sure when you register your handler that you use the string “hyponyms” as

the first argument to the register method, and not “hyponym”.

Start by opening your ngordnet.main.Main.java file.2

Once youʼve modified Main so that your new handler is registered to handle hyponyms

requests, start up Main and try clicking the Hyponyms button in your web browser again.

You should see text appear that says “Hello”.

Hyponyms Handler (Basic Case)

2024/3/29 11:00 Project 2B: Ngordnet (Wordnet) | CS 61B Spring 2024

https://sp24.datastructur.es/projects/proj2b/ 6/15

As another example, suppose weʼre using a bigger dataset such as the one below (given to you

as the input files synsets16.txt and hyponyms16.txt ):

Suppose the user enters “change” and clicks on the Hyponyms button. In this case, the

hyponyms are all the words in the blue nodes in the diagram below:

That is the output is [alteration, change, demotion, increase, jump, leap,

modification, saltation, transition, variation] . Note that even though “change”

belongs to two different synsets, it only appears once.

2024/3/29 11:00 Project 2B: Ngordnet (Wordnet) | CS 61B Spring 2024

https://sp24.datastructur.es/projects/proj2b/ 7/15

Note: Try not to overthink this. Specifically, observe that the output does not include:

Synonyms of synonyms (e.g. does not include “adjustment” )

Hyponyms of synonyms (e.g. does not include “conversion” )

Hyponyms of other definitions of hyponyms (e.g. does not include “flashback” , which is a

hyponym of another definition of “transition” )

Implement HyponymsHandler.java and any helper classes.

Note: Please read the tips below, since you shouldnʼt be writing all of your code in this

To complete this task, youʼll need to decide what classes you need to create to support the

HyponymsHandler . DO NOT DO ALL THE WORK IN HYPONYMS HANDLER. Instead, you

should have helper classes. For example, in 2A, to handle the “History” button, we created

an NGramMap class. Youʼll want to do something similar in 2B, with your own classes.

Youʼll also need to understand the input format of the WordNet dataset. This description is

given in the section below.

For this part, you may NOT import any existing graph library into your code. That is you

canʼt import, for example, the graph implementations from the optional Princeton

algorithms textbook. Instead, you should build your own graph class or classes.

Just like 2A̓s NGramMap, youʼll want your helper classes to only parse the input files once,

in the constructor. DO NOT CREATE METHODS WHICH HAVE TO READ THE ENTIRE

INPUT FILE EVERY TIME THEY ARE CALLED. This will be too slow!

We strongly recommend creating at least two classes for this part of the project as follows:

One which implements the idea of a directed graph. One which reads in the WordNet

dataset and constructs an instance of the directed graph class. This second class should

also be able to take a word and return its hyponyms. You may also want additional helper

classes that represent the idea of a traversal but this is not required – you can implement

your traversal within your graph class as well.

Donʼt worry about writing Truth tests yet, weʼll talk about how to do that later in the spec.

Simply use the web front end to check the two input examples (“descent” and “change”)

from the diagrams above for synsets16.txt and hyponyms16.txt .

2024/3/29 11:00 Project 2B: Ngordnet (Wordnet) | CS 61B Spring 2024

https://sp24.datastructur.es/projects/proj2b/ 8/15

While you can (and should) write unit tests for the helper classes/methods that you create

for this project, another good way to test and see what s̓ going on with your code is to

simply run Main.java , open ngordnet.html , enter some inputs into the boxes, and click

the “Hyponyms” button. You may find visual debugging can lead to some useful discoveries

in this project.

We now describe the two types of data files that store the WordNet dataset. These files are in

comma separated format, meaning that each line contains a sequence of fields, separated by

File Type #1: List of noun synsets. The file synsets.txt (and other smaller files with

synset in the name) lists all the synsets in WordNet. The first field is the synset id (an

integer), the second field is the synonym set (or synset), and the third field is its dictionary

definition. For example, the line

means that the synset { Goofy } has an id number of 6829, and its definition is “a cartoon

character created by Walt Disney”. The individual nouns that comprise a synset are

separated by spaces (and a synset element is not permitted to contain a space). The S

synset ids are numbered 0 through S − 1; the id numbers will appear consecutively in the

synset file. The id numbers are useful because they also appear in the hyponym files,

described as file type #2.

File Type #2: List of hyponyms. The file hyponyms.txt (and other smaller files with

hyponym in the name) contains the hyponym relationships: The first field is a synset id;

subsequent fields are the id numbers of the synset s̓ direct hyponyms. For example, the

following line

means that the synset 79537 (“viceroy vicereine”) has two hyponyms: 38611 (“exarch”) and

9007 (“Khedive”), representing that exarchs and Khedives are both types of viceroys (or

vicereine). The synsets are obtained from the corresponding lines in the file synsets.txt :

WordNet File Format

6829,Goofy,a cartoon character created by Walt Disney Copy

79537,38611,9007 Copy

79537,viceroy vicereine,governor of a country or province who rules…

38611,exarch,a viceroy who governed a large province in the Roman Empire

9007,Khedive,one of the Turkish viceroys who ruled Egypt between…

2024/3/29 11:00 Project 2B: Ngordnet (Wordnet) | CS 61B Spring 2024

https://sp24.datastructur.es/projects/proj2b/ 9/15

There may be more than one line that starts with the same synset ID. For example, in

hyponyms16.txt , we have

This indicates that both synsets 12 and 13 are direct hyponyms of synset 11. These two

could also have been combined on to one line, i.e. the line below would have the exact

same meaning, namely that synsets 12 and 13 are direct hyponyms of synset 11.

You might ask why there are two ways of specifying the same thing. Real world data is often

messy, and we have to deal with it.

To get the “Hyponyms” button working youʼll need to:

Develop a graph class. If you arenʼt familiar with this data structure, take a look at lectures

21 and 22. You should test this with operations that are independent of the given data files.

For example, our tests evaluated that our createNode and addEdge functions yielded

appropriate graphs by using our graph classes s̓ getNodes and neighbors functions. For

inspiration, you can check out Lecture 22 and 23.

Write code that converts the WordNet dataset files into a graph. This could be part of

your graph class, or it could be a class that uses your graph class.

Write code that takes a word, and uses a graph traversal to find all hyponyms of that word

in the given graph.

We strongly recommended writing tests that evaluate queries on the examples above (for

example, you might look at the hyponyms of “descent” in synsets11/hypernyms11, or the

hyponyms of “change” in synsets16/hypernyms16).

Tests should be written at a level of abstraction appropriate to what theyʼre evaluating. For

example, we have a class called TestGraph that evaluates various aspects of our Graph

Or as another example, our code has a class called TestWordNet containing the function

11,12,13 Copy

Suggested Steps to Take

public void testHyponymsSimple(){

https://docs.google.com/presentation/d/1Gyke5ZMrgcMuBTa7qHAoklWZB3yf_yqQAymK2nV-Nqs/edit
https://docs.google.com/presentation/d/1NKWnXSJ8pUn1E2Cw6zidxMRuXf_aINKCpW7XpEkyz6c/edit

2024/3/29 11:00 Project 2B: Ngordnet (Wordnet) | CS 61B Spring 2024

https://sp24.datastructur.es/projects/proj2b/ 10/15

Note that your WordNet class may not have the same functions as ours so the test shown will

probably not work verbatim with your code. Note that our test does NOT use an NGramMap

anywhere, nor is it using a HyponymsHandler , nor is it directly invoking an object of type

Graph . It is specifically tailored to testing the WordNet class. Relying on only browser tests

will be incredibly frustrating (and slow!). Use your JUnit skills to build confidence in the

foundational abstractions that you build (e.g. Graph, WordNet, etc.).

This project involves having to do all sorts of different lookups, graph operations, and data

processing operations. There is no one right way to do this.

Some example lookups that you might need to perform:

Given a word (e.g. “change”), what nodes contain that word?

Example in synsets16.txt: change is in synsets 2 and 8

Given an integer, what node goes with that index?

Necessary for processing hyponyms.txt. For example in hyponyms16.txt, we know that

the node with synset 8 points at synsets 9 and 10, so we need to be able to find node 8

to get its neighbors.

Given a node, what words are in that node?

Example in synsets16.txt: synset 11 contains alteration, modification, and adjustment

Some example graph operations you might need to perform:

Creating a node, e.g. each line of synsets16.txt contains the information for a node.

Adding an edge to a node, e.g. each line of hyponyms16.txt contains one or more edges

that should be added to the corresponding node.

Finding reachable vertices, e.g. the vertices reachable from vertex #7 in hyponyms16.txt are

7, 8, 9, 10.

Your life will be a lot easier if you select instance variables and/or data structures for your

classes that naturally help solve all six of the problems above.

Some example data processing operations:

Given a collection of things, how do you find all non-duplicate items? (Hint: There is a data

structure that makes this very easy and efficient). Donʼt be afraid to also Google

WordNet wn=new WordNet(“./data/wordnet/synsets11.txt”,”./data/wordnet/hyponyms1

assertThat(wn.hyponyms(“antihistamine”)).isEqualTo(Set.of(“antihistamine”,”acti

Design Tips

2024/3/29 11:00 Project 2B: Ngordnet (Wordnet) | CS 61B Spring 2024

https://sp24.datastructur.es/projects/proj2b/ 11/15

documentation for the data structure that you choose (e.g. if you choose to use a TreeMap

for whatever reason, feel free to look up “TreeMap methods java”, “Map methods java”, or

“Collection methods java”, etc).

Given a collection of things, how do you sort them? (Hint: Google how to sort the collection

that youʼre using)

Also, a reminder from proj2a: Deeply nested generics are a warning sign that you are doing

something too complicated. Either find a simpler way or create a helper class to help

manage the complexity. For example, if you find yourself trying to use something like

Map