COMP2511 23T2 Assignment III: Tributary
Due: Week 10 Sunday, 5pm (6th August)
No late penalty will be applied until Week 11 Tuesday, 5pm (8th August).
Value: 10% – Bonus
This assignment can be used to make up lost marks in other assignments. It does not make up for marks
lost in labs or the final exam.
Getting Setup
This assignment is completed in pairs or individually.
You will need to fork this repo in order to complete the assignment: Assignment III Starter Code.
If you are completing the assignment individually, fork it into your personal COMP2511 GitLab Group
(COMP2511 / 23T2 / z123456).
If you are completing the assignment in your Assignment II pair, fork it into your pair GitLab Group
(COMP2511 / 23T2 / Teams / M09A_ABBA).
If you are completing the assignment in a pair which is different to your Assignment II pair, post on the
forum and weʼll create the group for you to fork the repo into.
What is this assignment aiming to achieve?
Having completed Assignments I and II and gained familiarity and experience with the course
concepts it s̓ time to turn this into expertise. This assignment aims to get you working with a real-
world problem that forms a big part of modern software architecture – event-streaming and event
pipelines.
This assignment is designed for students aiming to achieve a high mark in the course – you can still
get up to 100% in the course overall without completing this assignment.
The aims of this assignment can be broken down into four major themes:
�. Synthesising Complex and Abstract Requirements. This problem is a bit different from those
youʼve encountered before, and may be difficult to understand at first. However, like many
similar problems, once you understand what youʼre working with the solution is more
straightforward.
�. Building a Java API. Using your skills in Design by Contract, youʼll need to design a library that
could be used as a platform to build event-driven systems.
�. Design and Testing Methodologies. Youʼll be completing a preliminary design to set the
trajectory before diving in, and iterating on your design as you go. Youʼll also be planning out
and fully architecting and writing a test suite for the system.
�. Event-Driven & Asynchronous Design. In developing this system, youʼll learn about how event
streaming works and how event-driven design can be used in software systems. Youʼll also need
to consider the implications of working asynchronously, and how to manage concurrency
1.0 Overview
Event-Driven Architecture makes up much of the backbone of modern software. With a move towards
decentralised microservice systems, there is a need for scalable asynchronous communication between
software components.
In this assignment you will be writing a Java API that another engineer could use to build an event-driven
system using the principles discussed in lectures.
This library is based on a heavily simplified version of the event streaming infrastructure Apache Kafka,
which you can read more about for your own interest. No understanding of Kafka is required to complete
this assignment. Apache Kafka
Intro Video
COMP2511 Assignment III: Tributary
2.0 Engineering Requirements
The fundamental premise on which Event-Driven Architecture rests is the ability of producer and
consumer entities in the system to share data asynchronously via a stream-like channel.
However, our library will allow for more complex interactions than simply that of a single channel.
A Tributary Cluster contains a series of topics. A topic contains events which are logically grouped
together. For example, a cluster could contain two topics: one for images-related events and one for
video-related events. You can think of them like a table in a database or a folder in a file system.
Within each topic, there are a series of partitions – a collection of a sequence of messages which can be
appended to.
You can complete this assignment in pairs or individually. You will be marked differently depending
on whether you complete the assignment as a pair or individually. You can continue with the same
pair as you had in Assignment II, or form a new pair if you like.
https://kafka.apache.org/
A unit of data within a Tributary is a message, or record or event. For example, to update their profile a
user may send a message to Partition 1 in Topic A, and this message will be appended to Partition 1 in
Topic A. Each partition is queue where messages are appended at the end of the partition. Each message
has an optional key to indicate which partition it should be appended to.
A topic can be related to “user profiles” and each message relates to requesting an update to a specific
profile. However, considering there can be many such requests at a given time, the system divides the
incoming requests into multiple partitions. There can be multiple consumers consuming messages at the
same time (concurrently). However, each partition is handled by only one consumer. Multiple consumers
will allow us to effectively utilise the underlying hardware with multiple cores.
In the context of the library you are building, topics are parameterised on a generic type; all event
payloads within that topic must be of the specified type.
2.1 Message Lifecycle: A Simple Example
Let us take the example of a user updating their profile. This results in an event being generated by the
producer for a topic “user profiles” with the updated profile information. This event is now delivered to
the Tributary, which assigns the event to one of the partitions. The producer indicates whether the
message is randomly allocated to a partition, or provides a key specifying which partition to append the
message to.
A consumer processes one or more partitions by sequentially processing (consuming) events in the
allocated partitions.
2.2 Message Structure
Individual messages contain the following information:
Datetime created;
Payload Type;
Value. The value is an object containing relevant information for a topic. Considering information
required for different topics may change, you should consider using a generic type here.
2.3 Producers
A Producer is responsible for sending messages to the Tributary system. As show in the diagram above, a
message information including the datetime created and source producer. Messages may have a key
which indicates the partition id to send the message to a particular partition. Alternatively, a message is
randomly assigned to a partition by the system.
2.3.1 Allocation of Messages to Partitions
Producers can indicate whether to send a message to a particular partition by providing the relevant key
or requesting random allocation. There are two types of producers:
Random Producers – the producer requests the Tributary system to randomly assign a message to a
Manual Producers – the producer requests the Tributary system to assign a message to a particular
partition by providing a specific key.
A producer object can be created to represent one of the two producers listed above. However, it cannot
change its message allocation method once it has been created. Your implementation should allow
different message allocation methods to be added in the future.
2.4 Consumers and Consumer Groups
2.4.1 Consumers
Consumers responsible for consuming (processing) messages stored in partition queues. Consumers
operate as part of a Consumer Group. The Consumer reads the messages in the order that they were
produced. The consumer keeps track of the messages that have been consumed.
2.4.2 Consumer Groups
There can be multiple consumers consuming (processing) messages from one or more partitions in a
topic. Note that each partition can be consumed by only one consumer. A consumer group consists of
one or more consumers, capable of consuming all partitions in a topic.
Each topic can have multiple consumer groups. While each consumer group may contain a different
number of consumers, the number of partitions being consumed from will always remain the same.
Each consumer group contains one or more consumers, that are together capable of consuming all
partitions. Each topic can have multiple consumer groups, each capable of consuming all partitions.
For example, in the image below Topic D is consumed by Consumer Group A has 3 consumers capable of
handling 5 partitions. Topic D is also consumed by Consumer Group B, which has 4 consumers capable of
handling 5 partitions.
2.4.3 Consumer Rebalancing
When a new consumer is added to a consumer group or an existing consumer is removed from the
consumer group, the partitions are reassigned based on one of the following two strategies:
Range – depending on the number of partitions and consumers at a given time, allocate partitions to
consumers as shown below.
In the above example, Partitions 0, 1, 2, 3 are allocated to Consumer I and Partitions 4, 5 and 6 are
allocated to Consumer II. The partitions are divided up evenly and allocated to the consumers. If there
is an odd number of partitions, the first consumer takes one extra.
Round Robin – depending on the number of partitions and consumers, allocate partitions in turns to
consumers (in a round robin fashion) as shown below.
In the above example, Partitions 0, 2, 4 and 6 are allocated to Consumer I, and Partitions 1, 3 and 5 are
allocated to Consumer II. The partitions are allocated like cards being dealt out.
A system should be able to dynamically change the rebalancing strategy between the above two.
If a partition is assigned a new consumer after rebalancing, the consumer should pick up from where the
previous consumer left off in that partition.
2.5 Replay Messages
One of the most powerful aspects of event streaming is the ability to replay messages that are stored in
the queue.
The first way this can occur is if a new consumer group is created – for each consumer in the group,
consuming of messages starts from the beginning.
The second way in which this can occur is via a controlled replay, which is done from a message
offset in a partition. Messages from that point onwards are streamed through the pipeline, until the
most recent message at the latest offset is reached.
2.6 Design Considerations
Two design considerations you will need to think about in your solution are as follows:
Concurrency – since Producers and Consumers are all running in parallel, operating on a shared
resource, how will you ensure correctness?
Generics – How will you ensure that an object of any type can be used as an event payload in a
tributary topic?
3.0 Interface
3.1 Java API
Since you are building a library on which other engineers can develop event-driven systems, you will need
to have some classes which are available for others to use in their implementations – just as you do when
you import any Java library.
You will need to determine which classes are part of the interface (API which other developers can use),
and these classes will need to be documented with JavaDoc and go inside a folder called api . All other
classes are considered part of the internal workings of the system (black box) and do not need JavaDoc.
These classes should go inside a folder called core .
In your public-facing API specification, you will need to adopt Design by Contract and make clear any
preconditions, postconditions and exceptions of your interface.
3.2 Philosophy and Usage
The way we refer to API here is a little bit different to how you might be used to it from previous courses.
In something like COMP1531, the term “API” usually refers to a web-based service that you can call upon
via an endpoint, to perform some of the work required by your application. In that scenario, components
that make up the API are abstracted away from your application.
However, when we refer to an API here, we are describing more a library rather than a service. By using
this library, we can construct various different domain specific applications by bringing together the
components provided by the library. The Command Line Interface application described below is one
specific instance of a domain-specific application that we are using our library to build, but it is not the
behaviour of the API in itself. That is to say, the functionality that we want from the CLI isnʼt abstracted
behind the API as a service that facilitates requests based on user input, but rather we use the
components provided by the library and coordinate them for this specific use case.
We should be able to take this library and create some other domain specific application that isnʼt tied to
this specific Command Line Interface program. Additionally, the components provided by your library like
the producer and the consumer should be able to be extended by the user of your library to define new
types of producers and consumers for their specific application, building upon the mechanisms and
functionality provided by your library.
Although the library exposes a bunch of components publicly (the stuff that goes in api folder), there will
also be things that the library uses internally that isnʼt exposed to the user (the stuff that goes in the
core folder). Hence the user only interacts with the public API of the library and uses the library and its
components in the way the API defines.
You can think of all of this very similarly to the Java API. It is a library that provides a bunch of components
that we can use to build our applications. The components that are available and the way we are able to
use them are defined by the public API. We can also extend things provided by the library. For example, if I
really wanted to, I could extend the ArrayList class, or maybe more practically, I can implement
provided interfaces in my own custom components.
3.3 Command Line Interface
In order to run usability tests on your solution you will need to develop a way to interact with tributaries,
producers and consumers via command line.
In a seperate package to the system, write a wrapper class called TributaryCLI that allows the user to
enter the following commands.
create topic
tributary.
A message showing the id, type
and other relevant information
Command Description Output
id is the topic s̓ identifier.
type is the type of event that
goes through the topic. While
this can be any type in Java, for
the purposes of the CLI it can
either be Integer or String .
about the topic confirming its
create partition
Creates a new partition in the
topic with id topic .
id is the partition s̓ identifier.
A message confirming the
partition s̓ creation.
create consumer group
Creates a new consumer group
with the given identifier.
consumer group is subscribed
group s̓ initial rebalancing
method, one of Range
or RoundRobin .
A message confirming the
consumer group s̓ creation.
create consumer
Creates a new consumer within
a consumer group.
A message confirming the
consumer s̓ creation.
delete consumer
A message confirming the
consumer s̓ deletion, and an
output of the rebalanced
consumer group that the
consumer was previously in.
create producer
Creates a new producer which
produces events of the given
or Manual , determining which
method of partition selection is
used for publishing events.
A message confirming the
producer s̓ creation.
produce event Produces a new event from the given producer to the given How you represent the event is up to you. We recommend using a JSON structure to represent the different parts of an event and the event parameter to this command is a filename to a JSON file with the event content parameter used only if the producer publishes events to a manually specified partition The event id, the id of the partition it is currently in. consume event The given consumer consumes an event from the given The id and contents of the event, showing that the 4.0 Assessment specification. Task 1) Preliminary Design (5 marks) Before you begin work you will need to complete a preliminary, high-level design up front. In a blog post, write an engineering specification which includes: An analysis of the engineering requirements; A list of usability tests – a “checklist” of scenarios which someone testing your system via the command line interface could use to verify that your system works according to the requirements. partition. Precondition: The consumer is already allocated to the given partition. consumer has received the consume events
In this assignment, you will need to design and implement the Tributary system outlined in the
all of the events currently in each partition.
show consumer group
each consumer is receiving events from.
parallel produce
(
Produces a series of events in
parallel. This is purely for
demonstrating that your
tributary can cope with multiple
producers publishing events
simultaneously.
For each event, the id of the
partition it is currently in.
parallel consume
(
Consumes a series of events in
parallel. This is purely for
demonstrating that your
tributary can cope with multiple
consumers receiving events
simultaneously.
For each event consumed, the
contents of the event and its id.
set consumer group
rebalancing
Updates the rebalancing method
of consumer group
be one of Range or
RoundRobin .
A message confirming the new
rebalancing method.
playback
Plays back events for a given
consumer from the offset.
The id and contents of each
event received in order.
The above table exists purely to give you some structure to develop a CLI to so that your solution
can be usability tested – you are welcome to follow it as much or little as you like, so long as
you are able to showcase an implementation of the Engineering Requirements discussed in
Section 2.
Feel free to add other CRUD operations as you see fit.
An initial UML diagram showing the entities, relationships between entities and key methods/fields in
each entity (does not have to be a complete list, it just needs to be a first-version API)
Your design for a Java API by which someone could use your solution to setup an event-driven
A testing plan which explains how you are going to structure your tests for the system. You will need a
mix of unit tests on individual components as well as integration tests to check that components work
together. Your usability tests will need to be incorporated into this plan as well.
When it comes to implementation, there are two ways you can go about implementing a solution:
Feature-driven – essentially, working through the table in Section 3.2 and getting one section working
at a time. This approach is easier to usability test incrementally, harder to incorporate with multiple
people, and can result in less cohesive software design and test design (more of an Incremental
Design Approach)
Component-driven – creating each component and writing unit tests individually, before bringing the
pieces together and usability testing at the end. This approach is harder to usability test incrementally,
easier to incorporate with multiple people but can result in more cohesive software and test design
(more of a Big Design Up Front Approach)
You will need to pick one approach and justify your decision.
All of this should occur in a blog post before you start coding. Youʼll be able to iterate on the design as you
go, this is just the preliminary design.
No marks will be awarded for a preliminary design with no implementation.
Task 2) Tributary Implementation (20 marks)
Implement the system according to the engineering requirements, using your chosen approach.
To demonstrate the correctness of your solution, you will need to record a short video of you walking
through each scenario your usability test “checklist” and showing that your solution satisfies the each
The breakdown of marks in this section is as follows.
Your video should be 5 minute or less. Videos longer than 5 minutes wonʼt be marked;
You can help your marker by indexing the video for specific features;
Upload your video to Google Drive or YouTube as an unlisted video and link in your blog to
Pass (8 marks pair, 10 marks solo) The system functions with a single producer, a
single topic, a single partition, a single consumer
group and a single consumer such that a message
lifecycle can be completed. Events only have to be
of a single type for a pass.
Credit (12 marks pair, 15 marks solo) All requirements for a pass, as well as support for
multiple types of events and demonstration of
producers and consumers working in parallel.
Distinction (16 marks pair, 20 marks solo) All requirements for a credit, as well as
implementation of consumer groups with no
rebalancing, implementation of producer message
allocation options.
High Distinction (20 marks pair) All requirements for a credit, as well as a full
consumer rebalancing implementation and
Criterion Requirements
Task 3) Final Design (15 marks)
In a second blog post you will need to write in essence a report on your final solution. Your blog post will
need to contain:
Your final testing plan and your final list of usability tests;
An overview of the Design Patterns used in your solution;
Explanation of how you accommodated for the design considerations in your solution;
Your final UML diagram; and
A brief reflection on the assignment, including on the challenges you faced and whether you changed
your development approach.
We will use this blog post and your code to assess the overall design of your solution.
4.2 Submission
To submit you will need to push all your code to master, and submit the following form.
Form Link Coming Soon.
Make sure you push to master . We will take your most recent commit on master as your submission
during marking. Any commits on master after the deadline will be taken as a late submission.
4.3 Late Penalties
For one week, there is a late penalty of 5% per day of the maximum assessment mark. This means, for
example, if you submit one day late and would have received an on-time assessment mark of 100% your
final assessment mark would be 95%. If you submit one day late and would have received an on-time
assessment mark of 90% then your mark is unchanged.
Any submissions after Friday, Week 11 at 5pm will be awarded 0 marks.
4.3 Extenuating Circumstances
Extensions are only granted in extenuating circumstances and must be approved through either Special
Consideration, which needs to be submitted prior to the assignment deadline, or pre-arranged through an
Equitable Learning Plan with Equitable Learning Services and the Course Authority. In all cases please
Pairs where either student has Equitable Learning Provisions for group work are granted a 1-week
extension by emailing
If the student is completing the assignment individually and has an Equitable Learning Plan they can
receive an extension of up to one week (no more).
Pairs where a students has received Special Consideration will give the pair an extension. Individuals
completing the assignment who receive Special Consideration will be given an extension.
There is no subsequent late penalty on extended deadlines; any submissions made after Sunday Week 11,
5pm will be awarded 0 marks.
support for replaying of events.
When it comes to testing your solution, you will need to implement your testing plan.
Your testing will be assessed in the final design task. You should aim for at least 80% coverage of
your final solution as part of having good test design.
If your blog is not linked in your repo when you submit, it will not be marked. Markers will not
be chasing up blog posts.
4.4 Other Expectations
The following segment applies to people completing the assignment in pairs.
While it is up to you as a pair to decide how work is distributed between you, for the purpose of
assessment there are certain key criteria all partners must attain:
Code contribution;
Non-code contribution;
Usage of Git/GitLab; and
Academic conduct.
The details of each of these is below.
While, in general, both students will receive the same mark for the assignment, if you as an individual fail
to meet these criteria your final assignment mark will be reduced.
4.5 Plagiarism
The work you submit must be your own work. Submission of work partially or completely derived from any
other person or jointly written with any other person is not permitted. The penalties for such an offence
may include negative marks, automatic failure of the course and possibly other academic discipline.
Assignment submissions will be examined both automatically and manually for such submissions.
The use of code synthesis tools, such as GitHub Copilot, is not permitted on this project.
Relevant scholarship authorities will be informed if students holding scholarships are involved in an
incident of plagiarism or other misconduct.
Do not provide or show your project work to any other person, except for your group and the teaching
staff of COMP2511. If you knowingly provide or show your assignment work to another person for any
reason, and work derived from it is submitted you may be penalised, even if the work was submitted
without your knowledge or consent. This may apply even if your work is submitted by a third party
unknown to you.
Note, you will not be penalised if your work has the potential to be taken without your consent or
knowledge.
5.0 Credits
Credit for the engineering requirements in this specification goes to Nine Technology Apache Kafka
Training Course.
If you believe a your partner is not contributing as they should contribute, you must inform your
tutor at the end of that corresponding week.
For example, if your partner has not contributed in Week 5, you need to report this before the end
of Week 9. You must not wait beyond this. If you fail to report in time, we may not be able to
address the issue and/or apply redistribution of marks.
Assignment II: Dungeonmania Coursework Assignment I: Back in
BlackoutCOMP2511 23T2 COMP2511 23T2
COMP2511 23T2