COMP2511 23T2 Confluence

COMP2511 23T2 Assignment III: Tributary

Due: Week 10 Sunday, 5pm (6th August)

No late penalty will be applied until Week 11 Tuesday, 5pm (8th August).

Value: 10% – Bonus

This assignment can be used to make up lost marks in other assignments. It does not make up for marks

lost in labs or the final exam.

Getting Setup
This assignment is completed in pairs or individually.

You will need to fork this repo in order to complete the assignment: Assignment III Starter Code.

If you are completing the assignment individually, fork it into your personal COMP2511 GitLab Group

(COMP2511 / 23T2 / z123456).

If you are completing the assignment in your Assignment II pair, fork it into your pair GitLab Group

(COMP2511 / 23T2 / Teams / M09A_ABBA).

If you are completing the assignment in a pair which is different to your Assignment II pair, post on the

forum and weʼll create the group for you to fork the repo into.

What is this assignment aiming to achieve?

Having completed Assignments I and II and gained familiarity and experience with the course

concepts it s̓ time to turn this into expertise. This assignment aims to get you working with a real-

world problem that forms a big part of modern software architecture – event-streaming and event

pipelines.

This assignment is designed for students aiming to achieve a high mark in the course – you can still

get up to 100% in the course overall without completing this assignment.

The aims of this assignment can be broken down into four major themes:

�. Synthesising Complex and Abstract Requirements. This problem is a bit different from those

youʼve encountered before, and may be difficult to understand at first. However, like many

similar problems, once you understand what youʼre working with the solution is more

straightforward.

�. Building a Java API. Using your skills in Design by Contract, youʼll need to design a library that

could be used as a platform to build event-driven systems.

�. Design and Testing Methodologies. Youʼll be completing a preliminary design to set the

trajectory before diving in, and iterating on your design as you go. Youʼll also be planning out

and fully architecting and writing a test suite for the system.

�. Event-Driven & Asynchronous Design. In developing this system, youʼll learn about how event

streaming works and how event-driven design can be used in software systems. Youʼll also need

to consider the implications of working asynchronously, and how to manage concurrency

1.0 Overview
Event-Driven Architecture makes up much of the backbone of modern software. With a move towards

decentralised microservice systems, there is a need for scalable asynchronous communication between

software components.

In this assignment you will be writing a Java API that another engineer could use to build an event-driven

system using the principles discussed in lectures.

This library is based on a heavily simplified version of the event streaming infrastructure Apache Kafka,

which you can read more about for your own interest. No understanding of Kafka is required to complete

this assignment. Apache Kafka

Intro Video

COMP2511 Assignment III: Tributary

2.0 Engineering Requirements
The fundamental premise on which Event-Driven Architecture rests is the ability of producer and

consumer entities in the system to share data asynchronously via a stream-like channel.

However, our library will allow for more complex interactions than simply that of a single channel.

A Tributary Cluster contains a series of topics. A topic contains events which are logically grouped

together. For example, a cluster could contain two topics: one for images-related events and one for

video-related events. You can think of them like a table in a database or a folder in a file system.

Within each topic, there are a series of partitions – a collection of a sequence of messages which can be

appended to.

You can complete this assignment in pairs or individually. You will be marked differently depending

on whether you complete the assignment as a pair or individually. You can continue with the same

pair as you had in Assignment II, or form a new pair if you like.

https://kafka.apache.org/

A unit of data within a Tributary is a message, or record or event. For example, to update their profile a

user may send a message to Partition 1 in Topic A, and this message will be appended to Partition 1 in

Topic A. Each partition is queue where messages are appended at the end of the partition. Each message

has an optional key to indicate which partition it should be appended to.

A topic can be related to “user profiles” and each message relates to requesting an update to a specific

profile. However, considering there can be many such requests at a given time, the system divides the

incoming requests into multiple partitions. There can be multiple consumers consuming messages at the

same time (concurrently). However, each partition is handled by only one consumer. Multiple consumers

will allow us to effectively utilise the underlying hardware with multiple cores.

In the context of the library you are building, topics are parameterised on a generic type; all event

payloads within that topic must be of the specified type.

2.1 Message Lifecycle: A Simple Example

Let us take the example of a user updating their profile. This results in an event being generated by the

producer for a topic “user profiles” with the updated profile information. This event is now delivered to

the Tributary, which assigns the event to one of the partitions. The producer indicates whether the

message is randomly allocated to a partition, or provides a key specifying which partition to append the

message to.

A consumer processes one or more partitions by sequentially processing (consuming) events in the

allocated partitions.

2.2 Message Structure

Individual messages contain the following information:

Datetime created;

Payload Type;

Value. The value is an object containing relevant information for a topic. Considering information

required for different topics may change, you should consider using a generic type here.

2.3 Producers

A Producer is responsible for sending messages to the Tributary system. As show in the diagram above, a

message information including the datetime created and source producer. Messages may have a key

which indicates the partition id to send the message to a particular partition. Alternatively, a message is

randomly assigned to a partition by the system.

2.3.1 Allocation of Messages to Partitions

Producers can indicate whether to send a message to a particular partition by providing the relevant key

or requesting random allocation. There are two types of producers:

Random Producers – the producer requests the Tributary system to randomly assign a message to a

Manual Producers – the producer requests the Tributary system to assign a message to a particular

partition by providing a specific key.

A producer object can be created to represent one of the two producers listed above. However, it cannot

change its message allocation method once it has been created. Your implementation should allow

different message allocation methods to be added in the future.

2.4 Consumers and Consumer Groups

2.4.1 Consumers

Consumers responsible for consuming (processing) messages stored in partition queues. Consumers

operate as part of a Consumer Group. The Consumer reads the messages in the order that they were

produced. The consumer keeps track of the messages that have been consumed.

2.4.2 Consumer Groups

There can be multiple consumers consuming (processing) messages from one or more partitions in a

topic. Note that each partition can be consumed by only one consumer. A consumer group consists of

one or more consumers, capable of consuming all partitions in a topic.

Each topic can have multiple consumer groups. While each consumer group may contain a different

number of consumers, the number of partitions being consumed from will always remain the same.

Each consumer group contains one or more consumers, that are together capable of consuming all

partitions. Each topic can have multiple consumer groups, each capable of consuming all partitions.

For example, in the image below Topic D is consumed by Consumer Group A has 3 consumers capable of

handling 5 partitions. Topic D is also consumed by Consumer Group B, which has 4 consumers capable of

handling 5 partitions.

2.4.3 Consumer Rebalancing

When a new consumer is added to a consumer group or an existing consumer is removed from the

consumer group, the partitions are reassigned based on one of the following two strategies:

Range – depending on the number of partitions and consumers at a given time, allocate partitions to

consumers as shown below.

In the above example, Partitions 0, 1, 2, 3 are allocated to Consumer I and Partitions 4, 5 and 6 are

allocated to Consumer II. The partitions are divided up evenly and allocated to the consumers. If there

is an odd number of partitions, the first consumer takes one extra.

Round Robin – depending on the number of partitions and consumers, allocate partitions in turns to

consumers (in a round robin fashion) as shown below.

In the above example, Partitions 0, 2, 4 and 6 are allocated to Consumer I, and Partitions 1, 3 and 5 are

allocated to Consumer II. The partitions are allocated like cards being dealt out.

A system should be able to dynamically change the rebalancing strategy between the above two.

If a partition is assigned a new consumer after rebalancing, the consumer should pick up from where the

previous consumer left off in that partition.

2.5 Replay Messages

One of the most powerful aspects of event streaming is the ability to replay messages that are stored in

the queue.

The first way this can occur is if a new consumer group is created – for each consumer in the group,

consuming of messages starts from the beginning.

The second way in which this can occur is via a controlled replay, which is done from a message

offset in a partition. Messages from that point onwards are streamed through the pipeline, until the

most recent message at the latest offset is reached.

2.6 Design Considerations

Two design considerations you will need to think about in your solution are as follows:

Concurrency – since Producers and Consumers are all running in parallel, operating on a shared

resource, how will you ensure correctness?

Generics – How will you ensure that an object of any type can be used as an event payload in a

tributary topic?

3.0 Interface

3.1 Java API

Since you are building a library on which other engineers can develop event-driven systems, you will need

to have some classes which are available for others to use in their implementations – just as you do when

you import any Java library.

You will need to determine which classes are part of the interface (API which other developers can use),

and these classes will need to be documented with JavaDoc and go inside a folder called api . All other

classes are considered part of the internal workings of the system (black box) and do not need JavaDoc.

These classes should go inside a folder called core .

In your public-facing API specification, you will need to adopt Design by Contract and make clear any

preconditions, postconditions and exceptions of your interface.

3.2 Philosophy and Usage

The way we refer to API here is a little bit different to how you might be used to it from previous courses.

In something like COMP1531, the term “API” usually refers to a web-based service that you can call upon

via an endpoint, to perform some of the work required by your application. In that scenario, components

that make up the API are abstracted away from your application.

However, when we refer to an API here, we are describing more a library rather than a service. By using

this library, we can construct various different domain specific applications by bringing together the

components provided by the library. The Command Line Interface application described below is one

specific instance of a domain-specific application that we are using our library to build, but it is not the

behaviour of the API in itself. That is to say, the functionality that we want from the CLI isnʼt abstracted

behind the API as a service that facilitates requests based on user input, but rather we use the

components provided by the library and coordinate them for this specific use case.

We should be able to take this library and create some other domain specific application that isnʼt tied to

this specific Command Line Interface program. Additionally, the components provided by your library like

the producer and the consumer should be able to be extended by the user of your library to define new

types of producers and consumers for their specific application, building upon the mechanisms and

functionality provided by your library.

Although the library exposes a bunch of components publicly (the stuff that goes in api folder), there will

also be things that the library uses internally that isnʼt exposed to the user (the stuff that goes in the

core folder). Hence the user only interacts with the public API of the library and uses the library and its

components in the way the API defines.

You can think of all of this very similarly to the Java API. It is a library that provides a bunch of components

that we can use to build our applications. The components that are available and the way we are able to

use them are defined by the public API. We can also extend things provided by the library. For example, if I

really wanted to, I could extend the ArrayList class, or maybe more practically, I can implement

provided interfaces in my own custom components.

3.3 Command Line Interface

In order to run usability tests on your solution you will need to develop a way to interact with tributaries,

producers and consumers via command line.

In a seperate package to the system, write a wrapper class called TributaryCLI that allows the user to

enter the following commands.

create topic Creates a new topic in the

tributary.

A message showing the id, type

and other relevant information

Command Description Output

id is the topic s̓ identifier.

type is the type of event that

goes through the topic. While

this can be any type in Java, for

the purposes of the CLI it can

either be Integer or String .

about the topic confirming its

create partition

Creates a new partition in the

topic with id topic .

id is the partition s̓ identifier.

A message confirming the

partition s̓ creation.

create consumer group

Creates a new consumer group

with the given identifier.

is the topic the

consumer group is subscribed

is the consumer

group s̓ initial rebalancing

method, one of Range

or RoundRobin .

A message confirming the

consumer group s̓ creation.

create consumer

Creates a new consumer within

a consumer group.

A message confirming the

consumer s̓ creation.

delete consumer Deletes the consumer with the

A message confirming the

consumer s̓ deletion, and an

output of the rebalanced

consumer group that the

consumer was previously in.

create producer

Creates a new producer which

produces events of the given

is either Random

or Manual , determining which

method of partition selection is

used for publishing events.

A message confirming the

producer s̓ creation.

produce event

Produces a new event from the

given producer to the given

How you represent the event is

up to you. We recommend using

a JSON structure to represent

the different parts of an event

and the event parameter to this

command is a filename to a

JSON file with the event content

is an optional

parameter used only if the

producer publishes events to a

manually specified partition

The event id, the id of the

partition it is currently in.

consume event

The given consumer consumes

an event from the given

The id and contents of the

event, showing that the

4.0 Assessment
In this assignment, you will need to design and implement the Tributary system outlined in the

specification.

Task 1) Preliminary Design (5 marks)

Before you begin work you will need to complete a preliminary, high-level design up front. In a blog post,

write an engineering specification which includes:

An analysis of the engineering requirements;

A list of usability tests – a “checklist” of scenarios which someone testing your system via the

command line interface could use to verify that your system works according to the requirements.

partition. Precondition: The

consumer is already allocated to

the given partition.

consumer has received the

consume events

Prints a visual display of the given topic, including all partitions and

all of the events currently in each partition.

show consumer group Shows all consumers in the consumer group, and which partitions

each consumer is receiving events from.

parallel produce

(, ,

), …

Produces a series of events in

parallel. This is purely for

demonstrating that your

tributary can cope with multiple

producers publishing events

simultaneously.

For each event, the id of the

partition it is currently in.

parallel consume

(, )

Consumes a series of events in

parallel. This is purely for

demonstrating that your

tributary can cope with multiple

consumers receiving events

simultaneously.

For each event consumed, the

contents of the event and its id.

set consumer group

rebalancing

Updates the rebalancing method

of consumer group to

be one of Range or

RoundRobin .

A message confirming the new

rebalancing method.

playback

Plays back events for a given

consumer from the offset.

The id and contents of each

event received in order.

The above table exists purely to give you some structure to develop a CLI to so that your solution

can be usability tested – you are welcome to follow it as much or little as you like, so long as

you are able to showcase an implementation of the Engineering Requirements discussed in

Section 2.

Feel free to add other CRUD operations as you see fit.

An initial UML diagram showing the entities, relationships between entities and key methods/fields in

each entity (does not have to be a complete list, it just needs to be a first-version API)

Your design for a Java API by which someone could use your solution to setup an event-driven

A testing plan which explains how you are going to structure your tests for the system. You will need a

mix of unit tests on individual components as well as integration tests to check that components work

together. Your usability tests will need to be incorporated into this plan as well.

When it comes to implementation, there are two ways you can go about implementing a solution:

Feature-driven – essentially, working through the table in Section 3.2 and getting one section working

at a time. This approach is easier to usability test incrementally, harder to incorporate with multiple

people, and can result in less cohesive software design and test design (more of an Incremental

Design Approach)

Component-driven – creating each component and writing unit tests individually, before bringing the

pieces together and usability testing at the end. This approach is harder to usability test incrementally,

easier to incorporate with multiple people but can result in more cohesive software and test design

(more of a Big Design Up Front Approach)

You will need to pick one approach and justify your decision.

All of this should occur in a blog post before you start coding. Youʼll be able to iterate on the design as you

go, this is just the preliminary design.

No marks will be awarded for a preliminary design with no implementation.

Task 2) Tributary Implementation (20 marks)

Implement the system according to the engineering requirements, using your chosen approach.

To demonstrate the correctness of your solution, you will need to record a short video of you walking

through each scenario your usability test “checklist” and showing that your solution satisfies the each

The breakdown of marks in this section is as follows.

Your video should be 5 minute or less. Videos longer than 5 minutes wonʼt be marked;

You can help your marker by indexing the video for specific features;

Upload your video to Google Drive or YouTube as an unlisted video and link in your blog to

Pass (8 marks pair, 10 marks solo) The system functions with a single producer, a

single topic, a single partition, a single consumer

group and a single consumer such that a message

lifecycle can be completed. Events only have to be

of a single type for a pass.

Credit (12 marks pair, 15 marks solo) All requirements for a pass, as well as support for

multiple types of events and demonstration of

producers and consumers working in parallel.

Distinction (16 marks pair, 20 marks solo) All requirements for a credit, as well as

implementation of consumer groups with no

rebalancing, implementation of producer message

allocation options.

High Distinction (20 marks pair) All requirements for a credit, as well as a full

consumer rebalancing implementation and

Criterion Requirements

Task 3) Final Design (15 marks)

In a second blog post you will need to write in essence a report on your final solution. Your blog post will

need to contain:

Your final testing plan and your final list of usability tests;

An overview of the Design Patterns used in your solution;

Explanation of how you accommodated for the design considerations in your solution;

Your final UML diagram; and

A brief reflection on the assignment, including on the challenges you faced and whether you changed

your development approach.

We will use this blog post and your code to assess the overall design of your solution.

4.2 Submission

To submit you will need to push all your code to master, and submit the following form.

Form Link Coming Soon.

Make sure you push to master . We will take your most recent commit on master as your submission

during marking. Any commits on master after the deadline will be taken as a late submission.

4.3 Late Penalties

For one week, there is a late penalty of 5% per day of the maximum assessment mark. This means, for

example, if you submit one day late and would have received an on-time assessment mark of 100% your

final assessment mark would be 95%. If you submit one day late and would have received an on-time

assessment mark of 90% then your mark is unchanged.

Any submissions after Friday, Week 11 at 5pm will be awarded 0 marks.

4.3 Extenuating Circumstances

Extensions are only granted in extenuating circumstances and must be approved through either Special

Consideration, which needs to be submitted prior to the assignment deadline, or pre-arranged through an

Equitable Learning Plan with Equitable Learning Services and the Course Authority. In all cases please

Pairs where either student has Equitable Learning Provisions for group work are granted a 1-week

extension by emailing

If the student is completing the assignment individually and has an Equitable Learning Plan they can

receive an extension of up to one week (no more).

Pairs where a students has received Special Consideration will give the pair an extension. Individuals

completing the assignment who receive Special Consideration will be given an extension.

There is no subsequent late penalty on extended deadlines; any submissions made after Sunday Week 11,

5pm will be awarded 0 marks.

support for replaying of events.

When it comes to testing your solution, you will need to implement your testing plan.

Your testing will be assessed in the final design task. You should aim for at least 80% coverage of

your final solution as part of having good test design.

If your blog is not linked in your repo when you submit, it will not be marked. Markers will not

be chasing up blog posts.

4.4 Other Expectations

The following segment applies to people completing the assignment in pairs.

While it is up to you as a pair to decide how work is distributed between you, for the purpose of

assessment there are certain key criteria all partners must attain:

Code contribution;

Non-code contribution;

Usage of Git/GitLab; and

Academic conduct.

The details of each of these is below.

While, in general, both students will receive the same mark for the assignment, if you as an individual fail

to meet these criteria your final assignment mark will be reduced.

4.5 Plagiarism

The work you submit must be your own work. Submission of work partially or completely derived from any

other person or jointly written with any other person is not permitted. The penalties for such an offence

may include negative marks, automatic failure of the course and possibly other academic discipline.

Assignment submissions will be examined both automatically and manually for such submissions.

The use of code synthesis tools, such as GitHub Copilot, is not permitted on this project.

Relevant scholarship authorities will be informed if students holding scholarships are involved in an

incident of plagiarism or other misconduct.

Do not provide or show your project work to any other person, except for your group and the teaching

staff of COMP2511. If you knowingly provide or show your assignment work to another person for any

reason, and work derived from it is submitted you may be penalised, even if the work was submitted

without your knowledge or consent. This may apply even if your work is submitted by a third party

unknown to you.

Note, you will not be penalised if your work has the potential to be taken without your consent or

knowledge.

5.0 Credits
Credit for the engineering requirements in this specification goes to Nine Technology Apache Kafka

Training Course.

If you believe a your partner is not contributing as they should contribute, you must inform your

tutor at the end of that corresponding week.

For example, if your partner has not contributed in Week 5, you need to report this before the end

of Week 9. You must not wait beyond this. If you fail to report in time, we may not be able to

address the issue and/or apply redistribution of marks.

Assignment II: Dungeonmania Coursework Assignment I: Back in
BlackoutCOMP2511 23T2 COMP2511 23T2
COMP2511 23T2