1 Introduction 1.1 Contents
assign4 (MSA)
This assignment is meant to combine what you have learned so far during the MSA part of the module. You will build upon some of the exercises from previous assignments, optimizing your design in terms of time-area trade off. Additionally, you will design a general purpose processor with a simple Instruction Set Architecture (ISA).
During this assignment, you will need all tooling from the previous assignments.
1.3 Assessment
The assessment for this assignment will be done through a graded assignment on Canvas. You are expected to turn in your Clash code (.hs), as well as a report (.pdf) including relevant RTL schematics. For exercise 1 and 2 you can score 4 and 6 points respectively, for a total of 10 points.
Your grade is calculated using the following formula, where 𝐺 is your grade and 𝑝 the total number of points: 𝐺 = 𝑚𝑎𝑥(1, 𝑝)
Deadline: 2023-06-2 at 23:55
Exercise 1 – Min-Max Optimized for Area Recall Exercise 3 from Assign3 (MSA): Min-Max.
By now, most likely you have come to the conclusion that the solution you made in the previous assignment would not be scalable in practice, since the length of the combinatorial path grows linearly in proportion to the amount of inputs. In practice, hardware designers (especially in Embedded Systems) have strict area and power consumption requirements to fulfill.
Your task in this exercise is: implement a hardware design with similar functionality as Exercise 3 from Assignment 2, but only using two compare elements, note that this is a resource constrained scenario.
Also note that the behaviour is slightly different, since you have to supply a new data input and control signal each clock cycle, instead of all at once.
Hint: use a mealy machine in your design!
The idea is to make a machine that keeps track of the minimum and maximum value of an input stream. The control signal indicates what the output should be, so the circuit either outputs the maximum value or the minimum value of the numbers that have already streamed in.
1. Test your design by simulation and make sure it has the correct functionality as men- tioned above! Additionally, try it with the following input signals and put the output in your report: [2pt]
Control signal: [0,0,0,0,1,1,1,1] Data in: [4,2,5,3,7,1,0,10]
2. Once you’re satisfied with the design, let Quartus synthesize it and show the RTL schematic. Explain how many input pins, registers and comparators there are in your design. Additionally, explain how many clock cycles it takes for the correct output to appear. [2pt]
Code Help, Add WeChat: cstutorcs
Exercise 2 – Stack Processor
In this exercise, you will design a simple stack-based processor. The processor has a stack as its memory (state) and receives an instruction (input) every clock cycle. As its output, it should give the result of the current instruction. For example: Push x results in x at the output during the same clock cycle, Add results in stack[sp-1] + stack[sp-2] etc.
Your task is to implement the following instructions:
• Push
• Add – Adds the two values on the top of the stack, removing the old values and pushing their sum onto the stack.
• Mul – Same as Add, but multiplies the top 2 values.
• Sub – Same as Add, but subtracts the top value from the other value.
The code fragment below gives you some pre-defined data types to work with, as well as an
operator that you can use to easily write to the stack.
PLEASE NOTE: the NOINLINE directive is used for some functions, this tells Clash to generate separate VHDL files for these functions when compiling. This is useful to abstract the RTL view a bit in case of larger hardware structures.
type Value = Signed 16 type Stack = Vec 8 Value
— operator to edit the stack, usage: stack <~ (index, value)
-- returns a new stack with its value at `index` changed to `value` (<~) :: Stack -> (Value, Value) -> Stack
xs <~ (i,a) = replace i a xs
{-# NOINLINE (<~) #-}
-- operator to fetch the i'th element from a stack
(~>) :: Stack -> Value -> Value xs ~> i = xs!!i
{-# NOINLINE (~>) #-}
— Instruction datatype
data Instr = Push Value |
— Pushes a value onto the stack
— Adds the values at the top of the stack
— Multiplies “”
— Subtracts “”
deriving (Show, Generic, ShowX, NFDataX)
Programming Help
1. First, design your system using the given framework. Verify your design by providing a test program, its expected outputs and the actual outputs in your report. Also include the results of the following test program:
[Push 45, Push 2, Add, Push 3, Mul, Push 100, Sub, Push 0, Push 1, Sub]
Additionally, show the RTL schematic of your processor. [2pt]
2. You might have noticed when looking at the RTL view of your processor that some rather large multiplexer logic is created, especially inside the non-inlined blocks for the ~> and <~ operators. Implement your own version of replace (the function used by <~) using higher-order functions and with the following type signature:
myReplace :: (KnownNat n) => Int -> a -> Vec n a -> Vec n a
And test it using the following test (show output in your report):
myReplace 2 5 $ generate d4 (*2) 2
Explain using your implementation of myReplace why a solution such as this one syn-
thesizes to a relatively large piece of hardware. (you don’t have to synthesize it) [1pt]
3. So far, we’ve seen that the solution using replace will give us the correct output, but is not optimal in terms of hardware size. Think of a way to optimize your solution in terms of hardware size. Implement it in Clash and show the RTL schematic. Explain what you changed and how this affects the synthesized hardware with respect to your previous solution. [3pt]
浙大学霸代写 加微信 cstutorcs