Implementation Details
Stage 1 – Data Retrieval
In Stage 1, you will implement the basic functionality for a dictionary allowing the lookup
of data by key (trading_name).
Your Makefile should produce an executable program called dict1 . This program should
take three command line arguments:
1. The first argument will be the stage, for this part, the value will always be 1 (this will vary in Assignment 2).
2. The second argument will be the filename of the data file. 3. The third argument will be the filename of the output file.
Your dict1 program should:
Read the data from the data file specified in the second command line argument. The data from the CSV should be stored in a linked list of pointers to
struct s for the data. Datatypes for each field should be consistent with those in the *Dataset* slide. Each record (row) should be stored in a separate node.
Accept trading_name s from stdin, search the list for all matching records and print them to the output file. You may assume that all queries will be terminated by a new line to allow for the querying of entries with empty
trading_name s if you like. If no matches for the query are found, your program should output NOTFOUND . trading_name will be the 7th (0-based) value in the row.
In addition to outputting the record(s) to the output file, the number of records found should be output to stdout.
For modularity reasons, to assist with Assignment 2, your approach should ideally:
Store information about the result of the search itself independently from the structure of the list, e.g. your approach should make it simple to add additional search-related information to the result of the query.
For testing, it may be convenient to create a file of keys to be searched, one per line, and redirect the input from this file. Use the UNIX operator < to redirect input from a file.
Example Execution
An example execution of the program might be:
followed by typing in the keys, line by line, and then ending input once all keys are entered or:
Example Output
This is an example of what might be output to the output file after two queries:
make -B dict1
# ./dict1 stage datafile outputfile
./dict1 1 dataset_1000.csv output.txt
make -B dict1
# ./dict1 stage datafile outputfile
./dict1 1 dataset_1000.csv output.txt < queryfile
程序代写 CS代考 加微信: cstutorcs
McDonald's
--> census_year: 2020 || block_id: 710 || property_id: 108624
|| base_property_id: 108624 || building_address: 407A St Kilda
Road MELBOURNE VIC 3004 || clue_small_area: Melbourne
(Remainder) || business_address: 407A St Kilda Road MELBOURNE
VIC 3004 || trading_name: McDonald’s || industry_code: 4511 ||
industry_description: Cafes and Restaurants || seating_type:
Seats – Outdoor || number_of_seats: 10 || longitude: 144.97618
|| latitude: -37.83615 ||
–> census_year: 2020 || block_id: 710 || property_id: 108624
|| base_property_id: 108624 || building_address: 407A St Kilda
Road MELBOURNE VIC 3004 || clue_small_area: Melbourne
(Remainder) || business_address: 407A St Kilda Road MELBOURNE
VIC 3004 || trading_name: McDonald’s || industry_code: 4511 ||
industry_description: Cafes and Restaurants || seating_type:
Seats – Indoor || number_of_seats: 34 || longitude: 144.97618
|| latitude: -37.83615 ||
–> census_year: 2020 || block_id: 731 || property_id: 110377
|| base_property_id: 110377 || building_address: 305-331 City
Road SOUTHBANK VIC 3006 || clue_small_area: Southbank ||
business_address: 305-331 City Road SOUTHBANK VIC 3006 ||
trading_name: McDonald’s || industry_code: 4511 ||
industry_description: Cafes and Restaurants || seating_type:
Seats – Indoor || number_of_seats: 50 || longitude: 144.95791
|| latitude: -37.82824 ||
Standing Room
With the following output to stdout:
To process a csv file, it is usually easiest to read the entire line and then read through the characters. To help with this, a simple state machine has been provided – arrows show state changes and the conditions on those changes, diamonds show actions and squares show where the process starts. You might find it useful to have your program handle whether it is in the pink state or the blue state. Your program isn’t required to follow this pattern, but it may allow your code to be simpler.
1 McDonald’s –> 3
2 Standing Room –> NOTFOUND
Requirements
The following implementation requirements must be adhered to:
You must write your implementation in the C programming language.
You must write your code in a modular way, so that your implementation could be used in another program without extensive rewriting or copying. This means that the dictionary operations are kept together in a separate .c file, with its own header (.h) file, separate from the main program, and other distinct modules are similarly separated into their own sections, requiring as little knowledge of the internal details of each other as possible.
Your code should be easily extensible to multiple dictionaries. This means that the functions for interacting with your dictionary should take as arguments not only the values required to perform the operation required, but also a pointer to a particular dictionary, e.g. search(dictionary, value) .
Your implementation must read the input file once only.
Your program should store strings in a space-efficient manner. If you are using malloc() to create the space for a string, remember to allow space for the final
end of string character, ‘ \0 ’ ( NULL ).
A full Makefile is not provided for you. The Makefile should direct the compilation of your program. To use the Makefile, make sure it is in the same directory as your code, and type make dict1 to make the program. You must submit your Makefile with your assignment.
• If you haven’t used make before, try it on simple programs first. If it doesn’t work, read the error messages carefully. A common problem in compiling multifile executables is in the included header files. Note also that the whitespace before the command is a tab, and not multiple spaces.
• It is not a good idea to code your program as a single file and then try to break it down into multiple files. Start by using multiple files, with minimal content, and make sure they are communicating with each other before starting more serious coding.
Programming Style
Below is a style guide which assignments are evaluated against. For this subject, the 80 character limit is a guideline rather than a rule — if your code exceeds this limit, you should consider whether your code would be more readable if you instead rearranged it.
/** ***********************
* C Programming Style for Engineering Computation
* Created by Aidan Nagorcka-Smith
13/03/2011
* Definitions and includes
* Definitions are in UPPER_CASE
* Includes go before definitions
* Space between includes, definitions and the main function.
* Use definitions for any constants in your program, do not
just write them
* Tabs may be set to 4-spaces or 8-spaces, depending on your
editor. The code
* Below is “gnu” style. If your editor has “bsd” it will
follow the 8-space
* style. Both are very standard.
* GOOD: */
#include
#include
#define MAX_STRING_SIZE 1000
#define DEBUG 0
int main(int argc, char **argv) {
/* Definitions and includes are mixed up */
#include
#define MAX_STING_SIZE 1000
/* Definitions are given names like variables */
#define debug 0
#include
/* No spacing between includes, definitions and main
function*/
int main(int argc, char **argv) {
40 41 42 43
/** *****************************
* Variables
* Give them useful lower_case names or camelCase. Either is
* as long as you are consistent and apply always the same
* Initialise them to something that makes sense.
* GOOD: lower_case
int main(int argc, char **argv) {
int i = 0;
int num_fifties = 0;
int num_twenties = 0;
int num_tens = 0;
* GOOD: camelCase
int main(int argc, char **argv) {
int i = 0;
int numFifties = 0;
int numTwenties = 0;
int numTens = 0;
int main(int argc, char **argv) {
/* Variable not initialised – causes a bug because we
didn’t remember to
* set it before the loop */
99 100 101
/* Variable in all caps – we’ll get confused between this
and constants
int NUM_FIFTIES = 0;
/* Overly abbreviated variable names make things hard. */
int nt = 0
while (i < 10) {
/** ********************
* Spacing:
* Space intelligently, vertically to group blocks of code
that are doing a
* specific operation, or to separate variable declarations
from other code.
* One tab of indentation within either a function or a loop.
* Spaces after commas.
* Space between ) and {.
* No space between the ** and the argv in the definition of
* function.
* When declaring a pointer variable or argument, you may
place the asterisk
* adjacent to either the type or to the variable name.
* Lines at most 80 characters long.
* Closing brace goes on its own line
* GOOD: */
int main(int argc, char **argv) {
int i = 0;
for(i = 100; i >= 0; i–) {
if (i > 0) {
120 121 122
printf(“%d bottles of beer, take one down and pass it
” %d bottles of beer.\n”, i, i – 1);
printf(“%d bottles of beer, take one down and pass it
” We’re empty.\n”, i);
return 0; }
/* No space after commas
* Space between the ** and argv in the main function
definition
* No space between the ) and { at the start of a function */
int main(int argc,char ** argv){
int i = 0;
/* No space between variable declarations and the rest of
the function.
* No spaces around the boolean operators */
for(i=100;i>=0;i–) {
/* No indentation */
if (i > 0) {
/* Line too long */
printf(“%d bottles of beer, take one down and pass it
around, %d
bottles of beer.\n”, i, i – 1);
/* Spacing for no good reason. */
printf(“%d bottles of beer, take one down and pass it
” We’re empty.\n”, i);
/* Closing brace not on its own line */
return 0;}
程序代写 CS代考 加QQ: 749389476
194 195 196
/** ****************
* Opening braces go on the same line as the loop or function
* Closing braces go on their own line
* Closing braces go at the same indentation level as the
thing they are
* GOOD: */
int main(int argc, char **argv) {
for(…) {
return 0; }
int main(int argc, char **argv) {
/* Opening brace on a different line to the for loop open
for(…) {
/* Closing brace at a different indentation to the thing
closing */
218 219 220
/* Closing brace not on its own line. */
return 0;}
/** **************
* Commenting:
* Each program should have a comment explaining what it does
and who created
* Also comment how to run the program, including optional
command line
* parameters.
* Any interesting code should have a comment to explain
* We should not comment obvious things – write code that
documents itself
* GOOD: */
/* change.c
* Created by Aidan Nagorcka-Smith
13/03/2011
* Print the number of each coin that would be needed to make
* that is input by the user
* To run the program type:
* ./coins –num_coins 5 –shape_coins trapezoid –output
blabla.txt
* To see all the input parameters, type:
* ./coins –help
* Options::
* –num_coins arg
* –shape_coins arg
* –bound arg (=1)
Show help message
Input number of coins
Input coins shape
Max bound on xxx, default value 1
* –output arg Output solution file
int main(int argc, char **argv) {
int input_change = 0;
printf(“Please input the value of the change (0-99 cents
inclusive):\n”);
scanf(“%d”, &input_change);
printf(“\n”);
// Valid change values are 0-99 inclusive.
if(input_change < 0 || input_change > 99) {
printf(“Input not in the range 0-99.\n”)
/* No explanation of what the program is doing */
int main(int argc, char **argv) {
/* Commenting obvious things */
/* Create a int variable called input_change to store the
input from
* user. */
int input_change;
/** ****************
* Code structure:
* Fail fast – input checks should happen first, then do the
computation.
* Structure the code so that all error handling happens in an
easy to read
* location
if (input_is_bad) {
printf(“Error: Input was not valid. Exiting.\n”);
exit(EXIT_FAILURE);
/* Do computations here */
if (input_is_good) {
/* lots of computation here, pushing the else part off the
screen. */
fprintf(stderr, “Error: Input was not valid. Exiting.\n”);
exit(EXIT_FAILURE);
Some automatic evaluations of your code style may be performed where they are reliable. As determining whether these style-related issues are occurring sometimes involves non- trivial (and sometimes even undecidable) calculations, a simpler and more error-prone (but highly successful) solution is used. You may need to add a comment to identify these cases, so check any failing test outputs for instructions on how to resolve incorrectly flagged issues.
Submission
Your C code files (including your Makefile and any other files needed to run your code) should be submitted through Ed to this assignment. Your programs must compile and run correctly on Ed. You may have developed your program in another environment, but it still must run on Ed at submission time. For this reason, and because there are often small, but significant, differences between compilers, it is suggested that if you are working in a different environment, you upload and test your code on Ed at reasonably frequent intervals.
Programming Help
A common reason for programs not to compile is that a file has been inadvertently omitted from the submission. Please check your submission, and resubmit all files if necessary.
Assessment & Ungrading
There are a total of 10 marks given for this assignment.
Your C program will be marked on the basis of accuracy, readability, and good C programming structure, safety and style, including documentation (2 marks). Safety refers to checking whether opening a file returns something, whether malloc() s do their job, etc. The documentation should explain all major design decisions, and should be formatted so that it does not interfere with reading the code. As much as possible, try to make your code self-documenting by choosing descriptive variable names.
It is common to lose marks for modularity and efficiency because the *Requirements* slide was not adequately read, make sure to double check requirements at intervals during the writing of your program and to check Ed regularly. You will likely find some requirements easier to understand once you have finished your initial program plan and made a start on programming it.
One mark this semester is for your reflections on your learning.
The remainder of the marks will be based on the correct functioning of your submission.
Note that these correct functioning-related marks will be based on passing various tests. If your program passes these tests without addressing the learning outcomes (e.g. if you fully hard-code solutions or otherwise deliberately exploit the test cases), you may receive less marks than is suggested but your marks will otherwise be determined by test cases.
Note that code style will be manually marked to provide you with the most meaningful feedback for the second assignment.
Self-evaluation & Submission Certificate are a hurdle. Not providing them will earn 0 points, regardless of the quality of your solution.