COMP2041 9044 23T2 Assignment 2 Sheepy

Assignment 2: Sheepy
version: 1.4 last updated: 2023-07-22 18�00

This assignment aims to give you

practice in Python programming generally

experience in translating between complex formats with Python

clarify your understanding of Shell syntax & semantics

introduce you to Python syntax & semantics

Introduction
Your task in this assignment is to write a POSIX Shell Transpiler.

Generally, compilers take a high-level language as input and output assembler, which can then can be directly

A Transpiler (or Source-to-Source Compiler) takes a high-level language as input and outputs a different high-level

Your transpiler will take Shell scripts as input and output Python.

Such a translation is useful because programmers sometimes convert Shell scripts to Python.

Most commonly this is done because extra functionality is needed, e.g. a GUI.

And this functionality is much easier to implement in Python.

Your task in this assignment is to automate this conversion.

You must write a Python program that takes as input a Shell script and outputs an equivalent Python program.

The translation of some POSIX Shell code to Python is straightforward.

The translation of other Shell code is difficult or infeasible.

So your program will not be able to translate all Shell code to Python.

But a tool that performs only a partial translation of shell to Python could still be very useful.

You should assume the Python code output by your program will be subsequently read and modified by humans.

In other words, you have to output readable Python code.

For example, you should aim to preserve variable names and comments.

Your compiler must be written in Python.

You must call your Python program sheepy.py .
It will be given a single argument, the path to a Shell script as its first command line argument.

It should output, to standard output, the equivalent Python code.

For example:

https://en.wikipedia.org/wiki/Source-to-source_compiler

$ cat gcc.sh
#!/bin/dash

for c_file in *.c
gcc -c $c_file
$ ./sheepy.py gcc.sh
#!/usr/bin/python3 -u

import glob, subprocess

for c_file in sorted(glob.glob(“*.c”)):
subprocess.run([“gcc”, “-c”, c_file])

If you look carefully at the example above you will notice the Python code does not have exactly the same semantics

as the shell code.

If there are no .c files in the current directory the for loop in the shell program executes once and tries to compile a
non-existent file named *.c whereas the Python for loop does not execute.
And if the file name contains spaces the shell code will pass it as multiple arguments to gcc but the Python code will

pass it as a single argument – in other words the shell breaks but the Python works.

This is a general issue with translating Shell to Python.

In many cases, the natural translation of the shell code will have slightly different semantics.

For some purposes, it might be desirable to produce more complex Python code that matches the semantics exactly.

For example:

#!/usr/bin/python3 -u

import glob, subprocess

if glob.glob(“*.c”):
for c_file in sorted(glob.glob(“*.c”)):
subprocess.run([“gcc”, “-c”] + c_file.split())
subprocess.run([“gcc”, “-c”, “*.c”])

This is not desirable for our purposes.

Our goal is to produce the clearest most human-readable code so the first (simpler) translation is more desirable.

The shell features you need to implement is described below as a series of subsets.

It suggested you tackle the subset in the order listed but this is not required.

The echo builtin is used to output a string to stdout.
For example:

Shell PossiblePython translation

#!/bin/dash

echo hello world
echo 42 is the meaning of life,
the universe, and everything
echo To be or not to be: that is
the question

#!/usr/bin/python3 -u

print(“hello world”)
print(“42 is the meaning of life,
the universe, and everything”)
print(“To be or not to be: that
is the question”)

The echo builtin outputs each argument separated by a single space.

echo Hello World

is output as

Hello World

The = operator is used to assign a value to a variable.
For example:

Shell PossiblePython translation

#!/bin/dash

course_code=COMP2041
AssignmentName=Sheepy

#!/usr/bin/python3 -u

foo = “hello”
bar = “world”
course_code = “COMP2041”
AssignmentName = “Sheepy”

Remember that Shell assignment operator requires no spaces around the = operator.

Whereas Python allows spaces around the = operator.

As you should aim to produce readable Python code you should include spaces around the = operator.

Remember that Shell doesn’t have integers, so all values are strings.

The $ operator is used to access the value of a variable.
For example:

Shell PossiblePython translation

#!/bin/dash

theAnswer=42
echo The meaning of life, the
universe, and everything is
$theAnswer

name=COMP2041
echo I hope you are enjoying
$name this semester

echo $H, $W

palindrome=$P1$P2
echo $palindrome

#!/usr/bin/python3 -u

theAnswer = “42”
print(f”The meaning of life, the
universe, and everything is
{theAnswer}”)

name = “COMP2041″
print(f”I hope you are enjoying
{name} this semester”)

H = “Hello”
W = “World”
print(f”{H}, {W}”)

P1 = “race”
P2 = “car”
palindrome = f”{P1}{P2}”
print(f”{palindrome}”)

There may be multiple correct ways to translate a Shell script.

Again: remember that Shell doesn’t have integers, so all values are strings.

This will simplify your translation immensely.

Care needs to be taken as to not redefine Python function and keywords.

The shell code

print=Oops

could result in the Python code that makes the print function inaccessible.
The shell code

would result in a syntax error, as pass is a Python keyword and cannot be redefined.

The # operator is used to start a comment.
For example:

Shell PossiblePython translation

#!/bin/dash

# This is a comment

echo hello world # This is also a

#!/usr/bin/python3 -u

# This is a comment

print(“hello world”) # This is
also a comment

As comments do not effect the output of your program, your program will work correctly if you simply

remove all comments.

This will be penalized during manual marking, your generated python program should contain all the

comments from the input shell script.

Remember that Shell comments start with a # character and continue to the end of the line.

The * , ? , [ , and ] characters are used in globbing.
For example:

Shell PossiblePython translation

#!/bin/dash

C_files=*.[ch]
echo $C_files

echo all of the single letter
Python files are: ?.py

#!/usr/bin/python3 -u

import glob

“.join(sorted(glob.glob(“*”))))

C_files = “*.[ch]”
“.join(sorted(glob.glob(C_files))))

print(“all of the single letter
Python files are: ” + ”
“.join(sorted(glob.glob(“?.py”))))

Extra whitespace has been added to the Python code so that the position of each line matches the Shell code

You do not need to add extra whitespace to your Python code.

In this subset globbing characters will only be used for globbing or in a comment.

That is, if you see a * , ? , [ , or ] character it is a globbing character (unless it is in a comment).

glob.glob() can be used to perform globbing in Python.

As shown in the example above, globbing characters aren’t expanded until they are used, not when they

are assigned to a variable.

This may be difficult to implement, and in most cases expanding globbing characters when they are

assigned to a variable is acceptable.

The for , in , do , and done keywords are used to start and end for loops.
For example:

Shell PossiblePython translation

#!/bin/dash

for i in 1 2 3

for word in this is a string
echo $word

for file in *.c
echo $file

#!/usr/bin/python3 -u

for i in [“1”, “2”, “3”]:

for word in [“this”, “is”, “a”,
“string”]:

print(word)

for file in
sorted(glob.glob(“*.c”)):

print(file)

Extra whitespace has been added to the Python code so that the position of each line matches the Shell code

You do not need to add extra whitespace to your Python code.

In this subset you do not need to handle nested for loops.

The exit builtin is used to exit the shell.
For example:

Shell PossiblePython translation

#!/bin/dash

echo hello world
echo this will not be printed
echo this will double not be

#!/usr/bin/python3 -u

import sys

print(“hello world”)
sys.exit()
print(“this will not be printed”)
sys.exit(0)
print(“this will double not be
sys.exit(3)

sys.exit() can be used to exit the Python program.

The cd builtin is used to change the current working directory.
For example:

Shell PossiblePython translation

#!/bin/dash

#!/usr/bin/python3 -u

import glob, os

“.join(sorted(glob.glob(“*”))))
os.chdir(“/tmp”)
“.join(sorted(glob.glob(“*”))))
os.chdir(“..”)
“.join(sorted(glob.glob(“*”))))

os.chdir() can be used to change the current working directory.

The read builtin is used to read a line from stdin.
For example:

Shell PossiblePython translation

#!/bin/dash

echo What is your name:

echo What is your quest:
read quest

echo What is your favourite
read colour

echo What is the airspeed
velocity of an unladen swallow:
read velocity

echo Hello $name, my favourite
colour is $colour too.

#!/usr/bin/python3 -u

print(“What is your name:”)
name = input()

print(“What is your quest:”)
quest = input()

print(“What is your favourite
colour = input()

print(“What is the airspeed
velocity of an unladen swallow:”)
velocity = input()

print (f”Hello {name}, my
favourite colour is {colour}

input() can be used to read a line from stdin.

External Commands
Any line that is not a known builtin, keyword, or other shell syntax should be treated as an external command.

For example:

Shell PossiblePython translation

#!/bin/dash

touch test_file.txt
ls -l test_file.txt

for course in COMP1511 COMP1521
COMP2511 COMP2521 # keyword
echo $course
mkdir $course
# external command
chmod 700 $course
# external command

#!/usr/bin/python3 -u

import subprocess

subprocess.run([“touch”,
“test_file.txt”])
subprocess.run([“ls”, “-l”,
“test_file.txt”])

for course in [“COMP1511”,
“COMP1521”, “COMP2511”,
“COMP2521″]:

print(f”{course}”)
subprocess.run([“mkdir”,
subprocess.run([“chmod”,
“700”, course])

Extra whitespace has been added to the Python code so that the position of each line matches the Shell code

You do not need to add extra whitespace to your Python code.

subprocess.run() can be used to run external commands.

Command Line Arguments
The $0 , $1 , $2 , etc. variables are used to access the command line arguments.
For example:

Shell PossiblePython translation

#!/bin/dash

echo This program is: $0

file_name=$2
number_of_lines=$5

echo going to print the first
$number_of_lines lines of
$file_name

#!/usr/bin/python3 -u

import sys

print(f”This program is:
{sys.argv[0]}”)

file_name = sys.argv[2]
number_of_lines = sys.argv[5]

print(f”going to print the first
{number_of_lines} lines of
{file_name}”)

Only the numbers 0-9 can be used with the $ operator.

numbers 10 and above require the ${} syntax.
(See the next section)

The ${} operator is used to access the value of a variable.
For example:

Shell PossiblePython translation

#!/bin/dash

string=BAR
echo FOO${string}BAZ

#!/usr/bin/python3 -u

string = “BAR”
print(f”FOO{string}BAZ”)

This is similar to the $ operator but allows for strings to be appended immediately after the variable name.

Only simple expressions will be used. Not ${var:default} , ${var#word} , ${var%%word} , etc

The test builtin is used to test a condition.

os.access() and os.path will be useful for implementing test .

In this subset it is only to be used as the condition in an if statements and while loops.

You do have to handle all test operators such = , -eq , -z , -r , -d , etc.

Except for test ‘s -ef and -t operators as they don’t have a simple translation to Python.

The if , then , elif , else , and fi keywords are used to start and end if statements.
For example:

Shell PossiblePython translation

#!/bin/dash

if test -w /dev/null
echo /dev/null is writeable

#!/usr/bin/python3 -u

if os.access(“/dev/null”,

print(“/dev/null is
writeable”)

In this subset you do not need to handle nested if statements.

In this subset the if condition must be a single test expression.

The while , do , and done keywords are used to start and end while loops.
For example:

Shell PossiblePython translation

#!/bin/dash

while test $row != 11111111111

#!/usr/bin/python3 -u

while row != “11111111111”:

print(row)
row = f”1{row}”

In this subset you do not need to handle nested while statements.

In this subset the while condition must be a single test expression.

As we do not have the $(()) operator yet (see subset 4), we cannot increment a loop counters.

Instead we are using string concatenation to “expand” the loop counter.

Single Quotes
The ‘ character is used to start and end a single-quoted string.
For example:

#!/bin/dash

echo ‘hello world’

echo ‘This is not a $variable’

echo ‘This is not a glob *.sh’

Strings inside single-quotes do not have variables expanded,

do not have globbing characters expanded,

and do not have whitespace removed.

Double Quotes
The ” character is used to start and end a double-quoted string.
For example:

#!/bin/dash

echo “hello world”

echo “This is sill a $variable”

echo “This is not a glob *.sh”

Strings inside double-quotes do have variables expanded,

but not globbing characters expanded,

or have whitespace removed.

A command substitution can be started and ended with a ` character (backtick).

For example:

#!/bin/dash

date=`date +%Y-%m-%d`

echo Hello `whoami`, today is $date

echo “command substitution still works in double quotes: `hostname`”

echo ‘command substitution does not work in single quotes: `not a command`’

Backticks will not be nested.

Backticks are the original syntax for command substitution.

The new and better syntax for command substitution $() is in subset 4.

The -n flag for echo tells it not to print a newline at the end of the output
For example:

#!/bin/dash

echo -n “How many? ”

Echo only accepts a single flag, -n, and it must be the first argument.

-n at any other location, or any other flag, should be treated as a normal string.

number of command line arguments
The $# variable is used to access the number of command line arguments.
For example:

#!/bin/dash

echo I have $# arguments

command line argument lists
The $@ variable is used to access all the command line arguments.
For example:

#!/bin/dash

echo “My arguments are

Make sure you translate $@ appropriately both when quoted and not quoted.
As the behaviour of $@ differs depending on this context.

if/while/for nesting
In this subset, if , while , and for expressions can now be nested inside themselves and each other.
For example:

#!/bin/dash

while test $i != ‘!!!!!!’
while test $j != ‘!!!!!!’
echo -n “. ”

for file in *.txt
if test -f “$file”
dos2unix “$file”

The [ and ] characters are used to start and end a test expression.
The [ builtin is the same as the test builtin.
Except that the [ builtin requires a ] as the last argument.

The case , in , | , ) , esac , and ;; keywords are used to start and end case statements.
For example:

#!/bin/dash

case $# in
echo no arguments
echo one argument
echo some arguments
echo many arguments

The conditions (expressions before the ) ) in a case statement is a glob

Multiple conditions can be OR’ed together using |

| , ) , and ; are only used in case statements.

apart from in strings and comments: | , ) , and ; will only be used in case statements, not for their other
purposes in Shell scripts.

The $() operator is used for command substitution.
The $() operator is the same as the ` operator.
Except that The $() operator may be nested.
For example:

#!/bin/dash

date=$(date +%Y-%m-%d)

echo Hello $(whoami), today is $date

echo “command substitution still works in double quotes: $(hostname)”

echo ‘command substitution does not work in single quotes: $(not a command)’

echo “The groups I am part of are $(groups $(whoami))”

The $(()) operator is used to evaluate an arithmetic expression For example:

#!/bin/dash

echo $((x + y))

<, >, and >>
The < , > , and >> operators are used to redirect stdin and stdout respectively.
For example:

#!/bin/dash

echo hello >file
echo world >> file

You do not need to implement stderr redirection. (i.e. 2> )

You do not need to implement redirection to a file descriptor. (i.e. >&2 )

Input and Output redirection will only be used after the command and it’s arguments.