GT CS 6035: Introduction to Information Security Project
Binary Exploitation!
Learning Goals of this Project:
Students will learn introductory level concepts about binary exploitation. This lab develops understanding of control flow hijacking through different taskschallenges designed to show certain vulnerabilities or weaknesses in a C program. A python library pwntools will be used to show some exploitation techniques and automation to successfully hack a program
The final deliverables:
A single json formatted file will be submitted to Gradescope. This file should be named projectbinexp.json. A template can be found in the Home directory.
See Submission Details for more information
Important Reference Material :
If youre an absolute beginner with no Linux or Cassembly experience,
This Writeup explaining some rudimentary basics of Computer Architecture
This Lecture Im Developing on how the Stack and Function Calls work in C
This Website may be able to help with some more basics from an external source
This Intro to pwntoolspwndbg video showing how to automate some exploits and use our exploit framework on the VM
INTERACTIVE TUTORIAL I made which details a lot of gdb debugging and pwntools techniques
pwntools Documentation
GDB command cheat sheet
Submission:
Gradescope autograded see Submission Details
Virtual Machine:
Same VM as APISec, if you have the vm named CS6035Summerv6.ova you do not need to redownload!
VM Download
Username: cs6035, Password: nevermind1991
GT CS 6035: Introduction to Information Security
IMPORTANT FIRST STEP:
To get the correct flags for this project you need to make sure to run a command that will pull all the correct data first! Otherwise you will get no points from autograder.
wget https:cs6035.s3.amazonaws.combinexpsummer2023binexp.sh chmod
777 binexp.sh .binexp.sh
GT CS 6035: Introduction to Information Security 00intro
Step 1: Open a terminal and cd into the project directory projectctf00intro. cd binexp00intro
Open the exploit python script named e.py and modify it with your GTID 9 digit numeric school ID number that looks like 901234567
And then execute the script run .e.py to get your first flag!
Your output will look like this. Copy this submission hash and place in the json file in your home directory projectbinexp.json
SUBMIT YOUR FIRST FLAG TO MAKE SURE IT WORKS BEFORE CONTINUING
Also, it is a very good idea to submit each flag you get to make sure it works before moving on, in case of any issues
GT CS 6035: Introduction to Information Security
Applicable for all flags: If for whatever reason you dont get a flag and youre positive you should, try running the exploit once or twice. The flag generator can have some unexpected behaviors. When in doubt, make a post in Ed Discussion to All Instructors and we will assist you if possible
GT CS 6035: Introduction to Information Security 01basicoverflow1
watch the intro video first please, or if you want to try the experimental instruction program BoxxY, see Appendix for details, for IntelAMD chips
This task is a very simple buffer overflow that reads a buffer inside the unsafe function. The scanf function that is called will read in any amount of data, and with the right amount of characters will end up overflowing into the nearby variables like makemenotzero.
Dont overthink this one, just try putting in a bunch of data and see what happens!. Try running the binary itself on the command line
Note: you are free to use GDB if you need to for any of the tasks in this project but you need to run the program on the command line i.e. .e.py in order to get the real flag for submission and submit it!
input some characters below!
GT CS 6035: Introduction to Information Security 01basicoverflow2
In this task you will learn details about binaries compiled from C code, and how some basic things
can be exploited such as process redirection or control flow hijacking. The steps in this flag are discussed indepth in the intro video.
In this directory you have an executable binary named flag which is vulnerable to a buffer overflow in one of its functions. We will be using an exploitation library called pwntools to automate some of the overflow techniques and get the binary to call a function it otherwise wouldnt have. This function called callme generates a key using your Gradescope User ID to get a valid flag that will pass the autograder.
Now we will run the binary just to see what the program is doing by running the executable
We see the binary is asking for a string, input any text you want or just press enter, and youll see that the program does nothing and just exits. Thats just to simplify the code so we can focus on the exploit.
The binary is statically linked to a shared object which has a lot of methods that construct the key and has a simple function called callme which will print out your key.
This is where we will start learning about binary file formats. Without going into a deep dive about program structure, operating systems, compilers, assembly language, machine code, etc. you will still be able to understand that there are two aspects that are key in binary exploitation
is simple enough, it is just any collection of bits that represent some kind of data
element like an ASCII character, integer value, pointer, etc
At this scope we can just think of addresses as fully unique identifiers of specific
data elements. These are logical locations the computer understands.
cd binexp01basicoverflow2
GT CS 6035: Introduction to Information Security
A buffer overflow occurs when too much data is fed into an unprotected or poorly protected data buffer. The way that 64bit C programs work is, a small amount of bytes past the beginning of the stack frame, data is stored at an address called the Instruction Pointer which is a register pointing to the currently executed instruction. If we override this with a valid address we can manipulate the control flow of the program and have it execute arbitrary or otherwise unintended code, with a wellformed attack. Starting off easy, we are going to modify e.py and learn a few basics of the pwntools library.
Open e.py with your favorite text editor and analyze the content and comments.
Once you understand what they do, proceed to fill in the cyclic size this number is up to you, based on your understanding of the program and what would break it to get a segmentation fault message by running
This will open up a gdb terminal with a breakpoint set at main
Type c to continue from the breakpoint sometimes need to press c twice if you dont see the
error, this is an issue with how gdb attaches to processes
We see the program received an interrupt signal for a SEGMENTATION FAULT SIGSEV, or an invalid access to memory. This happens when the program tries to access memory at a certain location that it either isnt allowed to access, or doesnt exist. In this case the return address for the function was overwritten by cyclics data in the form of long strings of characters. Pay
GT CS 6035: Introduction to Information Security
attention to the bottom of the screenshot where the instruction pointer is currently trying to ret return to 0x6561……616b which is just a string of ascii characters in hexadecimal form.
Now we know how to break the binary, lets figure out how to purposefully break it. Using a pwntools method called cyclicfind we enter in the bottom 32 bits 4 bytes of the return string in this example is 0x6561616b which will give the number of characters before reaching that value. This is important because we are now going to reach our first step of control flow hijacking by overflowing enough data that we can place a value and change the course of the programs normal path.
In e.py, on the commented line below your cyclic command, we are now going to use cyclicfind which will automate our buffer length calculation, and feed that number into cyclic. Place in your 4 character bytes preceded by a 0x, like 0x6561616b. Uncomment either of the lines beneath our original cyclic call one uses hex value and the other uses the ASCII values, and fill in the hex or ascii value described above see image below
And then uncomment the line below this block. This will append the payload with a recognizable string of hex numbers typically used in debuggingexploiting, which is the hex number 0xdeadbeef. This uses a function you will get very familiar with throughout this project which is linked here: p64string.
uncomment this line after you have filled in the cyclicfind line
After you have done that, rerun
And hit c to continue
GT CS 6035: Introduction to Information Security
If done correctly, you should see something like this screenshot above, where if you check the ret instruction, we are now failing on an invalid access to our dummy address.
Stepping away from the pwntools library for a moment, we now need to find something usable within the binary that will allow us to actually call a function or do something other than just crashing the program.
Now we will use a linux command objdump which takes a binary file and will output a dump of the binary which will give some key information about the binary. The D flag will output binary addresses, machine code, and assembly code of the binary into a file.
objdump D flag flag.asm Then open flag.asm
You will see a bunch of likely confusing information that at a high level translates to the code that you can see in the flag.c file. You arent going to have to go through this file in any extreme expanse unless you want to? we are just going to focus on finding an address within the binary file that holds the machine code responsible for making a function call to callme.
Search for the string callme in flag.asm and keep looking until you find the assembly instruction:
For IntelAMDCPUS:
call some address callme
Note down the highlighted address showing the call it will be different in your binary:
GT CS 6035: Introduction to Information Security Now open e.py and adjust the line see the commented useful commands section
payload p64 0xdeadbeef
With the hexadecimal value of the address above prepend 0x to the value highlighted, in the
screenshot above it would look like 0x401be8
Now run .e.py again from the command line without dbg and check the terminal output.
Did you get it? Awesome! Submit your first flag to gradescope follow APPENDIX for more details If not, retrace your steps in this task and also make sure you used the call callme
address in the earlier step and not the address of the actual function callme
GT CS 6035: Introduction to Information Security
02assembletheassembly
This task will get you to determine which assembly instructions will properly construct a call using the address of the callme function the actual address of the function, as opposed to task 1 which needed the call to a function. Analyze the different instructions and look up the usagebehavior of them to figure out which one will construct the address.
You can use objdump or gdb to find the address of callme and figure out how you calculate it.
For debugging, I highly recommend using gdb, setting a breakpoint on the gadget function, and stepping through the options once you think you know the correct path to get to the function call.
FYI:: you dont have to use pwntools for this one, just
run .flag once you understand the sequence!
GT CS 6035: Introduction to Information Security 02badrando
This Program very conveniently leaks out part of the libc base address
this address is randomized via ASLR so it will change a little bit every time the program
is launched
run the program a few times and notice what bytes are different and which ones arent
Next step will be analyzing the C file and see what we are comparing against in order to get to callme
system is a libc function, use GDB to get the address of system using p system
Fortunately theres only one byte that is missing from our formula, so we can do some scripting in python to try out the remaining values.
pwntools has a function called recvlineuntilall that will let us do some manipulation with the string returned before we send the payload and allow that to coerce the input we send in.
the recv functions will return a BYTES object, so you will need to do some clever manipulation of said strings that are returned, this will probably take a few iterations and permutations to get the value in the right format
note that the C file is using scanf to read in a hexadecimal number, meaning you dont need to use p64, you are sending in the STRING REPRESENTATION of a hex number, that means WITHOUT the 0x in the beginning, and you send the string directly on the command line like ffaabbccdd or f701234abcd etc!
Your task is going to be:
get the value leaked from the program
modify it with the offset of the system function fill in the remaining byte with a random value
send to the process
repeat until you get a flag
note: i recommend using recvall after you send in
each payload, and write your loop logic around the output see other flags for what kind of string output you can expect to see if you got the right value!
run .flag multiple times, it will ask you for input and your goal is to guess an address.
Put in any random guess and try it a few times to see if you can notice a pattern versus
what is leaked and what is being expected.
GT CS 6035: Introduction to Information Security
02p4s5w0rd
Now its time to learn a really useful technique to find all the available strings in a program.
And by strings, we mean any collection of printable characters that exist in the binary. So things like variable names, hardcoded paths, debug messages, or eeeeevenn…. passwords? Hopefully not in a real program but you would be surprised.
This binary has zero debugging information and you do not have the source code available, but guess what? The program is written terribly and is very unsafe, with passwords stored in plain text that can easily be dumpedsearched in the binary!
I would recommend running the program once or twice to see what its doing checking a series of responses to questions and if you get every question right, then you will get the flag!
To get the strings for the program, run the command:
strings flag
This will output it all to the terminal which isnt super helpful, so would suggest redirecting output
to a file like:
strings flag flagstr
Now you will be able to grepsearchnavigate the file in a new terminal and will hopefully be
able to figure out what the correct responses would be for the given questions.
hint, strings are stored in the binary in the order that theyre written in the C code, might be a good idea to search for the questions theyre asking and it should be pretty easy to determine the answer from there!
Good luck!
GT CS 6035: Introduction to Information Security 02theserverclientone
This flag shows a communication between a server and a client. The client binary flag will send data to the server, and the server appends some very conveniently structured data to that message and sends it back to the client. Your goal for this task is to have the server return the ideal data to overwrite the instruction pointer with the data that is returned from the server.
Follow the same steps in previous tasks basicoverflow2, more specifically to break the program in gdb, and then figure out your buffer size, and try to fill in the response to correctly hit this function call!
If you use the pwntools e.py file, it will start the server for you so there is no need to explicitly start the server.
If you are running the program on the command line to experiment, then you must start the server each time you run the binary. You can either open a new terminal, and run
Or in the same terminal, each time you run the binary, run
Your task is to figure out the breaking point, and heavily inspect the last bytes that are returned from the server in order to get the right return and get the flag!
GT CS 6035: Introduction to Information Security 03XORbius
Time to rev up those Reverse Engineering motors, because you need to unravel the logic that this program is checking against in order to get to the callme function!
No buffer overflow this time, you just simply need to input the right values that will correctly decode the logic and pass the checks.
If youre unfamiliar with C operators, this TUTORIAL has all the necessary operations detailed.
Suggest pen and paper for this one to work through the logic by hand, or do a ton of experimentation to get the right value!
GT CS 6035: Introduction to Information Security 03pointypointypoint
We see there is an unsafe function which has some checks for different local variables. The positioning of these variables is important because they are declared before the input buffer which means that a buffer overflow will cause data to be overwritten.
You will find additional details on this flag in the readme file of the folder. This program is a Buffer Overflow, however you will not be changing the control flow to a specific binary address, rather you will need to enter in the right values to trick the pointer arithmetic logic and get to the callme function.
psst, the math is easy, dont overthink it, its just addition
GT CS 6035: Introduction to Information Security
03huntthenrop
Youve made it! You are now on your final task. In this directory is the entire contents of usrbin, a collection of binary files that make up a lot of common linux uses. One of these files has been overwritten by a vulnerable program. It is your task to figure out which one. You are given a list of checksum values that are known good, so your first task will be determining the sha1 hash of all of the files in this directory, and then finding the one that does not match. You are free to do this however you would like. NOTE: in your scripting method, ignore the files checksums and user.txt. They will likely report a mismatch but you can be certain neither are not the file in question
Once you find the file it is time to begin our exploit of that file. This is a bit more complex than the other flags and will require a full ROP return oriented programming exploit to chain calls together, and we will also need a new tool called Ropper to find a gadget in order to supply a function argument and pass a specific check.
In 64bit programs, the function gets arguments through registers, in the case of intel architecture the RDI register supplies the first function argument.
So we need to find a gadget a piece of code that we can override the instruction pointer with, that will perform a certain action and then continue with the control flow hijack that will pop a value from the stack into the RDI register.
Lets use ropper like this
ropper file flag grep pop
This will give you all gadgets within the binary that have a keyword pop spoiler, theres a LOT of them. An objective for this task is to figure out what gadget will likely work best to get the required argument passed into the function you are trying to call. This Writeup is a helpful reference to understand how calling convention works for x8664 cpus
Note the addresses that are output for each gadget. Once you find a gadget you think will work, we will need that as our first override value in pwntools
GT CS 6035: Introduction to Information Security Pictorially, this is what our crafted exploit needs to look like remember stack grows down
Now we will need to supply the argument, which will be on the stack immediately after our pop gadget, figure out what that value needs to be, and add it as p64value after the pop gadget
Then we need to put the address of the function as the next call, use objdump or gdb to find the addresses you should probably get the second function address while youre at it. The call to our pop gadget will ret and then hit this second function call to enter one of the unsafe functions
Finally, we need to finish our execution chain by calling the second function which will allow for exploitation. Append that address to your chain and see if you get a flag!
GT CS 6035: Introduction to Information Security
This project is worth 18 of your grade.
There are a total of 110 points for this project, if you complete all flags and get all 110 points, you get an extra 10 of the project applied to your grade
If you complete all flags you will get an effective extra credit of 1.8 final course grade applied
01basicoverflow1
01basicoverflow2
02assembletheassembly
02badrando
02p4s5w0rd
02theserverclientone
03huntthenrop
03pointypointypoint
Total Possible
GT CS 6035: Introduction to Information Security
Submission Details
File submission instructions:
The contents of the submission file should be the following. There is a projectbinexp.json file in your vm with a template set up, or you can copypaste this to your newly created projectbinexp.json file elsewhere and replace the placeholders with the flags you retrieve from each relevant task. the name of the file doesnt matter, it is just named that for clarity
Note: You can use TextEdit or Vim to create and edit this file. Do not use LibreOffice or any Word Document editor. It must be in proper JSON format with no special characters in order to pass the autograder and these Word Document editors are likely to introduce special characters.
If you cant find the file in the VM just copy this format below:
00intro: copy flag here,
01basicoverflow1: copy flag here,
01basicoverflow2: copy flag here,
02assembletheassembly: copy flag here,
02badrando: copy flag here,
02p4s5w0rd: copy flag here,
02theserverclientone: copy flag here,
03huntthenrop: copy flag here,
03pointypointypoint: copy flag here,
03XORbius: copy flag here
GT CS 6035: Introduction to Information Security An example of what the submitted file content should look like:
4ec60c3e084d8387f0f33916e9b08b99d5264a486c29130dd4a5a530b958c5c0f1faeaca2ce30b478281ec546a
4729f629b531a86cb27d86c089f0c542,
01bufferoverflow1:
f496d9514c01e8019cd2bc21edfeb8e33f4a29af14a8bf92f7b3c14b5e06c5c0f1faeaca2ce30b478281ec546a
4729f629b531a86cb27d86c089f0c442,
01bufferoverflow2:
b621bba0bb535f2f7a222bd32994d3875bcfcad651160c543de0a01dbe2e0c5c0f1faeaca2ce30b478281ec546
a4729f629b531a86cb27d86cf0c49542,
GT CS 6035: Introduction to Information Security
There is an experimental instructional tool Ive created to guide you through the flow of some of these flags and give you an introduction to GDB and pwntools in a way that allows you to learn by doing!
You can start the program by
Make sure your window at a minimum will show the symbols on topbottomleftright. If you have the resolution then feel free to make it bigger as it should enhance the readability.
wget https:cs6035.s3.amazonaws.combinexpsummer2023boxxy.zip unzip
.calibrate.py
GT CS 6035: Introduction to Information Security
Once your terminal is set up, run
You will be asked which lesson you want to start, I would suggest going through all of them but you are free to start wherever you want. Note, once you start a section there isnt currently an easy way to navigate through the other sections, that is still being developed, thanks for your patience and I appreciate feedback on the lesson material!
GT CS 6035: Introduction to Information Security
So long for now
GT CS 6035: Introduction to Information Security