CS 6340 Software Analysis Lab 5 CBI
Lab 5: Cooperative Bug Isolation (CBI)
Fall 2022 Semester
Due: 7 November, 8:00 am Eastern Time Corresponding Lecture: Lesson 9 (Statistical Debugging)
¡ñ Enumerating basic blocks and instructions in a function:
o http://releases.llvm.org/8.0.0/docs/ProgrammersManual.html#basic-inspection-and- traversal-routines
¡ñ Instrumenting LLVM IR
o http://releases.llvm.org/8.0.0/docs/ProgrammersManual.html#creating-and-inserting- new-instructions
¡ñ Important classes
o http://releases.llvm.org/8.0.0/docs/ProgrammersManual.html#the-function-class o https://cs6340.cc.gatech.edu/LLVM8Doxygen/classllvm_1_1CallInst.html
o https://cs6340.cc.gatech.edu/LLVM8Doxygen/classllvm_1_1DebugLoc.html
o https://cs6340.cc.gatech.edu/LLVM8Doxygen/classllvm_1_1BranchInst.html
In this lab, you will implement cooperative bug isolation (CBI) to statistically localize error locations. You need to implement an LLVM pass that instruments each branch and function call to report the values of their conditions and return values. With the large number of inputs, you will obtain the data at runtime and measure different types of scores that help users find bugs.
You should find this lab familiar. The basic structure of the first part will be very similar to Lab 1 (Fuzzing). If you find you¡¯re having trouble implementing the instrumentation, we recommend you refer back to that lab.
Download cbi.zip from Canvas. When uncompressed, it will produce the directory cbi/
This lab builds off the work you did in Lab 1 (Fuzzing). We have provided a reference solution in
the form of a binary, , and your own solution from the previous lab, copy LAB5/reference.
. If you prefer to use and from LAB1/build to
reference/Instrument.so
reference/fuzzer
Instrument.so
The following commands set up the lab (from the lab¡¯s directory):
$ mkdir build && cd build
$ cmake ..
You should now see CBIInstrumentPass.so and cbi in the current directory.
The cbi tool performs statistical debugging for a program using a feedback profile (which you will generate) for successful and erroneous program runs. To help generate programs runs that pass and fail, you will use your fuzzer:
You should see the sample output as follows:
Lab Instructions
In this lab, you will need to edit the file to implement the cooperative
$ mkdir fuzz_output
$ timeout 10 ../build/fuzzer ./fuzz1 fuzz_input fuzz_output 10 $ ../build/cbi ./fuzz1 fuzz_output
== S(P) ==
== F(P) ==
== Failure(P) ==
== Context(P) ==
== Increase(P) ==
predicates.
src/CBIInstrument.cpp
bug isolation which will instrument branch and return instructions with code to extract and monitor
contains functions that you will use in your lab:
lib/runtime.c
// Append predicate information as ¡°branch,line,col,cond¡± void __cbi_branch__(int line, int col, int cond) { /**/ }
// Append predicate information as ¡°return,line,col,rv¡± void __cbi_return__(int line, int col, int rv) { /**/ }
Like you did in Lab 1, your LLVM pass should instrument the code with these functions.
Your pass should instrument each conditional branch with code records whether the branch conditional is true or false on execution. Likewise, instrument each integer-returning call instruction with code to record the return value. This will create a feedback profile for you to perform statistical debugging and generate a feedback report.
Code Help, Add WeChat: cstutorcs
In short, the lab consists of the following tasks:
1. Implement the
function to insert a
instrumentCBIBranches
predicate (conditional). Note: __cbi_branch__ takes a 32-bit integer as a parameter for
__cbi_branch__
condition. You¡¯ll need to convert the boolean (1-bit integer) condition to a 32-bit
integer. True == 1 (not -1) and False == 0
3. Implement the
return value
runOnFunction
predicate recording logic (from #1)
to instrument all 32-bit integer-returning call instructions with the
runOnFunction
return recording logic (from #3)
5. Using the feedback profile, you construct in 1-4, modify
F(P), S(P), , , and which should be stored in the corresponding data structures in
Revisiting Instrumentation. By now you should feel comfortable working with the LLVM compiler infrastructure, but for a refresher, consult Lab 0 and Lab 1.
CBI File Infrastructure. The cbi executable will execute the input program on each of the trace input files from a fuzzer output directory. This includes both successful program runs (fuzz_output/success) and erroneous program runs (fuzz_output/failure). Each run will generate an analogous feedback profile for each input file. The resulting directory tree will look like this:
function to insert a
call for a
instrumentCBIReturns
__cbi_return__
to implement
statistical debugging. You should compute
generateReport
Failure(P)
Context(P)
Increase(P)
include/Utils.h
fuzz_output/
input1.cbi
input2.cbi
input1.cbi
You will use these .cbi files to generate the feedback report.
to instrument all conditional branching instructions with the
Generating the feedback report. During the lesson, you saw how we use several metrics to help determine which predicates correlate with bugs. One such metric, F(P) is the number of failing runs in which P is observed to be true. S(P) is the number of successful runs in which P is observed to be true. Failure(P), calculates how often predicate P is true in failing runs. Another metric, Context(P), calculates the background chance of failure when predicate P is observed. Finally, Increase(P) calculates the likelihood that P influences the success or failure of the program.
The function generateLogFiles() will be called for you in CBI.cpp. This file will iterate through the .cbi files generated and populate the lists SuccessLogs and FailureLogs with the filenames of each. You¡¯ll need to iterate through each file in these data structures to calculate these values. Please use these data structures instead of traversing the directory yourself as the grader will use a different directory during grading.
We have defined maps F, S, , , and in that you should populate in located in . Notice each is a mapping
generateReport
consists of a line, column, and a State datatype that encodes the possible predicates a branch or return has.
Note that your instrumentation will record where a branch or return occur and its result, but you need to encode that into a predicate. For example, if we encounter if (p == 10) { … } in the code, we need to store two predicates, (p == 10), and (p != 10), which you would represent as State::BranchTrue and State::BranchFalse.).
When doing your calculations for Failure(P), Context(P), and Increase(P), if you encounter a divide-by-zero error please enter 0 for the result.
The skeleton code will go through and print out your maps via printReport. Example Input and Output
Your statistical debugger should run on any C code that compiles to LLVM IR. As we demonstrated in the Setup section, we will compile code to LLVM and instrument the code with the fuzzer and cbi passes.
Make sure to re-run the make command in the build directory if you make code changes to your files in the src directory so it will pick up the changes.
LAB5/CBI.cpp
to . Here, the tuple represents a predicate, which
LAB5/include/Utils.h
std::tuple
Code Help
$ clang -emit-llvm -S -fno-discard-value-names -c fuzz1.c -g
$ opt -load ../build/InstrumentPass.so -Instrument -S fuzz1.ll -o fuzz1.instrumented.ll
$ opt -load ../build/CBIInstrumentPass.so -CBIInstrument -S fuzz1.instrumented.ll -o fuzz1.cbi.instrumented.ll
$ clang -o fuzz1 ../lib/runtime.c fuzz1.cbi.instrumented.ll
The Makefile provided in the test directory will run these commands for all of the test programs at once.
After, we will run the fuzzer to generate a set of passing and failing inputs for use with the cbi tool.
You should expect to generate output similar (but not exactly the same) to the following:
$ mkdir fuzz_output
$ timeout 10 ../build/fuzzer ./fuzz1 fuzz_input fuzz_output 10 $ ../build/cbi ./fuzz1 fuzz_output
== S(P) ==
Line 33, Col 7, BranchTrue: 0
Line 33, Col 7, BranchFalse: 188
Line 37, Col 7, BranchTrue: 5
Line 37, Col 7, BranchFalse: 183
== F(P) ==
Line 33, Col 7, BranchTrue: 129
Line 33, Col 7, BranchFalse: 0
Line 37, Col 7, BranchTrue: 0
Line 37, Col 7, BranchFalse: 0
== Failure(P) ==
Line 33, Col 7, BranchTrue: 1
Line 33, Col 7, BranchFalse: 0
Line 37, Col 7, BranchTrue: 0
Line 37, Col 7, BranchFalse: 0
== Context(P) ==
Line 33, Col 7, BranchTrue: 0.40694
Line 33, Col 7, BranchFalse: 0.40694
Line 37, Col 7, BranchTrue: 0
Line 37, Col 7, BranchFalse: 0
== Increase(P) ==
Line 33, Col 7, BranchTrue: 0.59306
Line 33, Col 7, BranchFalse: -0.40694
Line 37, Col 7, BranchTrue: 0
Line 37, Col 7, BranchFalse: 0
How Do We Use It?
Dr. Liblit, the inventor of the CBI technique, says that CBI ¡°is mostly a research topic, but it is out there in the ¡°real¡± world in two ways:
1. The public deployment of the Cooperative Bug Isolation Project hunts for bugs in various Open Source programs running under Fedora Linux. You can download pre-instrumented packages and every time you use them you’re feeding us data to help us find bugs.
2. Microsoft has released Holmes, an implementation of statistical debugging for .NET. It’s nicely integrated into Visual Studio and should be a very easy way for you to use statistical debugging to help find your own bugs in your own code. I’ve worked closely with Microsoft Research on Holmes, and these are good smart people who know how to put out high-quality tools.¡±
While we do not have concrete examples that we can share with the class, CBI is the technique that we teach where we have had the most past students contact us and tell us they are using either the CBI implementation from Dr. Liblit¡¯s dissertation or their own variation on the technique. Typically, if they are writing C/C++, they use Dr. Liblit¡¯s and if they are using another language, they write their own. After doing this lab, you will be equipped to implement the technique on any code that you can process through the LLVM toolchain.
Your solution will be tested against the provided test cases as well as some additional hidden test cases that are similar to the provided ones.
Items to Submit
For this lab, we will be using Gradescope to submit and grade your assignment. For full credit, the file must be properly named and there must not be any subfolders.
o Submitfile 40points o Submitfile 60points
Do not submit anything else. Make sure all of yourcode modifications are in the files listed above as your other files will not be submitted. In particular, past students have modified header files to import additional headers, leading to a compileerror in their submission
Gradescope will have immediate feedback from the autograder as to whether the submission was structured correctly as well as whether your project compiles successfully. As long as Gradescope doesn¡¯t report any errors, you can be confident that your project will run and be able to receive a grade. If however, Gradescope reports a compilation error then you will likely receive a 0 on your assignment.
CBIInstrument.cpp
Tips for submitting your assignment with Gradescope
¡ñ When you go to the Canvas Assignment, you will see a blank Gradescope submission window embedded in the Canvas Page. Follow the on-screen prompt to submit your files.
¡ñ If there are multiple files in this lab, you can zip the files together and upload the single zip file to the submittal page, or you may select each file individually on the page. If you select the files individually, they must all be uploaded in the same submission.
¡ñ A correct submission should look like this image (ignoring size and order):
¡ñ You may resubmit your assignment as many times as you want up to the assignment due date. The Active submission will be the final grade. By default, every time you upload a new submission, it will become the Active submission, but you can change the Active submission in your Submission History window if needed.
¡ñ If you resubmit your assignment, you must include all files as part of the new submissions.
¡ñ We provide comments as output of your submission. Use these to make sure your submission is as complete as possible. For example, if you see a compile error, you will
not receive any points for that part of the assignment.
程序代写 CS代考 加QQ: 749389476