CS 124 Final exam: Spring 2022
Your name: Instructions:
• Spend at most 180 minutes on the exam, from the time you first see a problem to the last time you write anything in your solutions. (Time spent uploading doesn’t count against your time.)
Time started: Time ended:
• Do not communicate with anyone but the course staff about the contents of the exam.
• Do not use a computer or other electronic aids during the exam for anything except for:
1. (optionally) typesetting your solutions,
2. (optionally) accessing offline notes prepared before the exam, without using a search function, 3. (optionally) asking clarification questions of the instructors on Ed.
• (You may use your notes and/or books prepared before the exam.)
• After you finish the exam, type or sign your name in affirmation of the following statement:
I, , pledge my honor that I have obeyed all instructions on this page in taking this exam.
Reminder for extension students:
After you finish, upload your exam to Gradescope as if it were a problem set.
• You have 180 minutes to earn a maximum of 180 points. Do not spend too much time on any single problem: we recommend roughly n minutes on an n point problem. Read all the problems first, and attack them in the order that allows you to make the most progress.
• You can ask clarification questions of the instructors on Ed/in person.
• Be neat and write legibly and clearly. You will be graded not only on the correctness of your answers, but also on the clarity with which you express them.
• Do not waste time re-deriving results presented in class, psets, or sections. You may simply state and cite them.
1 True or False (18 points)
For each of the following statements, say whether the statement is TRUE or FALSE. Each correct answer
is worth 2 points. Incorrect answers are worth 0 points. No explanations are needed.
1. On a connected, undirected graph G, if a breadth-first search starting from a vertex s finds a path of
length 124 to a vertex t, then every path from s to t has length at most 124. True False
A breadth-first search finds the shortest path, but there may be a longer one: for instance, the graph that’s a path of length 125 edges from s to t, plus an edge from s to the third vertex of the path, is a counterexample.
2. If every problem in NP can be solved in polynomial time, then there is a polynomial-time algorithm to break RSA encryption.
True False
Factoring is in NP. If you can factor the public key n, you can break RSA.
3. If we insert at least mk elements into an initially-empty Bloom filter with m ≥ 1 bits of memory and k ≥ 1 hash functions and then look up an element (not necessarily one of the ones we inserted), then it is possible for the lookup to be a false negative.
True False
Bloom filters don’t have false negatives, only false positives.
4. If we insert 0 elements into an initially-empty Bloom filter with m ≥ 1 bits of memory and k ≥ 1 hash functions and then look up an element (necessarily not one we inserted) and then look up an element, then it is possible for the lookup to be a false negative.
True False
Bloom filters don’t have false negatives, only false positives.
5. Let ODD be the function that takes an integer as input and returns True iff x is odd. Then, for every decision problem f solvable in polynomial time, there’s a polynomial-time reduction from f to ODD.
True False
As a polynomial-time reduction algorithm from f , we can first solve f itself in polynomial time and
get a yes/no answer, then output 0 (if no) or 1 (if yes) as the input to ODD.
6. Let G be a directed graph with positive even integer edge weights, and let s and t be two vertices of
G. There is necessarily a maximum-valued flow from s to t in which the flow on every edge is even. True False
Solution 1: As long as all capacities in the (residual) graph are even Ford-Fulkerson only ever finds even bottlenecks, so all capacities in the residual graph remain even. So, when Ford-Fulkerson ends, it finds a flow with every edge even.
Solution 2: Divide all capacities by 2, getting integers. We proved that there’s an integer-valued max flow. Multiply by 2 again.
7. Let G be a directed graph with positive odd integer edge weights, and let s and t be two vertices of G. There is necessarily a maximum-valued flow from s to t in which the flow on every edge is odd.
True False
Any graph with odd edge weights in which t isn’t reachable from s (so, max flow 0) is a counterex- ample. Alternately, make a graph with edges (s,u) of weight 1, (s,v) of weight 1, (v,u) of weight 1, and (u,t) of weight 3; then the max flow is 2, with 2 flow on the edge (u,t).
8. In the simulated annealing algorithm for Number Partition, if we choose the temperature (probability of moving to a worse neighbor) to be 0, the resulting algorithm is equivalent to the hill-climbing algorithm—that is, they have the same asymptotic runtime and distribution of outputs.
True False
Simulated annealing is just hill climbing with some probability of moving to a worse neighbor.
9. Assuming P ̸= NP, there is a polynomial-time algorithm that, given a directed graph, vertices s and t and positive integer edge capacities (as in a maximum flow problem), finds the maximum-valued flow from s to t that puts flow at least 1 on every edge.
True False
E.g. you can add the constraints that the flow on each edge is at least 1 to the usual linear-programming solution to max flow.
2 False (18 points)
Each of the following statements is FALSE. For each of them, prove it’s false by providing a counterexam-
ple. No explanation is necessary; only the example itself will be graded.
1. For every undirected graph G with |V (G)| > 1, if G has a vertex v whose degree is greater than every other vertex’s degree, then no maximum independent set of G contains v.
The maximum independent set contains the middle vertex and the four outside vertices.
2. For every undirected graph G with |V (G)| > 1, if G has a vertex v whose degree is less than every other vertex’s degree, then no minimum vertex cover of G contains v.
x7 x8 x9 x10
The optimal is taking 1,4,5,6. You must take 4,5,6, because they are connected to all nodes (except 1), making the degree of all nodes ≥ 3. If you leave one of them out, you must take 2,3,7,8,9,10. So, taking 1 is optimal, because the only edges left uncovered are (x1,x2) and (x1,x3)
3. For every undirected graph G with positive integer edge weights, if G has an edge e whose weight is greater than every other edge’s weight, and G has a Hamiltonian cycle (i.e. a Traveling Salesperson Tour) that doesn’t include e, then the shortest Hamiltonian cycle (TSP tour) doesn’t include e.
Make one outside edge weight 124, and the inner cross be weight 121. Any tour that takes one edge in the cross must take both edges, so the cheapest tour is the outside cycle, which avoids both but contains the edge of weight 124.
3 A Sneak Preview of this Exam’s Questions (18 points)
In the year 2124, CS124 will pilot an exam-hints system: each student will be allowed to ask one TF (before the exam) for a hint about what will be on the final exam. Each TF knows some fraction of the contents of each problem: for instance, Ashley might know 100% of problem 1, 50% of problem 2, and 200% of problem 3 (that is, she knows two solutions to problem 3), and Lawrence might know 100% of problem 1, 20% of problem 2, and 0% of problem 3.
You can ask for fractional hints from multiple TFs, as long as you get a total of only one hint: for instance, you could ask for 60% of a hint from Ashley and 40% of a hint from Lawrence, and you’d learn about 100% of problem 1, 38% of problem 2, and 120% of problem 3. You only care about knowing up to 100% of a solution to each problem, so even though you know 2.58 solutions, you only know and care about 2.38.
1. (9points)You’dliketomaximizethenumberofsolutionsyouknowandcareabout(like2.38above), given the level of knowledge ci, j by each TF i of each problem j. Give a reduction from this problem to a linear program. Your reduction should run in polynomial time, but you don’t need to prove this. You don’t need to prove the correctness of your reduction.
Solution: We’ll have a variable xi for each TF i representing how much we ask them, a variable y j for each solution j representing how much we know and care about of its solution, constraints ∑i xi ≤ 1, ∀i:xi ≥0,∀j:yj ≤1,∀j:yj ≤∑ici,jxi,andanobjectivetomaximize∑jyj.. Notethatthere’sa decision how to model “get a total of only one hint”: if you say ∑i xi = 1, the solution is the same in almost all cases, since you’d want to ask a TF if you can—but if there are no TFs, you should be able to just learn 0 solutions, not declare the universe unfeasible because you can’t ask for a hint.
2. (9points)EachTFisresponsibleforcommunicatingwiththeotherTFswheneverastudentasksthem for a hint, but they’re not reliable at doing so. Instead, each TF only informs their friends (friends go both ways) of the tips they’ve distributed. (Their friends don’t pass that information along recursively.) You’d like to take advantage of this to cheat and ask for more than 1 total hint, but if any TF realizes that you’ve asked for more than 1 total hint, you will be sent to the Ad Board!
For instance, suppose Tarun and Evan are friends, Evan and Lavanya are friends, and Lavanya and Phyllis are friends, but no other pairs of TFs are friends. If you ask Tarun and Lavanya for .6 hints each, then Evan will realize that you’ve asked for 1.2 hints and report you. If you instead ask both Tarun and Phyllis for 1 hint each, you’ll get away with it.
Again, you’d like to maximize the number of solutions you know and care about, given the level of knowledge ci, j by each TF i of each problem j and the list of what pairs of TFs are friends.
Give a reduction from this problem to a linear program. Your reduction should run in polynomial time, but you don’t need to prove this. You don’t need to prove the correctness of your reduction.
Solution: As above, we’ll have a variable xi for each TF i representing how much we ask them, a variable yj for each solution j representing how much we know and care about of its solution, constraints ∀i : xi ≥ 0, ∀j : yj ≤ 1, ∀j : yj ≤ ∑i ci,jxi, and an objective to maximize ∑j yj. Instead of the constraint ∑i xi ≤ 1, we have a constraint that for each TF i with neighborhood N(i) (i and i’s friends), ∑k∈N(i) xk ≤ 1.
Calculations (48 points)
1. (12 points) Find an inverse of 124 modulo 1249 (that is, an integer x such that 124x ≡ 1 (mod 1249)),
and show the steps of whatever algorithm you use to do so.
Answer: 554 (or anything that’s 554 mod 1249), which can be found by the extended Euclidean algorithm to find x in an (x, y) pair such that 124x + 1249y = 1, as in the following table where we can first calculate the first three columns down, then work our way back up:
abkxy 1249 124 10 -55 554 124 9 13 4 -55
9 7 1 -3 4 7 2 3 1 -3 21201 10210
2. (10 points) Suppose I search for a solution to the 2SAT formula (x∨¬y)∧(y∨¬z)∧(¬x∨z)
by choosing an assignment of the variables (uniformly at random from the 8 possible assignments), then repeatedly choosing an unsatisfied clause (uniformly at random from all unsatisfied clauses) and a random variable in it (uniformly at random from its two variables) and flipping the value of that variable. (If it happens that I guess a satisfying assignment correctly to start, I do 0 variable flips.) Compute the expected number of variable flips I need to find a satisfying assignment.
Answer: 3/2. There are 2 satisfying assignments: all True and all False. From every other assignment, the only nonsatisfied clause(s) include the odd variable out, so you have a 1/2 chance of moving to a satisfying assignment and a 1/2 chance of moving to another assignment where the situation is the same. So the expected number of steps x from a nonsatisfying asssignment satisfies x = 1 + 12 x, that is, x = 2. You start on a satisfying assignment with probability 14 , for a total expected number of steps of 41 0 + 34 2 = 32 .
3. (11 points) I attempted to implement a hash table with chaining and table doubling. The table started at size 22 and contained 0 elements. If, after adding an element x to the table, the number of elements in the table equaled the size of the table, I intended to rebuild a table of size 2k for the next biggest value of k, and rehash everything into it. Unfortunately, due to a typo, I actually rebuilt a table of size k2 for the next biggest value of k, and rehashed everything into it. Calculate the amortized time per insert operation for the first n insert operations. (Give an asymptotic answer, like O(log n).) (Assume that computing one hash value takes time O(1).)
Answer: O( n). When we don’t rebuild the table, the time to compute the hashes is O(1) and the expected time to insert them is O(1), since the number of entries is less than the table size. In the first n operations, we rebuild the table at sizes 1, 4, 9, …, ⌊√n⌋2, and pay the same order of magnitude
in rebuilding time, for a total of 1 + 4 + 9 + ··· + ⌊√n⌋2 = O(n32 ). Hence the amortized time per 1
operation is O(n2 ).
4. (15 points) Suppose I run the Edmonds-Karp algorithm to find a maximum flow from s to t in the graph below. In the breadth-first search that Edmonds-Karp uses to select paths to push flow along, break ties to explore alphabetically-earlier vertices before alphabetically-later ones. What paths does Edmonds-Karp push flow along, in order, and how much flow does it push along each? (Your answer should be of a form like “First, 3 flow on S-A-B-T, then 3 flow on S-B-A-T, then . . . ”.)
Answer: 7 SAT, 5 SBT, 1 SBAT, 1 SDACT, 1 SDABCT.
5 Non-adjacent subset sum (15 points)
Give an algorithm that takes as input an array of n integers and returns the maximum sum of a subset of them that doesn’t contain any three consecutive elements of the array. For instance, on input [4, 4, 5, 2, 3, −1, −3, 5], your algorithm should return 18 (because 18 = 4 + 4 + 2 + 3 + 5). Your algorithm should run in time O(n). Prove its correctness and analyze its runtime.
Solution: Let the elements of the array be a0 , a1 , . . . , an−1 . We’ll find by dynamic programming the maximum sum D[i] of a subset of the elements up to i that doesn’t contain any three consecutive elements; then our final answer is D[n−1]. Base cases are D[−1] = 0, D[0] = max(0,a0) and D[1] = max(0,a0)+ max(0,a1). For greater i, you can either not take element i, or take it but be sure to exclude i−1, or take both but be sure to exclude i−2, that is, D[i] = max(D[i−1],ai +D[i−2],ai +ai−1 +D[i−3]).
We solve n subproblems where each takes O(1) time for arithmetic, so the runtime is O(n).
The values of D[i] we calculate are at most the maximum sum D[i] of a subset of the elements up to i that doesn’t contain any three consecutive elements because each case in the maximum for D[i] corresponds to some appropriate subset of A. Each appropriate subset of the first i elements of A corresponds to some choice in the maximum we calculate for D[i], so the values of D[i] we calculate are at least the maximum sum D[i] of a subset of the elements up to i that doesn’t contain any three consecutive elements.
CS Help, Email: tutorcs@163.com
6 Almost a Majority (15 points)
Give an algorithm that returns True if any element in an input list of n integers appears more than n/124 times, and False otherwise. For full credit, your algorithm should run in worst-case linear time. For up to 12 points of partial credit, your algorithm can run in expected linear time. You don’t need to prove your algorithm’s runtime. Prove the correctness of your algorithm.
Solution: Run the linear-time selection algorithm from lecture to find the n/124th-smallest, 2n/124th-
smallest, …, 123n/124th-smallest elements of the input list. For each of them, read through the list and
count the number of occurrences; return yes if any of those is at least n , and no otherwise. 124
We only return yes if we find an element that occurs at least n/124 times, so there are no false positives. If there’s an element that occurs at least n/124 times, it’s one of the n/124th-smallest, 2n/124th-smallest, . . ., 123n/124th-smallest elements of the input list, so we find it correctly.
(For partial credit, there’s a solution that hashes each element, checks whether any hash value appears at least n/124 times, and double-checks the answer. That’s only expected linear time from what we’ve seen in class, although it’s possible to make an advanced modification to a hash table data structure to make its worst-case time O(1) if we know the number of elements in advance, as we do here.)
7 Orders under permutations (12 points)
Recall that the resemblance R(A, B) of two sets A and B is defined as R(A, B) = |A∩B| . |A∪B|
SupposethatforsomepositiveintegerN,AandBaresubsetsof[N](i.e. {0,1,…,N−1})with|A|≥2 and |B| ≥ 2. Prove or disprove: for every such A and B, if we choose a uniformly random permutation π:[N]→[N],theprobabilitythatsecond-largest({π(a):a∈A})=second-largest({π(b):b∈B})isR(A,B). (Given a set X, second-largest(X) is the second-largest element of X.)
One possible answer: take A = {0, 1, 2, 3, 4}, B = {0, 1, 2}. These sets have resemblance 35 . second-largest({π(a) :
a ∈ A}) is always 3. second-largest({π(b) : b ∈ B}) is only 3 if the permutation sends two of 0, 1, and 2 to
3 and 4 and the last one anywhere else, which happens with probability 6 · 1 · 1 = 3 , which isn’t 3 . 5410 5
浙大学霸代写 加微信 cstutorcs
8 Martian utilities (36 points)
Elon is planning to bring power and water to his Martian colonies! A colony consists of nm buildings arranged in an n × m grid, and the building at position (i, j) has a positive integer height ci, j .
1. (16 points) To bring water to the buildings, Elon plans to build pumps at some of them. The cost to build a pump at height h is $h. A pump provides water to its own building and the buildings immediately north, south, east, and west of it (up to 4 buildings, but fewer at the boundaries of the colony). The goal is to bring water to all buildings with the minimal cost. Give a 5-approximation algorithm for the problem: that is, an algorithm that returns a set of generators whose total cost is within a factor of 5 of the smallest possible total cost. For full credit, your algorithm should run in O(nm) time, assuming all relevant integers are small enough that integer arithmetic operations take O(1) time, but slower (polynomial-time) correct algorithms will receive some credit. You don’t need to prove a bound on your algorithm’s runtime. Prove the correctness of your algorithm.
(Hint: if each pump could only supply water to its own building or one of its four neighbors, but you could build multiple pumps in the same place, what would the optimal solution be?)
Solution: For each building, take the minimum of its cost and its neighbors’ costs, and mark that as a place to build a pump. Build pumps at all the marked places. This waters all the buildings, because each building has a pump at it or a neighbor. Call the cost of this solution Agreed. Call the cost of the optimal solution Aopt. Define Bgreed and Bopt similarly for the hinted problem—the greedy solution builds, for each building, the cheapest pump that waters that building.
Agreed ≤ Bgreed because the greedy solution to B builds one pump at a building per mark instead of one at each marked building and is otherwise the same.
Bgreed = Bopt because pumps don’t interact with each other in the hinted problem; we can’t do better than building the cheapest pump for each building.
Bopt ≤ 5Aopt because one possible solution to the hinted problem (possibly worse than Bopt is to build 5 pumps at each building where the optimal solution builds 1 and have them supply water north, south, east, west, and center: then we supply each of the buildings with water from a pump dedicated to it.
Put all these together, and Agreed ≤ 5Aopt, as desired.
2. (16points)Tobringelectricitytothecolonies,Elonplanstobuildwindturbinesatthemand/orpower
lines between them. The cost to build a wind turbine at height h is $h2. The cost to build a power line
between two buildings is the distance (in three dimensions) between them. A wind turbine provides
power to its own building and all buildings connected to it by (chains of) power lines. For instance, if
n=1,m=3,andtheheightsofthebuildingsarec1,1 =1,c1,2 =2,andc1,3 =3,thenthecheapestway
to power the colony is to build a wind turbine at (1,1,1) (for $1) and power lines connecting (1,1,1) to
(1,2,2) and (1,2,2) to (1,3,3) (for $ 2 each).
Give an algorithm to find the set of locations to build wind turbines and power lines that minimizes the total cost. For full credit, your algorithm should run in O(n2m2) time, assuming all relevant integers are small enough that integer arithmetic operations take O(1) time, but slower (polynomial-time) correct algorithms will receive some credit. You don’t need to prove a bound on your algorithm’s runtime. Prove the correctness of your algorithm.
Solution: Make a complete graph whose vertices are the buildings plus one vertex labeled “wind”. Label each edge between buildings with the squared distance between them (so we never have to take any square roots and worry about nonintegers). Label each edge to “wind” from a vertex at height h with $h4. Run Prim’s algorithm, using an array for the priority queue. (Since the graph has O(mn) vertices and O(m2n2) edges, this runs in time O(m2n2).) Build a wind turbine at every vertex with an edge to “wind”, and build power lines for every other edge.
Every set of wind turbines and power lines that provides power to all the buildings in the original problem corresponds to a set of edges that connects the graph we’ve defined and vice versa. If there isn’t any power line e that connects buildings that would already both be powered without e, then there aren’t any edges whose removal leaves the graph connected, so the set of edges is minimal, that is, it’s a spanning tree. So we want the spanning tree for which the sum of (the square roots of) the edge weights is minimal.
Prim’s algorithm greedily picks the cheapest edge at each step, which is also the edge whose square root is cheapest, so it picks the same set of edges whether we use the squared-distance weights (in- tegers we can compare by integer arithmetic) or their square roots (distances whose value defines the cost in the original problem). Hence it returns an optimal solution.
3. (4 points) In the electricity-provision problem above, modern computer scientists do not know any polynomial-time algorithm to determine, given an integer $k, whether it’s possible to provide power to all the colonies for a cost of at most $k. Explain how this is consistent with the existence of an algorithm as above.
Solution: Although we can find a set of edges that minimizes the cost in polynomial time (the algo- rithm above is still polynomial time if integer arithmetic operations on k-bit integers take time O(k2), and runtimes are measured as a function of the length of the whole input, which includes the input integers), and so we can represent the minimum cost of a solution as a sum of square roots of integers, we don’t have a way to compare a sum of square roots of integers to a target integer k in polynomial time: e.g. if you represent the square roots as floating points, the amount of precision you need to compare their sum to a given integer might be more than polynomial. (Floating-point approximations are usually good enough and correct in practice, but do sometimes lead to rare and hard-to-diagnose bugs.)
Extra page for problem #
Extra page for problem #
Computer Science Tutoring
Extra page for problem #