CS 124 Midterm: Spring 2022
Your name: Instructions:
• Spend at most 75 minutes on the exam, from the time you first see a problem to the last time you write anything in your solutions. (Time spent uploading doesn’t count against your time.)
Time started: Time ended:
• Do not communicate with anyone but the course staff about the contents of the exam.
• Do not use a computer or other electronic aids during the exam for anything except for:
1. (optionally) typesetting your solutions,
2. (optionally) asking clarification questions of the instructors on Ed.
3. (optionally) looking at pre-prepared notes, using no electronic features not also possible on paper (in particular, no search function)
• (You may use your own prepared notes.)
• After you finish the exam, type your name in affirmation of the following statement:
I pledge my honor that I have obeyed all instructions on this page in taking this exam.
After you finish, upload your exam to Gradescope as if it were a problem set.
• You have 75 minutes to earn a maximum of 75 points. Do not spend too much time on any single problem: we recommend roughly n minutes on an n point problem. Read all the problems first, and attack them in the order that allows you to make the most progress.
• You can ask clarification questions of the instructors on Ed.
• Be neat and write legibly and clearly. You will be graded not only on the correctness of your
answers, but also on the clarity with which you express them.
• Do not waste time re-deriving results presented in class, psets, or sections. You may simply state and cite them.
Programming Help, Add QQ: 749389476
1 True or False (18 points)
For each of the following statements, say whether the statement is TRUE or FALSE. Each correct answer is worth 2 points. Incorrect answers are worth 0 points. No explanations are needed.
1. 5log∗ n = O(n5)
True False 5log∗n =o(5log5n),5log5n =n=o(n5). Alternately: logm≥log∗m. Choosem=logn,andyougetloglogn≥log∗logn=−1+log∗nbydefinitionoflog∗. So 5log∗n ≤5∗5loglogn =O(logn)=O(n5).
2. No directed graph with 124 vertices and 124 edges has a valid topological sort.
True False Label the vertices as 0,…,123, and add edges (i,i+1) for i ∈ [123] and edge (0, 2).
3. The following encoding is prefix-free:
A→1 B→2 …
Y → 25 Z → 26
True False
4. There is an O(n1000) algorithm to sort a list of size n.
True False Bubble sort runs in O(n2) ⊂ O(n1000) time.
5. Given a biased coin that shows heads with probability 1, is there an algorithm to generate unbiased coin flips from flips of this coin (and no other source of randomness) that generates, in expectation, at least 81 unbiased flips per flip of the biased coin?
True False Notice that the final result of the algorithm is deterministic, so we cannot generate a random flip. One can also do an entropy-based argument.
6. For every directed graph G, we can find a topological sort of G by running a DFS and ordering the vertices in decreasing order of their post-order numbers.
True False G may have a cycle, in which case there is no topological sort.
7. For every directed graph G, the number of strongly connected components of G is strictly
less than the number of vertices of G.
True False Consider a graph with no edges. It has the same number of SCCs
as vertices. In fact, every directed acyclic graph with n vertices has n SCCs.
e.g. the encoding of B is a prefix of the encoding of Y
8. Suppose we Huffman-encode a set of items of which one item i occurs with probability pi = 1/2. Then, the Huffman encoding of i is always 1 bit long.
True False In the Huffman encoding, we will aggregate every other term except for i first, so i will be at the top of the tree. An item with probability 12 has greater probability than any proper subset of the remaining items, so it can’t be one of the cheapest two until all other items are merged.
9. For every undirected graph with all positive edge weights, there exists a vertex v such that the tree of shortest paths from v is a minimum spanning tree of the graph.
True False The graph might not be connected and so not have a minimum spanning tree. There are also connected counterexamples: e.g. a complete graph on at least 4 vertices where a cycle of edges has weight 2 and all other edges have weight 3. Every shortest path tree uses the diagonal edges, but the MST does not.
CS Help, Email: tutorcs@163.com
2 Edges In and Out of MSTs (8 points)
For each of the graphs below, mark an “x” on each of two edges that cannot be part of any minimum spanning tree, and an “o” on each of two edges that must be in every minimum spanning tree. Note that there may be more than two possibilities, but you should identify just two of each. No justification is needed.
To do this systematically, we can use the following two sufficient conditions for edges in an MST:
1. The heaviest edge in a cycle is never in an MST,
2. The lightest edge crossing a cut is always in an MST.
3 Even Paths (12 points)
Give an algorithm that, given a directed acyclic graph G = (V(G),E(G)) with positive integer edge weights, and also given vertices s, t ∈ V (G), finds the length of the shortest path from s to t that uses an even number of edges (or returns ∞ if no such path exists). Your algorithm should run in time and space O(n + m) on a graph with n vertices and m edges, but you don’t need to analyze the runtime or space or prove that upper bound. Prove the correctness of your algorithm.
One possible solution uses the DAG shortest paths algorithm as a black box. Make a new graph G′ with 2|V (G)| vertices, labeled (v, odd) and (v, even) for each vertex v ∈ G. Give G′ edges from (u, odd) to (v, even) and from (u, even) to (v, odd) for each edge (u, v) of G, and make those edges the same weight as (u,v). Then every even-length path from s to v in G corresponds to a path from (s,even) to (v,even) of the same weight in G′ (by alternating between “even” and “odd” vertices) and vice versa (every even-length path in G′ ends in a vertex labeled “even”), so the shortest path to (t, even) in G′ is the shortest even path to t in G. So, run the DAG shortest paths algorithm on G′ and return the distance to (t, even).
Another solution is by dynamic programming: First run a topological sort so the vertices are labeled 0, 1, 2, . . . , n − 1 in order. Next, let dp[v][0] be the shortest even-length path from s to v, and dp[v][[1] be the shortest-odd length path. We want dp[t][0], and we know that dp[v][e] = ∞ for all v < s, dp[s][0] = 0 and dp[s][1] = ∞.
For every path of even length (s,...,u,v), the path (s,...,u) is an odd length path and vice versa. Hence,
dp[v][0]= min dp[u][1]+wuv,dp[v][1]= min dp[u][0]+wuv. (u,v)∈E (u,v)∈E
Thus the DP is complete. Since we look at each vertex and edge once, we use O(n + m) time and O(n) space.
4 Union before Find (8 points)
Suppose we have an initially-empty union find data structure (complete with the union-by-rank and path-compression heuristics), on which we call MAKESET n times, then UNION m times, then FIND l times, in that order. Prove that the total time necessary to do all those operations is O(n+mlog∗ n+l).
For up to 5 points’ worth of partial credit, instead prove that the total time necessary to do all those operations is O(n log∗ n + m log∗ n + l).
(As a reminder that you may use without proof, the total time to run MAKESET n times, UNION m times, and FIND l times, in possibly-mixed order is O(n + m log∗ n + l log∗ n).)
Common mistake: it’s not true that the time to run the FINDs is O(l)—even one FIND might take log n time).
Partial credit: Each FIND after the first on a given item takes O(1) time (for a total of at most O(l)) and doesn’t affect the structure, since the first FIND path-compresses it to the root and there are no later UNIONs. The MAKESETS, UNIONs, and remaining n FINDS take O(nlog∗n + mlog∗n) time by the “reminder”.
Solution 1: At most 2m elements are unioned together. Each FIND on an item that’s never unioned, and each FIND after the first on a given item, takes O(1) time (for a total of at most O(l)) and doesn’t change the structure. The MAKESETS, UNIONs, and remaining ≤ 2m FINDS take O(n + mlog∗n + mlog∗n) time by the “reminder”, for a total of O(n + mlog∗n + l))
Solution 2: During the FINDs, each element’s parent changes at most once (to the root of its tree), so at most n parent changes happen. During a FIND, every link that’s followed either changes a parent (which happens at most O(n) times) or is a link to the root (which happens at most once per find), for a total of O(n + l) work. The MAKESETs and FINDs take time O(n + m log∗ n) by the reminder, for a total of O(n + m log∗ n + l)
程序代写 CS代考 加QQ: 749389476
5 Martian Colonization (17 points)
Elon is hoping to set up colonies on Mars. There are n potential locations for colonies, and some of the locations have natural roads that allow colonists to travel between them. The set of natural roads forms a tree on the graph whose vertices are the potential colonies. Each location l for a colony has some (not necessarily positive) integer quality ql. However, to survive on Mars, a colony must have at least one colony neighboring it (by a natural road). A colony may have multiple neighbors. Elon wishes to found the set of colonies whose total value is maximal.
1. (4 points) A greedy algorithm calculates, for each edge (u, v), the sum qu +qv of the qualities of its endpoints, and calls that edge “good” if the sum of the qualities is positive. Then it founds colonies at every site that’s an endpoint of at least one good edge. Give an example instance of this problem (a tree and qualities of its vertices) for which this greedy algorithm does not solve the problem. (No proof necessary.)
Consider the tree with 3 vertices, edges (1, 2), (1, 3) and qualities q1 = 2, q2 = −1, q3 = −1. The optimal solution colonies 1 and 2, while the greedy solution colonies all 3.
2. (13 points) Give an algorithm to maximize the sum of the qualities of the colonies. Your algorithm should run in time O(n124) and space O(n), but you don’t need to analyze the runtime or space or prove that upper bound. You don’t need to prove its correctness, but an explanation may be considered for partial credit if your algorithm’s wrong or incomplete.
First give the tree a root r. Let dp[v][f1] be the optimal way to colonize the subtree rooted at v, with a few potential modification:
(a) If f1 = 1 then we have to build a colony at v, and v is free from the “colony must have a neighbor” condition.
(b) If f1 = 2 then we are free to build a colony at v, even if none of its children are colonies.
We want dp[r][0]. For each leave f, dp[v] = [0,qv,max(qv,0)]. For a vertex v, let V denote its set of children. If neither modification is true we can either not build a colony there, or build a colony at v and force one of its children. So,
! dp[v][0] = max qv + min(dp[w][1] − dp[w][2]) + X dp[w][2], X dp[w][0] .
If we have to build a colony there, then all its children are free to be colonies. So,
dp[v][1] = qv + X dp[w][2]. w∈S
If we are free to build a colony, then we can either build one or not. So,
! dp[v][2] = max qv + X dp[w][2], X dp[w][0] .
The runtime is O(n) since we consider each child O(1) times, and so is the space complexity.
6 Exam Security (12 points)
In order to keep the 2022 CS 124 Midterm as secret as possible, Adam has broken the answer key into n parts {a1,...,an} and given each part ai to one of the n TFs, ti. Each TF, eager to know whether or not they can even score higher than the median on this exam, makes promises with other TFs—“I promise to give you all parts of the exam that I know.” Let the set of promises be P, and let element (ti,tj) ∈ P indicate a promise from ti to tj.
For example, if t1 = John extends a promise to t2 = Phyllis and t2 = Phyllis extends a promise to t3 = Tarun, then, after all promises are fulfilled John will have a1; Phyllis will have a1 and a2; Tarun will have a1,a2, and a3 (since Phyllis will give Tarun everything she knows, including her part and John’s part). The TFs may communicate multiple times to fulfill their promises: e.g. if t3 = Tarun also made a promise to t1 = John, then after all communication, all three TFs would know all three pieces of the answer key.
You are a CS124 student, worried about this exact moment – reading an overly wordy question on the midterm. So, after the TFs have finished communicating, you decide to steal a TF’s laptop, containing all their pieces of the answer key! You can only steal one laptop before you get caught. You would like the full answer key. Design an algorithm that, given n and P, will return the list of all TFs whose laptops you can steal to obtain the full answer key (or the empty list if no such TFs exist). If there are m promises made, your algorithm should run in O(n + m) time and space, but you don’t need to analyze the runtime or space or prove that upper bound. Prove the correctness of your algorithm.
Run DFS on GR and use the results as an order to run DFS on G to construct an SCC graph, as proven in class. Topologically sort the SCC graph. For each vertex of the SCC graph, check whether it’s a sink by checking whether it has no outgoing edges. Then, if the SCC graph has a unique sink, return every vertex in that sink. Otherwise, return the empty list.
Finding all sinks takes O(n + m) time and so does DFS, so the algorithm is O(n + m) time. For correctness:
1. No one outside a sink SCC knows that SCC’s answers, so any TFs who know all the answers must be in every sink SCC (in particular, there must be only one).
2. Suppose a TF in SCC i doesn’t know all the answers. Let j a TF whose answers i doesn’t know. Then every SCC reachable from j also doesn’t send its answers to i, so the latest such SCC (in the topological order) is a sink (because it’s the latest) and doesn’t contain i. That is, if a TF doesn’t know all the answers, there’s a sink SCC not containing them.
Extra page for problem #
Extra page for problem #