CS 124 Homework 3: Spring 2023
Your name: Collaborators:
No. of late days used on previous psets:
No. of late days used after including this pset:
Homework is due Wednesday March 1 at 11:59pm ET. You are allowed up to twelve (college)/forty (extension school), but the number of late days you take on each assignment must be a nonnegative integer at most two (college)/four (extension school).
Try to make your answers as clear and concise as possible; style may count in your grades. Assign- ments must be submitted in pdf format on Gradescope. If you do assignments by hand, you will need to scan your papers to turn them in.
You may not write anything that you will submit within one hour of collaborating with other students, language models, the internet, etc., or using notes from such sources. That is, whatever you submit must first have been in your head alone, or notes produced by you alone, for an hour. Subject to that, you can collaborate with other students, e.g. in brainstorming and thinking through approaches to problem-solving.
For all homework problems where you are asked to give an algorithm, you must prove the correctness of your algorithm and establish the best upper bound that you can give for the running time. Generally better running times will get better credit; generally exponential-time algorithms (unless specifically asked for) will receive no or little credit. You should always write a clear informal description of your algorithm in English. You may also write pseudocode if you feel your informal explanation requires more precision and detail, but keep in mind pseudocode does NOT substitute for an explanation. Answers that consist solely of pseudocode will receive little or no credit. Again, try to make your answers clear and concise.
1. Meeting Minimization: Harvard University’s new President has received a shock on her first day in office. Every one of the n = 2400 faculty members wants to meet with her to discuss their list of complaints. She doesn’t have 2400 meeting slots available and would like to minimize the number of meetings. She learns that if she gets two faculty members to meet they can learn each other’s complaints and merge their list of complaints, so meeting either one will suffice to satisfy both. Indeed this satisfaction is recursive: If A meets B and then B meets C and then C meets the president then all of A, B and C are satisfied. However faculty won’t meet with each other unless there are some inducements. Specifically it costs the university C(X,Y) dollars to get faculty members X and Y to meet. The university however has a limited budget B and can only pay for a sequence of meetings of total cost less than B.
Design an algorithm that takes as input an n×n matrix C(X, Y ) and a bound B and outputs a sequence of meetings of the form “X meets Y”, “Y meets Z”, “Y meets the president” etc.,
CS Help, Email: tutorcs@163.com
while minimizing the number of meetings with the president such that (1) The total cost of the meetings not involving the president is at most B and (2) The meetings with the president satisfy all faculty members.
Warning: Note that the order of meetings does matter. For instance, a sequence of meetings A-B, B-C, A-D would end up with A and D knowing the complaints of {A, B, D} and B and C knowing the complaints of{A,B,C}, But if we were to order the meetings in the order B-C, A-B, A-D then A and D would know the complaints of all four.
Solution 1 (worse runtime):
We construct a graph where each faculty member is a vertex, between all pairs of vertices X, Y there exists an undirected edge (X, Y ) with associated weight wX,Y = C(X, Y ). We then run Kruskal’s until adding an edge would take our total cost above B, in which case we break.
We then add a new vertex P for the president, and connect the vertex to each MST in the forest, and run BFS from the source vertex. As we process an edge, we add the edge (a, b) to the list of meetings. We then return the reversed list of meetings.
Time complexity:
• Constructing the graph takes O(n2) time as there are O(n2) edges and O(n) vertices.
• Running Kruskal’s takes at most O(n2 log(n) since m = O(n2).
• Connecting all the MSTs to the President takes time O(n) as there are at most O(n) MSTs in the forest.
• Running BFS takes O(n) time, as the resulting graph has O(n) edges and O(n) vertices.
Thus the overall runtime of this algorithm is O(n2 log n).
Solution 2 (better runtime): We construct the same graph. We run Prim’s algorithm to find a minimum spanning tree, using an unsorted array for our priority heap. We sort the edges of the resulting MST by weight, and add throw out the heaviest edges one by one until the total weight is at most B. Then we add a vertex for the president and run BFS as in the previous solution.
Time complexity:
• Constructing the graph takes O(n2) time as there are O(n2) edges and O(n) vertices.
• Running Prim’s takes time O(n2): O(1) each for m = O(n2) inserts and O(n) each for
O(n) delete-mins.
• Connecting all the MSTs to the President takes time O(n) as there are at most O(n)
MSTs in the forest.
• Running BFS takes O(n) time, as the resulting graph has O(n) edges and O(n) vertices.
Thus the overall runtime of this algorithm is O(n2). Proof of correctness:
• Both of these algorithms find a maximum subset X of edges of some MST T ⊃ X with weight at most B. We need to prove that that’s also the set of edges with total weight at most B that has the fewest connected components (since each meeting with the pres- ident satisfies at most one connected component’s worth of faculty).
Suppose for contradiction that some set of edges with weight at most B has fewer con- nected components than X; let Y be the minimal-weight such set. If Y has any cycles, remove one edge from one of them: that has the same number of connected components as X but smaller weight, contradicting the choice of Y . So Y has no cycles. The number of connected components of Y is n − |Y | and the number of connected components for X is n−|X|, since each edge added to an empty graph of n vertices reduces the number of connected components by 1, so n − |Y | > n − |X|, that is, |X| < |Y |.
Since T is a spanning tree, it connects the whole graph. Add to Y whatever edges of T are necessary to connect the whole graph; call the added edges Z. This gives a spanning tree of the graph whose weight is w(Y ) + w(Z). Also, Z is at most the weight of the heaviest n − |Y | edges of an MST T and w(Y ) is less than the weight of the lightest |Y | edges of T (because we defined X to be a maximal subset of edges of T with weight at most B, i.e. taking |Y | > |X| edges of T would exceed w(Y )), so we have spanning tree of the graph with weight less than w(T), contradicting the fact that T was an MST. Hence a maximum subset X of edges of an MST with weight at most B is also the set of edges with total weight at most B that has the fewest connected components, as claimed.
• The chain of meetings must satisfy all faculty members as by construction there must exist a path from every vertex to the president (since we added an edge from the president to every MST).
• Since we return the reversed list of edges, our meetings are ordered from vertices at the greatest depth to those at the lowest depth from the president. To show that this satisfies all faculty members, every faculty member’s information must reach the President. For this to be true, all information must reach the set of faculty that meet the President. Let’s assume for contradiction that there exists a faculty member such that this ordering of meetings doesn’t satisfy them. There exists a path to this faculty member (P,a),(a,b),…,(t,u),(u,v). Since our graph is a tree, each faculty member is at only one depth/this is a unique path from the President to them. Thus for everyone on this path, they must be at a lower depth. Since our algorithm returns the edges in reverse BFS order (and therefore reverse depth order), we must return (u,v) before (t, u), and thus u has v’s information before meeting t, and so on for all meetings in this path, such that a must have v’s information when they meet the President, and thus this faculty member is satisfied.
2. Network Construction Minimization:
Major shipping company Nile, headquartered in Seeyatill, henceforth s, wants to connect its n warehouses to the source warehouse at s, by building a series of “links”. Specifically, after the network is established, for every warehouse i there is a path via a series of links from s to i. Links can only be built between certain pairs (i,j) of warehouses. For each such pair, there’s a positive integer distance di,j between them. Building a link from warehouse i to warehouse j requires buying di,j units of construction materials for a cost of $1 each and then shipping them from s to warehouse i, but shipping construction materials can only be done along existing links, and shipping the complete set of construction materials for one link a distance of 1 costs $1. That is, if there’s a path s → i1 → ···it → i which connects s to i,
thenalink(i,j)canbebuiltforacostofdi,j +C(i),whereC(i)=ds,i1 +di1,i2 +···+dit,i. (We can only add a link (i,j) if i is previously connected to the source s.)
Your goal in this problem is to give an algorithm that outputs a sequence of links to be constructed such that they connect all the warehouses as cheaply as possible. (I.e., if your sequence is (i1, j1), . . . , (it, jt) then you should minimize Ptl=1 cost(il, jl).)
Consider a graph representation G of the problem where each warehouse is a vertex. Create an undirected edge of weight di,j between i and j if a link can be built between warehouses i and j. Run Dijkstra on G using s as the source vertex, generating the Dijkstra tree T. We claim that the edges in T are the links that should be constructed. Run a DFS on T to output the sequence of links in proper order.
We prove the correctness of this algorithm. For each vertex v ̸= s in G, suppose that we first connect v to s by building a link e = (u, v) of cost du,v + C(u). C(u) is the length of some path from s to u, so the cost to build link e is at least the length of the shortest path from s to v via u. Therefore, for every vertex, the cost to build the link connecting it to s is at least the length of the shortest path from s to v. The total cost found by our algorithm is optimal because it’s no more than that: for every edge (u,v) in the shortest path tree, the shortest path from s to v ends in (u, v), and the shortest path tree has a shortest path to u.
Let there be m possible pairs (i,j) of possible warehouses links. Graph construction takes O(m + n) time and running Dijkstra with a Fibonacci heap takes O(n log n + m) time. Out- putting the edges from the Dijkstra tree then takes O(n) time as we are running a DFS (since there are only n − 1 edges in the tree).Therefore, the algorithm’s overall runtime is O(nlogn+m).
3. Explain how to solve the following two problems using heaps. (No credit if you’re not using heaps!)
(a) (12 points) Give an O(n log k) algorithm to merge k sorted lists with n total elements into one sorted list.
Consider the following algorithm:
i. Create a list of size n to store our final output list
ii. Create a min heap of the k lists, where each point on the heap points to the first
element of the respective list
iii. Pop the first element off the the root list and store this in the first empty slot in
the output list. Then, if the list is not empty, insert the remainder of this list back into the heap using the first element as its value to insert into the correct position in the min heap. If the list is now empty, simply remove the node from the heap and heapify the heap and continue.
iv. Repeat step iii. n times
Correctness:
Lemma: The smallest remaining element of each list is in the heap at every step of the loop.
Proof by induction:
• Base case: initially we add the first element of each list to the heap. Since each list is sorted, each of these elements are the smallest remaining element in each list, so this holds initially
• Inductive hypothesis: after t iterations, the smallest remaining element of each list is in the heap
• Inductive step: Since we know after t iterations the smallest element of each list is on the heap, on the t + 1th step of the algorithm we pop an element off the heap, and then add the next element from the list that that element came from. As such, after this step, k − 1 of the elements remain the same: all the other lists still have the same element on the heap, which from the inductive hypothesis was the smallest remaining element in each of these lists, so these k − 1 elements are correct. For the list that had its element popped from the heap, if it is now empty, simply nothing is added to the heap (it is just heapified to preserve the heap). Since the list has no remaining elements, it is true that the smallest remaining element of this list is in the heap, since it has no elements. If it still has remaining elements, we add the next element in this list to the heap. Since this list was sorted, the next element must be the smallest remaining element in the list, thus the smallest remaining element from the list is now on the list and the lemma holds.
Now we know that at every step in my algorithm, the smallest remaining element of each list is on the heap. Since the heap is a min heap, each time we pop an element from the heap, we remove the smallest element from the heap. We are therefore removing the minimum overall remaining element, as this must remove the smallest number out of the smallest remaining numbers in each list. Thus this correctly removes the next smallest overall element at each step. Since we loop through this n times, this must correctly return the overall sorted list of length n.
Time complexity
• Initialize an array of size n in O(n) time.
• Create a min heap of size k in O(k) time.
• Repeat the following n times:
– Pop an element from the min heap, which takes O(1) time.
– Store the first element in the list in the array in O(1) time.
– Re-insert this list back into the heap with a pointer to the first element of the
list (takes O(1) time). We then have to run an iteration of MIN-HEAPIFY at the source node to ensure that our heap remains intact, which takes O(log(k)) time since the heap has k elements.
Overall runtime: O(n log(k)) as required.
(b) (12 points) Say that a list of numbers is k-close to sorted if each number in the list is less than k positions from its actual place in the sorted order. (Hence, a list that is 1-close to sorted is actually sorted.) Give an O(n log k) algorithm for sorting a list of n numbers that is k-close to sorted.
Consider the following algorithm:
• Build a min heap using the first k elements of the list.
• Initialize an empty output array • Iterate n times:
– Pop an element from the min heap and store it in the nth position in the array
– If it exists, insert the next element from the input list into the heap Correctness:
Proof by induction:
• Base case: n = 1: In this case the heap is built with just this one element, thus when we pop the element we correctly get just that element in our output array (and we don’t insert anything back into the heap since there is nothing left in the list), thus our output array contains just the element which is clearly correctly sorted, thus the algorithm outputs the correct result for the base case.
• Inductive hypothesis: the algorithm correctly works for n = c
• Inductive step: n = c + 1. Since we know from our inductive hypothesis that this
step works for n = c, all we need to do is show that it works correctly to remove the first element, and then we can use our inductive hypothesis to ensure the rest of the list is correctly sorted.
• Since the list is k-close to sorted, when we initialize our heap and add the first k elements to the heap, it must contain the smallest element in the list (since the first element must be < k elements from its actual position, so it can be in no more than the kth position, and thus must be in this first k elements in the list). Thus by building our heap with this set of elements, we must have this value v0 on our heap at the start. Then since this is a min heap, popping the element the first time must return this value v0, which we correctly place in the first position in the array. We then add the next element in the list to the heap. At this point, we have dealt with 1 element, and now we have c elements remaining to be sorted which are all k-close to being in the correct position, thus by our inductive hypothesis the algorithm correctly sorts the rest of this list. Therefore, we correctly place the first element in the first position, and we correctly sort the rest of the list and put them in the right position in the list, thus we have a correct sorted list of size n = c + 1 and the proof is complete.
Time complexity:
It takes O(k) time to build the heap, and O(n) time to create the output list. In section we’ve proven that removing the top element from a min heap takes time O(log(k)), and inserting an element also takes O(log(k)) time, therefore, the loop step takes time O(n log(k)). Thus the overall running time of the algorithm is O(n log(k)) as required.
4. (0 points, optional)1 Consider the following generalization of binary heaps, called d-heaps: instead of each vertex having up to two children, each vertex has up to d children, for some integer d ≥ 2. What’s the running time of each of the following operations, in terms of d and the size n of the heap?
(a) delete-max() (b) insert(value)
1We won’t use this question for grades. Try it if you’re interested. It may be used for recommendations/TF hiring. 6
Code Help
(c) promote(x, newvalue)
The last operation, promote(x, newvalue), updates the value of x, an element in the heap, to newvalue, which is guaranteed to be greater than x’s old value. (Alternately, if it’s less, the operation has no effect.)
Solution Link:
All parts are in the Week 2 staff section notes Heap Exercises solutions, corresponding to parts 3, 4, and 5 respectively. 2
5. (5 points) Give a family of set cover problems where the set to be covered has n elements, the minimum set cover is size k = 3, and the greedy algorithm returns a cover of size Ω(log n). That is, you should give a description of a set cover problem that works for a set of values of n that grows to infinity – you might begin, for example, by saying, “Consider the set X = {1,2,3,…,2b} for any b ≥ 10, and consider subsets of X of the form…”, and finish by saying “We have shown that for the example above, the set cover returned by the greedy algorithm is of size b = Ω(log n).” (Your actual wording may differ substantially, of course, but this is the sort of thing we’re looking for.) Explain briefly how to generalize your construction for other (constant) values of k. (You need not give a complete proof of your generalization, but explain the types of changes needed from the case of k = 3.)
Consider the set X = {1,2,…,2b} for any b ≥ 10, and consider the following subsets of X. We have three subsets of X which each contain roughly a third of X by the following manner: the subsets will contain all the values where the modulo 3 operation returns 0, 1, and 2 respectively. For example, the first few elements of each would be {3, 6, 9, 12, . . .}, {4,7,10,13,…}, and {5,8,11,14,…}.
Then we also denote the following Ω(log n) subsets. For every 0 ≤ i < b, we define the subset {2i + 1, . . . , 2i+1}, along with the subset {1}. For example, the first few subsets using this method would be {1},{2},{3,4},{5,...,8},{9,...,16}. I claim that this set X along with the specified subsets will give a set cover problem where the optimal set cover has size k = 3, but that the greedy algorithm returns a cover of size Ω(log n).
Let us prove this by running the greedy algorithm on a generic example of this sort of set cover problem. We can see that the largest possible subset will be given by the subset {2b−1 + 1, . . . , 2b}. We can see that this will have size 2b − (2b−1) = (2b)/2, or in other words, half the set. We’ve eliminated about half from each of the third sets, so the next largest subset will be the subset that is a quarter of the entire set. We continue in this way, reducing the size of the thirds so they are never chosen while continually halving the number of elements not covered by our set cover. Since we continually halve, it means that we will select Ω(log n) sets in total. However, we could have selected the three ”third” sets to get a set cover of k = 3. We have shown that for the example above, the set cover returned by the greedy algorithm is of size b = Ω(log n).
If we wanted to extend this to other values of k, we could construct the initial subsets by choosing every kth element of X rather than every third element of X.
2This exercise was added independently to the problem set and to the section notes. It was added to the section notes by a TF who’d seen it at MIT, at
https://courses.csail.mit.edu/6.046/fall01/handouts/ps2sol.pdf.
6. “Hey Upper East Siders, Gossip Girl here.” You are secretly Gossip Girl, an anonymous gossip blogger who keeps track of friendships at Constance Billard High School. You publish an up-to-date map of the friendships at Constance on your website.3 You maintain this map by a stream of distinct tips from anonymous followers of the form “A is now friends with B.”
(a) (5 points) You call some groups of people a “squad”: each person is in the same squad as all their friends, and every member of a squad has some chain of friendships to every other member. For example, if Dan is friends with Serena, Serena is friends with Blair, and Alice is friends with Donald, then Dan, Serena, and Blair are a squad (You make up the name “The Gossip Girl Fan Club”) and Alice and Donald are another squad (“The Constance Constants”). Give an algorithm that takes in a stream of (a) tips and (b) requests for a specified person’s squad name. You should answer requests that come in between tips consistently—if you make up the name “The Billard billiards players” for Dan’s squad, and you’re asked for Serena’s squad’s name before any new tips come in, you should report that it’s “The Billard billiards players”.
Hints: Which algorithm from lecture can we use to help solve this problem? How can we represent the squads as a graph problem?
In a union-find data structure: for each person, run MAKESET. For each tip that comes in, run UNION. For each request for a specified person’s squad name, run FIND (which returns the name of a root vertex in a tree, that is, a name of one of the students—we choose students’ names as squad names).
Correctness: Two people are in the same squad if and only if there’s a chain of friendships between them, that is, if and only if UNIONs put them in the same component. FIND(u) = FIND(v) if and only if u and v are in the same component, so we return the same name exactly for people in the same squad.
Runtime: Union-find for n vertices, m UNIONs, and l FINDs has run time O((m + n + l) log∗ n).
(b) (10 points) A “circular squad” is defined to be a squad such that there is some pair of friends within the group that have both a friendship and a chain of friendships of length more than 1. In the example above, if Dan and Blair also became friends, then the group would be a circular squad. If Dan and Donald also became friends, they would all be in one circular squad. Modify your algorithm from the previous part so that you report names that contain the word “circle” for all circular squads (and not for any other squads).
Modify the algorithm from the previous part so that whenever a tip comes in for two people U and V already in the same component, we prepend “circle” to the name of the root vertex, and whenever we link a root y containing “circle” into a root x, we prepend “circle” to the name of x. (Also, preprocess everyone’s names to remove “circle” from them, in case of https://xkcd.com/327/ .)
Correctness: We claim that circular squads are exactly connected components of the friendship graph containing a cycle: if u and v are consecutive vertices of a cycle, then
3There are no ethical concerns here, because you’re a character in a highly-rated teen drama. 8
people U and V are both friends and have a chain of friendships (around the cycle the other way), and vice versa. Therefore, we only need to track connected components containing a cycle, which is exactly what our algorithm does.
Runtime: O((m + n + l) log∗ n) because we are still using union-find, just adding an additional check for each union.
7. Consider the following scheduling problem: we have two machines, and a set of jobs j1, j2, j3, . . . , jn that we have to process. To process a job, we need to assign it to one of the two machines; each machine can only process one job at a time. Each job ji has an associated running time ri. The load on the machine is the sum of the running times of the jobs assigned to it. The goal is to minimize the completion time, which is the maximum of the load of the two machines.
Suppose we adopt a greedy algorithm: for every i, job ji is assigned to the machine with the minimum load after the first i − 1 jobs. (Ties can be broken arbitrarily.)
(a) (5 points) For all n > 3, give an instance of this problem for which the completion time of the assignment of the greedy algorithm is a factor of 3/2 away from the best possible assignment of jobs.
Solution: Let the first and last of the n jobs have running times of r0 = n − 2, r1 = 1,r2 = 1,r3 = 1,…,rn−2 = 1,rn−1 = 2n−4. Taking in these jobs, the greedy algorithm will assign j0 to one machine (call it M0), the next n − 2 jobs to the other machine (call it M1) (putting a load of n − 2 on both machines), and the last job to one machine, giving it a load of 3n − 6. However, if the first n − 1 jobs were put on one machine and the remaining job were put on the other machine, both machines would have a load of 2n − 4, so the greedy assignment is a factor of at least 3/2 away from optimal.
(b) (15 points) Prove that the greedy algorithm always yields a completion time within a factor of 3/2 of the best possible placement of jobs. (Hint: Think of the best possible placement of jobs. Even for the best placement, the completion time is at least as big as the biggest job, and at least as big as half the sum of the jobs. You may want to use both of these facts.)
This follows as a special case of part c) with m = 2, but we can also prove this explicitly here. Let our jobs be j1, . . . , jn with running times r1, . . . , rn. Let our optimal completion time be ropt and our greedy running time be rgreedy. Note that whatever our ultimate arrangement is, the total work shared among both machines equals sumni=1ri, as all jobs need to be run. As mentioned in the hint, we have ropt ≥ 21 Pni=1 ri, since the total amount of work done by two machines is at most twice the amount of time the longer one runs. Also, ropt ≥ rmax, where rmax is the running time of the longest job, because the longest job needs to run.
Let M be the machine that takes the longest time to finish under the greedy algorithm’s schedule. The running time of M is thus the completion time of our schedule created by the greedy algorithm, rgreedy. Let jk ∈ {j1, . . . , jn} be the last job to finish on that machine (with running time rk). Consider the starting time sk of jk; it is, at most,
1 Pk−1 ri, since this is the time at which jk would start if both machines took the same 2 i=1
time to finish the first k−1 jobs. If not all of the first k−1 jobs finished at the same time, one machine would finish even earlier, and our greedy algorithm would schedule jk to run on that machine. Hence, we have
rgreedy = sk + rk
2 i=1 1Xk 1
ri − 2rk + rk 1Xn 1
ri + 2rk 1Xn 1
ri + 2rmax ≤ ropt + 21ropt = 32ropt
using the fact that Pki=1 ri ≤ Pni=1 ri and rk ≤ rmax, and finally our previous inequalities involving ropt. Since our choices of jobs and running times was arbitrary, this implies that our greedy algorithm always yields a completion time within 3/2 of the optimal completion time, as desired.
(c) (10 points) Suppose now instead of 2 machines we have m machines and the completion time is defined as the maximum load over all the m machines. Prove the best upper bound you can on the ratio of the performance of the greedy solution to the optimal solution, as a function of m?
We claim that the running time of a greedy schedule is off by a factor of no more than
2 − m1 from the running time of an optimal schedule for any set of jobs. The proof
presented here is simply a generalization of the one in part b). Let our jobs again be
j1, . . . , jn with running times r1, . . . , rn, let our optimal completion time be ropt, and let
our greedy completion time be rgreedy (which will be the running time of the machine
that takes the longest to run). As above, we know that since our machines will ultimately
totally run for a time of Pni=1, we have ropt ≥ m1 Pni=1 ri, since the total runtime for all
m machines is at most m times the time any machine runs. Also, ropt ≥ rmax, where
rmax is the maximum running time among all jobs, since the longest job needs to run.
Consider the machine M that finishes last under the greedy schedule, such that the
running time of M equals rgreedy. Consider the last job, jk ∈ {j1, . . . , jn}, to run on M;
let the start time of jk be sk. Again, sk ≤ 1 Pk−1 ri, which occurs if the first k−1 jobs m i=1
take the same time to run on all machines. Thus rgreedy = sk + rk
Code Help, Add WeChat: cstutorcs
≤ X ri + rk
m i=1 1Xk 1
ri − mrk + rk 1Xn 1
ri + (1 − m)rk 1Xn 1