CS 124 Homework 1: Spring 2023
Your name: Collaborators:
No. of late days used on previous psets:
No. of late days used after including this pset:
Homework is due Wednesday Feb 1 at 11:59pm ET. You are allowed up to twelve (college)/forty (extension school), but the number of late days you take on each assignment must be a nonnegative integer at most two (college)/four (extension school).
Try to make your answers as clear and concise as possible; style may count in your grades. Assign- ments must be submitted in pdf format on Gradescope. If you do assignments by hand, you will need to scan your papers to turn them in.
You may not write anything that you will submit within one hour of collaborating with other students, language models, the internet, etc., or using notes from such sources. That is, whatever you submit must first have been in your head alone, or notes produced by you alone, for an hour. Subject to that, you can collaborate with other students, e.g. in brainstorming and thinking through approaches to problem-solving.
For all homework problems where you are asked to give an algorithm, you must prove the correctness of your algorithm and establish the best upper bound that you can give for the running time. Generally better running times will get better credit; generally exponential-time algorithms (unless specifically asked for) will receive no or little credit. You should always write a clear informal description of your algorithm in English. You may also write pseudocode if you feel your informal explanation requires more precision and detail, but keep in mind pseudocode does NOT substitute for an explanation. Answers that consist solely of pseudocode will receive little or no credit. Again, try to make your answers clear and concise.
There is a (short) programming problem on this assignment; you DO NOT CODE with others on this problem like you will for the “major” programming assignments later in the course. (You may talk about the problem, as you can for other problems.)
1. Suppose you are given a four-sided (“tetrahedral”) die that might be biased in an unknown way. So the die rolls a number in the set {1, 2, 3, 4} with (unknown) probabilities p1, p2, p3, and p4, where pi is the probability of rolling i.
(a) (10 points) Explain how to use rolls of that die to generate unbiased coin flips. Using your scheme, determine the expected number of die rolls until one coin flip is generated, in terms of the (unknown) probabilities p1 . . . , p4. Your scheme should work as long as at least two of the pi are positive, but doesn’t need to be maximally efficient. Solution 1.a:
Non-multi-level strategy:
i Procedure: We can generate coin flips by examining pairs of rolls, executing two rolls at a time (so we need an even number of rolls to generate a flip). If the first roll has a higher value, we record a head; if the second roll has a higher value, we record a tail. Otherwise, we ignore the result.
ii Claimed bound: Let r be the expected number of rolls until one coin flip is generated. 2
r = 1 − P4i=1 p2i
iii Proof of claimed bound: For any i, j ∈ {1, 2, 3, 4} with i ̸= j, the pairs of rolls (i, j) and (j,i) both occur with probability pipj, as die rolls are independent. In our setup, one outcome is assigned heads and the other tails. Since this holds for all i, j, we have that among successful die rolls, there is an equal probability of generating heads and tails (which we can work out to be P1≤i
! Q(l)=Y Xp2kj
If P(l) is our probability that our strategy takes exactly l die rolls to produce a flip, thenP(l)=0forloddandP(l)=Q(l−2)−Q(l)forleven. Thenbydefinition, the expected number of die rolls until a coin flip is generated is
x = X P (l) · l = X (Q(l − 2) − Q(l)) · l = 2 X Q(l) l≥2, l even l≥2, l even l≥2, l even
Using the telescoping sum argument and simplification from class, we get
Solution Notes: Because we don’t know the probabilities of rolling each number {1,2,3,4}, it is not correct to simply match two die results to heads and two die re- sults to tails, e.g. the idea of assigning heads to rolling 1, or 2, and assigning tails to the other roll results. For instance, if p1 = p2 = 1/2 (such that the die doesn’t generate other numbers at all), the aforementioned example would never generate a tail.
One other idea would be to interpret the heads and tails generated above as biased heads and tails, i.e. 1 and 2 generate a biased head and the other results generate a biased tail. The die rolls would then simulate a biased coin with probability p1 + p2 of generating heads; from class, we know how to use the results of that biased coin flip to generate unbiased coin flips. But as the example above shows, this strategy doesn’t always work either, since in the example we’d only ever get biased heads and no tails.
(b) (10 points) Now suppose you want to generate unbiased rolls of a six-sided die given your potentially-biased tetrahedral die. Explain how to do this, and again determine the expected number of biased die rolls until one unbiased die roll is generated. Your scheme should work as long as at least three of the pi are positive, but doesn’t need to be maximally efficient.
Solution 1.b:
Coin Flip Solution: Use the strategy from part (a) to generate 3 fair coin flips. Assign HHH to 1, HHT to 2, HTH to 3, HTT to 4, THH to 5, and THT to 6; if you see TTH or TTT, throw them out and try again. Then we generate a roll with probability 3/4, so in expectation we need to repeat the generation of 3 flips 4/3 times, i.e. generate 4 flips, so the expected number of biased flips needed is 4 times the result from part (a).
Ordering Solution: We use a similar strategy as in part (a). This time, we consider 3
x = 2 P Q(l) = 2 Q 1 + P4 p2k l≥2, l even k≥1 i=1 i
triples of die rolls, (x, y, z). If x = y = z, we ignore the triple and roll again; otherwise, if x < y < z, we get a 1; if x < z < y, we get a 2; if y < x < z, we get a 3; if y < z < x, we get a 4; if z < x < y, we get a 5; and if z < y < x, we get a 6.
In the case that exactly two of (x,y,z) are equal, we consider the pair (1,2) if x is not repeated, (3, 4) if y is not repeated, and (5, 6) if z is not repeated. We use our strategy in part (a) to then generate an unbiased coin flip; we generate the first number in the pair if the flip gives Heads, and the second number otherwise. This works because any permutation of three rolls x, y, z is equally likely to occur by symmetry, in the case where they are all distinct as well as in the case where there are only 2 distinct numbers.
The probability of not generating an unbiased roll from a triple of biased rolls is β = p31 + p32 + p3 + p34, where all rolls in the triple are the same. The probability that our triple of biased rolls has two distinct numbers is α = 3(p21(1 − p1) + · · · + p24(1 − p4)). Then, letting X be the expected number of biased rolls needed until an unbiased roll is generated and Y be the expected number of biased rolls needed until a coin flip is generated from part (a), we have the equation X = (1−α−β)·3+α(3+Y )+β(X +3), where we used the expected value computation from part (a). Solving for X, we have
Similarly as in the previous part, we may make this more efficient by employing a Multi- Level strategy extension, considering 32, 33, . . . rolls as well (e.g., we consider a (1, 1, 1) as a Level 1 roll with value 1. Then, for example, the Level 0 rolls (1, 1, 1), (2, 2, 2), (3, 3, 3) would produce a 1, which is fair since any permutation of those 3 triples is equally likely by symmetry). We do not pursue expected value calculations here.
For both problems, clearly separate (i) the procedure you use to produce the unbiased die or coin, (ii) your claimed bound on the expected number of die rolls to generate the coin flip (or die roll) (iii) Your proof of the claimed bound.
2. On a platform of your choice, implement the three different methods for computing the Fi- bonacci numbers (recursive, iterative, and matrix) discussed in lecture. Use integer variables. (You do not need to submit your source code with your assignment.)
Note: Part (a) and (c) will depend on what language you choose to implement you algorithm in, so different students might get different answers.
(a) (15 points) How fast does each method appear to be? (This is deliberately open-ended; part of the problem is to decide what constitutes a reasonable answer.) Include precise timings if possible—you will need to figure out how to time processes on the system you are using, if you do not already know.
Solution 2.a: Student answers will likely have precise timings. Some other possible things to look for:
• Asymptotics: the recursive algorithm takes exponential time; the matrix and itera- tive methods both take time growing like n2.
• Specific runtimes for big input values. 4
• The recursive method egregiously slow: n > 50 should already take the computer a super long time to compute.
• Iterative and matrix methods similar empirical runtimes.
(This question is open-ended; students may talk about other things.)
(b) (4 points) What’s the first Fibonacci number that’s at least 231? (If you’re using C longs, this is where you hit integer overflow.)
Solution 2.b: The 47th Fibonacci number, i.e. 2,971,215,073.
(c) (15 points) Since you should reach “integer overflow” with the faster methods quite quickly, modify your programs so that they return the Fibonacci numbers modulo 216. (In other words, make all of your arithmetic modulo 216—this will avoid overflow! You must do this regardless of whether or not your system overflows.) For each method, what is the largest value of k such that you can compute the kth Fibonacci number (modulo 65536) in one minute of machine time? If that value of k would be too big to handle (e.g. if you’d get integer overflow on k itself) but you can still calculate Fk quickly, you may report the largest value kmax of k you can handle and the amount of time the calculation of Fkmax takes.
Solution 2.c: [TF’s empirical results from last year using Python. Other solutions may differ slightly from this:]
• Recursive method: k = 40
• Iterative method: k ≈ 2.5 ∗ 108 • Matrix method: k ≈ 109,000,000
Note: As expected and consistent with (a), the recursive method is worst and the matrix method is best by far, especially as the Fibonacci numbers get larger and larger.
3. (a) (10 points) Make all true statements of the form fi ∈ o(fj), fi ∈ O(fj), fi ∈ ω(fj), and fi ∈ Ω(fj) that hold for i ≤ j, where i,j ∈ {1,2,3,4} for the following functions. No proof is necessary. All logs are base 2 unless otherwise specified.
i. f1 =(logn)n. ii. f2 = n1.5
iii. f3 = n(log n)5
iv. f4 =nlogn.
Solution 3.a: Ordering from smallest to largest, we have
f3 = n(logn)5,f2 = n1.5,f4 = nlogn,f1 = (logn)n,
where each function is o of all functions to its right, and O of itself and all functions to its right. Note that this is because
• limn→∞ f3 f2
• limn→∞ f2 f4
• limx→∞ f4 f1
→ 0 (using L’Hopital’s rule)
→ 0 (comparing the exponents on f2 and f4)
→ 0 (by rewriting f1 = (logn)n = 2n·loglogn and f4 = nlogn = 2logn·logn
and noting limn→∞ (log n)2 nloglogn
n → ∞ and thus n log log n f4 < f1)
= 0 by L’Hopital’s, so n grows faster than (log n)2 as certainly must grow faster than (log n)2, and hence
Programming Help
The above thus gives the relations: • For i = 1:
– f1 ∈ O(f1), f1 ∈ Ω(f1) (trivial) – f1 ∈ ω(f2), f1 ∈ Ω(f2)
– f1 ∈ ω(f3), f1 ∈ Ω(f3)
– f1 ∈ ω(f4), f1 ∈ Ω(f4)
• For i = 2:
– f2 ∈ O(f2), f2 ∈ Ω(f2) (trivial) – f2 ∈ ω(f3), f2 ∈ Ω(f3)
– f2 ∈ o(f4), f2 ∈ O(f4)
• For i = 3:
– f3 ∈ O(f3), f3 ∈ Ω(f3) (trivial) – f3 ∈ o(f4), f3 ∈ O(f4)
• For i = 4:
– f4 ∈ O(f4), f4 ∈ Ω(f4) (trivial)
(b) (5 points) Give an example of a function f5 : N → R+ for which none of the four statements fi ∈ o(f5), fi ∈ O(f5), fi ∈ ω(f5), and fi ∈ Ω(f5) is true for any i ∈ {1, 2, 3, 4}.
Solution 3.b: f5(n) = nn(1 + (−1)n) + 1 is one possibility. When n is even, that’s 2nn + 1, which grows faster than any of fi. When n is odd, that function is 1, which grows slower than all the given functions.
4. In each of the problems below, all functions map positive integers to positive integers.
(a) (5 points) Find (with proof) a function f1 such that f1(n2) ∈ O(f1(n)).
Solution 4.a: Let f1 : N → N where f1(x) = 1, i.e. f1(x) is just a constant function. Then, we wish to show that f1(n2) ∈ O(f1(n)), which by definition means we must show there exists c,N such that f(n) ≤ c·g(n) for all n ≥ N. Then, if we take c = 2 and N = 1, then we have that
1=f1(n2) ≤2=2·1=c·f1(n) which is certainly true and hence f1 satisfies f1(n2) ∈ O(f1(n)).
(b) (5 points) Find (with proof) a function f2 such that f2(n2) ̸∈ O(f2(n)).
Solution 4.b: Let f2 : N → N where f2(x) = x. Suppose for sake of contradiction f2(n2) ∈ O(f2(n)), which means there exist positive constants c,N such that ∀n ≥ N,
f2(n2)=n2 ≤c·n=cf(n). However, consider any n > c, say n = ⌈c⌉ + 1; it is clear that
f(n2)=n2 >cn=cf(n). Thus, we have a contradiction, implying f2(n) ̸∈ O(f2(n)).
程序代写 CS代考 加QQ: 749389476
(c) (10 points) Prove that there does not exist any function f such that f(n2) ∈ o(f(n)). Solution 4.c: Assume for sake of contradiction that there does exist some function f such that f(n2) ∈ o(f(n)). Then, by definition, this means that for all c > 0, there exists N such that for all n > N, |f(n)| ≤ c|g(n)|, or simply
f(n) ≤ c · g(n)
here since we know from the problem statement that all outputs must be positive integers.
In particular, the above holds for c = .9. For this setting, the above statement tells us that there exists n0 such that for every n ≥ n0, we have
f(n2) ≤ .9f(n) < f(n).
Let n1 = n0 and ni = n2i−1. Note that every ni ≥ n0 and so applying the above inequality
forn=ni wegetf(ni+1)
which contradicts the fact that the longest chain of strictly decreasing positive integers starting with the integer f(n1) has length f(n1).
(Concretely, f(n2) is at least 1 less than f(n1), and f(n3) is at least 1 less than f(n2), which tells us that f(n1) must be at least 2 greater than f(n3), and in general
f(nk+1) ≤ f(n1) − k ∀ k.
For k = f(n1) this leads to the contradictory assertion 1 ≤ f(nk+1) ≤ f(n1)−f(n1) = 0.)
5. WindowSort iterates the following process three times on input an Array A[1..n], where n is a nonnegative integer: (a) Recursively sort the left half, i.e., Compute WindowSort(A[1, .., n/2]). (b) Recursively sort the middle half, i.e., Compute WindowSort(A[n/4 + 1, .., 3n/4]). (c) Re- cursively sort the right half, i.e., Compute WindowSort(A[n/2 + 1, .., n]). (So WindowSort iterates through (a)-(c) three times!)
(a) (5 points) We didn’t specify what WindowSort does if the number of items to be sorted
is not divisible by 4. Specify what WindowSort does in those cases in such a way that
WindowSort terminates and correctly sorts.
Solution 5.a: Suppose the number of items to be sorted is n. If n = 0 or n = 1, we
do nothing and return the list. If n = 2, we look to see if the list is already sorted. If
so, we return the list. Otherwise, we swap the two elements and then return the list. If
n = 3, we sort the first two elements and then the last two elements. For larger n, if n
is not divisible by 4, then WindowSort should round the number such that in step (b)
it recursively sorts A[⌊n⌋+1,…,⌈3n⌉]. We know for n ≥ 2 that ⌈n⌉ < n, so the process 442
recursively sorts sub-lists of smaller size and doesn’t get stuck in a loop.
程序代写 CS代考 加微信: cstutorcs
(b) (15 points) Prove rigorously that WindowSort correctly sorts. You may assume all numbers to be sorted are distinct.
Solution 5.b:
We prove by (strong) induction on n that WindowSort correctly sorts all arrays of size k. As base cases, if the array size is 0, 1, 2, or 3, the algorithm is equivalent to a brute-force sort, which is correct.
We now proceed by strong induction; suppose n > 3 and we know that WindowSort
correctly sorts all arrays of size m < n. We will use this to show as an inductive step
that Windowsort also correctly sorts an arbitrary input array A of size n.
Since n > 2, n > n ≥ 0, so we can assume by the inductive hypothesis that Window- 2
Sort correctly sorts half of A. Since WindowSort simply calls this procedure nine times, it terminates.
Let S denote the set of the largest n/4 of elements of A. Initially, some of S may be located in the first half of A. In the first iteration of (a), the first half of the array is sorted, so the elements of S are now between indices n/4 + 1 and n/2. Next in step (b), elements of S are sorted to be between n/2 + 1 and 3n/4. In step (c), the last half of A is sorted so the elements of S are now sorted and occupy A[3n/4 + 1, n]. Thus after the first iteration of steps (a)-(b)-(c) we have that the top n/4 elements of A now occupy the top 1/4 of the indices of A. In all future iterations of (a) and (b) these elements are not even examined and future iterations of (c) will retain these elements in place since they are the top n/4 elements.
Next, consider the next-largest n/4 elements of A (after S); call that set B. In the second iteration of (a), elements of B are sorted and end up between indices n/4 + 1 and n/2. Next in step (b), elements of B are sorted to be between n/2 + 1 and 3n/4. Step (c) does not change the ordering because A[3n/4 + 1, n] were sorted in the previous iteration. Now we have that the last two quarters of the array are sorted. We also know that the next iteration of (a)-(b)-(c) will keep elements and A and B in place.
Finally in the third iteration of (a), the first half of the array is sorted. Since the last half of the array was already sorted, steps (b) and (c) make no difference.
(c) (5 points) Give a recurrence describing WindowSort’s running time, and, using that recurrence, give the asymptotic running time of WindowSort.
Solution 5.c: On an input of length n, WindowSort recursively calls itself inputs of length ⌈n2⌉ nine times, and does a constant1 amount of work to calculate n4 and make the recursive calls. This gives us a recurrence relation
T(n) = 9T lnm + Θ(1) 2
for the runtime T(n). We see that this recurrence relation is in the form T(n) = aT(n/b)+cnk wherea=9,b=2,k=0.2 Sincea=9>(2)0 =bk,bytheMas-
ter Theorem we can conclude that WindowSort’s running time is
1An implementation of WindowSort that allocates new arrays for the recursive calls would instead do Θ(n) work, but the runtime would end up asymptotically the same either way.
2The ceilings don’t make a difference for the Master Theorem. 8
Θ nlog2 9 .
(d) (Optional, 0 points)1 Suggest an improvement to WindowSort that improves the runtime. The resulting algorithm should still involve O(1) recursive calls on arrays of half the input length, and O(1) other work.
Solution 5.d: We note that we do not have to go through all three steps (a)-(b)-(c) in each of the iterations. Instead we can simply do steps (a)-(b)-(c)-(a)-(b)-(a). This leads to only six recursive calls on arrays of length ⌈n/2⌉ , and thus the recurrence
T(n) = 6T lnm + Θ(1), 2
which by the Master Theorem has running time Θ(nlog2 6).
6. (0 points, optional)3 InsertionSort is a simple sorting algorithm that works as follows on
input A[0], …, A[n−1].
Algorithm 1 InsertionSort Input: A
for i = 1 to n − 1 do j=i
while j > 0 and A[j − 1] > A[j] do swap A[j] and A[j − 1]
end while end for
Show that for every function T (n) ∈ Ω(n) ∩ O(n2) there is an infinite sequence of inputs {Ak}∞k=1 such that Ak is an array of length k, and if t(n) is the running time of InsertionSort on An, then t(n) ∈ Θ(T (n)).
Solution 6: InsertionSort works by growing a list of sorted elements from the start of the array. At each step in the for loop, a new element is considered; while this element is smaller than the element to the left, it moves left. For an array of length n, the best-case running time of InsertionSort is Ω(n), since it just checks all elements of the array. The worst-case running time of InsertionSort is O(n2), since for each of the n elements it must compare and swap the element with all elements to the left of it, giving Pni=1 i = n(n + 1)/2 = O(n2) operations.
We want to show that given any T (n), we can construct arrays A1, A2, . . . with lengths 1, 2, etc. such that the running time of InsertionSort on An, t(n), is Θ(T(n)). A1 is trivial, as it is a single-element sorted list; any T (1) ∈ Ω(1) ∩ O(1) must be constant, which matches with the constant running time of InsertionSort. Now we prove that given n ≥ 2 and any values v between n and n(n + 1)/2, we can construct an array of length n with values 1, . . . , n such that it takes v operations for InsertionSort to sort the array (so v = Θ(t(n))).
Fix n. Start with a sorted array {1, . . . , n}; this represents the best-case running time of InsertionSort, n operations. To turn this into an array which requires n + 1 operations, swap 1 and 2 to produce the array {2,1,3,…,n}; to turn this into an array which requires n+2
3This question will not be used for grades, but try it if you’re interested. It may be used for recommendations or TF hiring.
operations, swap 1 and 3 to produce the array {2, 3, 1, . . . , n}, and so on. This process can be continued until we get an array {2, 3, . . . , n, 1}, which requires n+(n−1) = 2n−1 operations. This process generates arrays on which the number of operations taken by InsertionSort vary from n and 2n − 1, i.e. t(n) goes from Θ(n) to Θ(2n − 1).
We can continue this process, repeating the same steps on the subarray containing the first n− 1 elements, “pushing” the number 2 to create the arrays {3,2,4,…,n,1}, {3,4,2,…,n,1}, …, {3,4,…,n,2,1}. These steps generate a series of arrays on which InsertionSort needs between n and n+(n−1)+(n−2) = 3n−3 operations (t(n) goes from Θ(2n) to Θ(3n−3)). We can then do the same with the subarray with the first n−2 elements, pushing 3 iteratively, producing arrays on which InsertionSort needs between 3n − 2 and 4n − 6 operations (so t(n) goes from Θ(3n − 2) to Θ(4n − 6)), and so on.
This process ends when the subarray simply becomes the first two elements of the array; the last two arrays produced are {n−1,n,n−2,…,2,1} and {n,n−1,…,2,1}, the completely reversed (and worst-case) array. InsertionSort takes n+2+3+···+n−1 = n(n+1)/2−1 operations on the former, and n+1+···+n−1 = n(n+1)/2 operations on the latter. t(n) on these last two arrays are thus Θ(n(n + 1)/2 − 1) and Θ(n(n + 1)/2). So we can construct arrays with length n with runtimes ranging from Θ(n) = Ω(n) to Θ(n(n + 1)/2) = O(n2), finishing our proof. Since our choice of n was arbitrary, this also holds for all n.
For any function T(n) ∈ Ω(n) ∩ O(n2), we are guaranteed that for any specific n, T(n) must be Θ(n) and O(n2), i.e. T(n) is proportional to some v as described above. But from the above, we have seen that we can always construct an array such that t(n) = Θ(v), so T(n) = Θ(t(n)) ⇐⇒ t(n) = Θ(T(n)). Since our choice of n was arbitrary, this holds for all n, meaning we can indeed generate a sequence of arrays {Ak}∞k=1 in this manner.