The current implementation for polynomials is a vector where the term of degree k is stored at position k + 1. This is good for dense polynomials (those with mostly nonzero coefficients) but our factoring algorithms rely heavily on manipulating sparse polynomials like a” – 1.
Therefore,
• Create a new type called PolynomialSparse that stores only nonzero terms using datastructures as described below. PolynomialSparse should have identical functionality to Polynomial in terms of the functions and methods the later provides. That is, anything you can do with a Polynomial you should be able to do with PolynomialSparse. (Yet no need to be able to do operations that involve both types).
• Create tests for PolynomialSparse and associated functions and ensure your implementation passes them.
• Rename the supplied Polynomial type to be called PolynomialDense and rename all the methods that use it. Make sure the original supplied tests and scripts work for the new renamed type PolynomialDense.
• Modify your solution to Task 1 so that example_script.j1 is for PolynomialDense and example_script_2.j1 is for Polynomialsparse.
• Provide a short discussion contrasting the pros and cons of the dense and sparse representations. In particular, determine which operations are efficient (both in terms of memory and time) with the original PolynomialDense type and what operations are efficient with your new PolynomialSparse.
Note: You may to create an abstract base type, say Polynomial, for PolynomialSparse and PolynomiaiDense. However, this is not required.
Datastructures for PolynomialSparse
Your PolynomialSparse should have a non-trivial datastructure which involves both a dictionary and a linked list. The dictionary can be the in-built Dict () of Julia. The linked list can be via DataStructures. j!.
• Linked lists are like arrays, yet they allow us to have more efficient insertion and deletion of terms. Their drawback is that lookup is not
O(1) like an array but rather requires traversing the list.
• To make the lookup quicker, a dictionary where the keys are exponents will point at the entries of the linked list. The idea here is that every term of the polynomial will be represented both in the linked list and the dictionary. Searching for example for a term of power 32 would imply looking to see if it has a key in the dictionary. Then using the dictionary value to find it in the linked list. Similarly, running over all terms will imply iterating the linked list. Etc…
• With everv operation a Polvnomialsparse obiect needs to remain consitent in that each term of the polvnomial is represented both in the linked list and in the dictionary. Further, there are no excessive terms in the linked list or the dictionary.
• Note that as you develop this task you may do it in two phases. First implement the linked list – do all the tests and see everything works.
Then add the dictionary and again see everything works.
From now on we focus on the sparse representation.
Exact computation requires polynomials with (truly) integer coefficients and thereby we must use BigInt rather than (the finitely many integers comprising) Int. Nevertheless in this project we’ll simply replace Int by Int128.
• Refactor PolynomialSparse to have Int128 coefficients instead of Int. Do this by introducing a new type PolynomialSparse128 which is short hand for “Polynomial Sparse 128 bit Integer”.
• Duplicate the functionality (functions and methods) of PolynomialSparse for PolynomialSparse128. Note that PolynomialSparse128 should have a sparse implementation just like PolynomialSparse.
• Create tests for PolynomialSparse128 and associated functions and ensure your implementation passes them.
• Find examples where PolynomialSparse (the one with Int coefficients) overflows and yields the wrong result whereas your new
PolynomialSparse128 does not. The test code for PolynomialSparse128 should have such cases.
• Empirically compare the run-time performance of polynomial multiplication for PolynomialSparse and PolynomialSparse128. Do this for cases where Polynomialsparse does not overflow (i.e. returns the correct result). In general, your results should show that PolynomialSparse is faster, yet may overflow.
Note: Ideally we would use parametric types in Julia but for simplicity we simply introduce the new type PolynomialSparse128. You may (if you wish) use parametric types for PolynomialSparse, PolynomialDense, and PolynomialSparse128 but it is not required.
Notice the PolynomialDense, PolynomialSparse, and PolynomialSparse128 data-types have integer coefficients whereas operations such as divide, ged, and factor require the coefficient arithmetic to be done modulo p. Consequently, the aforementioned functions must do this reduction themselves, and worse yet can only do so once a prime p is provided. It is for this reason that divide, ged, and factor return a function that returns a polynomial once a p is provided. It would be much cleaner to have a polynomial datatype where the “mod p” behavior is “baked in” and automated.
• Create a type PolynomialMod which has a polynomial (of type PolynomialSparse) and a prime number (Int) as field names.
PolynomialModP should support the same arithmetic operations as Polynomialsparse but with coefficient arithmetic over ZZp rather than Int. PolynomialMod should be easier to work with when carrying out division, ged, and factorization. So for example f1 ÷ f2 where f1 and f2 are of type PolynomialMod, would directly yield a new PolynomialMod which is the result mod p. Similarly for the other operations.
• Create tests for PolynomialMod and associated functions and ensure your implementation passes them.
Your new datatype should rely on the existing functionality of PolynomialSparse but restrict the coefficients to 0, 1, \dots, p – 1.
Once you complete this task, certain operations of PolynomialSparse128 like p1 ÷ p2, are now better done via PolynomialModP. That is, while you may use f1 ÷ f2 for f1 and f2 of types PolynomialSparse128, it is better to only ÷ when f1 and f2 are of type PolynomialMod because you get back an instance of PolynomialModp rather than a function.
• Implement ^ for PolynomialMod and update the pow_ mod function to use it.
• Update the show method for PolynomialModP to include an indication it is “mod p”
The current multiplication implementation is not efficient for PolynomialSparse128 because biger integer arithmetic may be expensive. In this task we reconcile this problem by using CRT to compute products of PolynomialSparse128 via cheaper products of PolynomialModP.
• Implement multiplication for PolynomialMod (if you haven’t done so yet in Task 4).
• Implement multiplication using the Chinese Remainder Theorem (CRT) for PolynomialSparse128 as presented in the lecture. Here the product of two PolynomialSparse128s will be computed by doing sufficiently many multiplications of their PolynomialModP counterparts.
• Create tests for PolynomialSparse128 CRT multiplication and ensure vour implementation passes them.
• Benchmark PolynomialSparse128 CRT multiplication and contrast your results against the old multiplication (it is recommended to keep the old form of multiplication in another function or method. Find examples that demonstrate the CRT based approach is faster.
At this point you should have efficient multiplication and power for PolynomialSparse128 as well as PolynomialModP and its associated functions and methods.
As discussed in class, if you have a polynomial f and raise it to the power 2^n then instead of doing about 2^ multiplications you can do about n using repeated squaring.
• Re-factor pow_mod and ^ into something more efficient by using repeated squaring. Your implementation should work for both PolynomialSparse128 and PolynomialModP.
• Create tests for pow _mod and ^ for both implementations and ensure your implementation passes them.
• Adapt the supplied polynomial factorization code to incorporate the new methods and types you have built up to this point. The current supplied code works for limited cases but may get stuck on certain examples.
• Create tests for factorization and ensure our implementation passes them. It is a good idea to pass vour factorization algorithm a product of polynomials so your factorization is predictable. Note that your factorization algorithm may fail on small primes due to bad luck. It is therefore best to choose large primes.
• Benchmark your factorization algorithm by factoring a database of problems (e.g. generated by you via rand). Do not use more than 120 seconds of local compute time.
• Using a compute budget of approximately 120 seconds “stress test” your factorization implementation by factoring something substantial.