Homework 4
Large Scale Computing on the Cloud Homework 4: Spam Filtering Using Spark MLlib Learning Goal: use Spark MLlib to implement spam filtering following example in lecture 4 notes page 50-53. You need to complete the following steps: 1. Collect 20 spam text samples, and 20 non-spam text samples (one potential source is your own email), …