MTHM501 Coursework 1
Data wrangling
Load in the file ¡°indicator hiv estimated prevalence% 15-49.csv¡±. This file contains the estimated HIV prevalence in people of aged 15-49 in different countries over time. Prevalence is defined here to be the estimated number of people living with HIV per 100 population.
1. Produce a tidy data set called gp_hiv using the tools in tidyverse that we introduced in the week 3 practical. The dataset needs to run from 1991 onwards (there is too much missing data prior to that), and we want to end up with three variables (i.e. columns): Country, year and prevalence. [Note that a couple of the years have no values in the data set, and by default R reads these columns in as character columns. Hence when you gather() the data to create a prevalence column, all the numbers will be converted into characters. One way to deal with this is to convert the column back into numbers using as.numeric]
2. Once you have this tidy dataset, run the following code
and produce a table of the output.
Your report should consist of the code (with explanatory comments- aim for one comment per line of code) used to create ¡®gp_hiv¡¯ and the table of the output. You may use any word processor you choose, but save it to pdf for submission. As a guideline, it is not expected that your code should be longer than one side of A4 (and could be considerably less).
gp_hiv %>%
group_by(Country) %>% summarise(MeanPrevalence = mean(prevalence))
library(tidyverse)
library(magrittr)
library(knitr)
gp_hiv <- read_csv("Data/indicator hiv estimated prevalence% 15-49.csv") %>%
rename(Country = ¡®Estimated HIV Prevalence% – (Ages 15-49)¡®) %>% gather(year, prevalence, -Country) %>%
mutate(year = as.numeric(year)) %>%
filter(!is.na(Country)) %>%
filter(!is.na(prevalence)) %>%
filter(year > 1990) %>%
mutate(prevalence = as.numeric(prevalence))
out <- gp_hiv %>%
group_by(Country) %>%
summarise(MeanPrevalence = round(mean(prevalence),2))
kable(out)
Country MeanPrevalence
Afghanistan 0.06 Algeria 0.07 Angola 1.75 Argentina 0.36 Armenia 0.10 Australia 0.11 Austria 0.16 Azerbaijan 0.07 Bahamas 3.27 Bangladesh 0.06 Barbados 0.56 Belarus 0.15 Belgium 0.19 Belize 1.97 Benin 1.16 Bhutan 0.11 Bolivia 0.21 Botswana 21.25 Brazil 0.39 Bulgaria 0.08 Burkina Faso 2.31 Burundi 4.40 Cambodia 1.00 Cameroon 4.59 Canada 0.21 Cape Verde 1.00 Central African Republic 7.48
Chad Chile
China Colombia Comoros Congo, Rep. Costa Rica Cote d¡¯Ivoire Croatia Cuba
Czech Republic Denmark
Dominican Republic Ecuador
El Salvador Equatorial Guinea Eritrea
Guatemala Guinea Guinea-Bissau Guyana
Honduras Hungary
Kazakhstan Kenya
Kyrgyz Republic Lao
MeanPrevalence
0.06 0.64 0.07 3.97 0.18 5.40 0.06 0.08 0.06 0.12 2.31 0.84 0.45 0.06 0.59 2.21 0.94 0.64 1.57 0.07 0.08 0.34 4.35 0.75 0.09 0.10 1.76 0.11 0.48 1.59 1.70 1.63 2.54 1.21 0.08 0.20 0.32 0.10 0.13 0.18 0.14 0.31 1.94 0.06 0.09 7.90 0.11 0.11 0.37 0.12
19.41 2.28
程序代写 CS代考 加微信: cstutorcs
Country MeanPrevalence
Lithuania 0.07 Luxembourg 0.23 Madagascar 0.21 Malawi 12.44 Malaysia 0.37 Maldives 0.06 Mali 1.39 Malta 0.09 Mauritania 0.60 Mauritius 0.46 Mexico 0.30 Moldova 0.32 Mongolia 0.06 Morocco 0.10 Mozambique 8.00 Myanmar 0.63 Namibia 11.81 Nepal 0.41 Netherlands 0.18 New Zealand 0.09 Nicaragua 0.13 Niger 0.75 Nigeria 3.67 Norway 0.10 Oman 0.06 Pakistan 0.08 Panama 1.08 Papua New Guinea 0.46 Paraguay 0.26 Peru 0.46 Philippines 0.06 Poland 0.10 Portugal 0.43 Qatar 0.06 Romania 0.10 Russia 0.41 Rwanda 3.83 Sao Tome and Principe 1.00
Sierra Leone Singapore Slovak Republic Slovenia Somalia
South Africa South Korea South Sudan Spain
Sri Lanka Sudan Suriname
0.57 0.09 0.94 0.10 0.06 0.06 0.33
13.05 0.06 3.10 0.41 0.06 0.39 0.87
Computer Science Tutoring
Switzerland Tajikistan
Trinidad and Tobago Tunisia
United Kingdom United States Uruguay Uzbekistan Venezuela Vietnam
Yemen Zambia Zimbabwe
MeanPrevalence
19.17 0.11 0.32 0.13 6.72 1.66 2.98 1.10 0.06 0.06 7.80 0.74 0.16 0.53 0.35 0.07 0.50 0.26 0.20 14.07 20.43
Programming Help, Add QQ: 749389476