unirank issueshttps://gitlab.erc.monash.edu.au/oscar.lane/unirank/-/issues2021-09-03T12:05:56+10:00https://gitlab.erc.monash.edu.au/oscar.lane/unirank/-/issues/7Compute total scores for ARWU subject rankings2021-09-03T12:05:56+10:00Oscar LaneCompute total scores for ARWU subject rankingsWill need to implement separately for each subject, as per the tables with weightings here: http://www.shanghairanking.com/Shanghairanking-Subject-Rankings-2017/Methodology-for-ShanghaiRanking-Global-Ranking-of-Academic-Subjects-2017.htmlWill need to implement separately for each subject, as per the tables with weightings here: http://www.shanghairanking.com/Shanghairanking-Subject-Rankings-2017/Methodology-for-ShanghaiRanking-Global-Ranking-of-Academic-Subjects-2017.htmlhttps://gitlab.erc.monash.edu.au/oscar.lane/unirank/-/issues/10Extend ur_scrape_arwu() to scrape 501-10002019-07-25T16:53:51+10:00Hung VoExtend ur_scrape_arwu() to scrape 501-1000Currently, the `ur_scrape_arwu()` function scrapes the Top 500 ranked universities. Following code run to verify this:
```{r}
> # retrieve 2018 rankings
> arwu_ranks_2018 <- ur_scrape_arwu(2018)
>
> # count rows
> nrow(arwu_ranks_2018...Currently, the `ur_scrape_arwu()` function scrapes the Top 500 ranked universities. Following code run to verify this:
```{r}
> # retrieve 2018 rankings
> arwu_ranks_2018 <- ur_scrape_arwu(2018)
>
> # count rows
> nrow(arwu_ranks_2018)
[1] 500
```
Could we have this scraper modified to optionally scrape the 501-1000 ranked universities?
This list is located on the second tab on http://www.shanghairanking.com/ARWU2018.html.https://gitlab.erc.monash.edu.au/oscar.lane/unirank/-/issues/6Automatically parse Times data2018-11-12T11:18:44+11:00Oscar LaneAutomatically parse Times dataAt the moment, Times information is held in character vectors. Could use `readr::parse_guess()` on all columns to fix this?
```
> ur_data_times
# A tibble: 5,945 x 28
rank_order rank name scores_overall scores_overall_… scores_teac...At the moment, Times information is held in character vectors. Could use `readr::parse_guess()` on all columns to fix this?
```
> ur_data_times
# A tibble: 5,945 x 28
rank_order rank name scores_overall scores_overall_… scores_teaching scores_teaching… scores_internat…
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 1 1 Harv… 96.1 1 99.7 1 72.4
2 2 2 Cali… 96.0 2 97.7 4 54.6
3 3 3 Mass… 95.6 3 97.8 3 82.3
4 4 4 Stan… 94.3 4 98.3 2 29.5
5 5 5 Prin… 94.2 5 90.9 6 70.3
6 6 6 Univ… 91.2 7 88.2 9 77.2
7 6 6 Univ… 91.2 6 90.5 7 77.7
8 8 8 Univ… 91.1 8 84.2 11 39.6
9 9 9 Impe… 90.6 9 89.2 8 90.0
10 10 10 Yale… 89.5 10 92.1 5 59.2
# ... with 5,935 more rows, and 20 more variables: scores_international_outlook_rank <chr>,
# scores_industry_income <chr>, scores_industry_income_rank <chr>, scores_research <chr>,
# scores_research_rank <chr>, scores_citations <chr>, scores_citations_rank <chr>, record_type <chr>,
# member_level <chr>, url <chr>, nid <int>, location <chr>, aliases <chr>, subjects_offered <chr>,
# year <int>, apply_link <chr>, stats_number_students <chr>, stats_student_staff_ratio <chr>,
# stats_pc_intl_students <chr>, stats_female_male_ratio <chr>
```https://gitlab.erc.monash.edu.au/oscar.lane/unirank/-/issues/5Clean up institution names that differ between years2018-10-08T14:44:15+11:00Oscar LaneClean up institution names that differ between yearsI have noticed this issue for a few institutions, one example Melbourne university:
```
# A tibble: 14 x 12
Rank University `Total Score` Alumni Award HiCi `N&S` PUB PCP `Computed Score` `Computed Rank` Year
...I have noticed this issue for a few institutions, one example Melbourne university:
```
# A tibble: 14 x 12
Rank University `Total Score` Alumni Award HiCi `N&S` PUB PCP `Computed Score` `Computed Rank` Year
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <int>
1 82 University of Melbourne 26 15.4 14.4 22.2 18.7 53.5 19.9 25.3 82 2005
2 78 University of Melbourne 26.7 14.8 14.1 23.1 18.1 54.8 25.2 26.0 78 2006
3 79 University of Melbourne 26.6 14.4 14.1 22.2 18.4 55.1 25 25.9 79 2007
4 73 University of Melbourne 27.7 13.7 14.1 23.1 19.6 58.1 26.7 27.0 73 2008
5 75 University of Melbourne 27.2 13.4 14.1 22.9 17.2 58.5 26.7 26.6 75 2009
6 62 University of Melbourne 29.3 19.9 14.1 22.8 18.7 63.1 27 28.4 62 2010
7 60 University of Melbourne 30 19.5 14.1 25 21.1 62.1 26.8 29.1 60 2011
8 57 University of Melbourne 30.1 18 13.7 24 22.7 63 27.1 29.2 57 2012
9 54 University of Melbourne 30.2 17.7 13.4 24 24.4 62.5 27.1 29.3 55 2013
10 44 The University of Melbourne 32.6 17.5 13.3 29.3 26.7 65.9 29.7 31.8 44 2014
11 44 The University of Melbourne 32.3 17 13.3 28.6 25.3 66.9 30.2 31.5 44 2015
12 40 The University of Melbourne 33.9 17 13.3 35.5 24.8 67.9 32.2 33.2 40 2016
13 39 The University of Melbourne 35.9 16.8 13.1 45 22.8 69.7 33.9 35.2 39 2017
14 38 The University of Melbourne 35.7 16.8 13.1 42.8 21.4 72.3 33.6 35.0 39 2018
```
When I get some time I will try and fix this up. If you notice any other discrepancies for other institutions, please report them here and I will try to fix them at the same time.