Skip to contents

Read orderings from .soc, .soi, .toc or .toi files storing ordinal preference data format as defined by {PrefLib}: A Library for Preferences into a preferences object.

Usage

read_preflib(
  file,
  from_preflib = FALSE,
  preflib_url = "https://www.preflib.org/static/data"
)

Arguments

file

A preferential data file, conventionally with extension .soc, .soi, .toc or .toi according to data type.

from_preflib

A logical which, when TRUE will attempt to source the file from PrefLib by adding the database HTTP prefix.

preflib_url

The URL which will be preprended to file, if from_preflib is TRUE.

Value

An aggregated_preferences object containing the PrefLib data.

Details

Note that PrefLib refers to the items being ordered by "alternatives".

The file types supported are

.soc

Strict Orders - Complete List

.soi

Strict Orders - Incomplete List

.toc

Orders with Ties - Complete List

.toi

Orders with Ties - Incomplete List

The numerically coded orderings and their frequencies are read into a data frame, storing the item names as an attribute. The as.aggregated_preferences method converts these to an aggregated_preferences object with the items labelled by name.

A PrefLib file may be corrupt, in the sense that the ordered alternatives do not match their names. In this case, the file can be read in as a data frame (with a warning), but as.aggregated_preferences will throw an error.

Note

The Netflix and cities datasets used in the examples are from Caragiannis et al (2017) and Bennet and Lanning (2007) respectively. These data sets require a citation for re-use.

References

Mattei, N. and Walsh, T. (2013) PrefLib: A Library of Preference Data. Proceedings of Third International Conference on Algorithmic Decision Theory (ADT 2013). Lecture Notes in Artificial Intelligence, Springer.

Bennett, J. and Lanning, S. (2007) The Netflix Prize. Proceedings of The KDD Cup and Workshops.

Examples


# Can take a little while depending on speed of internet connection

# \donttest{
# strict complete orderings of four films on Netflix
netflix <- read_preflib("netflix/00004-00000138.soc", from_preflib = TRUE)
head(netflix)
#>                                preferences frequencies
#> 1 [Beverly Hills Cop > Mean Girls > M ...]          68
#> 2 [Mean Girls > Beverly Hills Cop > M ...]          53
#> 3 [Beverly Hills Cop > Mean Girls > T ...]          49
#> 4 [Mean Girls > Beverly Hills Cop > T ...]          44
#> 5 [Beverly Hills Cop > Mission: Impos ...]          39
#> 6 [The Mummy Returns > Beverly Hills  ...]          37
names(netflix$preferences)
#> [1] "Mean Girls"             "Beverly Hills Cop"      "The Mummy Returns"     
#> [4] "Mission: Impossible II"

# strict incomplete orderings of 6 random cities from 36 in total
cities <- read_preflib("cities/00034-00000001.soi", from_preflib = TRUE)
# }