Read orderings from .soc
, .soi
, .toc
or .toi
files storing
ordinal preference data format as defined by
{PrefLib}: A Library for Preferences
into a preferences
object.
Usage
read_preflib(
file,
from_preflib = FALSE,
preflib_url = "https://www.preflib.org/static/data"
)
Arguments
- file
A preferential data file, conventionally with extension
.soc
,.soi
,.toc
or.toi
according to data type.- from_preflib
A logical which, when
TRUE
will attempt to source the file from PrefLib by adding the databaseHTTP
prefix.- preflib_url
The URL which will be preprended to
file
, iffrom_preflib
isTRUE
.
Value
An aggregated_preferences
object
containing the PrefLib data.
Details
Note that PrefLib refers to the items being ordered by "alternatives".
The file types supported are
- .soc
Strict Orders - Complete List
- .soi
Strict Orders - Incomplete List
- .toc
Orders with Ties - Complete List
- .toi
Orders with Ties - Incomplete List
The numerically coded orderings and their frequencies are read into a
data frame, storing the item names as an attribute. The
as.aggregated_preferences
method converts these to an
aggregated_preferences
object with the
items labelled by name.
A PrefLib file may be corrupt, in the sense that the ordered alternatives do
not match their names. In this case, the file can be read in as a data
frame (with a warning), but as.aggregated_preferences
will throw an error.
Note
The Netflix and cities datasets used in the examples are from Caragiannis et al (2017) and Bennet and Lanning (2007) respectively. These data sets require a citation for re-use.
References
Mattei, N. and Walsh, T. (2013) PrefLib: A Library of Preference Data. Proceedings of Third International Conference on Algorithmic Decision Theory (ADT 2013). Lecture Notes in Artificial Intelligence, Springer.
Bennett, J. and Lanning, S. (2007) The Netflix Prize. Proceedings of The KDD Cup and Workshops.
Examples
# Can take a little while depending on speed of internet connection
# \donttest{
# strict complete orderings of four films on Netflix
netflix <- read_preflib("netflix/00004-00000138.soc", from_preflib = TRUE)
head(netflix)
#> preferences frequencies
#> 1 [Beverly Hills Cop > Mean Girls > M ...] 68
#> 2 [Mean Girls > Beverly Hills Cop > M ...] 53
#> 3 [Beverly Hills Cop > Mean Girls > T ...] 49
#> 4 [Mean Girls > Beverly Hills Cop > T ...] 44
#> 5 [Beverly Hills Cop > Mission: Impos ...] 39
#> 6 [The Mummy Returns > Beverly Hills ...] 37
names(netflix$preferences)
#> [1] "Mean Girls" "Beverly Hills Cop" "The Mummy Returns"
#> [4] "Mission: Impossible II"
# strict incomplete orderings of 6 random cities from 36 in total
cities <- read_preflib("cities/00034-00000001.soi", from_preflib = TRUE)
# }