close

Вход

Забыли?

вход по аккаунту

?

3126858.3126879

код для вставкиСкачать
Technical Session 4: Recommender Systems (i)
WebMedia’17, October 17-20, 2017, Gramado, RS, Brazil
Evaluating Ensemble Strategies for Recommender Systems
under Metadata Reduction
Lassion Laique Bomfim de
Souza Santana
Federal University of Bahia
Department of Computer Science Av.
Ademar de Barros , 500 , Ondina
Salvador, Bahia, Brazil 40170-010
lassionlaique@dcc.ufba.br
Alesson Bruno Santos Souza
Diego Lima Santana
Federal University of Bahia
Department of Computer Science Av.
Ademar de Barros , 500 , Ondina
Salvador, Bahia, Brazil 40170-110
alessonbruno@dcc.ufba.br
Federal University of Bahia
Department of Computer Science Av.
Ademar de Barros , 500 , Ondina
Salvador, Bahia, Brazil 40170-110
diegolsantana@dcc.ufba.br
Wendel Araújo Dourado
Frederico Araújo Durão
Federal University of Bahia
Department of Computer Science Av.
Ademar de Barros , 500 , Ondina
Salvador, Bahia, Brazil 40170-110
wendelad@dcc.ufba.br
Federal University of Bahia
Department of Computer Science Av.
Ademar de Barros , 500 , Ondina
Salvador, Bahia, Brazil 40170-110
freddurao@dcc.ufba.br
ABSTRACT
as what items to buy, what music to listen, or what news to read.
These systems appear as valuable means for online users to cope
with information overload by filtering, prioritizing, and efficiently
delivering relevant information.
Recommender systems usually rely on items’ metadata to perform their predictions. Therefore, it is important for them to have
a good amount of metadata in order to generate good suggestions.
For instance, knowing the cast of a movie and the name of its directors, recommender systems can suggest movies to users more
efficiently than considering its genre solely. Hence, employing and
combining metadata can increase the possibilities of recommender
systems to outdo themselves.
The problem of working with a lot of metadata, however, is that
not all of it is always available and sometimes the information
on one piece of metadata is more frequently available than the
others. For example, genre and main cast information are more
frequently associated with a movie than casting director or songwriter information. Although some related works [3, 7] show that
the combination of metadata results in better recommendations, it
is necessary to assess the quality of recommendations given when
some of that information is not present. Furthermore, it is important to understand whether or not all of that information is needed,
which is actually relevant and which prevail over the other. Future
work we will also evaluate the performance relation with the reduction of metadata. In this work this was not contemplated because
the initial objective would be to analyze the precision.
Based on the problem described above, this article investigates
the reduction of the amount of metadata used by recommender
systems in order to observe the behavior of the resulting predictions.
We evaluate our approach in the context of movies by reducing
the amount of movie metadata (genres, actors, directors, countries,
tags) used by the recommendation algorithm MABPR. We then
compare its performance against other algorithms, using the mean
average precision as metric. We also show that metadata with a
greater diversity of information, in this case, the tag metadata, has
Recommender systems are information filtering tools that aspire to
predict accurate ratings for users and items, with the ultimate goal
of providing users with personalized and relevant recommendations. Recommender system that rely on the combination of quality
metadata, i.e., all descriptive information about an item, are likely
to be successful in the process of finding what is relevant or not
for a target user. The problem arises when either data is sparse
or important metadata is not available, making it hard for recommender systems to predict proper user-item ratings. In particular,
this study investigates how our proposed collaborative-filtering
recommender performs when important metadata is reduced from
a dataset. To evaluate our approach use the HetRec 2011 2k dataset
with five different movie metadata (genres, tags, directors, actors
and countries). By applying our approach of metadata reduction, we
provide a comprehensive analysis on how mean average precision
is affected as important metadata become unavailable.
1
INTRODUCTION
The amount of information available on the Web is growing at an
accelerated rate. Consequently, it has become difficult for users to
find things that appeal to their different interests as dealing with
such a vast number of options can be a laborious process. This
problem is known as Information Overload. Recommender systems
are software tools that seek to suggest relevant items to users based
on their past preferences. The suggestions provided are aimed at
supporting their users in various decision-making processes, such
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a
fee. Request permissions from permissions@acm.org.
WebMedia ’17, October 17–20, 2017, Gramado, Brazil
© 2017 Association for Computing Machinery.
ACM ISBN 978-1-4503-5096-9/17/10. . . $15.00
https://doi.org/10.1145/3126858.3126879
125
Technical Session 4: Recommender Systems (i)
WebMedia’17, October 17-20, 2017, Gramado, RS, Brazil
3
a very significant impact on the algorithm as the largest drop in
prediction performance was after its withdrawal.
This work is structured as follows: Section II introduces and
describes some related work; Section III describes the models considered in this evaluation; Section IV introduces the details of the
proposed ensemble framework and strategies; Section V and VI
depict how the metadata is extracted, along with the assessment
of different metadata reductions applied to different strategy algorithms, and expose the results; finally, Section VII presents the
conclusions, final remarks, and future work.
2
CONSIDERED MODEL
In this section we describe in more details the models used to study
and compare the different types of metadata considered in this paper. In the next two subsections, we present a set of metadata aware
algorithms which use the MABPR [10] to personalize a ranking of
items using only implicit feedback. These techniques will be considered in our evaluation in the context of movies recommendation.
3.1
Notation
Following the same notation as in [14, 17], we use special indexing
letters to distinguish users, items and metadata: a user is indicated
as u, an item is referred to as i, j, k and an item’s metadata as д. The
notation rui is used to refer to explicit or implicit feedback from
a user u about an item i. In the first case, it is an integer provided
by the user, indicating how much they liked the content; in the
second, it is just a boolean indicating whether the user consumed
or visited the content or not. The system’s prediction about the
preference of user u for item i is represented by rˆui , which is a
floating point value calculated by the recommender algorithm. The
set of pairs (u, i) for which rui is known is represented by the set
K = {(u, i)|rui is known}.
Additional sets used in this paper are: N (u) to indicate the set
of items which user u provided an implicit feedback, and N̄ (u) to
indicate the set of items that is unknown to user u.
RELATED WORK
Various approaches consider different metadata to recommend
items. An ensemble approach combines the predictions of different
algorithms, or the same algorithm with different parameters, to obtain a final prediction. Ensemble algorithms have been successfully
used, for instance, in the Netflix Prize contest, consisting of the
majority of the top performing solutions [22, 28].
Most of the related works in the literature point out that ensemble learning has been used in recommender systems as a way of
combining the prediction of multiple algorithms (heterogeneous
ensemble) to create a stronger ranking [13] in a technique known
as Blending. They have also been used with a single collaborative
filtering algorithm (single-model or homogeneous ensemble), with
methods such as Bagging and Boosting [2]. However, those solutions do not consider the multiple metadata present in the items,
and are often not practical to implement in a scenario because
of their computational cost and complexity. In the case of a heterogeneous ensemble, it needs to train all models in parallel and
treat the ensemble as one big model, but unfortunately training
100+ models in parallel and tuning all parameters simultaneously
is computationally not feasible [28]. In contrast, a homogeneous
ensemble demands the same model to be trained multiple times,
and some methods such as Boosting. [3] tried a different approach
and combined multiple metadata by concatenating them, with a
modest performance increase.
Leo Breiman proposed the Bagging method [5, 6]. Bagging stands
for “Bootstrap Aggregation” and it uses the composite outputs
of several machine learning techniques to boost the performance
and stability of predictions. The model used is a special form of
the mean method. Bumping stands for “Bootstrap Umbrella of the
Model Parameter”, and only Consider the model which has the
lower percentage of errors [16, 27]. The results show that Bumping
surpasses the performance of Bagging.
In comparison to the aforementioned approaches, our method
uses three different ensemble strategies to combine distinct metadata, but with the advantage that it does not require the algorithm
to be modified, or to be trained multiple times with the same dataset.
Therefore, it can be used in all of the current recommender systems. This can be justified by the employment of user prediction in
our method (which is the least possible information in any recommender system). Our approach involves two voting strategies and
a weighted strategy in which the parameters are optimized using a
Genetic Algorithm approach.
3.2
MABPR
The BPR-Linear[10] is an algorithm based on the Bayesian Personalized Ranking (BPR) framework, which uses items’ metadata in a
linear mapping for score estimation. One disadvantage of the BPR
algorithms is that they are not able to infer any conclusion when the
items i and j are both known (or both are unknown). In other words,
if an item has been viewed by the user, it is possible to conclude
that this content is preferred over all other unknown items, as it
aroused a particular interest to them. On the other hand, when both
items are known (or both are unknown), it is not possible to infer
which one is preferred over the other because the system only has
the positive/negative feedback from the user. Consequently, those
pairs which belong to the same class (positive or negative) will not
be able to be ranked accordingly, as the model will be learned only
by using the specific case where one item is known and the other
is not.
Manzato et al. [18] proposed an extension to the BPR technique,
named MABPR, which also considers items’ metadata in order to
infer the relative importance of two items.
It starts by redefining the set D K , which contains the data used
′ := {(u, i, j)|i ∈ N (u) & j ∈ N̄ (u) or i ∈
during training, to D K
N (u) & j ∈ N (u) ∪ N̄ (u) & |G(i)| > 0 & |G(j)| > 0} to consider
the metadata available in the specified case, while also considering
items without descriptions.
Figure 1 shows how the proposed extension affects the relationship between items i and j with respect to the preferences of user
u. Because items i 2 , i 4 and i 5 are known, the system has to analyze
their metadata to infer which one is preferred over the other. This
is the role of function δ (i, j), which is defined as:
126
Technical Session 4: Recommender Systems (i)
WebMedia’17, October 17-20, 2017, Gramado, RS, Brazil
roles or any other relevant criterion, e.g. Dictatorship, Least Misery
and Most Pleasure strategies.
Before introducing the ensemble algorithms, we need to recall
that our recommenders produce a ranking of items. To generate
recommendations, this Ranking-Oriented Recommender receives as
an input a dataset of ratings as a tuple ⟨u, i, r ⟩, and outputs a matrix
MU I , where U is the set of all users and I is the set of all items
known by the recommender system. Each row of the matrix M is
composed of a vector of tuples ⟨i, rˆui ⟩, ordered by the item score
prediction rˆui for the user u. The ensemble algorithms proposed
in this paper can be formally defined as a function f : M K → M,
where the input is a vector of k-predictions and the output is a
matrix of the combined predictions.
In Subsection 4.1 we present the Most Pleasure strategy, a straightforward ensemble strategy that combines predictions based on
score. In Subsection 4.2, we describe the Best of All strategy, which
determines a preferred metadata attribute for a user and uses it to
create the ensemble. And finally, in Subsection 4.3 the Weighting
strategy is presented. It uses multiple metadata and weighs them
with a Genetic Algorithm optimizing the Mean Average Precision
(MAP) metric.
Figure 1: As an extension to the approach of Rendle et al.,
Manzato et al. consider the metadata describing items i and
j when both are known (i ∈ N (u) & j ∈ N (u)). The function
δ (i, j) returns positive whether user u prefers the description
of item i over the description of item j, and negative otherwise.


 + if

− if

 ? otherwise,

where φ(u, .) is defined as:
δ (i, j) =
φ(u, .) =
φ(u, i) > φ(u, j),
φ(u, i) < φ(u, j),
Õ
1
wuд ,
|G(.)|
(1)
4.1
Most Pleasure Strategy
(2)
д ∈G(.)
and wuд is a weight indicating how much u likes a description
д ∈ G(.).
This approach enhances the BPR algorithm with further insight
about the user’s preferences by considering his personal opinions
about particular descriptions of items. Such metadata can be of any
type: genres of movies/music, keywords, list of actors, authors, etc.
The mechanism used to infer such opinions wuд by analyzing
only the training data is accomplished by adopting the same linear
attribute-to-feature mapping.
4
ENSEMBLE ALGORITHMS
Figure 2: Most Pleasure strategy.
The algorithm presented in Section 3 supports only one metadata
attribute per item. This is a point of improvement, as it is common
for an item to have multiple attributes. A related work [3] points
out this problem and uses multiple metadata by concatenating
the different types of attributes as a single metadata x item list.
However, the performance improvement observed was moderate.
In this paper, the proposed ensemble framework consists of training
the recommender system for each different piece of metadata and
combining them with one of the three ensemble strategies presented
next.
The strategies elicited here were inspired by group decisionmaking that combine several users’ preferences to aggregate itemranking lists. According to Senot et al [25] there are three categories
of strategies, namely majority-based, which strengthen the “most
popular” choice among the group, e.g. Borda Count and Plurality
Voting strategies; consensus-based strategies, which average somehow all the available choices, e.g. Additive Utilitarian and Average
without Misery; and borderline strategies, also known as role-based
strategies, which only consider a subset of choices based on user
The Most Pleasure strategy is a classic aggregation method, often
used for combining individual ratings for group rating [19]. It takes
the maximum of individual ratings for a specific item and creates
a unified rank. Figure 2 illustrates the Most Pleasure strategy, in
which the output comprehends a ranked list of movies with highest
ratings from two distinct input sets.
Algorithm 1 shows that it only needs the generated prediction
vector as an input. This vector is composed of the predictions from
the recommender algorithm trained with one of the metadata attributes. For each user, a new prediction is created, selecting the
highest score of an item among all the individually-trained algorithms.
The idea behind this strategy is that differently trained algorithms have a distinct knowledge about the user’s preferences, and
the predicted score can be considered an indicator of the algorithm’s
confidence. So the created ensemble is a list of items which the
distinct algorithms have more confidence to recommend.
127
Technical Session 4: Recommender Systems (i)
WebMedia’17, October 17-20, 2017, Gramado, RS, Brazil
Input: Vector of predictions, P
Output: Predictions ensemble M
for u = 1,...,#Users do
for i = 1,...,#Items do
Select highest rˆui for the item i among the
K-predictions for the user u
Mui ← (i, rˆui ) //Store the highest score
end
Sort Mu by rˆui
end
Input: T - Training dataset of rating <U,I,R>
Input: P - Probe dataset of rating <U,I,R>
Input: A - Vector of Metadata
Input: PredAlд - the Base prediction algorithm
Output: Predictions ensemble M
for m = 1,...,#Metadata do
Km ← PredAlд Trained with T dataset and Au
end
for u = 1,...,#Users do
Evaluate all K algorithms against the P dataset and select
the one with highest MAP for the user u as hiдhestu
end
for m = 1,...,#Metadata do
Km ← PredAlд Trained with T+P dataset and Au
end
for u = 1,...,#Users do
rˆu ← Khiдhestu u
Mu ← rˆu
end
Algorithm 1: Most Pleasure algorithm.
4.2
Best of All Strategy
The Most Pleasure strategy gives the same weight for different types
of metadata. However, it is natural to assume that different types
of metadata can affect users differently. In contrast, the Best of All
strategy considers the recommendation algorithm that provides the
best results for a specific user, and uses this algorithm to provide
future predictions as illustrated in Figure 3.
Algorithm 2: Best of All algorithm.
4.3
Weighting Strategy
One drawback of the Best of All strategy is that it considers that only
one metadata attribute influences the user preference. However, it
is natural to assume that the interests of a user may be influenced
by more than one attribute, and at different levels. The Weighting strategy considers all available metadata, assigning different
weights for each prediction as illustrated in Figure 4.
Figure 3: Best of All Strategy.
Algorithm 2 requires as an input i) the recommendation algorithm, ii) a training dataset, iii) a probe dataset, and iv) the vector
of metadata. Differently from the Most Pleasure strategy, this one
requires a probe run to determine which is the best performing algorithm. Therefore, the dataset is divided in training and probe sets.
The algorithm is primarily trained using each of piece of metadata
individually. Then, for each user, a probe is made to determine the
metadata with the highest performance. This performance is indicated by the Mean Average Precision (MAP) metric [12], often used
for ranked recommendations. Finally, the algorithms are retrained
using all data (including the probe set), and the final ensemble is
the result of the combination of predictions using, for each user,
the prediction from the algorithm with the highest performance in
the probe test.
The idea behind this algorithm is that a single metadata attribute
can greatly influence the user’s preferences, and this should be used
for future predictions. For instance, if User A enjoys films from a
particular genre such as “horror”, and User B enjoys films of some
specific theme, which is represented by a tag, such as “bloody films”,
the ensemble will contain predictions from the recommendation
algorithm trained with both: genre metadata for User A, i.e. “horror”,
and tag metadata for User B, i.e. “bloody”.
Figure 4: Weighting Strategy.
Similarly to the previous strategy, the Algorithm 3 requires as
an input i) the recommendation algorithm, ii) a training and probe
dataset, and iii) the vector of metadata. After training the algorithm
using each piece of metadata individually, a probe run is also needed;
however, the objective is to determine the optimal weights for
each user. This is an optimization problem and was solved using a
Genetic Algorithm (GA). GA is particularly appealing for this type
of problem due to its ability to handle multi-objective problems.
In addition, the parallelism of GA allows the search space to be
covered with less likelihood of returning local extremes [20].
The probe part consists of running the GA to find out the optimal
weights. We implemented our algorithm using the GA Framework
proposed by Newcombe [20], in which the weights are the chromosomes, and the fitness function is the MAP score against the
probe dataset. Other GA characteristics includes the use of 5% of
Elitism, Double Point crossing-over, and Binary Mutations. Finally,
the algorithms are retrained using all data (including the probe set),
and the final ensemble uses, as the item score, the sum of individual
128
Technical Session 4: Recommender Systems (i)
WebMedia’17, October 17-20, 2017, Gramado, RS, Brazil
Rotten Tomatoes (RT) 2 movie review systems. Each movie has its
IMDb and RT identifiers, English and Spanish titles, picture URLs,
genres, directors, actors (ordered by “popularity”), countries, filming locations, and RT audience’s and experts’ ratings and scores.
The dataset was composed of 2113 users with 855598 ratings on
10197 movies, including the relation between 20 movie genres, 4060
directors, 95321 actors, 72 countries and 13222 tags.
Input: T - Training dataset of rating <U,I,R>
Input: P - Probe dataset of rating <U,I,R>
Input: A - Vector of Metadata
Input: PredAlд - the Base prediction algorithm
Output: Predictions ensemble M
for m = 1,...,#Metadata do
Km ← PredAlд Trained with T dataset and Au
end
for u = 1,...,#Users do
Get weights wu for all K algorithms against the Pu dataset
for the user u using a Genetic Algorithm, where the MAP
is the Fitness function.
end
for m = 1,...,#Metadata do
Km ← PredAlд Trained with T+P dataset and Au
end
for u = 1,...,#Users do
MetÕ
adat a
wui Ki /Metadata
rˆui ←
5.2
i=1
Mui ← rˆui
end
Algorithm 3: Weighting algorithm.
predictions multiplied by the weights found in the probe phase and
divided by the total number of metadata.
The idea behind it is that the different types of metadata influence
differently the user’s preference. Still in the context of movies, let us
consider two users: User A, that enjoys films from a determinate set
of genres, but do not care about the production country and User B,
that does not care about film genre or country of production. For
User A, the ensemble should give a higher weight for the film genre,
and a lower weight for the production country. In contrast, for
User B, the ensemble should equally distribute the weights between
those metadata.
The technique used does not undergo any modification for the
analysis of metadata reduction that will be presented in an upcoming section, maintaining the coherence and the integrity of the
results.
5
6
EVALUATING METADATA REDUCTION
Dataset and Metadata
All tests were executed with the HetRec 2011 MovieLens 2k dataset
[8], an extension of MovieLens10M dataset, which contains personal ratings and tags about movies. In the dataset, MovieLens
movies are linked to the Internet Movie Database (IMDb)1 and
1 Internet
EVALUATION RESULTS
We compare the MAP results in the various scenarios generated by
the reduction of metadata. As it is possible to identify on the charts
below, there is a decline in the value of the metric as the amount of
metadata is reduced.
Our results point to a significant decrease of quality as the
amount of metadata attributes is gradually reduced from five to
one. For the ensemble strategy algorithms used (the last three bars
of the graphs above), we see a decrease of up to 1.4% in relation
to the Most Pleasure strategy, a relatively low difference, but this
strategy has more relevant results in simpler, weaker algorithms,
depending on the scenario used.
The Best of All strategy considers the best results generated by
the recommendation algorithm for a user in specific. Removing
the tag metadata generated a reduction of 3.3% in the evaluation.
That happens due to the fact that tags usually represent a more
diverse set of information, and, sometimes, they can even be used to
simulate a combination of metadata. The idea behind this algorithm
is that a single piece of metadata can greatly influence a user’s
The goal of this evaluation is to assess the metada reduction under
the three assembling strategies using the MABPR algorithm. We
used the recommendation algorithm MABPR available in the MyMediaLite library [11], and presented in Section 3, with 5 different
types of metadata (genres, actors, directors, tags and countries).
The Mean Average Precision (MAP) metric was used to measure
the accuracy of the recommendations.
5.1
Methodology
For generating recommendations, we used a latent factor 10, and as
a preliminary run, the most significant was the metric MAP just by
a larger percentage decrease in performance over the withdrawal of
metadata. The Genetic Algorithm (GA) uses a population of size 40,
with 90 generations, a crossover probability of 80%, and a mutation
probability of 8%. Usually a higher number of generations is used
for convergence; however, due to the size of our dataset, a moderate
number was used.
We split the dataset randomly into two sets, with an 80:20 ratio,
and used them for training and evaluation respectively. Furthermore, due to the need of a probe run in some of the ensemble
strategies presented in Section 4, 25% of training dataset was split
again for the probe run, resulting in a 60:20:20 division. It is important to note that, during the evaluation, the algorithm is trained
with the full training dataset. To summarize, the ensemble was created with an algorithm trained with the 60% dataset and evaluated
with the 20% probe dataset, later, with the ensemble created, the
algorithm was trained again, this time with the full 80% training
dataset and evaluated with the evaluation dataset.
A total of four evaluations will be carried out and, at each evaluation, a metadata attribute will be withdrawn. For each scenario
with reduced metadata, three ensemble strategies are tested. As
there isn’t in the literature a specific indication treating grouping
or reduction of metadata, the adopted criteria for the configuration
of each scenario was the popularity of metadata in datasets used in
evaluations of recommendation systems.
2 Rotten
Movie Database, http://www.imdb.com
129
Tomatoes, http://www.rottentomatoes.com
Technical Session 4: Recommender Systems (i)
WebMedia’17, October 17-20, 2017, Gramado, RS, Brazil
Figure 5: MAP score results using the MABPR algorithm.
The first five bars are the results for the MABPR recommender algorithm using five types of metadata, whereas the
last three bars are the results for the proposed ensemble algorithms.
Figure 7: MAP score results using the MABPR algorithm.
The first three bars are the results for the MABPR recommender algorithm using only three types of metadata,
whereas the last three bars are the results for the proposed
ensemble algorithms.
Figure 8: MAP score results using the MABPR algorithm.
The first two bars are the results for the MABPR recommender algorithm using only two types of metadata,
whereas the last three bars are the results for the proposed
ensemble algorithms.
Figure 6: MAP score results using the MABPR algorithm.
The first four bars are the results for the MABPR recommender algorithm using only four types of metadata,
whereas the last three bars are the results for the proposed
ensemble algorithms.
represented by tags, such as “police movies”, the set of predictions
generated by the recommendation algorithm will contain forecasts
based on each of those attributes: genre metadata for user A, that
is, “action”, and tag metadata for user B, that is, “police movies”.
preferences, and this should be used for future predictions. For
example, if a user A likes movies of a particular genre, such as
“action” and another user B likes films of any theme, which are
130
Technical Session 4: Recommender Systems (i)
WebMedia’17, October 17-20, 2017, Gramado, RS, Brazil
important challenge is to accurately identify the user’s context, and
organize information so that the predictions match their tastes.
In this article a method for reducing the amount of metadata
used is proposed in order to analyze the behavior of the generated
recommendations. After analysis, we observed a decrease in the
quality of predictions, measured using the MAP metric.
From five metadata (figure 5) to two (figure 8) the tag metadata
is the most relevant, possibly because it has more information
aggregated in it [3]. The actor metadata produces results as good
as the director one. A possible reason for that is the importance
that these two key attributes have inside a movie context.
The Weighting strategy has shown good results, but the other two
strategies can also be used depending on the scenario. For example,
the Most Pleasure strategy was weak but the cost of using it is very
low. Following the same path, the Best of All strategy has produced a
much greater increase in performance, however, it requires a probe
run and also the use of GA weight optimization, an expensive step
in the process of generating recommendations. For one piece of
metadata we understand that no ensemble algorithm should be
used as they perform similarly to the non-ensemble approaches.
Analyzing in a different context to different metadata, such as
music, for example according [4] "Possible metadata includes editorial information, social tags, and user listening/consumption
behavior in form of listening statistics, such as playcounts and
artist charts, sell histories, and user ratings.". Other domains may
suffer the same accuracy delay when metadata is reduced between
experiments and tests to be performed to analyze which metadata
are most influential on their others.To apply the algorithm to other
domains, the metadata must be selected and matched to calculate
the accuracy of each combination.
Figure 9: MAP score results using the MABPR algorithm.
The first bar is the result for the MABPR recommender algorithm using only one type of metadata, whereas the last
three bars are the results for the proposed ensemble algorithms.
Furthermore, the most significant result was obtained using the
Weighting strategy, with which the quality decrease level reached
approximately 10% between five and one metadata attributes used
(Figure 5 and 9). We can see the MAP results of 0.16717 for one
attribute and 0.18369 for five. This can be explained by the fact that
the Weighting strategy uses metadata to make predictions and to assign relevance weights, according to an individual user’s taste. This
behavior can be pointed as the cause of the small difference between
the results of our first reduction, from five metadata attributes (Figure 5) to four (Figure 6). In that case, the decrease of quality was of
only 0.01%, considered irrelevant because the metadata removed,
countries, which have lower weight, since most of the films in the
dataset were made in the same country (USA). As it was the case
with the Best of All strategy, removing the tag metadata had the
most impact on the decrease of the quality of predictions, with a
difference of about 6% from when it was being used, for reasons
already mentioned in this paper.
Therefore, we conclude that the combination of metadata allows
the tested algorithm to produce better results than when using only
one of the metadata attributes. Based on the results, we observe that
the Weighting strategy was the one that suffered the greatest drop
in prediction quality with the reduction of the amount of metadata,
and that tag metadata are the most important in any strategy used,
just for having a greater diversity of information.
6.1
7
CONCLUSION
In this paper we analyze the behavior of recommendation algorithms after reducing the amount of metadata they take into account. We observed that the metadata reduction did not contribute
to an improvement of the recommendations generated given the
observed results, it became evident that metadata are an important
source of information. In order to generate significant recommendation of items, closely related to the users’ interests, a good amount
of information is required.
From the reduction of metadata and analysis of the behavior of
the recommendations, we conclude that the combination of metadata allows recommendation algorithms to generate better results
than when using just one metadata attribute. We also conclude
that users’ information and preferences are essential to generate
better predictions. Each time a reduction was made, more user
information was being removed. Therefore, we observed that the
quality of the resulting recommendations decreased. However, this
strategy has more relevant results in simpler, weaker algorithms,
and depends on the scenario in which is being used.
Future work includes evaluating the individual relevance of each
piece of metadata. It would be a good strategy to use a more diverse
mix of metadata. Five attributes result in 5 factorial combinations,
resulting, through quick calculation, in a total of 120 scenarios on
which apply all proposed algorithms, allowing us to know how they
Discussion
Recommender systems are trained according to each user’s (or
group of users’) information and are based on their preferences. An
131
Technical Session 4: Recommender Systems (i)
WebMedia’17, October 17-20, 2017, Gramado, RS, Brazil
would behave when certain metadata are removed at any given
moment.
As we do not want the be limited to movies, other domains can
be tested, such as games or books. Using a vast amount of metadata
and apply the same method using other metrics and evaluate how
the algorithms behave with respect to certain metadata to observe
the relevance thereof to a good accuracy.
[12] Abby Goodrum. 2000. Image Information Retrieval: An Overview of Current
Research. Informing Science 3 (2000), 63–66.
[13] Michael Jahrer, Andreas Töscher, and Robert Legenstein. 2010. Combining
Predictions for Accurate Recommender Systems. In Proceedings of the 16th ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD
’10). ACM, New York, NY, USA, 693–702. https://doi.org/10.1145/1835804.1835893
[14] Yehuda Koren. 2010. Factor in the Neighbors: Scalable and Accurate Collaborative
Filtering. ACM Trans. Knowl. Discov. Data 4, 1, Article 1 (Jan. 2010), 24 pages.
https://doi.org/10.1145/1644873.1644874
[15] Miklós Kurucz, András A. Benczúr, and Balázs Torma. 2007. Methods for large
scale SVD with missing values. In KDD Cup and Workshop 2007.
[16] Y. Lee and S. Kwak. 1999. A Study on Training Ensembles of Neural Networks-A
Case of Stock Price Prediction. Journal of Intelligence and Information Systems 5
(1999), 95–101.
[17] Marcelo Garcia Manzato. 2013. gSVD++: Supporting Implicit Feedback on Recommender Systems with Metadata Awareness. In Proceedings of the 28th Annual
ACM Symposium on Applied Computing (SAC ’13). ACM, New York, NY, USA,
908–913. https://doi.org/10.1145/2480362.2480536
[18] Marcelo G. Manzato, Marcos A. Domingues, and Solange O. Rezende. 2014.
Optimizing Personalized Ranking in Recommender Systems with Metadata
Awareness. In Web Intelligence (WI) and Intelligent Agent Technologies (IAT),
2014 IEEE/WIC/ACM International Joint Conferences on (WI-IAT ’14), Vol. 1. IEEE
Computer Society, Washington, DC, USA, 191–197. https://doi.org/10.1109/
WI-IAT.2014.33
[19] Judith Masthoff. 2011. Group Recommender Systems: Combining Individual
Models. In Recommender Systems Handbook. Springer US, Boston, MA, 677–702.
https://doi.org/10.1007/978-0-387-85820-3_21
[20] J. Newcombe. 2013. Intelligent radio: An evolutionary approach to general coverage
radio receiver control. Master’s thesis. De Montfort University, UK.
[21] Arkadiusz Paterek. 2007. Improving regularized singular value decomposition
for collaborative filtering. In Proc. KDD Cup Workshop at SIGKDD’07, 13th ACM
Int. Conf. on Knowledge Discovery and Data Mining. 39–42.
[22] Martin Piotte and Martin Chabbert. 2009. The Pragmatic Theory Solution to the
Netflix Grand Prize. Netflix prize documentation. (2009).
[23] Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars SchmidtThieme. 2009. BPR: Bayesian Personalized Ranking from Implicit Feedback.
In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI ’09). AUAI Press, Arlington, Virginia, United States, 452–461.
http://dl.acm.org/citation.cfm?id=1795114.1795167
[24] Badrul M. Sarwar, George Karypis, Joseph A. Konstan, and John T. Riedl. 2000.
Application of Dimensionality Reduction in Recommender System – A Case
Study. In Proceedings of ACM SIGKDD Conference on Knowledge Discovery in
Databases. Boston, MA, USA.
[25] Christophe Senot, Dimitre Kostadinov, Makram Bouzid, Jérôme Picault, Armen
Aghasaryan, and Cédric Bernier. 2010. Analysis of Strategies for Building Group
Profiles. In Proceedings of the 18th International Conference on User Modeling,
Adaptation, and Personalization (UMAP’10). Springer-Verlag, Berlin, Heidelberg,
40–51. https://doi.org/10.1007/978-3-642-13470-8_6
[26] Harald Steck. 2011. Item Popularity and Recommendation Accuracy. In Proceedings of the Fifth ACM Conference on Recommender Systems (RecSys ’11). ACM,
New York, NY, USA, 125–132. https://doi.org/10.1145/2043932.2043957
[27] Robert Tibshirani and Keith Knight. 1996. Model search and inference by
bootstrap âbumpingâ. Technical Report, Department of Statistics, University of
Toronto.(http://www-stat. stanford.-edu/tibs). Presented at the Joint Statistical
Meetings, Chicago.
[28] Andreas Töscher, Michael Jahrer, and Robert M. Bell. 2009. The BigChaos Solution
to the Netflix Grand Prize. Netflix prize documentation. (2009).
ACKNOWLEDGMENTS
The authors would like to thank the financial support from CNPqUFBA.
REFERENCES
[1] Gediminas Adomavicius and Alexander Tuzhilin. 2005. Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible
Extensions. IEEE Transactions on Knowledge and Data Engineering 17, 6 (June
2005), 734–749. https://doi.org/10.1109/TKDE.2005.99
[2] Ariel Bar, Lior Rokach, Guy Shani, Bracha Shapira, and Alon Schclar. 2013.
Improving simple collaborative filtering models using ensemble methods. In
Multiple Classifier Systems. Springer Berlin Heidelberg, Berlin, Heidelberg, 1–12.
https://doi.org/10.1007/978-3-642-38067-9_1
[3] Renato Dompieri Beltrão, Marcelo Garcia Manzato, Bruno Souza Cabral, and
Frederico Araújo Durão. Personalized ranking of movies: Evaluating different
metadata types and recommendation strategies using multiple metadata. In
BRACIS, in press.
[4] Dmitry Bogdanov and Perfecto Herrera. 2011. How Much Metadata Do We Need
in Music Recommendation? A Subjective Evaluation Using Preference Sets. In
ISMIR. 97–102.
[5] Leo Breiman. 1994. Heuristics of instability in model selection. Technique Report.
Statistics Department. University of California at Berkeley (1994).
[6] Leo Breiman. 1996. Bagging predictors. Machine learning 24, 2 (1996), 123–140.
[7] Bruno Souza Cabral, Renato Dompieri Beltrão, Marcelo Garcia Manzato, and
Frederico Araújo Durão. 2014. Combining Multiple Metadata Types in Movies
Recommendation Using Ensemble Algorithms. In Proceedings of the 20th Brazilian
Symposium on Multimedia and the Web (WebMedia ’14). ACM, New York, NY,
USA, 231–238. https://doi.org/10.1145/2664551.2664569
[8] Ivan Cantador, Peter Brusilovsky, and Tsvi Kuflik. 2011. Second Workshop on
Information Heterogeneity and Fusion in Recommender Systems (HetRec2011).
In Proceedings of the Fifth ACM Conference on Recommender Systems (RecSys ’11).
ACM, New York, NY, USA, 387–388. https://doi.org/10.1145/2043932.2044016
[9] Michael D. Ekstrand, John T. Riedl, and Joseph A. Konstan. 2011. Collaborative
Filtering Recommender Systems. Found. Trends Hum.-Comput. Interact. 4, 2 (Feb.
2011), 81–173. https://doi.org/10.1561/1100000009
[10] Zeno Gantner, Lucas Drumond, Christoph Freudenthaler, Steffen Rendle, and
Lars Schmidt-Thieme. 2010. Learning Attribute-to-Feature Mappings for ColdStart Recommendations. In Proceedings of the 2010 IEEE International Conference
on Data Mining (ICDM ’10). IEEE Computer Society, Washington, DC, USA,
176–185. https://doi.org/10.1109/ICDM.2010.129
[11] Zeno Gantner, Steffen Rendle, Christoph Freudenthaler, and Lars SchmidtThieme. 2011. MyMediaLite: A Free Recommender System Library. In Proceedings
of the Fifth ACM Conference on Recommender Systems (RecSys ’11). ACM, New
York, NY, USA, 305–308. https://doi.org/10.1145/2043932.2043989
132
Документ
Категория
Без категории
Просмотров
2
Размер файла
1 377 Кб
Теги
3126858, 3126879
1/--страниц
Пожаловаться на содержимое документа