Stability of Explainable Recommendation

Explainable Recommendation has been gaining attention over the last few years in industry and academia. Explanations provided along with recommendations in a recommender system framework have many uses: particularly reasoning why a suggestion is provided and how well an item aligns with a user’s personalized preferences. Hence, explanations can play a huge role in influencing users to purchase products. However, the reliability of the explanations under varying scenarios has not been strictly verified from an empirical perspective. Unreliable explanations can bear strong consequences such as attackers leveraging explanations for manipulating and tempting users to purchase target items that the attackers would want to promote. In this paper, we study the vulnerability of existent feature-oriented explainable recommenders, particularly analyzing their performance under different levels of external noises added into model parameters. We conducted experiments by analyzing three important state-of-the-art (SOTA) explainable recommenders when trained on two widely used e-commerce based recommendation datasets of different scales. We observe that all the explainable models are vulnerable to increased noise levels. Experimental results verify our hypothesis that the ability to explain recommendations does decrease along with increasing noise levels and particularly adversarial noise does contribute to a much stronger decrease. Our study presents an empirical verification on the topic of robust explanations in recommender systems which can be extended to different types of explainable recommenders in RS.


INTRODUCTION
Explainability of Recommender Systems (RS) is an important field which studies methods that learn why a recommendation is suggested by a model for a user [14,15,39].The explanations provided improve the transparency of the system, by revealing more about the predicted outcome as in how does the model learn personalized preferences for every user [37,45].Moreover, explanations provided within a RS framework can directly appeal to a user and even influence them to purchase an item if it is very well explained (by providing very detailed information associated with the recommendation) as to why it is recommended to the user [26].Additionally, explanations can be leveraged for detecting anomalies in certain systems [6], such as in graph neural networks [13].Thus, explanations provided along with recommendations must be reliable and unchanging under varying scenarios, however the current existing explainable recommenders are typically vulnerable towards external attacks and hence provide unstable explanations.
While there has been a lot of work done in improving the explainability of the model, there has not been much attention drawn towards studying the robustness of explanations provided by a recommender during varying circumstances [14,15].Explainable systems that are prone to attacks can provide an easy outlet for attackers with malicious intent to achieve their objectives.For example in figure 1, we present this consequence within an example of cellphone recommendation in e-commerce websites, where the attackers (say a particular mobile brand's manufacturer) can promote a target mobile phone (belonging to that particular brand of the attacker, in this case, item in red) by deliberately manipulating the associated explanations for a user's personalized recommendations.This form of manipulation can be done by adjusting the feature scores-battery and/or screen quality of the item to align more with a current user's (say ) interests which is utilized as explanations, to match the user's preferences.This could attract the users to interact more with the target item and hence grab their interest away from the much more relevant items (items and in green).Hence, this could provide a different representation from the original characteristics of the target item, thus tricking consumers to purchase them and hence achieving the attacker's objective.
In this work, we present an empirical research study on the existent explainable recommender models by exploring the global strong claim that similar inputs must possess similar interpretations [1,4,33,44].Our study also intends to verify the robustness of explainable models but strictly within the realm of RS.

PROBLEM STATEMENT 3.1 Feature-aware Recommender System
Let be the set of users and be the set of items of a dataset .Following the sentiment analysis-based extraction method using a tool called Sentires from [47], triples containing features, opinions, and sentiments such as ( , , ) are extracted from all the user-item reviews of .The features extracted form the aspect set from which explanations are derived from.The opinions are the adjectives which are used to classify the aspect while the sentiment predicted belongs to a binary set of being expressed as either positive or negative i.e. {−1, +1}.Using these triplets, we create the user-aspect matrix ∈ R .Similarly, we construct item-aspect matrix ∈ R where = | |, = | | and = | |.We adopted the same construction technique as done in [8,20,35,38,46] as follows: where N is the maximum rating scale from the reviews (typically 5), , is the number of mentions from user ∈ on the feature ∈ , , is the number of mentions on item ∈ using feature ∈ , and , is the average sentiment polarity of all the ( , ) mentions.
We use these matrices as inputs for learning a black-box recommender with trainable model weights Θ which predicts the matching score , = ( , | Θ; ) for a user ∈ and item ∈ where is the user-feature vector corresponding to in and is the item-feature vector corresponding to in .For obtaining the top-recommendation for a user , we find the first items which score the highest as per the recommender as

Explanation under Noises
Let the explainability capability of the recommender (Θ; ) trained on be under a normal condition.In this paper, we strongly hypothesize that when we perturb the model parameters by Δ (constrained by a max norm constraint , see eq. 1), the explainability measure changes to ′ .In this study, we plan to characterize the difference − ′ under different noise levels .Our hypothesis is that the difference − ′ increases as we increase the noise level .In addition, we strongly believe that adversarial noises (FGSM) would cause a bigger difference than the random noise counterpart since FGSM noises learn much more against the objective learned by a recommender.

EXPERIMENTS 4.1 Datasets & Preprocessing
We chose two datasets of different scales to conduct our experiments: • Amazon Electronics1 : E-commerce-based dataset which contains user-provided reviews of electronics purchased on the Amazon platform.• Yelp2 : Users' reviews are contained for various businesses: restaurants, salons, travel agencies, hotels, etc. across different locations in the world.
In order to improve the density of both datasets, we preprocessed by retaining users with at least 20 reviews for the Yelp dataset and at least 10 reviews for the Amazon dataset.Following previous works [16,35], we created the testing set as follows: for each user, we keep the last 5 interacted items (positive items) by time and randomly sample 100 items that are not at all interacted by the user (negative items).In table 1, we present the final dataset statistics.

White-box based Perturbation attacks
In order to attack the RS using model-based perturbation methods, we chose two simple attack strategies described below: • Random Noise: Gaussian Noise which is drawn from the Normal distribution (0, 1).We normalize the added noise using 2 norm3 and then we are easily able to scale them to a global noise level .• Adversarial Noise: Let the ( ; Θ) be the loss function of the recommender (predominantly combining the utility and explainability of the recommender).We can optimize for the original model parameters as Θ = arg min Θ ( ; Θ) .The adversarial noise Δ which is added as perturbation into the model is learnt after we learn the original model weights Θ.According to eq. 1, since this noise follows a maxnorm constraint (where is total magnitude of adversarial perturbations and • is the 2 norm.) and it is intractable to exactly maximize for a recommender loss function in general cases, we follow the optimization technique as done in [19,36,43] and optimize the noise inspired by the Fast Gradient Sign Method (FGSM) [18] as:

Models & Training Setting
For verifying our hypothesis that existent explainable models are vulnerable to external attacks, we pick SOTA feature-based explainable recommenders chosen based on the presence and involvement of explicit and/or hidden factors in both the recommendation and explanation procedures.The models are described as follows: • CER: Counterfactual Explainable Recommendation [35]: The top recommendations are learned from a black box neural network model comprised of two hidden layers with and as inputs.Then, the explanations for the top items for any user are the most minimal counterfactual changes done to the item features space such that these changes are responsible for the item not being recommended in the top list to this user.We chose to perturb all the hidden layers of the recommender neural network model since these are responsible for learning from the features for predicting recommendations.We also provide personalized explanations per user which then only explains from the features mentioned by the user in all their reviews.
• A2CF: Aspect Aware Collaborative Filtering [8]: This paper predicts the missing user-feature and item-feature values within and using a residual neural network by learning user, item, and feature embeddings.We remove the item-item similarity learning so that the model is capable to explain all the user-item pairs.Since we mainly wanted to focus on studying the impact of direct explicit factors responsible for both recommendation and explanation in this model, we chose to perturb all the three embeddings: user, item, and aspect.In addition, we also perturbed the projection weight used for predicting matching scores (via Bayesian Pairwise Ranking) for user-item pairs.
• EFM: Explicit Factor Modeling [46]: This work is heavily based on matrix factorization techniques by decomposing three matrices: user-item interaction matrix, user-feature matrix , and item-feature matrix into smaller rank matrices learning with both explicit and hidden factors.For ensuring an even-handed influence of both explicit and hidden factors, we set the hidden dimension as the same for both.For this model, we perturbed only the explicit factor matrices 1 , 2 , since they are the only factors that are responsible for both recommendation and explanation.Additionally, since this method has a closed-form solution (unlike the other models considered which have employed gradient descent for optimization) for finding the optimal parameters, we only perturbed using the random noise method.
Training se ing: For all the cases, the models are trained until convergence with a batch size of 32.We chose the best hyperparameters for each model by grid search.We set the learning rate as 0.001 and Stochastic Gradient Descent was used for optimizing all the gradient-descent based models.The FGSM-based attack models were trained in the same conditions as the vanilla model was trained.We first provided the top-= 5 recommendation lists for each user and then we explain using the top-= 5 features.We also chose from [0,1] for Yelp and [0,2] for Electronics datasets.

Evaluation
For evaluating the recommendation quality, we ended up choosing the most common metric of Normalized Discounted Cumulative Gain (NDCG).However, in order to gauge feature-level explanations, we utilized Feature-level Precision, Recall, and F1 (harmonic mean of Precision and Recall) scores of the explanations by comparing them with the golden truth features found in the reviews of the user-item interaction, which have been mentioned with positive sentiment as suggested by papers [34,35,42,48].We evaluate all the explained samples for which a review actually exists (positive interactions by a user) and report the average metric scores across all such samples.

RESULTS
From the results, there is a clear inference that the explanation performance drops heavily on increasing noises for almost all the models which verifies our hypothesis that feature-aware explainable recommenders are vulnerable and hence explanation methods within RS are prone to attacks.We can also clearly conclude that the potency of adversarial attacks (see figure 2) is far superior when compared to random attacks.This is because the FGSM based adversarial attacks learn model perturbations by optimizing against the original objective of the recommender (by learning in the opposite direction of the gradients) which causes the model to inflict stronger behavior change and expose more vulnerabilities within the model.

Lack of Generalization in the Explanations
Based on the observed results, we can infer that the explanations are not generalized and robust across changing scenarios across all the recommenders.This is because the recommenders do not generalize well in capturing the exact features that correspond with the user preferences for a particular recommendation.As the noises are added into the model, the predicted outcome of the model becomes much more incorrect compared to the original vanilla model case, leading to a wrong identification of the most contributing features to explain the outcome.This implies that the explanation provided by all the models in general is not reliable indicating a vulnerability within the models.The recommendation quality drops as the noise increases leading to the incorrect features used for explaining the recommendation provided to the user and hence leading to an eventual drop in the global explainability of the system.We can observe this vulnerability in both the datasets (see tables 2 and 3) implying that the attacked models lose their capability to provide stable explanations for any suggested item on average and this trend (in figure 2) just deteriorates for increasing .

Model architecture: Explicit vs hidden factors?
The main reasons we suspect for observing this vulnerability are the impact of hidden factors and the explicit factors involved in the recommendation and explanation tasks of the recommenders.We deduce that the impact of the hidden and explicit factors depends on Random FGSM NDCG@100 Pr@5,5 Re@5,5 F1@5,5 NDCG@100 Pr@5,5 Re@5,5 F1@5,5

CONCLUSIONS & FUTURE WORK
In this study, we explored a fresh problem of vulnerability of existent feature-aware explainable recommenders and we conducted extensive experiments of different feature-aware explainable RS under varying noise levels by perturbing the models with two different kinds of noises: random and adversarial.
From our experiments, we conclude all the models are vulnerable to external attacks, implying worse identification of reasons for the predicted outcome thus decreasing explainability.We also provided attributed the noticed behavior with the presence of explicit and hidden factors.We also discuss how these factors play a huge role as the size of the dataset increases.
While we present this fresh empirical study regarding various explainable methods in RS, we also note that there is much scope for future work and developments.The most important extension to this work would be developing newer explainer methods in RS that are capable of generating explanations that are robust in general and can be relied upon by consumers.We also highlight that different types of stability such as generalized explainability across different domains could be analyzed, besides security-based stability as in this study.Robust explanations are more than a necessity in RS since it possesses serious consequences for both consumers and developers.Finally, we strongly emphasize that the field of robust explanations deserve a much broader study within the field of explainable AI and particularly in RS.

Table 2 :
Vulnerability of Explainable Recommenders on Electronics