Private Matrix Factorization with Public Item Features

We consider the problem of training private recommendation models with access to public item features. Training with Differential Privacy (DP) offers strong privacy guarantees, at the expense of loss in recommendation quality. We show that incorporating public item features during training can help mitigate this loss in quality. We propose a general approach based on collective matrix factorization (CMF), that works by simultaneously factorizing two matrices: the user feedback matrix (representing sensitive data) and an item feature matrix that encodes publicly available (non-sensitive) item information. The method is conceptually simple, easy to tune, and highly scalable. It can be applied to different types of public item data, including: (1) categorical item features; (2) item-item similarities learned from public sources; and (3) publicly available user feedback. Furthermore, these data modalities can be collectively utilized to fully leverage public data. Evaluating our method on a standard DP recommendation benchmark, we find that using public item features significantly narrows the quality gap between private models and their non-private counterparts. As privacy constraints become more stringent, models rely more heavily on public side features for recommendation. This results in a smooth transition from collaborative filtering to item-based contextual recommendations.


Introduction
Recommender systems trained on private user feedback present the risk of leaking sensitive information about users' activity or preferences (Zhang et al., 2021;Calandrino et al., 2011), and thus, providing formal privacy protections is increasingly important.Differential privacy (DP) (Dwork et al., 2014a) has emerged as the de facto standard for formalizing and quantifying privacy protections.These DP guarantees often come at the expense of some degradation in model quality, as DP training involves adding noise to quantities derived from user data (for example, adding noise to the gradients (Abadi et al., 2016)).Recent progress in private recommendation algorithms (Jain et al., 2018;Chien et al., 2021;Krichene et al., 2023) has significantly improved the privacy/utility trade-offs, but there still remains a large quality gap between private models and their non-private counterparts.
It was recently shown (Krichene et al., 2023;Chien et al., 2021) that these quality losses are due to degradation in item representation as a result of the noise added to ensure DP (particularly for tail items, which have fewer ratings and are more impacted by noise).Making item embeddings robust to the added noise may be the key to narrowing the quality gap between private and non-private models.One promising direction is to utilize public item features to improve item representation while maintaining strict user-privacy guarantees.
In this work, we investigate methods to utilize such item features to improve the quality of privacy-preserving recommenders.We take inspiration from the literature on Collective Matrix Factorization (CMF) (Singh and Gordon, 2008), which learns shared embeddings from collections of related matrices, rather than a single matrix.Throughout the paper, we will distinguish between private user feedback, which is sensitive and needs to be protected, and public item features, which represent non-sensitive, publicly available information that does not need privacy protection.

Contributions
• Formulation: We model both public item features and sensitive user-item feedback as matrices.Two low-rank factorizations are learned simultaneously.One factorization approximates the user feedback matrix and the other approximates the item feature matrix.Importantly, the item representation is shared between the two factorizations, which enables item embeddings to benefit from public features.This setup is versatile as it can encode various modalities of public information.For instance, features can represent public item metadata.The setup can also encode pairwise item similarity derived from public data, where the 'features' correspond to items and represent similarity scores.Finally, we can encode user feedback, for instance, from users who choose to make their ratings or reviews publicly available, here, the 'features' are users and represent the affinity between a user and an item.
• Method: To provide DP guarantees, we propose Differentially Private Collective Matrix Factorization (DP-CMF), that extends the recently proposed DPALS algorithm (Chien et al., 2021) to the CMF formulation.DP-CMF works by adding noise to the sufficient statistics derived from sensitive data, while using exact statistics derived from public data.
• Evaluation: We evaluate DP-CMF on the same private recommendation benchmark used in (Jain et al., 2018;Chien et al., 2021;Krichene et al., 2023).We find that incorporating public item features significantly narrows the quality gap between private and non-private models, particularly so when privacy requirements are high.This study offers a promising direction for improving privacy-utility trade-offs in recommender systems by leveraging public data sources while preserving user privacy.

Related Work
Differential privacy in recommender systems.The importance of privacy in recommender systems has been recognized for a long time (Narayanan and Shmatikov, 2008), and some early attempts were made (McSherry and Mironov, 2009;Kapralov and Talwar, 2013) to develop differentially private algorithms that offer strong protection, but this usually required significant losses in model quality.Recent work (Jain et al., 2018;Chien et al., 2021;Krichene et al., 2023) developed new algorithms that narrowed this quality gap, by using alternating minimization (Chien et al., 2021;Jain et al., 2021), and developing methods to adaptively allocate privacy budgets (Krichene et al., 2023).Our proposed algorithm builds on these recent improvements, by extending the DPALS technique (Chien et al., 2021) to incorporate public item data.While utilizing public data to improve DP models has been explored in other domains (as described below), our work is the first to carry out a systematic study for private recommenders.
Using side features in recommenders.User and item side information are commonly employed to address the "cold-start" problem for users and items with limited or no interaction data (Gantner et al., 2010;Saveski and Mantrach, 2014;Kula, 2015;Deldjoo et al., 2019;Cortes, 2018).Furthermore, side information can tackle fairness concerns and mitigate popularity bias in recommendations (Shi et al., 2014).Side features can be integrated into MF models through Collective Matrix Factorization (CMF) (Singh and Gordon, 2008;Shi et al., 2014;Dong et al., 2017;Liang et al., 2016;Jenatton et al., 2012), also known as Joint Matrix Factorization (Zhu et al., 2007), which originated in the Statistical Relational Learning literature (Getoor and Taskar, 2007).Our work leverages the CMF approach and extends it to private recommendations.While in recommender systems, both user and item side information can be useful, in the privacy context, it is more natural to consider only item side information, as it generally represents non-sensitive data, while user side information (such as demographic features) is sensitive and would require privacy protection.Our paper will hence focus on item features.
Using public data to improve private models.Leveraging public information to enhance privacy/utility trade-offs has been explored in various contexts.Existing approaches fall in two broad categories.The first is public pre-training followed by private fine-tuning.Empirically, this approach is effective in domains with abundant public data, such as natural language processing (Li et al., 2021;Yu et al., 2021a;Behnia et al., 2022) and vision (Golatkar et al., 2022;Xu et al., 2022).The second is to directly incorporate public data into the private learning process.These techniques are based either on projecting private gradients onto a low-dimensional subspace estimated from public gradients (Kairouz et al., 2021;Yu et al., 2021b;Zhou et al., 2021), or utilizing public data to modify the objective function (Bassily et al., 2020;Amid et al., 2022;Li et al., 2022).For an extensive review, see (Cummings et al., 2023).These approaches often make the restrictive assumption that public and private data come from the same distribution (Kairouz et al., 2021;Amid et al., 2022;Wang and Zhou, 2020;Zhou et al., 2021) (so that public and private gradients lie on the same subspace).Our approach can work even if the public data comes from a different distribution: access to item metadata can be informative about item similarity, even if this data is of an entirely different nature than user feedback.Another notable difference is that existing work focuses on gradient-based methods, while ours is, to the best of our knowledge, the first to explore the benefits of public data on second-order methods (Alternating Least Squares).

Preliminaries 2.1 Setup & Notation
Throughout, M ∈ R m×n denotes the user-item feedback matrix, and S ∈ R s×n the item-feature matrix, where m, n, s are the number of users, items, and features, respectively.We denote by Ω a subset of [m] × [n] representing the indices of the observed entries in M .We define Ω i: := {j ∈ [m] : (i, j) ∈ Ω} the set of items rated by user i,and define Ω :j := {i ∈ [n] : (i, j) ∈ Ω} the set of users that rated item j.Further, we denote by representing the observed entries of the item-feature matrix.For instance, (k, j) ∈ Ω ′ if item j has corresponding public feature token k (e.g., j ≡ Titanic, k ≡ director: James Cameron).The goal of CMF is to learn two low-rank factorizations: M Ω ≈ U V ⊤ that approximates the user feedback matrix, and S Ω ′ ≈ F V ⊤ that approximates the item feature matrix.Where U ∈ R n×d , V ∈ R m×d and F ∈ R s×d are d−dimension embeddings corresponding to users, items and features, respectively.The notation M Ω means that approximate equality is desired only with respect to the entries M ij for (i, j) ∈ Ω.
For a vector v ∈ R d , ∥ • ∥ denotes the usual Euclidean ℓ 2 norm.For two vectors u, v ∈ R d , ⟨u, v⟩ and u ⊗ v denote the inner and the outer product, respectively.By Π PSD (•), we denote the projection operation on the set of positive semidefinite matrices.By ∥ • ∥ f rob , we denote the Frobenius norm of a matrix.For a matrix U , u i specifies the i-th row.Finally, we use N d to denote the standard multivariate normal distribution and N d×d to denote the distribution of symmetric d × d matrices whose upper triangular entries are i.i.d.standard normal.

Privacy considerations
Following (Jain et al., 2018;Chien et al., 2021;Jain et al., 2021), we adopt the notion of user-level DP (Dwork et al., 2014a), where the goal is to protect all of the ratings from a user.Intuitively, the user-level DP guarantee limits the impact that any user can have on the algorithm's output.More formally, let D = {d 1 , d 2 , . . .d n } be a set of inputs corresponding to the n users, and let A : D n → Y be a randomized algorithm that produces an output y ∈ Y.In our case, d i are the ratings associated with user i and y is the set of all item embeddings V and feature embeddings F .Denote by D −i the inputs for all users except i.Two sets of inputs D, D ′ are said to be adjacent if they differ in at most one user; i.e. (Kearns et al., 2014)).An algorithm A satisfies user-level (ε, δ)-DP if for all adjacent data sets D and D ′ , and any measurable set of outputs Y ⊂ Y, the following holds: Intuitively, (ε, δ) are privacy parameters that control the "indistinguishability" between the outputs of the algorithm when it processes two datasets that differ in a single user's data.The smaller the values of ε and δ, the stronger the privacy guarantee provided by the algorithm.The parameter δ is typically taken to be ≤ 1/n (n is the number of users).The values of ε depend on the domain, studies typically report values ranging from ε = 0.1 (high privacy regime) to ε = 10.
Remark 2.2 (User-level vs. rating-level Differential Privacy).Some prior techniques (Dwork et al., 2014b;Kapralov and Talwar, 2013) provide rating-level DP guarantees, meaning that neighboring datasets are allowed to differ in at most a single rating.In other words, rating-level DP limits risk of leakage from each individual rating, but this offers a much weaker protection at the user-level, (since users typically have many ratings, and the leakage risk compounds with the number of ratings).In contrast, user-level DP (Kearns et al., 2014;Jain et al., 2018) ensures that a user's full set of ratings is protected.This makes user-level DP both more challenging to accomplish, but also more practically significant and relevant in terms of privacy protection of a user's data.

Differentially Private Collective Matrix Factorization
We now introduce the DP-CMF algorithm for private recommendations with public item features.We first recall the Alternating Least Squares (ALS) algorithm for (non-private) CMF, then introduce the necessary modifications to satisfy user-level DP.

ALS for (non-private) CMF
CMF jointly optimizes the following weighted loss function to find low-rank approximations of where W ij is the weight associated with the contribution of user i's rating of item j to the loss function, λ is the regularization weight for user and item embeddings and λ ′ is the regularization weight for feature embeddings.Finally, α is a hyper-parameter that controls the relative importance of fitting the public versus private data.A small α means that item embeddings V will primarily depend on user-item feedback; whereas a large α means that item embeddings will depend more on the item-feature matrix.Although the loss is not jointly convex, for fixed item embeddings V , it is a convex quadratic with respect to (U , F ) and vice-versa.ALS takes advantage of this fact, and alternates between updating (U , F ) and updating V , as follows ∀i ∈ [n], ∀k ∈ [s] and ∀j ∈ [m], respectively : S kj v t−1 j ; (3) where The ALS updates for user and feature embeddings (Eqs.( 2) and ( 3)) are decoupled and can happen simultaneously.In essence, item features (e.g.genre:comedy) can be treated as "fictitious users".
Remark 3.1 (Implicit feedback and binary features).When the user feedback is implicit (e.g.clicks, views), or when the public item features are categorical, we use the implicit ALS formulation (Hu et al., 2008) that penalizes non-zero predictions outside of the observation sets Ω and Ω ′ , by adding terms ∥U V ⊤ ∥ f rob and ∥F V ⊤ ∥ f rob to the optimization objective in Eq. ( 1).This results in changes to the update equations that are standard in the literature.

Differentially Private CMF
To ensure user-level DP, we introduce DP-CMF (see Algorithm 1), which extends the DPALS procedure (Chien et al., 2021) to CMF with public features.DP-CMF computes and releases the item and feature embeddings (V t , F t ) with DP protection on a trusted centralized platform (server-side).Meanwhile, each user i independently updates their

Algorithm 1 Differentially Private CMF with Public Item Features
Input: User-item matrix M ; feature-item matrix S; feature weight α; weight matrix W ; initial item embeddings V 0 ; number of steps T ; clipping parameters Γ M ,Γ U ; regularization parameters λ,λ ′ .for t = 1 to T do Broadcast V t−1 to all users.
Each user i ∈ [n] updates (client-side): Update feature embeddings (server-side): Update item embeddings (server-side): : Compute noisy statistics: embedding u t i on their own device (client-side).As a result, the user embedding update (step 2 of Algorithm 1) is identical to the non-private update in Eq. ( 2) with additional assumption that the update is unweighted (i.e.W ij = 1).Furthermore, the feature embedding update (step 3) only depends on S (public data) and V t−1 (which is DP-protected), hence by the DP post-processing property, it requires no additional noise and can be computed as in Eq. (3).
On the other hand, the item embedding update (step 4) depends on private data M , U t , and must be modified to guarantee DP.This requires two modifications: the first is to limit the impact of each user on the item embeddings, this is done by clipping the magnitude of individual ratings (step 6) clipping the user embedding norm (step 7) and weighting the ratings of each user with appropriately chosen weights W (step 9).The second is to add noise to the sufficient statistics (steps 8-11) via the Gaussian mechanism (Vu and Slavkovic, 2009;Foulds et al., 2016;Wang, 2018).Note that the statistics Âj , bj (step 9) depend on sensitive data and are protected via noise, while the statistics A ′ j , b ′ j (step 10) depend only on public data and are computed exactly.
Step 11 intuitively highlights the potential benefit of using item features: the item embedding is the solution of a linear system Ax = b with A = Âj + αA ′ j , and b = bj + αb ′ j , where Âj , bj are noisy quantities derived from user feedback, while A ′ j , b ′ j are derived from public features and are exact.A larger α makes the solution more robust to the noise, but favors fitting the item features.When the item features are informative (e.g., they accurately capture item-item similarity), this can improve the item representation compared to only using noisy user feedback (α = 0).Remark 3.3 (Computational cost of DP-CMF).One step of DPALS (Chien et al., 2021) consists of forming the sufficient statistics (a cost of O(|Ω|d 2 )) then solving m + n linear systems (a cost of O((m + n)d 3 ).In DP-CMF (Algorithm 1), the sufficient statistics computation cost increases to O((|Ω| + |Ω ′ |d 2 ), and the linear system cost increases to O((m + n + s)d 3 ).Hence, the added cost of using public features remains reasonable if |Ω ′ | is comparable in size to |Ω|, and the total number of features s is smaller or comparable to m + n.
Remark 3.4 (Threat model).Observe that in this model, the recommendation platform broadcasts the item embeddings V and the feature embeddings F .The user embeddings U are never published.Rather, each user i can compute her own embedding u i (by solving a least-squares problem involving her own ratings along with the published item embeddings V , see Eq. ( 1)), then use it to generate recommendations by computing scores u ⊤ i V .This captures a very strong notion of privacy, as it protects user i even against potential collusion of the remaining n − 1 users (i.e. an adversary with access to V , F and D −i ), while allowing the user to take full advantage of her data to generate recommendations.Importantly, the platform hosting the recommendation system is a trusted entity (it has access to the raw user ratings and user embeddings when computing the noisy sufficient statistics).The goal is to protect against privacy attacks from malicious users or external agents, not the recommender system itself.However, if the recommender itself is considered untrusted, these algorithms (DP-ALS and DP-CMF) can potentially be implemented using secure aggregation algorithms (Bonawitz et al., 2017), although this comes at an increased computational cost.
Proposition 3.1 (Privacy Guarantee).For all ε > 0, δ ∈ (0, 1), if the inputs to Algorithm 1 , then the algorithm is (ε, δ) user-level DP.Remark 3.5.The weights W are used to control a user's impact on the model.The simplest way to generate weights that satisfy the condition of the proposition is to assign a uniform weight to each user.Specifically, given a desired privacy level (ε, δ), let β = ε 2 4T (log(1/δ)+ε) , then simply set W ij = β/|Ω i | satisfies the inequality (notice that a user with more ratings, i.e. a larger |Ω i |, will have lower weights, to limit the solution's sensitivity to that user's data).A more sophisticated method was developed in (Krichene et al., 2023) that adapts to the item frequencies by putting more weight on infrequent items.We use the latter in our experiments.
Proof.First, we argue that it suffices to prove the result for α = 0. Indeed, by step 10, the statistics A ′ j , b ′ j only depend on the public feature matrix S and on the feature embeddings F t , which in turn only depends on S and V t−1 (by step 3).Since V t−1 is released with DP protection, there is no additional privacy cost for computing A ′ j , b ′ j (by the post-processing property of DP (Dwork et al., 2014a, Proposition 2.1)).Therefore the privacy guarantee of the algorithm with α = 0 or α > 0 are identical.When α = 0 (no features), the algorithm becomes identical to DPALS, and the guarantee is proved in (Krichene et al., 2023, Theorem 3.3).

Empirical Evaluation
We evaluate DP-CMF on a standard DP recommendations benchmark used in (Jain et al., 2018;Chien et al., 2021;Krichene et al., 2023), based on the MovieLens datasets (Harper and Konstan, 2015).The benchmark considers a rating prediction task on the MovieLens 10M (ML10M) dataset, which records over 10 million ratings ranging from 1 to 5 for n = 69878 users and m = 10677 movies.For the feature-item matrix we consider 3 sources of public data: Item metadata.We construct a categorical feature dataset by cross-referencing movie IMDb identifiers with data available on Wikidata.org.For each movie in the ML10M dataset, we collect genre, topic, and cast information.We construct a binary feature matrix S, where each row corresponds to a feature token (e.g., the first row is labeled as director:James Cameron, and non-zero entries in this row correspond to movies directed by James Cameron).The metadata dataset comprises s = 12637 feature tokens with an overall feature density of 0.13%.
Item to item similarity scores.We create an item-to-item similarity dataset from a non-private recommendation model trained on a variant of the ML20M dataset, as proposed in (Liang et al., 2018).This dataset, is commonly used for benchmarking recommender performance on implicit feedback, as the training data is a binary matrix corresponding to ratings ≥ 4. We first train a Matrix Factorization model on the dataset and use the resulting item embeddings to identify, for each movie, the k most similar movies based on similarity scores (we experiment with inner product and cosine similarity).Each row in the feature matrix S corresponds to a movie, with non-zero values S ij indicating similarity between movies i and j.Finally, we consider both actual and binarized scores in the S feature matrix.
Public user ratings.We use the ML20M dataset and select the observations corresponding to 70152 users that do not overlap with the ML10M users.We consider this set of user-item feedback as public item side data.More specifically, each user, whose data is considered public, plays the same functional role as a feature token.We observe that in this case, the public data is of the same semantic type as the private data.This setup is closest to the common assumption in the literature, that private and public data come from the same or similar distributions.In experiments, we use subsets of public users of various sizes, ranging from very small values (s = 100) to the full available data (s = 70152).Finally, we consider both raw ratings (from the original ML20M) as well as binarized ratings.

Experimental procedure
We follow the procedure of (Lee et al., 2016) to partition the ML10M into train, test and validation datasets.For the privacy parameters, following (Jain et al., 2018;Chien et al., 2021;Krichene et al., 2023), we consider a range of ε = [1, 5, 10, 20] and fix δ = 10 −5 .For each privacy setting we use the optimal hyper-parameters such as number of ALS iterations T , regularization weight λ, clipping norm Γ U tuned by (Krichene et al., 2023) for the same task without side features.With these pre-tuned hyper-parameters, for each ε and each public data source, we tune only4 the hyper-parameters corresponding to side features: the side feature weight α and side feature regularization λ ′ .We select hyper-parameters based on performance on validation set and finally we report performance on test set.Performance is measured using the Root Mean Squared Error (RMSE) between true and predicted ratings: It is important to note that, although the training loss considers side feature-item data and the learned embeddings for features play a crucial role in updating item embeddings, the final performance is measured solely based on rating data (user feedback).In addition, we report performance metrics sliced by item popularity to gain a deeper understanding how public item information impacts the the quality of private models across frequent and infrequent items.

Results
In Figure 1a, we compare DP-CMF's performance to the DPALS algorithm without side features from (Chien et al., 2021;Krichene et al., 2023) (blue curve), which is the current state-of-the-art on the ML10M benchmark.We also report for reference the non-private ALS baseline (dashed line).Incorporating public item information significantly narrows the existing quality gap between private and non-private models.The relative improvement depends on the public data source.Public user rating data (red curve) consistently outperforms other sources, as expected, since it comes from a closely related distribution.However, even tangentially related public item data, such as item metadata from Wikidata, substantially improves model quality.Furthermore, the different public item data modalities are composable, leading to compunded accuracy improvements (purple curve).The gap between private and non-private models is largest for high privacy requirements (low ε), with side features closing up to 60% of the performance gap.

Tail performance
Figure 1b shows performance across four popularity buckets for models trained under privacy parameter ε = 1, with each bucket containing roughly 2500 items.Due to the skewed distribution of ratings, the buckets hold 86.6%, 9.4%, 3%, and 1% of the all ratings, respectively.Thus, popular items have a greater impact on overall performance.The   performance ordering of head items (bucket 1) matches the global ordering.However, for tail items (buckets 2 through 4), the order is reversed, with Wikidata features showing the most improvement for tail items.
We posit that Wikidata movie metadata outperforms on tail items due to its less pronounced bias towards popular items.As Fig. 2a illustrates, 90% of feature-item observations from ML20M public users correspond to popular items, while only 37% of Wikidata feature matrix entries do so.However, feature density alone doesn't fully account for this performance difference, as item-to-item similarity doesn't perform as well on tail items, despite being perfectly balanced (by construction, we select the same number of neighbors for all movies).One hypothesis is that side features are most beneficial for tail items when they can transfer information from popular items.Fig. 2b supports this, showing that while public users primarily rate popular movies.Wikidata features, on the other hand, are less frequent, but describe both popular and tail items.This balance may explain why Wikidata outperforms other public data sources on tail items.Another notable difference between the sources of data is illustrated in Fig. 2b.The figure shows, for each feature, the fraction of top-bucket movies for that feature (so a fraction of r means that among all occurrences of that feature, r fraction fall in the top bucket while the rest fall in other buckets).We can observe that while public users primarily rate popular movies, Wikidata features are more balanced across popular and tail items.This may explain why Wikidata outperforms other data sources on tail items.

Performance across public data sources
We find that cast information alone captures most of the performance lift achieved by Wiki Metadata features (3).The cast information is the most effective side feature for both head and tail items.This phenomenon is potentially explained by the fact that cast information is very granular and plausibly correlates with user preferences.
In Figure 4 we consider variants of pairwise item similarity.We find that the performance improves with the number of similar items for both cosine and inner product similarity scoring.Inner product scores generally outperform cosine similarity scores, in part because they take into account the magnitude of two vectors, not just their angle.Given that higher magnitudes typically correspond to more popular items, this leads to popularity bias, reflected in the comparatively weaker performance of dot product similarities on tail items.
Finally, in Figure 5 we consider public item data derived from public ratings.Increasing the number of features (in this case, users with public ratings) significantly enhances model performance.This improvement is more pronounced when using non-binarized ratings, with the private model's performance approaching that of the non-private model even under strict privacy settings.While the most considerable accuracy gains are achieved with large amounts of public data (s = 50000), even modestly sized sources of in-distribution data (s = 1000) yield performance improvements comparable to the best gains achieved through Wiki Metadata.

Discussion
In this work, we introduce DP-CMF, a method aimed at improving the privacy-accuracy trade-off of private recommendation models.Our technique incorporates public item feature data into private recommendations that satisfy (ε, δ)-DP.This approach is simple to implement, easy to tune, and highly scalable.DP-CMF allows for the integration of public side item information, pairwise item similarities, and public rating data.This is achieved within the same formulation, without requiring any changes in privacy accounting.Our experimental results demonstrate practical improvements in the privacy-accuracy trade-off by utilizing public item features.
Identifying public features that align with user interests and enhance recommendation performance remains a challenge.This task is domain-dependent.In general, access to high-quality annotations is beneficial, and this may be harder to obtain in some domains, for instance when content creation is cheap, and annotations are relatively more expensive.In such cases, another potential source is learning unsupervised, content-based similarity (Jansen et al., 2018) Future work includes comparing DP-CMF with pre-training approaches and extending our methodology beyond CMF.This could involve exploring other models that utilize item side features, such as Inductive Matrix Completion (Gantner et al., 2010;Xu et al., 2013;Chiang et al., 2015;Jain and Dhillon, 2013;Goldberg et al., 2010) which enjoys favorable theoretical guarantees.
(a) Feature matrix density across popularity levels.
Share of popular items vs. feature prevalence.

Figure 2 :Figure 3 :
Figure 2: Popularity bias of Wikidata metadata, item-item similarity and ML20M public users' data

Figure 4 :Figure 5 :
Figure 4: Comparing DP-CMF performance across varying similarity functions and numbers of selected similar items Impact of public item features on private recommendation accuracy.Wiki Metadata corresponds to categorical genre, topic and cast features.Item-Item Similarity considers k = 100 similar items according to dot product; ML20M Public Users considers binary observations for ratings ≥ 4