Stability and Efficiency of Personalised Cultural Markets

This work is concerned with the dynamics of online cultural markets, namely, attention allocation of many users on a set of digital goods with infinite supply. Such dynamic is important in shaping processes and outcomes in society, from trending items in entertainment, collective knowledge creation, to election outcomes. The outcomes of online cultural markets are susceptible to intricate social influence dynamics, particularly so when the community comprises consumers with heterogeneous interests. This has made formal analysis of these markets improbable. In this paper, we remedy this by establishing robust connections between influence dynamics and optimization processes, in trial-offer markets where the consumer preferences are modelled by multinomial logit. Among other results, we show that the proportional-response-esque influence dynamic is equivalent to stochastic mirror descent on a convex objective function, thus leading to a stable and predictable outcome. When all consumers are homogeneous, the objective function has a natural interpretation as a weighted sum of efficiency and diversity of the culture market. In simulations driven by real-world preferences collected from a large-scale recommender system, we observe that ranking strategies aligned with the underlying heterogeneous preferences are more stable, and achieves higher efficiency and diversity. In simulations driven by real-world preferences collected from a large-scale recommender system, we observe that ranking strategies aligned with the underlying heterogeneous preferences are more stable, and achieves higher efficiency and diversity.


Introduction
Online content platforms are major sources for everyday entertainment, and the attention allocation within a platform provides a market setting of interest.There are strong social and economic motivations for the stakeholders to understand these markets.The platform providers ask how they can improve user experience and raise profits.They may also want to enforce diversity of the cultural products, so as to achieve a sustainable business model.The content producers are interested in strategies to improve their products, so as to gain more popularity and raise revenues.Regulatory bodies and the user population seek to understand the market dynamic and its drivers, with the goal of making attention markets more transparent, accountable and fair.
However, understanding such markets is challenging, since their outcomes are susceptible to intricate social influence dynamics among customers, which in turn are affected by the recommender systems of the platform providers.While various analyses have been done for the classical markets (e.g., Arrow-Debreu/exchange markets, Fisher markets), the online cultural markets (or more generally, attention economies) have several key aspects different from the classical markets, so it is not clear if known analyses directly apply.In classical markets, goods are scarce.Their prices act as the coordination signals to balance demand and supply.Typically, there exist moderate prices that lead to equilibrium.In online cultural markets, however, the digital goods can be reproduced with essentially no cost, so their supplies are unlimited.Users' attention is the scarce commodity that the producers compete for.Typically, the influence dynamics and recommender systems tend to cascade, i.e., to promote goods which already got high attentions.This is known to lead to polarizing and unpredictable outcomes.
Since the seminal empirical work by Salganik et al. [49], now dubbed MusicLab, several mathematical models have been proposed to describe it [42,1,45], but all of them assume that user preferences are homoge- Objective functions (9) (14) Figure 1: A core contribution of this paper is to provide an optimization view (Top) of cultural markets (Bottom), which affords new results on stability, efficiency and equilibrium behavior.(Bottom) An illustration of a cultural market with several types of users interacting with a few items (color similarity between users and items indicate differing matches in preferences).Users allocate attention to the items based on proportionalresponse-esque dynamic (see Equation ( 2)).neous.This is in stark contrast to the rich literature and wide-spread practice of recommender systems that are focused on estimating and catering to heterogeneous preferences in user populations.On the other hand, recent work in classical markets (esp Fisher markets) offer a range of results to understand equilibria from an algorithmic and optimization perspective [57,9,19].One may wonder: can recent results on Fisher markets be extended to describe the implicit computations in cultural markets?Specifically for the MusicLab model [42], customer behaviors are characterized by a two-step trial-offer (T-O) process: first, they select a product to try; second, they decide to purchase or not.The first step is a stochastic process where the randomness depends on the intrinsic appeal of the products, and also the history of past purchases by customers, creating a feedback loop.The second step is random depending on the intrinsic quality to the customer.

Our contributions
The main themes of this work are to establish robust connections between stochastic T-O markets and optimization, and to use these connections to rigorously show that the influence dynamics in these markets are stable.For the homogeneous markets, we discover two objective functions for which the equilibrium of a T-O market is maximiser of these objectives.The first objective is the "total utility" of the market.The second objective is of particular interest due to its natural interpretation.It is a weighted sum of the efficiency and the diversity of the market shares in the market, as measured by the Shannon entropy.While efficiency is a natural benchmark, diversity in cultural market is also important for the healthy development of the platform.The diversity not only broaden the customer base, it also provides financial support to the less popular producers to keep them in the cultural industry.Thus, it is of the platform providers' interest to strike for a balance between efficiency and diversity.
Interestingly, we show that the influence dynamic is indeed equivalent to stochastic mirror descent on the second objective.This suggests the dynamic is implicitly optimizing the natural objective in the market.A significant consequence is this allows us to present a new proof of a result of Maldonado et al. [45] that the dynamic converges to an equilibrium of the market almost surely.
For heterogeneous markets, we show that the equilibrium is optimizing an ex-post version of Nash social welfare.In classical Fisher market, Nash social welfare is the product of users' utilities, whereas each utility is raised to a power of the user's budget.In our case, the power is the budget times the efficiency for that user at the equilibrium.Then we turn our focus to two interesting sub-classes of heterogeneous markets, namely (i) the users have the same appeals on the items, but they perceive the qualities of the items differently; (ii) the users perceive same qualities on the items, but they have different appeals on the items.For (i), we show that it is equivalent to a homogeneous market.For (ii), we design a new objective function, where the influence dynamic is equivalent to stochastic mirror descent on the objective.Again, this allows us to show the dynamic converges to an equilibrium almost surely.
The robust connection between the dynamics and optimization processes echoes with the self-reinforced efficiency of some economic systems, for which there exist natural dynamics or algorithms that can attain equilibrium, while the equilibrium optimizes popular efficiency measure like social welfare.See Related work below for more relevant discussions.
We perform simulations using user preferences from the well-known MovieLens-100K dataset [37].We observe that accounting for heterogeneous user preferences improves efficiency in cultural markets while preserving stability.We examine the (user-centric) efficiency and (item-or producer-centric) diversity measures across three ranking strategies: random, quality-driven, and popularity-driven.Results confirm quality ranking being more efficient and more diverse than popularity ranking which was implemented in MusicLab [49] and is known to be unstable.
The rest of this paper is organized as follows.After describing the models in Section 2, we present our main results formally in Section 3. In Section 4, we provide an overview of the techniques we employ for proving the main results.This is followed by a discussion of empirical observations in Section 5.

Related work
As early as in 1971, Simon [50] pointed out that in an information-rich world, attention becomes the new scarcity that information consumes.Examples of attention economies include entertainment such as music, film and television [49,5], political campaigns and votes [10], scientific publications and researchers [34].Since Simon's visionary statement, the research community has formulated economic question about attention in a number of different ways, such as articulating the phenomenon of attention scarcity in corporate life [27], diagnostic criteria for attention scarcity and solving it as (one off) allocation problems [32,33], or connecting attention allocation to advertising revenue [31].A recent study by Vosoughi et al. [52] showed that false news spreads faster online, suggesting that besides quality, appeal (e.g., novelty of the false news and the emotion it stimulates) of a digital good is crucial in social influence.More broadly, the web research community have measured attention to items by individual users [51], a large set of users [55], and attention among a network of items [56].
The concept of self-reinforced efficiency of economic systems can be traced back to the "invisible hand" metaphor of Adam Smith.One of the first analytical confirmations of the concept is the famous First Welfare Theorem, which states that a market equilibrium of any complete market is Pareto efficient [43,3,28].Furthermore, in a broad class of markets called Eisenberg-Gale markets, market equilibrium optimizes a popular efficiency measure called Nash social welfare [30,29,38].On the other hand, in combinatorial auction, any Walrasian equilibrium (if exists) optimizes the social welfare [8].In many of these economic systems, there are natural adaptive price/bidding dynamics (e.g., tâtonnement [53,4,23,24,18,17,16,20], proportional response [54,44,57,9,19,12,20,35,13,21,22]) or auction algorithms (e.g., ascending-price auctions [41,47]) that attain efficient equilibria.As we shall see, the influence dynamics we study are indeed a stochastic version of proportional response.
2 Model: The Trial-Offer Market with Heterogeneous User Types First, we describe stochastic trial-offer (T-O) market, in which users come to the platform one-by-one to try and purchase the items.We introduce measures of efficiency and diversity.Then we describe a continuous and deterministic analogue of the stochastic model which will be useful for analysis.In this work, we use purchase to denote user completing a transaction on an item, where the resource a user spends is attention.One may think of it as a unit amount of time.Without loss of generality, we assume that each user has the same budget of attention, and that each item costs one unit of attention.This model generalises cultural markets specified by Krumme et al. [42] and Maldonado et al. [45] to heterogeneous types of users.

Stochastic Trial-offer (T-O) Market
Let U denote the types of users and I denote the set of items.The fraction of Type-i users is denoted by w i ; note that we say the market is homogeneous, otherwise it is heterogeneous.The dynamic starts at time t = 0.At each time t ≥ 1, a random user comes to the platform and tries an item, and then she decides to purchase the item or not.Let d t j denote the number of purchases of item j up to time t.To ensure that all items have a positive probability to be tried in the initial rounds, we assume that each item was purchased at least once before the dynamic starts, i.e., d 0 j ≥ 1 for every item j.The market share of item j at time t is simply the fraction of all purchases that goes to item j: The possible market shares lie on a simplex, denoted by ∆: If the user at round t is of Type-i, the probability that she will try item j is modelled as a multinomial logit, a common type of discrete choice model [36]: v ij ≥ 0 is a parameter that depicts the visibility of item j to Type-i users, which depends on the appeal of the item itself, and also how the item is promoted or ranked with respect to other items.r i > 0 is a parameter called feedback exponent, which depicts the strength of feedback signal for Type-i users. 1   After a Type-i user tries an item j, they purchase the item with probability q ij ∈ [0, 1], which intuitively reflects the quality of an item that may be unknown before it is tried.
In a homogeneous market, there is only one user type, we will drop all indices i from the notations, resulting in v j , q j , r, and clearly w 1 = 1.
The dynamic is the result of two interacting factors.The first is the user-specific visibility and quality factors v ij and q ij , which generalise recent models that analyse homogeneous attention markets with feedback loops 1 ri = 0 means no social feedback signal from the current market share, whereas ri → ∞ means only the most popular item will be chosen in the next round.If the denominator |I| k=1 v ik (φ t−1 k ) r i = 0, then this probability is defined as 0.
[ 42,39].The second is a social feedback signal (φ t−1 j ) r i based on the overall popularity of the item, such as the one implemented by the original MusicLab experiment [49], or the number of downloads and likes on myriads of internet platforms.This feedback dynamic is also similar to proportional response in Fisher markets [19], which we will exploit for obtaining key results in Section 3.
Ranked list is one of the most popular forms of presenting a set of items to users, and a salient factor affecting the visibility of an item is its position in such a list [26].If the positions are fixed throughout the attention dynamic, v ij remains constant.Our theoretical results focus on this case.In Section 5, we empirically explore how strategies of dynamically positioning the items by the platform will affect the outcome.We compute the probability of item j being the next purchase by manipulating the trial and purchase probabilities.
Lemma 2.1.In a stochastic T-O market defined above, the probability that the next purchase is for item j, denoted by p j (φ), is a function of the current market share φ, given by p j (φ) = y j (φ)/( |I| k=1 y k (φ)), where y j (φ) represents the probability that item j is tried and then purchased by any user group.In particular, for the homogeneous case, the probability that the next purchase is for item j is

Trial-Offer Market Equilibrium
For φ to be a stationary point in this stochastic process, it must satisfy p j (φ) = φ j for all items j.This motivates the following equilibrium notion.
Definition 2.2.For any T-O market, we say a market share φ is a trial-offer market equilibrium (TOME) if p(φ) = φ.We say φ is an interior TOME if it is a TOME with φ j > 0 for all items j.
The following theorem establishes that TOME exists in heterogeneous T-O market under mild conditions.It extends previous results on homogeneous markets [45].The proof, which uses the Brouwer's fixed-point theorem, is presented in Appendix A.
Theorem 2.3.[Existence of TOME.]If for any user type i the population fraction w i > 0 and the feedback exponent 0 < r i < 1, then the T-O market must have a TOME φ * , in which φ * j > 0 for any item j with v ij q ij > 0 for at least one user type i.
If there is an item j with v ij q ij = 0 for all user types i, then its equilibrium market share must be zeroleading to φ * on the relative boundary of the simplex -so we may ignore the item in analysis.In the rest of this paper, we assume there is no such item in the markets.
In the homogeneous case, we can explicitly compute the unique TOME φ * if 0 < r < 1: provided that v k q k > 0 for some item k.It is easy to verify φ * is a stationary point by plugging it into Definition 2.2.Section 3 specifies how to obtain φ * and argues for its uniqueness in the interior of the simplex ∆.

Efficiency and Diversity Measures
An online platform may be interested in maximising the probability of successful transaction among all items, which is A platform may also be interested in promoting diversity among items.A natural measure of diversity is the Shannon entropy, which is the standard measure of uncertainty of a probability distribution in information theory [25].Given market share φ ∈ ∆, its Shannon entropy is (5)

The Deterministic T-O Market Dynamic
This model is analogous to the stochastic T-O model.It will be useful for analysis.There is one user of each type i, whose budget is w i .The budget w i corresponds to the maximum amount of attention the buyer can afford in the platform.At each time t ≥ 0, each buyer i spends an amount of b t ij for item j, subject to the budget constraint Crucially, there is a natural correspondence between TOME of a stochastic T-O market and the fixed point of the dynamic (6), summarized by the lemma below.Its proof can be found in Appendix A.

Comparison with Classical Fisher Market and Proportional Response
For readers who are familiar with Fisher market dynamics, the deterministic T-O market dynamic ( 6) is reminiscent of the proportional response (PR) dynamic in Fisher markets [19].There is one crucial difference though.PR dynamic in Fisher market is same as dynamic (6) but with b in Fisher market is viewed as the price of item k; a higher price in PR drives down spendings on that item from the buyers.In contrast, a higher value of b t−1 T-O markets are mirror descent steps for one of the objectives.Section 3.3 presents the objective functions for heterogeneous markets.

TOME maximises regularised utilities
First, we establish a robust connection between TOME of homogeneous T-O market and optimization.For notational simplicity, let qj = q j v j , noting that the quality and visibility factors q j v j are coupled in both ( 3) and ( 4).We consider the following two constrained optimization problems: and max Ψ(φ) : We establish the equivalence between the equilibria and the maximisers of the above problems in the following theorem.The proofs for both simply invoke Lagrangian multipliers, and they are presented in Appendix B. Theorem 3.1.If φ * is a maximiser of problem (7) or problem (8), then it is a TOME.
We can view the objective function (7) as the "total utility" since the choice probability of item j is proportional to the "utility" associated with it, which is qj φ r j .The objective function (8) can be decomposed into two sums, namely |I| j=1 φ j log qj and (1 − r) |I| j=1 −φ j log φ j .The first sum can be viewed as an alternative measure of total utility, with the utility of item j being log qj weighted by its market share φ j .The second sum is (1 − r) times the Shannon entropy of the market share.When r = 1, the entropy term disappears, so the optimization problem (8) becomes trivial: the optimal solution is by setting φ j = 1 for the highest-utility item j = arg max k qk .As r decreases from 1, i.e., the strength of feedback signal reduces, the entropy term becomes more significant, which encourages diversity in the optimal solution.For 0 < r < 1, the objective functions in (7) and ( 8) are both strictly concave in φ, therefore having a unique maximum.A crucial advantage of ( 8) over ( 7) is that mirror descent on (8) provides insight into the convergence of the stochastic influence dynamics in T-O market as specified in (1) and (3).To formally describe this discovery, we need several concepts in optimization theory, which are discussed next.

T-O update as mirror descent, and TOME convergence for homogeneous markets
Background: Bregman Divergence and Mirror Descent.Consider a general constrained convex optimization problem of minimizing a smooth convex function f (x), subject to the constraint x ∈ C for some compact and convex set C. Definition 3.2.Let C be a compact and convex set, and let h be a differentiable convex function on C. The Bregman divergence w.r.t.h, denoted by d h , is defined as The widely used Kullback-Leibler (KL) divergence is a special case of Bregman divergence, generated by the function h(x) = j (x j log x j − x j ).
Given a Bregman divergence d h , the corresponding mirror descent update rule is where α is considered as the step-size of the update rule, which may depend on t in general.

New Result: T-O update as Mirror Descent
A key conceptual message of this paper is the equivalence of influence dynamic and mirror descent.To illuminate this, we first focus on deterministic and homogeneous T-O market.
Lemma 3.3.Let φ t be market share at time t in a homogeneous T-O market, and function p(φ) as defined in (3).The update rule is equivalent to the mirror descent update rule (9) for the optimization problem (8), in which d h is taken as the KL divergence, and α = 1.
The proof of the above lemma is presented in Appendix B. Once the equivalence is established, the convergence to TOME of the deterministic dynamic (10) becomes intuitive; we will provide the formal argument in Section 4. To show that the convergence extends to the stochastic setting, we follow Maldonado et al. [45] to rewrite the stochastic influence dynamic as a Robbins-Monro algorithm (RMA) of the deterministic dynamic (10).Precisely, the RMA is in the form of φ t = φ t−1 + 1 t • u t , where u t−1 is a random vector with E[u t−1 ] = (p(φ t−1 ) − φ t−1 ).For comparison, note that we can rewrite (10) ).This enables us to apply stochastic approximation [6,11] to establish the convergence of the stochastic dynamic.We summarize our main result for homogeneous market in the theorem below, and leave the discussions of RMA and stochastic approximation to Section 4, and the full proof to Appendix D.
where φ * is the unique interior TOME of the market.When r ∈ [0, 1], with probability 1, where Ψ is the objective function of the optimization problem (8), and Ψ * is the maximum value of (8).

TOME for heterogeneous markets
For the heterogeneous case, we first show the following proposition, which depicts that the TOME is optimizing an ex-post version of a convex objective.
Proposition 3.5.Given a heterogeneous T-O market with 0 < r i < 1 for all i ∈ U, its TOME φ * is the optimal solution of the following optimization problem: where We present the proof of the above proposition in Appendix C. The objective function of (13) takes the form of a product-of-utilities, or sum-of-log-utilities after taking logarithm.Known as Nash social welfare [40], this objective was found to strike a good balance between fairness and efficiency in the resulting allocations [7,14].A proper exposition of this connection is outside the scope of this paper.Once a * i are known (hence ex-post), (13) is a convex optimization problem.However, if a * i is unknown, it is non-convex in general.We raise the properties of its optimal solution (e.g., is the optimal solution a TOME?) as an open problem.
Then we turn our focus to two interesting special cases of the heterogeneous market, where r i = r for all types i, and: • the trial randomness is the same across all user types (i.e., v ij are the same for all types i), but the purchase randomness can be different (i.e., q ij can be different for various types i); • the trial randomness can be different across all types (i.e., v ij can be different for various types i), but the purchase randomness is the same across all types (i.e., q ij are the same for all types i).
It is easy to reduce the first case to a homogeneous setting.For the second case, we design a new optimization problem and show that ( 6) is indeed mirror descent for the problem.The driving variables of the dynamic ( 6) are b ij for user types i and items j.We let b j := |U | i=1 b ij , and set q j to be the common value of q ij for all i.
By performing a variable transformation x ij = b ij /q j to the above problem, we obtain an equivalent transformed optimization problem where x ij 's are the driving variables, which is needed for the key lemma below.The proof is presented in Appendix C. Lemma 3.6.The dynamic ( 6) is equivalent to mirror descent w.r.t.KL divergence on the transformed optimization problem (14).
Recall from Lemma 2.5 that any fixed point of (6) corresponds to a TOME.Lemma 3.6 thus implies that if the dynamic (6) converges to the optimal solution of ( 14), the optimal solution corresponds to a TOME.Then we use RMA and stochastic approximation again to establish convergence of the corresponding stochastic influence dynamics in heterogeneous markets.The proof is presented in Appendix D. Theorem 3.7.In any heterogeneous T-O market with r i = r < 1 for all i ∈ U, if φ 0 > 0, and one of the following conditions 1. v ij = v j for all i ∈ U, j ∈ I 2. q ij = q j for all i ∈ U, j ∈ I is satisfied, then with probability 1, where φ * is the unique interior TOME of the market.When r ∈ [0, 1], with probability 1, where b t = (b t ij ) is the vector with b t ij defined on φ t−1 as in (6), Γ is the objective function of the optimization problem (14), and Γ * is the maximum value of (14).
A Remark.In the settings of Theorem 3.4 and Theorem 3.7, there indeed exist multiple TOMEs in the simplex ∆.However, there is only one interior TOME.The uniqueness of the limit point of the dynamic depends on the choice of initial point and the social influence parameter r: • When r < 1 and the initial market share φ 0 is in the relative interior of ∆, then the limit point of the dynamic is unique, which is the interior TOME, for both homogeneous and heterogeneous cases according to Theorems 3.4 and 3.7.
• When φ 0 contains zero initial market shares for some items, then it is equivalent to consider a market with those items eliminated.In other words, those items would not gain non-zero market share in subsequent iterations from zero initial market shares.The limit point of the dynamic is still unique in these cases.
• When r = 1, the objective functions in optimization problems ( 8) and ( 13) are not strictly convex.There is a level set containing multiple equilibria.The dynamic will converge to the level set which optimizes these objective functions.

Analysis
We follow the same approach in formally proving two of our main results, Theorem 3.4 and Theorem 3.7.The approach comprises two main steps.First, we show that the evolution of market share φ can be cast as a stochastic RMA.Second, we show the convergence of such RMA by establishing their equivalence to mirror descent on convex functions.

Influence Dynamic as RMA
where z t ∈ R n for some n ≥ 1, F : R n → R n is a deterministic continuous vector field, γ t is deterministic and satisfies γ t > 0, t≥1 γ t = +∞ and lim t→∞ γ t = 0, and E[U t |F t−1 ] = 0 where F t−1 is the natural filtration on the entire process.The corresponding ordinary differential equation (ODE) system of the RMA is ż = F (z).
Note that market share φ will change only when there is a purchase.Thus, Maldonado et al. [45] modify the time schedule to only count those times at which a purchase occurs, and show the following lemma.
Lemma 4.2.In the stochastic T-O market, the update of market share follows the following RMA w.r.t. the modified time schedule: where U t is the random variable defined as below.Let e t denote the random unit vector whose j-th entry is 1 if item j is purchased at time t.Then U t = e t − E[e t |F k−1 ]. (Recall that e t j = 1 with probability p j (φ t−1 ).) The proof of this lemma can be found in [45] (for the homogeneous setting) and Appendix D (for the heterogeneous setting).With the lemma in hand, we can apply the seminal results of Benaïm [6] to show that the RMA trajectory is the asymptotic pseudotrajectory of the mirror descent update (10).By using the mirror descent convergence theorem established in [15,19], we show that both dynamics converge to the global minimisers of (8).This allows us to present a new proof of Theorem 3.4, which was first shown in [45].

Convergence of Mirror Descent
Given an L-Bregman-convex function f with respect to the Bregman divergence d h , the mirror descent rule with respect to the Bregman divergence d h is given by x t+1 = g(x t ), where The update in (17) is the same as that of a general mirror descent (9), with step-size α = 1/L.It enables us to use the following theorem to bound the difference to the optimal.Theorem 4.4.[Chen and Teboulle [15]] Suppose f is an L-Bregman-convex function with respect to d h , and x t is the point reached after t applications of the mirror descent update rule x t+1 = g(x t ), where g is as defined in (17).Then for all t ≥ 1, Thankfully, the objective functions for both homogeneous (8) and heterogeneous ( 14) cases are Bregman convex; the proofs are presented in Appendices B and C respectively.Finally, we use the theorem below to complete the proof.Note that what we have just showed is the convergence of discrete-time mirror descent updates of the form x t = g(x t−1 ), but the theorem requires condition that guarantee convergence of the continuous-time ODE system ẋ = g(x) − x.To apply the theorem, we need to convert the discrete-time convergence to its ODE analogue.The conversion is simple, and it is presented in Appendix D.  The lines denote the median of 50 simulations with different random initialisations.(Right) Efficiency over the entropy of market shares in heterogeneous setting.The lines denote the median of efficiency over 50 simulations with different random initialisations, markers denote iteration 1000, 100,000, 200,000 and 300,000, and error bars represent the 25th to 75th percentile range in both efficiency and entropy.Theorem 4.5.Consider an ODE ẋ = h(x).Suppose there is a continuously differentiable function f : R d → R such that (i) lim x →∞ f (x) = +∞; (ii) the set of minimum points of f , X * , is non-empty; and (iii) ∇f (x), h(x) ≤ 0 for all x, with equality holds if and only if x ∈ X * .Then almost surely, the Robbins-Monro algorithm of the ODE converges to a non-empty subset of X * .

Empirical observations
We simulate cultural markets using real-world preferences from the well-known MovieLens dataset [37], in order to explore the efficiency and diversity of the market in homogeneous and heterogeneous settings, and under different ranking strategies 2 .he simulations aim to answer key questions such as whether heterogeneous T-O market is more efficient, whether T-O market is stable as prescribed in Section 3, and whether stability sacrifices diversity.
We set up the simulation using the MovieLens-100K dataset [37].This dataset consists of 100,000 ratings (valued 1-5) from 943 users on 1682 movies, where each user has rated at least 20 movies.We performed matrix completion using incomplete SVD [5] via the Surprise python package 3 , yielding a preference matrix Γ = [γ ij ] ∈ R 943×1682 for each (user, movie) pair.With γ ij ∈ [0, 1] denoting the normalised preference of user i ∈ U and item (movie) j ∈ I. Denote O as the set consisting of all indices (i, j) with γ i,j observed (in the MovieLens-100K dataset), and O as the set of unobserved indices (entries estimated with incomplete SVD).
To simulate heterogeneous preference types, we divide the users into M non-overlapping subgroups based on user attributes.Let M i=1 U i = U, where U i denote the set of users in group i, and hence the weights w i = |U i |/|U|.We first calculate the visibility factor vij by averaging over the set of observed entries generated by user group i using (18).We also calculate the quality factor q ij by averaging over the set of unobserved entries generated by user group i using (19).This choice reflects the intuition that in a T-O market, the probability of purchase depends on a quality factor that is often unknown to the platform a priori before a user tries an item.Other strategies for estimating q ij and vij are left for future work.
We cluster users into 100 groups using K-means on the rows of Γ.This grouping is used to compare market efficiency and diversity in homogeneous (all users have the same preferences) and heterogeneous (100 user groups) settings.Results for other group compositions are in the online supplement [58]E and are qualitatively similar.We also account for position bias in ranked lists [26] in order to construct the actual visibility factor v ij , which can vary over time, denoted as v t ij .Prior MusicLab model has included position bias parameters [42,45] for the top-50 items, with zero visibility assigned to all other items.This results in a list of fixed weights ι k , k = 1, . . ., 1682, where ι 1 , . . ., ι 50 > 0, and ι 51 , . . ., ι 1682 = 0. We adopt the separable click-through rate (CTR) model commonly used in modelling auctions [2], which simply multiplies the estimated visibility term vij with a ranking factor η t ij of item j presented to user i at time t.Different ranking strategies produce different η t ij to modulate vij , which in turn result in different probability distributions on the trial phase (equation ( 1)).
We introduce the following three ranking strategies that define the relationship between the fixed visibility factors ι and ranking factor for user i and time t over all items η t i,: .In all strategies η t ij = ι k , with index k defined randomly, ordered by popularity or quality, respectively.
• Random-ranking.Upon each simulation round, the visibility term and [j] denotes array indexing.This ranking changes for each simulation step.One expects such a strategy to promote diversity while preserving some information on item appeal through vij .
• Popularity-ranking.This strategy sorts the items by descending market share φ t , with This ranking will change over the simulation steps, and is analogous to the original MusicLab experimental setting [49].One expects this strategy to be unstable due to the randomness early in the simulation, since it could accidentally promote items that users do not like to the top, resulting in the high quality items being buried.
• Quality-ranking.Denote the descending sorting rank of item j among user group i as k = arg sort desc {q i,: }[j], where q i,: is the one-dimensional array for qualities factors in group i.This ranking does not change over the simulation steps, since both q ij and its sorted order remains fixed.One expects this strategy to best align visibility with the underlying quality metrics (unobserved before trying), since it has oracle access to q ij , and should yield high efficiency.
In each round of the simulation, one new user arrives at the market and chooses an item for a trial according to the multinomial logit (equation ( 1)).Then the user decides whether to purchase this particular item by flipping a biased coin parameterized by q ij .Note that these new users are generalisations of the M groups of user populations in MovieLens via attributes q ij and vij , rather than being subsets (or samples) from the original 943 users.This setting is consistent with other theoretical and simulation studies of cultural market and recommender systems [42,39].We report market efficiency as the empirical version of Definition 2.4, namely, the fraction of users who made a purchase.We also measure diversity among the set of items, by computing the Shannon entropy of market share φ (Equation ( 5)) at each time step.We explore the relationship between these two metrics.
Figure 2 summarises the trends of efficiency and diversity over the two settings -homogeneous and heterogeneous -and three ranking strategies -random/popularity/quality.In Figure 2 (left), it is observed that the quality-ranking oracle has the highest efficiency among the three ranking strategies, followed by popularity ranking, and random ranking has the lowest efficiency when there is a sufficient number of users.Taking into account heterogeneous user preferences improves efficiency in both quality and popularity ranking settings.We also notice that in MovieLens-100K dataset, the gap among the ranking strategies are larger than that of moving from the homogeneous to heterogeneous settings.Figure 2 (right) compares both efficiency and diversity (measured by the entropy of market shares) at different iterations for different ranking strategies in the heterogeneous setting.For popularity and quality rankings, item diversity decreases over time (curves moving leftwards) while efficiency increases (curves moving slightly up).Comparing the two, quality ranking yields more diversity across the items (higher entropy) and is more stable (smaller spread on both dimensions).This observation corroborates Proposition 3.5 that heterogeneous Musiclab objective is ex-post concave.Popularity ranking results in larger variations in both efficiency and entropy, confirming observations in the original Mu-sicLab experiment [49] -that market allocation is unstable due to random initialisations and result in market dominance by a few popular items.As a control group, random ranking yields the lowest efficiency and no apparent differences between homogeneous and heterogeneous user preferences.In this setting, efficiency still improves slightly due to the joint effect of both visibility and quality terms.But item diversity stays close to the theoretical maximum in entropy (ln(1684) ∼ 7.4nats) over time, indicating that the random ranking with cut-off at top 50 items is playing a larger role in user choice than signals present in the visibility and quality terms.

Conclusion
This paper views the dynamics of cultural markets under an optimization lens.We identify new objective functions for trial-offer markets, and establish robust connections between social feedback signals and optimization processes.Our results narrow the gap between the theory and practice of recommender systems.In particular, they make the analysis of recommender systems more versatile by incorporating user-specific preferences, and offer a holistic view of market stability and efficiency beyond individual clicks and views.Simulations using real-world user preferences confirm that markets with heterogeneous preferences are more stable and more efficient.
Our work leads to several open research questions, such as convergence rates of the stochastic T-O markets, analysis of general heterogeneous T-O settings, fairness properties of market equilibria, and describing markets that are also learning a recommender systems in-the-loop [46].More generally, we hope the current work opens up new ways to asking and answering a set of research questions at the intersection of classical markets and online attention.

A Properties of TOME
In this section, we present the proofs of Theorem 2.3 and Lemma 2.5.

A.1 Proof of Theorem 2.3
By the definition of y j (φ) in (2), the following inequality holds for any φ ∈ ∆ and j ∈ I: where i * (j) is a user type with v i * (j),j q i * (j),j > 0, which exists due to condition (ii).If φ j ≥ (c j ) 1/(1−r i * (j) ) , then On the other hand, recall that p j (φ) = y j (φ)/( Let S denote the set φ ∈ ∆ : hence S is non-empty.And of course, S is compact and convex.Due to ( 21) and ( 22), p is a continuous function that maps S to a subset of S. By the Brouwer's fixed point theorem, p has a fixed point in S, which is a TOME of the market.
A.2 Proof of Lemma 2.5 (i) Suppose φ * = (φ * j ) is a TOME.By the definition of TOME, there exists a real number c such that for any j ∈ I, Recall that we set b Now, suppose that b t−1 = b * .By ( 6), ( 23) and ( 24), Thus, if b t−1 = b * , then b t = b * too.This concludes that b * is a fixed point of the dynamic (6).
(ii) Note that Thus, for any item j we have φ * , which implies φ * is a TOME by definition.

B Homogeneous Markets B.1 Proof of Theorem 3.1
Optimization Problem (7) When r ≥ 1, since φ j ∈ [0, 1], we have φ r j ≤ φ j .Let q = max j {q j : j ∈ I}.We have If r > 1, the above equalities hold only if φ j = 1 for an item j satisfying q j = q, and φ k = 0 for all k = j.
It is easy to verify that every such φ is a TOME (check 2.2).If r = 1, the above equalities hold only if: φ j > 0 ⇒ q j = q.Again, it is easy to verify that every such φ is a TOME.
When 0 < r < 1, ( 7) is a convex program, so we can employ the standard convex analysis tools of Lagrangian multipliers and Karush-Kuhn-Tucker (KKT) theorem to characterize the optimal solution.We first transform the maximisation problem into a minimisation problem for simplicity: The Lagrangian is given below, with dual variables λ ∈ R and η j ≥ 0 for any j ∈ I.
Then for any j ∈ I, ∂L 1 ∂φ j = − r qj φ r−1 j + λ − η j = 0 Since r − 1 < 0, for the above equality to hold, φ j = 0 is impossible if qj > 0. By the KKT theorem, at the optimum η j = 0 for any j with qj > 0, and hence −r qj φ r−1 j + λ = 0. Thus, φ 1−r j qj is the same for all j with qj > 0 (with common value r − 1 − λ), i.e., φ 1−r j ∝ qj .On the other hand, if qj = 0, then clearly φ j = 0 at the optimum.Together with the constraint |I| j=1 φ j , we can solve for the optimal solution analytically, which coincides with the TOME specified in (4).
Optimization Problem (8) Note that the objective function can be decomposed as two parts, j φ j log qj and (1 − r) times the Shannon entropy.When r > 1, 1 − r < 0, so the second part is maximized when the market share is purity, i.e., φ j = 1 for some j and φ k = 0 for all k = j.The first part is maximized when φ j = 1 for some j satisfying q j = q, and φ k = 0 for all k = j -note that this condition is stronger than the one for the second part, and this is the same optimality condition as the one we presented for (7).
When r = 1, the optimality condition is again the same as what we presented for (7): φ j > 0 ⇒ q j = q.When 0 < r < 1, the objective function is concave, so (8) is a convex program.The Lagrangian is Then for any j ∈ I, Since log φ j −∞ as φ j 0, φ j = 0 is impossible.By the KKT theorem, at the optimum η j = 0 for all j.
Thus, log φ 1−r j qj is the same for all j (with common value r − 1 − λ), i.e., φ 1−r j ∝ qj .This is the same optimality condition as the one we presented for (7).

B.2 Proof of Lemma 3.3
Mirror descent is used to minimize a function, so to use the mirror descent optimization tools more conveniently, in the proofs below we let Ψ be the negative of the objective function in (8).Note that Thus, the mirror descent update rule is4 Let the function in the arg min be g.Then To optimize g in the simplex ∆, from KKT condition it suffices that the above partial derivative is the same for all j, i.e., φ j qj (φ t−1 j ) r is the same for all j.Together with the constraint φ ∈ ∆, we can solve for φ analytically, which is p(φ t−1 ) as given by (3).

B.3 Convergence in Deterministic Homogeneous T-O Market
For fulfilling our final target of showing convergence in stochastic T-O market, an intermediate conceptual step is to show the analogous result in the corresponding deterministic T-O market.After proving Lemma 3.3 above, we use the approach laid down in Section 4.2 to show the update rule (10) converges to TOME.To apply Theorem 4.4, we need to show that Ψ (again, here it denotes the negative of the objective function of ( 8)) is 1-Bregman convex w.r.t. the KL divergence.For any φ, φ ∈ ∆,

C Heterogeneous Markets
C.1 Proof of Proposition 3.5 Let f be the logarithm of the objective function, then the partial derivatives are given by We note that if φ * j = 0 for some j ∈ I, then this component clearly satisfies the equilibrium equation.Therefore, it suffices to consider the set Q ⊂ I which includes all item indices j with φ * j > 0. We have j∈Q φ * j = 1.By the KKT theorem, we have for all j ∈ Q. Multiplying (25) by φ j and summing it up over all j ∈ Q gives Plugging this into (25) we get , when setting φ = φ * in the RHS of the above formula, it fulfills the definition of TOME (recall Lemma 2.1 and Definition 2.2).

C.2 Reduction to Homogeneous Market for
Recall that in a T-O market each user tries an item modelled by a stochastic process, followed by a random decision to purchase that item or not.In the heterogeneous setting, if for each item j we have v ij = v j for all i ∈ U, this means all users follow the same stochastic process of choosing which item to try.Then the model is equivalent to all users belong to the same type (i.e., homogeneous), but the eternal probability qj of purchasing an item j after trying is a weighted average of q ij 's over all i ∈ U.The following proposition formally summarises the above idea.
Proposition C.1.Suppose that r i = r for all i ∈ U. Also, for every item j ∈ I, we have v ij = v j for all i ∈ U.
Then the TOME φ * can be written as φ * j = (q j v j ) Proof: When r i = i and v ij = v j , the function y j (φ) in (2) can be written as The RHS is same as y j (φ) for a homogeneous market with w 1 = 1, the same values of v j 's, and with q j = |U | C.3 Proof of Lemma 3.6 Note that i,j x ij = i w i = 1 due to the constraints.We first show that the objective function is 1-Bregmanconvex w.r.t. the KL divergence on the variables X.Note that ∂Γ ∂x ij = −r log(x j q j ) + log Then for any X, X in the domain, a direct calculation shows that Γ(X ) − Γ(X) − ∇Γ(X), X − X = KL(X , X) − r • KL(y , y) , where y = (x 1 , x 2 , . . ., x |I| ) and y = (x 1 , x 2 , . . ., x |I| ).Since X, X are refinements of y, y respectively, 0 ≤ KL(y , y) ≤ KL(X , X).Since 0 < r < 1, 0 ≤ Γ(X ) − Γ(X) − ∇Γ(X), X − X ≤ KL(X , X), which demonstrates that Γ is 1-Bregman-convex w.r.t. the KL divergence.
Next, we compute the mirror descent update rule derived from (17), which is −r log(x t−1 j q j ) + log Let the function in the arg min be g.Then ∂g ∂x ij = − r log(x t−1 j q j ) + log x ij v ij (q j x t−1 j ) r + 1 − r.To optimize g subject to the constraints in the optimization problem, from KKT condition it suffices that for each user type i, the above partial derivative is the same for all j, i.e., x ij v ij (q j x t−1 j ) r is the same for all j.Together with the constraint j x ij = w i , we can derive the optimal solution, which is .
Finally, recall that we have adopted the variable substitution x ij = b ij /q j .Converting the above update rule back to the domain with driving variables b ij 's, we have , which matches with (6).

Definition 4 . 1 .
([6, 48]) A Robbins-Monro algorithm (RMA) is a discrete-time stochastic process z t whose general structure is specified by z Let d h (•, •) denote the Bregman divergence in Definition 3.2.We assume that the function h is strictly convex.Consequently, d h (•, •) is strictly convex in its first parameter, and d h (x, y) = 0 if and only if x = y.Definition 4.3.A function f is L-Bregman-convex with respect to the Bregman divergence d h if for any y ∈ rint(C) and x ∈ C,

Figure 2 :
Figure2: Simulation results on MovieLens dataset.Each simulation is run for 300,000 time steps (each introducing a new user), a measurement is taken after every 1000 time steps.(Left) Market efficiency over time comparing the homogeneous vs heterogeneous settings under three ranking strategies (random/popularity/quality).The lines denote the median of 50 simulations with different random initialisations.(Right) Efficiency over the entropy of market shares in heterogeneous setting.The lines denote the median of efficiency over 50 simulations with different random initialisations, markers denote iteration 1000, 100,000, 200,000 and 300,000, and error bars represent the 25th to 75th percentile range in both efficiency and entropy.
w i q ij .

First, we writex
down the transformed optimization problem explicitly.Recall that |U | i=1 w i = 1, andx ij = b ij /q j .Let x j = |U | i=1 x ij .min Γ(X) = −r |I| j=1x j log(x j q j ) + ij = w i , and x ij ≥ 0 for all i, j.

Figure 3 :
Figure 3: Simulation results on MovieLens dataset with 50 user groups.