Informational Diversity and Affinity Bias in Team Growth Dynamics

Prior work has provided strong evidence that, within organizational settings, teams that bring a diversity of information and perspectives to a task are more effective than teams that do not. If this form of informational diversity confers performance advantages, why do we often see largely homogeneous teams in practice? One canonical argument is that the benefits of informational diversity are in tension with affinity bias. To better understand the impact of this tension on the makeup of teams, we analyze a sequential model of team formation in which individuals care about their team's performance (captured in terms of accurately predicting some future outcome based on a set of features) but experience a cost as a result of interacting with teammates who use different approaches to the prediction task. Our analysis of this simple model reveals a set of subtle behaviors that team-growth dynamics can exhibit: (i) from certain initial team compositions, they can make progress toward better performance but then get stuck partway to optimally diverse teams; while (ii) from other initial compositions, they can also move away from this optimal balance as the majority group tries to crowd out the opinions of the minority. The initial composition of the team can determine whether the dynamics will move toward or away from performance optimality, painting a path-dependent picture of inefficiencies in team compositions. Our results formalize a fundamental limitation of utility-based motivations to drive informational diversity in organizations and hint at interventions that may improve informational diversity and performance simultaneously.


Introduction
A long line of work in the social sciences has argued that, within organizational settings, groups that bring a diversity of perspectives to a task can be more effective than groups that do not.The combination of distinct perspectives makes more information available to a group, and can enable productive synergies among these sources of information, improving a team's performance (Page, 2008;Burt, 2004).Along with observations of this phenomenon in practice, a set of mathematical models has sought to formalize these types of informational advantages in abstract settings in which a group of agents engage in collective problem-solving (Hong and Page, 2004).
If this form of informational diversity1 confers performance advantages on teams within organizations, why do we so often see teams that are largely homogeneous in practice?A canonical argument is that the benefits of informational diversity are in tension with affinity bias, a human behavioral phenomenon in which people prefer to interact with others who have similar perspectives.This tendency is well-documented by prior work in organizational psychology (Huang et al., 2019;McCormick, 2015;Oberai and Anand, 2018).
It is an aggregate effect that can stem from a range of different underlying mechanisms: for example, it may arise because people have an inherent preference for others with similar perspectives, or because they have difficulty evaluating others with different perspectives, or because they prefer teams with fewer disagreements or those whose aggregate view is closer to their own.All of these would produce a version of affinity bias as an outcome.For purposes of our discussion here, we will focus on the observable effects of these mechanisms in the form of affinity bias without restricting ourselves to a specific underlying mechanism.
The tension between informational diversity and affinity bias is the basis for a number of empirical results establishing that informationally diverse teams can lead simultaneously to higher-quality solutions but also lower group cohesion (Phillips et al., 2009;Milliken and Martins, 1996;Watson et al., 1993).Such findings highlight the challenge in building informationally diverse teams: expanding a team by adding members with dramatically different perspectives has the potential to improve the team's performance but also to reduce the subjective value of the experience for participants due to their affinity bias.We are interested in understanding the the fundamental phenomena that emerge from this conflict between informational diversity and affinity bias.In particular, what are the implications of this tension for the composition of teams that form as new members are brought on and the team grows in size?
The present work: modeling informational diversity with affinity bias.In this paper, we develop a model for team formation in the presence of both informational diversity and affinity bias.Whereas earlier models of informational diversity formulated agent-level objective functions in such a way that the agents should always favor greater diversity (see, e.g., (Lamberson and Page, 2012)), our class of models explicitly captures the tension between these forces in agents' utilities.In particular, in our model, agents forming a team are faced with a prediction task: they see instances of a prediction problem encoded by features, and they must make a prediction about some future outcome for each instance.A team of policymakers trying to predict the effect of policy interventions, a team of investors trying to predict which start-up companies will be successful, a team of doctors faced with complex medical diagnosis, or a team of data scientists participating in web-based competitions such as the Netflix Prize and Kaggle competitions are all among the types of scenarios captured by this framework.
While prior work assumes that agents aim to minimize the overall predictive error of their teams (Lamberson and Page, 2012), our approach uses these earlier formalisms as building blocks to produce a more general model in which each agent has an objective function comprised of the sum of two terms (capturing the two forces we are considering): one term is the error rate of the team, and the other term is their level of dissimilarity to other team members.A one-dimensional parameter controls the relative weight of these two terms in the objective function; this general form for the objective function allows us to study extremes in which agents care primarily about team performance or primarily about team homogeneity.
We are particularly interested in the process by which teams grow over time, as they decide sequentially which new members to add.For any configuration of a team, we can ask which types of agents the team would be willing to add, where the criterion for adding a new member is that it improves the aggregate utility of the team members (via the weighted sum of team performance and individual disagreement with others).Our main takeaway from the analysis of this model is a path-dependent characterization of inefficiencies in team compositions formed through the above sequential growth dynamics.This characterization hints at organizational interventions that may improve informational diversity and performance simultaneously, including those that help reduce the impact of affinity bias on team formation dynamics, or those that initiate teams from a more informationally diverse composition (thereby beginning the dynamics at more favorable initial conditions).

Model Overview
We consider a setting in which a team consisting of multiple members is tasked with making complex, nonroutine decisions based on the members' collective predictions about some future outcome.Importantly, the task is complex and not further decomposable into specialized sub-tasks that can be accomplished independently. 2 We have mentioned several examples of real-world settings in which these conditions are approximately met.As an example, consider aggregating the diverse forecasts of individual members of a marketing team to predict a new product's expected sales (Lamberson and Page, 2012).
Team problem-solving mechanism.Team members have predictive models of the world (e.g., predicting the expected sales of a product).Given a new case, each member uses their model to predict the outcome of the case.(In different contexts, this model can either be an abstraction of the team member's mental model of the domain, or it could be an actual implementation of a computational model that they have built.Our model operates at a level of generality designed to address both these scenarios in general terms.)For simplicity, we assume individuals belong to one of the two opinion groups or types, with those belonging to the same type holding similar predictive models.More precisely, an agent belonging to a given type has access to a noisy version of that type's predictive model.the team aggregates its members' opinions into a collective prediction/decision using an aggregation mechanism, such as simple averaging.Team's (dis)utility (λ).The team's cost function combines two factors: (1) the expected error rate of the team's predictions; and (2) the dissimilarity among team members' predictors.In particular, the team aims to minimize: where the dissimilarity between two teammates is captured by the expected level of disagreement in their predictions for a randomly drawn case.Note that various choices for λ reflect different team preferences.For example, λ = 0 corresponds to a team which is solely concerned with improving accuracy.λ = 1 corresponds to a team which only cares about minimizing internal disagreement.Intermediate λ values (e.g., λ = 0.5) capture teams that weigh both accuracy and similarity.The effect of team size (β).To capture the relationship between team size and level of dissimilarity among team members, we introduce a parameter β ∈ [0, 1], which at a high-level captures the psychological costs of cooperation (Boroş et al., 2010) and managing conflicts (Higashi and Yamamura, 1993) as the team grows in size.In particular, as described in Section 3, for a team of size n, we assume that the sum of pairwise disagreements among team members is normalized by 1/n 1+β .This means that when β = 0 the psychological cost of disagreement grows linearly in the number of teammates who hold conflicting opinions, whereas when β = 1 the cost depends only on the fraction of agents with conflicting opinions, independent of team size.Values of β strictly between 0 and 1 interpolate between these extremes.
A parametric class of aggregation mechanisms (α).While prior work takes simple averaging as the team's approach to aggregating predictions, to better understand the effect of the aggregation mechanism on team's composition, we consider a natural class of aggregation functions parameterized by α ≥ 0. This parametric family is akin to the Tullock's contest success function (Jia et al., 2013;Skaperdas, 1996), in which a homogeneous subgroup of size m in the team has its opinion weighted by m α in the overall team aggregation.In the case of only two opinion types, α = 1 corresponds to the uniform average, and the limit α → ∞ corresponds to the majority rule (or the median).

Team Growth Dynamics
We assume λ, β, and α are fixed throughout.At each time step, a new agent arrives and the current team considers whether to bring them on as a new member.We assume this decision is made by assessing whether the addition of the new agent to the team would reduce its disutility.Note that adding more than one member of each type may be desirable to the team for two distinct reasons: (a) since each agent's prediction is a noisy version of its type, adding multiple members of the same type leads to noise reduction; (b) having more than one member of each type might be necessary for the team to achieve the accuracy-optimal balance between types.(For example, if the accuracy-optimal composition consists of twice as many agents of type A compared to type B, a team starting with 1 member of each type may find it beneficial to hire another member of type A.) For simplicity, in the basic version of our model, we assume that teams can precisely measure both their internal levels of dissimilarity and their predictive errors.In particular, we make the simplifying assumption that a team can perfectly estimate its current accuracy as well as changes in accuracy as a result of bringing on a new member.This assumption is common in prior work (see, e.g., (Lamberson and Page, 2012;Hong and Page, 2020)).Our key observation is that even with this optimistic assumption in place, teams fail to raise sufficient informational diversity to optimize accuracy.As discussed in Section 5, if teams underestimate the accuracy gains of increased informational diversity, the incentive to bring on diverse team members would only be further hampered.(We will further demonstrate the effect of both the under-and over-estimation of accuracy gains on team growth dynamics in Section 5.) Figure 1: Preview of team growth dynamics.The x-axis specifies the number of type A members in the team and the y-axis, the number of type B members.Arrows point in the direction of disutility reduction.The eventual composition is highly path-dependent and often inefficient.

Insights from the Analysis
Our analysis characterizes the kind of teams that form as the result of the interplay between predictive accuracy and affinity bias, depending on the three primary parameters of the model: • λ, or the relative impact of affinity bias vs. accuracy on team growth dynamics: In the extreme cases of λ = 0, 1 the team formation dynamics behave as one may expect: When the team only cares about accuracy (λ = 0), it reaches the accuracy (=utility) optimal composition regardless of its initial makeup.If the team solely cares about reducing disagreement, only the initial majority type can bring on more members of its own.For the intermediate values of λ, however, it is not a priori clear how inefficiency in team's accuracy emerges as λ grows.One may expect a tipping point phenomenon, where λ has to be larger than a certain threshold to hinder the formation of accuracy optimal teams.With relatively few assumptions, our analysis shows that the pattern is, in fact, markedly different: For any intermediate value of λ team formation dynamics get stuck in local utility optima, failing to achieve accuracy-optimal compositions.
• α, or the mechanism by which individual predictions get aggregated into a team prediction: As α grows, the team's rule for arriving at a collective prediction varies smoothly from pure averaging to a median or majority rule type of function.In the process, the majority type's prediction has an increasingly dominant effect on the collective prediction as α grows.This leads to a dynamic in which the prospect of adding new members from the less-represented type produces negligible accuracy gains but nontrivial disagreement cost; as a result, new members from the less-represented group will not be added unless their relative size on the current team is already substantial enough.
• β, or the impact of team size on perceptions of within-team disagreements: As β becomes smaller, team size will play a more dominant role in affecting perceptions of dissimilarity/disagreement.As a result, the team will never add more than a certain number of the less-represented type.Depending on the initial composition of the team, the majority type may find it beneficial to continue adding new members of its own to drown out the predictions of the other type, and thereby drive down the cost arising from dissimilarity.When within-type disagreements are non-zero-which can be the case due to the noise in the predictions made by agents of the same type-the majority type may stop expanding itself to avoid increasing within-type disagreements.
Taken together, these principles suggest that team-growth dynamics can exhibit a set of subtle behaviors: (i) from certain initial team compositions, they can make progress toward better performance but then get stuck partway to optimally diverse teams; but (ii) from other initial compositions, they can also move away from this optimal balance as the majority group tries to crowd out the opinions of the minority.The initial composition of the team can determine whether the dynamics will move toward or away from performance optimality.
It is natural to visualize this process geometrically as taking place in a Cartesian plane where the point (n A , n B ) represents a team with n A members of group A and n B members of group B and arrows initiating from point (n A , n B ) point to the direction in which growing the team would improve its utility.Team growth dynamics then correspond to a walk through this space; and the destination that this walk heads to depends on the point it starts from.Figure 1 provides an example for how this analysis operates on a specific instance of the problem.In the figure, the diagonal line shows the optimal team composition, and arrows starting from a point (n A , n B ) on the plot indicate the direction in which a team consisting of n A members of groups A and n B members of group B grows.(The specific instance in the figure is described by parameter values α = 5, β = 0.1, λ = 0.025 using the notation from earlier; and both opinion types, A and B, have the same error rate of 0.1 (L A = L B = 0.1).As will be described in Section 3, in our model, we assume these similar error rates are achieved using different predictive attributes; therefore, teams consisting of both types achieve higher accuracy than homogeneous ones.) The figure illustrates in a concrete example the set of underlying principles that are formalized by our results.Specifically, looking at how the arrows for team growth point in different parts of the plane, we see that the space decomposes into a set of different regions with distinct behaviors.There is a valley near the diagonal: a subset of points close to the accuracy optimum where growth dynamics will move the team toward.Some of these, like point Q 1 in the figure, will iterate all the way to accuracy optimality, while others, like point Q 2 in the figure, will move partway to the optimum and then get stuck.There is also a downslope near each axis: points like Q 3 that are sufficiently close to the axis will actually move away from the accuracy optimum and further out along the axis, corresponding to teams that add more of the majority type to reduce average dissimilarity.The expansion of the majority group through growth dynamic may stop if within-team disagreements become non-negligible (see, e.g., the dynamics initiating at point Q 4 ).Finally, there is a ridge that separates the central valley from the outer downslope; which side of the ridge a point is on determines whether it iterates in the direction of optimality or away from it.Points that actually lie on this ridge, like Q 5 , do not move at all.Thus, our results suggest that there can be a critical level of team heterogeneity in the process: once the team passes this level of heterongeneity, then the growth dynamics will improve its performance; but if it falls short of this level of heterogeneity, then the growth dynamics may cause it to unravel toward greater homogeneity and lower performance.
The type of analysis outlined above, while stylized in the context of our model, suggests several broader insights that can be actionable.First, the positive impact of diversity on a team's performance alone will not incentivize high-performing teams to form.Second, the analysis highlights some of the levers available to the planner to influence the team growth dynamics.Some of these are visible in Figure 1, like the choice of initial team composition.Others are implicit in the choices of parameters -for example, in the choice of aggregation mechanism (corresponding to α) for resolving conflicts of opinion among team members.

Related Work
Optimal forecasting teams.Combining multiple predictors to achieve better predictive accuracy is a common and well-studied approach in machine learning, operations research, and economics (Clemen, 1989;Armstrong, 2001).Prior work has showed that combining a diverse set of predictors often improves performance (Batchelor and Dua, 1995;Hong and Page, 2004;Lobo and Nair, 1990).Motivated by the empirical evidence, prior work has proposed formal models of forecast aggregating teams (Lamberson and Page, 2012;Davis-Stober et al., 2015).For example, Lamberson and Page (2012) focus on the role of team size on determining its optimal composition for making predictions.While our model closely follows (Lamberson and Page, 2012), the question we are interested in is fundamentally different.It is also worth noting that the team formation process in our model can be viewed as a variant of ensemble learning in machine learning-with the key difference that there exists a cost associated with combining diverse models.Comparison with (Lamberson and Page, 2012).Our work extends the model proposed by Lamberson and Page, who study the optimal composition of teams making combined forecasts.Similar to our model, accuracy serves as a proxy for teams' problem-solving abilities; the aggregation mechanism is fixed ahead of time; agents belong to one of the two predictive types, A and B, and a positive and fixed covariance exists between the errors made by any two agents of the same type.The key question is "what composition of types minimizes the team's mean squared error?".Lamberson and Page's key finding is that for large teams, the optimal composition is mainly comprised of the type with the lowest error covariance, even if the type is not the most accurate.In contrast, in small groups, the highest accuracy type will be in the majority.Our major point of departure from (Lamberson and Page, 2012) is the team's objective function: Instead of assuming teams solely aim to maximize accuracy, we also account for the effect of affinity bias.Additionally, while Lamberson and Page's analysis focuses on uniform averaging of forecasts across team members, we study a richer class of aggregation mechanisms.Finally, unlike the prior contribution, which investigates the effect of within-type error covariance on accuracy-optimal compositions, we fix the error covariance of types and instead focus on teams' growth dynamics as the tensions between accuracy and affinity bias play out.Diversity in team performance and dynamics.A substantial body of empirical and theoretical research has investigated the impact of diversity on teams' performance and dynamics.A significant part of this literature studies diversity with respect to demographic characteristics such as race, gender, and age/generation (Guillaume et al., 2017;Pelled, 1996;Elsass and Graves, 1997).Other scholars have focused on diversity in job-related3 characteristics such as education level or tenure (Sessa and Jackson, 1995;Milliken and Martins, 1996;Pelled et al., 1999).Our work is closer to the latter category of diversity.Importantly, our contributions do not directly apply to demographic diversity in organizations.
Empirical work has investigated the impact of diversity on group performance and effectiveness.Some of the prior work argues that diversity can be a "double-edged sword," meaning that it can lead to higher-quality solutions, while reducing group cohesion (Phillips et al., 2009;Milliken and Martins, 1996;Lauretta McLeod and Lobel, 1992;Watson et al., 1993;O'Reilly III et al., 1989).The goal of our analysis is to understand why and under what conditions diversity acts this way.
Aside from performance, empirical studies have established that groups consisting of dissimilar individuals leads to less attraction and trust among peers (Chattopadhyay, 1999), less frequent communication (Zenger and Lawrence, 1989), lower group commitment and psychological attachment (Tsui et al., 1992), lower task contributions (Kirchmeyer, 1993;Kirchmeyer and Cohen, 1992), and lower perceptions of organizational fairness and inclusiveness (Mor Barak et al., 1998).Compared to homogeneous groups, heterogeneous groups are found to have reduced cohesiveness (Terborg et al., 1975), more conflicts and misunderstandings (Jehn et al., 1997) which, in turn, lowers members' satisfaction, decreases cooperation (Chatman and Flynn, 2001), and increases turnover (Jackson et al., 1991).These empirical findings are reflected through an inherent taste for agreement among team members in our model.Affinity bias.Affinity bias or homophily is the tendency of individuals to gravitate toward or associate with others whom they consider similar to themselves.The similarity could be in terms of demographic characteristics (such as race, ethnicity, age, or gender), social status (e.g., job title), values (e.g., political affiliation), or beliefs.A substantial body of empirical work has established the existence of homophilly in social networks (McPherson et al., 2001).Affinity bias in organizational processes has been documented and discussed extensively (Huang et al., 2019;McCormick, 2015;Oberai and Anand, 2018).Hedonic games.Hedonic games model the formation of coalitions (or teams) of players in settings where players have preferences over coalitions (Bogomolnaia and Jackson, 2002).Existing work in the area focuses on the stability of game outcomes (e.g., by evaluating whether the outcome of the game belongs to the core).Similar to hedonic games, in our setting, each agent's payoff depends on the other members of her team.However, unlike hedonic games, we are not interested in how society partitions itself into disjoint coalitions.Instead, we study a team that evolve sequentially when current members get to decide who joins next.
Wisdom of crowds and prediction markets.The wisdom of crowds (Surowiecki, 2005;Mannes et al., 2012) capture the idea that groups of people often perform better at prediction tasks compared to individuals.This idea has been the basis of "prediction markets" where agents can buy or sell securities whose payoff correspond to future events.The market prices can indicate the crowd's collective belief about the probability of the event of interest (Wolfers and Zitzewitz, 2004;Arrow et al., 2008).In prediction markets, traders do not generally form groups or coalitions; rather, they bet against each other.A trader receives the highest possible payoffs only if their prediction about the future state of the world is correct and only a small subset of other traders have made their bet according to the correct prediction.Unlike prediction markets, our model assumes individual members of a team are concerned with the overall performance of their team.Additionally, we set aside incentive considerations to focus on the interplay between accuracy and affinity bias in team growth dynamics.

The Basic Model
Let X denote the set of all possible states of the world distributed according to a probability distribution P .We assume each state of the world is described by a feature vector, that is, cov(x i , x j ) = 0 for all j = i.Each state of the world, x, leads to an outcome y ∈ Y.We assume there exists a true outcome function f * , such that for any x ∈ X , y = f * (x) is the true outcome of x.For simplicity and unless otherwise specified, we assume f * is deterministic, X = R r , and Y = R.
Consider a set of agents all capable of making predictions about the true outcome given the state of the world.An agent i has a fixed predictive model of the world, denoted by f i : X −→ Y, which maps each possible state of the world, x ∈ X , to a predicted outcome, ŷi = f i (x).We will use L i to denote the accuracy loss of i's predictions.More precisely, given a loss function : As an example, can be the squared error, that is, (ŷ i , y) = (ŷ i − y) 2 .
For any two predictive models f i , f j , we define the level of disagreement between them through a distance metric, δ : As an example, δ can be the squared L 2 norm, that is, δ(f i (x), f j (x)) = (f i (x) − f j (x)) 2 .To simplify the analysis, we will first focus on simple quadratic loss ( ) and distance (δ) functions, i.e., (y, y ) = δ(y, y ) = (y − y ) 2 .Later in Section 5, we show that our results extend to a larger family of distance metrics and loss functions.
Agent types.For simplicity and following prior work (Lamberson and Page, 2012), we assume there are two types of agents, A and B, each with a type-specific predictive model of the world.In particular, individuals of each type base their predictions on a type-specific subset of features, and their predictions are a noisy version of the highest accuracy predictor on those features.(The assumption of individuals utilizing the highest-accuracy predictor available to them is common in prior work.See, e.g., (Hong and Page, 2020).)Without loss of generality,4 suppose an individual of type A only takes features x 1 , • • • , x k into consideration, whereas an individual of type B utilizes features x k+1 , • • • , x r to make a prediction about x.
We assume the true function f * is linear in the feature vector x.Therefore, it can be decomposed into two components-corresponding to the two types' feature sets.In particular, ∀x = (x 1 , • • • , x r ) ∈ X : We will refer to θ * 1,••• ,k as θ A (since it's the accuracy-optimal weights on A's features) and θ * k+1,••• ,r as θ B , so that f * (x) = θ A • x + θ B • x.Given an instance x, individuals of each type can produce a noisy prediction according to their type's highest-accuracy predictive model.More specifically, an individual of type A predicts f A (x) = θ A • x + θ A 0 + A for x, where A is an i.i.d.noise sampled from a mean-zero Gaussian distribution with variance σ 2 A , and To avoid having to carry the dot-product notation, we define the shorthand functions θ A (x) := θ A • x and θ B (x) := θ B • x.We will also use L A , L B to refer to the (noise-less) accuracy loss of each type's predictive model (i.e., L c = E x∼P [ (θ c • x, y)] for c ∈ {A, B}).Additionally, for simplicity, we assume E[x] = 05 so that the above intercepts are both 0. (Note that since E[x] = 0, and f * , f A , and f B are all linear in x, we have that The aggregation mechanism.A team T is a set of agents who combine their predictions according to a given aggregation function G T .For any x ∈ X , the aggregation function G T receives the predictions made for input x by all members of T , and output a collective/team prediction for x.Given the team T = {1, 2, • • • , |T |}, we use G T (x) to refer to the aggregated prediction of team members for state x.More precisely, Note that we assume the functional form of G T -denoted by G in the above equation-is exogenously chosen and fixed throughout, and is independent of the team's composition.As a concrete example, G can be the mean prediction across all team members, that is: ŷi .
(This particular form of G is reasonable to assume in environments where the only acceptable aggregation function is giving all team members' opinions equal weight.)We will consider a general class of aggregation functions inspired by Tullock's contest success function (Jia et al., 2013;Skaperdas, 1996), defined as follows: Given a team consisting of n A individuals of type A and n B individuals of type B, we define the following parametric class of aggregation functions: (3) When n A , n B are clear from the context, we drop the subscript and use G α to refer to the aggregation function.Note that if α = 1, the above simplifies to the simple average, and in the limit of α → ∞, G α becomes close the median.Disutility and cost.Individual agents can be added to a team to reduce the team's overall disutility or cost.The cost an agent i incurs as a member of team T is defined as a combination of (a) their level of disagreement with other team members, and (b) the team's overall accuracy loss.More precisely, Note that the parameter β ∈ [0, 1] specifies how the level of disagreement experienced by individual i is impacted by the size of the team.In particular, it captures how perceptions of disagreement scale with absolute vs. relative size of the opposing type.As an example to illustrate this, consider a hypothetical case where E x∼P [δ(f j (x), f i (x))] is fixed to some constant value δ for all j = i.When β = 0, i's experienced disagreement with team members is equal to (|T − 1|δ), and it grows linearly with team size |T |.That is, a larger team increases i's perception of disagreement with teammates.For β = 1, though, i's experienced disagreement level (|T − 1|δ/|T |) remains roughly constant.Growth dynamics.We assume a team T would be willing to accept new member if it reduces the team's average cost across its current members: (4) The team growth dynamics proceeds as follows: Team T initially consists of n A individuals of type A and n B individuals of type B. New individual candidates arrive one at a time over steps t = 1, 2, • • • .Let us refer to the t'th individual as i t , and denote their type by s t ∈ {A, B}.The team brings on i t if and only if it reduces the current team's disutility (4).As will be discussed in Section 4, there are two distinct reasons for the team to bring on more than one member of a given type: • When the accuracy-optimal team composition favors one type, hiring more than one member of each type might be necessary to achieve the accuracy-optimal balance.
• When each agent's prediction is a noisy version of its type, multiple members of the same type reduces noise.

Analysis
In this section, we describe the dynamics of team formation under three separate regimes of λ: one in which λ = 0 (only accuracy matters); another in which λ is close to 1 (the importance of accuracy is negligible), and finally settings in between where 0 < λ < 1.

Minimum λ value (λ = 0)
When λ = 0, the team adds new members if and only if the new member reduces the team's mean squared error.To see under what conditions a new member will be brought on, suppose the current team consists of n A individuals of type A and n B individuals of type B. Recall that the team's collective predictive model can be written as It is easy to show that the team's mean-squared error can be decomposed into bias and variance terms.
Lemma 1 (Team's Error Decomposition).Consider a team with a composition of n A members of type A and n B members of type B. Then: where the expectation is with respect to (x, y) ∼ P and c ∼ N (0, σ 2 c ) for c ∈ {A, B}.Proof.We can write: Note that since σ B > 0 and (1 − β) > 0, the above is a quadratic polynomial in n B with a positive leading coefficient (i.e., 2σ 2 B (1 − β)).Let n lower , n upper denote the roots of this polynomial.Since the leading coefficient is positive, for any n B ∈ [n lower , n upper ], the derivative of the disagreement term is negative, indicating that adding new members of type B will reduce the disagreement.Similarly, outside this range, the derivative is positive indicating that new type B members will only worsen the team's disagreement.This finishes the proof.

Intermediate λ values
For 0 < λ < 1, the team's disutility can be written as: Next, we address the following question: for a given 0 < λ < 1, how and to what extent does a team with initial composition (n A , 0) grow?And how does the resulting composition compare with the accuracyoptimal team?We observe that for any strictly positive value of λ, the team fails to add the appropriate number of type B members, leading to accuracy inefficiencies.For ease of exposition, throughout this section we assume σ A = σ B = 0. Outline of the analysis.Our theoretical analysis focuses on deriving closed-form solutions for the edge cases of α = 1 and β ∈ {0, 1}.These particular settings are natural to study, because α = 1 corresponds to a meaningful, common aggregation mechanism (i.e., simple averaging, which is often utilized in practice and has been advocated as a good rule of thumb (Makridakis-Winkler'1983, Clemen-Winkler'1986, Clemen'1989, Armstrong'2001).β ∈ {0, 1} capture whether perceptions of conflict within the team depend on the relative or the absolute size of the types.The analysis of these extremes offers several non-trivial observations, as will be stated shortly.For other values of α and β, we provide simulation results (Figure 5) showing that the effects observed at the edge cases continue to hold, but they interact with each other in potentially interesting ways.
Theorem 1 (Utility-optimal composition for α = 1, β = 0).Consider a team with an initial composition of n A > 0 members of type A and no member of type B. Fix λ for the team.The optimal number of type B members whose addition maximizes the team's utility is equal to n Proof.Recall that when σ 2 A = σ 2 B = 0, the disagreement term in the team's objective (9) simplifies to: Taking the derivative of the right hand side with respect to n B , we obtain: 2 Note that the above is always positive, and decreasing in n B .So if the cost function (the λ-weighted sum of disagreement and loss) has a zero, it must be before the zero of accuracy derivative, that is, before n A L A L B .To see where precisely the zero lies, we can write: Figure 3 shows the team growth dynamics for λ = 0.02 when α = 1, β = 0, and under several different regimes of (L A , L B ).Note that team formation dynamics can get stuck in local utility optima, failing to achieve accuracy-optimal compositions.Additionally, the team composition is highly sensitive to its initial composition-which can be thought of as a form of path-dependence, as formalized through the corollary below.
Corollary 1 (Convergence of team growth dynamics for α = 1, β = 0).Consider a team with the initial composition of (n A , n B ). Without loss of generality, we assume Regardless of the order in which agents of each type arrive, the team growth dynamics converges to (n A , n B ) where Theorem 2 (Utility-optimal composition for α = 1, β = 1).Consider a team with an initial composition of n A > 0 members of type A and no member of type B. Fix λ for the team.The optimal number of type B members whose addition maximizes the team's utility is equal to n Proof.Recall that when σ 2 A = σ 2 B = 0, the disagreement term in the team's objective (9) simplifies to Taking the derivative of this function with respect to n B , we obtain: L B .To see where precisely the zero lies, we can write: Figure 4 illustrates the team growth dynamics for λ = 0.02 when α = 1, β = 1, and under several different regimes of (L A , L B ). Team formation dynamics continue to converge to utility optima, failing to achieve accuracy-optimal compositions.Note, however, the different patterns of inefficiency in the case of β = 0 and β = 1.
Corollary 2 (Convergence of team growth dynamics for α = 1, β = 1).Consider a team with the initial composition of (n A , n B ).
Regardless of the order in which agents of each type arrive, the team growth dynamics converges to (n A , n B ) where Figure 5 illustrates the team growth dynamics for λ = 0.05 when α is high.In general, we observe that high α induces a lower bound on the number of less-represented type members needed to make increasing the type's representation in the team beneficial.Additionally, larger β values encourage a dominant majority to bring on more members of its own to reduce disagreement-even if that comes at the cost of accuracy.

Takeaways from the Analysis
The path-dependent nature of inefficiencies.Through the analysis in this Section, we observe that the initial composition of the team plays an important role in its eventual composition.As an illustrative example of the different effects at work, consider Figure 5 (b).The initial composition of the team dictates whether (a) the team remains at its initial makeup, (b) it adds members to the less-represented type to move toward greater accuracy, or (c) it continues adding to the more-represented type in a way that overpowers the less-represented type.While the exact dynamics are specific to our model, this general family of observations has important implications for teams in organizations more generally: that the initial composition can have a significant effect on the direction in which the team grows.The role of the aggregation mechanism.It is also interesting to note the ways in which varying the aggregation parameter α has an effect on the team growth dynamics and incentives for the team to add members of each group.This suggests more generally some of the mechanisms whereby aggregation can influence decisions about group composition.There are interesting analogies to other contexts that exhibit a link between aggregation mechanisms and the dynamics of new membership.For example, while legislative bodies are distinct from problem-solving teams, there is an interesting analogy to issues such as the way in which the prospect of statehood for entities like the District of Columbia and Puerto Rico play out differently in the U.S. House of Representatives, where aggregation is done proportionally to population, and the U.S. Senate, where aggregation is done uniformly across states.This is precisely a case of the difference in aggregation mechanism implying differences in the politics of new membership (in this case via statehood).

Extensions
Alternative notions of distance and accuracy loss.Throughout our analysis, we assumed that the distance and the loss functions take on simple quadratic forms.It is easy to show that our main result (that team composition initially trends toward improving accuracy but stops short of achieving the optimal performance) holds for more generic functional forms.Consider a distance metric δ(., .)capturing disagreements between team members, and a loss function (., .)capturing the team's predictive loss.For simplicity, let's assume both and d are continuous and differentiable.Let's define the following pieces of notation for convenience: With similar reasoning as that presented in the proof of Proposition 1, we can show the following: Proposition 3 (informal statement).Consider a team with an initial composition of n A > 0 members of type A and no member of type B. Fix λ for the team.Suppose δ(., .)and ˜ (., .)are both differentiable, and the following conditions hold: 1. δ(n A , .) is concave and increasing in the number of the less-represented group members, n B .
2. ˜ (n A , .), is initially decreasing and convex, but becomes and remains increasing thereafter.
Then the optimal number of type B members whose addition maximizes the team's utility is strictly less than n A .
Proof sketch.Note that the first condition implies ∂ ∂n B δ(n A , .) is positive and decreasing (with c ≥ 0 as its potential asymptote).Additionally, the second condition implies that the ∂ ∂n B ˜ (n A , .) is initially negative, but reaches zero at some point and remains positive thereafter.The derivative of the sum is equal to the sum of derivatives, so the derivative of the team's objection function with respect to n B is equal to λ ).Therefore, if the above derivative has a zero, it must lie before the zero of the accuracy loss term.
We remark that with the appropriate choice of δ and ˜ , the utility-maximizing number of type B members can be arbitrarily close to the number needed for optimizing accuracy.More than two predictive types.In the analysis in Section 4, we assumed that agents belong to one of the two types: A or B. As we show in Appendix A.1, our derivations readily generalize to three types (A, B, and C), where f * (x) = θ A (x) + θ B (x) + θ C (x). Figure 6   The role of biased accuracy-gain assessments.We assumed throughout that teams could perfectly (i.e., without bias and noise) estimate the accuracy gains of adding a new member of each type.While this is a common assumption in prior work, in reality, such assessment may be biased (e.g., optimistic or pessimistic) and noisy.Here, let's consider the biased case, as demonstrated via the example in Figure 7. (Figure 11 in the Appendix illustrates the case where assessments are both biased and noisy).In the absence of bias (Figure 7, (b)), we observed that once teams reach an accuracy-optimal composition, they cease to grow any further.Additionally, adding a new member of type A and a new member of type B could not simultaneously improve the team's utility.When accuracy gain assessments are over-estimated (Figure 7, (c)), however, the same teams may continue to grow beyond accuracy optimal compositions, and they may find themselves in situations where adding a new member of any type is beneficial.As an example, compare the dynamics at (40, 40).Conversely, when accuracy gain assessments are under-estimated (Figure 7, (a)), teams dynamics get stuck in accuray sub-optimal compositions.

Conclusion
This work offered a stylized model of team growth dynamics in the presence of a tension between informational diversity and affinity bias.Our analysis provides several key observations about the effect of affinity bias on team composition inefficiencies (even an arbitrarily small positive weight on affinity bias leads to inefficiency) and the moderating role of the aggregation mechanism and team size.It also shows how the growth dynamics of a team can lead toward optimality for some starting compositions and away from it for others.Our findings present several actionable insights to improve team growth dynamics.In particular, it shows that awareness of the positive impact of diversity on a team's performance alone will not incentivize high-performing teams to form.But the social planner can positively influence team growth dynamics by adjusting the initial team composition or the aggregation mechanism used to resolve conflicts of opinion.
We conclude with a discussion of limitations and outline of important directions for future work.Strategic considerations.Our model considers the incentives of the overall team to improve total utility, but does not account for other kinds of incentives, including individual ones.The decision of an individual agent considering whether to join a team or not may be impacted by the proportion of current team members of the same type.For instance, a type B agent may refuse to join if the current number of type B members of the team is below a certain threshold.Agents who already belong to a team may have incentive to exaggerate their opinion in anticipation of their opinions getting aggregated.Under such circumstances, it may be beneficial to utilize non-uniform/weighted voting schemes both to improve team's accuracy and promote truthfulness.We leave the exploration of such incentives as an important direction for future work.Opinion formation processes.Our model does not provide a micro-foundation of opinion dynamics and consensus formation as a function of team members communicating with one another and deliberating.While some of the aggregation functions we study (e.g., uniform average) can be thought of as the outcome of simple opinion formation dynamics, we leave the integration of a more detailed account of opinion evolution in teams for future work.
Last but not least, our analysis relies on a range of additional simplifying abstractions of the team formation process, including (1) the restriction to independent predictive models of the world across the different types of agents; (2) taking accuracy as an appropriate measure of team performance; (3) assuming that teams can accurately estimate the performance gains of increased diversity; and (4) assuming that λ, α, and β are fixed across types and team compositions.While it would be interesting to see future work that relaxes some of these assumptions, the simplicity of our model enables it to serve as a useful conceptual metaphor capturing an inherent limitation of utility-centric motivations for improved informational diversity: for diverse teams to form and thrive, acknowledging the performance gains of informational diversity alone will not carry the day.follows.
where in the last line we used the fact that E x 2 i = 1 for all i.With a similar logic, we obtain that: Combing ( 10), (11), and (12), we obtain: 15)

Figure 2 :
Figure 2: Visualization of team formation dynamics for λ = 1 when σ A = 0, σ B > 0. Adding a new member of type B is only beneficial in a specific range of n B (n B ∈ [n lower , n upper ]) determined by the roots of a degree-two polynomial.

Figure 4 :
Figure 4: Visualization of team formation dynamics for an intermediate value of λ (λ = 0.12).α = 1, β = 1, L A = 0.1, L B = {0.05,0.1, 0.2} and var( ) = 0. Team formation dynamics converge to utility optima, failing to achieve accuracy-optimal compositions.Note that the above is always positive, and decreasing in n B as long as as n B ≤ n A .So if the cost function (the λ-weighted sum of disagreement and loss) has a zero, it must be before n A L A

Figure 5 :
Figure 5: Visualization of team formation dynamics for an intermediate value of λ (λ = 0.025), α = 5, β = {0, 0.2, 1}, L A = 0.1, L B = 0.1 and σ A = σ B = 0.When α is high, adding a member of the lessrepresented type is only beneficial if it has a non-negligible impact on accuracy, hence the lower bound on the number of the type's members for their addition to start.(a) since β = 0, there is an upper bound on the number of less-represented group members.(b,c) for larger β values, if the majority is sufficiently dominant, hiring more majority members reduces the disagreement, which is beneficial (even if it degrades the accuracy).

Figure 6 :
Figure 6: Visualization of team formation dynamics for three types.Trends are similar to that of two types.

Figure 7 :
Figure 7: Visualization of team growth dynamics when assessments of utility gains are biased-as captured by an additive bias term equal to: (a) 0.12 (under-estimation of gains); (b) 0 (unbiased estimate); (c) −0.12 (over-estimation).(a) The team may not reach accuracy optimality, or (c) it may grow beyond it.

Figure 8 :
Figure 8: Visualization of team formation dynamics for λ = 0 when α = 1 and σ A = σ B = 0. (Note that in this setting, n * B = L A L B

Figure 9 :
Figure 9: Visualization of team formation dynamics for λ = 0 when α = 1 and σ A = 0 and σ B ∈ {0.1, 0.2, 0.3}.Even though individual members of each type are noisy, it is never simultaneously beneficial to add a new member of type A and a new member of type B.

Figure 10 :
Figure 10: Visualization of team formation dynamics when λ = 1 for several β values (trends are similar for other values of L A , L B ). Adding a new team member of the less-represented type is never beneficial.Adding a new member of the majority type only improves the team's disutility if β is sufficiently large.

Figure 11 :
Figure 11: Visualization of team growth dynamics when assessments of accuracy gains are biased.When deciding whether to add a new member, the team's assessment of accuracy gains is corrupted by a bias term equal to (a) 0.003 (under-estimation of accuracy gains); (b) 0 unbiased estimation of accuracy gains; (c) −0.003 (over-estimation of accuracy gains).When the estimation of accuracy gains are biased, the team may grow beyond accuracy optimal compositions, and adding a member of any type may be beneficial in certain compositions.