Comparative Analysis of Discussion Intensity and Semantic Diversity in Early vs. Late Engagers: A Study of Japanese Tweets about ChatGPT

This study investigates engagement patterns related to OpenAI's ChatGPT on Japanese Twitter, focusing on two distinct user groups - early and late engagers, inspired by the Innovation Theory. Early engagers are defined as individuals who initiated conversations about ChatGPT during its early stages, whereas late engagers are those who began participating at a later date. To examine the nature of the conversations, we conduct a dual methodology, encompassing both quantitative and qualitative analyses. The quantitative analysis reveals that early engagers often engage with more forward-looking and speculative topics, emphasizing the technological advancements and potential transformative impact of ChatGPT. Conversely, the late engagers intereact more with contemporary topics, focusing on the optimization of existing AI capabilities and considering their inherent limitations. Through our qualitative analysis, we propose a method to measure the proportion of shared or unique viewpoints within topics across both groups. We found that early engagers generally concentrate on a more limited range of perspectives, whereas late engagers exhibit a wider range of viewpoints. Interestingly, a weak correlation was found between the volume of tweets and the diversity of discussed topics in both groups. These findings underscore the importance of identifying semantic diversity, rather than relying solely on the volume of tweets, for understanding differences in communication styles between groups within a given topic. Moreover, our versatile dual methodology holds potential for broader applications, such as studying online discourse patterns within different user groups, or in contexts beyond ChatGPT.


INTRODUCTION
ChatGPT 1 , which is a chatbot released by OpenAI in November 2022 has significantly changed the digital landscape and gained notable attention, eliciting diverse responses from users across different communities [11].Understanding those user engagement patterns and interests towards emerging technology is crucial for addressing concerns in the context of filter bubbles which refers to the tendency of individuals to be exposed to information and perspectives that align with their existing beliefs and preferences while being isolated from diverse viewpoints.
In this study, we aim to analyze the varying patterns of engagement with ChatGPT on Twitter among Japanese users, from a bird's eye view, focusing on two distinct groups, namely early engagers and late engagers.Early engagers are defined as users who began interacting with ChatGPT shortly after its launch, while late engagers are users who initiated their interactions in a later phase.The categorization into these groups is inspired by the Innovation Theory [17], which posits that early adopters often constitute a risk-taking, socially influential group, whereas the later majority are typically more cautious and less socially active, often avoiding risk.
Existing research on social interactions on Twitter mainly focused on quantitative analysis, focusing on proportions derived from topic modeling, and evaluating sentiments and stances among specific groups.However, these studies do not take into account the semantic structure within the topics discussed by different groups.In other words, even if two groups share the same volume of tweets, the manner information spreads within these groups may vary.For instance, while one group may predominantly retweet the same tweets, the other group may create and share more original content, resulting in a situation where later groups disseminate more diverse information.To address this gap, our paper conducts qualitative analysis that extract overlapping and unique content parts of what each group was talking about, aiming to gain a comprehensive understanding of their perspectives.
To investigate these engagement patterns, we have developed a novel methodology that enables both quantitative and qualitative comparative analyses of the discussion topics.The qualitative examination seeks to identify dominant topics for each group, discerning intriguing thematic differences.For instance, our analysis revealed that early engagers tend to focus on technology-centric subjects, while late engagers lean towards socio-cultural dialogues.Moreover, our in-depth analysis of the qualitative differences between early and late engagers revealed that early engagers tend to focus on a more limited perspectives, whereas late engagers exhibit a broader range of viewpoints within the same topic.Furthermore our analysis identified that a high volume of discussion does not necessarily result in semantic diversity within a topic.This underscores the importance of diving beneath the surface level of engagement numbers to understand the actual content and nuances of the discussions happening among users.
In conclusion, our research offers valuable insights into the public discourse surrounding ChatGPT, shedding light on the unique conversational inclinations of early and late engagers.Additionally, the hybrid approach of quantitative and qualitative methods is versatile enough for broader applications in studying engagement patterns within other user groups and contexts beyond this specific case.

RELATED WORK
The advent of ChatGPT has led to an increase in research aimed at exploring public perceptions towards it, with Twitter data as the primary source of information.Haque et al. [9] employed Latent Dirichlet Allocation (LDA) for topic modeling to identify popular topics in ChatGPT-related tweets.Subsequently, they performed sentiment analysis based on these topics.Taecharungroj [18] also made use of LDA topic modeling but their dataset was larger, comprising over 200,000 tweets.Their focus was more on identifying the strength and limitations of ChatGPT rather than examining public opinions.In a more encompassing study, Leiter et al. [12] carried out sentiment analysis and classified tweets into 19 predefined topics.This was facilitated by a model built upon roBERTa that was fine-tuned explicitly for tweet topic classification [2].They expanded their analysis to incorporate differences across various languages and over time.Unlike the aforementioned studies, our research is geared towards a comparative study between groups.We also offer a distinctive approach by calculating the proportion of shared or unique content between groups.This innovative method provides insight into mutual understanding and polarization within groups.
To the best of our knowledge, no studies have examined the combination of traditional topic modeling techniques and bias analysis of semantic vectors like our approach, even for other tweet analysis topics such as COVID-19 and the US election.Studies in the field of fairness in machine learning sometimes investigate the bias of semantic vectors between sensitive attributes such as gender or race.Our methodology draws inspiration from Bolukbasi et al. [5], who examined biases in word embeddings between genders.They aimed to measure the gender bias of word representations in Word2Vec and GloVe by calculating the projections into principal components of the differences of embeddings of a list of male and female pairs,a concept they termed "gender direction".Our approach is similar in terms of mapping tweets into a dimension that separates early and late engagers and analyze the distributional bias concerning these user groups.Twitter was selected as the primary data source for this study due to its widespread popularity and utilization as a platform for social interaction.We originally collected public Japanese tweets that were posted between November 30, 2022, and February 20, 2023, and mentioned the term "ChatGPT".This extraction was facilitated through the Twitter API v2, which is specifically designed for Academic Research.Table 1 provides a statistical summary of the assembled dataset.

Differentiation between Early and Late Engagers
We aimed to apply the principles of innovation theory in order to distinguish between early and late engagers of ChatGPT.This theory helps categorize users into several distinct groups based on the timeline of their adoption or involvement with a new product, service, or technology.Specifically, it designates about 2.5% of the total population as innovators, around 13.5% as early engagers, 34% as early majority, another 34% as late majority, and roughly 16% as late adopters.Figure 1 illustrates the number of users posting their first tweet about ChatGPT, with the x-axis representing the timeline and the y-axis indicating the number of users.The ongoing evolution and growth of the ChatGPT topic, as depicted in this figure, suggest that these precise percentages may not be suitable.In response, we examine the graph of user volumes, noting an initial spike in user engagement and a subsequent calming period around December, followed by a reacceleration in mid-January when a large number of users began to participate.We interpret this inflection point as a chasm [15], representing a critical hurdle that must be overcome for a product to transition from the enthusiasm of early adopters to widespread acceptance among mainstream customers.According to innovation theory, bridging this gap is essential for the success of advanced technological products in the mass market.Following the notion above, we thus divided the users into groups at this specific juncture.
To set the specific date for classification, we calculate the average of number of new users over the preceding seven days and selected the lowest count as the threshold.As a result, users who tweeted before December 31, 2022 were classified as early engagers, while those who started tweeting after this date were categorized as the late engagers.Table 2 also provides a statistical summary of the assembled dataset on or after December 31, 2022.

TOPIC MODELING APPROACH
Our study aims to compare engagement patterns between two groups.For this purpose, we firstly adopt a topic modeling to categorize each tweet according to specific themes.Topic modeling can be generally classified into two main types: Bayesian probabilistic topic models (BPTMs), such as Latent Dirichlet Allocation (LDA) [4], and clustering-based topic models (CBTMs), including models like BERTopic [8].Recent studies indicate that CBTMs outperform BPTMs [20], producing more coherent and diverse topics while requiring fewer computational resources and less time.Our work also deploys the CBTMs and the subsequent subsections outline the details of our implementation.

Extracting Semantic Features via Text Embeddings
The first step involves extracting semantic features from the tweets using text embeddings.We filter out retweets and preprocess the text by removing mentions and URLs.OpenAI's text-embeddingada-002, is then utilized to transform the cleaned text of each tweet into a 1536-dimensional sentence embedding.
To mitigate the Curse of Dimensionality [1], which could adversely affect clustering results, we reduce the dimensions of the sentence embeddings.Based on recommendations by [20], we employ Uniform Manifold Approximation and Projection (UMAP) [14] to bring down the dimensions from 1536 to 5.

Density-Based Clustering and Hyperparameter Tuning
We use Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) [13] for clustering tweets into topics.HDBSCAN is chosen for its efficacy in grouping similar topics and its ability to discern noise clusters, which contain posts that are not closely related to any particular topic.
For hyperparameter tuning, we use the Density-Based Clustering Validation (DBCV) index [16] to measure clustering quality.The DBCV index evaluates the density connectivity between pairs of data points.We aim to optimize two important parameters of HDBSCAN as implemented by scikit-learn2 :min_cluster_size and min_samples.The former specifies the smallest size that HDBSCAN recognizes as a cluster, whereas the latter determines the stringency in defining clusters.We experimented with values [10,25,50,100,200] for both parameters.
After optimization, the hyperparameters were set to min_samples = 10 and min_cluster_size = 25, resulting in 390 distinct topics, along with one noise cluster containing tweets not closely associated with any specific topic.

Topic Filtering
Once the HDBSCAN algorithm is used to extract topics, we turn our attention to refining the dataset.We observe that some topics are heavily dominated by a small subset of users.We decide to exclude such topics, as they may not be representative topics of the broader community.To identify these topics, we calculate a metric termed "user half line" inspired from [3], which measures the proportion of users contributing to 50% of the content within a topic.This metric provides a quantitative indication of the concentration of content generation, which may hint at dominant voices or coordinated content dissemination.
Figure 2 visually presents our analysis.The x-axis represents the proportion of users contributing to half of the content, while the y-axis shows the number of tweets in each topic.We use the Interquartile Range (IQR) method with a multiplier of 1.5 to pinpoint outlier topics.Consequently, topics with a user half line of 32.8% or less are excluded leaving us with a refined dataset containing 351 topics.

QUANTITATIVE ANALYSIS IN TOPIC BETWEEN EARLY AND LATE ENGAGERS
In this section, our objective is to address the research question: RQ1.Which topics are prominently discussed by early and late engagers?Figure 3 presents a scatter plot where each point represents a topic, and its coordinates indicate the number of tweets by early and late engagers.Points located above the line  =  signify topics that late engagers discuss more frequently.Our analysis reveals that 88% of the topics have a higher number of tweets by early engagers compared to late engagers.
Additionally, we included a line defined by the equation  = 166349 36736  in the scatter plot, where the denominator represents the number of early engagers, while the numerator represents the number of late engagers.Points situated below this line indicate that the average number of tweets per person is greater among early engagers than late engagers.Notably, 80% of the topics fall below this line, suggesting that individuals within the early engagers' group have higher engagement levels per topic.
To identify specific topics that were particularly prominent in each group, we normalize the volume of discussion for each topic based on the number of tweets made by both early and late engagers.We then compute the difference in the proportion of tweets between the two groups using the following formula: where   is the probability of early engagers discussing topic ,   is the probability of late engagers discussing the same topic, and   is the average topic probability between early and late engagers.This metric is a component of the Jensen-Shannon Divergence [7].
Due to limited space, we highlight and analyze 16 distinctive topics that are identified as outliers using the IQR method with multiplier of 4. A summary of these topics, generated with a assistance of ChatGPT, is provided in Table 3.
The early engagers appears more invested in the technical potential and future prospects of AI and its development.They are interested in the upcoming versions, possible integrations, and anticipated enhancements in AI.This includes looking at the merging of AI capabilities for improved output, plans of future releases by major tech players, and breakthroughs in integrating AI into different environments, such as IDEs.This perspective indicates an anticipatory stance towards technology, focusing on its development, potential, and its ability to disrupt existing paradigms.The late engagers, however, seems to focus on the current practical applications, real-world usage, and the observable implications of AI technologies.This includes discussions about their integration into existing platforms for various applications, the pros and cons of these tools in their current state, and their impacts on different fields such as academic writing, content creation, and software development.The "late" group also emphasizes user experiences, pragmatic evaluations, and potential cautionary aspects of AI tools.
Thus, the early engagers embody a more forward-looking, speculative viewpoint, focusing on technological progression and its potential transformative impact.The late engagers offer a more contemporary and application-focused perspective, centered on maximizing the current capabilities of AI technologies while navigating their limitations.In conclusion, this experimental result resonates well with the innovation theory proposition that early engagers tend to focus on technology-centric subjects, while late engagers lean towards socio-cultural dialogues.The complementarity of these viewpoints is crucial in driving both the evolution of AI technologies and their effective integration into practical applications.

SEMANTIC DIVERSITY WITHIN TOPICS
In the previous section, we observed that early and late engagers discuss various topics in disparate proportions.This finding prompted an examination of whether these two groups also interact differently within the same topics.
In the following we aim to answer the following sub-research questions.
• RQ2-1.Which early or late engagers speaks more diversely within a topic?• RQ2-2.Is there a relationship between the tweet volume and semantic diversity?• RQ2-3.Controlling for the bias associated with tweet volume, which speaks more broadly, the early or late enagers?
Figure 4: Overview of our proposed method for comparing semantic diversity

Semantic Venn Diagram Comparing Two Groups
In our study, we focus on understanding the differences in the content discussed by two groups within each topic.To this end, we have employed a four-step methodology to create a "semantic Venn diagram" comparing the two groups.This methodology allows us to quantify the extent of divergence in discussions and the areas of overlap or uniqueness.Figure 4 provides a graphical representation of the steps involved in this procedure.
(1) Text Embedding: The first step involves encoding the textual content into a more suitable representation for analysis.By transforming the text into embeddings, as detailed in Section 4.1, we can leverage the rich information captured in the embeddings for further analysis.(2) Linear Discriminant Analysis (LDA): Once the text is represented as embeddings, we apply Linear Discriminant Analysis (LDA) [10] to map the embeddings of each topic into a single dimension.LDA is particularly advantageous because it seeks to maximize the variance between groups while minimizing the variance within groups.In the context of our study, this effectively translates to identifying perspectives or dimensions where the discussions by the two groups are most divergent.This proceess is similar to the gender direction mapping by previous studies [5] measuring embedding bias between men and women, obtained by principal components of differences of embeddings of a list of male and female pairs.(3) Kernel Density Estimation (KDE): After obtaining the one-dimensional embeddings for each topic through LDA, we employ Kernel Density Estimation (KDE) [6] to estimate the probability density function of each topic.KDE helps in understanding the distribution of discussions for each group across the identified dimension.For this, we set a hyperparameter at 95% to define the regions for calculating the areas E  and L  , where E and L represent the areas of early and late engagers respectively, and  denotes the topic id.The selection method for the bandwidth in KDE is determined through cross-validation [19]. in each group obtained by stratified analysis colored in Figure 6.
where both groups have discussions (E  ∩ L  ), areas where discussions are exclusive to early enagagers (E  \L  ), and areas where discussions are exclusive to late engagers (E  \L  ).This step allows us to quantify the extent of commonality and divergence in the content discussed by the two groups within each topic.
By executing these steps, our methodology provides a systematic and quantitative approach to analyze the differences and similarities in the semantic content of discussions between two groups.This helps in gaining insights into the nature of their communication and identifying the unique and shared aspects of their discussions across different topics.

Evaluating Semantic Diversity in Early vs.
Late Engagers To answer the sub-research question: RQ2-1.Which early or late engagers speaks more diversely within a topic?, we examined the range of topics brought by each group.Figure 5, the x-axis represents the proportion of areas covered by the early engagers in each topic( E  E  ∪L  ), while the y-axis shows the same for the late engagers ( L  E  ∪L  ).This scatter plot allows for a direct comparison of the diversity of topics discussed by both groups.
The mean value for E  E  ∪L  was found to be 0.6, while the mean value for L  E  ∪L  was 0.82.A Welch's t-test was conducted on these sets of data, and the results confirmed a significant difference between the two ( < 0.0001).This indicates that late engagers generally encompass a more diverse viewpoints within the topics identified in Section 4. Conversely, early engagers seem to be more focused or limited in their discussion within the topics.

Relationship between the Tweet Volume and the Semantic Diversity
In the previous section, we observed that the late engagers had a more diversed coverage of the topic compared to the early One of the potential explanations for this could be seen in Figure 2, which illustrates that late engagers also had a higher volume of tweets per topic than the early engagers.To investigate whether there is a correlation between the semantic diversity and the tweet volume, a comparative analysis was conducted.Sub-research question RQ2-2 Is there a relationship between the tweet volume and semantic diversity?, was explored by plotting the relationship between them in Figure 6a and 6b.The x-axis of Figure 6a shows the ratio of the tweet volume of the early engagers per topic, while the y-axis represents the ratio of semantically covered area by the early engagers ( E  E  ∪L ).With a correlation coefficient of 0.48, a relatively weak positive correlation exists between the two factors.Similarly, Figure 6b represents the late engagers' data.The correlation coefficient for this set is slightly lower at 0.37, again suggesting a weak correlation.
The identified weak correlations suggest that the findings in Section 6.2 cannot be solely attributed to the large number of tweets.The content they share is also a significant factor.

Stratified Analysis of Semantic Diversity by Early and Late Engagers
In the previous subsection, we discovered that it is essential to minimize the bias caused by differences in tweet volume in order to determine whether early or late engagers participate more in discussions with a broad range of meaning.This subsection answers RQ2-3.Controlling for the bias associated with tweet volume, which speaks more broadly, the early or late enagers?
To achieve this, we employ a stratified analysis technique, allowing us to observe the groups within partitions that are more homogenous in terms of the ratio of tweet volume.This approach ensures a more unbiased comparison between the two groups.
Upon manual inspection of Figure 6a and Figure 6b, the dataset was divided into three strata based on the percentage of the amount of tweets each group accounted for in a given topic.The strata were formed as follows: up to 1/3, between 1/3 and 2/3, and above 2/3 of the ratio which are colored in Fig6.
The results of this stratified analysis are depicted in Figure 7 (a) through (c).In these figures, the x-axis represents the percentage of the amount of tweets that each group accounted for, while the y-axis shows the distribution of E  E  ∪L  and L  E  ∪L  for topics assigned to each group.
A key observation from the results is that, across all strata, the average value of L  E  ∪L  is consistently higher than the average value of E  E  ∪L  .Furthermore, statistical tests reveal that the differences are significant across all strata ( < 0.0001).This implies that, even when controlling for tweet volume, late engagers still exhibit a broader range of discussion within the topics.Our findings highlight the need to consider the semantic nuances in tweet content.Therefore, future research should go beyond tweet volumes and explore the semantic aspects and biases of the content.

CONCLUSION
This study explored the discourse about ChatGPT within the Japanese Twitter community, focusing on the tweet volume and semantic diversity of early and late engagers.The research proposed a dual methodology, incorporating both quantitative and qualitative analyses to understand the nature of the discussions.
The quantitative analysis revealed distinct conversational focuses between early and late engagers.Early engagers emphasized forward-looking and speculative topics, highlighting the technological advancements and potential transformative impact of ChatGPT.In contrast, late engagers engaged more with contemporary topics, focusing on optimizing existing AI capabilities and considering their limitations.
The qualitative analysis delved deeper into the discussions and measured the breadth of perspectives within topics between the two groups.A weak correlation was found between the volume of tweets and the range of discussed topics in both groups.We also found that early engagers tended to concentrate on a more limited range of perspectives, while late engagers exhibited a broader range of viewpoints, even reducing the bias of the tweet volume.This finding emphasized the importance of identifying semantic diversity and understanding the content and nuances of discussions beyond the volume of tweets.
The dual quantitative and qualitative methodology employed in this study is versatile and applicable to studying online discourse patterns within other user groups or beyond the context of Chat-GPT.The insights gained from this research contribute to a better understanding of public discourse surrounding emerging technologies and the communication styles of different user groups.
Future studies can extend this research in several ways.Firstly, by utilizing the proposed methodology, we will be able to extract topics where semantic polarization is occurring and determine the existence of conflicting opinions within those topics.Secondly, exploring engagement patterns in other community clustering methods such as network-based clustering methods can shed light on the influence of community structures on discussion dynamics.Additionally, grouping tweets based on their popularity, distinguishing between well-retweeted and rarely retweeted tweets, may reveal how only some perspectives are seen by many people.Furthermore, tracking the evolution of these engagement patterns over time can offer valuable insights into the development and maturation of user interactions and public discourse surrounding emerging technologies.By addressing these areas in future research, we can achieve a more comprehensive understanding of the complex dynamics involved in discussions about emerging technologies and their societal impact.

Figure 1 :
Figure 1: Trends in the daily number of new users who made their first post containing ChatGPT.Early and late engagers are defined based on the threshold set on December 31, corresponding to the minimum value of the seven day average represented by the red dotted line.

Figure 2 :
Figure 2: This scatter plot shows topics as individual points, with the y-axis representing the size of the topic and the x-axis indicating the proportion of users responsible for creating half of the content.Topics colored in orange are identified as noisy majority topics using the IQR method and are excluded from further analysis.

Figure 3 :
Figure 3: Scatter plot visually represents the tweet activity per person, with each point indicating a specific topic.The x-axis represents the number of tweets per person of early engagers, while the y-axis represents the late engagers.The scatter plot further highlights distinct topics, denoted by red and blue markers, which exhibit notable differences in terms of topic proportions between the two groups.

Figure 5 :
Figure 5: Scatter plot demonstrates a collection of data points, with each point representing a specific topic.The x-axis depicts the magnitude of early engagers, labeled as E  E  ∪L  , while the y-axis portrays the magnitude of late engagers, labeled as L  E  ∪L  .

( 4 )Figure 6 :Figure 7 :
Figure 6: Scatter plot demonstrates a collection of data points with each point representing a specific topic.The x-axis depicts the ratio of tweet volume of early (a) and late (b) engagers, while the y-axis portrays the E  E  ∪L  (a) and L  E  ∪L  .

Table 1 :
Statistical description of Tweet Dataset

Table 3 :
Summaries of topics with particularly different percentages spoken about in Early and Late engagers Automatic Care Report Generator" on Google Sheets, integrating ChatGPT into Google Docs and Sheets for research automation, information gathering, and streamlining administrative tasks.lateTheintegration of AI with Notion tool generates infinite articles, providing capabilities like idea generation with ChatGPT, and fact-checking is essential for AI-generated content.
late Various discussions about the capabilities and applications of ChatGPT occur, with differing user experiences and opinions shared online.late A collection of tweets and headlines discuss a variety of topics, including cryptocurrency news, AI digital tokens, and the potential role of ChatGPT in the cryptocurrency market.late While ChatGPT struggles with accurate numerical answers, it is useful for those weak in IT and can assist in expressing desires and thoughts, highlighting the evolving role and limitations of AI. late The potential of ChatGPT to replace search engines is discussed, exploring its use in content creation and potential impact on the demand for web articles.late Various uses and limitations of ChatGPT are discussed on Twitter, including language learning, blogging, coding, and information gathering.late People are exploring ChatGPT's utility for diverse purposes including joke making, prospecting, cold emailing, and more, indicating the AI tool's broad applications and growing popularity.