From Guest to Family: An Innovative Framework for Enhancing Memorable Experiences in the Hotel Industry

This paper presents an innovative framework developed to identify, analyze, and generate memorable experiences in the hotel industry. People prefer memorable experiences over traditional services or products in today's ever-changing consumer world. As a result, the hospitality industry has shifted its focus toward creating unique and unforgettable experiences rather than just providing essential services. Despite the inherent subjectivity and difficulties in quantifying experiences, the quest to capture and understand these critical elements in the hospitality context has persisted. However, traditional methods have proven inadequate due to their reliance on objective surveys or limited social media data, resulting in a lack of diversity and potential bias. Our framework addresses these issues, offering a holistic solution that effectively identifies and extracts memorable experiences from online customer reviews, discerns trends on a monthly or yearly basis, and utilizes a local LLM to generate potential, unexplored experiences. As the first successfully deployed, fast, and accurate product of its kind in the industry, This framework significantly contributes to the hotel industry's efforts to enhance services and create compelling, personalized experiences for its customers.


I. INTRODUCTION
In 1998, Pine and Gilmore introduced the "experience economy" concept to describe the emerging consumer demand for experiences over products and services [45].They observed that customers were no longer satisfied with simply buying services; they sought to buy unique, memorable experiences.Hence, companies in today's competitive market should go beyond the function and upgrade their offerings to provide an experience.
Csikszentmihalyi emphasized the significance of experiences in generating a profound sense of pleasure that leads to favorable memories [14].Creating a memorable experience (ME) is taking root in the hospitality industry.The industry is known as a customer-centric industry; therefore, its products are highly experience-oriented [35].The shift from service delivery to the experience creation industry began as researchers realized that strategies focusing solely on service, quality, and price are no longer the primary drivers of competitive advantage, and staging an experience to be a ME has become increasingly important within the core capabilities of hotels, for instance, and plays a crucial role in determining whether guests will choose to revisit [24], [39], [40].Today, destination managers and tourism businesses must strive to provide memorable experiences as a new standard to meet since travelers now desire authentic and meaningful experiences that cater to their leisure and spiritual desires [32].
Experiences, as Pine and Gilmore [45] stated, are "inherently personal, existing only in the mind of an individual who has been engaged on an emotional, physical, intellectual, or even spiritual level."The mix of distinctively subjective factors makes experience hard to be quantified or measured accurately [12], [16].However, researchers in hospitality and tourism have attempted to define the experience construct in the context of their field (tourist's experience, traveler experience, guest experience) and explore the link between tourism experiences and memory by applying different methods to measure it.For example, Kim and Chen [31]characterize tourist experiences as intangible, unique, ongoing, and highly subjective occurrences, and these experiences can be understood from two perspectives: the immediate, momentto-moment encounters and the overall assessment of the experience.Vada et al. [64] define a memorable tourism experience (MTE) as a positive encounter that is retained and remembered by individuals even after the actual event has taken place.Seyfi et al. [51]suggest that the quality of the experience is a stronger predictor of creating MEs than the quality of service.It is a challenging task to identify MEs, but those defined criteria can help in capturing customers' memorable experiences.
The traditional methods used to measure tourists' MEs in hospitality and tourism include questionnaire surveys, selfreported diaries, interviews, observant participation, and the employment of the experiential sampling method [28].However, in more recent times, memorable experience research has expanded to incorporate innovative techniques such as social media analytics [54].Current efforts have been limited to objective surveys or a small portion of social media data, making it hard to generalize its results due to bias and lack of diversity.
To address the previously discussed challenges and, most importantly, to efficiently deploy a framework that can identify memorable experiences within the hotel industry on a realworld, large-scale platform, we propose From Guest To Family (G2F).In this paper, we detail the development of our platform and deploying it as an asset product for hotel industry management to enhance their hotel services and stay ahead of the industry.The development of G2F consists of three main steps: 1) efficiently identifying the reviews that include positive or negative MEs, which is based on the K-Means algorithm to cluster the reviews based on their sentiment scores and user reviews' rating; (2) extracting the representative keywords that are trending for MEs in a monthly or yearly pattern based on an advanced keywords extraction algorithm; (3) novel and unexplored text generation of reviews that include MEs based on local Large Language Model (LLM) and the extracted keywords.To the best of our knowledge, the proposed G2F is the first successfully deployed fast and accurate product in capturing, analyzing, and generating MEs in the hotel industry.Our contributions are summarized as follows: • We have created a comprehensive platform capable of distinguishing and extracting both positive and negative MEs from online customer reviews within the hotel industry.• Our platform also provides an analytical tool that uncovers trends in MEs on a monthly or yearly basis, thereby enabling hotel management to identify unique or recurrent key terms to improve their services.• Lastly, we've built a platform that leverages a local LLM and the extracted key terms to generate potential yet unexplored MEs, assisting hotel management in anticipating and preparing for future scenarios.

A. Memorable experiences in Hospitality
Tourism, often seen as a journey of experiences, involves various elements like accommodation, local interaction, transportation, attractions, and culinary experiences [23].Studies by [63] and [32] focused on the link between tourism experiences and memory, the elements that make experiences memorable, and the conceptualization of the term "memorable tourism experience."They identified key dimensions and developed a scale encompassing hedonism, refreshment, local culture, meaningfulness, knowledge, involvement, and novelty.While these focused primarily on positive experiences, more recent studies have started considering negative experiences as potentially memorable components [56].
Despite these developments, there is still a lack of consensus on theories and measurement of the concept.The scales used are often seen as inadequate in capturing the true essence of what makes a tourism experience memorable.Most studies have utilized close-ended surveys, interviews, openended questionnaires, and travel blog narrative analysis, with few drawing on content analysis [55].The need for more comprehensive and updated research on the topic is palpable, as several scholars urge for further studies to deepen our understanding of memorable tourism experiences [23], [32], [55].

B. Keywords Extraction
Automatic Keyword Extraction (AKE) is designed to quickly and efficiently identify a small yet representative set of words that accurately reflect the key topics in a text document without requiring time-consuming manual annotation by experts [66].Different terminologies are used to describe the most significant information extracted from a text, such as key phrases, key segments, key terms, or keywords extraction.However, they all serve the same purpose [5].Past efforts in KE techniques are mainly supervised or unsupervised.Supervised methods for keyword extraction typically require a substantial labeled training dataset to achieve high performance, making them often limited to specific domains.Consequently, unsupervised methods such as TextRank [41], Yake [9], EmbedRank [6], SIFRank [59], AttentionRank [15], and MDERank [71] have emerged as widely adopted and robust alternatives.Starting with TextRank [41], a graph-based approach, KE algorithms have continuously evolved to address the limitations of previous methods, resulting in improved accuracy, efficiency, and relevance of extracted keywords.Based on the best of our knowledge of the current stateof-the-art (SOTA) works, both SIFRank [59] and MDERank [71] demonstrate notable strengths in terms of robustness, efficiency, and keyword relevance.
KE is widely used across multiple domains and industries for various purposes such as tracking research trends [52], analyzing pandemic trends [65], or improving educational methods [17].In the hospitality and tourism domain, Le Huy et al. [36] suggested a KE approach based on BilSTM-CRF combined with BERT for effectively extracting key phrases related to information and search methods in the field of tourism.A different study [37] developed a tool called VisTravel that used the TextRank [41] to identify and extract essential words from travel reviews, enabling the tourism management team to gain insights into customers' opinions.Additionally, this study [68] introduced an online hotel review analysis using KE based on TF-IDF algorithm to extract the top 20 keywords that reflect the most concerning factors of hotel consumers on hotel services.Chang et al. [10] also developed a visual analytics framework for exploring insights from hotel ratings and reviews.They incorporated their keyword extraction method by integrating a sentiment-based model learned through SVM.While keyword extraction techniques have been applied in the tourism and hotel industries, previous studies have ignored high-accuracy or SOTA keyword extraction methods, which offer more relevant, accurate keywords and have efficient performance and robustness in various lengths of keyphrases [71].Consequently, those results are affected by the limitations of low-accuracy KE methods.

C. Large Language Models for Text Generation
Text generation and text summarization share the common goal of producing coherent and comprehensible texts tailored to individual users' needs.Text summarization creates concise summaries of longer documents or texts.There are two types of summaries: Extractive, which assembles summaries from the source text [29], [48], [72], and Abstractive which generates summaries that contain novel words to simulate human summaries [20], [44], [50].Various methods have been proposed to help travelers choose hotels.Hu et al. [25] used extractive summarization to identify informative sentences based on author reliability, review time, usefulness, and conflicting opinions.Tsai et al. [62] created high-quality summaries by identifying helpful reviews and categorizing sentences into location, sleep quality, room, service, value, and cleanliness.Nathania et al. [22] developed a tool to generate paragraph and phrase-based summaries and analyze annual sentiment trends.However, these methods may lack coherence and novelty and be limited to the original text.

III. METHODOLOGY
The development of G2F consists of three main parts, as illustrated in Figure 1.The first step is identifying the most positive and negative MEs from the reviews through sentiment analysis and K-Means clustering methods.The second step is extracting the yearly and monthly most representative keywords of the positive and negative MEs based on implementing the advanced keywords extraction algorithm.This step enables G2F to conduct a yearly or monthly analysis of the positive and negative MEs' keywords by facilitating the implementation of keywords distribution of the extracted keywords.In the third step, given the yearly/monthly extracted words, we customize four prompts to be inputted into a local open-source text generation LLM (Vicuna) to obtain different and unexplored positive and negative MEs in the hotel industry.Each prompt contains a concatenation of the top 20 extracted keywords, 500 random extracted keywords, and a positive/negative customized prompt.G2F details are discussed in the following subsections.Group cluster rows by <year,month>

A. Memorable Experiences Identification from Reviews
The main goal of G2F is to extract MEs from customer reviews.Most hotel research (see sec II) considers only positive MEs, such as wedding days at hotels or great local food on a fantastic view.However, MEs can also be highly negative, like a toilet clog that causes a pungent odor to the room, and the hotel management does nothing about it.Each positive or negative experience has a long-term effect, such as revisiting or never booking there again.G2F considers both positive and negative experiences by considering the most positive and negative hotel reviews to extract those experiences.
Relying only on the reviewers' rating scores from 1 (the lowest) to 5 (the highest) is not sufficient or reliable to distinguish the reviews from the most positive or negative reviews.For example, a customer may be having a bad day and gives a rating of 1 to a hotel that does not offer discounts or cash.On the other hand, a rating of 5 can be given to a biased customer who compliments how the hotel is clean because he knows someone there.Therefore, we corroborated the rating score of the reviews with the sentiment analysis scores to have more reliable scores for the most positive and negative reviews.We employ VADER [27] for sentiment analysis, a lexicon comprised of human-annotated phraseemotion pairs.VADER returns three emotion scores (positive, negative, neutral) ranging from 0 to 1 and a compound ranging from −1 to 1 score for a given text input.
G2F utilizes the reviews' sentiment analysis scores and ratings to implement the K-Means clustering algorithm to group the reviews with the most positive and negative sentiment scores and ratings.In this work, we employed K-means over other clustering methods, such as Hierarchical clustering and DBSCAN or GMM, because K-means is computationally more efficient on a large-scale dataset, and other methods, such as DBSCAN, operates on the concept of density-based clustering where a point that does not belong to the density neighborhood of any clusters can be regarded as noise which is not ideal for us as we are interested in including the extreme instances in our cluster-based analysis.Consequently, we can focus on the top positive and negative clusters to extract MEs.The K-Means clustering algorithm divides N reviews in attr dimensions into K clusters, then minimizes the sum of squares of the distances between each r ∈ N review within each cluster.Given a the set of N reviews represented as then optimize the sum of squares from each review to the centroid of its cluster.The problem is formulated as follows: Where C is the clusters whose points are the reviews represented by vectors where each of its elements is the attributes (rating, neutral, positive, negative, compound).The size of the cluster C i is |C i |, L 2 norm is represented by ∥.∥, and µ i is the centroid of those points in C i such that We then find the optimized K clusters and set the labels to each review to its grouped cluster.The top positive cluster and the most negative cluster are identified as the reviews that contain memorable experiences.Finally, The labeled reviews will be used to find the most representative keywords in the next step.

B. Trend Analysis of Memorable Experiences Using Keyword Extraction
Up to this point, we have all the reviews from HotelRec accompanied by their cluster label based on the previously discussed technique.We implement a state-of-the-art KE algorithm on the top positive and most negative cluster reviews to extract the most representative keywords from each cluster.Therefore, MDERank [71] is considered a good candidate for the task because it outperforms all previous KE algorithms in terms of F1 score and has available code for implementation.However, MDERank performance is slower than SIFRank and only outperforms it by an average of 1.8 F1.Consequently, we implement SIFRank to extract the keywords.
In SIFRank [59], a hotel review from HotelRec undergoes tokenization and part-of-speech tagging.Subsequently, noun phrases (NPs) are extracted using a pattern-based NP-chunker, utilizing the part-of-speech tags, and these NPs are considered candidate keywords.The document's tokens are then fed into a pre-trained language model to acquire token representations, which may include multi-layer word embeddings.Using a sentence embedding model, the NPs and the entire document are transformed into NP embeddings and document embeddings, respectively, ensuring they share the same number of layers and dimensions.The similarity between candidate keyphrases and the document's topic is assessed using the cosine distance between the NP embeddings and document embeddings.Finally, the top-N most similar candidate keywords are chosen as the final keywords for the hotel review.
The first step to implementing SIFRank into G2F involves preprocessing the reviews in the HotelRec dataset.This text preprocessing includes tokenization, stop word removal and special character removal.Then, given a preprocessed review from After extracting the keywords, they play a pivotal role in analyzing the trends of MEs over the years and months.The analysis begins by focusing on the top positive and most negative clusters.These clusters are instrumental in understanding the essence of MEs.The reviews within each cluster are then organized based on the year and further categorized into monthly sub-groups.Within these groups and sub-groups, we calculate the frequency distribution of words, shedding light on their significance in both the yearly and monthly contexts.This meticulous process allows us to draw meaningful insights and patterns from the data, enabling a comprehensive understanding of the evolving trends in MEs.

C. Creating Novel Memorable Experiences with Local LLM Text Generation
Given the representative MEs extracted keywords from all the reviews that were grouped by year and months according to their cluster of either the highest positive reviews or the most negative reviews, we utilize those keywords to create novel and unexplored memorable customer experiences for future hotel stays.To generate the texts for this step, we implement a local text generation LLM Vicuna [13].Vicuna-13B [13] represents an open-source chatbot trained using the sophisticated fine-tuning techniques of LLaMA.It utilizes dialogues from ShareGPT, a platform where users share their conversations, as its foundational training data.We did not use any ChatGPT [8] to avoid any privacy issues.According to ChatGPT's privacy policy 1 , one of the resources that ChatGPT gathers its information is the conversation or prompts that are typed into the chatbot itself.Furthermore, An initial assessment deploying GPT-4 as a benchmark indicates Vicuna-13B surpasses 90% of the performance quality exhibited by established models like OpenAI's ChatGPT and Google Bard.Furthermore, it excels beyond other models such as LLaMA and Stanford Alpaca in over 90% of instances [13].
To generate a future and unexplored customer ME for a hotel stay, a prompt is customized and then inputted into PROMPT_OBJECT{ "PosPromptText": ( "CustomizedText": "Write a positive ..", "Top-20-Keywords": "w1, w2, ..., w20", "Rand-500-Keywords": "w1, w2, ..., w500" ), "NegPromptText": ( "CustomizedText": "Write a negative ..", "Top-20-Keywords": "w1, w2, ..., w20", "Rand-500-Keywords": "w1, w2, ..., w500" ) } Output{ "Pos_Rev" : GenText(PosPromptText), "Neg_Rev" : Vicuna as illustrated in Algorithm.2. The prompts to generate a positive ME differ from the negative ones.Each prompt is generated by P rompt(CustomizedT ext + keywords) function (see line 13 Algorithm.1).An example for the unchanged part of the prompt (CustomizedT ext) for a positive one is "Write a positive memorable experience hotel review from the following keywords:".We aimed to make the unchanged part of the prompt as short as possible to simplify it for Vicuna.Then, we concatenate the top 20 extracted keywords and 500 random ones from the same group to CustomizedT ext.The top 20 extracted keywords guarantee the review will be about positive fundamental hotel concepts.The random 500 keywords make the text generation unique because they will differ each time.Here is an explained instance from the pseudo-code structure in Algorithm.2: to create a positive prompt for July 2018, we concatenate the positive CustomizedT ext, the top 20 keywords denoted as T opHT A d c where d is the date and c is the cluster, and another 500 random extracted keywords denoted as RandHT A d c from the grouped reviews of the exact date and cluster.We then input the concatenated prompt into Vicuna to get the final output.

IV. EXPERIMENT AND RESULTS
In this section, we first introduce the dataset and then perform machine-based and human-based evaluation metrics on keywords extractions for memorable hotel experiences and text generation.We then demonstrate the experimental results in a series of evaluations.In addition, a case study is provided to showcase an actual demonstration of the objectives of G2F framework.

A. Dataset
We conduct our study on a dataset collected only for hotels called HotelRec [2].HotelRec is a comprehensive repository of hotel reviews collected from TripAdvisor 2 .According to the third quarterly report in November 2019, available on the U.S. Securities and Exchange Commission website, TripAdvisor is established as the world's preeminent online travel platform, featuring roughly 1.4 million hotels 3 .The dataset contains a period of nineteen years, from February 1, 2001, to May 14, 2019, and stores 50,264,531 worldwide hotel reviews.These reviews are provided by a user base totaling 21,891,294 individuals.HoteRec analysis of user contribution reveals a diverse distribution, with 67.55% of users writing a single review and 90.73% contributing less than five.The average review count per user is 2.24, with the median at one review.User evaluations, represented through obligatory overall ratings, signify the collective hotel experience.Each review in the dataset includes a user profile, the user profile, the hotel URL, the overall rating, the summary, the user-written text, the date, and detailed sub-ratings on different hotel aspects when applicable.As it stands, HotelRec is unrivaled in size as a public dataset in the hotel industry and is the largest textbased recommendation dataset in any single domain.

B. Experiment Settings and Evaluation Metrics
The K-Means clustering algorithm is simple and efficient.However, a major challenge in this technique is determining the K number of clusters that should be chosen to group the data.To determine we first use a statistical technique called the elbow method.The method calculates the sum of squared distances from each data point to its assigned center point, or centroid, during each iteration of the K-Means algorithm.Each iteration is carried out with a differing number of clusters.The result is displayed in the lower right chart in Figure .2. The chart shows that K = 4 is the optimal number of clusters and also the most efficient one.In addition, K = 5 is also seen as the optimal number of clusters, but it is not timewise efficient when k = 5.However, some could argue that the elbow method is highly ambiguous because it does not contain a definite elbow [30] and is also considered unreliable in some cases.Therefore, we use the Silhouette method to find the optimal K number of clusters.The silhouette coefficients for each point signify how well a point aligns with other data in its cluster and how poorly it aligns with data from the nearest cluster, specifically, the cluster whose average distance from the data point is the smallest [47].The value of the silhouette ranges between [-1, 1], and the closer the value to 1, the better K clusters we have.The four charts in Figure .2 show the average silhouette scores when K = {2, 3, 4, 5}.We see k = 4 and k = 5 are the closest to 1, with scores of 0.83 and 0.85, respectively.Therefore, looking at the elbow method and silhouette scores together, 5 is the optimal K.
The dataset we are using is not highly dimensional because it has five features, but it may challenge the algorithm to group the clusters optimally.In Figure .2, we plot the t-SNE to visualize the clusters and corroborate that the optimal K 2 https://www.tripadvisor.com/ 3https://www.sec.gov/ix?doc=/Archives/Although the separation among the most negative clusters is not well-established, it is evident from the chart that our proposed method successfully extracts the top positive and most negative clusters so we can utilize the clusters for KE of positive and negative MEs.
Our framework contains two key aspects for evaluation in the hotel industry: a) memorable experience-driven keyword extraction, and 2) Creating Novel Memorable Experiences with Local LLM Text Generation.
A. Machine-based evaluation of memorable experiencedriven keyword extraction: We evaluate the utilization of SIFRank to extract memorable experience-related keywords based on Precision, Recall, and F1 value.All the SOTA keyword extraction methods are evaluated on several datasets of different domains, such as Inspec [26], SemEval2017 [4], or DUC2001 [67].Needless to say, it is necessary to evaluate the accuracy of memorable experiences-related keyword extraction from HotelRec Dataset.To accomplish this, an industry expert meticulously annotated 100 reviews from HotelRec containing positive or negative memorable experiences.Extracting representative keywords from each review, we conduct the evaluation by comparing the annotated keywords to those extracted by SIFRank.
B.1.Machine-based evaluation of creating novel memorable experiences with Local LLM text generation.We evaluate the generated texts (positive/negative ME reviews) by a local Vicuna in two measures: ROUGE (ROUGE-1, ROUGE-2, ROUGE-L) score and BLEU (BLEU-1, BLEU-2, BLEU-3, BLEU-4).At first, we generated 30 prompts with the same standards in Algorithm.2. The same prompts are given to an industry expert to generate memorable experiences based on those prompts.Similarly, those prompts were inputted into Relying only on ROUGE and BLEU scores is not reliable enough and favors the scoring against generated texts that delivers the same content but rephrases used words.Therefore, we perform a human evaluation based on Likert scale scoring [43].The prevalent technique involves assigning ratings to a generated text (the review generated by Vicuna) based on a source document (the review generated by the expert).This often takes the form of an independent assessment, where each generated text is evaluated independently rather than compared directly with others.The evaluative criteria generally include consistency, fluency, informativeness, and relevance.Each generated text is scored on a scale from 1, being the poorest, to 5, considered the best.Two anonymous annotators were given the same task to conduct the evaluation.

C. Results and Analysis
The machine-based evaluation results of the SIFRank method to extract memorable experience-related keywords reveal an interesting trade-off between precision and recall.As seen in Table I, the precision of 0.86 indicates that when the method identifies keywords, it is highly accurate, with only 14% of the extracted keywords being false positives.This suggests that the method is proficient at selecting relevant and appropriate keywords, making it a valuable identifying representative keywords of ME.However, the recall score of 0.54 indicates that the method misses 46% of the actual keywords present in the text.One reason for the low recall score is the different lengths of the extracted keywords between the human-annotated keywords and the extracted ones by SIFrank.For example, SIFRank extracts from one to three words as one keyphrase, while the keyphrase length extracted by the annotator can consist of 4 words.Nonetheless, the overall F1 score of 0.63 demonstrates a reasonably balanced performance, indicating that the method strikes a fair compromise between precision and recall.Better annotation of humangenerated keywords from HotelRec could potentially improve recall without sacrificing precision, leading to a more effective keyword extraction approach.
The evaluation results in are quite promising.For ROUGE-1, the score of 0.5559 suggests that more than half of the unigrams (individual words) in the generated reviews match those in the reference reviews.Similarly, for ROUGE-2, with a score of 0.3374, the model demonstrates reasonable success in reproducing meaningful word sequences of two words in length from the reference text.Moreover, the ROUGE-l score of 0.5201 reveals that the LLM performs well in preserving the reviews' overall linguistic structure and continuity.The ROUGE-L metric considers the longest common subsequence between the generated and reference texts, indicating that the LLM can produce reviews that capture the essence and context of the original reviews reasonably well.For BLEU-1, the score of 0.5741 indicates that over 57% of the unigrams (individual words) in the generated reviews match those in the reference reviews.This suggests that the LLM is reasonably successful in producing words that align with the original reviews.For BLEU-2, the score of 0.4226 represents the similarity in bigrams (sequences of two words) between the generated and reference texts.The model's ability to reproduce meaningful two-word sequences is evident, though there is still room for improvement.BLEU-3 and BLEU-4 scores, 0.3230 and 0.2471, respectively, account for trigrams (sequences of three words) and 4-grams (sequences of four words).These scores demonstrate that the LLM's performance in generating longer sequences of words is relatively lower compared to unigrams and bigrams.To summarize, Vicuna showed its effectiveness in generating text reviews.It did not only generate reviews from the exact keywords but also showed its ability to write novel and unexplored reviews that describe positive and negative MEs.
Table .III shows the results of the human-based evaluation of the generated customer ME reviews according to the Likert scale.The generated reviews received high scores across all categories from the first and second annotators.It received average scores of 4.625 and 4.2625 for consistency, indicating that our method of creating the prompt assessed Vicuna to produce text with high coherence and did not contradict itself.The fluency scores of 4.5 and 4.25 show that the generated texts were very high in terms of grammar, sentence structure, and readability.The flow of the writing was very smooth and appeared human-like text.Regarding informativeness, impressive scores of 4.757 and 4.375 indicate that the generated reviews were highly insightful and valuable.Lastly, the relevance scores of 4.5 and 4.125 suggest that the framework's outputs were highly pertinent to the given prompt or task.According to the annotators' evaluation, these scores show that employing Vicuna to receive the created prompt from our proposed framework demonstrated a high level of proficiency in generating positive and negative ME reviews.

D. Case Study
In this section, we perform case studies to demonstrate the capabilities of G2F in the hotel industry.The first case study demonstrates how G2F can identify trending keywords related to MEs for hotel stays, providing valuable insights for hotel owners and managers looking to improve the quality of their service.The second case study showcases the ability of G2F to combine with local LLMs to generate new and unique experiences that can help the hotel industry stay ahead of the curve in terms of customer satisfaction and service innovation.
To showcase the usefulness of our proposed G2F, we present a qualitative analysis of a real-world case study from July 2018.According to the Gensler Hospitality Index report, several fundamental factors play a crucial role in creating a good hotel experience: cleanliness, safety, quality/value, and having friendly and hospitable staff are statistically significant drivers [21].Figure 3 shows that the framework captured the fundamental representative keywords for making an experience good at hotel stays.For example, in the cleanliness domain, some of the extracted keywords were "spotlessly clean, "clean bed," and "clean room."In the staff domain, example keywords such as "friendly," "helpful," and "welcoming" were extracted and highly mentioned.All the primary domain keywords have been highly mentioned not only for one month but for every month.This shows that G2F successfully identified the fundamental factors.
Moreover, G2F captured representative keywords from the defining terms of a ME mentioned by Kim [32] such as local culture and novelty.G2F identified local events associated with hotel stays, such as the Soccer World Cup in Russia in the summer of 2018.The unique keywords extracted from customer reviews were "Russian channels," "world cup themed," "repainted walls," and "Russian tour."Hotel management can benefit from such analysis by preparing for such events by providing such services in their hotels, like providing TV channels of the event or creating a themed atmosphere for such an event by repainting the walls.The last capability of G2F is creating novel and unexplored experiences.Figure 3 shows an example review text created by G2F that covers the fundamental aspects of a ME.Creating reviews like that can help hotel management polarize ideal future MEs and raise their level of service by preparing for unexplored scenarios.

Algorithm 1 2 : 3 :) 4 : 6 :
Proposed Method of G2F Framework Input: HotelRec Data HT A attr N Output: Hotels-Memorable-Experiences Data HM E M O Initialize: K-clusters Optimal value k = 0, i = 0 1: while i < N do ApplySentimentAnalysis(HT A T ext i ) AppendSentimentScores(HT A pos,neg,neu,com i end while 5: procedure K-MEANS(HT A pos,neg,neu,com,rating i ) Find and assign optimal value k for clusters 7:Assign and append cluster labels to each row HT A clu N
r ∈ HT A where HT A attr N denotes HotelRec dataset, and a set of selected candidate keywords W = {w 1 , • • • , w i , • • • , w m }, where a candidate w i consists of one or multiple tokens, as w i = w 1 i , • • • , w l i , and m ≤ n, SIFRank's task is to select C candidates from W , where (C ≤ m).The candidates in C are scored and ranked from most important to least.The values range from 0 to 1, with higher values indicating greater relevance of the candidate keyword to the review's topic.Conversely, lower values indicate the keyword's increasing irrelevance to the topic.

Fig. 2 :
Fig. 2: Choosing optimal K for K-Means.The top four figures show the Silhouette scores for 2,3,4, and 5 clusters.The bottom left is the t-SNE figure.The right lower figures show the Elbow Test for K

Fig. 3 :
Fig. 3: Example of the case study from G2F for July 2018

TABLE I :
Machine-based evaluation results of SIFRank performance on HotelRec compared to extracted keywords by a hotel industry expert

TABLE II :
Table II of the LLM (Large Language Model) for generating text reviews indicate its effectiveness in capturing the essence of the original reviews.The mean ROUGE scores, which measure the similarity between the generated text and the reference (original) text, Machine-based evaluation results comparing the generated text by Vicuna to generated text by a hotel expert