Large Language Model Powered Agents in the Web

Web applications serve as vital interfaces for users to access information, perform various tasks, and engage with content. Traditional web designs have predominantly focused on user interfaces and static experiences. With the advent of large language models (LLMs), there's a paradigm shift as we integrate LLM-powered agents into these platforms. These agents bring forth crucial human capabilities like memory and planning to make them behave like humans in completing various tasks, effectively enhancing user engagement and offering tailored interactions in web applications. In this tutorial, we delve into the cutting-edge techniques of LLM-powered agents across various web applications, such as web mining, social networks, recommender systems, and conversational systems. We will also explore the prevailing challenges in seamlessly incorporating these agents and hint at prospective research avenues that can revolutionize the way we interact with web platforms.

topics at top venues such as WWW, SIGIR, CIKM, ACL, EMNLP, TKDE, and TOIS.He has rich experience in organizing tutorials at top conferences, including ACL 2023 and SIGIR-AP 2023.Contact via email ydeng@nus.edu.sg.
An Zhang is a Postdoctoral Research Fellow at the NExT++ research center.She earned her Ph.D. from the National University of Singapore.Dr. Zhang's research is primarily focused on large language models (LLMs) and recommender systems, with a specific interest in the deployment of LLM-driven multimodal generative agents for recommendation simulation.She has published more than 20 papers at top-tier conferences and journals, including WWW, KDD, NeurIPS, ICLR, TPAMI, and TOIS.Contact via email an_zhang@nus.edu.sg.
Yankai Lin is an assistant professor at Gaoling School of Artificial Intelligence, Renmin University of China.He received his Ph.D. degree from Tsinghua University.His research interests lie in prelarge language models (LLMs), especially LLM-based tool learning and LLM-based agents.He has published more than 50 papers at top-tier AI and NLP conferences, including ACL, EMNLP, IJCAI, AAAI, and NeurIPS, with over 10,000 Google Scholar citations.He was selected as the most cited Chinese researcher by Elsevier from 2020 to 2023.He has served as area chair for EMNLP and ACL ARR.Contact via email yankailin@ruc.edu.cn.
Xu Chen is a tenure track associate professor at Gaoling School of Artificial Intelligence, Renmin University of China.Before joining Renmin University of China, he was a research fellow at University College London, UK.Xu Chen obtained his PhD degree from Tsinghua University.His research interests lie in large language models, recommender systems, causal inference, and reinforcement learning.He has published more than 70 papers on top-tier conferences/journals like WWW, AIJ, NeurIPS, TKDE, SIGIR, WSDM and TOIS.He has organized many workshops and tutorials on top-tier conferences including SIGIR 2021, SIGIR 2020, SIGIR 2019, WSDM 2021, and WSDM 2018.Contact via email xu.chen@ruc.edu.cn.
Ji-Rong Wen is a full professor, the dean of School of Information, and the executive dean of Gaoling School of Artificial Intelligence at Renmin University of China.He has been working in the big data and AI areas for many years, and publishing extensively in prestigious international conferences and journals.

OVERVIEW OF TUTORIAL TOPICS
According to the representative set of papers listed in the selected bibliography, this tutorial will contain about 25% of work that involves at least one of the six presenters.The rest of the tutorial will present a comprehensive overview of the tutorial topic by discussing the related work as much as possible from other researchers.

Background of LLM-powered Agents
Autonomous AI agents have long been regarded as stepping stones towards artificial general intelligence (AGI), with capabilities for self-guided task execution.Traditional approaches employed heuristic policy functions, which often lacked human-level adeptness in open-domain scenarios, largely due to heuristic limitations and constrained training data.Recently, LLMs have shown impressive strides towards human-like intelligence [32].This advancement has spurred a growing trend in integrating LLMs as central components in developing autonomous AI agents [34,35,37,52].
• LLM-based Agent's Architecture.The architectures of existing LLM-based AI agents can be distilled into a consolidated framework, extensively covered in recent survey literature on AI agents [39].This unified structure comprises four primary modules: profiling, memory, planning, and action.The profiling module determines the agent's role, while the memory and planning modules immerse the agent in a dynamic environment, facilitating recall and strategizing of future action.The action module then converts decisions into concrete outputs.Notably, the profiling module influences both memory and planning modules, which in turn guide the action module.• LLM-based Tool Learning.LLM-based tool learning seeks to meld the prowess of specialized tools and LLMs, enabling LLM-based agents to use external tools, and bringing in better autonomous problem-solving.Recent studies highlight foundation models' adeptness in tool utilization, such as web search automation [31], online shopping [45], neural model integration [37], computer task execution [23], and embodied robotic learning [2,20].

LLM-powered Agents in Social Network
The social network connects different people by allowing them to share opinions and exchange information.Recent years have witnessed many AI techniques to solve social network problems like user connection prediction [47] and social information propagation [4], where the key challenge lies in understanding human intrinsic cognitive processes and behavior patterns.Recently, by learning huge amounts of web knowledge, LLMs have obtained remarkable success in achieving human-level intelligence.This sheds new light on solving social network problems, and several attempts have been made to incorporate LLM-based agents into this field.
• Social Network Simulation with LLM-based Agents.Social network simulation is a fundamental problem.If one can accurately simulate a social network, then its underlying mechanism and running rules can be easily understood and utilized.However, due to the intrinsic nature of human minds, it is quite hard to predict how people may behave in social networks.Recently, there have been several attempts [16,26,33] to leverage LLMbased agents to solve this problem.The key to these papers is leveraging LLMs as user brains, and designing profile, memory, and planning modules to make LLMs act like humans.• Social Network Problem Solving with LLM-based Agents.
Another research line on combining LLM-based agents with social networks is solving specific problems.People have leveraged agents to discover social system dynamics [17], analyze social principles between different agents [3], and so on.This direction is still rapidly growing, and we foresee that there will be much more promising work in the future.

LLM-powered Agents in Recommendation
Recommender systems play a pivotal role in contemporary information dissemination, actively shaping individual preferences [25].With the recent advancements in LLMs, LLM-powered agents demonstrate remarkable achievements in autonomous interaction and user preference understanding [29].This impressive capability can, on one hand, be harnessed to simulate authentic human behavior within recommender systems at both individual and population levels by scaling their deployment.On the other hand, it opens the potential for leveraging LLM-powered agents in the construction of a new paradigm of personalized recommenders [44].
• User Behavior Simulation with LLM-powered Agents.Simulating user behavior in recommender systems is a complex endeavor that requires a deep understanding of human preference and behavior patterns [5,40,49].Bridging this gap necessitates not only the incorporation of agent modules that are tailored for recommendation contexts but also accommodation of the multimodal nature of such environments [27,48].Hence, agents driven by LLMs must be equipped with and further fine-tuned for multimodal comprehension to approximate the fidelity of real-world user interactions.• Recommender Agents.While contemporary recommender systems are proficient in predicting domain-specific recommendations leveraging user behavioral data, they typically lack capabilities for explaining their recommendations, engaging in user conversations, and integrating rich user data [21].To create a dynamic and interactive recommender system, LLMs serve as the 'brain', with the recommender model acting as a tool [30,42].This research direction is dedicated to developing user-oriented recommender agents for the recommendation ecosystem [38].

LLM-powered Conversational Agents
LLM-powered conversational agents [13] not only redefine user interaction but also introduce innovative functionalities that push the boundaries of traditional web engagements.
• LLM-powered Conversational Agents for User Simulation.
Building user simulators [36,51] has emerged as an effective and efficient technique for evaluating conversational systems, thereby mitigating the high cost of interacting with real users.Inspired by the recent success of leveraging LLMs for role-play scenarios, researchers design LLM-powered conversational agents, which can be flexibly adapted to different dialogue evaluations, including open-domain dialogues [24], task-oriented dialogues [18], and conversational recommendation [41].• LLM-powered Proactive Conversational Agents.Despite the exceptional proficiency in context understanding and response generation in various dialogue problems, LLM-based conversational agents typically prioritize accommodating users' intentions as LLMs are trained to passively follow users' instructions.Therefore, LLM-powered conversational agents often face challenges in handling proactive dialogue problems that require the conversational agent to strategically take the initiative to steer the conversation towards an anticipated goal [10].To this end, recent works investigate prompt-based policy planning methods that prompt an actor LLM to either conduct self-thinking of strategy planning for each turn [11,50] or generate AI feedback given the whole dialogue history to iteratively improve the dialogue policy planning for proactive dialogues [12,14,46].

Open Challenges and Beyond
In the last part, we will discuss the main open challenges in developing LLM-powered agents in web applications and several potential research directions for future studies.
• Trustworthy and Reliable Web Agents.While LLM-powered web agents are designed to be accurate, hallucination and inconsistency issues [6,22] can lead to incorrect or inappropriate responses.Ensuring that these agents are both trustworthy (data privacy and ethical considerations [7]) and reliable (consistent and accurate performance) remains a pressing challenge.• Multi-agent Collaboration and Competition.As the web ecosystem grows in complexity, there is a foreseeable future where multiple LLM-powered agents will need to interact with each other, either collaboratively to achieve common goals or competitively.Designing agents that can effectively collaborate requires addressing challenges in communication [43], shared knowledge bases [52], and synchronizing actions in real-time [1].
On the other hand, competitive scenarios [14] necessitate agents that can strategize, negotiate, and adapt to dynamic conditions.
This topic receives notably increasing attention from both academia and industry.In academia, the web conference recognizes Large Language Models as a new research topic for various research tracks this year due to the revolutionary techniques based on LLMs.In industry, recent years have witnessed many successful web applications that are empowered by the integration of LLMs.For example, Microsoft released a new version of Bing with its integration with ChatGPT.Several tutorials about the integration of LLMs in specific web applications have been given in related top-tier conferences, including but not limited to 1) Tutorial on Large Language Models for Recommendation at RecSys 2023 [19], 2) Proactive Conversational Agents in the Post-ChatGPT World at SIGIR 2023 [28], and 3) Goal Awareness for Conversational AI: Proactivity, Non-collaborativity, and Beyond at ACL 2023 [9].However, these tutorials mainly introduce advanced designs for building specific web applications with the assistance of LLMs.In our tutorial, we aim to elaborate a comprehensive introduction to cutting-edge research on LLM-powered agents across multiple important web applications.

DETAILED SCHEDULE
The following summarizes the detailed schedule of the tutorial:

STYLE AND INTENDED AUDIENCE
This tutorial is a lecture-style tutorial.The target audiences are researchers and practitioners who are interested in web mining, information retrieval, natural language processing, and humancomputer interaction.No specific prerequisite knowledge or skill is required.The audience will learn about the state-of-the-art research in web mining, information retrieval, and natural language processing as well as the cutting-edge designs of autonomous AI agents powered by large language models in various web applications.

SUPPORTING MATERIALS
(1) Slides will be made publicly available; (2) A survey [39] is accompanied with this tutorial; (3) A video teaser1 is provided for public promotion.

( 1 )
Introduction [10 min] (2) Background of LLM-powered Agents [35 min] (a) Agent Architecture (b) Tool Learning (3) LLM-powered Agents in Social Network [35 min] (a) Social Network Simulation with LLM-based Agents (b) Social Network Problem Solving with LLM-based Agents (4) LLM-powered Agents in Recommendation [35 min] (a) User Behavior Simulation with LLM-powered Agents (b) Recommender Agents (5) LLM-powered Conversational Agents [35 min] (a) LLM for User Simulation in Conversations (b) Proactive Conversational Agents (6) Open Challenges and Beyond [20 min] (a) Trustworthy and Reliable Web Agents (b) Multi-agent Collaboration and Competition (7) Summary and Outlook [10 min] He serves as the Program Chair of SIGIR 2020 and the Associate Editor of TOIS and TKDE.He has previously served as a senior researcher at Microsoft Research Asia and the group manager of the Web Search and Mining Group.He was elected as a National Distinguished Professor in 2013 and Beijing's Distinguished Young Scientist in 2018.He is a Chief Scientist at the Beijing Academy of Artificial Intelligence.Contact via email jrwen@ruc.edu.cn.