Self-Adaptation in Industry: A Survey

Computing systems form the backbone of many areas in our society, from manufacturing to traffic control, healthcare, and financial systems. When software plays a vital role in the design, construction, and operation, these systems are referred as software-intensive systems. Self-adaptation equips a software-intensive system with a feedback loop that either automates tasks that otherwise need to be performed by human operators or deals with uncertain conditions. Such feedback loops have found their way to a variety of practical applications; typical examples are an elastic cloud to adapt computing resources and automated server management to respond quickly to business needs. To gain insight into the motivations for applying self-adaptation in practice, the problems solved using self-adaptation and how these problems are solved, and the difficulties and risks that industry faces in adopting self-adaptation, we performed a large-scale survey. We received 184 valid responses from practitioners spread over 21 countries. Based on the analysis of the survey data, we provide an empirically grounded overview of state-of-the-practice in the application of self-adaptation. From that, we derive insights for researchers to check their current research with industrial needs, and for practitioners to compare their current practice in applying self-adaptation. These insights also provide opportunities for the application of self-adaptation in practice and pave the way for future industry-research collaborations.


INTRODUCTION
Computing systems form the backbone of our factories, traffic control systems, healthcare, telecommunication, financial systems, and so forth.When software plays a vital role in their design, construction, and operation, these systems are often referred to as software-intensive systems [21].The trustworthiness and sustainability of these systems is vital for our society [5,32].Yet, building and maintaining trustworthy and sustainable systems is challenging due to complexity that arises from the growing demands on these systems, their continued integration, the uncertain operating conditions they face, the fast speed of technological progress, etc.These challenges have been a continuous driver for new and innovative approaches to design, develop, and operate softwareintensive systems.One common approach today is so called DevOps in which development and operation are blended, allowing system components to be easily evolved and redeployed without impacting their operation [7].
A classic approach to address the increasing complexity of software-intensive systems is transferring control from humans [27] to software components by equipping systems with feedback loops that automate tasks that otherwise need to be performed by human operators.These feedback loops monitor the system and its environment, reason about the system behaviour and its goals, and adapt the system to ensure its goals under changing conditions, or gracefully degrade if necessary.Such goals can be very diverse, ranging from ensuring a required level of performance under uncertain workload conditions, dealing with errors caused by external services that are difficult to predict, or defending the system against malicious attacks and the problems they may cause.A typical example is a feedback loop deployed in a cloud environment that expands or decreases computing resources to meet changing demands while minimising the cost of operation.Another example is a container framework that performs autoscaling in a microservice deployment.
The principles of applying feedback control to software-intensive systems have been the subject of active study in academia.Back in 1998, Oreizy et.al. [33] presented a seminal paper at the International Conference on Software Engineering (ICSE) where the authors introduced the notion of self-adaptation that comprises two simultaneous processes: system adaptation that is concerned with detecting and handling changing circumstances, and system evolution that is concerned with the consistent application of change over time.A few years later, Garlan et.al. [15] stated the crucial role of architectural models as first-class citizens that enable a system to reason about system-wide change and adapt itself accordingly to achieve or maintain its goals.Blair et.al. [4] consolidated and elaborated on these principles in what is now generally known as "models at runtime."In 2007, Kramer and Magee [25] stated the crucial role of software architecture in the realisation of self-adaptive systems, distinguishing adaptation management from goal management.Over the last decade, the research community has developed a vast body of knowledge and know-how on principles, see e.g., [2,4,13,37], models and languages [23,31,43,52], processes and methods [1,6,8], patterns [26,35,51], and frameworks [10,15,36] to engineer self-adaptive systems.Researchers have documented a substantial number of literature reviews and surveys on various topics in self-adaptive systems, such as the benefits of self-adaptation [50], requirements for self-adaptive systems [54], approaches to realise self-adaptation [26,28,30,39], the use of formal methods in self-adaptive systems [48], self-protection [55], the notion of uncertainty [20,29], and the use of machine learning in the realisation of self-adaptation [17], among others.Basic research works in the field of self-adaptation are for example [7,9,22,38,44].
In parallel, the principles of feedback control have been studied and applied in industry.For example, about two decades ago, IBM launched its legendary initiative on autonomic computing [24].Inspired by the autonomic nervous system of the human body, the central idea of autonomic computing was to enable computing systems to manage themselves based on high-level goals.Four classic goals are self-optimisation, self-healing, self-protection, and self-configuration.Autonomic computing delegates the complexity of system operation to the machine aiming to reduce the time required by operators to resolve system difficulties and other maintenance tasks such as software updates.Over the years, industrial solutions based on feedback loops have found their way to practical applications, for instance in the domain of elastic cloud to adapt computing resources and automated management of server parks to deal with changing business needs, e.g., [3,40].
While the output of academic research is documented in research articles, journal volumes, and books, the current practice of self-adaptation in industry has never been systematically described.

Objective and Research Questions
Our general objective is to better understand the state of practice of self-adaptation in industry.To that end, we perform a large-scale survey with active practitioners.Concretely, this survey aims at shining a light on what motivates practitioners to apply self-adaptation, what kind of problems they solve using self-adaptation, how practitioners design and develop self-adaptive systems, whether they follow any established practices, what difficulties and risks they face in adopting self-adaptation, and what future opportunities industry sees for the application of self-adaptation.
To the best of our knowledge, no systematic study has been done that investigates and these issues.Hence, there is no clear and documented view of why and how the principles of self-adaptation are applied in practice, and what challenges practitioners face when realising self-adaptation.Investigating industrial practice on self-adaptation and answering the questions targeted by this study will help narrow the gap between industry and academia.It aims at helping researchers in academia to get a better picture of how self-adaptation is applied in practice, the industrial needs in realising self-adaptation, and what problems practitioners face.We conjecture that having a better picture about industry practice will help the research community to position their efforts with respect to industrial needs and make well-informed decisions to set future research objectives, both fundamental and applied.On the other hand, drawing a picture of the state-of-the-practice can also benefit industry by sharing the motivations and potential benefits of self-adaptation, directing them towards relevant sources of information such as best practices, and identifying opportunities for collaboration with researchers to address the problems they face.
We aim to answer the following concrete research questions: RQ1: What drives practitioners to apply self-adaptation in software-intensive systems?RQ2: How do practitioners characterise self-adaptation?RQ3: How do practitioners apply self-adaptation in industrial software-intensive systems?RQ4: What are the experiences of practitioners with applying self-adaptation and do they see opportunities for how and where to apply self-adaptation?
With RQ1, we want to investigate the motivations of practitioners for applying self-adaptation, the kinds of industrial systems for which self-adaptation is applied, and the types of problems they solve using self-adaptation.In academic research, self-adaptation has been proposed for two main complementary problems [44]: 1) to automate the management of complex software-intensive systems based on high-level goals provided by operators, and 2) to deal with operating conditions that are hard to predict before deployment and need to be resolved during operation (i.e., mitigating uncertainties).Key management tasks for self-adaptation are self-healing, self-optimisation, selfprotection, and self-configuration.We want to understand whether industry uses the principles of self-adaptation to deal with the same or different problems, and whether and how they relate to the classic system and software management tasks.Answering RQ1 will shine a light on application areas, motivations, and concrete problems for which self-adaptation is applied by practitioners or could be applied by practitioners who currently do not use self-adaptation.This may provide academics with insights in relevant areas to drive and validate research results on self-adaptation.The results may also indicate applications and problems that are not yet explored in industry and may benefit both academia and industry.
With RQ2, we aim to investigate the perception of practitioners on the concept of self-adaptation.We are particularly interested in how practitioners characterise self-adaptation as a property that enables a system to adapt itself at runtime.To that end, we will elicit concrete examples of what they understand by self-adaptation.This will give us better understanding of whether and how practitioners understand the concept of self-adaptation, what terminology they use, whether there are any differences in the viewpoints on what constitutes self-adaptation, and whether they consider self-adaptation altogether useful.This may also shine a light on whether there are any (emerging) industrial standard practices, e.g., a technology stack or tools.Answering RQ2 will help researchers to get a better picture of how practitioners understand the concept of self-adaptation.On the other hand, the insights may reveal potential opportunities for practitioners to benefit from expertise of other practitioners as well as knowledge developed by researchers.
With RQ3, we aim at examining how self-adaptation has been realised and used in industry.We are particularly interested in mechanisms, tools, benchmarks, and processes employed in the industry to engineer self-adaptive solutions.We will pay attention to the degree of automation and the role of humans in runtime adaptation as this is commonly considered important for trust in software-intensive systems, see e.g., [49].Furthermore, we are interested in comparing industrial practices with solutions developed by academics, such as modelling techniques, frameworks, and verification techniques.We also want to understand how practitioners obtain trust in the selfadaptive solutions they employ.Answering RQ3 will provide insights into best practices on how practitioners realise self-adaptation.It will highlight the criteria that practitioners use to apply and realise self-adaptation solutions and may shine a light on to what extent solutions from the research community have been adopted in industry.These insights will open opportunities for both academia and industry to steer future research and improve practical applications.
Finally, with RQ4, we want to understand the difficulties and risks, if any, that practitioners experience in the design, implementation, and other engineering activities of self-adaptive systems.We also will probe whether practitioners face problems for which they would appreciate support from researchers.Finally, we elicit opportunities that practitioners see for applying self-adaptation that are not exploited yet.Answering RQ4 may help to fill the gap between academia and industry.Furthermore, identifying problems and risks may trigger new collaborative studies to investigate and address these challenges.Such studies are likely to bridge the gap and result in more targeted research and improved industrial applications of self-adaptive systems.

Contributions
By drawing a landscape of the use of self-adaptation in industry, the survey results benefit both researchers and practitioners.Concretely, the contributions of this study are: • An empirically grounded overview of state-of-the-practice in the application of selfadaptation; • Insights for researchers to assess their current research in relation to industrial needs; • Insights for practitioners to assess the level of their current practice in applying selfadaptation; • Additional prospects for applying self-adaptation in practice and opportunities for industryresearch collaborations.
Preliminary results of this study were reported in [46].That paper only considered a small subset of questions and reported initial results based on one batch of data.

Outline
In Section 2 we present the study design with the survey questions and analysis methods used.Section 3 presents the results for each research question and provides key insights for each research question.In Section 4, we derive insights from the study results for researchers and practitioners.Section 5 discusses threats to validity.Finally, we wrap up and conclude in Section 6.

RESEARCH METHOD
In this study we use a survey as research method [18].Subsequently, we discuss the population and sample, the questionnaire, and the data analysis methods we used.

Population and Sampling
Our target population are practitioners that are actively involved in the engineering of industrial software-intensive systems in any domain.This includes architects, designers, developers, testers, maintainers, operators, and other people who have technical expertise and are actively involved in the development and maintenance of these software systems.
Concretely, we contacted 355 practitioners from a wide variety of companies 1 via the networks of the researchers involved in this study (i.e., the authors of this paper) to complete the survey.We used two criteria to invite people: (1) participants should be active in different domains that are representative of software-intensive systems, and (2) participants have the required expertise to answer the questions.The invited practitioners were spread over in total 21 countries. 2The invitations were sent by personalised emails in different batches during the period from November 30, 2020 until July 31, 2022.We sent reminders according to a predefined schedule of one, two, and six weeks after the invitation.

Survey Instrument
The survey used a questionnaire to collect data based on a set of predefined questions [18].Because practitioners are not necessarily familiar with the term self-adaptation, the survey started with a gentle introduction of the core idea of what constitutes a self-adaptive system using basic terminology commonly used in industry, and illustrated this with a few characteristic examples to make it concrete.We used both closed and open questions.Closed questions have a predefined set of answers, such as yes/no, or multiple choice.We also allowed participants to add extra options for answering several closed questions using a text field.Open questions provide a space that participants can use to provide an answer.While closed questions allow acquiring a clear view on a particular topic using basic statistics, open questions allow acquiring in-depth insights using qualitative analysis.We provide a replication package with all study materials, including the study protocol, the questionnaire, the raw data, and the analysis results. 3  For this study we used a self-administered anonymous online questionnaire (Survey & Report hosted by Linnaeus University, Sweden).The main motivation to use an online questionnaire is to involve a large set of participants with relatively low cost (both time-wise and financially).We created an initial list of survey questions that were directly derived from the research questions of this study.The initial list of questions was composed by two members of the research team and then crosschecked by the other team members. 1 Almost all practitioners we contacted were from different companies and the few that were from the same company had different roles within the company.The participants were asked to answer from their own perspective.We validated the questionnaire in a pilot with eight randomly-selected participants from the target population.For this pilot, we added additional meta-questions to the questionnaire about clarity of terminology and questions, relevance of the questions, scope of the questions, and the time required to complete the survey.For both clarity of terminology and clarity of the questions we obtained an average score of 4.38 on a scale from 1 (Not clear at all) to 5 (Very clear).None of the participants indicated that questions should be removed or modified.Six participants indicated that no important aspects were missing.One participant hinted that we may also probe whether the use of self-adaptation requires a specialised team in the company or alternatively infrastructure to share knowledge.Another participant suggested adding a question about scalability of solutions for self-adaptation.One participant stated that the example system we used to introduce self-adaptation may create some bias, and further that answers to questions may differ depending on roles on the engineering teams.The average reported time to complete the survey was 24 minutes.Based on the feedback, we adjusted the introductory part of the questionnaire.We did not revise the questions as they were perceived as clear and well scoped.The finalised questionnaire was then distributed to the participants as explained above.
The first part of the questionnaire (Table 1) solicited whether the participant applies selfadaptation and collected general demographic information.This allowed us to check whether the participant had experience with self-adaptation (Q0.1), confirm a good coverage of kinds of software-intensive systems across participants (Q0.2), the size of the companies of participants (Q0.3), as well as a confirmation of the participant's role (Q0.4) and years of experience (Q0.5).The second part of the questionnaire aimed at questions related to RQ1 collecting data about the problems for which the participants apply self-adaptation (Q1.1), the main business motivations for using self-adaptation (Q1.2), and the benefits obtained from applying self-adaptation (Q1.3) (see Table 2).The first two questions had multiple options. 4he third part of the questionnaire covered a question related to RQ2 on how practitioners characterise self-adaptation (see Table 3).This part included only one question that asked participants to describe a concrete self-adaptive system they had worked with (Q2.1).
The fourth part of the questionnaire addressed RQ3 on how practitioners apply self-adaptation in their practice (see Table 4).The first three questions investigated the mechanisms that participants Q2.1 Think of a concrete self-adaptive system you worked with.Name this system and briefly explain its purpose (please use this system to answer the following three questions) Free text use to monitor (Q3.1) and analyse (Q3.2) the system during operation, and change the system when needed (Q3.3).The next question investigated the degree of automation of self-adaptation (Q3.4).The next three questions investigate reuse of solutions (Q3.5-Q3.7).The last question of this part of the questionnaire probed whether and how practitioners establish trust in the self-adaptation solutions they build (Q3.8).Finally, the fifth part of the questionnaire addressed RQ4 on difficulties, risks, and opportunities of applying self-adaptation in practice (see Table 5).The first two questions investigated difficulties (Q4.1 and Q4.2); the next three questions focused on risks and risk mitigation (Q4.3-Q4.5).The next two questions probed the interest of practitioners to get support from researchers for solving problems with self-adaptation (Q4.6 and Q4.7).The last two questions investigated opportunities for applying self-adaptation beyond the current practice (Q4.8 and Q4.9).
The questionnaire concluded with a question (Q5.1) about how confident participants were in general about the answers they gave when answering the survey questions with possible answers: Very confident; Confident; Sufficiently confident; Neutral; Somewhat unconfident; Not confident; Not confident at all.

Data Analysis
To analyse closed questions, we used descriptive statistics and quantitative data analysis.Therefore, we mostly report frequencies of answers, percentages relative to the respective number of responses, and relationships between answers to questions based on contingency matrices (based on the categorisation of answers).We only report relationships that led to relevant insights.
To analyse comments to open questions, we used qualitative data analysis.In particular, we used inductive reasoning to construct codes and infer categories from the data by labelling occurrences of codes and grouping them into categories [41].Similar to others (e.g., Prechelt et al. [34]), we tried Free text to keep coding simple.We did not have a pre-defined coding schema or a pre-defined granularity or semantic style for the codes.However, we interpreted comments in the context of the question for which they were given.We used a simple version of open coding [42].Similar to Mendez Fernandez et al. [12], we used open coding to add codes to small coherent fragments of the comments.We then categorised the developed concepts in a hierarchy of categories as an abstraction of the codes.We coded in sub-teams of two or three coders in total 886 comments of 12 open questions.
Coding was first done individually and then consolidated in the sub-team.Two other researchers crosschecked the consolidated coding.Where necessary, the coding was adjusted in consensus between the sub-team and the researchers.We excluded some comments from coding, e.g., if they did not provide any additional insights or if they were too generic, e.g., a participant answering "Always" to a closed question and stating "This is how we work" in the comments.Also, we did not map the answers to a closed question to comments for that question.For example, a participant may have answered that they never reused solutions for self-adaptation, but in their comments indicated reasons that they "might" do so (i.e., one comment may cover several concepts, which may not necessarily match the answer to the closed question).When reporting example quotes from comments in Section 3, we use verbatim excerpts, including spelling and punctuation errors.

Demographic Information
In total, 184 participants completed the survey from 355 invitations, i.e., a response rate of 51.8%.
Based on the answers to the first question (Q0.1), we split the answers of the other questions of the demographics in two groups: those provided by all participants and those provided by participants that worked with concrete self-adaptive systems. 5.1.1Experience with self-adaptation (Q0.1):Of the 184 participants that provided valid data, 100 (54.4%) expressed to have worked with concrete self-adaptive systems.
3.1.2Software systems built by organisations (Q0.2):Almost all participants (181, 98.4%) provided a valid description of the kind of systems they build.Based on the analysis of the data we could classify the answers along two axes: the types of software systems built by the organisations, and the focus of the software systems.The type refers to the domain, while the focus refers to the activities on which the organisation concentrates within the domain.For example, automation (focus) within manufacturing (type).Note that the domain may be abstract, e.g., embedded systems or communication and networks.Focus on the other hand may refer to purpose, such as analytics, but also specific technologies or methods, such as machine learning.Figure 1 6 summarises the types of systems we identified.The most frequent types are web/mobile, embedded, cyber-physical, IoT systems, data management, and cloud (together these four types represent 52.5% of all systems).Sixteen participants (8.8%) built various types of systems. 7The results show that the percentages of the types of systems of all participants and those that worked with self-adaptation are similar.
In addition to the types of systems, 104 participants (56.5%) also provided insights in the focus of the systems they build.Among the 100 participants that worked with self-adaptation, 60 provided a description of the focus.Figure 2 provides an overview the results.The dominant focus is monitoring, analytics and control, representing 27.4% of the foci described by the participants.Other key foci are services (21.7% of the participants that described the focus of the systems they built) and quality and security (14.2%).Overall, the variety in the types of systems built by the participants and the different foci in activities underpins the representativeness of the data collected during the survey.3.1.3Software engineers working at companies (Q0.3): Figure 3 summarises the results of the number of software engineers that work at the companies of the participants.About half of the companies have more than 100 employees who work as software engineers.The other half is about equally divided over four categories of companies with between 1 and 100 software engineers.The results are similar for all participants and those that have worked with self-adaptive systems.The numbers show that we collected data from participants of companies with a significant number of software engineers, i.e., people dedicated to building software-intensive systems (because our study is interested in the engineering of software-intensive systems, we collected the number of software engineers at the companies and not the total number of employees as a measure for size).3.1.4Roles of participants in their organisation (Q0.4):The role(s) that participants have in their company are summarised in Figure 4. Of 184 participants, 129 indicated that they have one role in their organisation.The other participants indicated that they have two or more roles.Overall, the participants reported on average 1.6 roles in their company.The participants that worked with self-adaptation reported on average 1.5 roles.The most frequent roles are programmer and project manager/coordinator, each representing over 40% of the participants.About one in three participants works as a designer or architect.The representation of the other roles is lower in the sample.The relative numbers for the roles of all participants and those that work with selfadaptive systems are again similar.One exception is researcher: 9 of the 10 participants that work as researcher have worked with self-adaptive systems.The results show that we collected data from participants with a broad range of key software engineering roles in industry.3.1.5Experience of participants (Q0.5): Figure 5 summarises the years of experience of participants as software engineers. 8A majority of the participants have at least 9 years of experience as software engineer; i.e., 69.6% of the total sample and 76.0% of the practitioners that worked with selfadaptation.The distributions for all the participants and those that worked with self-adaptation are similar.The numbers show that most participants of the survey are experienced software engineers.

Drivers for Applying Self-Adaptation (RQ1)
We now analyse the data that we collected for answering RQ1.This research question focuses on the drivers of practitioners for applying self-adaptation and the types of problems they solve using self-adaptation.Note that the data used to answer RQ1 comes from the 100 participants that have experience with concrete self-adaptive systems (i.e., the participants that answered "Yes" to Q0.1).

3.2.1
For which problems do you apply self-adaptation?(Q1.1). Figure 6 summarises the results.On average, the participants applied self-adaptation for 3.6 types of problems from the predefined list (with seven options).The results show that practitioners apply self-adaptation to deal with a variety of problems.Optimising performance and automating tasks are the main problems addressed by selfadaptation in industry.On the other hand, dealing with changes in business goals is less frequently solved using self-adaptation.'Others' include for example support for testing and evolution.

3.2.2
What are the main business motivations to apply self-adaptation?(Q1.2). Figure 7 summarises the results.On average, the participants provided 2.1 business motivations to apply self-adaptation.Improving user satisfaction, reducing costs, and mitigating risks are the most selected motivations for using self-adaptation.Opening up new application opportunities was selected by a lower number of 21 participants.Examples of 'Others' are improving utility and managing complexity.

3.2.3
What could be benefits of applying self-adaptation?(Q1.3).Ninety-two participants provided meaningful descriptions of benefits of self-adaptation, an average of 1.8 benefits per participant.

Analysis of comments:
We summarise the findings in Table 6.For each category (bold font) and code, we include how often it appeared and we provide a few example quotes. 9The dominating benefits of applying self-adaptation are improved utility (61 participants), savings in costs and resources (38 participants), and improved human interaction (37 participants).

Key insight(s) from RQ1:
(1) Self-adaptation is widely applied in industry across a wide variety of domains.
(2) Practitioners primarily apply self-adaptation to optimise performance, automate tasks, and deal with changes in the deployment environment.(3) The dominating business motives to apply self-adaptation in industry are primarily improving user satisfaction and reducing costs, and secondarily mitigating risks.(4) The main benefits of applying self-adaptation are improved utility (in robustness and performance), savings (costs and resources), improved human interaction (user experience and engineers support), and handling dynamics (in the context and system load).
3.3 RQ2: Characterisation of Self-adaptation 3.3.1 Explain a concrete self-adaptive system you worked with (Q2.1).Except for one, all participants with experience in self-adaptation provided a concrete description of a system they worked with.Analysis of comments: Tables 7 and 8 summarise the findings.We focused on characteristics of self-adaptive systems and identified three categories: subject, type, and trigger of adaptation.With subject we mean the system or part of it that is adapted; type refers to the kind of adaptation that is applied, and trigger refers to the source that initiates adaptation.
Ninety-nine participants provided a description of what is the subject of adaptation in the systems they work with.Top results are system that occurred 28 times, followed by module with 22 times (i.e., a part of a system).Platform layer (infrastructure, execution platform, etc.) was mentioned 11 times and application layer 11 times.
Of the participants that worked with self-adaptation, 86 described in total 101 instances of the types of adaptation they apply (i.e., an average of 1.17).Auto-scaling with 33 occurrences and autotuning with 28 are the most frequent types of adaptations applied by the participants.Twenty-two participants focus on monitoring and analysis only (they may use the human in the loop for other adaptation functions).
Finally, 62 participants explained in total 78 triggers of adaptation in their work (i.e., an average of 1.21 triggers).The main triggers originate from system properties with 27 occurrences and environment properties with 18 occurrences.Changes in the system load, events, 10 and user actions are the other types of triggers for adaptation.

Key insight(s) from RQ2:
(1) Self-adaptation is applied at different levels of industrial software-intensive systems: from a complete system to parts of a system and support systems.(2) The dominating types of adaptations applied in industry are auto-scaling, auto-tuning, and monitoring/analysis. (3) Adaptions in industrial software-intensive systems are triggered by changes in properties of systems and their environments, dynamics in system load, relevant events, and through user actions.(4) Technologies such as elastic cloud and auto-scalers are key enablers for the realisation of self-adaptation in practice.
3.4 RQ3: Application of Self-adaptation 3.4.1 What mechanisms or tools does the self-adaptive system you worked with use to monitor a managed system during operation?(Q3.1).The participants provided a total of 146 instances of mechanisms or tools they used for monitoring in a self-adaptive system they worked with, i.e., on average, 1.5 mechanisms/tools per participant.
Analysis of comments: Table 9 summarises the findings.The participants focused on both "what" is being monitored and "how" monitoring is done.Based on this we identified three categories: monitoring metrics, monitoring mechanisms, and monitoring tools.Of the 100 answers, we marked 14 as unclear.
The participants mentioned in total 75 metrics they use for monitoring.Resource usage with 23 occurrences, system load with 18, and reliability with 13 are the most frequently mentioned metrics.
The participants described in total 20 monitoring mechanisms.Environment sensors occurred nine times and system sensors four times.Six participants described different logging mechanisms, and in one system, a human is involved in monitoring.We use kubernetes which provides notification callbacks on any event such as host/pod not available, based on these events we auto mark the node was inactive and do not use those nodes for further write or read operations"; "Auto Scaling an EMR cluster in AWS based on incoming event data" User actions 7 "[adapt] cache warm up strategy based on user interactions"; "scammers ... To decide the users that are most likely to be a scammer, the system tracks the past performance of models responsible for flagging potential scammers." Finally, the participants provided in total 34 tools they use for monitoring.The most prominent tools are Kubernetes monitoring and Prometheus, which each occurred nine times, followed by AWS monitoring with eight occurrences. 11.4.2What mechanisms or tools does the self-adaptive system you worked with use analyse conditions of a managed system during operation?(Q3.2).The participants provided a total of 115 instances of mechanisms or tools they used for analysing conditions of a self-adaptive system they worked with, i.e., on average 1.5 mechanisms/tools per participant.Analysis of comments: Table 10 summarises the findings.We identified two categories: analysis mechanisms and analysis tools.Out of the 100 valid answers, 21 were marked as unclear or not applicable (such as "Fairy simple algoritms" or "The tech stack we use is proprietary and the tools we use are built in house").The rest of the participants mentioned in total 73 mechanisms they use for analysis.The most frequently mentioned mechanisms are data analysis methods (such as interference, statistical data analysis, what-if analysis, and search-based methods) with 18 occurrences, comparison to threshold with 16 occurrences, and metric(s) calculation and learning (mostly machine learning) with 12 occurrences.The participants provided in total 23 tools they use for analysis.AWS analysis tools occurred nine times, followed by Kubernetes stack with seven, and Dynatrace with two occurrences.Other tools mentioned by the participants include Splunk, JMX, Jasmina, Azure, Openshift, and Kibana.Analysis of comments: Table 11 summarises the findings.Out of the 100 valid answers, 23 were marked as unclear or not applicable.We identified two categories: change mechanisms and change enacting tools.In total, 83 instances of mechanisms for change were reported.Scaling mechanisms with 36 occurrences and reconfiguration (changing the adaptation logic, network reconfiguration, parameter adjusting, load balancing) with 25 occurrences are the most frequently mentioned changing mechanisms.Twelve participants used non-automated mechanisms that refer to notifications and change tasks done by humans.The participants mentioned 19 tools they use for enacting change.Kubernetes occurred nine times, AWS seven times and other tools, including Openshift and Dynatrace, three times.
Table 11.Analysis of comments -Mechanisms or tools used to change a managed system or parts of it (Q3.3).

Categories and codes # Example quotes
Change mechanism 83 Scaling mechanisms 36 "The server-side system has a load balancer.Hence we increase the number of workers behind the load balancer to decrease the average response time for the users."; "It adjusts the number of worker nodes."; "Adding a completely similar server / serverless Lambda instance"; Reconfiguration 25 "The adaptation directly adjusts the period between the packet send events, as well as the number of packets allowed during each send event.[...]; "Depending on context, controlled variables are managed through different automation systems."; "reconfiguration of the management entity ... to support a larger (or smaller) scale distributed system"; "load balancer/director that may support controlling the exposure facade towards the system environment." Non-automated 12 "To effect change on the managed system, the results from the tool need to be approved by an engineer, and are then acted on by the mining and plant teams.These processes are for the most part not automated [...]."; "Generating alerts and expecting humans to resolve the error manually based on suggestions."; "Did not do this [...].Based on safety protocols this could not be secured" Restarting/deploying 7 "Mostly just restarting the managed subsystems.In the case of Kubernetes HPA, its the horizontal scaling (up/down) of the Pods"; "Generally restarts the unhealthy workload, but in the case of autoscaling can also be used to add or remove replicas"; "... our pipelines use simple bash scripts to deploy previous versions when new versions fail." Migration 3 "Once the control process informs the control plane, it starts a workflow what we call as instance warming workflow which will dump items that supposed to go to that node from another replica and fills them.";"virtual machine (VM) migration or creation." Change enacting tool 19 Kubernetes 9 "Mostly just restarting the managed subsystems.In the case of Kubernetes HPA, its the horizontal scaling (up/down) of the Pods"; "... to change topology we simply use K8S api to add/remove worker pods" AWS 7 "AWS based in-built auto scaling capabilities "; "Use the AWS ElasticLoadBalancer and also trigger actions via AWS Lamda functions when required." Other 3 "IBM ITM, Log Analyzer, TCAM"; "UC4 Automation Engine workflows that orchestrate kubernetes clusters"; "Build-in Openshift mechanisms"

3.4.4
What is the degree of automation of the majority of the self-adaptive solutions you work with in your organization?(Q3.4).All 100 participants provided an answer to this question; Figure 8 summarises the findings.Forty-seven participants reported mixed automation in their projects (both semi and fully automated), while 31 indicated semi automation and 19 indicated full automation.Three participants selected other; two of them mentioned that there is no automation, the third stated "fully-automated till first incident." 3.4.5Do you reuse solutions to realise self-adaptation in systems you work with?(Q3.5).All 100 participants provided answers to this question that are summarised in Figure 9.A majority of 71 participants reuse at least sometimes solutions in self-adaptive systems (44 of them reuse solutions frequently to always).The other 29 participants rarely, very rarely or never reuse solutions.Analysis of comments: Table 12 summarises the findings.We focused on the subjects of reuse and identified five categories: code, design artifacts, specifications, IT infrastructure, and procedures.The 67 participants provided in total 91 objects of reuse in adaptation, i.e., an average of 1.4.Code occurred 33 times, with modules as the top subject of reuse (18 instances).Design artifacts was mentioned 22 times with patterns and architecture as main subjects of reuse (each with seven instances).Specification was mentioned 18 times as objects of reuse, IT infrastructure 11 times, and procedures seven times.The results demonstrate that reuse in self-adaptation is common practice, although the use of patterns (a topic that gets increasing attention in research) is limited."a framework for monitoring metrics that allows labels to be given to properties, the time-series data to be tracked in a database, and then hooks to visualization database and alert systems." Tools 4 "Use the same tools AWS provides for all our different product deployments." Procedures 7 Processes 3 "Writing "watchdog" processes for systems that aren't deployed to kubernetes" Pipelines 2 "pipeline (Application -Datadog -custom logic -AWS API) is replicated with different settings for different use-cases." Schedules 2 "Most of the approaches we use for digital twins share some history ...An example of that is in the scheduling space, where schedules need to adapt to changes in resources or the inclusion and removal of tasks." 3.4.7Why do you not often reuse solutions when realising self-adaptive systems?What hinders their reuse, please provide a short answer?(Q3.7).This was a conditional question that was only asked to the participants that answered never or very rarely to Q3.5 (that asked whether participants reuse solutions to realise self-adaptation).Twenty-three participants provided such an answer to Q3.5.
Analysis of comments: Table 13 summarises the findings.From 18 participants that provided valid answers, we identified 19 reuse hurdles, i.e., an average of 1.1.The main hurdle reported by 11 participants is difference in problems, hampering easy reuse.Other hurdles are lack of experience or maturity in applying self-adaptation within the company (4 occurrences), and the complexity of the system and organisational concerns (each with 2 occurrences).Analysis of comments: Table 14 summarises the findings.The participants provided in total 152 instances of techniques for ensuring trust in the self-adaptive systems they build, i.e., on average 1.7 techniques per participant.We grouped the techniques in three categories: testing and verification, stakeholder-centred techniques, online techniques.Testing and verification was mentioned 71 times with extensive testing being the main technique occurring 58 times, followed by benchmarking occurring 10 times and verification (three times).Stakeholder-centred techniques were mentioned 45 times.In this category, human supervision (22 occurrences) and rigorous design and development (10 occurrences) were the main reported techniques.Finally, online techniques were mentioned 36 times with runtime monitoring and alerting as main reported technique (27 occurrences).In contrast to an important focus of research in self-adaptation, (formal) verification as a technique to ensure trust was only mentioned three times.

Key insight(s) from RQ3:
(1) Resource usage and system load are the main types of monitoring metrics used in practice.These metrics are primarily tracked by sensors in the environment and the system.(2) Practitioners use various mechanisms for analysis in realising self-adaptation, with data analysis methods and comparison to thresholds as main mechanisms.(3) A wide range of mechanisms are used to enact self-adaptation in industrial systems with auto-scaling and reconfiguration as top mechanisms.(4) Practitioners extensively rely on tools such as Kubernetes and AWS to support the realisation of different functions of self-adaptation.(5) Industrial systems apply a mix of semi and fully automated adaptation.(6) A majority of practitioners reuse solutions when applying self-adaptation, mainly in the form of code, design artifacts, and specifications.(7) Ensuring trust in industrial self-adaptive systems is mainly achieved through extensive testing, runtime monitoring and alerting, and human supervision.
Table 14.Analysis of comments -Techniques for ensuring trust in self-adaptive solutions (Q3.8).

Categories and codes # Example quotes
Testing and verification 71 Extensive testing 58 "We use extensive testing (unit, module, system)"; "We have extensive testing on test k8s clusters, provisioned for these purposes."; "We have countless amount of testing and verification code built as part of the OpenJDK to ensure the quality of the product is appropriate." Benchmarking 10 'As a lot of the self adaptation logic involves optimization opportunities, we also regularly run many benchmarks and immediately report regressions"; "We do testing of the machine learing models, but we also have pilot factories where we test our methods and design to see if all station perform as itended." Verification 3 "expert testing, supervision, verification when applicable"; "Testing, but also some human verification as part of the Cloud Operations team."

Stakeholder-centred techniques 45
Human supervision 22 "Human supervision until confident."; "Extensive system testing and gradual release of human supervision levels upon system going live." Rigorous design and development 10 ""; "virtual training to ensure operators understand and are comfortable with the conditions in which the safety system will engage." Trust in third-party software 8 "for features like auto-scaling compute ... we use trusted vendors and deploy these features mainly for analytics use cases which are not business-critical." Operational constraints 5 "the concrete actions that are taken by the system are defined by the user.so there is never a surprise.the system only decides if and when to apply these actions.";"Our autotuning algorithms never fail for particular (exactly specified) set of systems.If the system fulfils these assumptions, it works always." Online techniques 36 Runtime monitoring and alerting 27 "In cases where an existing system is not being replaced but rather new capability is being added, results will be tracked over time to ensure accuracy."; "we have deployed some alert to track the high-level properties of the system." Continuous testing during operation

6
"there is gradual canary testing in the real production system."; "Automated test scripts, automated "synthetic transactions" in production, model performance validation" Mitigation strategies 3 "This automation can provide alter with all the steps and rollback automatically if there is any issue." 3.5 RQ4: Difficulties, Problem Support, and Opportunities 3.5.1 Did you encounter particular difficulties when engineering or maintaining self-adaptive systems you worked with?(Q4.1). Figure 10 summarises the findings.Forty-one of 100 participants report that they sometimes face difficulties with applying self-adaptation.Thirty encounter difficulties frequently or very frequently, while 17 rarely or very rarely have difficulties.Four participants reported to have always problems, while eight reported that they never face difficulties.Seventy-four participants reported in total 140 difficulties, i.e., on average 1.9 difficulties per participant.Table 15 summarises the findings.

Analysis of comments:
We identified four categories of difficulties: design issues, lifecycle issues, runtime issues, and people and process issues.Most frequently reported difficulties, 43 in total, relate to the design of self-adaptation, in particular reliable/optimal design (26 occurrences) and design complexity (17 occurrences).Life cycle issues were reported 42 times, in particular tuning/debugging (19 occurrences) and limitations of tools and methods (13 occurrences).Difficulties with runtime aspects of self-adaptive systems was reported 30 times with runtime uncertainty mentioned 17 times, and difficulties related to people and process occurred 25 times with skills and experience occurring 14 times.
3.5.3Did you face any risks when engineering self-adaptive systems?(Q4.3). Figure 11 summarises the findings.Thirty-four of 100 participants report that they sometimes face risks when engineering self-adaptive systems.Eighteen report that they frequently to always encounter risks, while 48 rarely to never face risks.The participants provided a total of 60 responses containing 66 instances of risks faced when engineering self-adaptive systems.On average, the participants reported 1.3 risks.Analysis of comments: Tables 16 and 17 summarise the findings.Out of the 60 valid answers, 11 were marked as unclear or not applicable.We identified four categories of risks: faults, difficulties with development/operation, impact on qualities, and impact on business.Most frequently mentioned risks, 20 in total, relate to faults, in particular incorrect functionality (7 occurrences), wrong results and misconfiguration (4 occurrences each), and network failure (2 occurrences).Difficulties with development/operation relate to difficulties to manage environment uncertainty (6 occurrences), and difficulties to test and build systems (4 occurrences each).Participants mentioned also the risk of having several qualities impacted; performance degradation with 5 occurrences the most frequent, followed by reduced availability and safety and security threats with 4 occurrences each.Finally, negative impact on the business in terms of increased cost (5 occurrences) and losing control and trust (4 occurrences) are also reported as important risks when applying self-adaptation."We face a risk of underestimating environment variability."; "Legacy monitoring solutions don't cope well with environments that scale back.";"Risk may be encountered if the incoming event stream is completely unpredictable and have huge spike differences in data for a considerable period of timr" Difficult to test 4 "if the executed actions that will be done by the self-adopting system are not tested before, it might introduce some risks"; "It is also difficult to do reliable performance testing in non-production environments." Difficult to build 4 "implementing and designing self-adaptive systems may initially seem to take longer time -hence the risk of not being allowed to implement it as good as it can be done"; "Costs of building own (self-hosted) environment [...]" Other 2 "life updates (no downtime)"; "There is always a lingering concern of quis custodiet ipsos custodes -or 'who watches the watchmen'." Table 17.Analysis of comments II -Risks faced when engineering self-adaptive systems (Q4.4).

Categories and codes # Example quotes
Impact on qualities 16 Performance degradation 5 "[...] risk of degrading the performance instead of improving it, and degrading the user experience as a result.";"Performance impact on the running system when applying auto-scaling (e.g.scaling down)"; "sometimes a sequence of perfectly acceptable self-adaptive automatic actions can lead to outages worse than the root cause" Reduced availability 4 "If the system did not behave properly this could result in an outage [...]"; "Availability of the system during the auto-scaling rules being applied" Safety and security threats 4 "If a system is self-adaptive, how can we secure that it is safe during production (some parts can be powered for self test during assembly and we need to know it is safe)?If we use machine learning on a self-adaptive system, how do we secure safety?"; "There is a risk of misconfiguration that can lead to lost nodes and applications, security exposures etc.There are also security risks involved with the base building components, such as docker images from untrusted sources [...]" Extra resource consumption 2 "Risk of all resources being eaten up by a self-adaptive process."; "[...] it may use up too many unnecessary hardware and software resources" Reliability issues 1 "Reliability issues in case of non-converging oscillations or plain wrong output due to prolonged failures in the metrics collection pipelines or simply wrong algorithms" Impact on business 14 Increased cost 5 "Regarding autoscaling, the main issue was to fail and so increasing the infra cost of the users due to bugs in the system.";"Lost control over system size.This also impacted the approx.total cost agreed with the customer." Losing trust and control 4 "Trust.Because flexible manufacturing systems have some kind of autonomous behavior with tasks that have been done manually, our clients are initially very sceptial and to not trust the systems initally"; "risk of losing (manual) control of the system for the sake of automation" Harder to understand/fix 3 "the whole system becomes more complex, hence fewer people understand all details of its behaviour."; "More difficult troubleshooting for a self-adapting, distributed system." Not useful 2 "The self-adaptive system might not perform better than the baseline when dealing with dynamic shapes, as the cost model might not be generic enough to predict the performance." 3.5.5How did you mitigate the risks that you faced?(Q4.5).The participants provided 51 responses containing 66 instances of risk mitigating techniques when engineering self-adaptive systems, i.e., on average 1.3 techniques per participant.
Analysis of comments: Table 18 summarises the findings.Out of the 100 valid answers, 13 were marked as unclear or not applicable.The other participants mentioned a variety of risk mitigation mechanisms, which we grouped into three categories.Stakeholder-centred techniques are the largest category with 25 occurrences, followed by offline techniques and online techniques with 18 and 9 occurrences each.Within stakeholder-centred techniques, rigorous design and development (8 occurrences), code review (4 occurrences), and human supervision (4 occurrences) are the most popular risk mitigation techniques.Extensive testing with 15 occurrences is the mostly mentioned offline technique, while runtime monitoring and analysis with 6 occurrences is the mostly mentioned online technique to mitigate risks.

3.5.6
Have you faced or seen any problems of self-adaptation for which you would appreciate support from researchers (Q4.6). Figure 12 summarises the findings for Q4.6.Thirty-three of 166 participants (17.9%) frequently to always experience problems with self-adaptation for which they would appreciate support from researchers, while 43 participants (23.4%) sometimes face such problems.On the other hand, 108 of the participants (58.7%) never to rarely experience problems for which they would appreciate support from researchers.In summary, almost half of the participants believe that they would benefit from support of researchers to address some of the problems they face with engineering self-adaptive systems.3.5.7For which problems of self-adaptation would you appreciate support from researchers?Please briefly explain one or two such problems (Q4.7).Sixty-five participants described in total 113 problems for which they would appreciate support from researchers.Tables 19 and 20 summarise the findings.

Analysis of comments:
We grouped the problems in four categories: engineering, guarantees, data, and user interaction.Forty-eight of the reported problems (42.5% of the reported problems for which practitioners would appreciate support from researchers) relate to the engineering of self-adaptive systems.The main problems in this category relate to architecture and reuse (16 occurrences) and the adoption of self-adaptation (10 occurrences).Adoption refers to problems within a company with introducing self-adaptation, which can be related to technical aspects, expertise, or organisational aspect.Twenty-five of the reported problems (22.1% of all reported problems) relate to guarantees, in particular providing trustworthiness (20 occurrences) and dealing with unknowns (five occurrences).Problems related to data were reported by 21 practitioners (18.6%) and include data governance and data access (both eight occurrences), and machine learning (five occurrences).The remaining 19 problems (16.8%)relate to user interaction, namely automation (nine times) and user experience (seven times).

3.5.8
In your organisation or in industry in general, do you see application opportunities for selfadaptation that are currently not exploited?(Q4.8).Of the 184 participants, 101 (54.4%) highlight new Table 18.Analysis of comments -Techniques to mitigate risks when engineering self-adaptive systems (Q4.5).

Stakeholder-centered techniques 25
Rigorous design and development 8 "careful engineering so that there are open doors for manual intervention, when necessary, without lost of system availability nor hindering the automation mechanisms"; "We try to have design sessions [...] and possibly enhance the design in the early phases of development"; "Engineering analysis, testing, controlled deployment, ... " Code review 4 "As always, planning, design reviews, code reviews, testing on several levels, monitoring the production."; "Each incident is taken into consideration and rules are always reviewed." Human supervision 4 "The responsibility was left to a human operator."; "Mainly by performing tests and human supervision (monitoring resource utilization)" Outsource 3 "Outsource the cloud operation to a specialized provider (RedHat, AWS) where possible.In other cases, customers had to hire experienced administrators/go through extensive period of testing to gain the necessary experience." Other (post mortem analysis, hiring experts, work in pairs, documentation)

6
"When we hit a problem years after the fact, we perform a detailed post-mortem and try to think about other possible failures we may have missed."; "We hired (multiple) external consultancy firms to tap into their experience in deploying such a system.";"Work in pairs, Document architectural decisions" Offline techniques 18 Extensive testing 15 "test each action in isolation before it is provided to the system"; "Automated and human testing.In addition for complex algorithms, we run parallel, correlated analysis.";"With automated and manual testing while injecting non-determinism to the test suite"; "Extensive testing at the customers factory and fine tuning of the models." Set operational boundaries 2 "Defined max-amount of resources a system functionality/component is allowed to consume."; "Thresholds and some manual monitoring" "We run our processes during the night, when there is less chance of interference with business critical (customer facing) systems." opportunities for applying self-adaptation, while 83 do not report any.The number of participants within these two groups is almost equally split among participants who have worked with concrete self-adaptive systems and those who have not (see Q0.1) (in particular, 58 participants that worked with self-adaptive systems report opportunities, while 42 do not).20 "Formal verification of the algoritmic behaviour of the overall system (correctness)"; "validate my algorithms"; "Safety protocols for Machine learnign in self-adapting systems"; "What are the mechanisms should be integrated into self-adapting system to identify malicious input?"Unknowns 5 "We normally capture this using some form of process based models, but these struggle with thin[g]s like unknowns."; "not just anomaly detection, but actually responding appropriately to the anomalies (what is appropriate?)." 3.5.9Please describe or give examples of the application opportunities for self-adaptation that are currently not exploited (Q4.9).Eighty-five participants described in total 147 unexploited opportunities for applying self-adaptation, i.e., an average of 1.7 opportunities per participant.Analysis of comments: Tables 21 and 22 summarise the findings.We grouped the opportunities in four categories: system activity, system property, engineering activity, and human involvement.Seventy-two of the reported opportunities (i.e., 49% of all) are related to system activity.The opportunities in this category relate to the autonomous operation behaviour of self-adaptive systems (37 occurrences), data management and machine learning (26 occurrences), and auto-scaling (nine occurrences).Forty-seven opportunities (32%) are related to system properties.In this category, the "most of the problems that we faced are related to help the customer to understand the benefits of self-adaptative systems.";"Autoscaling should become commodity products ... As users, the complexity should be abstracted away" User involvement 3 "User response can also be used for adaption (E.G. if a user constantly overrides the managed systems settings there managing system should 'learn' from the user and adapt the control algorithm for that specific user)" opportunities are related to quality improvement (26 occurrences), security improvement (10 occurrences), and cost effectiveness (eight occurrences).Twenty-one of the reported opportunities (14.3%) relate to engineering activities, in particular maintenance and reuse (15 occurrences), and patterns and libraries (six occurrences).Finally, seven opportunities (4.8%) relate to human involvement, in particular personalisation (four occurrences) and human-machine interaction (three occurrences).

Key insight(s) from RQ4:
(1) A majority of participants face difficulties when engineering or maintaining self-adaptive systems, mainly with reliable/optimal design, design complexity, and tuning/debugging.(2) About half of the participants encounter risks when using self-adaptation.The main risks relate to incorrect functionality and difficulty to manage environment uncertainty, as well as degraded performance and increased cost.(3) About half of the practitioners report that they would appreciate support from researchers to deal with problems they face, in particular problems related to the engineering of self-adaptive systems, guarantees, and management of data.(4) About half of the participants see future opportunities for applying self-adaptation, in particular in relation to autonomous operation, data management and machine learning.26 "Based on the alarm certain counter actions could be initiated in order to deal with the faulty behaviour and reach a stable system state."; "Congestion prognosis"; "fault tolerance"; "Power consumption"; "resource optimization"; "There are many opportunities to split up [current monolithic systems] and then make them scalable such that outages are more contained.E.g. screens on trains." Security improvement 10 "Security of e.g., mobile devices that adapts based on locally identified threats as well as knowledge of risks in the environment.";"Automating changes in Security levels based on threat levels"; "Detecting in-vehicle threats, detecting a system being compromised"; "react to attack patterns" Cost effectiveness 8 "IT cost reduction (e.g.software asset mgmt)"; "The question really is: How do you do these things on the cheap (with non Silicon Valley billion dollar funding) and in contexts where mistakes might be extremely critical?"

Confidence
Figure 13 shows the answers about how confident participants were in general about the answers they gave when answering the survey questions.The results show that almost all participants have confidence in the answers they provided to the survey questions.The numbers for all participants and those that have worked with self-adaptation are similar.

DISCUSSION
We start the discussion with highlighting a number of observations that we derived from the data analysis.Then we perform a number of additional analysis based on cross analyses of selected data of the answers of different questions.With this cross analyses we aim to gain further insights into three topics of interest: benefits of applying self-adaptation in practice, difficulties and risks with engineering self-adaptation in practice, and research support to address problems in practice.

Observations
The problems addressed by industry are in general similar to those studied by academics.Yet, one particular difference is the lack of emphasis of practitioners on the use of self-adaptation to mitigate uncertainties, which has been a key focus in research [11,14,20,52].A possible explanation is that practitioners avoid the term uncertainty that may be perceived as "doubt, " "not clearly defined, " or "not under control."Instead, they refer to uncertainty indirectly by using a different vocabulary, such as "conditions are not always obvious" and "available metrics are not always fully transparent." While practitioners apply self-adaptation to deal with a variety of problems, changes in business goals are less frequently solved by using self-adaptation.One possible explanation may be that business goals are usually about higher-level requirements, while the focus of self-adaptation is often targeting "lower level" technical problems.In addition, there is also the challenge of the mapping between business goal and technical/system metrics, which touches the line or work on dynamic software product lines [19].Yet, another explanation may be that self-adaptation has not yet been fully utilised in industry to deal with bigger system changes.We hypothesise that the latter is the case, but further study is needed to obtain deeper insight.
The four classic management tasks of self-adaptation studied by researchers (self-healing, selfoptimising, self-protecting, and self-configuring) are also relevant to practitioners.Yet, differently from academics, practitioners also emphasise the importance of improving user satisfaction, reducing costs, and mitigating risks.
Practitioners make extensive use of tools and infrastructures to realise the different functions of self-adaptation.This points to the need for more emphasis on tools and supporting infrastructure in research.Related to that is the need for reusing solutions, for instance in the form of references architectures and patterns.While some research efforts have been taken in these directions, these issues deserve more attention.An interesting step in this direction is the development of industry relevant artifacts as outlined in [47].
Self-adaption in software-intensive systems is often not completely automated.Humans remain involved in adaptation, either to provide parts of functions or just to supervise the system.On the one hand, for some companies this is the first step towards further automation; on the other hand, practitioners often express the need for involving humans to ensure trust by overseeing the system and take action when something unexpected happens.As such, we expect the role of humans in self-adaptation to remain important also for future industry relevant research in self-adaptation.
It is remarkable that more than 50% of the participants report that they face at least sometimes risks with applying self-adaptation.At the same time, about half of the practitioners express that they would appreciate support from researchers to deal with the problems they face.This suggests that the engineering of efficient and trustworthy self-adaptive systems is a challenge in practice and that practitioners believe that support from research could benefit them to deal with these challenges.This opens opportunities for joint efforts between industry and academics.

Benefits of Applying Self-Adaptation in Practice
When we crosscheck adaptation problems (Q1.1)versus kind of systems (Q0.2), we observe that most adaptation problems are applied to all kind of systems, while each adaptation problem is applied in one or two champion kind of systems.The three most frequently addressed adaptation problems are applied by all kind of systems.Specifically, the problem "to optimise system performance" is applied to all kinds of systems except transportation where "to detect and resolve errors" is the main adaptation problem (six occurrences), finances where "to deal with changes in the environment" is the main problem (five occurrences), and manufacturing where "to automate tasks" is the main problem (seven occurrences).Table 23 summarises the top occurrences, i.e., types of adaptation problems solved (top occurrences) versus the kind of system for which that adaptation problem is applied (top kind of systems).
We now look at the problems for which self-adaptation is applied (Q1.1) versus the benefits of using self-adaptation (Q1.2).Table 24 shows the contingency matrix.The results show that "improving user satisfaction" and "reducing costs" are by far the most frequently perceived benefits across all types of problems solved with self-adaptation.In particular, these two benefits are mentioned approximately 70% (+/-4%) on average across all problems, while "mitigating risks" and "penning up new application opportunities" are respectively mentioned 53% (+/-11%) and 28% (+/-5%) on average across all problems solved with self-adaptation.Finally, we look at the potential benefits of reuse using the data of the kind of software systems built by organisations (Q0.2) versus reuse when applying self-adaptation (Q3.5-3.7).The top domains where solutions are frequently reused are data management with 11 occurrences and embedded/cyber-physical/IoT systems with seven occurrences.Manufacturing is the top domain where practitioners very frequently reuse solutions with seven occurrences.The most frequent type of reused artifact is module with 11 occurrences, with embedded/cyber-physical/IoT as the top domain with four occurrences used for monitoring/analytics/control.Overall, there is no specific artifact that is more reused than other, and no domain that clearly reused more or less artifacts.Only five participants mention the reuse of patterns when engineering solutions for self-adaptation.
Summary for Benefits of Applying Self-adaptation in Practice.Optimising performance and dealing with changes in the environment are the main reported problems solved using selfadaptation in the domain of embedded/cyber-physical/IoT.Not surprisingly, self-adaptation in the cloud is primarily used to automate tasks and reconfigure the system.Reuse of selfadaptation solutions is mostly applied in the domains of manufacturing, data management, and embedded/cyber-physical/IoT systems.The main artifact of reuse is system module.

Difficulties and Risks of Applying Self-Adaptation in Practice
Large and small/medium organisations (Q0.3) are equally concerned about difficulties with design (Q4.1-4.2).Both types of companies are also concerned about tool support, but in different ways:

External Validity
A potential threat to validity may be the generalisation of the study results.Core to this threat is the selection of the sample of the target population.If this population may not have been representative, the study results may be imprecise and hence not generalisable.Since we used a non-probabilistic sampling method, there is a potential risk that the sample used to conduct the survey is biased and not representative of the target population.To mitigate the validity threat we mainly reached out to practitioners from our networks with industry.To ensure that participants have the required experience, we included questions asking about personal experience with engineering self-adaptive systems in practice.The results of the demographics of our sample show that the participants were active practitioners with sufficient expertise in various roles across companies of different sizes.In addition, we worked in total with eight teams from different areas that contacted practitioners from all over the world.This ensured a well-balanced population on a global scale.Because several of the researchers involved in this study are active in the field of engineering self-adaptive systems, the practitioners invited from our networks may have been biased and inclined to apply self-adaptation more often.To anticipate this threat, we did not particularly focus on practitioners that we have worked within projects, but rather invited practitioners in various software engineering roles that are active across a wide range of domains.

Reliability
Data analysis, in particular qualitative analysis (coding of answers with free text), are creative tasks that are to some extent subjective.Performing these tasks may be influenced by the experience (and even opinions) of the coders [12].To mitigate a potential interpretation bias, we followed a thorough coding scheme.The coding tasks were distributed among teams of two authors (one team of three).The authors of each team performed coding of the data independently and discussed where needed until an agreement was reached.All coding tasks were then distributed again among two authors.These authors repeated the coding independently from the initial coding.The results were then compared with the initial coding by these two authors.Any discrepancies were discussed among the two authors until consensus was reached.The coding was finally crosschecked with the authors that did the original coding to reach consensus.Finally, all material of the survey, including the raw data and the coding are publicly available. 12

CONCLUSIONS
In this paper, we studied the application of self-adaptation in industry.To that end, we conducted a questionnaire-based survey with practitioners from all over the world.We received valid responses from 184 participants, 100 of them with experience in engineering self-adaptive systems.
By analysing the data, we contributed an empirically grounded overview of state-of-the-practice in the application of self-adaptation.A selection of key observations includes: i) self-adaptation is extensively applied in industry across a wide variety of domains, ii) the dominating types of adaptations applied in industry are auto-scaling, auto-tuning, and monitoring/analysis, iii) practitioners rely extensively on tools and infrastructure to realise the different functions of selfadaptation, iv) human supervision is important to ensure trust in industrial self-adaptive systems, v) about half of the participants encounter risks with applying self-adaptation, vi) on the other hand, about half of the practitioners would appreciate support from researchers to deal with problems they face.Figure 14 summarises the main findings.
The results offer insights for researchers that enable them to compare the focus their of their current research with industrial needs.A selection of related key insights includes: i) different from Self-adaptation is widely applied in industry across a broad variety of systems developed by companies of all sizes.

Who?
The key people concerned with realising self-adaption are programmers, project leads, architects and designers.
Why? Practitioners primarily apply self-adaptation to optimise performance, automate tasks, and deal with changes in deployment environments.This results in improved utility and user satisfaction, and reduced costs.What?Self-adaptation is applied at the level of the system or any part of it.The main types of adaptation are autoscaling and auto-tuning, triggered by changing properties of systems and the environment, and dynamics in load.

Challenges
Reliable/optimal design of selfadaptation, design complexity, and tuning/debugging are important difficulties in practice.Risks include incorrect functionality, difficulty to manage environment uncertainty, and degraded performance.

Prospects
Practitioners appreciate support from researchers to help solving problems with realising self-adaptation, providing guarantees, and management of data.Opportunities lay in autonomous operation, data management and using machine learning.

How?
Monitoring: top mechanisms are resource usage and load on the system tracked by sensors in the system and environment.
Analysis: main approaches are data analysis methods and comparison to thresholds.
Change: top mechanisms are auto-scaling and reconfiguration.
Degree of automation: mix of semi and fully automated adaptation.Trust: achieved through extensive testing, runtime monitoring, and human supervision.
Practitioners make extensive use of tools and infrastructure to realise self-adaptation Fig. 14.Summary of the main findings of the survey academics that study adaptation for mitigating uncertainty of classic maintenance tasks (self-*), practitioners also emphasise the importance of improving user satisfaction, reducing costs, and mitigating risks, ii) practitioners (in particular those of small and medium sized companies) rely on tools and infrastructure to realise self-adaptation, iii) ensuring trust in industrial self-adaptive systems is mainly achieved through extensive testing, runtime monitoring and alerting, and human supervision, iv) risks with self-adaptation in practice relate mainly to incorrect functionality, difficulty to manage environment uncertainty, degraded performance and increased cost.
The results also offer insights for practitioners to assess the level of their current practice in applying self-adaptation.A selection of related key insights includes: i) practitioners broadly confirm that the use of self-adaptation improves robustness and performance while reducing costs and required resources, and improves user experience while reducing the burden of engineers, ii) a wide range of mechanisms are used to enact self-adaptation in industrial systems, iii) tools and infrastructure, such as auto-scaling and container-orchestration platforms are available and commonly used to support the realisation of self-adaptation in practice, iv) important challenges when engineering self-adaptation in practice are reliable/optimal design, design complexity, and tuning/debugging, v) there is a relevant match between industrial practice in realising self-adaptation and the body of work performed by the research community of self-adaption.
The survey results provide prospects for applying self-adaptation in practice and opportunities for industry-research collaborations in this area.The prospects include: i) realising full autonomous operation, ii) exploiting machine learning, iii) improving quality and security, and iv) applying self-adaptation for maintenance.Key opportunities for industry-research collaborations are in: i) consolidating best practices (architecture, patterns, and reuse), ii) modelling paths for the adoption of self-adaptation in industry, iii) supporting advanced features to realise self-adaptation such as dealing with the evolution of self-adaptive systems, iv) rigorous methods for ensuring trustworthiness of self-adaptive systems, v) governance of data, and vi) moving the human in the loop (performing adaptation functions) to the human on the loop (overseeing the system to ensure trust).
We hope that the results of this survey will propel industry-relevant research in the field of self-adaptive systems and enhance the application of self-adaptation in practice, paving the way for self-adaptation to reach full maturity as a discipline.

Fig. 8 .
Fig. 8. Degree of automation of the self-adaptive solutions the participant has worked with (Q3.4).

Fig. 9 . 3 . 4 . 6
Fig. 9. Do you reuse solutions to realise self-adaptation?(Q3.5) 3.4.6Please provide a concrete example of reuse you used to realise self-adaptation?(Q3.6).Sixtyseven participants provided examples of reuse in the realisation of the self-adaptive systems.Analysis of comments: Table12summarises the findings.We focused on the subjects of reuse and identified five categories: code, design artifacts, specifications, IT infrastructure, and procedures.The 67 participants provided in total 91 objects of reuse in adaptation, i.e., an average of 1.4.Code occurred 33 times, with modules as the top subject of reuse (18 instances).Design artifacts was mentioned 22 times with patterns and architecture as main subjects of reuse (each with seven instances).Specification was mentioned 18 times as objects of reuse, IT infrastructure 11 times, and procedures seven times.The results demonstrate that reuse in self-adaptation is common practice, although the use of patterns (a topic that gets increasing attention in research) is limited.

Fig. 10 .
Fig.10.Did you encounter difficulties when engineering or maintaining self-adaptive systems?(Q4.1) 3.5.2Please give one or two examples of the difficulties that you encountered when engineering or maintaining self-adaptive systems.(Q4.2).Seventy-four participants reported in total 140 difficulties, i.e., on average 1.9 difficulties per participant.Table15summarises the findings.Analysis of comments: We identified four categories of difficulties: design issues, lifecycle issues, runtime issues, and people and process issues.Most frequently reported difficulties, 43 in total, relate to the design of self-adaptation, in particular reliable/optimal design (26 occurrences) and design complexity (17 occurrences).Life cycle issues were reported 42 times, in particular tuning/debugging (19 occurrences) and limitations of tools and methods (13 occurrences).Difficulties with runtime aspects of self-adaptive systems was reported 30 times with runtime uncertainty mentioned 17 times, and difficulties related to people and process occurred 25 times with skills and experience occurring 14 times.

Fig. 12 .
Fig. 12. Have you faced or seen any problems of self-adaptation for which you would appreciate support from researchers?(Q4.6)

Table 2 .
Questionnaire: Drivers for applying self-adaptation in industrial software-intensive systems (RQ1)

Table 4 .
Questionnaire: How self-adaptation is applied in industrial software-intensive systems (RQ3)

Table 5 .
Questionnaire: Risks, challenges, and opportunities when applying self-adaptation in practice (RQ4)
"Keep Telco network in optimal condition so that QoS and user experience is maximized, and churn minimized"; "better user satisfaction because of prompt website responses" Engineers support 18 "removes most of the optimization burden from programmers, so they can be more productive"; "Reduce workload on human operators; make (the results of) certain actions [...] repeatable and predictable" "Each machine is unique and its optimal operational parameters change over time due to ware, location, task and seasonal factor."Otherimprovements16Various 16 "In case of spikes in incoming events the system is able to adapt [...] avoiding bottlenecks.";"Easier and faster market integration"; "It's fundamental in huge infrastructure systems otherwise we can't make it happen."

Table 7 .
Analysis of comments I -Explain a concrete self-adaptive system you worked with (Q2.1) "Our company develops safety critical systems for railway.Systems architecture is often with redundancy -e.g. 2 out of 3 system, where is automatic reconfiguration implemented.Purpose is high safety and availability"; "A flexible manufacturing system ... the system and the individual station within the system can "sense" what kind of work piece it has in front of itself and what it or another machine should do with it in the next step." Module 22 "Environment compensation system for capacitive touch interface.Such system is influenced by envirenmental change (for example temperature)"; "We manage the memory usage of the process.Once memory usage over a limit (i.e.90%), we start throttling the workload." tomation, as well as internal development process support (build servers, logging, etc.)." CI/CD pipeline 3 "Sacling up and down our infrastructure (CI/CD) chain to build and integrate the truck software."

Table 8 .
Analysis of comments II -Explain a concrete self-adaptive system you worked with (Q2.1) "Auto-scaling functionality of an Azure Service Fabric cluster running a transformation load for processing AGV statistical and playback data."; "Realtime focused data streaming protocol ... must take care to avoid exhausting the network resources and thus incurring packet loss and latency spikes, which are very noticeable in games." Environment properties 18 "An IoT system running in Kubernetes and used to monitor water leaking for household insurance."; "A flexible manufacturing system ... can "sense" what kind of work piece it has in front of itself and what it or another machine should do with it in the next step." System load 14 "Kubernetes, for handling load intensive periods for scaling up, and self recover from crashes."; "Autoscaling of SaaS applications in function of load on AWS and Azure clouds." Events 12 "

Table 9 .
Analysis of comments -Mechanisms or tools used to monitor a managed system (Q3.1).
"Based on external information (external sensors like Lidar, Camera, GPS, ...) making sure no accident were to happen"; "Exteroceptive are aggregated to create a snapshot of the world's state.These are LIDAR and Image sensors.We use Proprioceptive sensors to determine the robot's state.These are

Table 10 .
Analysis comments -Mechanisms or tools used to analyze conditions of a managed system (Q3.2).Data analysis methods 18 "I think it uses some rolling average or some similar algorithm to estimate whether to scale up or down."; "simple statistical inferences based on metrics and simple rules encoded by developers."; "statistical analysis of data" Comparison to threshold 16 "Comparing the error rate with constant/dynamic thresholds."; "Hard coded critical boundaries like min max values which lead to switching over to emergency modes [...
]"; "when it falls below Service Level Agreements this indicates a need for auto-scaling" Metric(s) calculation 12 "Failure rate is used to measure quality of adaptation parameters.";"Capturingperformance of each node.";"Measurement of traffic load, CPU utilization, and general availability metrics (reachability, status, ...)" Learning 12 "Each station has a kind of edge computing component that performs some analysis based on machine learning results.";"Ittracksboth the internal working conditions (load) of itself as a serving component, and learns about overall serving conditions.";"Thesystem uses biosensory feedback to determine the riders' happiness [...]"3.4.3What mechanisms or tools does the self-adaptive system you worked with use to change a managed system or parts of it during operation?(Q3.3).The participants provided 126 instances of mechanisms or tools they have used for applying changes, i.e., 1.3 mechanism/tool per participant.
3.4.8How do you ensure that you can trust the self-adaptive solutions you build?(Q3.8).Ninety-one of the 100 participants that worked with self-adaptation provided valid answers.

Table 15 .
Analysis of comments -Difficulties with engineering or maintaining self-adaptive systems (Q4.2) Complexity in defining the adaptation rules.Conditions are not always obvious."; "Self-adaptiveness or resilience have to be taken into consideration at each stage of the ... workflow.This is really a challenge as more often than not these are concepts that are completely obscure to the average programmer/devop mind." If the functionality is not designed in from the beginning then it is a huge amount of work to implement later."; "System architecture over lifetime (nee features to be added...)"

Table 19 .
Analysis of comments I -Problems for which support of researchers would be appreciated (Q4.7)

Table 20 .
Analysis of comments II --Problems for which support of researchers would be appreciated (Q4.7).

Table 21 .
Analysis of comments I -Opportunities for self-adaptation that are not exploited yet (Q4.9)Methods to automatically handle changes in the machine learning models and to efficiently deploy them to the edge.There is still lots of manual fine tuning that delays a timely new release."; "The query optimizer of database (i.e.MySQL) could utilize self-adaptation technic." Autoscaling 9 "The "managed service", which is a stateful service/ data store, is provisioned for the peak capacity, which means resources are idle most of the time.If we can build reliable and efficient system that can automatically scale stateful services based on the demand, we can reduce the cost."; "Our microservices do not dynamically scale"

Table 22 .
Analysis of comments II -Opportunities for self-adaptation that are not exploited yet (Q4.9).that the biggest opportunities are found within the Human Machine Interaction or Building Machine Interaction.There will be a future in which talking to a device that can modify the environment (e.g. a robot but not a phone) will be as natural as talking to a person, or seeing a machine interacting with another machine (e.g.robot taking the elevator)"