New Design Potentials of Non-mimetic Sonification in Human-Robot Interaction

With the increasing use and complexity of robotic devices, the requirements for the design of human-robot interfaces are rapidly changing and call for new means of interaction and information transfer. On that scope, the discussed project ś being developed by the Hybrid Things Lab at the University of Applied Sciences Augsburg and the Design Research Lab at Bauhaus-Universität Weimar ś takes a irst step in characterizing a novel ield of research, exploring the design potentials of non-mimetic soniication in the context of human-robot interaction (HRI). Featuring an industrial 7-axis manipulator and collecting multiple information (for instance, the position of the end-efector, joint positions and forces) during manipulation, these data sets are being used for creating a novel augmented audible presence, and thus allowing new forms of interaction. As such, this paper considers


INTRODUCTION
With growing complexity and autonomy of robots, the question arises how we can improve coexistence between human and machine.Machines are increasingly equipped with accurate sensors, outpacing human senses, can access data from multiple sources and make decisions based on this information.These new levels of complexity pose radically new questions regarding trust, usability, and interaction.51:2 E. Naphausen et al.
From a post-phenomenological perspective, following Don Ihde [13], we are connected to the world by technology, mediating our relationship to the world surrounding us and thus changing how we approach and understand the world.Technology is, in this sense, mostly used intentionally and is fundamentally blurring the boundaries between human beings and technological artefacts.In this, Ihde describes different types of relation between human, technology, and world.Beyond other relations 1 he mentions, one type is alterity relation.In an alterity relation, the human interacts with technology itself, not with the world behind a certain technology.In this kind of relation, technology is perceived to be operating autonomously, making their own decisions, which makes this perspective especially interesting when discussing human-robot interaction.
This new type of proactive automatism fundamentally changes how we interact with these machines.The cooperative robot is neither a passive tool, a mere body extension, or a fully automatic machine.It is designed to be efficient, durable, useful, human friendly, and approachable.We rather "co-operate" and "inter-act" by establishing a dialog-like interplay between autonomous and controlled behavior that is more akin to social interaction than to operating a machine.In fact, how these machines are designed to interact; greatly influences the subtle qualities of this relationship.
Since sound is an important communication medium and source of information for humans, which is deeply rooted in our perception, the question arises how sound can influence the subtle qualities of the relationship between human and industrial manipulator.
We see the thingness of a robot with its material qualities, its movement and behavior as a framework.Our approach takes the data representation of this frame and uses it as input for sound generation.We hope for a better understanding of how to design the qualities of interacting with an autonomous tool, besides social robotics.
We use the term mimesis, which describes imitation, representation, and mimicry.Mimesis is also a term used in ancient Greece to describe the ability for causing an impression by a gesture [22].Our aim is to create sounds in a non-mimetic way and therefore try to avoid anthropomorphic or zoomorphic design strategies in our sonification approaches.We use sound as carrier of information, which is represented by modulations of sound parameters.However, we are aware that effects like pareidolia2 occur also in sound, and we are probably biased in our design decisions.In this way, our endeavor explores new design strategies and applications for non-mimetic human-robot interaction and, as such, focuses on auditory methods and techniques.In particular, we seek approaches and insights that enable designers (and related disciplines) to consciously create a meaningful auditory environment for human and machine as co-inhabitants.In our designerly approach of creating an audible presence through sonification, we specifically demonstrate an alternative approach of creating an auditory display, moving beyond alarms, warnings, and process monitoring sounds, while aiming for a holistic augmented auditory presence, generated by data.As such, we use information generated during the operation of a seven-axis industrial manipulator and map3 them to specific control parameters of a software instrument.
New Design Potentials of Non-mimetic Sonification in Human-Robot Interaction

51:3
This allows us to include distinct physical properties in auditory dimensions, as well as abstract information that may not be directly apparent by the physical appearance of the machine.By leaving behind mimetic characteristics, we open new design spaces for unchartered auditory design strategies and applications.
In this article, we first provide a brief overview of sound and sonification in HRI contexts, followed by the identification of key research parameters for our exemplary sonification scenarios.Next, we present our approach and setup and give a detailed description of the mapping process and the corresponding explorations.We conclude with a summary of the results and will give a short outlook, regarding future steps of integrating these findings into a systemic, unifying human-robot interaction perspective.
It is important to understand that the goal is not to quantify or to improve specific task parameters but rather to provide a qualitative understanding of how real-time data sonification can be designed to affect collaborative human-robot tasks.We combine well-known principles of sonification in a data sonification approach, with a distinct scenario for collaborative human-robot tasks and acquire the results by means of an exploratory, action design research (ADR) process.

Sound Techniques in HRI
As robots and humans increasingly expand their space of action, it is important for safe operation and that humans, for instance, are informed about future movements, changes in operation, or machine failures.This applies not only to collaborative environments in industrial contexts but also to other applications where automated machinery makes autonomous decisions, such as, for example, (semi-)autonomous transportation systems.There are several ways of generating signals, such as light, sound, motion, and gaze, that can help people to detect the intentions of machines [3].In addition, robotic devices are recently ascribed social properties [10], which raises questions of how to communicate emotions and social intentions [15].Since sound is an important modality for information transfer-not only between humans but also between humans and animals or objects-it is claimed that sound can enhance human-robot interaction and has distinct advantages over other signaling mechanisms [3,14].For example, automatic speech recognition and speech synthesis have radically advanced in the last few years, greatly influencing the possibilities for human-robot interaction by natural spoken language [9].This approach is particularly useful for anthropomorphic robots, while, at the same time, raises questions about how these artificial properties retro-act with humans [7].An approach working with artificial sounds is the use of non-linguistic utterances, which focuses on the design of audio cues like beeps and blips, similarly to robot sounds in games or movies [25,27].The growing interest and importance in sound design for robotic devices is also proven by other research on sound, such as the effect of motor sounds to the user [24] or the question how a robot's consequential sound should be pitched [31].
As we continuously use the incoming data to generate sound, we put the listener in an actionperception loop created by the parameters of the robot, which has implications on interaction [20].There are also strong indications that these connections between sound and action are deeply rooted into the human brain [1], and recent research shows that these stimuli could be also applicable for nonbiological movement [2].
Jørgensen and Christiansen follow a sonification approach to find an engaging and appropriate sound for their soft robot SON Ō in a social human-robot interaction setting [18].In three sonification scenarios, they tested different sound typologies, coming to the conclusion that their sonification approach had no statistically significant effect on people's perception of social attributes of the used soft robots.However, the interpretation of sound seems to correlate with the robot type and manipulation case itself.
In this regard, Schwenk and Arras generate phonemes4 of a virtual language for their mildly anthropomorphic robotic platform Daryl [28].They pick up on the human/animal characteristics of Daryl and demonstrate different scenarios modifying the robotic utterances by movement parameters of the head, eyes, and ears.Another demonstration shows a reactive sonic feedback by using perceptual input for changing the robot's voice.
In sum, concurrent to these projects, is an explicit interest in the development and integration of novel forms of sonification for human-robot interaction, including, for example, distinct emotional, social, and perceptual aspects.Our focus, however, is not only to explore new potentials of sonification in general but also to particularly progress with non-mimetic concepts, corresponding parameters, and components to create a meaningful auditory presence.

Motion and Action Sonification Techniques in Other Domains
In this section, we briefly discuss motion and action sonification techniques in other domains, with focus on real-time and interactive scenarios as they are related to our approach.
In medical applications, sonification is used for spatial orientation in real time for surgical scenarios.Jovanov et al. [17] show different sonification scenarios in which a surgeon is guided by sonic events, such as beat interference, pitch, discrete clicks, spatial audio, or modification of wave tables.Another article [16] shows the sonification of electroencephalogram and magnetoencephalogram data in real time.They modulate pitch, volume, or balance selected sound patterns, according to the data stream.They see benefits in receiving additional information hands free and without looking toward an additional screen but criticise the lack of obvious design solutions to map parameters to certain sound attributes in this given application.
An example for assistance devices for visually impaired people is a laser range finder that transfers depth information to audio [23].The group maps the depth information to pitch, volume, or both in real time.The user can choose between two modes, either mapping the data proportionally to sound or mapping the change between two consecutive measurements.Their device is generating a Musical Instrument Digital Interface note for each measurement, which is played by a quick-time music architecture software synthesizer.Another example is the In-Situ Audio Services project [8], also addressing visually impaired people.Their project employs spatialized audio rendering to convey the relevant content, which may include information about the immediate surroundings, such as restaurants, cultural sites, public transportation locations, and other points of interest.The user is carrying a mobile device equipped with Global Positioning System and a compass, which can locate the user and orientation of the device.On a virtual map, points of interests are stored and associated with spearcons. 5Depending the operation mode, the position, and orientation of the user, the spearcons for points of interest are played back with different real-time effects, such as reverb for distant locations.
To facilitate the work of an air traffic controller, this project [26] aims for a holistic audio gestalt of the inbound and outbound traffic of air traffic control sectors.They are using synth string sounds for each individual airplane, focusing on airplanes that are important for the air traffic controller.They are modifying pitch and harmonies for each aircraft, depending inbound and outbound traffic, indicating planes that need to be taken care of.Their approach seems to facilitate the work of air traffic controller in situations with low to medium traffic, as the controllers attention and concentration decreases.For high-traffic situations the sonification interferes with radio calls and distracts.
These selected examples illustrate the various ways and approaches for real-time motion and action sonification scenarios.It is noticeable that there is no prevailing design strategy for sonification scenarios, but as Dubus and Bresin point out [6], there seem to be more and less popular mappings.

RESEARCH PARAMETERS AND SETUP
Our research rationale is based on an ADR methodology [21,29].This implies the assumption that action research can be a valuable basis of knowledge generation, where context-specific researchthrough-design-studies [32] foster the formulation of distinct (and yet generalizable) design principles and experiences.For this reason, we prefer to talk about (explorative) "experiences" rather than (explanative) "experiments." We therefore focus on a series of incremental studies, featuring variable design and process parameters, and their validation through a comparative qualitative analysis [4]. 6

Sound
Our research is based on specific parameters and strategies to manage and perform complex, nonmimetic robotic sonification processes on a real-world, one-to-one exploration.Here we have identified three general sound categories as follows: • Pitch: Pitch describes the fundamental frequency of the sound and is usually measured in hertz.When talking about sounds, it is often referred to the highness or lowness of sound.• Volume: Volume is the intensity of a sound; we often refer to as loudness.Volume is usually measured in decibels.• Timbre: Timbre describes the qualities of a sound that make you differ between two instruments, playing the same pitch and volume, for example a trumpet and a piano.From a physical perspective we refer to overtones and wave forms.
These categories, or auditory dimensions, are the basic parameters, describing sound in physical terms, while playing a fundamental role in cognitive processes [12].Since there are many ways to describe and analyze sound in different auditory dimensions, the chosen dimensions represent only a fraction of all possibilities; however, they are most commonly used in auditory displays.In turn, they enable us to systematically explore non-mimetic sonification (as outlined in Section 4) and to address respective theoretical, practical, and methodological challenges of the relationship between physical and auditory representation in human-robot interaction.
It is important to understand that there is no universal sonification strategy in this scenario.As the industrial manipulator is a versatile tool, suitable for a wide range of tasks, there is no distinct or absolute physical model [6] on which we can rely.We rather see the robotic arm as a generic apparatus, which sounds can be designed in many ways, fostering different non-deterministic qualities and auditory potentials in human-machine interaction.

Setup
The robotic arm with which we conducted our explorations is a model Panda by Franka Emika equipped with a mechanical two-finger parallel gripper at the end effector (EE).Panda is an industrial manipulator with seven degrees of freedom, torque sensors in all axes, and a maximum reach of 855 mm.In the case of an impact, the robot immediately stops, being sensitive enough not to hurt a human.Given an exemplary human-robot assembly task (as described in Section 3.3), our setup retrieves the following data from the machine during manipulation: • position and rotation of the end effector • angle of each joint • force applied by each joint From these data sources, we derive additional datasets, such as the velocity of the end effector or joint rotation velocities.Our explorations use the position of the end effector, the summed force of all joints, and the angle of each joint.We choose the position of the end effector as data source, because it is an apparent and important property for our task, which also has a clear visual representation.We also see potential for transporting the robots internal data, which has not necessarily an externally recognizable visual representation.For example, the position of the endeffector has a visual representation while the summed up force over all joints is an abstract value not readable on the physical appearance of the machine.We hope for insight about how well such parameters can be addressed.The angle of each joint represents the unique pose of the robot and therefore is a complete, yet complex data source to represent the whole physical state of the machine.
The data are read by OSCTool, software developed for these explorations to acquire, prepare, and forward information via the Open Sound Control (OSC) protocol.We fetch the data via http request from the robot's control unit with a fixed rate of 10 ms.We normalize the position of the end effector from millimeters to a range from 0.0 to 1.0, representing the maximum reach of the given axis.We also normalize the joint angles from 0.0 to 1.0, based on their leftmost and rightmost rotation limits.Depending on the exploration, we go into further detail how we treat the data.As sound generator we use Supercollider, an open source platform for audio synthesis and algorithmic composition.Supercollider is based on the OSC protocol, which targets the control of electronic musical instruments, sequencers, and show equipment such as spotlights, moving heads, and fog machines.OSCTool prepares the data for the specific exploration and controls a Supercollider instance via OSC.The mapping process (which data source controls which part of the sound) is done in OSCTool.Supercollider provides the sound generators and synth definitions.
We conducted our explorations with two different synths, programmed in Supercollider: the SimpleSawSynth and the TimbreSynth.All of our explorations regarding timbre are using the Tim-breSynth, and all other explorations (pitch and volume) make use of the SimpleSawSynth.
The SimpleSawSynth.The SimpleSawSynth (see Figure 1) consists of a single saw-wave oscillator with parameters to control pitch and volume in real time.Both parameters are filtered to smooth out (discrete) audible steps for large data changes (pitch: 0.2 s, volume: 0.05 s).The signal of the single oscillator is mixed up to a stereo signal.The TimbreSynth.The TimbreSynth (see Figure 2) consists of a simple subtractive synth.A square wave oscillator whose pulse width is influenced by a sine wave oscillator, with a factor of 0.01, at an attenuation of 0.1, resulting in a slowly and subtle pulsing sound.The square wave oscillator is afterwards filtered by a resonant low-pass filter, and the resulting signal is then multiplied by a volume parameter and mixed up to a stereo signal.In regard to parameters, we control the frequency of the square-wave oscillator, the overall volume, the filter frequency, the q-factor7 of the filter, and the speed of the sine wave oscillator.All parameters are smoothed out over 0.2 s.

Task
To explore new potentials of non-mimetic sonification in human-robot interaction within a real-world and representative setting, we opted for an exemplary collaborative (as described in Section 3.4) assembly task between a human and a robot, acknowledging that collaborative assembly and build-up of discrete components, represents an established task in robotics and automation [11].Here, the robot's task is to configure a pile of three discrete cubes into a Towers of Hanoi puzzle using a sequential assembly logic (see Figure 3).The cubes are marked with a visual symbol for recognition and have the same size, featuring an edge length of 65 mm.They are made from 3-mm-thick laser-cut and glued fiberboard elements.The weight of one cube is 65 gr.During assembly, one marked cube does not match the other cubes and needs to be (manually) exchanged during the stacking process by human intervention.The exchange of the cube simulates a situation the robot cannot solve without a human collaborator.However, the robot follows a programmed linear sequence.To change the cube, the robot hands the "false" cube to a human collaborator, who replaces the false cube with a correct one.After this interaction, the robot proceeds with the pre-programmed task.When designing the task, we included smaller movements (stacking from one place to another at a distance of about 30 cm) and larger movements (handing the cube to the human collaborator).We also assigned two axes with considerably more movement (left and right and up and down) and one axis with less movement (forward and backward).

Interaction
Our exploration implements a simple human-robot collaboration task to emphasize real-world physical interaction modalities in addition to status and control monitoring.As outlined above, one cube is given to a human collaborator by the robot and is getting exchanged by another cube.While the stacking takes place at one side of the robot's table, the human interaction takes place at the other side, requiring long-distance movements of the robotic arm and thus initiating significant changes in the sonification.Our setup is similar to a closed-loop8 sonification approach, while capitalizing on the important difference that the robotic manipulator neither senses nor reacts to the sound and follows a linear sequence, but the human observes the workflow and setup.Sonification has, therefore, an interactive and mediating role [30] between human and machine.

EXPLORATIONS 4.1 Structure and Mappings
In our explorations we orient toward de Campo's [5] sonification design space map, following a continuous data representation approach.In this, our explorations are structured in a sequential order and, as such, are organized within a coherent matrix (see Table 1).De Campo's Data Sonification Design Space Map can help to understand different sonification approaches and to gain a better conceptual understanding of sonification strategies.The Data Sonification Design Space Map is a three-dimensional map representing the properties of the data representation in each dimension as follows: • x: The number of data points needed for Gestalt perception.
• y: The number of data properties.
• z: The number of streams.
De Campo also situates regions of parameter combinations in his map, resulting in areas named as Model-Based-Sonification, Continuous Sonification, and Discrete-Point-Sonification.For our explorations, we treat the robot's data as data points in real time starting from the x-axis.This is limited by our polling rate and therefore settles at 100 data points per second.As this is in our approach strongly linked to the movement of the robot, which includes the dynamics of the movement over time, we use a parameter mapping strategy to create a meaningful auditory space.In the y-axis, we map only one parameter for each exploration in E1-E3 (pitch) and E4-E6 (volume) to a SimpleSaw synth.For E7-E9, we map three data properties linearly to a Timbre synth.
Since we are focusing on different motion parameters of the industrial manipulator, we create a new synth for each new parameter.This results in an increasing number of streams along the z-axis.As in E1, E4, and E7, we have separate streams for the three movement axis, resulting in three streams and therefore three synths per exploration (see Figure 4).With increasing number of movement parameters (rows in Table 1), the synth number increases.As such, we use four streams in E2, E5, and E8, and seven streams for E3, E6, and E9.
During our explorations, we designed two additional sonification scenarios (E10 and E11) described in Excursus.Due to the exploratory approach of this research, we took the opportunity to test these sonification scenarios as well.In E10, we have only one stream and one data property (volume), while in E11 we are also reducing the streams to one, but treating the movement axis as data properties, which moves us in the map a little bit toward a model-based sonification strategy.
We treat the incoming robot data as quasi-analog data, using a dedicated Supercollider synth for each dataset and property.We control the specific parameter (for example Volume) of each synth (or data property) by sending the according data via OSC with OSCTool.In what follows, we provide a description of the parameters and mappings for each row and column of our matrix in detail.EE x, y, z.Positions were recalculated to a value from 0.0 to 1.0, defining the minimum and maximum reach of the robot's end effector in a given axis.In our explorations, we focused on the translation of the position of the end effector and left aside additional rotational components.We created one synth for each axis, mapping the pitch, distributing a fixed volume evenly over all three synths.In our volume explorations, we predefined the pitch for each axis with notes of, for example, a spread C major chord (x: C2, y: E3, z: G4), mapping the volume for each synth to the axis from 0 to one-third of the maximum volume.The timbre exploration also creates three synths (one for each axis) and maps the data of this axis to the synths timbre related parameters: speed of pulse width modulation, filter frequency, and filter q-factor.The volume is preset at one-third for each synth, and the frequencies of the square wave oscillators is set to 80, 85, and 90 Hz, resulting in a dronelike soundscape.
The summed force.The summed force represents the summed value of the forces of all joints, normalized from 0 (no force applied) to the maximum force applied in our task overall.We added the summed force as fourth parameter to all EE x, y, z examples and mapped them in the same parameter range.For the volume exploration, we preset the pitch with the note B4, resulting in a C major 7 chord.
The joint angles.The joint angles are the measured angle of each individual joint, normalized from 0.0 to 1.0.We did not correct the positions for three joints that differ from the other joints in their position limits 9 but treated them the same in our sonification.As a consequence, there are joints that cannot reach the same value than others.We use one synth for each joint.In our pitch exploration E3, the tuning is different to the other pitch explorations to have another range to compare with.
Table 1 gives an overview over the parameters mapped and synths used in the specific exploration.As we also changed tuning, you will find this information in the description of the respective exploration.

Sonification Studies
The following sonification descriptions include a first-person account of authors 1 and 2. Those cover their listening experience to, and interaction with, the sonification (through exchanging the cube as described in Section 3.3).We strongly recommend watching our linked videos as you are reading along.Our first-person, qualitative analysis is based on iterative and incremental listening passes, including 22 full-length passes (E1-E11) and 12 selected passes (with a specific focus on E3, E9, E10, and E11).Each iteration, the subject interacting with the robot changed, to experience the passive spectator role, as well as the interactive experience.Both participants had the interacting and spectator role for all explorations.As the design process of these exploration scenarios follows a quasi-experimental methodology, both participants were involved both in the setup and listening process, and they were familiar with the design decisions, the overall validation, and the gestalt of the explorations.
The headings link to a video of the according exploration for convenience.We recommend quality speakers or headphones.In case the links are not working in your document, the written URL leads to the same video.For further reference, we provide spectrograms for each exploration as follows: New Design Potentials of Non-mimetic Sonification in Human-Robot Interaction 51:11 E1: EE x, y, z -Pitch (see Figure 5(a)), Video: http://sisound.naphausen.info/E1.The parallel glissandi 10 of the synths results mainly in a non-musical interplay of different frequencies, leading to an unpleasant sound experience.In fact, there are moments when the robot does not move, while all three synths settle at a certain frequency.If the frequencies were in a harmonic ratio to each other, then the pose gave the impression of being intentionally designed, which was not the case in our explorations.Subtle movements also became discernible in the sound pattern, as well as a difference between slow and fast move, small and large move, as well as standstill and movement.Above all, it was possible to separate single oscillators while listening and to correlate axes movement to them.
Number of Synths: 3 × SimpleSaw synth Pitch: 0-400 Hz Volume: all synths with same volume E2: EE x, y, z, summed force -Pitch (see Figure 5(b)), Video: http://sisound.naphausen.info/E2.The added force parameter draws the attention by quickly and unpredictably changing its frequency in comparison to the axis parameters.In the listening experience, it appears as a dominant almost whining-like sound.It was impossible to make a difference in sound between the robot moving while carrying a cube 11 or not.There were no moments where all synths came together in a moment of harmonic interaction, as the force corresponding synth is constantly changing.The moments with musical correlations still occurred for the synths of the x-, y-, and z-axes but were overlaid by the sound of the force-synth.It was possible to separate the fast changing synth, representing the force parameter in perception.
Number of Synths: 4 × SimpleSaw synth Pitch: 0-400 Hz Volume: all synths with same volume E3: Joint Angle -Pitch (see Figure 5(c)), Video: http://sisound.naphausen.info/E3.The perceived sound did not allow for any conclusion that joint specifically changes the sound in which dimension.Individual synths mixed and interfered with each other, which made the sound rather a morphing sound body.Subtle movements could not be identified in the sound representation, as well as velocity.Sound representations regarding the regions of the task could be identified, such as "near the initial stack" or "near the target stack" when the machine was not moving.

Volume.
E4: EE x, y, z -Volume (see Figure 6(a)), Video: http://sisound.naphausen.info/E4.Mapping the end effector position to volume requires a different frequency for each synth to achieve audible differences in sound.The organization of the frequencies in a single chord results in a more static and musical listening experience.As the individual notes of the chord get louder and softer, the impression of a morphing body of sound is created.It was possible to follow larger movements in the listening experience, while smaller movements had less audible effects, as did the speed of movement.Larger movements could be distinguished and revealed the corresponding mapping between note of the chord and movement axis.It was feasible to identify regions, similarly to E3, but with larger regions.
Number of Synths: 3 × SimpleSaw synth Tuning: C2, E3, G4 Volume: 0-0.3 per synth E5: EE x, y, z, summed force -Volume (see Figure 6(b)), Video: http://sisound.naphausen.info/E5.By introducing the B note to the C major chord, the musical appearance changes to a C major 7 chord.Similarly to the according pitch-exploration, the new note was easily found by it is quick changes in volume, and could be identified as a separate parameter.By arranging the frequencies in a chord, the new note blended pleasantly into the listening experience.Small changes in the force parameter were not perceived in such detail and could rather be categorized in a "low, " "mid, " and "high" range.
Number of Synths: 4 × SimpleSaw synth Tuning: C2, E3, G4, B4 Volume: 0-0.25 per synth E6: Joint Angle -Volume (see Figure 6(c)), Video: http://sisound.naphausen.info/E6.The C major chord spread over three octaves and sound sound generators provides a dense, pulsing sound by interacting frequencies.A link among different regions, motions, velocity, or other properties of the robot and the sound could not be established.By skipping through our recordings, it occurred that the sound changed, but in the real exploration it was not distinguishable whether the sound change was real or imagined.E7: EE x, y, z -Timbre (see Figure 7(a)), Video: http://sisound.naphausen.info/E7.The timbre synth introduces a lively, complex sound structure by using a pulse wave as primary oscillator and filtering this close to self-resonance.Compared to our previous explorations, this synth has independent time-based modulation, resulting in a slightly pulsing sound.Like all other explorations, we use one synth for each parameter, but by filtering and modulating we achieve a harmonically rich sound.It was difficult to separate the single synths from each other, so the sound appeared more like a soundscape.Subtle changes were inaudible, as well as differences in movement speed.As our TimbreSynth uses a resonant filter, the resonating frequencies became very dominant in the 51:13  Similarly to E2 and E5, the newly added parameter was apparent by its fast and unpredictable changes.It blended subtly into the soundscape and was perceived as part of a lively sound structure, which did not lead to an unpleasant experience as with E2.The TimbreSynth and the consequential mapping braked the logical link between a higher force and, e.g., a higher pitch (E2) or louder volume (E5).It rather introduced a change in multiple auditory dimensions without a predefined direction.
Number of Synths: 4 × Timbre synth Tuning: 80, 85, 90, 95 Hz Volume: all synths with same volume E9: Joint Angle -Timbre (see Figure 7(c)), Video: http://sisound.naphausen.info/E9.The seven synths all modulated at the same time gave the impression of colorful noise rather than a harmonic listening experience.Still, it was a pleasant sound.It was possible to identify regions in sound with vague boundaries, but the changes were more perceivable while the robot was moving.The speed of movements was not perceivable.Also subtle movements in the audible representation could be identified.
Number of Synths: 7 × Timbre synth Tuning: 80, 85, 90, 95, 100, 105, 110 Hz Volume: all synths with same volume 4.2.4Excursus.In addition to the nine examples provided by our matrix, we identified two other sonification scenarios.To fully exploit the scope of our research approach, we also tested these two other scenarios.
• E10 Speed: Only movement speed is mapped to volume • E11 Model: Instead of creating three distinct synths in E7, we map EE x, y, z to three parameters of a single synth E10: Speed -Volume (see Figure 8), Video: http://sisound.naphausen.info/E10.In this exploration, we gain a new input parameter by calculating the movement speed of the robot.The speed is normalized over the whole task between 0 and 1 and then mapped to the volume of a single Sim-pleSawSynth with a fixed pitch of 90 Hz.Although it is a minimalist listening experience, the transmitted information is very precise and diverse.It was feasible to hear the smallest movements and make assignments about the length and size of the movement as it progressed.The connection between speed of a movement and volume seemed natural, as physical movement tends to be louder with increasing speed.This mapping thus shows approaches of physical modelling.Also the reduction to one parameter helps to understand the intended mapping.

Number of Synths:
1 × SimpleSaw synth Tuning: 90 Hz Volume: 0-1 E11: EE x, y, z -Model (see Figure 9), Video: http://sisound.naphausen.info/E11.This exploration is similar to E7, but instead of creating a new synth for each motion axis, we map each axis to one parameter of the synth.Compared to the other explorations, this approach is fundamentally different.We shift from a continuous data representation to a model-based representation. 12he TimbreSynth explorations are demonstrating the sonification possibilities even with a simple synth, which motivates us to experiment with different parameters and mappings.In this exploration we map the x axis to the speed of the pulsewidth modulation, the y axis to the q-factor, and the z axis to the filter frequency.The individual mappings between the audible dimensions and the axis of motion were quite opaque, but the sound changes were more pronounced and appeared to be more closely associated with the movements of the machine.Similarly to E10, this exploration benefits from the reduction of the number of synthesizers, as it allows us to focus on more decided changes in the sound.

Discussion of Results
In the following, we present the results of our explorations, being structured according to our key research parameters (as outlined in Section 3).The overall conclusions are summarized in Section 5, outlining, for example, the overall design challenges of integrating single qualitative (non-mimetic) sonification approaches into a unified auditory presence.
Pitch.As we use a glissando change for our pitch explorations (E1, E2, and E3), the continuous change in pitch, and the resulting non-musical frequency compositions are not pleasant to listen to and, in fact, represent a fatiguing impression.Remarkable moments occur when the oscillators meet at points where harmonic interpretation is possible, creating a relaxation moment in the listening experience.Individual changes in pitch are clearly recognizable, even with up to seven synths playing.Here the association between single parameters and movement axes is becoming weaker by an increasing parameter count, shifting the experience from a "detuned chord" toward a "morphing soundscape." Running the exploration repeatedly, it becomes possible to identify New Design Potentials of Non-mimetic Sonification in Human-Robot Interaction 51:15 specific poses of the robot, based on the time sequence, and thus to follow the manipulation task blindly only through listening.
Volume.In contrary to our pitch-related explorations, we have full control over the oscillator's pitch, fading only volume up and down.Since we choose chords or notes at harmonically appropriate intervals, the listening experience is very familiar and pleasant.The changes in volume directly result in changing the sound space in all explorations, being, however, quite subtle.In turn, as the number of parameters increases, the individual changes are no longer audible.This particularly occurs in situations where the robot is moving.When the robot is not moving, we can identify regions or spaces that we remember and estimate the position of the robot.When a single parameter is isolated (for example, as explored in E10), the volume is a precise, meaningful sound (and manipulation) parameter.Timbre.Our TimbreSynth creates a dense sound even with only one active synth and therefore is perceived rather as a soundscape than a single instrument.The listening experience is pleasant, even with up to seven synths playing in parallel, creating a sound shifting toward a noisy composition.The changes are audible during motion but the axes of motion are difficult to identify or interpret.When the robot is not moving, we can identify specific regions in the soundscape, for example, the position when handing over the cube.In turn, we achieve a higher mapping transparency by reducing the number of synths and map different parameters to different parameters of the same synth (as provided in E11).
Overall, we principally succeeded in generating (and using) non-mimetic sonification in a human-machine assembly scenario.As such, we were able to systematically analyze individual data and ranges of key sound parameters, and their influence on the specific machinic and auditory presence.Explorations (such as E10) question the mimetic complexity of other existing sonification approaches (as outlined in Section 2), showing the general potential of mapping and manipulating individual-yet fundamental-sound parameters (for example pitch, volume, timbre) in human-robotic interaction scenarios.However, since our mapping relies on a number of different synths, adding another synth and thus further expanding auditory complexity, could potentially lead to indistinguishable configurations or results. 13In this, our explorations contain a considerable large amount of variables and so does our mapping.The outcome of the explorations is therefore limited in terms of objective comparability.At the same time, we are also aware that the provided scenario and setup (including end effector and materials, polling rate, computation time, and information transfer) considerably affects the auditory immediacy and immersion of our explorations, as well as the human-robot interaction.Ultimately, describing non-mimetic sonification and using characteristics, such as pitch, volume, and timbre, does not necessarily match the (individually) perceived sound.Changes of one parameter (e.g., pitch) can significantly affect other parameters (e.g., amplitude) [12] or even the overall perceptual experience.

CONCLUSIONS AND FUTURE WORK
In conclusion, non-mimetic sonification of automated machinery, in which external information is retrieved, reconfigured and represented, extends the traditional spectrum of human-robot interaction and, as such, creates new avenues for human-machine collaboration in using industrial manipulators.It fosters a substantially greater level of auditory presence, moving beyond traditional concepts of visual or auditory representation and interaction, while outlining new (alterity) relations among humans, technology, and the world.We thus tie in with philosophical theories, such as Don Ihde's, and design principles of human-robot interaction.On that scope, the presented research provides a first proof-of-concept, outlying essential research parameters, components, and attributes and demonstrates the general ability of using non-mimetic sonification for robotic material manipulation.However, like other auditory experiments and respective auditory displays, the presented approach underlies the general ambiguity of sound perception, whereas additional-and more refined-design and validation methods have to be developed and tested.At the same time, we see a loss of detail within the listening experience by increasing the number of synths and assigning a multitude of parameters.In future experiments and explorations, robot parameters must represent critical variables that are specifically important to a certain interaction task and/or a certain auditory experience.Other considerations relate to a model-based approach, such as working with a more complex synthesizer, that allows for more profound timbral changes.Based on our observations, we suspect that a combined parameter approach might prove useful in future explorations.Against this background, we also see a major difference in the design for a moving machine or a non-moving machine state, which ties into issues of task-and interactionspecific sonification.Also, all our explorations follow a linear mapping, which could be useful to change when it comes to more specific tasks.To be able to pursue future research, the question arises as to how the described auditory experiences could be systematically validated and contextualized.Ultimately, for non-mimetic sonification to succeed in real-world scenarios, it is required to explore further strategies of integrating different parameter sets and sound effects into a unified-and this sense holistic-auditory experience.

Fig. 3 .
Fig. 3. Collaborative assembly task featuring an industrial seven-axis manipulator and human collaborator.Photo composition of intermediate steps.

Fig. 4 .
Fig. 4. Location of the explorations in the sonification design space map.

Fig. 8 .
Fig. 8. Speed -Volume.sonification.It was also possible to identify areas in the movement that belonged to certain sound qualities, but they were not as clearly defined as in E3.Number of Synths: 3 × Timbre synth Tuning: 80, 85, 90 Hz Volume: all synths with same volume

Table 1 .
Explorations 1-11 (E1-E11): All Explorations are listed with the Type and Number of Synths Used and the Synth Parameters changed by our Sonification