WAVE : Anticipatory Movement Visualization for VR Dancing

Dance games are one of the most popular game genres in Virtual Reality (VR), and active dance communities have emerged on social VR platforms such as VR Chat. However, effective instruction of dancing in VR or through other computerized means remains an un-solved human-computer interaction problem. Existing approaches either only instruct movements partially, abstracting away nuances, or require learning and memorizing symbolic notation. In contrast, we investigate how realistic, full-body movements designed by a professional choreographer can be instructed on the fly, without prior learning or memorization. Towards this end, we describe the design and evaluation of WAVE, a novel anticipatory movement visualization technique where the user joins a group of dancers performing the choreography with different time offsets, similar to


INTRODUCTION
Dance and rhythm games have emerged as one of the most popular genres in consumer Virtual Reality (VR).A prime example is Beat Saber [14], the best-selling VR title of all time [5], in which the player wields dual lightsabers to slice targets in rhythm.Dance has also organically emerged on social VR platforms such as VRChat [48] in various user-driven communities, where people enjoy social dancing and performance [34].More generally, dancing has multiple health benefits [10,42] and is studied across multiple research domains.In Human-Computer Interaction, dance research ranges from interactive performance technology to improvisation tools, multisensory perception, and movement quality analysis [53].
Despite the success of VR dancing, a fundamental problem remains: Dancing can be difficult, and it is challenging to instruct dancing in VR, without an expert human teacher.In this paper, we propose and evaluate a new solution to this problem, focusing on a setting where the user's goal is to follow a predefined choreography with predefined timing.This goal is common in both dance games and dance classes organized in VR or in real-life.Note, however, that our goal is not to provide practice for a performance happening later.Instead, we aim to communicate and instruct dance on-the-fly, in a way that allows both dancers and non-dancers to simply start dancing and following a choreography one has never seen before, enjoying the music and the flow of movement.Achieving this goal could make dancing more approachable, especially to non-dancers, and provide new opportunities for dance game design.
The design challenge is how to present body movement trajectories in a way the user can easily understand and follow.Miller's work on dance games [30] describes the problem as: "we don't currently have a mechanism for streaming kinesthetic data into the human proprioceptive system in the same way that we stream audio and visual content" (p.103).
Expecting a user to simply copy a model's movements is not effective, as the user's reaction time is limited and the human body has considerable inertia, an issue that Miller calls kinesthetic lag [30, p. 121].Rather, skilled motor control generally requires some way of anticipating the required movements ahead of time [2,40,49].
Prior work typically uses some symbolic abstraction of the movement that allows for an anticipatory timeline presentation, e.g., as the timeline of arrows indicating dance pad footstep sequences in Dance Dance Revolution [22] and the sparse key poses of Just Dance [46].In VR, the same has been implemented as the timeline of footstep targets of Dance Dash [38] and the sword swing targets of Beat Saber [14].In some cases, such symbolic presentation is augmented with a model dancer demonstrating full-body movements for the player to mimic (e.g., Just Dance and Dance Central [43]), but as discussed above, this is of little help in guiding users not already familiar with the choreography.To instruct realistic choreography created by professional dancers [1], Dance Central and Dance Central VR [44] additionally include a practice mode that teaches the player to read the symbolic notation and execute the signified movements.However, dance game players may find such practice modes tedious or prefer not to switch modes during a social dance game session [30].What is missing in dance games and more generally in VR dancing is a way to instruct realistic, nonabstracted choreography in real time, on-the-fly, without a separate practice mode.
Contribution: We propose and evaluate WAVE, a new solution to the on-the-fly dance instruction problem.WAVE is a novel anticipatory movement visualization technique, illustrated in Figure 1.The core idea is that the user becomes part of a crowd of virtual dancers performing the instructed choreography with different time offsets, similar to spectators making waves in sports events.In dance pedagogy terms, we use the choreographic device of "canon" [16] to enhance the mimetic method of learners copying a teacher's movements, also known as the "see and do" approach [35].Our evaluation data ( =36) indicates that WAVE allows users to anticipate movements propagated through multiple virtual dancers, improving users' accuracy in following the choreography in comparison to following a single model dancer.To help others experiment with and extend WAVE, the source code and Unity 3D project are published at https://github.com/CarouselDancing/WAVE.

BACKGROUND AND RELATED WORK
Today, there are multiple different solutions for instructing movement and dance in virtual environments [8].Below, we review both dance games and non-game dance learning applications, dividing the discussion into non-VR and VR approaches.For a broader overview of HCI in the context of dance performance and related creative processes, we refer the reader to the review by Zhou et al. [53].

Non-VR Dance Instruction and Visualization
A popular approach for using computers to instruct dance is to adapt the mimetic method, an established dance-teaching approach [35].Most applications using the mimetic method focus on teaching individual moves rather than whole choreographic pieces.Here, we have focused on studies using technologies similar to those available in Oculus Quest 2, the platform used in our application.These technologies include sound, visuals, and motion tracking of the user's hands and head.
Chan et al. [7] use a motion capture suit for dance training.The student receives three types of feedback.Firstly, the user's pose, captured by the motion capture suit, is shown in real time next to an animated 3D model performing the desired movements.Secondly, a report displays the joints in which the player's movements were incorrect.Lastly, a slow-motion replay allows the user to review their performance.
When focused on teaching specific genres of dance, rather than dance movement generally, studies have tended to follow a similar pattern [18,19,51].Teachers' movements are recorded using motion capture, users try to copy those movements, and then the system evaluates their performance.Some studies of this kind have had promising results using Microsoft Kinect for real-time evaluation of full-body movement [4,37] and gamified movement instruction using Labanotation [36].
The teaching approaches above rely heavily on information the user gets after performing; in effect, it is assumed that the user will learn gradually through multiple repetitions.In contrast, we strive to provide foresight of the desired movements so that players have a possibility of succeeding on the first try.
Optimizing instruction for the first try is also what commercial dance games appear to aim at.This is reasonable because optimizing the first-time user experience is of high importance in games [29,33] and players have been found eager to skip tutorials [9].However, most games simplify the instruction problem by specifying choreography only partially, abstracting away nuances.For instance, Dance Dance Revolution's [22] arrows only specify footsteps and the key poses of Just Dance [46] do not indicate how to exactly transition between them.While there is evidence that dance games can teach dance skills [27], instructing realistic and nuanced dancing remains has remained non-trivial, requiring added complexity like the separate practice mode of Dance Central [43].

Dance Instruction in VR
VR has proven to be useful for instructing dance.For example, hiphop students appreciated the way VR dance materials simplified movements and made them clear and easy to follow.[47].Similarly, Eaves et al. [12] found that information provided to users should not be too detailed.Feedback based on only four tracked joints worked better than twelve, in that users were unable to extract the relevant information when they were presented with too much data.There is also work on how to identify correct dance poses, as in the approach of Kyan et al. [23] to ballet dance training.Some research indicates that learning dance with a partner can be beneficial, elevating users' interest in learning dance [51] and improving performance [19].It is no surprise that virtual dance partners are used in multiple VR dance studies.Kirakosian et al. [21] had a user leading a virtual partner in pair dance that was responsive to the user's movement.The study did not measure how effective the method was for learning but the users' rated their enjoyment as high and most of them anticipated being more confident to lead someone in real life.Senecal et al. [41] similarly used virtual partners for salsa dance instruction and found that the movement patterns of users without prior dance experience became more similar to that of users with dance experience after using their system.They measured movement patterns using a number of features, including several specifically designed to capture core technical elements of salsa.Studies have also explored using multiple virtual model dancers to support dance instruction, as in the work of Kico et al. [20].Here, we extend their work by having the virtual model dancers perform in canon instead of unison to provide anticipation of the next movements.

DESIGN
The WAVE prototype evaluated in this study has three lines of dancers, including dancers positioned to the left and right of the user, as shown in Figure 1.Naturally, this is only one of many possible configurations of dancers, and Figure 2 shows alternatives tested during development (see Section 3.4).Below, we explain our design process.

Problem Definition
Based on our review of related work, its limitations, and the problem areas discussed in the introduction, we defined two key requirements: (1) The system can instruct choreography to the same level of full-body detail as can be instructed outside of VR (ie., in naturalistic dance settings like the studio, stage or street), instead of relying on symbolic and/or abstracted dance notation.
(2) The user can follow the instructions on-the-fly, instead of having to first engage in a separate learning or memorization phase.

Design Principles
We derived design principles to help us satisfy the above requirements.Regarding the first requirement, we hypothesized that we should focus on instructing movements through demonstration.Demonstration is prevalent in dance teaching and even most nondancers have engaged with mimicking demonstrated movements at least occasionally, e.g., during childhood.From this point of view, it is natural to focus on using the moving body to instruct the moving body, i.e., using animated dancer characters as a core visualization element.With earlier screen-based systems, choreography was typically limited by the user needing to face forward to see the screen.However, dance choreography generally involves moving and facing in multiple directions, which called for placing model dancers in multiple positions, rather than than only immediately in front of the user.
To meet the second requirement, the user should be provided with a capability to anticipate/predict the upcoming movements.Executing the movement and timing demonstrated by a model in real-time is impossible; human reaction time is limited and the body has considerable inertia, so movements need to be planned and initiated ahead of time.
These design principles quite naturally lead to the core WAVE design idea of multiple dancers performing at different time offsets, which provides access to full-fidelity demonstrations of complex full-body movements with enough time for users to anticipate and then execute those movements at the target times.

Dance Style and Content
Our study used an 84-second contemporary dance choreography, designed for beginners.The choreography was designed for us by a professional contemporary dance teacher with over 20 years of experience teaching students of different levels and creating choreography for them.We chose contemporary dance as our focus, as it is relatively underexplored in dance games, compared to styles like hip-hop or other forms of street dance.The choreography was recorded using a Xsens motion capture system [32].

Formations of Dancers
Aiming to find an appropriate formation for the virtual dancers, we ran an exploratory informal study seeking feedback on potential variant formations.In collaboration with our choreographer, we designed the five potential WAVE formations in Figure 2.
While our goal was to create something enjoyable for both dancers and non-dancers, we specifically chose to seek design-phase feedback from experienced contemporary dancers.Our reasoning was that these dancers might be able to articulate their feedback better than novices, grounded on their experience of the dance style and being instructed in it.
We visited a contemporary dance class of 10 dance students (7 women, 3 men) with a median of 15.5 years of practice.We focused on subjective differences between the visualization variants.Every participant tested the variants in a prescribed order, at their own pace, using two different choreographies.By pressing a button, they could loop through the variants as many times as they felt they needed to explain their thoughts and rank the variants.
We aimed to select the best variant among the options, based on the rankings, justifications, and positive and negative remarks made while testing.However, none of the variants was clearly better than the others; the participants had different and often opposing opinions on which variants were easiest to follow.
We considered the formation using straight parallel lines most promising for two main reasons.First, with straight parallel lines to the left and right, dancers could see the upcoming moves, even when turned sideways.Second, with the formations using curved lines, some dancers noticed themselves accidentally following movements too early, possibly because the far-future dancers are more directly visible.However, we cannot make strong claims about the superiority of the chosen formation based on our data, and it may be worthwhile exploring other formations in future work.
Note that our choreographies have the user mostly facing forward and only occasionally turning around and sideways.We do not expect our chosen formation of virtual dancers to be effective for choreography in which the dancer turns to face the back; future work will need to address this limitation, perhaps having a wave coming towards the user from each direction.However, even our present design is more flexible than traditional dance visualizations requiring the user to face a screen.

EVALUATION
We conducted a quantitative evaluation ( =36) of our WAVE prototype, comparing against a baseline visualization with a single model dancer showing the movements in real-time.The two compared visualizations are shown in Figure 3.

Study design
We used a within-subjects design with two experimental conditions (WAVE & baseline), with the visualization type as the single categorical independent variable.Each participant danced the same 84-second choreography twice, i.e., once in each condition.The order of experimental conditions was counterbalanced to mitigate the inevitable order effect caused by the participants remembering at least parts of the choreography.

Hypotheses
We tested two hypotheses about the suitability of the proposed WAVE visualization technique for instructing dance using VR: H1: WAVE allows players to perform choreography more accurately than the baseline.As discussed above, following choreographed movements requires the user to be able to anticipate upcoming movements, which WAVE is designed to facilitate.H2: WAVE elicits higher subjective assessment of being able to perform the choreography correctly.

Sample Size
Since our hypotheses are directional we used single-tailed tests.A priori power analysis using G*Power 3 [13] was used to determine the total sample size necessary.For single-tailed paired-samples -tests, a sample of 27 participants is required to detect a medium effect size (Cohen's   = .50)with type I error rate 0.05 and 80% power.

Participants
36 adult volunteers were recruited among the students and staff of Aalto University, using social media and by having a testing stand on campus.17 participants were men, 18 women, and 1 preferred to not specify their gender.Mean participant age was 26 (SD = 5.5, min 20, max 43).The participants were somewhat experienced with VR (29 had tried VR before and 5 owned a VR device of their own).
The participants were required to be comfortable with light exercise and to believe to have sufficient vision for using the headset.The choreography was designed for dancers without any movement disabilities.Only a few participants had dance experience (19 had no experience, 11 had less than 5 years of experience, and 6 had 5 or more years of experience) or experience playing dance games (20 had 0 hours of experience, 9 had less than 15 hours of experience, and 7 had 15 or more hours of experience).

Procedure
The participant first filled in the demographics questionnaire on a laptop.On the same laptop, the participant was then presented with the instruction: "In this study, your goal is to dance the same choreography with two visualizations."The participant was then shown two video clips, each demonstrating one condition of the experiment.These clips were shown in the order the participant would perform them.Similar to the view shown in Figure 1, the clips presented each condition in third-person with a human model demonstrating how the user should perform.Each clip showed approximately 10 seconds of the choreography that the user would then perform in the next phases of the experiment.For the baseline, the clip was shown with "Try to mimic the model dancer in front of you." For the WAVE condition, the clip was shown with "You join a queue of dancers 'making waves'.Try to move when it's your turn at the end of the queue." After viewing the clips, the user was prompted with "Did you understand how the visualizations are different?(Ask if you have any questions.)"All participants agreed.After the video instructions, the participants put on the VR headset.
During the experiment, the facilitator watched the user's view on the laptop, allowing the facilitator to help the user get into position, if needed.The VR software prompted the participant to input an ID provided by the facilitator (this ID was not input by the facilitator to avoid having to switch the headset between persons, for hygienic reasons).The participant was then asked to calibrate their height by standing straight and clicking on a virtual button; the height was used to scale the virtual dancers to make the visualizations more appropriate for each participant's body.The system displayed text instruction to click a "Done" virtual button after completing the calibration.
The participant then danced in both experimental conditions.At the start of each condition, the system instructed the participant to move to a marked position.Once the user was in the correct position, the system prompted the user to click a virtual button to start the choreography.After performing the choreography, the participant filled the per-condition questionnaire.
After completing both experimental conditions, the participant removed the VR headset and filled in the final questionnaire.

Data Collection
The following data was collected: • Demographics: age, gender, VR experience (has used before?owns a device?), dance experience (years of practice?),experience with dance games (total estimated hours played?which games?).• During dancing: the rotation and translation of the player's head and hands for each game frame were tracked using the VR headset and hand trackers.This data was collected to allow for quantitative comparison between the player's movements and the desired choreography (see Section 4.7).• At the end of each experimental condition: Users were instructed to indicate how they felt about their performance using two sliders: "I felt I was able to perform the choreography correctly" and "I felt I was able to time my movements correctly".The sliders used a range from 0% to 100% and the order of the two items was randomized for each participant.• Final questionnaire: The participants were asked which of the two game versions was their favourite and to give justification for their choice.They were also asked for any additional comments or feedback.
Our primary interest in this study was to test whether anticipatory visualizations support users in accurately following the model choreography.While building the prototype, we observed that slow movements are relatively easy to follow, even without extra visual aids.The first part of the choreography used in this experiment only included slow movements, which are less appropriate for testing our hypothesis.Further, first-time users may need time to get used to the visualization and position themselves.For these reasons, we excluded the very slow start of the choreography from our analyses.Specifically, we excluded the first 37 seconds, after which the movements become faster and more challenging to follow (the authors' subjective assessment).The first included movement is when one quickly bends down and then lifts their arms up (video figure at 42s).After this exclusion, 47 seconds of data remained for each participant.

Methods: How to Measure Movement
Accuracy?
Our goal was for users' movements to accurately reflect the provided choreography, so we considered high error between the target movement and the user's actual movement as indicating low accuracy.We measured error in two different ways: • Position-based movement error, defined as the mean Euclidian distance in meters between the tracked head and hand positions and their choreographed target positions, measured every frame.In the WAVE condition, the user's goal is to move as the last dancer of the middle line, as shown in Figure 1 (communicated to the user as described in Section 4.5).Thus, the target timing corresponds to the dancers on the user's left and right.In the baseline condition, the target timing corresponds to that of the single model dancer.• Direction-based movement error, defined as the mean cosine distance between tracked and choreographed body-part velocity vectors, measured every frame, and scaled to the range [0,1], where 0 indicates the user moves perfectly in the correct direction, and 1 indicates the opposite direction.This error measure focuses on the "gist" of the choreography and does not penalize doing smaller or larger movements than instructed, which we consider desirable to accommodate users with different movement skills and abilities.4.8.4Preferred Visualization.In the final questionnaire, participants were asked which approach they prefer.20 participants preferred the WAVE approach while 16 preferred the baseline.We observed a clear order effect: 78% of the users preferred the approach they tested later.Figure 5: Mean and standard deviation of movement direction error when shifting the reference choreography in time.In the baseline condition, the error is minimized at a shift of approximately 0.5 seconds, i.e., the participants performed the choreography half a second late, on average.In the WAVE condition, the participants moved slightly ahead of the ideal time.

Summary of Results
Our results suggest that WAVE provides a potentially useful visualization approach for VR dance designers.Supporting H1, both the position-based and direction-based movement error analyses indicate that users can match the choreography better when using WAVE than when using the baseline visualization (Figure 4).The effect is small for position-based movement error, but large for direction-based movement error.The majority of participants (20) also preferred WAVE over the baseline.The subjective performance ratings are inconclusive, however, providing no support for H2.This should be investigated in future work, although it may be that the subjective data is simply more noisy than the objective movement-based measures.

Dancing Ahead of Time
Fig. 5 clearly shows that users are late in following choreography with the baseline visualization, as expected.More surprisingly, with WAVE, the users perform the choreography slightly ahead of time.
We hypothesize two explanations for this.First, in both the user study and the initial testing of different dancer configurations, we noticed that users occasionally tried to follow the "future" dancers instead of the dancers closest to them, which affects Fig. 5 to some degree.We hypothesize that this is an artefact of the user study focusing on first-time use; in our own experience, one may at first instinctively copy the "future" dancers when they make larger and faster movements that steal one's attention.
Second, it may be that at least some users synchronize their movements with the dancer directly in front of them, instead of the dancers to the left and right, which only become the focus of attention when the choreography requires one to turn sideways.We did not explicitly ask our participants to synchronize with the dancers to the left and right, or to add a small delay in relation to the dancer in front of them.In the tested WAVE version, the correct delay would be 0.7 seconds.
To address the above, future work might remove the delay between the last two rows of dancers in the WAVE formation.This way, the users could be simply instructed to copy the movements of the dancers closest to them, without an additional delay.The delay and spacing between the dancers might also be provided as user-adjustable parameters, and one might use lighting and transparency to ensure that the user mostly pays attention to the closest dancers.

Designing Dance Games Based on WAVE
We contribute a novel anticipatory visualization technique for full-body movements, paving the way towards dance games that include a broader range of movements and allow for choreography that changes direction and incorporates complex body trajectories.Here, it should be noted that different choreographies and movements might require different formations of virtual dancers, e.g., in all directions around the user.
To extend our prototype towards a full game, one would need to add clear real-time feedback about how well the user is dancing.
To achieve this, it should be possible to incorporate common dance game elements such as verbal guidance and encouragement and visual indicators for dancing accuracy.For accurate scoring and guidance, one would ideally need to track the user's full body, which could be done using current VR hardware such as HTC Vive trackers or emerging solutions that may require less complex hardware [3,50,52].As a limitation, WAVE occupies a large virtual space and thus places a constraint on the spatial design, but the same can be said of more traditional timeline visualizations like the one used by Beat Saber.
Making users feel competent is important to facilitate enjoyment and intrinsic motivation in both physical activity and games [6,31,39,45].In addition to providing encouraging feedback, another way to support competence could be by manipulating the user's perception of their own movements so that they appear more capable, e.g., through exaggerated jump height and flexibility [15,17,28].In our system, the user does not have a visual avatar except for small indicators of their current hand positions.In future work, an avatar could be visible in a mirror, which would reflect the real-life experience of many dance studios.

Wider Applicability
Presently, WAVE is designed for a single user.However, we could imagine applying WAVE in a setting like social VR, allowing dancers to emit their movements as waves that other users can try to follow.This might also mitigate the latency problems inherent in social VR dancing, for example, by matching the wave propagation time between two users to one musical bar, so that even though the "follower" is delayed with respect to the "leader", the movements of both would feel right with the music.Beyond dancing, our approach could potentially be used to instruct other complex movements such as Tai Chi.

Methodological Limitations
We acknowledge that our choice of baseline only allows us to conclude that the WAVE visualization helps in timing and performing movements compared to not using any assistive visualizations at all.It does not allow determining whether WAVE is better than some other visualization technique.Nevertheless, our experiment provides evidence that WAVE works and is worth considering if in need of an anticipatory dance visualization approach.
With only one model avatar in the baseline condition, one cannot be sure how much the results are due to observing the upcoming movements in canon as opposed to having multiple avatars to observe.We believe the latter to be of minor importance, as movement science has firmly established the limits of human reaction time and the importance of anticipating required movements ahead of time in skilled motor performance [2,40,49].Without some form of anticipatory display, it is not even theoretically possible to follow an unfamiliar choreography.
Both our WAVE and baseline visualizations might also work if displayed on a screen outside VR.Testing this is left as future work, as adding such conditions to our experiment would require an additional motion tracking setup and also risk participants becoming fatigued.Furthermore, parts of our choreography require the user to face sideways, making it cumbersome to look at a single screen or requiring a complex setup with multiple screens.The freedom of facing direction motivates our focus on VR.A benefit of screens is that they typically cause less motion sickness than VR.However, although we did not specifically measure motion sickness, our participants did not express experiencing it, and our setup avoids things known to cause a visual-vestibular conflict such as virtual movement or camera manipulations [11,25,26].
We also tested WAVE with only one choreography, in one specific style of dance.In our own opinion, WAVE works best for relatively slow and continuous movements, whereas the fastest parts of our choreography feel less easy to follow.Hence, it may be that WAVE does not work for some other dance styles, though we hypothesize that careful timing of the wave propagation may support faster movements and should be explored in future work.

CONCLUSION
We have proposed and evaluated WAVE, a new VR movement visualization technique aimed at solving the on-the-fly dance instruction problem.We build on a metaphor of the user being part of a crowd making a wave in a sports event-we use multiple model dancers with different time offsets, allowing the player to both mimic the movements of a model dancer close to them and anticipate future movements through seeing other dancers perform those movements ahead of time.To minimize visual occlusion and allow the use of peripheral vision, we render multiple lines of dancers at different locations.
Our study comparing WAVE against a baseline ( =36) provided evidence that WAVE helps users anticipate upcoming movements and perform choreography more accurately, particularly in terms of more-closely matching the velocities of the head and hands as choreographed (e.g., direction-based movement error in Section 4.7).In future work, it should also be possible to extend WAVE to multi-user social VR dancing, e.g., by allowing dancers to emit their own movements as waves for other dancers to follow.

Figure 2 :
Figure 2: Dancer configurations we tested during the project.The screenshots are captured from a Meta Quest 2 VR headset.A) The final configuration with 3 lines, B) Radial lines of dancers, C) Radial lines with extra far-away dancers for less occlusion of future movements, D,E) Curved lines.

Figure 3 :
Figure 3: Screenshots of the two visualizations compared in the user study, taken from the user's perspective using an Oculus Quest 2 VR headset.A) WAVE with three lines of dancers.B) Baseline with a single model dancer whose movements the player should copy.

4. 8 . 5
Additional analyses.The effect of WAVE on anticipating upcoming movements is visualized in Figure5.The figure shows how the direction-based movement error changes when the choreography is shifted in time.With the baseline condition, error is minimized with a shift of 0.5 seconds, indicating that the users follow the choreography 0.5 seconds late, on average.With WAVE, users perform slightly ahead of the target time, on average.

Figure 4 :
Figure 4: Boxplots of (A) position-based movement error in meters and (B) direction-based movement error when compared to the reference choreography.