Ultrasonic Mid-Air Haptics on the Face: Effects of Lateral Modulation Frequency and Amplitude on Users’ Responses

Ultrasonic mid-air haptics (UMH) has emerged as a promising technology for facial haptic applications, offering the advantage of contactless and high-resolution feedback. Despite this, previous studies have fallen short in thoroughly investigating individuals’ responses to UMH on the face. To bridge this gap, this study compares UMH feedback on various facial sites using the lateral modulation (LM) method. This method allows us to explore the impact of two LM parameters -frequency and amplitude - on both perceptual (intensity) and emotional (valence and arousal) responses. With 24 participants, positive relationships between LM amplitude and perceived intensity and arousal were observed, and the effect of LM frequency varied across facial sites. These findings not only contribute to the development of design guidelines and potential applications for UMH on the face, but also provide insights aimed to enhance the effectiveness and overall user experience in haptic interactions across diverse facial sites.


INTRODUCTION
Haptic feedback is a widely recognized and intuitive feedback method in human-computer interactions (HCI).Over the past decades, numerous haptic devices have been developed to enhance user experience and improve task efficiency [20,41,63].More recently, the advent of high-fidelity extended reality technology has ignited a fresh wave of exploration in haptic feedback research [68].While earlier studies primarily focused on investigating haptic feedback for the hand, there has been a notable shift in current HCI research towards the face as a relatively unexplored haptic target.
The face holds promise as a haptic interface for several reasons.Firstly, the face is densely populated with mechanoreceptors and is highly sensitive to tactile sensations [9,61].Secondly, the facial skin offers a sizable surface area, which, when coupled with the high resolution of ultrasonic mid-air haptics (UMH), allows for a diverse range of information to be conveyed [19].Lastly, the face remains consistently visible and unobstructed, making it a practical interface for real-life applications such as UMH feedback in automotive user interfaces [28].In fact, numerous interactive systems have explored the use of the face as a haptic interface across multiple domains, including virtual reality (VR) [13,33,51,84], augmented reality (AR) [37], and universal design (e.g., blind navigation) [24,29].Despite obtaining positive results and promising implications for specific face haptic applications in HCI, a major limitation of previous face haptic designs has been the reliance on contact-based devices.These devices involve physical contact between the facial skin and relatively bulky and heavy haptic actuators (e.g., in Nakamura et al. [49], ten servo motors were used to provide haptic feedback on the cheek), rendering them impractical for real-life scenarios.The mechanical structures of contact-based haptic actuators constrain the spatial resolution, confining the interaction area to a small region (e.g., on the cheek [49,81]).In this research, we sought to investigate the feasibility of utilizing UMH in the facial area as a contactless alternative for future face haptic research and design.Compared to other contactless approaches based on vortex rings [22,62], air jets [73], electric arcs [70], or lasers [31], UMH has emerged as a practical contactless technique for three reasons.Firstly, it can alter the user's perception of haptic feedback by adjusting the modulation parameters [17,44,50].Secondly, it delivers haptic feedback with high spatial and temporal resolution.Lastly, it can modulate stimuli in a relatively large spatial area.
Unlike traditional contact-based haptic techniques that utilize vibrotactile tactors [8,33,78] or electrical stimulation [38,39,76], UMH employs focused ultrasound waves to stimulate the skin [56].However, our current understanding of how humans perceive haptic feedback generated by ultrasound on the face is limited, presenting a challenge in the selection of appropriate modulation parameters for haptic researchers and designers.To address this gap, this study examines the effects of basic UMH modulation parameters on users' perception and emotional responses.Specifically, it investigates the independent effects of lateral modulation (LM) parameters on users' response to UMH on the face, including perceived intensity, valence, and arousal.To achieve this, we engaged 24 participants, subjecting them to various combinations of LM frequencies and LM amplitudes across three distinct facial areas.The Absolute Magnitude Estimation method [23] was used to measure perceived intensity, while the Self-Assessment Manikin (SAM) [2] was used to measure valence and arousal.
Results showed a significant difference in perceived intensity between the glabrous skin (i.e., the lip) and hairy skin (i.e., cheek and nose) within the facial area.This suggests that the impact of UMH on both glabrous and hairy skin, as observed on the hand and arm, extends to facial skin.The emotional responses elicited from different facial areas reveal that the lip stimulates the most active responses, while the nose helps users relax.Furthermore, our findings indicate that a larger LM amplitude corresponds to stimuli with higher perceived intensity and arousal.We identified an optimal LM frequency of approximately 40Hz for hairy skin on the face, while it is around 70Hz for glabrous skin.These findings provide valuable insights, establishing a fundamental design guideline for incorporating UMH feedback on the face.Moreover, several design concepts were proposed that aim to inspire the development of more innovative applications for face-haptic interactions.

RELATED WORK 2.1 Ultrasonic Mid-air Haptics Modulation Techniques
UMH utilizes arrays of ultrasound transducers to create a focal point by synchronizing the phase of each transducer.This focal point generates a pressure that induces movement in the propagation medium [71].By targeting the skin, the mechanoreceptors are stimulated, resulting in tactile sensations.However, the direct focal point produced by UMH is typically 40kHz, which exceeds the human vibrotactile perception range (approximately 5-500Hz [7]).
To enable perception by the human skin, modulation techniques are necessary for the focal points.There are three main modulation techniques to create perceivable UMH focal points.As illustrated by Figure 1(a), one technique is amplitude modulation (AM), which modulates the amplitude of the ultrasound waveform to correspond to the vibrotactile stimulation waveform [21].AM is the conventional UMH presentation method and has been widely employed to create haptic feedback on the palm [25,82].It modulates different frequencies by controlling the intensity of ultrasound waves, and its focal point is stationary (Figure 1 (a)).Another technique is LM [74], which keeps the intensity constant and produces haptic feedback by quickly moving the focal point along a short trajectory on the skin (Figure 1(b)).The spatial dimension of the LM focal point is typically within several millimeters with high-speed focal movement, which will be perceived as a point of vibrotactile stimuli instead of spatial movement [74].When the UMH is produced with the same device, LM can provide superior performance than AM by making the maximum use of power.The third technique commonly employed in UMH research is spatiotemporal modulation (STM) (Figure 1(c)), specifically designed to generate more intricate patterns [16].STM achieves this by dynamically repositioning the focal point along two-dimensional trajectories while maintaining a consistent focal intensity, allowing for the creation of diverse shapes.In terms of rendering principles, STM bears a resemblance to LM.However, there exists a significant difference between the two techniques, primarily concerning the size of the focal trajectory.In the case of STM, the trajectory aligns with the rendered pattern itself, spanning up to 200mm in circle perimeter, whereas LM operates on a much smaller scale, typically within the range of several millimeters [16,18].Given that this study centers around the perception of the focal point, which serves as the fundamental unit of UMH, the decision has been made to employ LM rather than STM.LM had two major parameters that were found to affect users' response to UMH, including the LM frequency (the frequency of the cyclic movement of the focal point) [45] and the LM amplitude (the size of the focal trajectory) [74].Different combinations of UMH frequency and amplitude influenced users' perceived intensity of the stimuli as well as emotional responses.Obrist et al. [50] investigated the effect of modulation frequencies and different locations on the hand in mediating the users' emotions.Ablart et al. [1] showed that a low frequency and large size of the UMH feedback result in a lower valence of emotional experiences.

UMH on the Face
Previous studies on UMH have primarily focused on the glabrous skin of the hand and have yielded numerous research findings regarding UMH design for hand-based interactions [34,36,52,60,83].However, the hand's practicality for delivering "passive" content, such as presenting alarming information or providing informative content, is limited as it is often occupied with other tasks [28,29].Consequently, this study sets out to explore the face as a promising alternative to the hand for receiving UMH feedback.While previous research has extensively examined the human response to various UMH parameters on the hand, it is important to note that the conclusions derived from hand studies may not be directly applicable to the facial area.This discrepancy arises primarily from the substantial differences between the skin characteristics between the face and the hand.Recognizing and addressing these distinctions is imperative to understand how UMH interacts with facial skin, emphasizing the need for dedicated exploration in this context.
The facial area contains four types of mechanoreceptors: Ruffini corpuscles, Meissner corpuscles, Merkel cell disks, and hair receptors [69], while the Pacinian corpuscles (PC), the major mechanoreceptors responsible for vibration detections (∼250Hz) in the hand, are absent in the face skin [30].Therefore, the optimal UMH frequency for the face is likely to differ from that of the hand.UMH feedback on the face normally targeted Meissner corpuscles, which are more responsive to lower vibration frequencies.The consequences of improper selection of modulation parameters are evident in a reduced perception of haptic cues [47,57,60].Despite this, there has been limited exploration of UMH on facial skin.In these studies, vibration frequencies were either borrowed from hand studies (e.g., 200Hz) or assumed based on the presumed most sensitive frequency from contact-based haptic research (e.g., 40Hz).Three studies focused on the lip, employing either LM or AM at 40Hz [27,28] or 50Hz [65].Their findings indicated that the lip exhibits greater sensitivity to UMH feedback at lower frequencies than higher frequencies (e.g., 200Hz).Two other studies examined other facial areas using AM at 40Hz [19] and LM at 200Hz [43].The inappropriately selected modulation methods or parameters in these aforementioned studies resulted in experiment settings that were excessively large and power-consuming (e.g., 996 transducers in Jingu et al. [28] and Mizutani et al. [43], and 1,494 transducers in Jingu et al. [27]).This renders them less feasible for translation into consumer-friendly products.Additionally, these settings presented challenges for participants to perceive UMH feedback with a more compact UMH array design.For instance, in Gil et al. [19], some UMH cues had only a 54% detection rate with 256 transducers.Shen et al. [65] also highlighted that approximately 16% of participants could not perceive the UMH feedback using their prototype with 64 transducers.Despite some previous attempts at applying UMH on the face, there is a lack of understanding and guidance on how to design effective facial haptics using UMH.To fill this gap, this study aims to examine how different UMH parameters (i.e., LM frequency and LM amplitude) influence users' perceptual (i.e., perceived intensity) and emotional response (i.e., valence and arousal) across different facial sites (i.e., lip, cheek, and nose).

CONTROLLED EXPERIMENT 3.1 Experiment Design
To keep the experiment short and avoid user fatigue, we limited the number of conditions tested.This study set the LM trajectory as a circular motion to provide a stronger sensation [74].16 LM stimuli were created using four different levels of LM amplitude and LM frequency.The four amplitude values were 3, 6, 9, and 12 millimeters.The pilot study demonstrated that most participants could not perceive an amplitude lower than 3mm, and an amplitude larger than 12mm was sometimes perceived as a circle instead of a point.The LM frequencies were set to 10Hz, 40Hz, 70Hz, and 100Hz by controlling the drawing speed of the focal point.These values were chosen to represent the lower bound to the higher bound of the perceivable frequency on the face.
As for the feedback positions, three areas were selected to yield a representative picture of the whole face (Figure 2).The first area is the lip, which is the only glabrous skin on the face and has higher tactile sensitivity than other facial parts [69].A preliminary study of UMH on the lip demonstrated that the center of the lip has the lowest perception threshold [27].Therefore, the center lip was selected as a representative position.The second position is the cheek beside the nose; this location represents most facial areas, as no significant difference was found between the forehead, cheek, and jaw under the same UMH stimuli [43].Finally, this study included an unexplored location of the nose tip.The nose is regarded as one of the most prominent facial features due to its unique geometry and central location on the face.The decision to investigate the nose tip stems from the hypothesis that its unique characteristics may result in differing perceptual responses compared to other facial regions.

Experiment Setup
The experiment setup is shown in Figure 3.The UMH stimuli were provided with the mid-air haptic device STRATOS Explore (https: //www.ultraleap.com/haptics/).The device consists of a 16×16 array of ultrasonic transducers that can create UMH feedback across a 60-degree field of view and at a distance of up to 800mm in front of the transducers.The V3 SDK was employed for a stronger haptic intensity.The sensation for the experiment was coded in C#.
A depth camera (Intel RealSense D435) tracked the participants' faces at a short-range distance (minimum 300mm).The 3D coordinates of facial locations were calculated in two steps.Firstly, the image from the RGB camera was processed by the "MediaPipe Face Landmarker [40]" to obtain the 2D coordinates of three facial points (i.e., ID = 12 for Lip, ID = 1 for Nose, and ID = 423 for the cheek).Secondly, the 2D coordinates were combined with the depth data, and the "rs2_deproject_pixel_to_point" function in the RealSense Library was used to calculate the 3D coordinates.The relative position between the depth camera and the transducer array was considered, transforming the 3D coordinates for accurate UMH feedback on the face.
The depth camera and the transducer array were affixed to a vertical support frame, positioned 330mm in front of the participant's head, for accurate tracking and optimal UMH strength.To enhance tracking reliability and mitigate possible latency, participants rested their heads on a chin rest during the experiment.The chin rest was individually adjusted to align participants' noses with the top of the UMH device and the center of the depth camera image, ensuring the three facial locations fell within the optimal region for both devices.
To maintain a perpendicular orientation to the ear and prevent direct exposure to UMH, all chosen facial locations in this study were aligned accordingly.Participants were equipped with in-ear headphones playing white noise and a 3M audio isolation headset to avert system malfunction and minimize ambient noise.Additionally, protective glasses were worn to physically shield the eyes from any unexpected UMH stimuli.

Experiment Procedure
A total of 24 participants (10 females, mean age: 23±2.75) were recruited for the experiment.Upon arrival at the lab, each participant was greeted and briefed on the procedure.After being fully informed, they signed the consent form.Ethical approval was obtained from the University ethics committee prior to the commencement of the experiment.Following the training, the formal test began.Each participant encountered the 16 test stimuli twice across three facial sites, resulting in a total of 96 stimuli.The order of presentation of these stimulus blocks was counterbalanced across participants, with six possible orders in total.Each order was completed by 4 participants, and the stimuli within each block were pseudo-randomized to avoid bias.During each trial, participants rested their heads on a chin rest, and stimuli were delivered 2 seconds after the head stabilization.Each stimulus lasted for 1 second and was presented three times with a 1-second interval between repetitions.Participants could move their heads and enter their perceived intensity, valence, and arousal after perceiving three identical stimuli.If no sensation was felt, the trial was marked as "no sensation, " and repeat stimuli were available if needed.
The participants used the Absolute Magnitude Estimation method to rate the perceived intensity of the stimuli [23,57,66], defining the scale themselves from 0 to infinity.For valence and arousal of the stimuli, participants used the SAM slider, ranging from unpleasant/calming to pleasant/activating (Figure 4).Icons representing corresponding emotions were presented above the slider to aid the rating process [1,2,66].
After responding to a stimulus, participants proceeded to the next, and the same procedure was repeated until the end of the experiment.A 3-minute rest period was provided after every 32 trials when switching to a new feedback site.The entire experiment lasted approximately 50 minutes.As a token of appreciation for their participation, each participant received a gift worth 10 dollars at the end of the experiment.

ANALYSIS AND RESULTS
2,304 ratings of perceived intensity, valence, and arousal were gathered, with 24 participants providing two ratings for 48 stimuli.The average response for each stimulus was calculated by aggregating the ratings from all participants.To standardize perceived intensity ratings, each participant's rating was normalized to a range between 0 and 1 by dividing it by their highest rating [17,66].Similarly, ratings for emotional responses were normalized between 0 and 1 by dividing them by the highest value of the slider [66].
The data collected in this study were analyzed using three-way repeated ANOVAs on sites, amplitude, and frequency concerning users' responses (perceived intensity, valence, and arousal).Mauchly's test of sphericity was conducted for each analysis, and Greenhouse-Geisser sphericity corrections were applied if the assumption of sphericity was violated.Post-hoc analysis involved pairwise comparisons with Bonferroni correction.Figure 5 displayed the mean values of perceived intensity, valence, and arousal for each stimulus (detailed results with standard deviation can be found in Table 1, 2, and 3 in the supplementary material).

Effects of UMH on Perceived Intensity in the Face
The feedback sites exerted a significant main effect on perceived intensity (F(2, 46) = 78.93,p < 0.001, [ 2 = 0.77).Pairwise comparison revealed significant differences between the lip and cheek (p < 0.001), and between the lip and nose (p < 0.001).Additionally, a three-way interaction emerged among the sites, amplitude, and frequency (F(18, 414) = 3.057, p < 0.001, [ 2 = 0.117).Consequently, the impact of the two LM parameters on perceived intensity was analyzed on a site-by-site basis.
The impact of LM parameters on the cheek is illustrated in Figure 6 [ = 0.212).The highest perceived intensity was recorded at 40Hz with an LM amplitude of 12mm (intensity = 0.427).Regarding frequency, significant differences were found between 40Hz and 10Hz (p < 0.001).As for amplitude, significant differences were noted between 12mm and 6mm (p < 0.001) and between 12mm and 3mm (p < 0.001).

Effect of UMH on Affective Response
No significant main effects and interactions were found for valence.However, concerning arousal, a significant main effect of feedback sites was identified (F(1.365,31.394)= 41.045,p = 0.037, [ 2 = 0.141) (Figure 7(a)).Pairwise comparisons revealed significant differences between the lip and cheek (p < 0.001), between lip and nose (p < 0.001), and between cheek and nose (p = 0.012).These findings underscore the distinct perception of UMH differs across three facial areas.Additionally, a two-way interaction was observed between facial area and frequency (F(6, 138) = 6.703, p < 0.001,  .When UMH was presented on different sites, 70Hz was more activating than 40Hz on the lip, while in the cheek and nose, 40Hz outperformed 70Hz.When modulated at different amplitudes, 70Hz was more activating than 40Hz at small amplitudes (3mm and 6mm), and 40Hz became more activating at larger amplitude (9mm and 12mm).Detailed results are available in Table 4 and 5 in the supplementary material.

DISCUSSION AND DESIGN IMPLICATIONS 5.1 Effect of Feedback Site and LM Parameters on Users' Response
The findings of this study unveiled the influence of feedback sites on facial perception.Notably, the lip emerged as significantly more intense in its response than the cheek and nose.These results suggest a divergence in the perception of UMH between glabrous skin (e.g., lip) and hairy skin (e.g., cheek and nose).This aligns with previous research that compared glabrous and hairy skin on the palm and arm [74].Moreover, the observed distinction suggests that the influence of skin type may transcend specific mechanoreceptor types, challenging previous research that predominantly focused on PC receptors.In contrast, our study homed in on Meissner corpuscles, hinting that the observed difference in perception is not limited to a particular mechanoreceptor type.Beyond skin types, the difference in perceived intensity could also be ascribed to the complex geometry of the face.The lip, with its valley-shaped (or concave) structure, facilitates the reflection of ultrasound waves between the lips, creating a concentrated focal point [27].In contrast, the nose, a prominent facial landmark, scatters sound waves upon contact with its convex structure, possibly contributing to its slightly lower perceived intensity compared to the cheek, whose surface is relatively flat.Additionally, the results showed varying arousal levels across different facial sites, with the nose evoking a more calming response and the lip generating heightened excitement.
For designers and researchers delving into UMH, these findings highlight the importance of carefully selecting the feedback site, as the same feedback patterns for different sites may elicit different user responses.
The study pioneers an exploration into the impact of LM amplitude on facial perception.The results reveal that the LM amplitude significantly influences perceived intensity across varying modulation frequencies, extending its impact not only to glabrous skin but also to hairy skin on the face.This aligns with previous studies on the hand and arm, where the effect of LM amplitude can be explained by the spatial summation phenomenon at the receptor level [53].In essence, larger stimulated areas lead to a stronger perception of intensity for UMH feedback.However, it is important to note that the perceived intensity plateaus once the skin is saturated [7].For the facial areas, no significant differences were observed when the amplitude reached 9mm, suggesting that the optimal LM amplitude for perceived intensity lies between 9mm and 12mm.A similar trend was observed for arousal across all locations, with users reporting the highest arousal at the amplitude of 9mm or 12mm.This parallels observations in UMH applies to the hand using STM, where perceived intensity correlates with the size of tactile patterns within the frequency range of 2Hz to 100Hz [1,17].These findings suggest that the spatial summation effect might be confined to a specific size range, warranting further investigation into human perception of pattern size from LM to STM.Introducing LM amplitude as a control parameter opens up new avenues for generating diverse emotional responses on the face beyond adjusting transducer power [28].This innovative approach enables the creation of more complex and impactful haptic cues using smaller, energy-efficient UMH devices.
The optimal frequency for perceiving LM differs between glabrous skin (i.e., the lip) and hairy skin (i.e., the nose and cheek) on the face.The lip exhibits optimal LM perception at 70Hz, while the cheek and nose achieve the strongest perception at 40Hz.These results align with previous findings indicating that frequencies below 80Hz excite non-PC receptors [42].The presence of hair receptors on the cheek and nose, absent on the lip, could explain their heightened sensitivity to lower LM frequencies.Previous studies on tactile sensation on hairy skin on the head found that vibration at 32Hz (i.e., lower frequency) led to better perception performance than 63Hz [48].Differences in observed frequency across facial sites may also be associated with different skin elastic properties [15].Prior investigations into skin properties have indicated that the lip possesses greater elasticity than the cheek and nose [11,64,75].Given that higher LM frequency results in a shorter time interval for the skin to completely relax, skin with higher elasticity is more adept at accommodating higher LM frequency.In alignment with LM amplitude, users' emotional responses can be influenced by LM frequency.Frequencies ranging from 40Hz to 70Hz notably enhance arousal, facilitating more engaging interactions without affecting the valence.
The current study revealed no significant effect on valence, consistent with the findings from Shen et al. [66] for the palm and Pittera et al. [53] for the forearm.Notably, Ablart et al. [1] demonstrated a potential correlation between frequency and valence, suggesting a subtle relationship.The absence of a clear valence effect could be attributed to the inherent challenges participants faced in accurately rating such emotions [50].During the emotion rating session, participants expressed difficulties providing precise valence ratings, accompanied by insightful comments on their emotional experiences across different feedback sites.Therefore, future research is recommended to consider employing qualitative methods, such as micro-phenomenology interviews [54], to gain a more in-depth understanding of the user experience within UMH on the face.

Design Implications
Our results indicate the efficacy of manipulating LM amplitude and frequency on different facial sites to control users' perceptual and emotional responses.This facilitates collaboration among HCI researchers and designers with diverse backgrounds, providing a standardized hardware framework for designing and evaluating facial haptic applications.The inherent contactless attributes of UMH feedback enhance its versatility, allowing seamless integration with pre-existing prototypes.Building upon the results of our experiments, this section delves into fundamental design paradigms and showcases several potential application examples to highlight their impact on enhancing utility and user experience.5.2.1 Design Paradigms.When utilizing the face as an interface for delivering informative feedback [67], careful consideration should be given to the selection of feedback sites and modulation parameters to ensure a smooth and effective user experience.
UMH Feedback Site Selection: The choice of feedback site should align with the intended level of urgency [4,35].For frequent and minor actions, modest UMH feedback directed toward the nose or cheek is recommended.Conversely, for infrequent and significant actions, the lip proves more suitable for delivering salient feedback.
LM Frequency Selection: To achieve optimal perceived intensity, we propose a frequency of 70Hz on the lip and 40Hz on the cheek and nose.Other frequencies can also be employed to convey additional information.Generally, the further the selected frequency deviates from the optimal frequency, the lower the perceived intensity and valence level.
LM Amplitude Selection: For the strongest perception across all three sites, an amplitude of 12mm is recommended.When two intensity levels are required, an amplitude of 6mm suffices.
Emotion Mediation: Similar to findings in Obrist et al. [50], UMH locations and parameters can influence the user's emotional state.In the case of face haptic feedback, targeting the lip promotes a more pleasant experience, while the cheek is conducive to neutral or calm feelings.Researchers and designers are advised to focus on the nose when unpleasant emotions are desired.

Design
Opportunities.Autonomous Vehicle User Interface (Figure 8 (a)): With the rise of autonomous driving, drivers are anticipated to engage in non-driving-related activities such as reading, working, or eating [77].In this context, UMH on the face holds advantages over hand-based interactions within autonomous vehicles.Facial haptic technology provides a platform to implement various categories of haptic information within automotive user interfaces [3].For instance, UMH feedback on the lip can quickly alert the driver to take control during emergencies, and haptic feedback on the nose could serve as a soothing confirmation signal while using voice control to adjust the in-vehicle temperature.Similar concepts can be extended to public transportation, where integrating UMH feedback into the inflight entertainment system could regulate and influence passengers' emotions during long-haul journeys.
Inclusive Design (Figure 8 (b)): Face haptics using UMH offers a range of exciting possibilities for inclusive designers.By adjusting the parameters of the LM, designers can create diverse haptic feedback cues with a limited number of transducers.Incorporating a flexible UMH array [32] further opens the potential to integrate the transducer array into wearable devices such as hats.Such devices can prove invaluable for individuals with visual or hearing impairments facing challenges in perceiving their surroundings in daily lives.Engineers and designers can empower individuals to sense their environment by projecting real-time haptic cues onto their faces without disrupting other tasks.Distance and location information can be conveyed on corresponding facial areas using varying intensity levels.
Immersive Experiences in VR/AR (Figure 8 (c)): over the last decades, the concept of "Metaverse" and the advancement of virtual reality technology showed great development prospects in the areas of social, gaming, training, etc. [55,72].By seamlessly integrating high-resolution UMH feedback with visual and auditory cues, a more encompassing and engrossing virtual experience can be achieved [65,68].For example, in a virtual environment where individuals meet and engage in conversations, their emotions can be simultaneously conveyed by adjusting UMH parameters.This has the potential to enhance empathy and elevate the user experience in remote communication [6,59,79].

Limitations and Future Work
Considering the potential applications of UMH on the face, it is imperative to address various technical challenges and safety issues before practical implementation in real-world scenarios.Firstly, integrating real-time and accurate face detection technology into the UMH feedback system is essential.While the depth camera utilized in our study and previous research has shown efficacy in controlled experimental settings, it is impractical for wearable devices due to its bulky nature and limited capability to calculate depth information within a short range.Furthermore, privacy concerns associated with face recognition are of utmost importance, particularly when deployed in public spaces [46,58,85].Future investigations should focus on either enhancing facial data protection or exploring alternative detection techniques such as thermal imaging [26].Secondly, advancements in the miniaturization of the UMH array in terms of size and weight are crucial for wearable applications.Shen et al. [65] demonstrated promising energy control when integrating the UMH array with a VR headset.However, the addition of the UMH array substantially increases the overall weight and size of the product.Therefore, HCI researchers and designers need more compact and lightweight transducers with improved energy efficiency to ensure optimal user experience.Lastly, more research is needed to address safety concerns associated with the application of UMH on the face.While previous studies have shown that exposure to high-intensity airborne ultrasound has no significant impact on hearing sensitivity [5,14,18], its long-term effects remain uncertain.Moreover, the close proximity of the transducers to the head in the novel wearable form raises safety concerns regarding the generation of UMH feedback in the vicinity of the head.These safety concerns must be thoroughly investigated and mitigated to ensure the well-being of users.
Throughout the experiment, a majority of participants reported perceiving a sensation akin to wind in response to certain stimuli.This wind sensation resulted from the acoustic streaming of focused ultrasound, generating temperature variations on the skin and influencing users' emotional responses [53].Post-experiment informal interviews revealed divergent participants' experiences, with some finding the wind sensation annoying while others derived enjoyment from it.Consequently, it would be beneficial for future studies to map out sensations and experiences associated with facial stimuli and compare findings from hand-related studies [12], which could further inform UMH designers and researchers.To validate the findings in this study, incorporating objective physiological measurements such as electroencephalography signals [80] and heart rate responses [10] can provide additional validation for the subjective feedback collected during the experiment.

CONCLUSION
This study investigated the effects of LM frequency and LM amplitude on users' perceptual and emotional responses across different facial sites.Notably, our findings mirrored the difference observed between glabrous and hairy skin on the hand and arm, extending these insights to facial skin.Specifically, the lip exhibited increased sensitivity to UMH feedback compared to the cheek and the nose.This bridged a valuable connection, allowing the transfer of insights and lessons garnered from previous studies on the hand and arm to the facial skin.Our results underscored the influence of both LM parameters on users' perceived intensity and emotional states.In particular, larger LM amplitudes were proven to generate stimuli with elevated intensity and arousal.For glabrous facial skin (the lip), the optimal LM frequency was identified as 70Hz, contrasting with 40Hz on hairy facial skin (the cheek and nose).These findings not only enhanced our understanding of how humans perceive UMH feedback on the face but also significantly contributed to the advancement of UMH design, opening up a more expansive array of applications.

Figure 1 :
Figure 1: UMH modulation techniques, showing the different control of the focal point (above) and the evolution of focal pressure over time (below): a) Amplitude modulation (AM); b) Lateral modulation (LM); c) Spatiotemporal modulation (STM).

Figure 2 :
Figure 2: Three facial sites selected for the experiment.

Figure 3 :
Figure 3: Overview of the experiment setup.

Figure 4 :
Figure 4: Self-Assessment Manikins, as shown on the rating tablet, to aid users' rating of valence and arousal.

Figure 5 :
Figure 5: Overview of all participants' responses to each LM UHM stimuli.

Figure 6 :
Figure 6: Summary of the result on perceived intensity according to LM frequency and LM amplitude on different facial sites.(a) Intensity on the lip; (b) Intensity on the cheek; (c) Intensity on the nose.Bars represent the mean; error bars represent the 95% confidence intervals.

Figure 7 :
Figure 7: Summary of the result on arousal from repeated measures ANOVA.(a) Arousal on three facial sites; (b) Arousal on three sites with different LM amplitude; (c) Arousal of different LM amplitude and LM frequency.Bars represent the mean; error bars represent the 95% confidence intervals.

Figure 8 :
Figure 8: Three potential applications of UMH on the face.a) autonomous vehicle user interface, b) design for individuals with visual or hearing impairments, c) Design for immersive experience.