Looking Beyond the Screen: Natural Eye Contact as a Key to Relatedness in Teleconferences

Non-verbal cues are essential in communication, and eye contact and gaze awareness are crucial components of these cues. However, video-based communication lacks many of these non-verbal cues. To address this issue, we have developed a low-cost prototype allowing natural eye contact during one-to-one video calls. In a study focused on telepresence in remote psychotherapy, we demonstrate that our prototype enhances the perception of genuine and realistic communication with higher physical and social presence compared to classical video meeting setups with the camera placed on the screen.


Introduction
Eye contact has received a lot of research interest over the years as a means of non-verbal communication.In a recent meta-analysis, authors find indications in many studies that direct gaze increases affective a rousal a nd e vokes a positively valenced affective reaction [3].However, much of the potential is lost with meetings taking place on the internet.Throughout the video conferences, the perception of eye contact is flawed as people look at the video of the person talking and, therefore, not directly into the camera lens.That leaves the impression of looking past someone, which is considered awkward or impolite in many cultures in direct interaction.
Our project presents an affordable experimental setup for one-on-one video conferencing that facilitates natural eye contact.This setup is particularly relevant in contexts like psychotherapy, where remote sessions have become common, especially during the COVID-19 pandemic.In our interdisciplinary experiment, we (i) developed the prototype described in this paper and (ii) investigated the perceived genuineness and realism of interaction in conversations in a preliminary study.We use our experimental setup versus a traditional video conferencing arrangement, providing insights into the impact of improved eye contact in digital communications.

Related Work
The negative effects of video meetings on users have become apparent with the advent of home office and extended use of video communication software.Seitz et al. [7] do a metaanalysis of 78 published studies predominantly focusing on professional communication and its negative effects on users.They identify the need for further research outside the professional, i.e., work context, and the need to investigate the positive impact on psychological user states.
The GAZE-2 system [8] utilizes multiple cameras to achieve better gaze awareness during video conferences.The cameras are hidden behind a semi-transparent mirror, which reflects a video screen from the bottom but lets the cameras observe the users figuratively from behind the screen.GAZE-2 uses an eye tracker to switch between cameras to direct the gaze to individual participants.The authors focus on group video conferences and the perception of eye contact but do not investigate if there are effects on the perceived social presence or genuineness by the perceived eye contact and gaze awareness.
The Multiview system [5] provides a group-to-group video conferencing setup, which aims at providing non-verbal cues, including gaze awareness.Like GAZE-2, they use multiple cameras and semi-transparent mirrors.The authors evaluated their system in a video conference setup between two groups of two or three participants.Their results indicate partial gaze awareness, but again, leave aside the potential positive effects on the perceived telepresence.
Microsoft, Apple, and NVidia have presented software systems for so-called gaze correction, which should be able to maintain eye contact in one-on-one video conferencing.Microsoft added the gaze correction feature to their Surface devices around 2020 1 and around the same time, Apple introduced the magic eye contact feature for Facetime with iOS 14.NVidia included a similar software-based approach later in its NVidia Broadcast App as eye contact feature2 .All three methods are AI-powered, and each of them has been critized for being creepy on social media.For psychotherapy sessions AI-powered gaze direction altering approaches have a significant drawback in that avoiding the gaze is still corrected and leads to the impression of lingering eye contact being maintained.So, while traditional setups do not allow for eye contact to be identified, AI-powered gaze correction overcompensates and often fails in situations when users look away without moving their heads.

Implementation
Our approach is similar to that of GAZE-2, as we also use a semi-transparent mirror.However, we have made the decision to install the screen in an upright position to avoid any potential heat venting issues.Most computer screens come with heat vents on the upper side of the body and air inlets on the bottom, which allows for passive cooling.However, if the screen is operated with the screen facing upwards, it will prevent heat dispersion and may eventually render it inoperable.The screen is installed at the back of a black box.A standard webcam is installed at the bottom of the box.The camera can be moved as indicated in Fig. 1 to align it with the height of the eyes of the user.The mirror is installed in the box at an angle of 45°degrees to reflect users' gaze into the camera.
During the initial pre-study tests, we encountered several issues related to the lighting, screen contrast, and audio.The semi-transparent mirrors in use allowed light to pass through from the monitor, and also reflected the user's light back to the camera.As a result, light in both directions was filtered and dimmed, leading to poor picture quality and color noise in the low-light setting of the camera, which affected video call quality.Also, the video conference content shown on the screen was dimmed by the mirror.To address these problems, we installed ambient lights on both sides of the box, as shown in the prototype photos in Fig. 2.This led to better lighting of the meeting participants and a far better video quality without dazzling the participants.We further replaced the computer screens with newer, brighter models to reduce the perceived effect of the mirror on the brightness.
In our setup, the webcam's microphone wasn't suitable for video conferences as it caused muffled sound due to the camera's position within the box.For the control group in our study, we attached an additional camera on top of the box to mimic traditional setups without eye contact.To ensure uniformity between the test and control groups and to improve audio quality, we used high-quality podcast microphones for both groups as shown in Fig. 2. Additionally, we required participants to wear headphones to avoid possible audio issues during the study.

Screen
Webcam S e m i-t r a n s p a r e n t M ir r o r  In Table 1, you can find an overview of the estimated costs of upgrading a single existing seat with a screen and computer to our eye contact prototype.It is worth noting that we used two seats for our experiment as we focused on one-to-one communication.All in all, at the time of writing, upgrading one seat costs around 320 EUR, not including assembly work and duct tape.We created two prototypes for our study, as shown in Fig. 2.
The study was run by an interdisciplinary team of psychology and computer science researchers.The overall goal was to do a pre-study for a larger one on the influence of eye contact in remote psychotherapy.We recruited the participants from the pool of psychology students at our university who benefit from participating in studies as part of their curriculum.In a randomized controlled trial, we randomly assigned  = 38 student participants to either a standard video setting with a webcam placed above the screen or our prototype where the webcam was perceived as being behind the screen.We placed participants in two separate rooms connected through the department's computer network infrastructure.We used Skype on Windows PCs to facilitate a video call for the experiment.For each run, we had two participants and one operator in the video call.The moderator guided the participants through the study: • informed consent -5 minutes • setup -5 minutes • discussion on controversial topics -40 minutes • questionnaires -10 minutes After participants gave their consent, the moderator helped set up each set by adjusting the lighting, headphones, and microphones, as well as the camera position, and chair height to ensure that the eyes were at the center of the camera.The moderator then guided the experiment via Skype and audio only, where the two study participants discussed controversial topics and worked toward a conclusion.Example topics were: • Do we need state educational training for parents, i.e., a parent driving license?• Should smoking be banned in public?
• Should the state pension be abolished?
• Do we need a real name requirement on the Internet?The participants subsequently completed the telepresence scale for psychotherapy delivered in videoconference (TVS; [1]) and the real relationship inventory (RRI; [4]), which served as dependent variables.The three subscales of the TVS assess physical presence, social presence, and absorption.The instrument shows acceptable reliability (alpha .80)and validity [1].The RRI assesses the perceived genuineness and realism of the interaction.It shows good internal consistency (.95) and retest-reliability [4].To control for covariables, the participants also indicated age and gender and filled out a measure of mentalizing ability (MZQ; [2]) and a measure of their general comfort with communication via the phone, video, or in person (DCCS; [6]).
Multiple variance analysis was calculated to obtain effect sizes for the impact of the camera condition on perceptions of telepresence and the real relationship.Despite the small sample and cost effective hardware and software, we observed high effects on all scales of the TVS and RRI in the expected direction (Table 2).This preliminary study emphasizes the importance of exploring new communication technologies in remote psychotherapy, as enhancing conventional technologies can support the perception of presence in the communication dyad.

Conclusion
In this paper, we present a low-cost prototype to improve eye contact during one-to-one video calls.Our study shows that this approach has a significant impact on the perceived telepresence and the real relationship between participants, leading to a more engaging and authentic conversation.
From an engineering perspective, our prototype faces lighting issues, including the subject's illumination and the screen's brightness.The semi-transparent mirror used in our prototype doesn't provide perfect reflection or light passthrough, which deteriorates the quality of the video.In that sense, it is even more surprising that our low-tech approach led to this significant impact.
For future work, we plan to include other conditions in our study.Firstly, we aim to test the gaze correction and magic eye contact software to determine if it has a similar effect to our prototype.Additionally, we plan to explore virtual reality and mixed reality conditions, with the ultimate goal of enhancing the experience of remote psychotherapy and remote therapy in general.In addition, our system can be utilized to generate a dataset for training gaze correction models.By recording both camera streams, one from the camera located on top of the screen and the other one perceived to be on the screen, our prototype has the potential to create low-cost training data.

Figure 1 .
Figure 1.Setup of our experiment: a user would look through a black box onto a screen at the other end.Within the black box, a semi-transparent mirror is installed to reflect the view on the user to a camera at the bottom of the box.

Figure 2 .
Figure 2. Photos from the prototype showing the ambient lighting devices to the left and right of the box, the microphone, and the webcam on top for the control group.

Table 1 .
Cost per seat for our prototype.Note that for a face-to-face setup, two seats are needed, and we assume that a computer and a screen are available per seat.

Table 2 .
Effect sizes (partial eta squared) controlled for age, gender, mentalizing ability, and comfort with telecommunication