Sound and Colour: Evaluating Auditory-Visual Tests in Virtual Reality and Traditional Desktop Settings

In this pilot study, we used a balanced A/B testing method to evaluate how an environment—a Virtual Reality (VR) head-mounted display (HMD) versus a traditional desktop—affects the performance of 20 participants on a musical pitch-colour association test. We aimed to discern the influence of testing environments on musical pitch-colour associations. Our findings revealed no significant difference in performance between the VR HMD and the traditional desktop conditions. Previous studies highlight VR’s potential, but the results of our investigation—focusing on limited immersive and presence qualities—suggest that working in VR as opposed to a standard desktop environment has no significant influence on musical pitch-colour association test results. This outcome prompts a further investigation into other inherent VR characteristics, such as spatial audio, natural environment simulations, and interactive object manipulation, which could potentially enhance the effectiveness of auditory-visual association tests. This study is the first in a series of studies to explore how VR technologies can be exploited to augment multi-modal testing, with a fully immersive musical pitch-colour association test envisioned.


INTRODUCTION
Virtual Reality (VR) has increasingly been embraced as an effective tool in research due to its ability to create immersive environments that simulate real-world experiences [20,26,28].One area where VR holds substantial promise is in exploring auditory-visual associations [19] and presenting synaesthesia-like experiences (see Section 2.1) to help understand connections between the senses [5].Previously, pen and paper and desktop-based tests have been developed to determine if a person has synaesthesia [1,4].At the same time, these tests are also used to highlight the consistency of the general population in associating between their senses [36,37].
With VR and its immersive nature, a more intuitive and engaging context for these assessments could be created, allowing non-synaesthetes to experience multi-modal stimuli in a way that more closely mimics the experiences of synaesthetes and possibly leads to enhanced consistency in auditory-visual associations within the test.However, the literature on the influence of VR on auditory-visual association tests is sparse and inconclusive.
This paper presents an experimental study designed to start filling this gap in the research.Utilising a balanced A/B testing methodology, we administered a musical pitch-colour association test to 20 participants using The Synesthesia Battery, an online platform designed for such assessments.Each participant completed the online test in two environments: a traditional desktop setting and a VR Head Mounted Display (HMD).
The research question (RQ1) was: Does environment (Desktop vs VR HMD) significantly affect how participants perform in a musical pitch-colour association test?
If the null hypothesis, stating that no significant difference exists between the test scores in both the desktop and VR HMD environments, cannot be rejected, we speculate that other elements unique to VR, such as spatial audio, natural environment simulations, and interactive object manipulation, might be the key factors in improving consistency in musical pitch-colour tests.
This pilot experiment will be the first step toward a broader understanding of how VR can be used to enhance multi-modal testing.By examining the impact of VR on auditory-visual association test performance, we aim to set a foundation for future research in this promising field.

RELATED WORK
In this section we review previous work on auditory-visual crossmodal associations, sound-colour synaesthesia, current auditoryvisual association tests and related VR research.
The aim is to contextualise our exploration of VR's potential to enhance understanding and assessment of auditory-visual associations.
Figure 1: Testing Environments: The desktop testing environment is shown on the left, the participant seating area for the VR test is in the middle, and the VR testing environment using the Oculus Quest 2 headset is on the right.

Auditory-Visual Cross-Modal Associations
Auditory-visual association research has shown interesting connections between the auditory and visual senses.Lower-pitched sounds have been found to associate with darker colours, while higher-pitched sounds usually associate with lighter colours [5,11,38].Other musical features also have associations with the visual sense, such as musical key-brightness, timbre-saturation, loudnessbrightness and pitch-size [24,33].
Emotions significantly influence how we perceive associations between the senses.Research on music-emotion and colour-emotion has revealed intriguing connections.Hevner's emotional model and Plutchik's wheel of emotions demonstrate the intricate relationships between emotion-music, and emotion-colour [14,27].The emotion mediation hypothesis further suggests that individuals tend to select colours with emotional content similar to the music they listen to [24].Understanding how emotions impact sensory associations is crucial for designing innovative systems, especially in immersive environments like VR.
Research into synaesthesia has found that there are some similarities with cross-modal associations used in the general population [17,38,40].Synaesthesia is a rare condition that affects approximately 2-4% of the population, with sound-colour synaesthesia affecting 0.2% of people [5,9].It is a phenomenon that occurs when the stimulation of one sense modality gives rise to sensations in another sense modality [5].Some people will associate the sound's pitch with a colour, shape, or size, but other aspects of music, such as timbre and musical key, can also associate with the visual sense [3,5,6,32].
With sound-colour synaesthesia, the experience is personalised and individualised for the person, and there is no single correct general auditory and visual match, and these matches are perceived either in the person's "mind's eye" or projected into the external space in front of them [5,29].

Current Tests on Auditory-Visual Associations
Research on synaesthesia and auditory-visual associations has led to the development of numerous tests designed to determine whether an individual experiences a specific type of synaesthetic response or more generally how strong a person's associations are between the two senses.The current tests are The Synaesthesia Battery [8], Test of Genuineness (TOG-R) [1,16], NeCoSyn method [37], and the Implicit Association Test (IAT) [17,25].The Synesthesia Battery test was selected as the test for this study, as it has a musical pitch-colour association test within its online battery of tests and has a researcher portal for easy management of participant scores [8].The Synaesthesia Battery test has been independently tested and validated, and is one of the most used tests in synaesthesia research to date [4].While it is used to identify certain synaesthetic abilities in participants, it also has the potential to test non-synaesthetes to understand the strength of their associations between the auditory and visual senses [23,36].

Similar Virtual Reality Research
Although there is limited research on VR studies specifically focusing on auditory-visual association tests, related areas have utilised VR environments for similar tests called Stroop tests [12,28,41].
The Stroop Test, a psychological assessment, illustrates the Stroop Effect, showing prolonged response times in colour naming compared to word reading.These tests found a higher reduction of stress in the VR version of the Stroop word-colour test compared to a regular desktop [28], with a follow-up study finding a 30-40% improvement in some metrics when using the VR version of the test [12].An overview on multi-modality in VR [19], highlights that 85% of multi-modality research in VR reports a positive impact, with only a 1% negative impact reported.Martin et al. states the potential for more research in cross-modality in VR, with limited research presented in the paper [19].In one study, it was found that crossmodal interactions between auditory and visual stimuli influence subjective experiences and assessments, including room evaluations in both real and virtual environments [18].
Music, art, and synaesthesia have been sparse research topics when it comes to VR and cross-modal associations.However, research into immersion and synaesthesia highlights seven potentially immersive features, with audiovisual synchronisation and impressive graphics included [30].Other VR research shows musical timbre visual mappings [31], programs for learning piano [35], understanding of harmony in music [34], and creative expression [15].In the next section, an experimental study will be presented on comparing two environments when taking a musical pitch-colour association test.

METHODOLOGY
A pilot study was conducted to assess whether merely being present in a VR environment (rather than a conventional desktop setup) affects the way in which participants form auditory-visual associations'.The Synesthesia Battery 1 online platform was used to conduct musical pitch-colour association tests in two VR scenarios (for details, see [8]).
The two environments being investigated used an online browserbased test, one in the more standard testing environment using a desktop computer and monitor, and the other being conducted on a VR Head Mounted Display (HMD).Details on experiment design, participants, data collection, and analysis will be provided in the next sections.

Experimental Design
This study investigates musical pitch-colour associations using consistency tests on standard desktop and VR HMD setups.For the desktop, a PC with Windows 10 and a Phillips 246V5 monitor were used, set to full brightness and 6500k color temperature.The VR setup employed an Oculus Quest 2 HMD with Oculus Touch controllers, with the display manually calibrated to match the PC monitor's brightness.Both displays used LCD technology, with the monitor in sRGB and the HMD in Rec.709 color space [21,22].Participants used Sony MDR-CD580 headphones, adjusting audio volume to their comfort.Spatial sound was not feasible due to limitations in the predefined Synesthesia Battery test environment.
The experimental design employed a within-subject A/B test.Half of the participants took the desktop test first, followed by the VR test, while the other half followed the reverse order.Participants heard 13 musical notes presented randomly 3 times each, totaling 39 auditory stimuli, covering a piano scale from middle C (C4 -261.63Hz) to one octave above (C5 -523.25 Hz).They selected colors associated with the auditory stimuli using a color picker 1 https://synesthete.ircn.jp/home(Figure 1).A five-minute break with an auditory-visual pattern video2 helped mitigate potential memory effects from the first test [1].Participants completed surveys before the first test and after the second test.The experimental procedure, depicted in Figure 2, was followed by all participants, lasting approximately 30-40 minutes.

Participants
20 participants took part in the study, 80% were male, 15% were female, and 5% identified as non-binary, with 70% having little to no VR experience.The 25-34 age group was represented 11 times, while the 18-24, 35-44, and 45-54 age groups had 4, 3, and 2 representatives, respectively.No participants reported having any issues with motion sickness before starting the study.Two participants stated that they had synaesthesia, but it was not musical pitch-colour synaesthesia that is being tested in this study.Ethics approval was granted by the UCC Ethics Review Board, and the study was conducted in accordance with the Declaration of Helsinki.

Data Collection
Google Forms collected general demographics and insights on VR experience and testing environments.Data from musical pitchcolour tests were obtained via The Synesthesia Battery website, with a dedicated researcher's portal allowing data export as CSV files.To limit personal data collection, a single account was used to collect all participants' tests, with participants providing their test IDs in the survey form to keep track of which environment was associated with which test.CSV files contain the average score used to measure sound-colour associations (with synaesthetes deemed < 1.0 and 1.0 > deemed as non-synaesthetes).

Data Analysis
SPSS was used to analyse the question stated in RQ1; with a Paired T-Test being used.From the insights participants shared during the study, a few other questions were analysed to gain insights that will help develop future studies.
A Fisher's exact test for both VR experience and sound-colour association difficulty was performed to gain further insights, as a chi-squared test might not be useful with sample sizes under 50 [13].Analysing participants' feedback will help design future tests and VR applications providing associations between auditory and visual senses.

RESULTS
To answer RQ1, stated in Section 1, a Paired T-Test was used.Results from the Paired T-Test show a mean difference of -0.06051650 between the desktop and VR test scores, with a significance level of 0.605 (non-significance) on the two-sided p-value.This result does not allow us to reject the null hypothesis meaning that no significant difference could be found in the test scores between the desktop test environment and the VR test environment.
For additional insights, a Fisher's exact test was conducted to see, firstly, if participants' VR experience and, secondly, their difficulty associating sound to colours had any significant effect on participants' preferred testing environment.For the VR experience, Fisher's exact test shows a significance of 0.010, which indicates that VR experience plays a role in the participants' choice of the preferred testing environment.For the difficulty associating sound and colour stimuli, Fisher's exact test shows a significance of 0.358, which reveals no significant association between the preferred testing environment and the difficulty of the task performed in the test.

DISCUSSION
The answer to RQ1 leans towards the null hypothesis, confirmed by the Paired T-Test results, suggesting no significant difference in test scores between the two environments.This study represents an initial exploratory step towards understanding the impact of VR on auditory-visual associations.While 12 out of 20 participants felt more comfortable associating the musical notes with colours in the VR environment, the results show no significant differences between the two testing environments in this A/B study design.Given that simple presence in a VR environment-with minimal interaction or immersive features such as spatial audio other than moving one's head and use of controllers-does not affect musical pitchcolour associations, we now have the opportunity to investigate other immersive features unique to VR.These further explorations may help enhance our understanding of the interplay between the two senses.
To explore the participants' perception of their performance in the two environments and see if there was any connection between how difficult they found associating sound, the colour stimuli, and the participant's VR experience, a Fisher's exact test was used.As reported in Section 4, the test shows that there was no significant correlation between participants' choice of the preferred test environment and how difficult they found it to associate between their auditory and visual senses.Although participants expressed a preference for the Virtual Reality environment, RQ1 revealed no significant difference in performance between the testing environments.Interestingly, participants' perceptions of their performance were not aligned with the difficulty they experienced in associating sound and colour stimuli.This shows that self-reporting tests can be flawed, and while you can gain interesting insights, backing it up with rigorous tested scientific process is important [1,8].
Analysis using Fisher's exact test highlights a significant correlation between participants' prior VR experience and their preferred testing environment.Interestingly, 70% of participants had minimal to no VR experience before this study, potentially influencing their environment choice.This suggests that the novelty of VR immersion may sway participants' preferences.However, as VR becomes more ubiquitous, the initial novelty effect wanes, although it remains a promising tool for training and testing [39].In the context of auditory-visual association tests, the standard desktop test lacks realism in sound perception.Placing participants in immersive VR environments with high-quality spatial audio could foster more natural sensory associations.

Implications & Limitations
The study reported here found no significant difference between using a standard desktop and a VR HMD when participants were tested on how they form associations between their auditory and visual senses.It provides no support for the hypothesis that using a basic VR environment rather than a desktop environment affects auditory-visual associations and paves the way to investigate other features of VR and how they can be used to improve and strengthen cross-modal associations between our two senses.Further investigation into features such as spatial audio, realistic environments, interactions with objects and the ability to manipulate the size and shape of objects could provide fruitful insights.Research has shown that people associate not just colour brightness, but other visual aspects like saturation and size with auditory features, such as timbre, loudness, and pitch [33].These aspects will be the focus of our future studies in auditory-visual associations using VR.
Participants taking both testing environments in the same session is not ideal and while some previous studies have done this, it can lead to memory effects between the two tests.To mitigate potential memory effects mentioned in Section 3.1, participants watched a video during the break, however having separate sessions weeks or months apart would solve any potential issues.Conducting all tests in a single session was chosen to increase participation numbers.
Assessing sound-colour associations in different environments, particularly VR, presents challenges due to device-specific colour management limitations [21,22].Calibrating colours on both devices was impractical for the VR headset, necessitating steps to ensure consistent colour presentation, including using a colour chart on both displays.Both devices utilized LCD screens with standard colour temperature settings.Prior research underscores the imperative for standardised colour management in VR environments [2,7,10].

CONCLUSION
This pilot study delved into the potential impact of testing environments, namely a VR head-mounted display (HMD) and a traditional desktop setup, on the performance outcomes of a musical pitchcolour association test.A balanced A/B testing approach involving 20 participants was employed for the investigation.Our primary research question, "Does environment (Desktop vs VR HMD) significantly affect how participants perform in a musical pitch-colour association test?" found its answer leaning towards the null hypothesis.The outcomes revealed no significant difference in the performance of participants under the VR and desktop conditions.
Given these findings, our study could not confirm the alternative hypothesis, demonstrating no significant difference in test scores between the two testing environments.These outcomes direct our future investigations toward a more immersive VR application.Recognizing VR's transformative potential in similar research domains (see Section 2.3), it becomes crucial to understand the factors contributing to the effectiveness of auditory-visual association tests in VR settings.

ACKNOWLEDGMENTS
This publication has emanated from research conducted with the financial support of Science Foundation Ireland under Grant number 18/CRT/6222.For the purpose of Open Access, the author has applied a CC BY public copyright licence to any Author Accepted Manuscript version arising from this submission.

Figure 2 :
Figure 2: Participants followed the steps in a clockwise sequence starting from the top left.The testing procedure lasted 30-40 minutes per participant.