Real-time Slow-motion : A Framework for Slow-motion Without Deviating from Real-time

Real-time and slow-motion are incompatible. This is a fundamental incompatibility based on the relationship between recording and playback. We propose Real-time Slow-motion, a framework that allows for the coexistence of both. In a see-through system with a pair of a camera and head-mounted display (HMD), this framework provides temporal editing between camera and display within the short width of the subjective present (i.e. now): the camera inputs, divided into small durations, are alternately distributed on two timelines, then stretched twice on the timelines, and these layers are assigned to the left and right eye displays respectively. The hypothesis is that, under certain conditions, the brain integrates images from both eyes, resulting in a slow-motion experience that does not deviate from real-time (i.e., users experience real-time events with slow-motion mode). This paper describes the implementation and the psychological evaluation experiments and interviews about the realized experience. The framework of Real-time Slow-motion is then generalized, and its applications and limitations are discussed. This research suggests a new field of “Temporal Editing to Humans.”


INTRODUCTION
Real-time and slow-motion are incompatible.This is a fundamental incompatibility in the relationship between recording and playback.The recording-playback process is an invention to create the illusion of motion in past events by sampling a continuous motion in an interval and presenting it at an interval.If the presentation interval matches the recording interval, the image is displayed at its original speed.Conversely, if the interval is set longer than the recording interval, the continuous motion from the original scene appears slower than in the recording, resulting in slow-motion.This is one of the most natural methods of editing that makes use of the relationship between recording and playback, and through it we discover that the recording-playback process is not just a reproduction of movement, but can be a technology that reveals aspects of movement that cannot be discovered through everyday vision, a "lens for movement" so to speak.
However, during the playback phase, if two records with different presentation intervals (i.e., different playback speeds) are played back simultaneously, the playback heads (i.e., the time at which the sampling data being played back at that time was recorded)of the two records will quickly diverge, and the divergence will continue to increase (Figure 2).Since real-time playback, which is a system that instantly presents on a display what a sensor has recorded, is a type of playback that presents sampling data at the same time interval as it was recorded, this discrepancy exists between slowmotion playback and real-time playback.As long as it is real-time, it cannot be slow-motion, and vice versa.Slow-motion playback is only possible when playing back what was recorded in the past.It is incompatible with real-time in principle.On the other hand, if we move away from the procedural definitions of slow-motion and real-time in terms of recording and playback, and grasp the essence of both, can we find a way to resolve the incompatibility between the two?We rewrite the definitions of slow-motion and real-time in the following way.
• "A playback is in real-time" means that the playback head is within a short duration that is perceptually interpreted as "now" in the playback.
• "A playback is in slow-motion" means that the motion is slower in the illusion of motion perceived by the playback than it was when it was recorded.
This definition reveals the strategy of fitting slow-motion into the duration of the subjective present.
We propose Real-time Slow-motion as a framework that allows both to coexist.In this framework, we attempt to realize slowmotion that does not diverge from real-time by dividing the input within a short duration into multiple layers, stretching them on each layer, and presenting them collectively to the user.In this research, we implemented the above framework in a video seethrough system using cameras and binocular displays.First, quantitative evaluation using recorded video confirmed that the proposed method can realize a situation in which real-time and slow-motion can be said to coexist.Next, we conducted a user study by adapting the proposed method to an actual real-time experience.As a result of the interview analysis, it became clear that the experience of slow-motion, which continues to follow one's real-time motion, partially augmented the motion perception, with a strange sensation represented by a user's comment "a feeling that time has exceeded saturation.
These findings demonstrate a methodology to add temporal intervention to the real-time experience by partially overcoming structural constraints in recording and playback, using the integrative capacity and temporal properties of cognition, and suggesting a new field of "Temporal Editing to Humans."

RELATED WORKS 2.1 Subjective present
The practically cognized present is no knife-edge, but a saddle-back, with a certain breadth of its own.
(William James [8]) Although the "present" is often represented in timelines as a point with no width, neither in the future nor in the past, it is known that the present, which humans subjectively consider to be the "now," appears as something with width [2] [18].Various criteria have been proposed for this width, depending on the way of interpreting the "now".For example, known is the simultaneous range, in which one cannot distinguish the multiple timing of events, or cannot succeed in synchrony judgment (SJ) task, and the ordinal range, in which one cannot grasp the ordinal relation of events, or cannot success in a temporal order judgment (TOJ) task [17] [18].The simultaneous range is said to be 3 to 5 ms for auditory, 30 to 40 ms for visual, and about 10 ms for tactile, and varies from one modality to another.The ordinal range is said to be 20 to 40 milliseconds regardless of the modality [7].In addition, many cases have been found in these time ranges where not only features such as non-synchrony and order "cannot be recognized," but also where an order that differs from reality "is created." For example, several studies have reported that stimuli to which attention is paid are perceived before stimuli to which attention is not paid, even when presented simultaneously, and Titchener (1908) named this phenomenon the law of prior entry [8][22] [19].Furthermore, examples of this illusion of order that does not actually exist have been studied as a so-called "illusion" of sorts, and famous examples include the "line motion illusion [6], " "reversed apparent motion [5], " and the "flash-lag effect [15]".

Slow-motion in real-time experience
In principle, slow-motion playback is applied to recorded scenes, and countless beautiful practices have taken place in audio-visual arts such as slow-motion scenes in cinemas [3]. Figure 3-(2) illustrates the structure of the slow-motion playback.Meanwhile, in various fields, methods relating to slow-motion in real-time experience have been explored.We will review the examples one by one, showing the temporal structure of the implemented methods (Figure 3), then highlight the originality of our method.
Slow-motion replay Possibly the most simple and familiar method is scene playback in live-streaming.For example, in live-streaming of sports, playback is often in slow-motion mode, so that viewers can check the scene precisely.Such a method can be considered as one approach to implement the mixture of real-time and slow-motion.On the other hand, Figure 3- (3) shows that the incompatibility between slowmotion and real-time, already mentioned, is manifested as an increase in delay during slow-motion replay and a time leap at reset.Event-driven reset of slow-motion Several studies and applications implement event-driven speed manipulation in real-time playback.For example, when an event is detected (e.g., speech), the event immediately is played back slowly.When no event is detected, the replay is back to real-time mode ( [10,14]) (Figure 3-( 4), ( 5)).One of the advantages of this approach is that the experience of real-time and slowmotion are seamlessly connected, which enables the sequential slow-motion experiment related to the current event.
However, the actual deviation from real-time during slowmotion has not been resolved, even if the approach can make the user unaware of the deviation.Additive slow-motion Some studies introduce slow-motion by adding slow-motion on the normal real-time playback [16,21].It enables us to capture the image moving at the same speed as in real-time and in parallel to capture the slowmotion of the motion being focused on partially.In contrast to such studies preparing two layers, one for slow-motion and one for real-time, our study is designed asReal-time Slow-motion, and slow-motion and real-time are not separated.Figure 3-(6).Augmented motion-perception There are examples that do not actually stretch the speed of the image, but intervene in the elements involved in motion perception to create a slow-motion-like sensation: [11] extend our dynamic vision by making stop-motion wearable, and [1] modulated our self-motion perception by editing the optic flow.Slow-motion effect During a life-threatening event, we sometimes get the feeling that the world has switched to slowmotion [4,20].It has also been reported that when one suddenly looks at the hands of a clock, it feels as if the clock has stopped [12,27].The existence of such fluctuations in time perception itself cannot be ignored in the engineering process of bringing slow-motion into the present.

Binocular integration/rivalry
In human vision, visual information from the retinas of the right and left eyes is integrated to form a unified perception.Binocular integration refers to the process by which the human visual system combines the two slightly different images perceived by each eye into a single, coherent image [25][24] [9].This integration occurs in the brain and is crucial for depth perception and three-dimensional understanding of the environment.It relies on the slight differences in the images from each eye, known as binocular disparity, to gauge the relative distances of objects.When binocular fusion is effective, it provides a richer, more detailed perception of the world; however, if the images from each eye are too different, it can lead to issues like double vision, which is referred to as binocular rivalry [23][26] [13].
Although the mechanism is still under debate, what we want to emphasize here is that our visual perception system can integrate two different visual information into a unified perception under certain conditions.We expect this or similar kind of unifying function will established with the dichoptic visual informations generated in a specific methods, which we describe below.

SYSTEM
We have designed a unique framework for the coexistence of realtime and slow-motion.In addition, using a camera and an HMD, we developed a video see-through system that temporally edits the video captured by the camera and displays it on the display according to the proposed framework.In this section, we describe the framework of this temporal editing and its specific implementation procedure.

Framework
The framework of Real-time Slow-motion using binocular integration consists of the following three steps (Fig. 4).
(1) Distribution: Inputs from the environment are separated on the timeline by a certain segment duration and distributed in sequence on two parallel layers.
(2) Stretching: The distributed input is stretched to be 1/2 times slower in each layer, with the head fixed.(3) Composition: Assign one layer to the right eye and one layer to the left eye.Here, we expect the visual stimuli from both eyes to be integrated on the user's perceptual and cognitive processes.With this three-step framework, the user sees short slow-motion sequences that are slightly delayed and then repeatedly reset, with the timing of the resets alternating between the left and right eyes.Each sequence is in slow-motion at 0.5x, and the delay to real-time remains within the segment duration.

Implementation
The implementation consists of an HMD, a camera capable of video capture, and a computer (Fig. 5).Based on our proposed framework, the system performs the following processing.First, the real-time camera image is input into a buffer on the computer.The buffered images are then processed and displayed on the HMD according to the framework.Specifically, the captured image is first split into two layers, with the layer into which the image is split changing every 200 ms (segment duration).Secondly, the segment of video in each layer is stretched to 2.0x its length (i.e.0.5x speed).This length corresponds to the time until the next segment is input into the layer.Finally, one layer is assigned to the right eye display and the other to the left eye display.In other words, the display for each eye shows 0.5x slow-motion, which is reset to a real-time image every 2.0x the length of the segment duration (400 ms).In addition, there is always an offset of the same length as the segment duration (200 ms) in the timing of the reset of the image on the display for each eye.
In this system, the video delay caused by slow-motion on each display is kept within a short time (maximum 200 ms; average 100 ms), so it can be said that the real-time video output is sufficiently realized.Therefore, we define that when the visual experience

PRELIMINARY EXPERIMENT
We conducted an experiment using the method of adjustment to see how the participants judge the speed of the video modulated by our proposed method.To evaluate the subjective slowness that participants feel from the images during the experience of Real-time Slow-motion, it is considered natural to use normal slow-motion playback as a control condition, but real-time and slow-motion are incompatible, as we have already described, this control condition cannot be created under real-time conditions.Therefore, in order to perform a quantitative evaluation of the system, we designed an experiment using video recordings instead of real-time camera inputs.
Participants observed the edited video, including the editing method used in the proposed system.Then, they adjusted the speed of the original video to match the speed of the edited video they felt.Our analysis involved a two-way repeated-measures analysis

Participants
We recruited ten naïve participants (7 males and 3 females; aged 21 -24, mean age 21.7 years old,  = 0.90).All participants did not have a physical disability.The study protocol was performed in accordance with the Declaration of Helsinki.All participants signed a letter of consent after being provided with an overview of the user study.

Task design
In a two-way repeated-measures design, we manipulated two types of video and four editing methods, thus in total encompassing eight conditions.Each condition consisted of six trials, where the participant observed the edited video and then adjusted the speed of the original video to match the speed of the edited video.
We adopted following two original videos as types of video conditions (fps: 60, resolution: 1080 × 1080): (1) video 1: An abstract video image that a green circle rotates around the center point on the grid (Fig. 6 left) (2) video 2: A photoreal video image that a hand repeatedly closes and opens on the wood desk (Fig. 6 right) The reference playback speed (i.e., 1.0x speed) of each video was defined as follows: The speeds that the green circle circled at six seconds in video 1, and that the hand closed and opened once at 4 seconds in video 2. Also, we adopted following four editing methods: (1) 1.0x: Original video at 1.0x speed (the same images were presented to both eyes) (2) 0.5x: Original video at 0.5x speed (the same images were presented to both eyes) (3) RS: Binocular type of Real-time Slow-motion (the different images were presented to the left and right eyes) (4) RS-blend: Blend type of Real-time Slow-motion (the same images were presented to both eyes) Here, RS is the proposed method described at the beginning of this section.RS-blend is the other editing method using the Real-time Slow-motion framework, where two layers were blended with the same transparency in the video.1.0x and 0.5x are conditions that simply modified the playback speed of the original video to the speed of the conditions.Thus, at all conditions except RS, the same images were presented to both eyes.

Procedure
Before the experiment, participants were briefed on the protocol of the task.Then, participants were asked to perform two sessions, one for each types of videos condition (all participants conducted their sessions in video 1 first).Each session consisted of four blocks per editing methods condition, and each block involved six trials, thus totaling 24 trials per session.The order of the block was changed randomly per participant.Participants took a 10-minute break between sessions.
Participants carried out the task of each trial in the following procedure.Participants observed the edited video for 20 seconds using editing methods determined for each block.Then, the original video, whose playback speed could be changed, was displayed, and the participants were asked to adjust the speed of the video so that it was the same as the speed of the video they had just observed, using the mouse wheel.The initial speed of this video was set alternately to 0.3x and 1.2x (the order of this was counterbalanced between participants).After the participants finished adjusting the speed of the video, they took a 5-second break and started the subsequent trial.

Results and Discussion
We collected 48 samples of subjective playback speed per participant.4 of the 480 samples in total (i.e., 48 samples × 10 participants) were excluded from the analysis because of operational errors by the participants.The average score of samples per condition was recorded as the subjective playback speed of the participant in that condition.The average of the subjective playback speed between participants differed between the four editing methods (Fig. 7).We tested normality using the Shapiro-Wilk test method for a total of eight conditions.The nonnormality was not indicated in all conditions (Two types of video × Four editing methods;  > .05 at all conditions).Then, we conducted the two-way repeated measures ANOVA for the within-participants, with factors types of The main effect of editing methods followed up with the pairwise comparisons using Shaffer's modified sequentially rejective Bonferroni procedure (MSRB procedure).This post hoc analysis showed a significant difference between all levels (0.5x < RS < RS-blend < 1.0x, Figure 7).
From the preliminary experiment, we found that our proposed framework provided participants with slow-motion experience while keeping the delay in short range from the original temporal position.The delay of the output of the proposed system with respect to the input is only 250 ms at most.However, we confirmed participants perceived the conditions in which Real-time Slow-motion was adapted as significantly slower, compared to the condition that played back at 1,0x speed.These results suggest that the proposed system has the potential to provide users with a sense of slow-motion, even though the presentation of information with little delay from real-time input.
In our experiment, we compared two different editing methods adopting the Real-time Slow-motion framework (i.e., RS and RS-blend).As a result, we found that participants felt RS was significantly slower than RS-blend in most cases.

WORKSHOP STUDY
This section describes an interview-based evaluation of the experience of viewing physical reality rather than recordings through Real-time Slow-motion.
We conducted this workshop study with the same participants as the preliminary experiment (7 males and 3 females; aged 21 -24, mean age 21.7 years old,  = 0.90).All participants did not have a physical disability.The study protocol was performed in accordance with the Declaration of Helsinki.All participants signed a letter of consent after being provided with an overview of the user study.

Task design and procedure
The participant sat in a chair facing a table and was free to observe their hands and the objects on the table, such as cups, tissues, and apples, through the Real-time Slow-motion system.Throughout the experiment, participants could adjust the SD using a knob on the table.Each participant experienced the system for at least 5 minutes and could continue for up to 15 minutes until they were satisfied.

Results and Discussion
In this section, we present notable comments that were collected during the interviews and grouped according to common themes.The participants are referred to as P1-9, and they experienced the Real-time Slow-motion system separately.
Coexistence of Real-time and Slow-motion All participants mentioned with surprise the coexistence of realtime and slow-motion, or more precisely, the phenomenon of movement in front of them that appeared slow but did not deviate from real-time.
P2, P6: "This experience is real-time and slow-motion." P3: "The slowed-down movement of my body keeps catching up with the real-time movement of my body, and I am being tricked." P3: "Physical environment is being slowed down while keeping to be the physical environment." Besides the comments referring to the respective characteristics of real-time and slow-motion, some participants also reported a certain unique sensation that cannot be reduced to either property.
P8: "I felt a contradictory sensation as if something that was originally too big to fit into the room had been put into the room." P3: "The phenomenon of slowing down all the time but not deviating from reality reminded me of "Shepard scale" (which sounds like it keeps rising in pitch but stays in the audible range all the time)." P10: "I got a feeling that time has exceeded saturation" These comments contain a common feeling, accompanied by the contradictory nuance of "an amount that should not be there", which is somehow caused by the coexistence of real-time and slow-motion, which do not usually coexist.Further investigation is required to determine the level of confusion.It could be conceptual confusion, as slow-motion should diverge from real-time but does not in this experience.Alternatively, it may be cognitive confusion caused by the continued betrayal of positional predictions due to the difference between instantaneous velocity and velocity over a wide time span relative to standard deviation.

Augmentation in terms of capability
Many participants reported experiencing sensory augmentation when slow-motion was introduced into their physical environment.
P2: "I am now able to recognize movement on (temporal)scales that are normally invisible." P9: "The temporal resolution was improved in that it was easier to see high-speed movements, but I also felt that the spatial resolution was also improved, as the sensitivity to the movement of a small portion of tissue or the thin tip of a finger was also improved." In addition to real-time sensory enhancements, some participants reported improvements in their memory of the experience.
P5: "A denser memory than usual was formed about the movement." P4: "It felt like normal slow-motion(the slow-motion that we daily experience in films) in my memory." On the other hand, it was also pointed out that there are limitations in the proposed system.P1: "When I move my arms quickly, the image appears to double and the number of fingers appears to increase." P3: "There is a small delay, which is not a problem for everyday tasks, but it is doubtful that this system will lead to good results in sports and other situations where severe reactions are required." The comments suggest that the proposed method allows for the perception of motor information that is not accessible through normal vision.However, it is unclear to what extent the resulting sensory augmentation can enhance the user's motor abilities due to limiting factors of the algorithm.
Adaptation to the system Several participants commented on their adaptation to the system.P3: "At the beginning of the experience, it was hard, like wearing strong glasses with thick lenses, but I gradually got used to it." P2: "I felt a slight headache, but it eased progressively." During the initial minutes of the experience, participants tended to feel uncomfortable with the misalignment of images between their two eyes and the coexistence of real-time and slow-motion.However, they gradually became accustomed to these conditions.
One participant reported a slight decrease in the effect of slowmotion adaptation.This is an interesting phenomenon as it may indicate an acclimation process to an edited perceptual timeline, or it could be a limitation of the system.
P3: "I feel that the slow-motion effect is somewhat diminished as it adapts." Difference in experience by Segmented Duration Once familiar with the system, many people enjoyed the dynamic slowing down of the perceived speed by setting it to a larger value of 1.5 or more.
P10: "I rather like the larger SD.The images seem to be double when you are aware of them, but when you focus vaguely on the motion, you can perceive them as a single coherent perception of the motion." P1: "When looking at an image very close to the eye, the left and right images are very different, but one can also forget the difference.The experience when the duration is wide is similar to that." In our preliminary experiments, we selected a small SD to avoid significant differences between the left and right images, as we expected them to form a single perceptual image, similar to stereopsis.However, these comments suggest that even when there are noticeable differences between the left and right images, conscious observers can still accept a bold temporal edit by focusing on the motion.Further examination is required to determine whether the brain is fully recognizing images from both eyes or selectively using information from each eye.

GENERAL DISCUSSION
The above two experiments showed that the proposed system can generate a slow-motion experience without deviating from realtime, and suggested that when the system is introduced into an actual real-time experience, a strange sensation of real-time and slow-motion coexistence is generated, which contributes to a qualitative extension of motion perception.
In this section, we generalize the idea of Real-time Slow-motion as a framework and show how the generalized framework can be deployed in any implementation.Then, we explain the limitations of this paper.

Generalization of Framework
Up to this point, we have considered the framework of Real-time Slow-motion, which is to fold slow-motion within the width of the subjective present, limited to systems that use binocular integration.Here, we generalize this idea and propose it again.The framework of Real-time Slow-motion already presented can be generalized as follows.
Real-time Slow-motion has the following framework (Figure 8): (1) Distribution: Inputs from the environment are separated on the timeline by a certain segment duration and distributed in sequence on N parallel layers.(2) Stretching: The distributed input is stretched to be 1/N times slower in each layer, with the head fixed.(3) Composition: All layers are juxtaposed and output simultaneously so that each of the layers is equally perceptible.Here, we expect the visual stimuli from both eyes to be integrated into the user's perceptual and cognitive processes.
On the other hand, this framework can be rewritten as the following bijective function.
An event at time  appears in the th layer at time   .

Real-time Slow-motion
Where  is the number of layers,  is the segment duration, and  is written  =  +  (0 <  <  < ) by cyclicity  = .
In another expression, at time , the th layer refers to   .Real-time Slow-motion : ℜ → ℜ  (4) The generalized framework of Real-time Slow-motion has the following characteristics: Slow-motion property All stimuli in each layer are modified to 1/x speed.Real-time property The slowed content on each layer is initialized in real-time for every .Therefore, the temporal distance  between the input and output stimulus is kept within 0 <  <  ( − 1), and its average  =  ( − 1)/2.If we focus on the layer with the lowest delay, the delay is kept within 0 <  <  (1 − 1/).Bijectivity All inputs are certainly assigned to one of the layers.Continuity Although there is a break when the segment of each layer is reset (i.e., the output returned to real-time sensor data), its timing is displaced from each other.Namely, when one layer is reset, the output of the other ( − 1) layers is always continuous.Equal-interval The parallel L layers at a certain moment correspond to the input sampled with equal interval ( − 1)/.
Symmetry Each layer is symmetrical in the framework.The rank in the delay from the original temporal position is also sequentially switched, and the average delay is equal.Cyclicity If we focus on the time difference between the present and output of each layer, the framework can be represented as changing periodically with period .In addition, considering the above symmetry, we can also say that the period of the framework is .

Expandability of framework
Humans have the innate ability to integrate multiple layers and perceive them as one thing, where the multiple layers share their roles for a single experience.Our multi-layered integration process exists in various modalities and at various levels.The implementation of our framework has variations depending on which integration process we focus on.
6.2.1 Modality.This paper described implementations for visual experiences, but it may apply to other sensory modalities (e.g., audio, tactile, and haptic sensations), and cross-modal implementation using inter-sensory integration may also be possible.However, to provide an experience consistent with the Real-time Slow-motion concept (i.e., a slow-motion experience that does not deviate from real-time), the modality must have a certain degree of high temporal resolution to perceive temporal changes in stimuli.In addition, to perceive multiple layers simultaneously, the modality needs to be capable of multi-channel perception.
The Interface specification has similar requirements.Our framework splits temporally continuous input into multiple layers, edits them, and outputs multiple layers simultaneously in a multi-channel or spatially aligned format.Therefore, the implementation of Realtime Slow-motion requires a sensor with temporal resolution and a display capable of outputting multiple channels.

Presenting methods.
There are variations in the presenting methods, depending on which level of integration capabilities is used.As a representative example, we will show two directions below: (1) Integration on the perceptual level : The first direction is to utilize the perceptual level integration of inputs from multiple sensory organs for the same modality.For example, the implementation using binocular integration described in this paper and an auditory version using both ears instead of both eyes belong to this direction.These types have the potential to establish a single image at the perceptual level and can present a perceptual state that is close to the daily life of the user, making the system more transparent.(2) Integration on the cognitive level : The next direction is to not hide the explicitness of multiple layers in the perceptual stage, and achieve their integration at the cognitive level.For example, the prototype shown in Figure 9 belongs to this category.Here the number of layers is chosen to be 7.In the central window, seven layers of Real-time Slow-motion are overlaid with 1/7 transparency.The seven windows around it are similarly made up of these seven layers arranged in a circular pattern.It is clear that there are multiple layers of images in both the central and outer windows, but the user who is conscious of one continuous motion can get a cohesive Real-time Slow-motion experience from them.Unlike the implementation using integration on the perceptual level, this implementation with explicitly non-ordinary screens cannot give the user the impression that "only the speed has changed" However, since it does not depend on the number of sensory organs or their temporal characteristics, the parameters such as L and S in the framework can be set to be large.In addition, due to the characteristic of equal-interval, the layers displayed simultaneously are the equidistant past.Thus, this method can also be regarded as representing the movement from the past to the present in a single frame, just like a cartoon.

Limitation
This section outlines the limitations of the study in three layers: the evaluation method in psychological experiments, the implementation that also uses binocular integration, and the Real-time Slow-motion framework itself.

Limitation of evaluation methodology.
In this study, we acknowledge a limitation in the evaluation methodology for the proposed method.In the quantitative evaluation conducted in the preliminary experiment, we chose the task of reproducing the same 'speed' in the playback of recordings as in the experience of the proposed method, but there were individual differences in how the quantity referred to here as 'speed' is perceived.Speed is a sensation that can come from a variety of sources, ranging from direct perception in the lower visual cortex to detection by inference from changes in position.Therefore, the subjective identification of speed depends on which sources of information they unconsciously or consciously focus on.Specifically, the instantaneous moving speed of each visual stimulus is 0.5 times the original speed, while the average speed of the stimulus in perception is the same as the original speed.Therefore, individual differences in what participants depend on when they identify may affect the evaluation of the system.Possibly this is the reason why the average subjective speed was distributed around the middle between the instantaneous and global speed.In order to clarify the detailed impact of the proposed method on human perception of time and motion and the mechanism of such impact, it is necessary to consider an index that is less susceptible to individual differences in speed interpretation.
On the other hand, the term 'time estimation' is used to refer to the inference of time over a longer span of time (more than a few seconds), as opposed to time perception.In the preliminary experiment, some participants commented that they felt the elapsed time was longer during their experience with the proposed method, suggesting that the proposed system may affect not only time perception but also time estimation.This aspect will also be investigated further.
6.3.2Limitation of using binocular integration.The use of binocular integration, or integration of paired senses, in Real-time Slowmotion has limitations.This approach expects the integration of two images into one perceptual image, but it may not always be possible depending on what the user sees through the system.For instance, when the user perceives rapid movement, they reported that the perceptual image explicitly became double.Additionally, for the images of both eyes to be integrated, the differences between the images must be small, and the time difference between the two images must also be small.It needs to be noted that further experimentation is required to determine whether this integration occurs at the same level as binocular integration in stereoscopic vision.To be precise, this is an interesting open question to be tackled, rather than the limitation of the method.Future studies would reveal whether integration occurs at low-level visual systems or at higher levels.Also, in either case, to understand the phenomena, we need to study the relationship with known visual phenomena.This limitation restricts the presentation time of successive slow-motion sequences and, consequently, the slow-motion effects, compared to the implementation using cognitive-level integration.However, it is worth noting that in the workshop study, several participants enjoyed the proposed method, even when the images of the two eyes were explicitly different.

Limitations of the framework.
Our framework is a stationary system that does not depend on input contents, but whether it works effectively for the users depends on the content.For example, if the input stimulus changes dramatically in time, the original continuity is lost, and the information becomes incomprehensible.One of the solutions to this limitation could be to adjust the internal parameters of the system to the input stimuli.However, this solution would compromise the critical characteristics of the framework (e.g., equal-interval, symmetry, and cyclicity).These problems may be alleviated if the user adapts sufficiently to our system-generated situations through long-term use.Two parameters, , and , each have a trade-off.According to the slow-motion property, because the playback speed of each layer is 1/, the larger  is, the slower experience could be provided to the user.However, as  is also the number of layers the user has to integrate at a time, it may give them more load.Also, while  does not affect the playback speed of each layer itself, it is related to the duration at which continuous slow-motion is presented in each layer.Therefore, too short  may compromise the user's slow-motion experience.Conversely, a larger  makes a larger offset between the real-time environment and the system output.And the output involves a collapse of the temporal order of events over a more extended period.

CONCLUSIONS
In this paper, we proposed Real-time Slow-motion, a framework that realizes slow-motion experience without deviating from real-time.First, we proposed a specific method and implementation using the integration of paired eyes.Second, we conducted psychological evaluation experiments and a workshop study to investigate how the proposed method would provide a user experience.From these investigations, it was suggested that the proposed method realizes the slow-motion experience without deviating from realtime, and it leads to a unique sensation that combines real-time and slow-motion, which are normally incompatible.Finally, we extended the framework of Real-time Slow-motion and confirmed its applicability to implementations with various modalities and presentation methods.We hope that this research will suggest a perspective: "Temporal Editing to Human" and help engineering approaches against time to be more vivid.

Figure 2 :
Figure 2: Slow-motion playback deviates from real-time playback; Real-time and slow-motion are incompatible.

Figure 4 :
Figure 4: Real-time Slow-motion using binocular integration

Figure 5 :
Figure 5: The hardware construction of the see-through goggles for Real-time Slow-motion

Figure 6 :
Figure 6: The snapshots of the videos used in the evaluation.

Figure 7 :
Figure 7: Results of the preliminary experiment.Left: Box plot of the subjective speed in each 2-way condition.The blue triangle represents the mean subjective speed between participants.The y-axis represents the rate of the playback speed adjusted by the participant in the task divided by the reference playback speed of each video.Right: Bar plot of the subjective speed for each editing method.Error bars represent the standard error.The subjective speed was slowest for the 0.5x condition, followed by RS, RS-blend, and 1.0x in that order.*  < .05,* *  < .01,* * *  < .001.

Figure 8 :
Figure 8: The generalized framework of Real-time Slowmotion .

Figure 9 :
Figure 9: An example of Real-time Slow-motion by integration on the cognitive level