* Copyright Disclaimer: paper preprints in this page are provided only for personal academic uses, and not for redistribution.
Journal and Conference Papers
Background: Social anxiety disorder (SAD) is a fear of social situations where a person anticipates being evaluated negatively. Changes in autonomic response patterns are related to the expression of anxiety symptoms. Virtual reality (VR) sickness can inhibit the VR experiences. Objective: This study predicts the severity of specific anxiety symptoms and VR sickness in patients with SAD using machine learning based on in-situ autonomic physiological signals (heart rate and galvanic skin response) during VR treatment sessions. Methods: This study had 32 participants with SAD taking part in six VR sessions. During each VR session, all participants’ heart rate and galvanic skin response were measured in real-time. We assessed specific anxiety symptoms using the Internalized Shame Scale (ISS), the post-event rumination scale (PERS), and VR sickness using the simulator sickness questionnaire (SSQ) during four VR sessions (#1, #2, #4 and #6). Logistic regression, random forest, and naive Bayes classification classified and predict the severity groups in the ISS, PERS, and SSQ subdomains based on in-situ autonomic physiological signal data. Results: The severity of social anxiety disorder was predicted with three machine learning models. According to the F1 score, the highest prediction performance among each domain for severity was as follows: The F1 score of the ISS mistake anxiety subdomain was 0.8421 using the logistic regression model, the PERS positive subdomain was 0.7619 using the naïve Bayes classifier, and the total VR sickness was 0.7059 using the random forest model. Conclusions: This study could predict specific anxiety symptoms and VR sickness during VR intervention by autonomic physiological signals alone in real-time. Machine learning models predict individuals' severe and non-severe psychological states based on in-situ physiological signal data during VR intervention for real-time interactive services. These models support the diagnosis of specific anxiety symptoms and VR sickness with minimal participant bias. Clinical Trial: CRIS Registration Number-KCT0003854.
Background: Although it has been well demonstrated that the efficacy of VR therapies for social anxiety disorder (SAD) is comparable to traditional cognitive-behavioral therapy, little is known about the effect of VR on the pathological self-referential processes in SAD. Objective: This study aims to determine the changes in self-referential processing and their neural mechanisms following VR treatment. Methods: We obtained scans from 25 participants with a primary diagnosis of SAD. Then, the subjects received VR-based exposure treatment starting immediately after the baseline MRI scan and clinical assessments and continuing for six sessions. Eventually, 21 SAD subjects completed follow-up scans after the sixth session of VR therapy in which the subjects were asked to judge whether a series of words (positive, negative, neutral) was relevant to themselves. Twenty-two age-, sex-, and handedness-matched controls also underwent baseline clinical assessments and fMRI scans. Results: The whole-brain analysis revealed that compared with the controls, the SAD group had increased neural responses during positive self-referential processing in the medial temporal and frontal cortexes. This group also showed increased left insular activation and decreased right middle frontal gyrus activation during negative self-referential processing. After undergoing VR-based therapy, the subjects with SAD rated negative words as less relevant (P = .066) and positive words as more relevant (P = .064) to themselves at the postintervention session than at baseline. Their overall symptoms, as measured with the Social Phobia Scale (SPS) and Post-Event Rumination Scale (PERS), were reduced accordingly. We also found that these subjects displayed greater activity in a group of brain regions responsible for self-referential and autobiographical memory processes while viewing positive words at the postintervention fMRI scan. Compared with that at baseline, higher activation was found within broad somatosensory areas of the subjects with SAD during negative self-referential processing following VR therapy. Conclusions: The current fMRI findings reflect the enhanced physiological and cognitive processing of individuals with SAD in response to self-referential information. They also provide neural evidence of the effect of VR exposure therapy on social anxiety and self-derogation. Clinical Trial: CRIS Registration Number-KCT0003854
The single-pair all-shortest-path problem is to find all possible shortest paths, given a single source-destination pair in a graph. Due to the lack of efficient algorithms for single-pair all-shortest-path problem, many applications used diverse types of modifications to the existing shortest-path algorithms such as Dijkstra’s algorithm. Such approaches can facilitate the analysis of medium-sized static networks, but the heavy computational cost impedes their use for massive and dynamic real-world networks. In this paper, we present a novel single-pair all-shortest-path algorithm, which performs well on massive networks as well as dynamic networks. The efficiency of our algorithm stems from novel 2-hop label-based query processing on large-size networks. For dynamic networks, we also demonstrate how to incrementally maintain all shortest paths in 2-hop labels, which allows our algorithm to handle the topological changes of dynamic networks such as insertion or deletion of edges. We carried out experiments on real-world large datasets, and the results confirms the effectiveness of our algorithms for the single-pair all-shortest-path computation and the incremental maintenance of 2-hop labels.
Background: Social anxiety disorder (SAD) is characterized by excessive fear of negative evaluation and humiliation in social interactions and situations. Virtual reality (VR) treatment is a promising intervention option for SAD. Objective: The purpose of this study was to create a participatory and interactive VR intervention for SAD. Treatment progress, including the severity of symptoms and the cognitive and emotional aspects of SAD, was analyzed to evaluate the effectiveness of the intervention. Methods: In total, 32 individuals with SAD and 34 healthy control participants were enrolled in the study through advertisements for online bulletin boards at universities. A VR intervention was designed consisting of three stages (introduction, core, and finishing) and three difficulty levels (easy, medium, and hard) that could be selected by the participants. The core stage was the exposure intervention in which participants engaged in social situations. The effectiveness of treatment was assessed through Beck Anxiety inventory (BAI), State‐Trait Anxiety Inventory (STAI), Internalized Shame Scale (ISS), Post-Event Rumination Scale (PERS), Social Phobia Scale (SPS), Social Interaction Anxiety Scale (SIAS), Brief-Fear of Negative Evaluation Scale (BFNE), and Liebowitz Social Anxiety Scale (LSAS). Results: In the SAD group, scores on the BAI (F=4.616, P=.009), STAI-Trait (F=4.670, P=.004), ISS (F=6.924, P=.001), PERS-negative (F=1.008, P<.001), SPS (F=8.456, P<.001), BFNE (F=6.117, P=.004), KSAD (F=13.259, P<.001), and LSAS (F=4.103, P=.009) significantly improved over the treatment process. Compared with the healthy control group before treatment, the SAD group showed significantly higher scores on all scales (P<.001), and these significant differences persisted even after treatment (P<.001). In the comparison between the VR treatment responder and nonresponder subgroups, there was no significant difference across the course of the VR session. Conclusions: These findings indicated that a participatory and interactive VR intervention had a significant effect on alleviation of the clinical symptoms of SAD, confirming the usefulness of VR for the treatment of SAD. VR treatment is expected to be one of various beneficial therapeutic approaches in the future. Trial Registration: Clinical Research Information Service (CRIS) KCT0003854.
With proper guidance, virtual reality (VR) can provide psychiatric therapeutic strategies within a simulated environment. The visuo-haptic-based multimodal feedback VR solution has been developed to improve anxiety symptoms through immersive experience and feedback. A proof-of-concept study was performed to investigate this VR solution. Nine subjects recently diagnosed with panic disorder were recruited, and seven of them eventually completed the trial. Two VR sessions were provided to each subject. Depression, anxiety, and VR sickness were evaluated before and after each session. Although there was no significant effect of the VR sessions on psychiatric symptoms, we could observe a trend of improvement in depression, anxiety, and VR sickness. The VR solution was effective in relieving subjective anxiety, especially in panic disorder without comorbidity. VR sickness decreased over time. This study is a new proof-of-concept trial to evaluate the therapeutic effect of VR solutions on anxiety symptoms using visuo-haptic-based multimodal feedback simultaneously.
In this paper, we propose the first end-to-end convolutional neural network (CNN) architecture, Defocus Map Estimation Network (DMENet), for spatially varying defocus map estimation. To train the network, we produce a novel depth-of-field (DOF) dataset, SYNDOF, where each image is synthetically blurred with a ground-truth depth map. Due to the synthetic nature of SYNDOF, the feature characteristics of images in SYNDOF can differ from those of real defocused photos. To address this gap, we use domain adaptation that transfers the features of real defocused photos into those of synthetically blurred ones. Our DMENet consists of four subnetworks: blur estimation, domain adaptation, content preservation, and sharpness calibration networks. The subnetworks are connected to each other and jointly trained with their corresponding supervisions in an end-to-end manner. Our method is evaluated on publicly available blur detection and blur estimation datasets and the results show the state-of-the-art performance.
Smartphone users often want to customize the positions and functions of physical buttons to accommodate their own usage patterns; however, this is unfeasible for electronic mobile devices based on COTS (Commercial Off-The-Shelf) due to high production costs and hardware design constraints. In this letter, we present the design and implementation of customized virtual buttons that are localized using only common built-in sensors of electronic mobile devices. We develop sophisticated strategies firstly to detect when a user taps one of the virtual buttons, and secondly to locate the position of the tapped virtual button. The virtual-button scheme is implemented and demonstrated in a COTS-based smartphone. The feasibility study shows that, with up to nine virtual buttons on five different sides of the smartphone, the proposed virtual buttons can operate with greater than 90% accuracy.
Augmented reality (AR) augments virtual information over the real-world medium and is emerging as an important type of an information visualization technique. As such, the visibility and readability of the augmented information must be as high as possible amidst the dynamically changing real-world surrounding and background. In this work, we present a technique based on image saliency analysis to improve the conspicuity of the foreground augmentation to the background real-world medium by adjusting the local brightness contrast. The proposed technique is implemented on a mobile platform considering the usage nature of AR. The saliency computation is carried out for the augmented object’s representative color rather than all the pixels, and searching and adjusting over only a discrete number of brightness levels to produce the highest contrast saliency, thereby making real-time computation possible. While the resulting imagery may not be optimal due to such a simplification, our tests showed that the visibility was still significantly improved without much difference to the optimal ground truth in terms of correctly perceiving and recognizing the augmented information. In addition, we also present another experiment that explores in what fashion the proposed algorithm can be applied in actual AR applications. The results suggested that the users clearly preferred the automatic contrast modulation upon large movements in the scenery.
Abnormal messages propagated from faulty operations in a vehicular system may severely harm the system, but they cannot be easily detected when their information is not known in advance. To support an efficient detection of faulty message patterns propagated in the in-vehicle network, this paper presents a novel graph pattern matching framework built upon a message log-driven graph modeling. Our framework models the unknown condition as a query graph and the reference database of normal operations as data graphs. The analysis of the faulty message propagation requires to consider the sequence of events in the distance measure, and thus, the conventional graph distance measures cannot be directly used for our purpose. We hence propose a novel distance metric based on the maximum common subgraph (MCS) between two graphs and the sequence numbers of messages, which works robustly even for the abnormal faulty patterns and can avoid false negatives in large databases. Since the problem of MCS computation is NP-hard, we also propose two efficient filtering techniques, one based on the lower bound of MCS distance for a polynomial-time approximation and the other based on edge pruning. Experiments performed on real and synthetic datasets to assess our framework show that ours significantly outperforms the previously existing methods in terms both of performance and accuracy of query responses.
Cellular internet-of-things (CIoT) systems are recently developed by the third-generation partnership project (3GPP) to support internet-of-things (IoT) services over the conventional mobile-communication infrastructures. The CIoT systems allow a large number of IoT devices to be connected through the random-access procedure, but the concurrent accesses of the massive devices make this procedure heavily competitive. In this article, we present an effective time-division random-access scheme built upon the coverage levels (CLs), where each CIoT device is assigned a CL and categorized based on its radio-channel quality. In our scheme, the random-access loads of device groups having different CLs are distributed into different time periods, which greatly relaxes instantaneous contention and improves random-access performance. To assess the performance of our scheme, we also introduce a mathematical model that expresses and analyzes the states and behaviors of CIoT devices using the Markov chain. Mathematical analysis and simulation results show that our scheme significantly outperforms the conventional scheme (without time-division control) in terms of collision probability, succeeding access rate, and access-blocking probability.
In order to facilitate low-cost network connection of many devices, machine-type communication (MTC) has evolved to low-cost MTC (LC-MTC) in the third-generation partnership project (3GPP) standard. LC-MTC should be able to effectively handle intensive accesses through multiple narrow-band (NB) random-access channels (RACHs) assigned within the bandwidth of a long-term evolution (LTE) system. As the number of MTC devices and their congestion rapidly increase, the random-access scheme for LC-MTC RACH needs to be improved. This paper presents a novel random-access scheme that introduces virtual preambles of LC-MTC devices and associates them with RACH indices to effectively discern LC-MTC devices. In comparison to the sole use of preambles, our scheme allows an LC-MTC device to better choose a unique virtual preamble. Thereby, the probability of successful accesses of LC-MTC devices increases in contention-based random-access environments. We experimentally assessed our scheme and the results show that our scheme performs better than the existing preamble-based scheme in terms of collision probability, access delay, and access blocking probability.
Many visual tasks in modern personal devices such smartphones resort heavily to graphics processing units (GPUs) for their fluent user experiences. Because most GPUs for embedded systems are nonpreemptive by nature, it is important to schedule GPU resources efficiently across multiple GPU tasks. We present a novel spatial resource sharing (SRS) technique for GPU tasks, called a budget-reservation spatial resource sharing (BR-SRS) scheduling, which limits the number of GPU processing cores for a job based on the priority of the job. Such a priority-driven resource assignment can prevent a high-priority foreground GPU task from being delayed by background GPU tasks. The BR-SRS scheduler is invoked only twice at the arrival and completion of jobs, and thus, the scheduling overhead is minimized as well. We evaluated the performance of our scheduling scheme in an Android-based smartphone, and found that the proposed technique significantly improved the performance of high-priority tasks in comparison to the previous temporal budget-based multi-task scheduling.
This article evaluates the usability of motion sensing-based interaction on a mobile platform using image browsing as a representative task. Three types of interfaces, a physical button interface, a motion-sensing interface using a high-precision commercial 3D motion tracker, and a motion-sensing interface using an in-house low-cost 3D motion tracker, are compared in terms of task performance and subjective preference. Participants were provided with prolonged training over 20 days, in order to compensate for the participants’ unfamiliarity with the motion-sensing interfaces. Experimental results showed that the participants’ task performance and subjective preference for the two motion-sensing interfaces were initially low, but they rapidly improved with training and soon approached the level of the button interface. Furthermore, a recall test, which was conducted 4 weeks later, demonstrated that the usability gains were well retained in spite of the long time gap between uses. Overall, these findings highlight the potential of motion-based interaction as an intuitive interface for mobile devices.
While hand-held computing devices are capable of rendering advanced 3D graphics and processing of multimedia data, they are not designed to provide and induce sufficient sense of immersion and presence for virtual reality. In this paper, we propose minimal requirements for realizing VR on a hand-held device. Furthermore, based on the proposed requirements, we have designed and implemented a low cost hand-held VR platform by adding multimodal sensors and display components to a hand-held PC. The platform enables a motion based interface, an essential part of realizing VR on a small hand-held device, and provides outputs in three modalities, visual, aural and tactile/haptic for a reasonable sensory experience. We showcase our platform and demonstrate the possibilities of hand-hand VR through three VR applications: a typical virtual walkthrough, a 3D multimedia contents browser, and a motion based racing game.
Presence is one of the goals of many virtual reality systems. Historically, in the context of virtual reality, the concept of presence has been associated much with spatial perception (bottom up process) as its informal definition of "feeling of being there" suggests. However, recent studies in presence have challenged this view and attempted to widen the concept to include psychological immersion, thus linking more high level elements (processed in a top down fashion) to presence such as story and plots, flow, attention and focus, identification with the characters, emotion, etc. In this paper, we experimentally studied the relationship between two content elements, each representing the two axis of the presence dichotomy, perceptual cues for spatial presence and sustained attention for (psychological) immersion. Our belief was that spatial perception or presence and a top down processed concept such as voluntary attention have only a very weak relationship, thus our experimental hypothesis was that sustained attention would positively affect spatial presence in a virtual environment with impoverished perceptual cues, but have no effect in an environment rich in them. In order to confirm the existence of the sustained attention in the experiment, fMRI of the subjects were taken and analyzed as well. The experimental results showed that that attention had no effect on spatial presence, even in the environment with impoverished spatial cues.
Spatial presence, among the many aspects of presence, is the sense of physical and concrete space, often dubbed as the sense of "being there." This paper theorizes on how "spatial" presence is formed by various types of artificial cues in a virtual environment, form or content. We believe that spatial presence is a product of an unconscious effort to correctly register oneself into the virtual environment in a consistent manner. We hypothesize that this process is perceptual, and bottomup in nature, and rooted in the reflexive and adaptive behavior to react and resolve the mismatch in the spatial cues between the physical space where the user is and the virtual space where the user looks at, hears from and interacts with. Hinted from the fact that our brain has two major paths for processing sensory input, the "where" path for determining object locations, and "what" path for identifying objects, we categorize the sensory stimulation cues in the virtual environment accordingly and investigate in their relationships as how they affect the user in adaptively registering oneself into the virtual environment, thus creating spatial presence. Based on the results of series of our experiments and other bodies of research, we postulate that while low level and perceptual spatial cues are sufficient for creating spatial presence, they can be affected and modulated by the spatial (whether form or content) factors. These results provide important insights into constructing a model of spatial presence, its measurement, and guidelines for configuring locationbased virtual reality applications.
Conference Posters, Talks, and WIPs
Lens flare, comprising diffraction patterns of direct lights and ghosts of an aperture, is one of artistic artifacts in optical systems. The generation of far-field diffraction patterns has commonly used Fourier transform of the iris apertures. While such outcomes are physically faithful, more flexible and intuitive editing of diffraction patterns has not been explored so far. In this poster, we present a novel scheme of diffraction synthesis, which additively integrates diffraction elements. We decompose the apertures into curved edges and circular core so that they abstract non-symmetric streaks and circular core highlights, respectively. We then apply Fourier transform for each, rotate them, and finally composite them into a single output image. In this way, we can easily generate diffraction patterns similarly to that of the source aperture and more exaggerated ones, as well.
This paper presents a real-time framework for computationally tracking objects visually attended by the user while navigating in interactive virtual environments. In addition to the conventional bottom-up (stimulus-driven) features, the framework also uses topdown (goal-directed) contexts to predict the human gaze. The framework first builds feature maps using preattentive features such as luminance, hue, depth, size, and motion. The feature maps are then integrated into a single saliency map using the center-surround difference operation. This pixel-level bottom-up saliency map is converted to an object-level saliency map using the item buffer. Finally, the top-down contexts are inferred from the user’s spatial and temporal behaviors during interactive navigation and used to select the most plausibly attended object among candidates produced in the object saliency map. The computational framework was implemented using the GPU and exhibited extremely fast computing performance (5.68 msec for a 256x256 saliency map), substantiating its adequacy for interactive virtual environments. A user experiment was also conducted to evaluate the prediction accuracy of the visual attention tracking framework with respect to actual human gaze data. The attained accuracy level was well supported by the theory of human cognition for visually identifying a single and multiple attentive targets, especially due to the addition of top-down contextual information. The framework can be effectively used for perceptually based rendering without employing an expensive eye tracker, such as providing the depth-of-field effects and managing the level-of-detail in virtual environments.
Disclosed is a method, performed by a device, of processing an image, the method including: for an original image at a particular time point among a plurality of original images having a sequential relationship in terms of time, determining a cumulative value due to an afterimage of another original image before the particular time point; based on the determined cumulative value and the plurality of original images, obtaining a plurality of blur compensation images for removing a blur caused by the afterimage; and outputting the obtained plurality of blur compensation images.
Provided are a method for rearranging webcomic content and a device therefor. A method for rearranging webcomic content according to one embodiment of the present invention comprises the steps of: obtaining first content including a plurality of image cuts composed of a plurality of elements; extracting the plurality of elements included in the first content; generating a plurality of image cut layers by reconstructing the plurality of extracted elements; and arranging the plurality of generated image cut layers in a specified arrangement so as to generate second content.
Disclosed is a method for rearranging image cuts of cartoon content according to various embodiments. This method for rearranging cartoon content is performed by a computing device and includes the steps of: loading first content in which a plurality of image cuts are arrayed two-dimensionally; extracting a plurality of cut areas, in which the plurality of image cuts from the first content are positioned, respectively; determining the arrayed order of the plurality of image cuts; and generating second content by rearranging the plurality of cut areas according to the arrayed order.
According to the present invention, a lens flare generation method and apparatus are provided that may simulate lens flare effects through paraxial approximation-based linear approximation to generate a lens flare utilizing physical characteristics of a lens system while generating a lens flare at remarkably high speed as compared with the conventional art. Further, according to an embodiment of the present invention, a non-linear effect may be added to a linear pattern-based lens flare effect, generating an actual lens flare reflecting most of physical characteristics generated from the lens system. Further, use of a pre-recorded non-linear pattern allows for generation of a lens flare having a similar quality to the existing light tracking-based simulation at higher speed as compared with the conventional art without speed reduction.
A method for performing occlusion queries is disclosed. The method includes steps of: (a) a graphics processing unit (GPU) using a first depth buffer of a first frame to thereby predict a second depth buffer of a second frame; and (b) the GPU performing occlusion queries for the second frame by using the predicted second depth buffer, wherein the first frame is a frame predating the second frame. In accordance with the present invention, a configuration for classifying the objects into the occluders and the occludees is not required and the occlusion queries for the predicted second frame are acquired in advance at the last of the first frame or the first of the second frame.
A method and device for efficiently simulating lens flares produced by an optical system is provided. The method comprises the steps of - Simulating paths of rays from a light source through the optical system, the rays representing light; and Estimating, for points in a sensor plane, an irradiance, based on intersections of the simulated paths with the sensor plane.
This dissertation presents a GPU-based rendering algorithm for real-time defocus blur and bokeh effects, which significantly improve perceptual realism of synthetic images and can emphasize user’s attention. The defocus blur algorithm combines three distinctive techniques: (1) adaptive discrete geometric level of detail (LOD), made popping-free by blending visibility samples across the two adjacent geometric levels; (2) adaptive visibility/shading sampling via sample reuse; (3) visibility supersampling via height-field ray casting. All the three techniques are seamlessly integrated to lower the rendering cost of smooth defocus blur with high visibility sampling rates, while maintaining most of the quality of brute-force accumulation buffering. Also, the author presents a novel parametric model to include expressive chromatic aberrations in defocus blur rendering and its effective implementation using the accumulation buffering. The model modifies the thin-lens model to adopt the axial and lateral chromatic aberrations, which allows us to easily extend them with nonlinear and artistic appearances beyond physical limits. For the dispersion to be continuous, we employ a novel unified 3D sampling scheme, involving both the lens and spectrum. Further, the author shows a spectral equalizer to emphasize particular dispersion ranges. As a consequence, our approach enables more intuitive and explicit control of chromatic aberrations, unlike the previous physically-based rendering methods. Finally, the dissertation presents an efficient bokeh rendering technique that splats pre-computed sprites but takes dynamic visibilities and appearances into account at runtime. To achieve alias-free look without excessive sampling resulting from strong highlights, the author efficiently sample visibilities using rasterization from highlight sources. Our splatting uses a single precomputed 2D texture, which encodes radial aberrations against object depths. To further integrate dynamic appearances, the author also proposes an effective parameter sampling scheme for focal distance, radial distortion, optical vignetting, and spectral dispersion. The method allows us to render complex appearances of bokeh efficiently, which greatly improves the photorealism of defocus blur.
This dissertation presents a real-time perceptual rendering framework based on computational visual attention tracking in a virtual environment (VE). The visual attention tracking identifies the most plausibly attended objects using top-down (goal-driven) contexts inferred from a user’s navigation behaviors as well as a conventional bottom-up (feature-driven) saliency map. A human experiment was conducted to evaluate the prediction accuracy of the framework by comparing objects regarded as attended to with human gazes collected with an eye tracker. The experimental results indicate that the accuracy is in the level well supported by human cognition theories. The attention tracking framework, then, is applied to depth-of-field (DOF) rendering and level-of-detail (LOD) management, which are representative techniques to improve perceptual quality and rendering performance, respectively. Prior to applying the attention tracking to DOF rendering, we propose two GPU-based real-time DOF rendering methods, since there have been few methods plausible for interactive VEs. One method extends the previous mipmap-based approach, and the other, the previous layered and scatter approaches. Both DOF rendering methods achieve real-time performance without major artifacts present in previous methods. With the DOF rendering methods, we demonstrate attention-guided DOF rendering and LOD management, which use the depths and the levels of attention of attended objects as focal depths and fidelity levels, respectively. The attention-guided DOF rendering can simulate an interactive lens blur effect without an eye tracker, and the attention-guided LOD management can significantly improve rendering performance with little perceptual degradation.