* Paper copies provided in this page are the authors' preprints. Personal/academic uses are only allowed.
Journal and Conference Papers
AbstractIn this paper, we propose the first end-to-end convolutional neural network (CNN) architecture, Defocus Map Estimation Network (DMENet), for spatially varying defocus map estimation. To train the network, we produce a novel depth-of-field (DOF) dataset, SYNDOF, where each image is synthetically blurred with a ground-truth depth map. Due to the synthetic nature of SYNDOF, the feature characteristics of images in SYNDOF can differ from those of real defocused photos. To address this gap, we use domain adaptation that transfers the features of real defocused photos into those of synthetically blurred ones. Our DMENet consists of four subnetworks: blur estimation, domain adaptation, content preservation, and sharpness calibration networks. The subnetworks are connected to each other and jointly trained with their corresponding supervisions in an end-to-end manner. Our method is evaluated on publicly available blur detection and blur estimation datasets and the results show the state-of-the-art performance.
AbstractSmartphone users often want to customize the positions and functions of physical buttons to accommodate their own usage patterns; however, this is unfeasible for electronic mobile devices based on COTS (Commercial Off-The-Shelf) due to high production costs and hardware design constraints. In this letter, we present the design and implementation of customized virtual buttons that are localized using only common built-in sensors of electronic mobile devices. We develop sophisticated strategies firstly to detect when a user taps one of the virtual buttons, and secondly to locate the position of the tapped virtual button. The virtual-button scheme is implemented and demonstrated in a COTS-based smartphone. The feasibility study shows that, with up to nine virtual buttons on five different sides of the smartphone, the proposed virtual buttons can operate with greater than 90% accuracy.
AbstractAugmented reality (AR) augments virtual information over the real-world medium and is emerging as an important type of an information visualization technique. As such, the visibility and readability of the augmented information must be as high as possible amidst the dynamically changing real-world surrounding and background. In this work, we present a technique based on image saliency analysis to improve the conspicuity of the foreground augmentation to the background real-world medium by adjusting the local brightness contrast. The proposed technique is implemented on a mobile platform considering the usage nature of AR. The saliency computation is carried out for the augmented object’s representative color rather than all the pixels, and searching and adjusting over only a discrete number of brightness levels to produce the highest contrast saliency, thereby making real-time computation possible. While the resulting imagery may not be optimal due to such a simplification, our tests showed that the visibility was still significantly improved without much difference to the optimal ground truth in terms of correctly perceiving and recognizing the augmented information. In addition, we also present another experiment that explores in what fashion the proposed algorithm can be applied in actual AR applications. The results suggested that the users clearly preferred the automatic contrast modulation upon large movements in the scenery.
AbstractAbnormal messages propagated from faulty operations in a vehicular system may severely harm the system, but they cannot be easily detected when their information is not known in advance. To support an efficient detection of faulty message patterns propagated in the in-vehicle network, this paper presents a novel graph pattern matching framework built upon a message log-driven graph modeling. Our framework models the unknown condition as a query graph and the reference database of normal operations as data graphs. The analysis of the faulty message propagation requires to consider the sequence of events in the distance measure, and thus, the conventional graph distance measures cannot be directly used for our purpose. We hence propose a novel distance metric based on the maximum common subgraph (MCS) between two graphs and the sequence numbers of messages, which works robustly even for the abnormal faulty patterns and can avoid false negatives in large databases. Since the problem of MCS computation is NP-hard, we also propose two efficient filtering techniques, one based on the lower bound of MCS distance for a polynomial-time approximation and the other based on edge pruning. Experiments performed on real and synthetic datasets to assess our framework show that ours significantly outperforms the previously existing methods in terms both of performance and accuracy of query responses.
AbstractCellular internet-of-things (CIoT) systems are recently developed by the third-generation partnership project (3GPP) to support internet-of-things (IoT) services over the conventional mobile-communication infrastructures. The CIoT systems allow a large number of IoT devices to be connected through the random-access procedure, but the concurrent accesses of the massive devices make this procedure heavily competitive. In this article, we present an effective time-division random-access scheme built upon the coverage levels (CLs), where each CIoT device is assigned a CL and categorized based on its radio-channel quality. In our scheme, the random-access loads of device groups having different CLs are distributed into different time periods, which greatly relaxes instantaneous contention and improves random-access performance. To assess the performance of our scheme, we also introduce a mathematical model that expresses and analyzes the states and behaviors of CIoT devices using the Markov chain. Mathematical analysis and simulation results show that our scheme significantly outperforms the conventional scheme (without time-division control) in terms of collision probability, succeeding access rate, and access-blocking probability.
AbstractIn order to facilitate low-cost network connection of many devices, machine-type communication (MTC) has evolved to low-cost MTC (LC-MTC) in the third-generation partnership project (3GPP) standard. LC-MTC should be able to effectively handle intensive accesses through multiple narrow-band (NB) random-access channels (RACHs) assigned within the bandwidth of a long-term evolution (LTE) system. As the number of MTC devices and their congestion rapidly increase, the random-access scheme for LC-MTC RACH needs to be improved. This paper presents a novel random-access scheme that introduces virtual preambles of LC-MTC devices and associates them with RACH indices to effectively discern LC-MTC devices. In comparison to the sole use of preambles, our scheme allows an LC-MTC device to better choose a unique virtual preamble. Thereby, the probability of successful accesses of LC-MTC devices increases in contention-based random-access environments. We experimentally assessed our scheme and the results show that our scheme performs better than the existing preamble-based scheme in terms of collision probability, access delay, and access blocking probability.
AbstractMany visual tasks in modern personal devices such smartphones resort heavily to graphics processing units (GPUs) for their fluent user experiences. Because most GPUs for embedded systems are nonpreemptive by nature, it is important to schedule GPU resources efficiently across multiple GPU tasks. We present a novel spatial resource sharing (SRS) technique for GPU tasks, called a budget-reservation spatial resource sharing (BR-SRS) scheduling, which limits the number of GPU processing cores for a job based on the priority of the job. Such a priority-driven resource assignment can prevent a high-priority foreground GPU task from being delayed by background GPU tasks. The BR-SRS scheduler is invoked only twice at the arrival and completion of jobs, and thus, the scheduling overhead is minimized as well. We evaluated the performance of our scheduling scheme in an Android-based smartphone, and found that the proposed technique significantly improved the performance of high-priority tasks in comparison to the previous temporal budget-based multi-task scheduling.
AbstractA virtualized system generally suffers from low I/O performance, mainly caused by its inherent abstraction overhead and frequent CPU transitions between the guest and hypervisor modes. The recent research of polling-based I/O virtualization partly solved the problem, but excessive polling trades intensive CPU usage for higher performance. This article presents a power-efficient and high-performance block I/O framework for a virtual machine, which allows us to use it even with a limited number of CPU cores in mobile or embedded systems. Our framework monitors system status, and dynamically switches the I/O process mode between the exit and polling modes, depending on the amounts of current I/O requests and CPU utilization. It also dynamically controls the polling interval to reduce redundant polling. The highly dynamic nature of our framework leads to improvements in I/O performance with lower CPU usage as well. Our experiments showed that our framework outperformed the existing exit-based mechanisms by 10.8 % higher I/O throughput, maintaining similar CPU usage by only 3.1 % increment. In comparison to the systems solely based on the polling mechanism, ours reduced the CPU usage roughly down to 10.0 % with no or negligible performance loss.
AbstractThis article evaluates the usability of motion sensing-based interaction on a mobile platform using image browsing as a representative task. Three types of interfaces, a physical button interface, a motion-sensing interface using a high-precision commercial 3D motion tracker, and a motion-sensing interface using an in-house low-cost 3D motion tracker, are compared in terms of task performance and subjective preference. Participants were provided with prolonged training over 20 days, in order to compensate for the participants’ unfamiliarity with the motion-sensing interfaces. Experimental results showed that the participants’ task performance and subjective preference for the two motion-sensing interfaces were initially low, but they rapidly improved with training and soon approached the level of the button interface. Furthermore, a recall test, which was conducted 4 weeks later, demonstrated that the usability gains were well retained in spite of the long time gap between uses. Overall, these findings highlight the potential of motion-based interaction as an intuitive interface for mobile devices.
AbstractWhile hand-held computing devices are capable of rendering advanced 3D graphics and processing of multimedia data, they are not designed to provide and induce sufficient sense of immersion and presence for virtual reality. In this paper, we propose minimal requirements for realizing VR on a hand-held device. Furthermore, based on the proposed requirements, we have designed and implemented a low cost hand-held VR platform by adding multimodal sensors and display components to a hand-held PC. The platform enables a motion based interface, an essential part of realizing VR on a small hand-held device, and provides outputs in three modalities, visual, aural and tactile/haptic for a reasonable sensory experience. We showcase our platform and demonstrate the possibilities of hand-hand VR through three VR applications: a typical virtual walkthrough, a 3D multimedia contents browser, and a motion based racing game.
AbstractPresence is one of the goals of many virtual reality systems. Historically, in the context of virtual reality, the concept of presence has been associated much with spatial perception (bottom up process) as its informal definition of "feeling of being there" suggests. However, recent studies in presence have challenged this view and attempted to widen the concept to include psychological immersion, thus linking more high level elements (processed in a top down fashion) to presence such as story and plots, flow, attention and focus, identification with the characters, emotion, etc. In this paper, we experimentally studied the relationship between two content elements, each representing the two axis of the presence dichotomy, perceptual cues for spatial presence and sustained attention for (psychological) immersion. Our belief was that spatial perception or presence and a top down processed concept such as voluntary attention have only a very weak relationship, thus our experimental hypothesis was that sustained attention would positively affect spatial presence in a virtual environment with impoverished perceptual cues, but have no effect in an environment rich in them. In order to confirm the existence of the sustained attention in the experiment, fMRI of the subjects were taken and analyzed as well. The experimental results showed that that attention had no effect on spatial presence, even in the environment with impoverished spatial cues.
AbstractSpatial presence, among the many aspects of presence, is the sense of physical and concrete space, often dubbed as the sense of "being there." This paper theorizes on how "spatial" presence is formed by various types of artificial cues in a virtual environment, form or content. We believe that spatial presence is a product of an unconscious effort to correctly register oneself into the virtual environment in a consistent manner. We hypothesize that this process is perceptual, and bottomup in nature, and rooted in the reflexive and adaptive behavior to react and resolve the mismatch in the spatial cues between the physical space where the user is and the virtual space where the user looks at, hears from and interacts with. Hinted from the fact that our brain has two major paths for processing sensory input, the "where" path for determining object locations, and "what" path for identifying objects, we categorize the sensory stimulation cues in the virtual environment accordingly and investigate in their relationships as how they affect the user in adaptively registering oneself into the virtual environment, thus creating spatial presence. Based on the results of series of our experiments and other bodies of research, we postulate that while low level and perceptual spatial cues are sufficient for creating spatial presence, they can be affected and modulated by the spatial (whether form or content) factors. These results provide important insights into constructing a model of spatial presence, its measurement, and guidelines for configuring locationbased virtual reality applications.
AbstractLens flare, comprising diffraction patterns of direct lights and ghosts of an aperture, is one of artistic artifacts in optical systems. The generation of far-field diffraction patterns has commonly used Fourier transform of the iris apertures. While such outcomes are physically faithful, more flexible and intuitive editing of diffraction patterns has not been explored so far. In this poster, we present a novel scheme of diffraction synthesis, which additively integrates diffraction elements. We decompose the apertures into curved edges and circular core so that they abstract non-symmetric streaks and circular core highlights, respectively. We then apply Fourier transform for each, rotate them, and finally composite them into a single output image. In this way, we can easily generate diffraction patterns similarly to that of the source aperture and more exaggerated ones, as well.
AbstractThis paper presents a real-time framework for computationally tracking objects visually attended by the user while navigating in interactive virtual environments. In addition to the conventional bottom-up (stimulus-driven) features, the framework also uses topdown (goal-directed) contexts to predict the human gaze. The framework first builds feature maps using preattentive features such as luminance, hue, depth, size, and motion. The feature maps are then integrated into a single saliency map using the center-surround difference operation. This pixel-level bottom-up saliency map is converted to an object-level saliency map using the item buffer. Finally, the top-down contexts are inferred from the user’s spatial and temporal behaviors during interactive navigation and used to select the most plausibly attended object among candidates produced in the object saliency map. The computational framework was implemented using the GPU and exhibited extremely fast computing performance (5.68 msec for a 256x256 saliency map), substantiating its adequacy for interactive virtual environments. A user experiment was also conducted to evaluate the prediction accuracy of the visual attention tracking framework with respect to actual human gaze data. The attained accuracy level was well supported by the theory of human cognition for visually identifying a single and multiple attentive targets, especially due to the addition of top-down contextual information. The framework can be effectively used for perceptually based rendering without employing an expensive eye tracker, such as providing the depth-of-field effects and managing the level-of-detail in virtual environments.
AbstractAccording to the present invention, a lens flare generation method and apparatus are provided that may simulate lens flare effects through paraxial approximation-based linear approximation to generate a lens flare utilizing physical characteristics of a lens system while generating a lens flare at remarkably high speed as compared with the conventional art. Further, according to an embodiment of the present invention, a non-linear effect may be added to a linear pattern-based lens flare effect, generating an actual lens flare reflecting most of physical characteristics generated from the lens system. Further, use of a pre-recorded non-linear pattern allows for generation of a lens flare having a similar quality to the existing light tracking-based simulation at higher speed as compared with the conventional art without speed reduction.
AbstractA method for performing occlusion queries is disclosed. The method includes steps of: (a) a graphics processing unit (GPU) using a first depth buffer of a first frame to thereby predict a second depth buffer of a second frame; and (b) the GPU performing occlusion queries for the second frame by using the predicted second depth buffer, wherein the first frame is a frame predating the second frame. In accordance with the present invention, a configuration for classifying the objects into the occluders and the occludees is not required and the occlusion queries for the predicted second frame are acquired in advance at the last of the first frame or the first of the second frame.
AbstractA method and device for efficiently simulating lens flares produced by an optical system is provided. The method comprises the steps of - Simulating paths of rays from a light source through the optical system, the rays representing light; and Estimating, for points in a sensor plane, an irradiance, based on intersections of the simulated paths with the sensor plane.