Kihwan Kim (김기환)

    VP at Samsung Electronics

    Head of XR and Immsersive SW
    Ph.D. in Computer Science

    Georgia Institute of Technology

    Computational Perception Lab

    NVIDIA Research

   Contact :

    email)  kihwan23 dot kim at gmail

I am currently a corporate vice president in Mobile communications division (MX/무선사업부) in Samsung Electronics. I have led several teams in computer vision, deep learning (on-device inference), camera pipeline, XR (AR/VR) and Avatar for Samsung's flagship Galaxy phones and various form factors.
Formerly, I was a principal research scientist at NVIDIA Research (2012--2020) worked on 3D computer vision and scene perception problems for autonomous driving, robotics, and AR/VR.
[Curriculum Vitae](CV)/[Resume], [Google Scholar], [Linkedin] [Joins]

News, recently released code, talks and dataset

2020 to 2022 (at Samsung) : Links for flagship models and other commercialization topics and projects were added
Flagship Galaxy Phones (S/Note/Fold/Flip) [S22] , [S21], [Note20] [Z Fold2] , [Z Fold3] , [Z Fold4] [Z Flip2] , [Z Flip3] , [Z Flip4]
: System (game/rendering), camera, imaging, ML, vision, AR/VR (XR Platforms), and Avatar solutions (mid to low-end models and other form factors as well)
[Galaxy Avatar: New SDK (SDC2022) : The history and the strategy of Galaxy avatar with the new features in OneUI 5.0 introduced in SDC2022.
[AI filters for mobile phones](PDF) : Extracting Vignetting and Grain Filter Effects from Photos (used for S21/S22), *WACV 2022.
[AI Nightography in S22] , [DL-based AI Night photography in S21] , [Night portrait] [AI Selfie portrait] New AI solutions for 2021/2022 models
[UDC (Under Display Camera)] , [Avatar (AR Emoji) : Avatar for phones, TV, and watch (Watch 4/5) (More XR and metaverse projects to be introduced)
[Invited talks] "Computer Vision for On-device AI: Trends and State-of-the-arts" [POSTECH] [SNU] [KAIST]

2020 : Codebases (Github) and new papers for 2020 projects are updated.
[Online Mesh Adaptation](TBD) : Online Adaptation for Consistent Mesh Reconstruction in the Wild, *NeurIPS 2020.
[DeepGMR](Github) : DeepGMR: Learning Latent Gaussian Mixture Models for Registration (s/o), ECCV 2020.
[SS3D] (Github) : Self-supervised Single-view 3D Reconstruction via Semantic Consistency (s/o), ECCV 2020.
[Dynamic View Synthesis] : View Synthesis of Dynamic Scenes with Globally Coherent Depths, CVPR 2020.
[Two-shot SVBRDF Estimation] (Github) : Two-shot Spatially-varying BRDF and Shape Estimation, CVPR 2020.
[Bi3D Stereo Estimation] (Github) : Bi3D: Stereo Depth Estimation via Binary Classifications, CVPR 2020.

[Neural Inverse Rendering] : Inverse Rendering of an indoor scene from a singleRGB image, ICCV 2019.
[PlaneRCNN] (GitHub) : Plane detection and reconstruction from single RGB image, CVPR 2019 (Oral).
[Neural RGB->D Sensor] (GitHub) : Depth estimation from a RGB video, CVPR 2019 (Oral).
*CVPR 2019 Best paper finalist.
[3D Human affordance (TBD)]: Putting human in a scene: Human affordance for 3D scene reasoning, CVPR 2019.
[Competitive Collaboration] (GitHub) : Joint unsupervised learning of depth, motion and flow, CVPR 2019.
[Intrinsic3D] (GitHub): 3D Reconstruction with a joint optimization from apperarance, geometry and lighting.
[3D Vision and beyond] (slide) : My Stanford SCIEN talk about state-of-the-art 3D Computer vision techniques.
2017 -- 2018
[HGMM and HGMR](TBD) (will be released with ISAAC SDK ): Point cloud registration, CVPR 2016 (s/o), ECCV 2018.
[Learning rigidity] (GitHub): Learning rigidity for 3D Scene flow estimation, ECCV 2018.
[3D Scene flow and rigidity] (slide) My GTC talk about scene flow and learning rigidity. ECCV 2018.
[GeoMapNet] (Github) Learning-based 6DOF camera pose estimation, CVPR 2018 (Spot' Oral).
[LearningBRDF] (NVR): Dataset for learning based reflectance estimation, ICCV 2017 (Oral).
[Dynamic Hand Gesture] (NVR) Dataset for online gesture recognition with R3DCNN, CVPR 2016.
[DTSLAM] (GitHub) : SLAM, Camera pose estimatino and mapping, 3DV 2015.
*See more details about old projects and their code below.

Main projects

[SDC 2022]

  Samsung Galaxy Avatar: new SDK announcements

AR Emoji: Your avatar, your experience
In Samsung Developer Conference 2022 [SDC Session page]
2021 (S22) Avatar / Emoji announcement

During SDC 2022, We announced the new Avatar SDK (AR emoji SDK 2022) for Phones, Watch, TVs, and showcased the latest update of Galaxy avatar in Samsung Galaxy ecosystem.
with Jongju Kim, Jinho Lim, and many members in HQ, SR, SRUK, SRIB, SAIC etc.


  Reconstructing a temporally consistent non-rigid object instances

Online Adaptation for Consistent Mesh Reconstruction in the Wild
In NeurIPS 2020 [PDF] [Project page]

This paper presents an algorithm to reconstruct temporally consistent 3D meshes of deformable object instances from videos in the wild. Without requiring annotations of 3D mesh, 2D keypoints, or camera pose for each video frame, we pose video based reconstruction as a self-supervised online adaptation problem for videos.

with Xueting Li, Sifei Liu, Shalini Gupta , Xiaolong Wang, Ming-Hsuan Yang , and Jan Kautz


 Learning Latent GMM for 3D Registration

DeepGMR: Learning Latent Gaussian Mixture Models for Registration
In ECCV 2020 [PDF] [Project page] [Dataset] [Video]

We introduce Deep Gaussian Mixture Registration (DeepGMR), the first learning-based registration method that explicitly leverages a probabilistic registration paradigm by formulating registration as the minimization of KL-divergence between two probability distributions modeled as mixtures of Gaussians.

with Wentao Yuan , Ben Eckart, Dieter Fox, and Jan Kautz


 Single view 3D reconstruction with semantic consistency

Self-supervised Single-view 3D Reconstruction via Semantic Consistency
In ECCV 2020 [PDF] [Project page] [Video]

We learn a self-supervised, single-view 3D reconstruction model that predicts the 3D mesh shape, texture and camera pose of a target object with a collection of 2D images and silhouettes. The key insight of our work is that objects can be represented as a collection of deformable parts, and each part is semantically coherent across different instances of the same category (e.g., wings on birds and wheels on cars).

with Xueting Li, Sifei Liu, Shalini Gupta , Varun Jampani, Ming-Hsuan Yang , and Jan Kautz


 Novel View Synthesis for Dynamic Scenes

View Synthesis of Dynamic Scenes with Globally Coherent Depths
In CVPR 2020 [PDF](TBD) [Project page] [Video]

This paper presents a new method to synthesize an image from the arbitrary view and time given a collection of images of a dynamic scene. A key challenge for the synthesis arises from dynamic scene reconstruction where epipolar geometry does not apply to the local motion of dynamic contents. We cast this problem as learning to correct the scale of depth estimates, and to refine each depth with locally consistent motions between views to form a coherent depth estimation.

with Jae shin Yoon, Orazio Gallo, Hyunsoo Park, and Jan Kautz


 Two-shot SVBRDF Estimation

Two-shot Spatially-varying BRDF and Shape Estimation
In CVPR 2020 [PDF] [Project page] [Video]

Capturing the shape and spatially-varying appearance (SVBRDF) of an object from images is a challenging task that has applications in both computer vision and graphics. We propose a novel deep learning architecture with a stage-wise estimation of shape and SVBRDF. The previous predictions guide each estimation, and a joint refinement network later refines both SVBRDF and shape. Both our two-shot image capture and network inference can run on mobile hardware.

with Mark Boss, Varun Jampani, Hendrik P.A. Lensch, and Jan Kautz


 Stereo Depth with binary classifications

Bi3D: Stereo Depth Estimation via Binary Classifications
In CVPR 2020 [PDF](TBD) [Project page](TBD)

We present Bi3D, a method that estimates depth via a series of binary classifications. Rather than testing if objects are at a particular depth D, as existing stereo methods do, it classifies them as being closer or farther than D. This property offers a powerful mechanism to balance accuracy and latency.

with Abhishek Badki, Orazio Gallo, Alejandro Troccoli, Pradeep Sen, and Jan Kautz


 Non-rigid Multiview Stereo

NRMVS: Non-Rigid Multi-view Stereo
In WACV 2020 [PDF] [Project page] [Video]

In this paper, we open up a new challenging direction: Dense 3D reconstruction of scenes with non-rigid changes observed from a small number of images sparsely captured from different views with a single monocular camera. We formulate this problem as a joint optimization of deformation and depth estimation, using deformation graphs as the underlying representation.

with Matthias Innmann, Jinwei Gu, Charles Loop, Matthias Niessner , and Jan Kautz


 Inverse Rendering of an Indoor Scene

Neural Inverse Rendering of an Indoor Scene from a Single Image
In ICCV 2019 [PDF] [Project page] [Code (TBD)]

Inverse rendering aims to estimate physical attributes of a scene, e.g., reflectance, geometry, and lighting, from image(s). This paper proposes the first learning based approach that jointly estimates albedo, normals, and lighting of an indoor scene from a single image. The key contribution is the Residual Appearance Renderer (RAR), which can be trained to synthesize complex appearance effects ( e.g., inter-reflection, cast shadows, near-field illumination, and realistic shading).

with Soumyadip Sengupta, Jinwei Gu, Guilin Liu, David W. Jacobs, and Jan Kautz

[CVPR19] Oral

 Plane detection, segmentation and 3D reconstruction

PlaneRCNN: 3D Plane Detection and Reconstruction from a Single Image
In CVPR 2019 [PDF] [Video] [Project page] [Code]

This paper proposes a deep neural architecture, PlaneR-CNN, that detects and reconstructs piecewise planar surfaces from a single RGB image. PlaneRCNN employs a variant of Mask R-CNN to detect planes with their plane parameters and segmentation masks. PlaneRCNN then jointly refines all the segmentation masks with a novel loss enforcing the consistency with a nearby view during training.

with Chen Liu, Jinwei Gu, Yasutaka Furukawa, and Jan Kautz

[CVPR19] Oral *Best paper finalist.

  Neural RGB-D Sensing: Depth estimation from a video

Neural RGB-D Sensing: Depth estimation from a video
In CVPR 2019 [PDF] [Video] [Project page] [Code]

In this paper, we propose a deep learning (DL) method to estimate per-pixel depth and its uncertainty continuously from a monocular video stream, with the goal of effectively turning an RGB camera into an RGB-D camera. Unlike prior DL-basedmethods, we estimate a depth probability distribution for each pixel rather than a single depth value, leading to an estimate of a 3D depth probability volume for each input frame.

with Chao Liu, Jinwei Gu, Srinivasa Narasimhan, and Jan Kautz


  Putting Human in a Scene: 3D Human Affordance

Putting Humans in a Scene: Learning Affordance in 3D Indoor Environments
In CVPR 2019 [PDF] [Video] [Project page] [Code (TBD)]

In this paper, we aim to predict affordances of 3D indoor scenes, specifically what human poses are afforded by a given indoor environment, such as sitting on a chair or standing on the floor. We build a fully automatic 3D pose synthesizer that fuses semanticknowledge from a large number of 2D poses extracted from TV shows as well as 3D geometric knowledge from voxel representations of indoor scenes.

with Xueting Li, Sifei Liu, Xiaolong Wang, Ming-Hsuan Yang , and Jan Kautz


  Unsupervised Joint Learning of Depth, Pose, Flow and Motion

Competitive Collaboration: Joint Unsupervised Learning of Depth, CameraMotion, Optical Flow and Motion Segmentation
In CVPR 2019 [PDF] [Project page] [Code]

Single view depth prediction, camera motion estimation, optical flow, and segmentation of a video into the static scene and moving regions are challenging but coupled problems. Our key insight is that these four fundamental vision problems are coupled through geometric constraints. Thus, we introduce Competitive Collaboration, a framework that facilitates the coordinated training of multiple specialized neural networks to solve complex problems.

with Anurag Ranjan, Varun Jampani, Deqing Sun, Jonas Wulffe, and Michael Black


  Learning Rigidity for 3D Scene Flow Estimation

Learning Rigidity in Dynamic Scenes with a Moving Camera for 3D Motion Field Estimation
In ECCV 2018 [PDF] [Talk slide] [Video] [Project page] [Code]

In a dynamic scene, the main challenge is the disambiguation of the camera motion from scene motion, which becomes more difficult as the amount of rigidity observed decreases. In this paper we propose to learn the rigidity of a scene in a supervised manner from a large collection of dynamic scene data, and directly infer a rigidity mask from two sequential images with depths.

with Zhaoyang Lv, Alejandro Troccoli, Deqing Sun , James M. Rehg , and Jan Kautz


  Hierarchical GMM for 3D Point Cloud Registration

HGMR: Hierarchical Gaussian Mixtures for Adaptive 3D Registration
In ECCV 2018 [PDF] [Video] [Project page]

We present a new registration algorithm that is able to achieve state-of-the-art speed and accuracy through its use of an adaptive hierarchical Gaussian Mixture Model (GMM) representation. Our method constructs a top-down multi-scale representation of point cloud data by recursively running many small-scale data likelihood segmentations in parallel on a GPU. It performs a pointwise data association in logarithmic-time while dynamically adjusting the level of detail to best match the complexity and spatial distribution of geometry.

with Ben Eckart, and Jan Kautz

[CVPR18] Spotlight Oral

  Learning-based Camera Localization (MapNet)

Geometry-Aware Learning of Maps for Camera Localization (MapNet)
In CVPR 2018 [PDF] [Video] [Project page] [Code]

We propose to represent a map as a deep neural net called MapNet, which enables learning a data-driven map representation. Geometric constraints expressed by these inputs, which have traditionally been used in bundle adjustment or pose-graph optimization, are formulated as loss terms in MapNet training and also used during inference.

with Samarth Brahmbhatt, Jinwei Gu, James Hays, and Jan Kautz

[ICCV17] Oral

  Deep Learning-based Reflectance Estimation On-the-fly

A Lightweight Approach for On-the-Fly Reflectance Estimation
In ICCV 2017 [PDF] [Talk slides] [Video] [Project page] [Dataset]

We propose to represent a map as a deep neural net called MapNet, which enables learning a data-driven map representation. Geometric constraints expressed by these inputs, which have traditionally been used in bundle adjustment or pose-graph optimization, are formulated as loss terms in MapNet training and also used during inference.

with Jinwei Gu, Stephen Tyree, Pavlo Molchanov, Matthias Niessner , and Jan Kautz


  Joint Optimization of Geometry, Color and Lighting for 3D Reconstruction

Intrinsic3D: High-Quality 3D Reconstruction by Joint Appearance and Geometry Optimization with Spatially-Varying Lighting
In ICCV 2017 [PDF] [Talk slides] [Project page] [Dataset] [Code]

We introduce a novel method to obtain high-quality 3D reconstructions from consumer RGB-D sensors. We simultaneously optimize a geometry encoded in a signed distance field, textures from automatically selected keyframes, and their camera poses along with material and scene lighting estimated from spatially-varying spherical harmonics (SVSH) from subvolumes of the reconstructed scene.

with Robert Maier, Daniel Cremers , Jan Kautz, and Matthias Niessner

[3DV17] Oral

  Multi-frame 3D Scene Flow Estimation

Multiframe Scene Flow with Piecewise Rigid Motion
In IEEE 3DV 2017 [PDF] [Talk slides] [Video] [Project page]

We introduce a novel multiframe scene flow approach that jointly optimizes the consistency of the patch appearances and their local rigid motions from RGB-D image sequences. We formulate scene flow recovery as a global non-linear least squares problem which is iteratively solved by a damped Gauss-Newton approach. As a result, we obtain a qualitatively new level of accuracy in RGB-D based scene flow estimation which can potentially run in real-time.

with Vladislav Golyanik, Robert Maier, Matthias Niessner , Jan Kautz

[CVPR16] Oral

  Accelerated Generative model (GMM) for 3D Vision

Accelerated Generative Models for 3D Point Cloud Data
In IEEE CVPR 2016 [PDF] [Talk slides] [Video] [Project page]

In this paper we introduce a method for constructing compact generative representations of point cloud at multiple levels of detail using hierarchical Gaussian Mixture Model (hGMM). As opposed to deterministic structures such as voxel grids or octrees, we propose probabilistic subdivisions of the data through local mixture modeling, and show how these subdivisions can provide a maximum likelihood segmentation of the data.

with Benjamin Eckart, Alejandro Troccoli , Alonzo Kelly, Jan Kautz


  Online classification of Dynamic Hand Gestures with R3DCNN

Online Detection and Classification of Dynamic Hand Gestures with Recurrent 3D Convolutional Neural Networks
In IEEE CVPR 2016 [PDF] [Video] [Project page]

Automatic detection and classification of dynamic hand gestures is challenging as: 1) there is a large diversity in how people perform gestures, making detection and classification difficult; 2) the system must work online in order to avoid noticeable lags between performing a gesture and its classification. We address these challenges with a recurrent three-dimensional convolutional neural network that outperforms state-of-the-arts.

with Pavlo Molchanov, Shalini Gupta , and Xiaodong Yang , Stephen Tyree, and Jan Kautz


  VirtualEye: Real-time 3D Reconstruction for Fast Free View Video

NVIDIA VirtualEye: Real-time Fast Free View Video
In DARPA Wait What: A Future Technology Forum 2015, [Media] [Video]

We introduce a live, real-time full HD visualization of a scenes with both dynamic non-rigid objects and rigid static background structure with commodity depth and stereo cameras. This demo was introduced in DARPA's Wait What A Future Technology Conference 2015. The project aims real-time (+30fps) visualization of free view video streams from multiple cameras. The pipeline (preprocessing, capturing, fusion and meshfication) is completely benefit from NVIDIA's CUDA.

with Alejandro Troccoli , Xiaodong Yang , Natesh Srinivasan , Jan Kautz

[3DV15] Oral

  Fast and accurate PCD registration with GMM

MLMD: Maximum Likelihood Mixture Decoupling for Fast and Accurate Point Cloud Registration
In IEEE 3D Vision (3DV 2015) [PDF] [Video]

We introduce a PCD registration algorithm that utilizes Gaussian Mixture Models (GMM) and a novel dual-mode parameter optimization technique which we call mixture decoupling. We show how this decoupling technique facilitates both faster and more robust registration by first optimizing over the mixture parameters (decoupling the mixture weights, means, and covariances from the points) before optimizing over the 6DOF registration parameters.

with Benjamin Eckart, Alejandro Troccoli , Alonzo Kelly, Jan Kautz

[EGSR15] Oral

  Physically-based Rendering for Mixed and Augmented Reality

Filtering Environment Illumination for Interactive Physically-Based Rendering in Mixed Reality
In Eurographics Symposium on Rendering (EGSR) 2015 [PDF] [Supp] [Video]

We propose a photo-realistic augmented and mixed reality system that runs in interactive rates. Our primary contribution is an axis-aligned filtering scheme that preserves the frequency content of the illumination. We then demonstrate a novel two-mode path tracing approach that allows ray-tracing a scene with image-based real geometry (captured from commodity depth camera) and mesh-based virtual geometry.

with Soham Mehta, Dawid Pajak , Kari Pulli, Jan Kautz, and Ravi Ramamoorthi

[CVPRW15] Oral

  3D CNN for Dynamic Hand Gesture Recognition

Hand Gesture Recognition with 3D Convolutional Neural Networks
In IEEE CVPR 2015 Workshop on Hand gesture recognition
Winner of first HANDS challenage competition 2015. [PDF]

We propose an algorithm for drivers' hand gesture recognition from challenging depth and intensity data using 3D convolutional neural networks. Our solution combines information from multiple spatial scales for the final prediction. It also employs spatio-temporal data augmentation for more effective training and to reduce potential overfitting. Our method achieves a correct classification rate of 77.5% on the VIVA challenge dataset .

with Pavlo Molchanov, Shalini Gupta , and Jan Kautz

[FG15] [RIDARCon15]

  Multi-sensor Deep Learning architecture for Gesture recognition

Multi-sensor System for Driver's Hand-Gesture Recognition
In IEEE Automatic Face and Gesture Recognition (FG 2015) [PDF]

Short-Range FMCW Monopulse Radar for Hand-Gesture Sensing
In IEEE International Radar conference 2015 [PDF]

We propose a novel multi-sensor system for accurate and power-efficient dynamic car-driver hand-gesture recognition, using a short-range radar, a color camera, and a depth camera, which together make the system robust against variable lighting conditions.

with Pavlo Molchanov, Shalini Gupta , Kari Pulli

[3DV14] Oral

  DT-SLAM: Robest SLAM with Adaptive Triangulation for Rotation

DT-SLAM: Deferred Triangulation for Robust SLAM
IEEE 3D Vision Conference (3DV 2014) [PDF] [Video] [Code]

We introduce a real-time visual SLAM system that incrementally tracks individual 2D features, and estimates camera pose by using matched 2D features, regardless of the length of the baseline. Triangulating 2D features into 3D points is deferred until keyframes with sufficient baseline for the features are available. Our method can also deal with pure rotational motions, and fuse the two types of measurements in a bundle adjustment step.

with Daniel C. Herrera, and Kari Pulli


  WYSIWYG Viewfinder: Real-time Segmentation and Editing

WYSIWYG Computational Photography via Viewfinder Editing
ACM Transaction on Graphics, SIGGRAPH Asia 2013 [PDF] [Video] [Project page]

We introduce a WYSIWYG viewfinder editing, which makes the viewfinder more accurately reflect the final image the user intends to create. We allow the user to alter the local or global appearance (tone, color, or focus) via stroke-based input, and propagate the edits spatiotemporally. The system then delivers a real-time visualization of these modifications to the user, and drives the camera control routines to select better capture parameters.

with Jongmin Baek, Dawid Pajak , Kari Pulli, and Marc Levoy


  Prediction of Camera Motions with Gaussian Process Regression

Detecting Regions of Interest in Dynamic Scenes with Camera Motions
In IEEE CVPR 2012 [PDF] [Video] [Project page]

We use stochastic fields for predicting important future regions of interest as the scene evolves dynamically. We evaluate our approach on a variety of videos of team sports. We show that our approach can detect where to move the camera based on observations in the scene and compare the detected/predicted regions of interest to the camera motion as generated by actual camera operators

with Dongreyol Lee and Irfan Essa


  Gaussian Process Regression Flow (GPRF)

Gaussian Process Regression Flow for Analysis of Motion Trajectories
In IEEE ICCV 2011 [PDF] [Video] [Project page]

In this paper, we introduce a new representation specifically aimed at matching motion trajectories. We model a trajectory as a continuous dense flow field from a sparse set of vector sequences using Gaussian Process Regression. Our approach works well on various types of complete and incomplete trajectories from a variety of video data sets with different frame rates.

with Dongreyol Lee and Irfan Essa


  Global Motion Prediction for Automated Broadcasting System

Motion Fields to Predict Play Evolution in Dynamic Sports Scenes
In IEEE CVPR 2010 [PDF] [Video] [Project page]

Player actions and interactions in dynamic sports scenes are complex as they are driven by many factors, such as the short-term goals of the individual player, the overall team strategy, the rules of the sport, and the current context of the game. We show that such constrained multi-agent events can be analyzed, and even predicted, by estimating the global movements of all players in the scene at any time and used to predict play evolution.

with Matthias Grundmann, Ariel Shamir, Iain Matthews, Jessica Hodgins and Irfan Essa


  Player Tracking and Localization with Multiple Cameras

Player Localization using Multple Static Cameras for Sports
In IEEE CVPR 2010 [PDF] [Video] [Project page]

Modeling and analysis for the problem of fusing corresponding players' positional information as finding minimum weight K-length cycles in complete K- partite graphs. We use our proposed algorithm-class for an end-to-end sports visualization framework, and demonstrate its robustness by presenting results over 60,000 frames of real soccer footage captured over five different illumination conditions, play types, and team attire.

with Raffay Hamid, Ram Krishan Kumar, Matthias Grundmann, Jessica Hodgins and Irfan Essa

[ISMAR09] Oral

  Augmenting Earth-Maps with Dynamic Information

Augmenting Aerial Earth Maps with Dynamic Information
In ISMAR 2009, Journal of Virtual Reality 2011 [PDF] [Video] [Slide] [Project page]

Modeling and analysis for the problem of fusing corresponding players' positional information as finding minimum weight K-length cycles in complete K- partite graphs. We use our proposed algorithm-class for an end-to-end sports visualization framework, and demonstrate its robustness by presenting results over 60,000 frames of real soccer footage captured over five different illumination conditions, play types, and team attire.

with Irfan Essa , Sangmin Oh and Jeonggyu Lee [Media] : CNN, New Scientist , Popular Science , Discovery Channel ,    MIT Tech Review , Engadget, Vizworld, Revolution Magazine , etc.


  Real-time Transparent-Colored Shadow Volume

A Shadow Volume Algorithm for Opaque and Transparent Non-Manifold Casters
Journal of Graphics Tools 2008 [PDF] [Video]

We provide a novel shadow volume algorithms that extends to general non-manifold meshes and an additional extension to shadows of transparent casters. To achieve these, we first introduce a generalization of an object’s silhouette to non-manifold meshes. we then compute the light intensity arrived at the receiver fragments after the light has traveled through multiple colored transparent receiver surfaces

with Byungmoon Kim, Greg Turk

[ISWC08] Oral

  GPSRay: 3D Reconstruction of Urban Scenes using GPS

Localization and 3D Reconstruction of Urban Scenes Using GPS
IEEE ISWC 2008 [PDF] [Video] [Slide]

Using off-the-shelf Global Positioning System (GPS) units, we reconstruct buildings in 3D by exploiting the reduction in signal to noise ratio (SNR) that occurs when the buildings obstruct the line-of-sight between the moving units and the orbiting satellites. We measure the size and height of skyscrapers as well as automatically constructing a density map representing the location of multiple buildings in an urban landscape

with Jay summet, Thad Starner, Mrunal Kapade, Daniel Ashbrook , and Irfan Essa


  Video based Non-Photorealistic Rendering

Video based Non-Photorealistic Rendering
Samsung STAR/SAIT 2008 [PDF] [Video1] [Video2]

Making Non-photorealistic Rendering(NPR) system using global gradient field from   Radiail Basis interpolation and dispersion filters (water-colorization). For temporal   coherence we adopt Michael Black's piecewise-smooth flow fields ( robust   regularization). Dispersion filter is also designed for mimicing pigment dispersion   on the water fluid.

with Irfan Essa

[ACMMM 06]

  Video based Non-Photorealistic Rendering

Interactive Mosaic Generation for Video Navigation
ACM Multimedia (ACMMM) 2006 [PDF] [Project page]

We introduce a novel mosaicing algorithm using multi-scale tiling algorithm. The method allows the users to create mosaic from a collection of videos and navigate and edit the video scenes. In matching process we used the annotated information from Family Video

with Irfan Essa, Gregory Abowd


  Face Recognition using Generalized SVD

Face Recognition using Generalized Singular Value Decomposition
Tech report [Project page]

We propose a Face recognition algorithm using GSVD. We used Linear Discriminant Analysis with Generalized Singular Value Decomposition (GSVD) which effectively reduces dimension of input data images while keep the classification performance better.

with Sangmin Lee, James M. Rehg , Haesun Park


  Real-time Face Detection

Face Detection with Adaboost and Morphology Operators
Tech report [Project page]

We demonstrate the implementation of a two state-of-the-art face detection algorithms. We first demonstrate Viola's Adaboost-based approach, then show Han's Morphology-based approach and show how we fuse the both methods with various evaluatoins.

Research and Development at Samsung SDS IT R&D Center



  ViaFace: Face Identification System

Face Detection with Adaboost and Morphology Operators
Tech report Details

Samsung IT R&D Center released a face recognition system (ViaFace) in  2002  after 3 years of research . It was demonstrated in Las vegas Comdex 2001. Later, it was deployed in  various field and industry including well known Korean apartment  franchise  'Raemian' and A Mexico airport etc.   This system covers both verification and identification.


  Syncbiz: Real-time Collaboration System

Samsung Real-time Collaboration System
Tech report Details

Syncbiz is a real-time collaboration system, which includes application sharing module, text chatting , video/audio conferencing module, shared virtual directory module, multiuser white board module, and realtime agenda scheduler. A single syncbiz local server has a capacity to sustain 50 concurrent sessions.
This project is a winner of 2003 Samsung Best Solution Award.


  IP-STB Framework : LivingWise CS (LWCS)

LivingWiseCS (LWCS) Samsung's Smart City STB Framework
Tech report Details

LWCS is an framework for IP Set top box used for a Smart City projects by Samsung Electronics and KT (Korea Telecommunication). It manages overall I/O and controllers on top of Microsoft Windows CE environment.

with Taesoo Jun, Hanchoel Kim and Joonsung Park.