Interpretability via Region Grouping

Yin’s Group, UW-Madison | Sept 2018 - May 2020

This research project aims at building an interpretable deep model while maintaining its accuracy. Such interpretability is achieved with our plug-in interpretable unit, inspired by the perceptual grouping idea and attention mechanism. With a novel prior as regularization, our model demonstrates superior interpretability both qualitatively and quantitatively, without hurting the discriminative power of the baseline or using any extra supervision. The paper has been accepted to CVPR 2020 as an oral presentation.

Sparsity-invariant Network for Depth Completion

Research Center, Sensetime | Feb 2018 - June 2018

We developed several novel and effective modules for CNN to handle spatially irregular inputs. These modules were then combined into a multi-scale network, which was tested on the depth completion task. By the time of submission, we achieved the first place on the KITTI depth completion benchmark. This work led to a paper published in IEEE Transactions on Image Processing (TIP).

Visual Odometry aided Video Segmentation

Autonomous Driving Group, Sensetime | July 2017 - Sept 2017

This work was to improve the performance of video semantic segmentation with visual odometry algorithms, as the latter could explicitly model rich temporal information. We designed a model that merged sparse depth maps and object trajectories from the VO algorithms into the semantic segmentation backbone. This model was evaluated on large datasets and achieve state-of-the-art results.