Yin’s Group, UW-Madison | Sept 2018 - May 2020
This research project aims at building an interpretable deep model while maintaining its accuracy. Such interpretability is achieved with our plug-in interpretable unit, inspired by the perceptual grouping idea and attention mechanism. With a novel prior as regularization, our model demonstrates superior interpretability both qualitatively and quantitatively, without hurting the discriminative power of the baseline or using any extra supervision. The paper has been accepted to CVPR 2020 as an oral presentation.
Research Center, Sensetime | Feb 2018 - June 2018
We developed several novel and effective modules for CNN to handle spatially irregular inputs. These modules were then combined into a multi-scale network, which was tested on the depth completion task. By the time of submission, we achieved the first place on the KITTI depth completion benchmark. This work led to a paper published in IEEE Transactions on Image Processing (TIP).
Autonomous Driving Group, Sensetime | July 2017 - Sept 2017
This work was to improve the performance of video semantic segmentation with visual odometry algorithms, as the latter could explicitly model rich temporal information. We designed a model that merged sparse depth maps and object trajectories from the VO algorithms into the semantic segmentation backbone. This model was evaluated on large datasets and achieve state-of-the-art results.
Exploring Visual Concepts within Adversarial Trained CNN, 2019: Compared the visual concept learned by adversarial trained CNN with clean trained CNN through visualization tools like Network Dissection and other specially designed statistics. Found that adversarial trained network had more (roughly double) interpretable visual concepts.
Exploring the Robustness of Different Modules in CNN, 2019: Examined the adversarial robustness of different modules like ReLU and BN in CNN. Found that BN could harm the robustness of CNN by a large margin, and ReLU could be modified a bit to increase the robustness a lot.
Lightweight Deep Learning Framework in CUDA/C++, 2019: Implemented a lightweight deep learning framework that supports several common layers. Achieved comparable training speed and accuracy compared with popular frameworks like Pytorch or Tensorflow.
Object tracking in indoor surveillance video, 2017: Did a literature review on object tracking and implemented several state-of-the-art algorithms in matlab. Improved their performance in indoor surveillance videos using global searching and recovering algorithms.
Implementing Classical Vision Algorithms, 2016: Implemented several classical vision algorithms like SIFT and Hough Transform from scratch in C++.