How Does PointInfinity Work?
PointInfinity is a conditional point diffusion model that generates point clouds based on RGBD images. During training, PointInfinity learns to denoise low-resolution point clouds, while during inference it generates point clouds of much higher resolution. The figure below is an overview of PointInfinity's setup.
The core of PointInfinity is the two-stream block. This is the main building block of our denoiser and decouples the underlying surface representation from the raw point cloud representation. As shown below, the two-stream block consists of 1) a fixed-sized latent surface stream, 2) a variable-sized raw data stream, and 3) the lightweight read/write modules exchanging information between the two streams. The read module loads information from the data stream to the latent surface stream, and perform the main computation there. The per-point noise prediction is then obtained by writing back to the data stream. The key idea is to execute most of the computation in the resolution-invariant latent space, which makes the denoiser robust to the change of resolution.