📝 Publications

SIGGRAPH 2025

Xiang Zhang, Yang Zhang, Lukas Mehl, Markus Gross^†, Christopher Schroers^†

We introduce SplatDiff, a pixel-splatting-guided video diffusion model for synthesizing novel views with consistent geometry and high-fidelity texture from a single image.
SplatDiff excels in single-view novel view synthesis, sparse-view novel view synthesis, and stereo video conversion, demonstrating remarkable crossdomain and cross-task performance.

NeurIPS 2024

Xiang Zhang, Bingxin Ke, Hayko Riemenschneider, Nando Metzger, Anton Obukhov, Markus Gross^†, Konrad Schindler, Christopher Schroers^†

We propose BetterDepth to boost zero-shot MDE methods with plug-and-play diffusion refiners, achieving robust affine-invariant MDE performance with fine-grained details.
We design global pre-alignment and local patch masking strategies to enable learning detail refinement from small-scale synthetic datasets while preserving rich prior knowledge from pre-trained MDE models for zero-shot transfer.

ECCV 2024 - Oral

Xiang Zhang, Yulun Zhang, Fisher Yu

We propose a simple yet effective strategy (HiT-SR) to convert popular transformer-based SR methods to our hierarchical transformers, boosting SR performance by exploiting multi-scale features and long-range dependencies.
We design a spatial-channel correlation method to efficiently leverage spatial and channel features with linear computational complexity to window sizes, enabling utilization of large hierarchical windows, e.g., $64\times64$ windows.

ICCV 2023

Xiang Zhang, Lei Yu^†, Wen Yang, Jianzhuang Liu, Gui-Song Xia

A scale-aware network is designed to allow flexible setups of input spatial resolutions and enable learning from different temporal scales of motion blur.
A self-supervised learning framework is proposed for model training with real-world data and performance generalization in spatial and temporal domains.
A multi-scale real-world blurry dataset (MS-RBD) is constructed to facilitate the evaluation of deblurring performance in real-world scenarios.

TPAMI 2022

Lei Yu^†, Xiang Zhang, Wei Liao, Wen Yang, Gui-Song Xia

An event-based synthetic aperture imaging (E-SAI) algorithm is proposed to see through dense occlusions even under extreme lighting conditions.
A hybrid network composed of an spiking encoder and a convolutional decoder is designed to mitigate the disturbances from occlusions and guarantee the overall reconstruction performance.

CVPR 2022

Xiang Zhang, Lei Yu^†

We present a unified framework for event-based video deblurring and interpolation (EVDI).
By utilizing the constraints between cross-modal frames and events, a fully self-supervised learning method is proposed to enable network training with real-world data without requiring ground-truth images.

IJCV 2025 Self-Supervised Shutter Unrolling with Events, Mingyuan Lin^*, Yanggunag Wang^*, Xiang Zhang, Boxin Shi, Wen Yang, Chu He, Gui-Song Xia, Lei Yu. | [Website] [Code] [Dataset]
TPAMI 2024 CrossZoom: Simultaneous Motion Deblurring and Event Super-Resolving, Chi Zhang, Xiang Zhang, Mingyuan Lin, Cheng Li, Chu He, Wen Yang, Gui-Song Xia, Lei Yu. | [Website]
TIP 2024 Neuromorphic Synergy for Video Binarization, Shijie Lin, Xiang Zhang, Lei Yang, Lei Yu, Bin Zhou, Xiaowei Luo, Wenping Wang, Jia Pan. | [Code&Dataset] [Youtube] [Bilibili]
TPAMI 2023 Learning to Super-Resolve Blurry Images with Events, Lei Yu^†, Bishan Wang, Xiang Zhang, Haijian Zhang, Wen Yang, Jianzhuang Liu, Gui-Song Xia. | [Code]
TSP 2022 Spiking Sparse Recovery with Non-convex Penalties, Xiang Zhang, Lei Yu^†, Gang Zheng, Yonina C. Eldar.
CVPR 2022 Synthetic Aperture Imaging with Events and Frames, Wei Liao^*, Xiang Zhang^*, Lei Yu^†, Shijie Lin, Wen Yang, Ning Qiao. | [Code] [Dataset]
CVPR 2021 Event-based Synthetic Aperture Imaging with a Hybrid Network, Xiang Zhang^*, Wei Liao^*, Lei Yu^†, Wen Yang, Gui-Song Xia. (Oral, Best Paper Candidate) | [Code] [Dataset] [Youtube]

^* means equal contribution and ^† indicates my supervisor.