📝 Publications

High-Fidelity Novel View Synthesis via Splatting-Guided Diffusion
Xiang Zhang, Yang Zhang, Lukas Mehl, Markus Gross†, Christopher Schroers†
[arXiv]
- We introduce SplatDiff, a pixel-splatting-guided video diffusion model for synthesizing novel views with consistent geometry and high-fidelity texture from a single image.
- SplatDiff excels in single-view novel view synthesis, sparse-view novel view synthesis, and stereo video conversion, demonstrating remarkable crossdomain and cross-task performance.

BetterDepth: Plug-and-Play Diffusion Refiner for Zero-Shot Monocular Depth Estimation
Xiang Zhang, Bingxin Ke, Hayko Riemenschneider, Nando Metzger, Anton Obukhov, Markus Gross†, Konrad Schindler, Christopher Schroers†
- We propose BetterDepth to boost zero-shot MDE methods with plug-and-play diffusion refiners, achieving robust affine-invariant MDE performance with fine-grained details.
- We design global pre-alignment and local patch masking strategies to enable learning detail refinement from small-scale synthetic datasets while preserving rich prior knowledge from pre-trained MDE models for zero-shot transfer.

HiT-SR: Hierarchical Transformer for Efficient Image Super-Resolution
Xiang Zhang, Yulun Zhang, Fisher Yu
- We propose a simple yet effective strategy (HiT-SR) to convert popular transformer-based SR methods to our hierarchical transformers, boosting SR performance by exploiting multi-scale features and long-range dependencies.
- We design a spatial-channel correlation method to efficiently leverage spatial and channel features with linear computational complexity to window sizes, enabling utilization of large hierarchical windows, e.g., $64\times64$ windows.

Generalizing Event-Based Motion Deblurring in Real-World Scenarios
Xiang Zhang, Lei Yu†, Wen Yang, Jianzhuang Liu, Gui-Song Xia
- A scale-aware network is designed to allow flexible setups of input spatial resolutions and enable learning from different temporal scales of motion blur.
- A self-supervised learning framework is proposed for model training with real-world data and performance generalization in spatial and temporal domains.
- A multi-scale real-world blurry dataset (MS-RBD) is constructed to facilitate the evaluation of deblurring performance in real-world scenarios.

Learning to See Through with Events
Lei Yu†, Xiang Zhang, Wei Liao, Wen Yang, Gui-Song Xia
- An event-based synthetic aperture imaging (E-SAI) algorithm is proposed to see through dense occlusions even under extreme lighting conditions.
- A hybrid network composed of an spiking encoder and a convolutional decoder is designed to mitigate the disturbances from occlusions and guarantee the overall reconstruction performance.

Unifying Motion Deblurring and Frame Interpolation with Events
Xiang Zhang, Lei Yu†
- We present a unified framework for event-based video deblurring and interpolation (EVDI).
- By utilizing the constraints between cross-modal frames and events, a fully self-supervised learning method is proposed to enable network training with real-world data without requiring ground-truth images.
-
IJCV 2025
Self-Supervised Shutter Unrolling with Events, Mingyuan Lin*, Yanggunag Wang*, Xiang Zhang, Boxin Shi, Wen Yang, Chu He, Gui-Song Xia, Lei Yu. | [Website] [Code] [Dataset] -
TPAMI 2024
CrossZoom: Simultaneous Motion Deblurring and Event Super-Resolving, Chi Zhang, Xiang Zhang, Mingyuan Lin, Cheng Li, Chu He, Wen Yang, Gui-Song Xia, Lei Yu. | [Website] -
TIP 2024
Neuromorphic Synergy for Video Binarization, Shijie Lin, Xiang Zhang, Lei Yang, Lei Yu, Bin Zhou, Xiaowei Luo, Wenping Wang, Jia Pan. | [Code&Dataset] [Youtube] [Bilibili] -
TPAMI 2023
Learning to Super-Resolve Blurry Images with Events, Lei Yu†, Bishan Wang, Xiang Zhang, Haijian Zhang, Wen Yang, Jianzhuang Liu, Gui-Song Xia. | [Code] -
TSP 2022
Spiking Sparse Recovery with Non-convex Penalties, Xiang Zhang, Lei Yu†, Gang Zheng, Yonina C. Eldar. -
CVPR 2022
Synthetic Aperture Imaging with Events and Frames, Wei Liao*, Xiang Zhang*, Lei Yu†, Shijie Lin, Wen Yang, Ning Qiao. | [Code] [Dataset] -
CVPR 2021
Event-based Synthetic Aperture Imaging with a Hybrid Network, Xiang Zhang*, Wei Liao*, Lei Yu†, Wen Yang, Gui-Song Xia. (Oral, Best Paper Candidate) | [Code] [Dataset] [Youtube]
* means equal contribution and † indicates my supervisor.