👋 Hi there!

I am currently a joint PhD student at the Computer Graphics Lab of ETH Zürich and DisneyResearch|Studios, supervised by Prof. Markus Gross and Dr. Christopher Schroers. I also spent 4 months working on low-level vision at the Computer Vision Lab, ETH Zürich, supervised by Prof. Yulun Zhang. Before that, I received my M.E. and B.E. degrees respectively in 2023 and 2020 from the Electronic Information School of Wuhan University, where I work closely with Prof. Lei Yu on event-based vision. I am a lifelong learner with broad interests, including 3D vision, low-level vision, signal processing, and neuromorphic computing. I am particularly interested in achieving robust perception in complex environments for real-world applications.

🔥 News

2025.03: 🎉 SplatDiff is accepted by SIGGRAPH 2025!

2025.01: 🎉 SelfUnroll is accepted by Springer IJCV 2025! Congrats to Mingyuan Lin and Yanggunag Wang!

2024.09: 🎉 BetterDepth is accepted by NeurIPS 2024!

2024.07: 🎉 HiT-SR is accepted by ECCV 2024 (Oral)! Code, models, and results are released!

2024.05: 🎉 CrossZoom is accepted by IEEE TPAMI 2024! Congrats to Chi Zhang!

2024.02: 🎉 EBR is accepted by IEEE TIP 2024! Congrats to Shijie Lin!

2023.08: 🛠️ The code of our GEM is released.

2023.07: 🎉 GEM is accepted by ICCV 2023!

2023.01: 🎉 eSL-Net++ is accepted by IEEE TPAMI 2023! Congrats to Bishan Wang!

2022.12: 🎉 A-SSR is accepted by IEEE TSP 2022!

2022.12: 🎉 Extended E-SAI is accepted by IEEE TPAMI 2022!

2022.04: 🛠️ The code of our EVDI is released.

2022.03: 🛠️ The code of our EF-SAI is released.

2022.03: 🎉 EVDI and EF-SAI are accepted by CVPR 2022! Congrats to Wei Liao!

2021.07: 🛠️ The code of our E-SAI is released.

2021.06: 🍾 E-SAI is selected as one of the best paper candidates by CVPR 2021!

2021.03: 🎉 E-SAI is accepted by CVPR 2021 (Oral)!

📝 Publications

SIGGRAPH 2025

High-Fidelity Novel View Synthesis via Splatting-Guided Diffusion

Xiang Zhang, Yang Zhang, Lukas Mehl, Markus Gross^†, Christopher Schroers^†

[Website] [Paper] [arXiv] [Supp] [Video]

We introduce SplatDiff, a pixel-splatting-guided video diffusion model for synthesizing novel views with consistent geometry and high-fidelity texture from a single image.
SplatDiff excels in single-view novel view synthesis, sparse-view novel view synthesis, and stereo video conversion, demonstrating remarkable crossdomain and cross-task performance.

NeurIPS 2024

BetterDepth: Plug-and-Play Diffusion Refiner for Zero-Shot Monocular Depth Estimation

Xiang Zhang, Bingxin Ke, Hayko Riemenschneider, Nando Metzger, Anton Obukhov, Markus Gross^†, Konrad Schindler, Christopher Schroers^†

[Website] [arXiv] [Poster]

We propose BetterDepth to boost zero-shot MDE methods with plug-and-play diffusion refiners, achieving robust affine-invariant MDE performance with fine-grained details.
We design global pre-alignment and local patch masking strategies to enable learning detail refinement from small-scale synthetic datasets while preserving rich prior knowledge from pre-trained MDE models for zero-shot transfer.

ECCV 2024 - Oral

HiT-SR: Hierarchical Transformer for Efficient Image Super-Resolution

Xiang Zhang, Yulun Zhang, Fisher Yu

[Code] [Supp] [Video]

We propose a simple yet effective strategy (HiT-SR) to convert popular transformer-based SR methods to our hierarchical transformers, boosting SR performance by exploiting multi-scale features and long-range dependencies.
We design a spatial-channel correlation method to efficiently leverage spatial and channel features with linear computational complexity to window sizes, enabling utilization of large hierarchical windows, e.g., $64\times64$ windows.

ICCV 2023

Generalizing Event-Based Motion Deblurring in Real-World Scenarios

Xiang Zhang, Lei Yu^†, Wen Yang, Jianzhuang Liu, Gui-Song Xia

[Code] [Dataset] [Youtube]

A scale-aware network is designed to allow flexible setups of input spatial resolutions and enable learning from different temporal scales of motion blur.
A self-supervised learning framework is proposed for model training with real-world data and performance generalization in spatial and temporal domains.
A multi-scale real-world blurry dataset (MS-RBD) is constructed to facilitate the evaluation of deblurring performance in real-world scenarios.

TPAMI 2022

Learning to See Through with Events

Lei Yu^†, Xiang Zhang, Wei Liao, Wen Yang, Gui-Song Xia

[Code] [Dataset] [Bilibili]

An event-based synthetic aperture imaging (E-SAI) algorithm is proposed to see through dense occlusions even under extreme lighting conditions.
A hybrid network composed of an spiking encoder and a convolutional decoder is designed to mitigate the disturbances from occlusions and guarantee the overall reconstruction performance.

CVPR 2022

Unifying Motion Deblurring and Frame Interpolation with Events

Xiang Zhang, Lei Yu^†

[Code] [Youtube]

We present a unified framework for event-based video deblurring and interpolation (EVDI).
By utilizing the constraints between cross-modal frames and events, a fully self-supervised learning method is proposed to enable network training with real-world data without requiring ground-truth images.

IJCV 2025 Self-Supervised Shutter Unrolling with Events, Mingyuan Lin^*, Yanggunag Wang^*, Xiang Zhang, Boxin Shi, Wen Yang, Chu He, Gui-Song Xia, Lei Yu. | [Website] [Code] [Dataset]
TPAMI 2024 CrossZoom: Simultaneous Motion Deblurring and Event Super-Resolving, Chi Zhang, Xiang Zhang, Mingyuan Lin, Cheng Li, Chu He, Wen Yang, Gui-Song Xia, Lei Yu. | [Website]
TIP 2024 Neuromorphic Synergy for Video Binarization, Shijie Lin, Xiang Zhang, Lei Yang, Lei Yu, Bin Zhou, Xiaowei Luo, Wenping Wang, Jia Pan. | [Code&Dataset] [Youtube] [Bilibili]
TPAMI 2023 Learning to Super-Resolve Blurry Images with Events, Lei Yu^†, Bishan Wang, Xiang Zhang, Haijian Zhang, Wen Yang, Jianzhuang Liu, Gui-Song Xia. | [Code]
TSP 2022 Spiking Sparse Recovery with Non-convex Penalties, Xiang Zhang, Lei Yu^†, Gang Zheng, Yonina C. Eldar.
CVPR 2022 Synthetic Aperture Imaging with Events and Frames, Wei Liao^*, Xiang Zhang^*, Lei Yu^†, Shijie Lin, Wen Yang, Ning Qiao. | [Code] [Dataset]
CVPR 2021 Event-based Synthetic Aperture Imaging with a Hybrid Network, Xiang Zhang^*, Wei Liao^*, Lei Yu^†, Wen Yang, Gui-Song Xia. (Oral, Best Paper Candidate) | [Code] [Dataset] [Youtube]

^* means equal contribution and ^† indicates my supervisor.

💻 Services

Conference Reviewer

Computer Vision and Pattern Recognition (CVPR)
International Conference on Computer Vision (ICCV)
European Conference on Computer Vision (ECCV)
Advances in Neural Information Processing Systems (NeurIPS)
International Conference on Learning Representations (ICLR)
International Conference on Machine Learning (ICML)

Journal Reviewer

Teaching

252-0206-00L Visual Computing - Teaching Assistant (Fall 2025)
263-5704-00L Artificial Intelligence for Digital Characters - Teaching Assistant (Spring 2025)
401-0131-00L Lineare Algebra - Teaching Assistant (Fall 2024)

🍹 Misc

🎸 I love rock and hip-hop music. Mayday, Jay Chou, and Pharaoh are my favoriate.
📖 I enjoy reading all kinds of books. Echo’s story inspired me about love and traveling. My recent favoriate is the sci-fi novel The Three Body Problem written by Cixin Liu.
🎮 I often relax by playing games, including roguelike games like Slay the Spire and card games like Hearthstone. I recently played Split Fiction and It Takes Two with my girlfriend, and they were super fun!
🛫 It always excites me when experiencing new cultures in new places. Here are some photos I took during my trips.