๐Ÿ‘‹ Hi there!

I am currently a joint PhD student at the Computer Graphics Lab of ETH Zรผrich and DisneyResearch|Studios, supervised by Prof. Markus Gross and Dr. Christopher Schroers. I also spent 4 months working on low-level vision at the Computer Vision Lab, ETH Zรผrich. Before that, I received my M.E. and B.E. degrees respectively in 2023 and 2020 from the Electronic Information School of Wuhan University, where I work closely with Prof. Lei Yu on event-based vision. I am a lifelong learner with broad interests, including computer vision, signal processing, and neuromorphic computing. I am particularly interested in achieving robust perception in complex environments for real-world applications.

๐Ÿ”ฅ News

  • 2024.09:  ๐ŸŽ‰ One paper is accepted by NeurIPS 2024!
  • 2024.07:  ๐ŸŽ‰ One paper is accepted by ECCV 2024 (Oral)! Code, models, and results of HiT-SR are released!
  • 2024.05:  ๐ŸŽ‰ One paper is accepted by IEEE TPAMI 2024! Congrats to Chi Zhang!
  • 2024.02:  ๐ŸŽ‰ One paper is accepted by IEEE TIP 2024! Congrats to Shijie Lin!
  • 2023.08:  ๐Ÿ› ๏ธ The code of our GEM is released.
  • 2023.07:  ๐ŸŽ‰ One paper is accepted by ICCV 2023!
  • 2023.01:  ๐ŸŽ‰ One paper is accepted by IEEE TPAMI 2023! Congrats to Bishan Wang!
  • 2022.12:  ๐ŸŽ‰ One paper is accepted by IEEE TSP 2022!
  • 2022.12:  ๐ŸŽ‰ One paper is accepted by IEEE TPAMI 2022!
  • 2022.04:  ๐Ÿ› ๏ธ The code of our EVDI is released.
  • 2022.03:  ๐Ÿ› ๏ธ The code of our EF-SAI is released.
  • 2022.03:  ๐ŸŽ‰ Two papers are accepted by CVPR 2022! Congrats to Wei Liao!
  • 2021.07:  ๐Ÿ› ๏ธ The code of our E-SAI is released.
  • 2021.06:  ๐Ÿพ Our work is selected as one of the best paper candidates by CVPR 2021!
  • 2021.03:  ๐ŸŽ‰ One paper is accepted by CVPR 2021 (Oral)!
  • ๐Ÿ“ Publications

    NeurIPS 2024
    sym

    BetterDepth: Plug-and-Play Diffusion Refiner for Zero-Shot Monocular Depth Estimation

    Xiang Zhang, Bingxin Ke, Hayko Riemenschneider, Nando Metzger, Anton Obukhov, Markus Grossโ€ , Konrad Schindler, Christopher Schroersโ€ 

    [arXiv]

    • We propose BetterDepth to boost zero-shot MDE methods with plug-and-play diffusion refiners, achieving robust affine-invariant MDE performance with fine-grained details.
    • We design global pre-alignment and local patch masking strategies to enable learning detail refinement from small-scale synthetic datasets while preserving rich prior knowledge from pre-trained MDE models for zero-shot transfer.
    ECCV 2024 - Oral
    sym

    HiT-SR: Hierarchical Transformer for Efficient Image Super-Resolution

    Xiang Zhang, Yulun Zhang, Fisher Yu

    [Code] [Supp] [Video]

    • We propose a simple yet effective strategy (HiT-SR) to convert popular transformer-based SR methods to our hierarchical transformers, boosting SR performance by exploiting multi-scale features and long-range dependencies.
    • We design a spatial-channel correlation method to efficiently leverage spatial and channel features with linear computational complexity to window sizes, enabling utilization of large hierarchical windows, e.g., $64\times64$ windows.
    ICCV 2023
    sym

    Generalizing Event-Based Motion Deblurring in Real-World Scenarios

    Xiang Zhang, Lei Yuโ€ , Wen Yang, Jianzhuang Liu, Gui-Song Xia

    [Code] [Dataset] [Youtube]

    • A scale-aware network is designed to allow flexible setups of input spatial resolutions and enable learning from different temporal scales of motion blur.
    • A self-supervised learning framework is proposed for model training with real-world data and performance generalization in spatial and temporal domains.
    • A multi-scale real-world blurry dataset (MS-RBD) is constructed to facilitate the evaluation of deblurring performance in real-world scenarios.
    TPAMI 2022
    sym

    Learning to See Through with Events

    Lei Yuโ€ , Xiang Zhang, Wei Liao, Wen Yang, Gui-Song Xia

    [Code] [Dataset] [Bilibili]

    • An event-based synthetic aperture imaging (E-SAI) algorithm is proposed to see through dense occlusions even under extreme lighting conditions.
    • A hybrid network composed of an spiking encoder and a convolutional decoder is designed to mitigate the disturbances from occlusions and guarantee the overall reconstruction performance.
    CVPR 2022
    sym

    Unifying Motion Deblurring and Frame Interpolation with Events

    Xiang Zhang, Lei Yuโ€ 

    [Code] [Youtube]

    • We present a unified framework for event-based video deblurring and interpolation (EVDI).
    • By utilizing the constraints between cross-modal frames and events, a fully self-supervised learning method is proposed to enable network training with real-world data without requiring ground-truth images.

    * means equal contribution and โ€  indicates my supervisor.

    ๐Ÿ’ฌ Invited Talks

    • 2021 & 2022, Introduction to Spiking Neural Networks, Wuhan University
    • 2021, Event-based Synthetic Aperture Imaging with a Hybrid Network, VALSE Webinar | [Bilibili]
    • 2021, Event-based Synthetic Aperture Imaging with a Hybrid Network, CSIG-3DV Student Forum

    ๐Ÿ’ป Services

    • Conference Review: CVPR, ICCV, ECCV, NeurIPS, ICLR
    • Journal Review: Springer IJCV, Springer MIR

    ๐Ÿน Misc

    • ๐ŸŽธ I love rock and hip-hop music. Mayday, Jay Chou, and Pharaoh are my favoriate.
    • ๐Ÿ“– I enjoy reading all kinds of books. Echoโ€™s story inspired me about love and traveling. My recent favoriate is the sci-fi novel The Three Body Problem written by Cixin Liu.
    • ๐ŸŽฎ I often relax by playing games, including roguelike games like Slay the Spire and card games like Hearthstone. I recently played It Takes Two with my girlfriend, and it was super fun!
    • ๐Ÿ›ซ It always excites me when experiencing new cultures in new places. Here are some photos I took during my trips.