Tensor Holography: Towards Real-time Photorealistic
3D Holography with Deep Neural Networks

Nature 2021

Liang Shi1,2,✉     Beichen Li1,2     Changil Kim1,2     Petr Kellnhofer1,2     Wojciech Matusik1,2,✉
1MIT CSAIL      2MIT EECS      Corresponding Author
Tensor Holography synthesizes a 3D hologram with per-pixel depth from a single RGB-D image in real-time. This videos shows a live capture from a holographic near-eye display (using HOLOEYE PLUTO SLM) with 3D holograms synthesized in real time. The camera focus is set on the eyes of the bunny. The background trees are optically (not computationally) blurred due to camera defocus.


The ability to present three-dimensional (3D) scenes with continuous depth sensation has a profound impact on virtual and augmented reality (AR/VR), human-computer interaction, education, and training. Computer-generated holography (CGH) enables high spatio-angular resolution 3D projection via numerical simulation of diffraction and interference. Yet, existing physically based methods fail to produce holograms with both per-pixel focal control and accurate occlusion. The computationally taxing Fresnel diffraction simulation further places an explicit trade-off between image quality and runtime, making dynamic holography far from practical. Here, we demonstrate the first deep learning-based CGH pipeline capable of synthesizing a photorealistic color 3D hologram from a single RGB-Depth (RGB-D) image in real time. Our convolutional neural network (CNN) is extremely memory-efficient (below 620 KB) and runs at 60 Hz for 1920×1080 pixels resolution on a single consumer-grade graphics processing unit (GPU). Leveraging low-power on-device artificial intelligence (AI) acceleration chips, our CNN also runs interactively on mobile (iPhone 11 Pro at 1.1 Hz) and edge (Google Edge TPU at 2 Hz) devices, promising real-time performance in future generation AR/VR mobile headsets. We enable this pipeline by introducing the first large-scale CGH dataset (MIT-CGH-4K) with 4,000 pairs of RGB-D images and corresponding 3D holograms. Our CNN is trained with differentiable wave-based loss functions and physically approximates Fresnel diffraction. With an anti-aliasing phase-only encoding method, we experimentally demonstrate speckle-free, natural-looking high-resolution 3D holograms. Our learning-based approach and the first Fresnel hologram dataset will help unlock the full potential of holography and enable new applications in metasurface design, optical and acoustic tweezer-based microscopic manipulation, holographic microscopy, and single-exposure volumetric 3D printing.


Towards Real-time Photorealistic 3D Holography with Deep Neural Networks
Liang Shi, Beichen Li, Changil Kim, Petr Kellnhofer, Wojciech Matusik
Nature 2021
[Paper]  [Code]  [Dataset]  [BibTeX]

Authors' Correction: In Figure 3(a) of the paper, the beam splitter in the visualization of the rendered setup is 90 degree off, please refer to the new rendering and video below. The Δp in Eq(5) is the grating pitch (twice the pixel pitch) instead of the pixel pitch. The paper will be corrected soon.


Exemplar Dataset Images