This figure illustrates the workflow of the Hybrid Tours. The upper left section illustrates the input video capture step. A human holding a camera in its hand representing the cameraman is shown on the left. Many curly arrows and a stack of images are shown on the right to represent the candidate clips. In the top middle section, there is an arrow pointing to the right with text Registration and Low-Fidelity Reconstruction representing the initial 3D registration and reconstruction process. The right section shows the Hybrid Tours editing interface with numbers and text explaining the components. An arrow pointing to the left with text Hi-Fidelity Reconstruction and Rendering in the lower middle section represents the final video rendering process. The bottom left section shows a curve with an arrow head whose color transitions back and forth between yellow and blue, which represents the final camera trajectory containing both real footage (shown in yellow) and virtual frames (shown in blue). This curve is drawn on top of a bunch of faded curved arrows which represent the original candidate clips.

We present Hybrid Tours, a tool for creating long-take touring shots from short hand-captured video clips. Users start by capturing candidate clips (top-left) that approximate different segments of potential touring camera trajectories. Then, a coarse subset of these frames is used to reconstruct a low-cost high-speed pre-visualization of the scene for path planning. Our editing interface (right) then lets users design longer camera trajectories by filtering, combining, and re-timing candidate clips. Finally, once the user is satisfied with the pre-visualized video, we optimize additional reconstruction of the scene around their chosen camera trajectory to render a final high-quality hybrid long-take touring video (bottom-left).


Abstract


Long-take touring (LTT) shots are characterized by smooth camera motion over a long distance that seamlessly connects different views of the captured scene. These shots offer a compelling way to visualize 3D spaces. However, filming LTT shots directly is very difficult, and rendering them based on a virtual reconstruction of a scene is resource-intensive and prone to many visual artifacts. We propose Hybrid Tours, a hybrid approach to creating LTT shots that combines the capture of short clips representing potential tour segments with a custom interactive application that lets users filter and combine these segments into longer camera trajectories. We show that Hybrid Tours makes capturing LTT shots much easier than the traditional single-take approach, and that clip-based authoring and reconstruction leads to higher-fidelity results at a lower cost than common image-based rendering workflows.

Video


Note: The result videos on this webpage are compressed, which affects their quality. If you want to watch the videos in their original quality, please download them here from Google Drive (608.3 MB). If you would like to access the raw data and create our own videos, please check our github repository at the top of this page.

Results - Tabletop



Results - Library



Results - Stairway



Results - Garage



Results - Floor




BibTeX Citation

@article{10.1145/3731423,
author = {Liu, Xinrui and Deng, Longxiulin and Davis, Abe},
title = {Hybrid Tours: A Clip-based System for Authoring Long-take Touring Shots},
year = {2025},
issue_date = {August 2025},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {44},
number = {4},
issn = {0730-0301},
url = {https://doi.org/10.1145/3731423},
doi = {10.1145/3731423},
abstract = {Long-take touring (LTT) shots are characterized by smooth camera motion over a long distance that seamlessly connects different views of the captured scene. These shots offer a compelling way to visualize 3D spaces. However, filming LTT shots directly is very difficult, and rendering them based on a virtual reconstruction of a scene is resource-intensive and prone to many visual artifacts. We propose Hybrid Tours, a hybrid approach to creating LTT shots that combines the capture of short clips representing potential tour segments with a custom interactive application that lets users filter and combine these segments into longer camera trajectories. We show that Hybrid Tours makes capturing LTT shots much easier than the traditional single-take approach, and that clip-based authoring and reconstruction leads to higher-fidelity results at a lower cost than common image-based rendering workflows.},
journal = {ACM Trans. Graph.},
month = jul,
articleno = {36},
numpages = {13},
keywords = {long-take shots, 3D gaussian splatting, video editing}
}