We present a scalable multi-view stereo method able to reconstruct accurate 3D models from hundreds of high-resolution input images. Local fusion of disparity maps obtained with semi-global matching enables the reconstruction of large scenes that do not fit into main memory. Since disparity maps may vary widely in quality and resolution, careful modeling of the 3D errors is crucial. We derive a sound stereo error model based on disparity uncertainty, which can vary spatially from tenths to several pixels. We introduce a feature based on total variation that allows pixel-wise classification of disparities into different error classes. For each class, we learn a disparity error distribution from ground-truth data using expectation maximization. We present a novel method for stochastic fusion of data with varying quality by adapting a multi-resolution volumetric fusion process that uses our error classes as a prior and models surface probabilities via an octree of voxels. Conflicts during surface extraction are resolved using visibility constraints and preference for voxels at higher resolutions. Experimental results on several challenging large-scale datasets demonstrate that our method yields improved performance both qualitatively and quantitatively.
«We present a scalable multi-view stereo method able to reconstruct accurate 3D models from hundreds of high-resolution input images. Local fusion of disparity maps obtained with semi-global matching enables the reconstruction of large scenes that do not fit into main memory. Since disparity maps may vary widely in quality and resolution, careful modeling of the 3D errors is crucial. We derive a sound stereo error model based on disparity uncertainty, which can vary spatially from tenths to sever...
»