The idea of this method is to take a collection of images of an object that has also been scanned with a 3D triangulation scanner, calibrate the camera positions, and then use that information to build a high-resolution texture map for the surface. The effects of lighting are factored out so that the resulting texture map represents the diffuse reflectance of the surface, independent of the lighting, which is what's required for realistic rendering.
My original work on this problem is described in Chapter 4 of my thesis, which also deals with the issues of how to parameterize an object with a suitable collection of texture maps. We also used essentially the same technique later on in the face modeling work I did at Microsoft Research.