You can see the arrangement used in the photo above. The method was trained and tested using a set of videos that were gathered in stereo using a Kinect to measure depth where possible. If the input is a video then motion flow is used to improve the depth estimates - basically pixels that are in motion have to be in front of background pixels. A label based smoothing procedure is then used to improve the depth estimates. The matched object is assumed to be at the same depth as the library object. It then estimates warping functions, which indicate how the object is different in the target photo.
#2d to 3d video converter python software#
The new software does the job by keeping a database of objects with known depths that it can recognize in the photo. Single frames converted to depth maps - darker is closer
However, try looking at a scene with one eye closed you will still be able to judge distances and even more impressive, you can judge distances in a single photo.
After all, we have two eyes to be able to judge depth. You might at first think that all of this is an impossible task because you need stereo vision and hence a pair of stereo photos to work out depth. Once you have a depth map you can apply it to the single image and create a stereo pair which can be viewed as a stereoscopic image. The algorithm attempts to construct a depth map of the sort the Kinect creates but without using a Kinect. This is all about taking a simple 2D image and working out how far away from you each of the objects in it are. This time it's a library of code that converts a 2D video or still image into a 3D depth image. Microsoft Research has been hard at work doing impossible things again.