Given a pair of images, a stereoscopic display should execute the image plane calibration, the depth map estimation, and the multiple views rendering to show a 3-D picture. Among all signal processing stages, the depth estimation is the most critical one. This paper presents a real-time depth map estimation algorithm as well as parallel software architecture based on a GPU processor. The experimental result shows that the proposed algorithm get smooth depth map and the performance can be boosted up to 42X faster than running on a modern traditional personal computer. It achieves the requirement of real-time stereoscopic displays.