Reconstructing a 3D map from a scene is one of the most important issues in computer vision. Extracting depth information is usually the main challenge of 3D map reconstruction. In a stereo vision system, different angle of views of the cameras lead to a relative displacement of objects in each image. The human vision system uses these displacements to estimate the depth information. In the proposed method a Convolutional Neural Network is used to do the same job by calculating the initial matching cost as the first step. Then, the matching cost for each pixel in disparity map is aggregated by a weighed sum over a surrounding neighborhood around that pixel. These weights are the measures of similarity calculated by Fuzzy C-Means method. In order to calculate the initial disparity map, we use the Winner-Takes-All strategy on disparity costs for each pixel. The left–right consistency checking used to detect the location of the invalid disparities or occlusion followed by fill-in process. This process replaces the occluded and invalid pixels in disparity map with values suggested by K-Nearest Neighbor method. Finally a median filter is applied on disparity map to eliminate possible salt and pepper noises. The proposed algorithm is evaluated on Middlebury stereo evaluation v3.0. Results show the average speed of 1.85 seconds for normalized runtime and an average bad pixel2 of 22.0.