Omni-directional cameras are widely used in many applications such as surveillance systems and endoscopy. Omnidirectional cameras use a single camera and a reflective mirror to capture elliptic omnidirectional images and then transform the elliptic omnidirectional images to panoramic images. To accelerate the transformation from elliptic omnidirectional images to panoramic images, this paper proposes a hierarchical parallelism including data parallelism and task parallelism to improve the performance of transformation using graphic processing units. The data parallelism accelerates the mapping of pixels from elliptic omnidirectional images to panoramic images using multiple threads simultaneously while the task parallelism performs deep pipelines on multiple streams. We have implemented the proposed algorithm using CUDA on NVIDIA GPUs. The experimental results show that the proposed hierarchical parallelism performed on GPUs achieves 6.33 times faster than the CPU counterpart does.