This paper presents a novel hardware architecture for fast principle component analysis (PCA). The architecture is developed based on generalized Hebbian algorithm (GHA). In the architecture, the updating of different synaptic weight vectors are divided into a number of stages. The results of precedent stages are used for the computation of subsequent stages for expediting training speed and lowering the area cost. The proposed architecture has been embedded in a system-on-programmable-chip (SOPC) platform for physical performance measurement. Experimental results show that the proposed architecture is an effective alternative for fast PCA in attaining both high performance and low computation time.