This paper presents a novel pipelined architecture of the competitive learning (CL) algorithm with κ-winners-take-all activation. The architecture employs a codeword swapping scheme so that neurons failing the competition for a training vector are immediately available for the competitions for the subsequent training vectors. An efficient pipeline architecture is then designed based on the codeword swapping scheme for enhancing the throughput. The CPU time of the NIOS processor executing the CL training with the proposed architecture as an accelerator is measured. Experiment results show that the CPU time is lower than that of other hardware or software implementations running the CL training program with or without the support of custom hardware.