A 32-bit AES implementation is proposed in small Xilinx FPGA Chip (Spartan-3 XC3S200). It uses 148 slices, 11 Block RAMs (BRAMs) and achieves a throughput of 647 Mega bits per second ( Mbps) at 278 MHz working frequency. It achieve 3 times improvement in throughput and 3.4 times increase to the best known similar design in throughput per area and 8% smaller in slices area. An 128-bit AES implementation in FPGA (Virtex-II XC2VP20) by parallel operations of four above 32-bit AES is also presented. Comparison to state-of-art AES cores indicates that the proposed folded designed achieves 4780 Mbps and 410 slices, which outperformed the most recent works by 200% in throughput and requires 20% less reconfigurable area, which results over 250% improvement in throughput/slice metric.