This paper presents a vision-based infant surveillance system utilizing infant facial expression recognition software. In this study, the video camera is set above the crib to capture the infant expression sequences, which are then sent to the surveillance system. The infant face region is segmented based on the skin colour information. Three types of moments, namely Hu, R, and Zernike are then calculated based on the information available from the infant face regions. Since each type of moment in turn contains several different moments, given a single fifteen-frame sequence, the correlation coefficients between two moments of the same type can form the attribute vector of facial expressions. Fifteen infant facial expression classes have been defined in this study. Three decision trees corresponding to each type of moment have been constructed in order to classify these facial expressions. The experimental results show that the proposed method is robust and efficient. The properties of the different types of moments have also been analyzed and discussed.