In recent years, deep neural networks for action recognition has attracted extensive attention because of its wide range of applications such as anomaly behavior detection in smart surveillance system. Among the proposed deep learning models, 3DCNN works very well in the action classification of large data sets, including UCF-101, HMDB-51, and Kinetics. However, for the classification of fine-grained actions, current action recognition models still need improvement. The fine-grained action means that the difference from the normal action is very small, and the time of occurrence is extremely short and difficult to distinguish. For example, in the basketball game, the foul action is a kind of fine-grained actions. Foul action recognition is very challenging because fouls in basketball games are always instantaneous and very similar to normal actions. In this paper, we propose a lightweight fine-grained action recognition model for basketball foul detection. Compared with other action recognition models such as two-stream model, 3DCNN, our proposed network has a better effect on this subtle classification task, and is lighter in parameters. The visualized foul feature distribution is concentrated in a few frames that supports our initial hypothesis that fouls always happen instantaneously. Finally, the output of this research can be used to assist in training basketball referees.