This study developed and validated a Chinese pseudo-character/non-character producing system (CPN system) that can assist researchers in creating experimental materials using Chinese characters. Based on a large-scale dataset of 6097 characters, the CPN system provides researchers with precise Chinese orthographic information (structures and positions, radical frequency, number of strokes, number of radical-sharing neighbors, and position-based regularity) to create three types of experimental stimuli: pseudo-characters, semi non-characters, and whole non-characters. Featuring the position-based regularity of 446 radicals, the CPN system helps researchers to manipulate, or to control for, orthographic characteristics of radicals to study Chinese lexical processing. In two empirical validations for stimuli created by the system, Chinese-as-second-language learners (n = 79) and first-language users (n = 41), respectively, participated in a Chinese orthographic choice task in which participants compared two artificial characters and chose the one that more closely resembled a real Chinese character. Both validations demonstrate that highly proficient Chinese readers are better able to identify pseudo-characters, suggesting that the radical’s position-based information impacts Chinese character identification to different extents. With the empirical support for the created stimuli, the system further affords researchers auto-generated outcomes with downloadable images and Excel sheets for creating customized stimuli, making material selection easy, efficient, and effective. This CPN system is the first large-scale, data-driven tool free for researchers who are interested in studies of written Chinese. CPN should benefit the field of Chinese orthographic processing, Chinese instruction, and cross-linguistic comparisons, providing a useful tool for studying Chinese lexical processing.
ASJC Scopus subject areas