Human spoken language production is directed towards communication delivering comprehensible information to recipients. Speech segmentation into small units efficiently enhances a sensible and interpretable discourse structure. Such processing units in real-life communication may be applied to semantic, syntactic, or prosodic structures. Previous studies have proposed various theories of speech segmentation, mainly based on qualitative analyses. The present study utilizes corpus-based quantitative data to examine how conversational speech in French and Mandarin is structured in terms of three different processing units, and how these units interact with one another. Unit completion location was identified by semantic structure (discourse unit), prosodic pattern (prosodic unit), and sequences of parts of speech (chunk). Quantitative analyses for both languages were carried out by applying comparable processing procedures. This article presents our efforts to establish a dataset for two typologically diverse languages, and to carry out quantitative comparative studies of processing units in face-to-face conversation.
ASJC Scopus subject areas