Project Details
Description
Major depressive disorder has a significant impact on society and the labor force, which has attracted more and more attention. The purpose of this project is to extract the type of query intention for user questions in a medical social media platform. The result can provide the necessary information for understanding the required information need of patients. We proposed three types of feature data. The first one is using the word embedding vector to generate the correlation features between the various vector dimensions. The second is the similarity features of each word with a set of pre-defined medical concept keywords. The third one is the embedded vector feature of the part-of-speech for each word. Then two frameworks of CNN-based learning models are proposed. The first one is CNN Joint Model, which concatenates CNN output results of various types of features to learn the intention types. The second one is Ensemble CNN Model. The feature data is used to predict the intention type degree value independently. Then the Ensemble parameters are used to learn the weight of each feature to combine the prediction results of various types of features. The results of experiments show that when the medical concept keyword feature and the word vector dimension association feature are combined as input features, the intent type of the question text can be predicted with high F1 measure. To combine with the traditional word embedding vector or part-of-speech embedding vector as the input feature data at the same time, the prediction result can be improved furthermore. Through the comprehensive evaluation on the experiments, when the predicted intention type degree value greater than a threshold value 0.3, the best result of intention types prediction can be achieved, whose F1 measure is at least 0.75.
Status | Finished |
---|---|
Effective start/end date | 2017/08/01 → 2018/10/31 |
Keywords
- intention types classification
- medical concept keyword feature
- CNN
Fingerprint
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.