Cluster validity indices for mixture hazards regression models

Yi Wen Chang, Kang Ping Lu, Shao Tung Chang*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


In the analysis of survival data, the problems of competing risks arise frequently in medical applications where individuals fail from multiple causes. Semiparametric mixture regression models have become a prominent approach in competing risks analysis due to their flexibility and easy interpretation of resultant estimates. The literature presents several semiparametric methods on the estimations for mixture Cox proportional hazards models, but fewer works appear on the determination of the number of model components and the estimation of baseline hazard functions using kernel approaches. These two issues are important because both incorrect number of components and inappropriate baseline functions can lead to insufficient estimates of mixture Cox hazard models. This research thus proposes four validity indices to select the optimal number of model components based on the posterior probabilities and residuals resulting from the application of an EM-based algorithm on a mixture Cox regression model. We also introduce a kernel approach to produce a smooth estimate of the baseline hazard function in a mixture model. The effectiveness and the preference of the proposed cluster indices are demonstrated through a simulation study. An analysis on a prostate cancer dataset illustrates the practical use of the proposed method.

Original languageEnglish
Pages (from-to)1616-1636
Number of pages21
JournalMathematical Biosciences and Engineering
Issue number2
Publication statusPublished - 2020


  • Cox proportional hazards model
  • EM-algorithm
  • Kernel estimator
  • Mixture regression model
  • Validity indices

ASJC Scopus subject areas

  • Modelling and Simulation
  • General Agricultural and Biological Sciences
  • Computational Mathematics
  • Applied Mathematics


Dive into the research topics of 'Cluster validity indices for mixture hazards regression models'. Together they form a unique fingerprint.

Cite this