Chemoprevention of phytoestrogens on women hormone-related cancers by integrating text mining and data mining approaches

  • Sheng I. Chen
  • , Guan Jun Lin
  • , Yi Nung Tsao
  • , Chia Chien Hsieh*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Breast, endometrial, and ovarian cancers are the most common cancers in women. Phytoestrogens, such as coumestrol, daidzein, equol, genistein, lignan, and resveratrol, are natural compounds from plants with multiple bioactivities, including chemoprevention. However, studies on their effects have yielded conflicting results. This study developed a framework to assess the chemopreventive effects of phytoestrogens using text and data mining approaches. Natural language processing parsed research papers and classified them into categories such as phytoestrogens, cancers, experiments, and results. Our dataset comprised 1682 data points from 937 PubMed-indexed papers between 2000 and 2020; subsequently, statistical and data mining analyses identified relationships between phytoestrogens and cancer development. Chi-square analysis showed that phytoestrogens had positive effects of 91 %, 79 %, and 72 % on ovarian, breast, and endometrial cancers, respectively (X2 = 20.9, p < 0.0001). Lignan and daidzein exhibited the highest (94 %) and lowest (68 %) positive effects, respectively. Decision tree classification revealed that lignan and resveratrol had significantly stronger positive effects than coumestrol, daidzein, equol, and genistein (p = 0.046), with little conflicting results. Association rule analysis further confirmed that lignan and resveratrol had high confidence (∼90 %) and positive lift (>1.0), signaling their beneficial role in cancer prevention. Conversely, daidzein and equol slightly harmed endometrial and breast cancer cells. These results suggest that lignan and resveratrol may benefit women with hormone-related cancers, whereas daidzein, equol, genistein, and coumestrol showed limited chemopreventive effects. This is the first study integrating text and data-mining to analyze nutraceuticals, demonstrating their potential in advancing precision nutrition research.

Original languageEnglish
Article number106368
JournalFood Bioscience
Volume68
DOIs
Publication statusPublished - 2025 Jun
Externally publishedYes

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • Association rule
  • Breast cancer
  • Chi-squire
  • Decision tree
  • Endometrial cancer
  • Natural language processing
  • Ovarian cancer

ASJC Scopus subject areas

  • Food Science
  • Biochemistry

Fingerprint

Dive into the research topics of 'Chemoprevention of phytoestrogens on women hormone-related cancers by integrating text mining and data mining approaches'. Together they form a unique fingerprint.

Cite this