TY - GEN
T1 - An approximate approach for mining recently frequent itemsets from data streams
AU - Koh, Jia Ling
AU - Shin, Shu Ning
PY - 2006
Y1 - 2006
N2 - Recently, the data stream, which is an unbounded sequence of data elements generated at a rapid rate, provides a dynamic environment for collecting data sources. It is likely that the embedded knowledge in a data stream will change quickly as time goes by. Therefore, catching the recent trend of data is an important issue when mining frequent itemsets from data streams. Although the sliding window model proposed a good solution for this problem, the appearing information of the patterns within the sliding window has to be maintained completely in the traditional approach. In this paper, for estimating the approximate supports of patterns within the current sliding window, two data structures are proposed to maintain the average time stamps and frequency changing points of patterns, respectively. The experiment results show that our approach will reduce the run-time memory usage significantly. Moreover, the proposed FCP algorithm achieves high accuracy of mining results and guarantees no false dismissal occurring.
AB - Recently, the data stream, which is an unbounded sequence of data elements generated at a rapid rate, provides a dynamic environment for collecting data sources. It is likely that the embedded knowledge in a data stream will change quickly as time goes by. Therefore, catching the recent trend of data is an important issue when mining frequent itemsets from data streams. Although the sliding window model proposed a good solution for this problem, the appearing information of the patterns within the sliding window has to be maintained completely in the traditional approach. In this paper, for estimating the approximate supports of patterns within the current sliding window, two data structures are proposed to maintain the average time stamps and frequency changing points of patterns, respectively. The experiment results show that our approach will reduce the run-time memory usage significantly. Moreover, the proposed FCP algorithm achieves high accuracy of mining results and guarantees no false dismissal occurring.
UR - http://www.scopus.com/inward/record.url?scp=33751372595&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33751372595&partnerID=8YFLogxK
U2 - 10.1007/11823728_34
DO - 10.1007/11823728_34
M3 - Conference contribution
AN - SCOPUS:33751372595
SN - 3540377360
SN - 9783540377368
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 352
EP - 362
BT - Data Warehousing and Knowledge Discovery - 8th International Conference, DaWaK 2006, Proceedings
PB - Springer Verlag
T2 - 8th International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2006
Y2 - 4 September 2006 through 8 September 2006
ER -