written 2.7 years ago by | • modified 2.7 years ago |
Discuss sampling techniques to extract frequent itemsets from a stream.
written 2.7 years ago by | • modified 2.7 years ago |
Discuss sampling techniques to extract frequent itemsets from a stream.
written 2.7 years ago by |
We assume that stream elements are baskets of items.
The simplest approach to maintaining a current estimate of the frequent itemsets in a stream is to collect some number of baskets and store it as a file.
Run one of the frequent-itemset algorithms, meanwhile ignoring the stream elements that arrive, or storing them as another file to be analyzed later.
When the frequent-itemsets algorithm finishes, we have an estimate of the frequent itemsets in the stream.
We can use this collection of frequent itemsets for the application, but start running another iteration of the chosen frequent-itemset algorithm immediately. This algorithm can either :
Use the file that was collected while the first iteration of the algorithm was running. At the same time, collect yet another file to be used at another iteration of the algorithm, when this current iteration finishes.
Start collecting another file of baskets, and run the algorithm until an adequate number of baskets has been collected.