05-12-2012, 04:46 PM
Predicting Missing Items in Shopping Carts
predicting missing item in shopping cart.pdf (Size: 2.71 MB / Downloads: 33)
INTRODUCTION
THE primary task of association mining is to detect
frequently co-occurring groups of items in transactional
databases. The intention is to use this knowledge for
prediction purposes: if bread, butter, and milk often
appear in the same transactions, then the presence of
butter and milk in a shopping cart suggests that the
customer may also buy bread. More generally, knowing
which items a shopping cart contains, we want to predict
other items that the customer is likely to add before
proceeding to the checkout counter.
This paradigm can be exploited in diverse applications.
For example, in the domain discussed in [1], each
“shopping cart” contained a set of hyperlinks pointing to
a Web page [1]; in medical applications, the shopping cart
may contain a patient’s symptoms, results of lab tests, and
diagnoses; in a financial domain, the cart may contain
companies held in the same portfolio; and Bollmann-Sdorra
et al. [2] proposed a framework that employs frequent
itemsets in the field of information retrieval.
PROBLEM STATEMENT
Let I ¼ fi1;. . . ;ing be a set of distinct items and let a database
consist of transactions T1;. . . ;TN such that Ti I; 8i. An
itemset, X, is a group of items, i.e., X I. The support of
itemset X is the number, or the percentage, of transactions
that subsume X. An itemset that satisfies a user-specified
minimum support value is referred to as a frequent itemset or
a high support itemset.
An association rule has the form rðaÞ ) rðcÞ, where rðaÞ and
rðcÞ are itemsets. The former, rðaÞ, is the rule’s antecedent and
the latter, rðcÞ, its consequent. The rule reads: if all items from
rðaÞ are present in a transaction, then all items from rðcÞ are
also present in the same transaction. The rule does not have
to be absolutely reliable. The probabilistic confidence in the
rule rðaÞ ) rðcÞ can be defined with the help of supports
(relative frequencies) of the antecedent and consequent as
the percentage of transactions that contain rðcÞ among those
transactions that contain rðaÞ:
EARLIER WORK
Association mining systems that have been developed with
classification purposes in mind are sometimes dubbed
classification rule mining. Some of these techniques can be
adapted to our needs.
Take, for instance, the approach proposed in [12]. If ij
is the item whose absence or presence is to be predicted,
the technique can be used to generate all rules that have
the form rðaÞ ) Ij, where rðaÞ ðI n fijgÞ and Ij is the
binary class label (ij ¼ present or ij ¼ absent). For a given
itemset s, the technique identifies among the rules with
antecedents subsumed by s those that have the highest
precedence according to the reliability of the rules—this
reliability is assessed based on the rules’ confidence and
support values. The rule is then used for the prediction of
ij. The method suffers from three shortcomings. First, it is
clearly not suitable in domains with many distinct items
ij. Second, the consequent is predicted based on the
“testimony” of a single rule, ignoring the simple fact that
rules with the same antecedent can imply different
consequents—a method to combine these rules is needed.
Third, the system may be sensitive to the subjective userspecified
support and confidence thresholds.
BAYESIAN APPROACH
We are not aware of any attempt to apply to our task the
Bayesian approach, a technique that has been around
since before World War II [16], [17]. The mathematically
“clean” version is known to be computationally expensive
in domains where many independent variables are present.
Fortunately, this difficulty can be sidestepped by the socalled
Naive-Bayes principle that assumes that all variables
are mutually pairwise conditionally independent [18].
THE PROPOSED APPROACH
Association rule mining (ARM) in its original form finds all
the rules that satisfy the minimum support and minimum
confidence constraints. Many later papers tried to integrate
classification and ARM. The goal was to build a classifier
using so-called class association rules. In classification rule
mining, there is one and only one predetermined target, the
class label. Most of the time, classification rule mining is
applied to databases in a “table” format, with a predefined
set of attributes and a class label. Attributes usually take a
value out of a finite set of values (although missing values
are often permitted).
Some papers, such as [11] and [14], demonstrated
encouraging results by incorporating DS theoretic notions
with class association rules. But most of these methods
were designed for data sets with limited number of
attributes (or data sets with small number of distinct items)
and one class label. In our task, we do not have a
predefined class label. In fact, all items in the shopping
cart become attributes and the presence/absence of the
other items has to be predicted. What is needed is a feasible
rule generation algorithm and an effective method to use to
this end the generated rules.
For the prediction of all missing items in a shopping cart,
our algorithm speeds up the computation by the use of the
itemset trees (IT-trees) and then uses DS theoretic notions to
combine the generated rules. The flowchart in Fig. 1 shows
an outline of our proposed system.