19-03-2012, 01:09 PM
Mining Knowledge-Sharing Sites for Viral Marketing
viral marketing paper.pdf (Size: 143.66 KB / Downloads: 31)
ABSTRACT
Viral marketing takes advantage of networks of influence among
customers to inexpensively achieve large changes in behavior.
Our research seeks to put it on a firmer footing by mining these
networks from data, building probabilistic models of them, and
using these models to choose the best viral marketing plan.
Knowledge-sharing sites, where customers review products and
advise each other, are a fertile source for this type of data mining.
In this paper we extend our previous techniques, achieving a large
reduction in computational cost, and apply them to data from a
knowledge-sharing site. We optimize the amount of marketing
funds spent on each customer, rather than just making a binary
decision on whether to market to him. We take into account the
fact that knowledge of the network is partial, and that gathering
that knowledge can itself have a cost. Our results show the robustness
and utility of our approach.
1. INTRODUCTION
Marketing has been one of the major applications of data mining
since the field emerged. Typically, the decision of whether or not
to market to a particular person is based solely on their characteristics
(direct marketing), or those of the population segment to
which they belong (mass marketing). This often leads to suboptimal
marketing decisions by not taking into account the effect
that members of a market have on each other’s purchasing decisions.
In many markets, customers are strongly influenced by the
opinions of their peers. Viral marketing takes advantage of this to
inexpensively promote a product by marketing primarily to those
with the strongest influence in the market.
THE MODEL
Consider a set of n potential customers, and let Xi be a Boolean
variable that takes the value 1 if customer i buys the product being
marketed, and 0 otherwise. Let the neighbors of Xi be the customers
who directly influence Xi: Ni={Xi,1,…,Xi,ni} Í X-{Xi}, where
X={X1,…,Xn}. The product is described by a set of attributes
Y={Y1,…,Ym}. Let Mi be the marketing action that is taken for
customer i. For example, Mi could be a Boolean variable, with
Mi=1 if the customer is (say) offered a discount, and Mi=0 otherwise.
Alternatively, Mi could be a continuous variable indicating
the size of the discount offered, or a nominal variable indicating
which of several possible actions is taken. Let M={M1,…,Mn} be
the marketing plan.
MINING KNOWLEDGE-SHARING SITES
Internet use has exploded over the past decade. Millions of people
interact with each other online, and, in many instances, those
social interactions are recorded in archives that reach back twenty
years or more1. As a result, there are many online opportunities to
mine social networks for the purposes of viral marketing. UseNet
newsgroups, IRC, instant messaging, online forums, and email
mailing lists are examples of possible sources.
Speed
The linear model introduced in this paper has tremendous speed
advantages over a non-linear model such as that introduced in our
previous work. Because of the independence that linearity provides,
we are able to simultaneously calculate the network value
for all customers. The network value is independent of the marketing
actions being performed on others, which allows us to find the
optimal marketing plan7 without performing a heuristic search
over plans. It would take approximately 100 hours to perform the
single-pass search (the fastest of the heuristic search methods
introduced in our previous work) with this model, or about 10-15
minutes if we make approximations in the inference. In contrast,
the linear model takes 1.05 seconds to find the optimal marketing
plan. At these speeds, our model could be used to find optimal
marketing plans for markets involving hundreds of millions of
customers in just hours.
Continuous Marketing Actions
Continuous-valued marketing actions (MiÎ[0,1]) allow the marketer
to better optimize the marketing plan – tailoring the action
for each person specifically to his characteristics. Our framework
allows for any function to be used to model P0( Xi | Y, Mi ), as
long as it is differentiable in Mi. As in the Boolean case, we have
chosen to model the effect of marketing as a multiplicative factor
on the internal probability of purchasing: