Rick L. Andrews, Imran S. Currim ,
Over the past two decades, marketing scientists in academia and industry have employed consumer choice models calibrated using supermarket scanner data to assess the impact of price and promotion on consumer choice, and they continue to do so today In order to guide managerial decisions regarded price and promotion strategies.
In its raw form, scanner panel data for a product category often contains information on the purchases of hundreds of Stock Keeping Units (SKUs), representing many brands, sizes, product forms, and formulations, by thousands of consumers. Typically, some of these brands, sizes, product forms, and formulations are judged to be less significant in terms of market share and influence on consumer purchase behavior and sometimes are eliminated from the dataset to improve parameter estimates and reduce computing time. In the marketing literature, there is no standard practice as to how these brands, sizes, etc. should be removed from the dataset.
Likewise, raw scanner panel data may contain purchases from some panelists who do not make enough purchases over a two-year period to provide insight into consideration set composition and loyalty and variety seeking behaviors, and so these panelists are sometimes eliminated from the dataset. On the other hand, such exclusion may produce bias in estimated parameters since heavier users are more price sensitive and have more sharply defined preferences for national brands than lighter users (Kim & Rossi, 1994). Again, there is no standard practice as to how purchases should be removed from the dataset. Some studies sample households, including the entire purchase history of each selected household purchasing only from the selected brands, while others sample purchases of the selected brands and omit purchases of other brands, possibly resulting