Ted O’Donoghue
Department of Economics
Cornell University and Matthew Rabin
Department of Economics
University of California, Berkeley
March 6, 2006
Abstract
We investigate ‘‘sin taxes’’ on unhealthy items, such as fatty foods, that people may (by their own reckoning) consume too much of. We employ a standard optimal-taxation framework, but replace the standard assumption that all consumers have 100% self control with an assumption that some consumers may have some degree of self-control problems. We show that imposing taxes on unhealthy items and returning the proceeds to consumers can generally improve total social surplus. Because such taxes counteract over-consumption by consumers with self-control problems while at the same time they naturally redistribute income to consumers with no self-control problems (who consume less), such taxes can even create Pareto improvements. Finally, we demonstrate with some simple numerical examples that even if the population exhibits relatively few self-control problems, optimal taxes can still be large.
Acknowledgments: We thank Robert Hall and Robert Barro for helpful discussions of a related shorter paper presented at the
AEA meetings in January 2003. For helpful comments, we thank Jonathan Gruber and an anonymous referee, and also Steve Coate,
Botond Koszegi, Emmanuel Saez, and seminar participants at Harvard, Yale, Stanford, Cornell, Vanderbilt, North Carolina State, the
USC Behavioral Public Finance Conference, the 2003 Association of Public Economic Theory Conference at Duke, the 2004 ASSA meetings, and the Cornell-LSE-MIT Conference on Behavioral Public and Development Economics. For research assistance, we thank Christoph Vanberg and Chris Cotton. For financial support, we thank the National Science Foundation (grants SES-0214043 and SES-0214147), and Rabin thanks the Russell Sage and MacArthur Foundations.
Mail: Ted O’Donoghue / Department of Economics / Cornell University / 414 Uris Hall / Ithaca, NY 14853-7601. Matthew Rabin
/ Department of Economics / 549 Evans Hall #3880 / University of California, Berkeley / Berkeley, CA 94720-3880.
E-mail: edo1@cornell.edu and rabin@econ.berkeley.edu. CB handles: ‘‘Puck Boy’’ and ‘‘Game Boy’’.
Web pages: http://www.people.cornell.edu/pages/edo1/ and http://emlab.berkeley.edu/users/rabin/.‘‘Never eat more than you can lift.’’ – Miss Piggy
1. Introduction
We investigate the welfare effects of ‘‘sin taxes’’ on unhealthy items, such as fatty foods, that people may (by their own reckoning) consume too much of. The standard economic approach to taxation a priori assumes that there is no such ‘‘over-consumption’’, and hence the only reasons to tax commodities are to raise revenue, to correct externalities, or to redistribute wealth. If, however, people do exhibit such over-consumption — due to self-control problems or some other error — then the standard calculus of optimal taxation does not necessarily hold.
1
We employ a standard framework to study optimal commodity taxation, but we replace the standard assumption that all consumers have 100% self control with an assumption that some consumers may have some degree of self-control problems — formalized as a time-inconsistent preference for immediate gratification. Using a social-welfare function that puts equal weight on all individuals, we show that imposing taxes on unhealthy items and returning the proceeds to consumers can generally improve social surplus. Moreover, we find that such taxes can even create Pareto improvements — in other words, such taxes need not involve helping people with self-control problems to the detriment of fully rational people. Finally, we demonstrate with some simple numerical examples that even if the prevalence of self-control problems in the population is relatively small, optimal sin taxes can still be large.
2
In Section 2, we develop a framework to analyze optimal commodity taxation when agents might have self-control problems. We focus on a simple quasi-linear economy where, in addition to a composite good, there is a ‘‘sin good’’ — which we refer to as potato chips — that is enjoyable to consume but creates negative health consequences in the future.
3
Two basic results are imme-
1
Over the years, some researchers (e.g., Musgrave (1959) and Besley (1988)) have studied ‘‘merit goods,’’ for which the government has a different notion of individuals’ optimal consumption than the individuals themselves have.
2
Some of the ideas presented here appear in more preliminary form in O’Donoghue and Rabin (2003). Three other recent papers also study the welfare effects of sin taxes. Gruber and Koszegi (2001) also suggest that a Pigouvian tax could be used to counteract over-consumption due to self-control problems, and they conduct some simulations to assess the relevant magnitudes for optimal cigarette taxes. They do not address optimal taxation in a heterogeneous population, however, which is our primary focus. Gruber and Koszegi (2004) also study cigarette taxation in the presence of self-control problems, and they calibrate tax incidence for different income groups. But they do not derive optimal taxes, and once again they do not focus on population heterogeneity in self-control problems. In a somewhat different approach, Gruber and Mullainathan (2005) use survey data from Canada to provide empirical evidence that higher local cigarette taxes lead to increased happiness. For another example of tax analysis when taxpayers are boundedly rational, see Sheshinski (2002).
3
Our analysis applies fully to chips both with and without ridges, and we believe our qualitative results extend to potato sticks, tator tots, and french fries.
1diate. First, self-control problems lead to over-consumption of potato chips. Intuitively, current consumption imposes a negative externality on one’s future self — dubbed a ‘‘negative internality’’ by Herrnstein, Loewenstein, Prelec, and Vaughan (1993). Whereas people without self-control problems fully internalize this externality, people with self-control problems only partially internalize it. In particular, whereas optimal behavior gives full weight to any future health cost, the current self gives it only partial weight. This negative-externality intuition immediately leads to our second basic result: With homogeneous consumers, a simple Pigouvian tax-and-transfer scheme (Pigou,
1920) can induce people to consume optimally.
Our primary interest, however, is the case in which there is population heterogeneity in both people’s tastes and in their degree of self-control problems. Even in this case, the first-best outcome can be implemented with individual-specific taxes and transfers. But since such schemes are unrealistic, we investigate constrained-optimal taxation when the government is limited to using linear taxes and lump-sum transfers that are the same for all consumers.
4
In Section 3, we analyze optimal taxation using a conventional social-welfare function that puts equal weight on all people. If there are no self-control problems in the population, then it would be optimal not to tax potato chips, because such taxes would merely distort the otherwise optimal potato-chip consumption. If instead some people have self-control problems, then it is indeed optimal to tax potato chips. Although taxes create consumption distortions for fully self-controlled people, such distortions are second-order relative to the benefits from reducing over-consumption by people with self-control problems.
In Section 4, we characterize Pareto-efficient taxation. Specifically, we investigate a constrained
Pareto efficiency in which the government is restricted to uniform taxes and transfers. In this context, potato-chip taxes and the associated transfers have two effects on individuals’ welfare: they affect the efficiency of potato-chip consumption, but they also redistribute income from those with high potato-chip consumption to those with low potato-chip consumption. Two results follow.
First, if there is any taste heterogeneity, then even when there are no self-control problems in the population, small taxes are not Pareto-inefficient. Intuitively, for fully self-controlled individuals, small taxes create only second-order consumption distortions, while at the same time there are firstorder income redistributions from people with a strong taste for potato chips to people with a weak
4
Our analysis is similar in spirit to analyses (following Diamond (1973)) of uniform taxation to correct non-uniform externalities. 2taste for potato chips. Second, when there are self-control problems in the population, potatochip taxes may, in fact, create Pareto improvements. People with self-control problems benefit because the taxes counteract their over-consumption. At the same time, because people with selfcontrol problems consume more potato chips than people without self-control problems, income is naturally redistributed from people with self-control problems to people without self-control problems. Indeed, we show that, under quite reasonable conditions, it is possible to tax potato chips in a way that on average helps both people with self-control problems and those without.
Under the less reasonable assumption that the variation in tastes for potato chips is sufficiently small, such taxes could help every single person.
Our formal results in Sections 3 and 4 are based on marginal arguments: we show that, when there are self-control problems in the population, infinitesimal departures from the policy that would be optimal given 100% rationality always increases average welfare and can sometimes make everyone better off. Yet we suspect that these results are not of marginal interest to economists. To highlight the potential importance of these results, in Section 5 we demonstrate with some simple numerical examples that, even if the prevalence of self-control problems in the population is relatively small, optimal taxes can still be significant.
Our formal analysis focuses on over-consumption due to a time-inconsistent preference for immediate gratification. In Section 6, we explore to what extent our conclusions would hold in other behavioral models of sin-good consumption. Finally, we conclude in Section 7 by discussing the broader implications of our analysis, and also its limitations.
2. Model and Basic Results
Consider a simple consumption model of the form introduced by Pigou (1920). There are two goods, potato chips and a composite good. Both goods are produced with constant returns to scale, and we normalize units so that they have identical marginal costs. Moreover, we assume that both markets are competitive, and we normalize the price of the composite good to be one, which implies the marginal cost of each good is also one.
5
We assume that time is discrete, and that consumption
(and production) occur in all periods.
The essential feature of ‘‘sin goods’’ such as potato chips is that current consumption generates
5
Given these assumptions, there are no distortions from mispriced goods — absent taxes, goods are priced at their constant marginal cost.
3immediate enjoyment but future health costs or other negative consequences. To incorporate this feature, we assume that the person’s instantaneous utility in period t takes the quasi-linear form ut ≡ v(xt
; ρ) − c(xt−1; γ) + zt
, where xt and zt denote, respectively, an individual’s period-t consumption of potato chips and the composite good. The function v(xt
; ρ) represents the immediate benefits from current potato-chip consumption, while c(xt−1; γ) represents the negative health consequences from past potato-chip consumption.
6
We assume vx > 0 and vxx < 0, so that there are decreasing marginal benefits to consumption. We also assume cx > 0, but we allow that there might be increasing, constant, or decreasing marginal health costs — that is, we allow cxx < 0, cxx = 0, or cxx > 0.
7
The parameters ρ and γ capture population heterogeneity in tastes. In particular, we assume that vxρ > 0, so that a higher ρ corresponds to a higher marginal benefit from consumption; and we assume that cxγ > 0, so that a higher γ corresponds to a higher marginal health cost from consumption. We assume throughout that people cannot borrow or save. This assumption helps to clarify the basic logic of our results. In particular, it permits us to isolate the intratemporal distortion in potato-chip consumption without any confounding effects from intertemporal distortions in savings behavior. In a more general model, there might be interaction effects wherein distortions in saving behavior influence potato-chip consumption, or distortions in potato-chip consumption influence savings behavior. We suspect, however, that as long as potato-chip expenditures are small relative to people’s income, such effects should be small.
How a person trades off the immediate benefits of potato-chip consumption against the future health costs depends on her intertemporal preferences. We assume that people might exhibit a tendency to pursue immediate gratification in a way that they themselves disapprove of in the long run. 8
Beginning with Laibson (1997), recent research on such present-biased preferences uses a simple and convenient functional form: A person’s intertemporal preferences at time t are given by
U
t
(ut
, ..., uT ) ≡ ut + β
XT
τ=t+1 δ τ−t uτ ,
6
As will become clear, the fact that the negative consequences go only one period forward is not essential. Indeed, c(xt; γ) can be interpreted as the (exponentially) discounted sum of the future health costs due to period-t consumption.
7
The combination of concave benefits and concave costs can of course create problems. Hence, whenever cxx < 0 we impose the additional assumption that vxx − cxx < 0, which guarantees that behavior is well-behaved.
8
For evidence that most humans do, indeed, have such a tendency, see Ainslie (1991, 1992), Ainslie and Haslam
(1992a, 1992b ), Loewenstein and Prelec (1992), Thaler (1991), and Thaler and Loewenstein (1992). For a recent overview, see Frederick, Loewenstein, and O’Donoghue (2002). This tendency is often referred to as ‘‘hyperbolic discounting’’. 4where uτ is her instantaneous utility in period τ . This two-parameter model is a simple modification of the standard one-parameter, exponential-discounting model. The parameter δ represents standard time-consistent impatience; for β = 1 these preferences reduce to exponential discounting. The parameter β represents a time-inconsistent preference for immediate gratification, where β < 1 implies an extra bias for now over the future. To simplify our analysis, we assume throughout that there is no time-consistent discounting, or δ = 1.
9
Because we assume that the benefits and costs from period-t consumption are additively separable from the benefits and costs from consumption in any other period, the person effectively faces a series of independent decisions. In particular, in every period the person will choose her current consumption (x, z) to maximize u
∗
(x, z) ≡ v(x; ρ)−βc(x; γ) +z, subject to the budget constraint that we discuss below.
10
We and other researchers often refer to β < 1 as representing a ‘‘self-control problem’’ because it reflects a short-term desire or propensity that the person disapproves of at every other moment in her life. Our welfare analysis therefore treats this preference for immediate gratification as an error, although the main points apply for essentially any welfare criterion. Specifically, we shall measure a person’s welfare as a function of her choice by her long-run utility u
∗∗
(x, z) ≡ v(x; ρ)−c(x; γ)+z.
11
The crucial feature that drives our results is that a person’s behavior may not maximize her own welfare. This feature is quite common in the behavioral-economics literature, which often examines ‘‘errors’’ in utility maximization. Indeed, to highlight this feature, Kahneman (1994) makes an explicit distinction between a person’s ‘‘decision utility’’, which is the utility function that explains a person’s choices, and a person’s ‘‘experienced utility’’, which is the utility function that reflects her welfare. In our model, u
∗
(x, z) is the person’s decision utility, whereas u
∗∗
(x, z)
9
This model was originally developed by Phelps and Pollak (1968) in the context of intergenerational altruism. It has been used in recent years by numerous authors, including Laibson (1997,1998), Laibson, Repetto, and Tobacman
(1998), Angeletos, Laibson, Repetto, Tobacman and Weinberg (2001), O’Donoghue and Rabin (1999a, 2001), Fischer
(1999), Carrillo and Mariotti (2000), and Benabou and Tirole (2002).
10
The behavior of people with time-inconsistent preferences often depends on whether they are aware vs. unaware of their future self-control problems — on whether they are ‘‘sophisticated’’ vs. ‘‘naive’’. For our analysis in this paper, however, this distinction is irrelevant because there is no intertemporal link between decisions — that is, one’s optimal behavior now is independent of her beliefs about her future behavior.
11
In other words, we are using the long-run perspective for an agent’s welfare function. While researchers sometimes worry about the appropriate welfare function for time-inconsistent agents, here there should be no controversy — u
∗∗
is appropriate under essentially any perspective. Perhaps most importantly, for any tax policy that takes effect in the future, under the β-δ model, the agent agrees that u
∗∗
is the appropriate welfare function. Alternatively, note that, if the person consumes a bundle (x
0
, z
0
) in all periods — as she does in our model — then her instantaneous utility will be exactly u
∗∗
(x
0
, z
0
) in all periods except period 1. Hence, measuring welfare using u
∗∗
(x
0
, z
0
) means we are also equating welfare to the person’s per-period utility flow.
5is the person’s experienced utility, and these differ when β < 1. Although our formal analysis focuses on this one specific source of decision utility deviating from experienced utility, we discuss in Section 6 the extent to which our conclusions hold in other behavioral models.
Consider ideal versus actual behavior for an individual with per-period income I, where I is
‘‘large’’ relative to potato-chip consumption. The first-best optimal allocation for this individual, which we denote by (x
∗∗
, z
∗∗
), maximizes long-run utility u
∗∗
(x, z) subject to the resource constraint x + z ≤ I. Hence, x
∗∗
satisfies vx(x
∗∗
; ρ) − cx(x
∗∗
; γ) − 1 = 0 and z
∗∗
= I − x
∗∗
.
The person’s actual behavior depends on taxes. Without taxes, the market price of potato chips will equal their marginal cost (which we have normalized to be 1), or px = 1. If the government imposes a per-unit tax t on potato chips, then the market price will be px = 1 + t. If in addition the person receives a lump-sum transfer ` from the government, her (per-period) budget constraint will be (1 + t)x + z ≤ I + `. The person will choose her consumption allocation to maximize u
∗
(x, z) subject to this budget constraint. Hence, her consumption of potato chips, which we denote by x ∗
(t), satisfies vx(x
∗
(t); ρ) − βcx(x
∗
(t); γ) − (1 + t) = 0, and her consumption of the composite good is z
∗
(t, `) = I + ` − (1 + t)x
∗
(t).
12
From these conditions, it is straightforward to derive our first basic result: In the absence of taxes
— when t = 0 — self-control problems lead to over-consumption of potato chips. In other words, for all ρ and γ, whereas actual potato-chip consumption x
∗
(0) is identical to first-best potato-chip consumption x
∗∗
for people with β = 1, actual potato-chip consumption x
∗
(0) is larger than firstbest potato-chip consumption x
∗∗
for people with β < 1. As we discuss in Section 1, this result can be interpreted in standard externality terms. Current consumption of potato chips imposes a negative externality on future selves. Whereas optimality requires giving full weight to the future cost, the current self only gives it weight β < 1, and so ignores proportion 1 − β of the future cost.
This negative-externality intuition generates our second basic result: In a population of homogeneous consumers with self-control problems, a simple Pigouvian tax-and-transfer scheme can be used to correct this over-consumption. In particular, consider the case in which there are many identical consumers and the government imposes a per-unit tax t on potato chips and returns the proceeds to consumers via a uniform (per-capita) lump sum `. Since the lump sum is independent of each consumer’s own behavior, it is easy to see that a tax rate t
∗∗
= (1−β)cx(x
∗∗
) will implement
12
Given our assumption that I is ‘‘large’’, potato-chip consumption is independent of `. Also, note that, for notational simplicity, we suppress the arguments ρ, γ, and β in the expressions x
∗∗
, z
∗∗
, x
∗
(t), and z
∗
(t, `).
6the first-best outcome — that is, will induce all consumers to choose x
∗
(t
∗∗
) = x
∗∗
.
13
While these basic results are straightforward, the heart of our analysis will focus on the case in which there is population heterogeneity in tastes, as captured by heterogeneity in ρ and γ, and
— more importantly — also in the degree of self-control problems, as captured by heterogeneity in β. In this case, implementing the first-best outcome would require individual-specific taxes and lump-sum transfers. Such schemes are presumably unrealistic, because of informational constraints, implementation costs, arbitrage opportunities, and so forth. We will therefore assume that the government is limited to using linear taxes and lump-sum transfers that are the same for all consumers. Formally, we assume that the population distribution of parameters is given by a cumulative distribution F(ρ, γ, β). In addition, we assume that the distribution of tastes is independent from the distribution of self-control problems — that is, F(ρ, γ, β) can be written as G(ρ, γ)H(β). Given that an individual’s demand for potato chips is x
∗
(t), the aggregate demand for potato chips (in per capita terms) is X∗
(t) = EF [x
∗
(t)]. Hence, a tax rate t will raise (per-capita) revenue t ∗ X∗
(t),
and if the government returns all tax proceeds to consumers, the (per-capita) lump-sum transfer will be `(t) = t ∗ X∗
(t). The next two sections analyze optimal taxation given this heterogeneity.
3. Optimal Taxes
In this section, we analyze optimal taxation given a specific social-welfare function, as in Diamond and Mirrlees (1971a,1971b ) and the subsequent optimal-taxation literature. Specifically, we use a social-welfare function that puts ‘‘equal weight’’ on all people, although the basic ideas will clearly hold for other weights as well.
Recall that an individual’s welfare function is u
∗∗
(x, z) ≡ v (x; ρ) − c(x; γ) + z. Given a tax t and a lump sum `(t), the person will choose consumption bundle (x
∗
(t), z
∗
(t, `(t))), and so the person’s welfare as a function of t will be u
∗∗
(x
∗
(t), z
∗
(t, `(t))). To put equal weight on all people, the social-welfare function will be the expectation of individual welfare — that is,
Ω(t) ≡ EF [u
∗∗
(x
∗
(t), z
∗
(t, `(t)))]
= EF [v (x
∗
(t); ρ) − c(x
∗
(t); γ) + I + `(t) − (1 + t)x
∗
(t)] .
13
‘‘Proof ’’: Given tax t = (1−β)cx(x
∗∗
; γ), x
∗
(t)satisfies vx(x
∗
(t); ρ)−βcx(x
∗
(t); γ)−(1+(1−β)cx(x
∗∗
; γ)) =
0, which can be rewritten as vx(x
∗
(t); ρ) − cx(x
∗∗
; γ) − 1 + β [cx(x
∗∗
; γ)) − cx(x
∗
(t); γ))] = 0, which is satisfied if and only if x
∗
(t) = x
∗∗
.
7(Note that we have substituted z
∗
(t, `) = I +`(t)−(1+t)x
∗
(t).) Finally, because the tax payments and lump-sum transfers sum up to zero — because `(t) = EF [t ∗ x
∗
(t)] — we can simplify this equation to
Ω(t) = EF [v (x
∗
(t); ρ) − c(x
∗
(t); γ) − x
∗
(t) + I] .
For each individual, v (x
∗
(t); ρ) − c(x
∗
(t); γ) − x
∗
(t) ≡ uˆ(t) reflects the efficiency of potatochip consumption. For any (ρ, γ, β), uˆ(t) is maximized when actual consumption x
∗
(t) is equal to first-best consumption x
∗∗
, and otherwise there are consumption distortions. Hence, if policymakers put equal weight on all people, they will be concerned exclusively with minimizing the average distortion in potato-chip consumption. Proposition 1 characterizes optimal taxation given this social-welfare function. (All proofs are in the Appendix.)
Proposition 1. Suppose policymakers maximize Ω(t). For any distribution of tastes G(ρ, γ):
(1) If everyone in the population has β = 1, then the optimal tax t
∗
= 0%; and
(2) If everyone in the population has β ≤ 1 and some people have β < 1, then the optimal tax t ∗
> 0%.
Part 1 establishes that if there are no self-control problems in the population then we should not tax potato chips. In the absence of taxes, people without self-control problems consume optimally.
Hence, potato-chip taxes would merely distort consumption away from the first best. Part 2 establishes that, if instead some people have self-control problems, then it is always optimal to tax potato chips. Potato-chip taxes still create consumption distortions for people without self-control problems; however, because such people consume optimally when t = 0%, these distortions are second-order. At the same time, because people with self-control problems over-consume when t = 0%, potato-chip taxes create first-order reductions in consumption distortions for people with self-control problems.
Proposition 1 focuses on the case in which no one has a preference for future gratification — that is, no one has β > 1. More generally, because people with β > 1 under-consume when t = 0%, the existence of such people would militate against taxing potato chips. (Indeed, if no one has β < 1 and some people have β > 1, the optimal tax would be t
∗
< 0% — that is, it would be optimal to subsidize potato-chip consumption.) Even so, as long as the predominant error is a preference for immediate gratification rather than a preference for future gratification — as the evidence suggests
8— it will be optimal to tax sin goods.
14
While Proposition 1 addresses whether it is optimal to tax potato chips, it does not address how the optimal tax depends on the prevalence of self-control problems in the population. A natural conjecture is that the more prevalent self-control problems are, the larger the optimal potato-chip tax. While this conjecture is correct for some specifications, it does not hold in general; Proposition
2 addresses when it is likely to hold.
Proposition 2. Suppose policymakers maximize Ω(t). For a fixed distribution of tastes G(ρ, γ), let t
∗
0 and t
∗
1 be the optimal taxes given distributions of self-control problems H0
(β) and H1
(β),
respectively.
(1) Suppose that H1
(β) ≥ H0
(β) for all β. If for all (ρ, γ) and t ≤ t
∗
0
, du/dt ˆ = du/dx ˆ ∗ dx
∗
/dt is larger for smaller β, then t
∗
1 > t
∗
0
; and
(2) Suppose that H1
(β) ≥ H0
(β) for all β and that H1
(β|β ≥ β0
) = H0
(β) for all β ≥ β0
,
where β0 ≡ sup{β | H0
(β) = 0}. If, for all (ρ, γ), x
∗
(t
∗
0
) > x
∗∗
for β = β0
, then t
∗
1 > t
∗
0
.
The simple intuition behind the conjecture is that people with larger self-control problems have an increased propensity to over-consume and hence are more likely to be helped by taxes. This simple intuition is not entirely correct, however, because it doesn’t account for people’s sensitivity to tax changes. More precisely, the fact that people with larger self-control problems have an increased propensity to over-consume implies that |du/dx ˆ | is larger for smaller β. Part 1 points out, however, that in order to conclude that any increase in the prevalence of self-control problems — in the sense of first-order stochastic dominance — implies an increase in the optimal tax, it must be that du/dt ˆ is larger for smaller β, where du/dt ˆ = du/dx ˆ ∗ dx
∗
/dt. While this condition holds for some specifications — including our example in Section 5 — it does not hold in general.
15
Part 2 establishes that the conjecture holds more broadly if we impose more structure on the way in which we increase the prevalence of self-control problems. Specifically, suppose we add to the population only people with larger self-control problems than currently exist — i.e., initially, everyone has β ≥ β0
, and we add people with β < β0
. If these new people all over-consume under the initial optimal tax t
∗
0
(they have x
∗
(t
∗
0
) > x
∗∗
), then the new optimal tax t
∗
1 will be larger.
14
For instance, one can show that, if v(x; ρ) = ρ ln x and c(x; γ) = γ ln x, it is optimal to tax potato chips whenever the average β in the population is less than 1.
15
For our example in Section 5 — which assumes v(x; ρ) = ρx
1−r
/(1−r) and c(x; γ) = γx — this condition holds as long as t
∗
0 < (1 + γmin
)r, where γmin is the minimum γ in the population.
9Intuitively, amongst the initial population the marginal effect of a tax change is zero — because t
∗
0 is optimal — while at the same time increasing the tax above t
∗
0 helps every new member of the population. 16
Thus far, our analysis has assumed that all revenue from potato-chip taxes is fully returned to consumers. An alternative — perhaps more realistic — assumption is that this revenue is used to reduce distortionary taxes elsewhere in the economy. If, for instance, the government raises revenue by taxing other commodities (in the spirit of Ramsey (1927)), such taxes create consumption distortions, and so taxing potato chips and using the proceeds to reduce taxes on other goods reduces consumption distortions.
17
To illustrate, let D(R) denote the reduction in distortions elsewhere when we raise (per-capita) revenue R from potato-chip taxes, in which case our social-welfare function becomes
Ωˆ
(t) ≡ EF [v (x
∗
(t); ρ) − c(x
∗
(t); γ) + I − (1 + t)x
∗
(t)] + D(tX
∗
(t))
= Ω(t) + [D(tX
∗
(t)) − tX
∗
(t)] .
Our analysis above assumes D0
(R) = 1, which reflects no distortions elsewhere in the sense that reducing (per-capita) tax collections elsewhere by $1 is equivalent to giving everyone $1. When there are distortions elsewhere, however, D0
(R) > 1, which says that reducing taxes elsewhere by $1 is better than giving everyone $1. In this case, potato-chip taxes have the added benefit of reducing distortions elsewhere — indeed, because of this second effect, it becomes optimal to tax potato chips even when everyone has β = 1.
A natural extension of Proposition 1 is that, for whatever the optimal potato-chip tax might be under an assumption that everyone is fully self-controlled, if we recognize that there are self-control problems in the population, it becomes optimal to impose an even larger tax potato chips.
18
After all, such taxes still have the additional benefit of reducing over-consumption. It turns out, however, that this conclusion does not hold in general for much the same reason that our result in Proposition
2 does not hold in general — because in addition to affecting the degree of over-consumption, self-
16
The condition that all new people over-consume given tax t
∗
0 will hold as long as the distribution of tastes is tight enough — indeed, it necessarily holds if everyone has the same (ρ, γ).
17
In O’Donoghue and Rabin (2003), we in fact analyze a Ramsey framework. We show for a specific functional form that the existence of self-control problems implies that we should raise taxes on potato chips and reduce taxes on other goods relative to the Ramsey taxes that would be optimal if everyone were fully self-controlled.
18
While the simple framework used in the text highlights some basic ideas, it is inappropriate for a more formal analysis because it ignores the fact that D(R) likely depends on the distribution of self-control problems in the population.
10control problems also affect one’s sensitivity to taxes. Even so, for most examples that we have worked out, this conclusion does hold.
4. Pareto-Efficient Taxes
In this section, we characterize Pareto-efficient taxes. As should be clear from our basic results in
Section 2, if the government can use individual-specific taxes and lump-sum transfers, then it can implement first-best Pareto efficiency. Because such schemes are unrealistic, however, we investigate a constrained Pareto efficiency in which the government is restricted to uniform taxes and lump-sum transfers. In other words, because a potato-chip tax t implies that the (per-capita) lumpsum transfer will be `(t) = tX∗
(t), the choice set for policymakers is {(t, `(t)) | t ∈ [−1,∞)}, and we analyze the set of constrained Pareto-efficient taxes within this set.
19
To build intuition, consider how taxes affect an individual’s long-run utility. For any t, the longrun utility for type (ρ, γ, β) is uˆ ∗∗
(t) ≡ v(x
∗
(t); ρ) − c(x
∗
(t); γ) + I + `(t) − (1 + t)x
∗
(t)
= [v(x
∗
(t); ρ) − c(x
∗
(t); γ) − x
∗
(t)] + [I + `(t) − tx
∗
(t)] .
The latter equation reveals that taxes have two effects on the individual’s long-run utility. First, taxes affect the efficiency of potato-chip consumption, as reflected by v(x
∗
(t); ρ) − c(x
∗
(t); γ) − x ∗
(t). Second, taxes and the associated lump-sum transfers redistribute income, as reflected by
I +`(t)−tx
∗
(t). Because `(t) = tX∗
(t), anyone whose consumption x
∗
(t) is smaller than average consumption X∗
(t) on net receives income, while anyone whose consumption x
∗
(t) is larger than average consumption X∗
(t) on net loses income. Whether an individual is better off under tax t vs. t 0 depends on the combination of these two effects.
The set of Pareto-efficient taxes depends on the support of preferences. Recall that G(ρ, γ) and
H(β) represent, respectively, the population distributions of tastes and of self-control problems.
We let Γ and B denote the supports of G and H, respectively. Our analysis focuses on two types of
Pareto comparisons:
19
Unless there are limits on free disposal, any tax t < −1 would lead people to demand infinite amounts merely to collect the net subsidy — hence why we bound taxes below at −1. This bound is not relevant for our analysis.
11Definition 1. Given population distributions G(ρ, γ) and H(β):
(1) A tax t is Pareto-superior to a tax t
0
if uˆ
∗∗
(t) ≥ uˆ
∗∗
(t
0
) for all (ρ, γ) ∈ Γ and β ∈ B and uˆ ∗∗
(t) > uˆ
∗∗
(t
0
) for some (ρ, γ) ∈ Γ and β ∈ B; and a tax t is Pareto-efficient if there does not exist any t
0
that is Pareto-superior to t.
(2) A tax t is quasi-Pareto-superior to a tax t
0
if EG [ˆu
∗∗
(t)] ≥ EG [ˆu
∗∗
(t
0
)] for all β ∈ B and
EG [ˆu
∗∗
(t)] > EG [ˆu
∗∗
(t
0
)] for some β ∈ B; and a tax t is quasi-Pareto-efficient if there does not exist any t
0
that is quasi-Pareto-superior to t.
Hence, in addition to standard Pareto comparisons, we also consider a kind of group Pareto comparison, which we call a quasi-Pareto comparison, where we group people by their self-control problems. Specifically, we say that one tax is quasi-Pareto-superior to another if and only if every subpopulation with a fixed β is on average better off under that tax. This second criterion permits us to address a common worry that policies designed to combat ‘‘errors’’ involve a trade-off between helping people making errors while hurting people who are fully rational. This criterion allows us to assess whether, on average, fully rational people (people with β = 1) are hurt when we implement taxes. Proposition 3 characterizes Pareto-efficient taxes when there are no self-control problems in the population. Proposition 3. Suppose that everyone has β = 1.
(1) If there is no heterogeneity in (ρ, γ), then t = 0% is the unique Pareto-efficient tax; and
(2) If there is heterogeneity in (ρ, γ), then there existst
0
> 0% > t
00
such that all taxest ∈ (t
00
, t
0
) are Pareto-efficient.
Part 1 says that if there is no heterogeneity in tastes — as reflected by ρ and γ — then the unique
Pareto-efficient policy is not to tax potato chips. With a homogeneous population, Pareto efficiency merely requires maximizing the sum of surplus, which, as we saw in Section 3, occurs for t
∗
= 0%.
Part 2, however, says that for the more interesting case of population heterogeneity in tastes, it is not Pareto-inefficient to tax potato chips as long as this tax is not too large. Although a potato-chip tax creates consumption distortions for everyone, since everyone is fully self-controlled, these consumption distortions are second-order near t = 0%. At the same time, the potato-chip tax and the associated lump-sum transfer has first-order income-redistribution effects. In particular, although everyone whose potato-chip consumption is larger than average (x
∗
(0) > X∗
(0)) is hurt, every-
12one whose potato-chip consumption is smaller than average (x
∗
(0) < X∗
(0)) is helped. Hence, a small-enough potato-chip tax is a movement along the Pareto frontier. (Analogously, a smallenough potato-chip subsidy is also a movement along the Pareto frontier.)
20
Proposition 4 characterizes Pareto-efficient taxes when some people have self-control problems, and in particular establishes that in this case a potato-chip tax can yield Pareto improvements relative to a zero tax — that is, t = 0% may be Pareto-inefficient:
Proposition 4. Suppose that everyone has β ≤ 1 and some people have β < 1. Suppose further that for all (ρ, γ) ∈ Γ and β ∈ B, vxxx − βcxxx ≥
2cxx
cx
(vxx − βcxx) for all x. (A)
(1) If there is no heterogeneity in (ρ, γ), then there exists t
0
> 0% such that all taxes t ∈ (0%, t
0
) are Pareto-superior to t = 0%; and
(2) If there is heterogeneity in (ρ, γ), then there exists t
0
> 0% such that all taxes t ∈ (0%, t
0
) are quasi-Pareto-superior to t = 0%; and if max(ρ,γ)∈Γ,β=1 x
∗
(0) < X∗
(0), then there exists t
0
> 0% such that all taxes t ∈ (0%, t
0
) are Pareto-superior to t = 0%.
Part 1 considers the case in which there is no heterogeneity in tastes, and so the only heterogeneity is in the degree of self-control problems. In this case, potato-chip consumption is larger for people with larger self-control problems, and thus income is naturally redistributed from people with large self-control problems to people with small or no self-control problems. Because small taxes create only second-order consumption distortions for fully rational people, it immediately follows that small taxes make fully rational people better off (relative to t = 0%). It does not immediately follow that small taxes make everyone better off, because for people with self-control problems it must be that the benefits from reduced consumption distortions are larger than the costs of a negative net income transfer. In particular, if their consumption responds very little to a tax increase, then the main effect of the tax increase will be the negative net income transfer, in which case they will be worse off. By guaranteeing that people with self-control problems have a sufficiently strong responsiveness to taxes, condition (A) is sufficient to conclude that small taxes create
Pareto improvements.
Part 2 considers the case in which there is heterogeneity in both tastes and self-control problems.
20
Trivially, when everyone has β = 1, quasi-Pareto efficiency becomes equivalent maximizing Ω(t), and so t = 0% is the unique quasi-Pareto-efficient policy even when there is taste heterogeneity.
13In this case, we cannot directly apply the logic above, because some people with self-control problems might consume less than some fully rational people — if they have weaker tastes for potato chips — and hence some fully rational people may have a negative net income transfer. Even so, the logic applies on average: on average fully rational people receive positive net income transfers, and therefore small taxes make fully rational people on average better off. Similarly, condition (A) guarantees that, for people with self-control problems, on average the benefits from reduced consumption distortions are larger than the costs of the negative net income transfer. Hence, condition
(A) is sufficient to conclude that small taxes create quasi-Pareto improvements. Finally, whether small taxes can yield full Pareto improvements depends on whether the fully rational person that most likes potato chips is made better off or worse off, which in turn depends on whether that person’s potato-chip consumption is smaller or larger than average.
The question remains how restrictive is condition (A). While it is a bit unclear how to give a general assessment, it holds in essentially all examples that we have considered. For instance, when the costs are linear or convex quadratic — when cxx ≥ 0 and cxxx = 0 — then condition
(A) holds as long as vxxx ≥ 0, which in turn holds for most commonly used functional forms.
For instance, vxxx ≥ 0 for any benefit function that has non-increasing absolute risk aversion, including the CRRA and CARA functional forms, and also for a quadratic benefit function. In more intuitive terms, the requirement is that people with self-control problems are sensitive to tax changes. Whether they are is an empirical question. But if they respond very little or not at all to a tax increase, then the tax would merely redistribute income away from these types without any corresponding benefits.
Our analysis in this section demonstrates that paternalistic policies — by which we mean policies aimed at helping people overcome their errors — need not take the form of helping the irrational to the detriment of the rational. Rather, in some instances, paternalistic policies can make everyone better off, or at least can help people while making fully rational people on average better off. The intuition is straightforward: If a policy can make irrational people strictly better off, then there is scope to make fully rational people better off as well by reallocating resources from irrational people to rational people. This intuition is clearly quite general. What is somewhat special to taxes on sin goods is that the same policy that provides help to irrational people can lead naturally to the compensating reallocation of resources to rational people.
145. An Example
Our formal results in Sections 3 and 4 are based on marginal arguments. To give a sense for the potential calibrational importance of these results, in this section we consider a specific example for which we can perform some back-of-the-envelope numerical calculations of the optimal taxes.
These calculations reveal that, even if the prevalence of self-control problems in the population is relatively small, optimal taxes can still be significant.
Specifically, we assume that the future costs from consumption are linear in the amount consumed — that is, we assume c(x; γ) = γx. The assumption of linearity will aid in interpreting our numerical calculations below. In particular, with linear costs, γ represents the magnitude of the future health cost relative to the cost of production. We also assume that the benefits from consumption take the CRRA functional form — that is, we assume v(x; ρ) = ρx
1−r
/(1−r). We assume for simplicity that all consumers have the same r; however, we choose r to match a reasonable market elasticity of demand, as we discuss below.
With CRRA benefits and linear costs, an individual’s demand becomes x ∗
(t) = ρ 1/r
(βγ + 1 + t)
1/r
, and the social-welfare function Ω(t) from Section 3 becomes:
Ω(t) = EF
£
ρ
1/r
¤
EF
"
1
1 − r µ 1 βγ + 1 + t
¶1−r
r
− (γ + 1) µ 1 βγ + 1 + t
¶1
r
#
+ I.
As this formula reveals, the combination of CRRA benefits and linear costs implies that the distribution of ρ is irrelevant for the optimal tax that maximizes Ω(t). In other words, the optimal tax will depend exclusively on the distributions of γ and β, and on r.
21
To perform our numerical calculations, we assume specific distributions and then calculate the optimal tax numerically. Since our goal is to investigate how the optimal tax depends on the distribution of self-control problems in the population, we consider various distributions of β. We are particularly interested in whether ‘‘small’’ self-control problems can have a significant impact on optimal taxes. Hence, we consider values of β that are relatively close to one — specifically, we permit people to have β ∈ {1, .99, .95, .90}, and consider several distributions over these four values. 21
This conclusion requires our assumption that ρ is not correlated with β or γ. For a derivation of Ω(t), see the
Appendix.
15With linear health costs, recall that γ reflects the magnitude of health costs relative to the costs of production. For simplicity, we assume that the future health costs are the same for everyone.
But we still must choose a reasonable value for γ. Unfortunately, we are unaware of any estimates in the literature for the future health costs from snack foods. As an admittedly somewhat arbitrary benchmark, we use the numbers from Gruber and Koszegi (2004) for cigarette consumption. Their back-of-the-envelope calculation for the cost in terms of life-years lost per pack of cigarettes is
$35.64. This cost is on the order of ten times the production costs for cigarettes, suggesting γ = 10.
Although our model is not really applicable to addictive products such as cigarettes, we believe that
Gruber and Koszegi’s conclusion that the health costs could be an order of magnitude larger than production costs is equally plausible for snack foods such as potato chips. Hence, in our numerical calculations below, we start by assuming γ = 10. But to highlight the importance of γ for the magnitude of the optimal tax, we also consider γ = 2.
The remaining parameter is r, which reflects the curvature of the benefits function. We choose this parameter to yield a reasonable market elasticity of demand. For this target, we use two estimates in the literature for the elasticity for potato chips. Kuchler, Tegene, and Harris (2005) use the
AC Nielsen Homescan panel data — in which households scan their purchases at home — and estimate an elasticity for potato chips of −0.45. This elasticity is quite similar to the usual estimated elasticities for cigarettes and alcohol. Katchova, Sheldon, and Miranda (2005) use aggregate price and quantity data, and estimate an elasticity of potato chips of −1.07. Hence, in our numerical calculations, we use a target elasticity of both εD = −0.5 and εD = −1.0.
Table 1 presents our numerical calculations. For each γ and distribution of β, we choose r to yield the target elasticity of demand when t = 0% (the elasticity of demand depends on the tax).
We then fix that r, and solve numerically for the optimal tax t
∗
, which we report in Column 7 of
Table 1. See the Appendix for more details about these calculations.
22
22
It is straightforward to confirm that, for the relevant range of taxes, the condition from Part 1 of Proposition 2 holds. Hence, throughout Table 1, an increase in the prevalence of self-control problems — in the sense of first-order stochastic dominance — leads to a larger optimal tax.
16Table 1: Optimal Taxes for Different Populations
Health Cost Proportion of Population With:
& Elasticity β = 1 β = .99 β = .95 β = .9 r t
∗
t
1/2 1/2 0 0 0.18 5.15% 5.00% γ = 10 1/2 0 1/2 0 0.19 28.53% 24.81%
& 1/2 0 0 1/2 0.19 63.71% 48.48% εD = −0.5 1/4 1/4 1/4 1/4 0.19 49.26% 39.26%
1/2 1/4 1/8 1/8 0.18 29.26% 23.20%
1/2 1/2 0 0 0.09 5.28% 4.99% γ = 10 1/2 0 1/2 0 0.09 31.68% 24.34%
& 1/2 0 0 1/2 0.10 72.72% 45.83% εD = −1.0 1/4 1/4 1/4 1/4 0.10 56.41% 38.01%
1/2 1/4 1/8 1/8 0.09 37.58% 24.55%
1/2 1/2 0 0 0.67 1.01% 1.00% γ = 2 1/2 0 1/2 0 0.68 5.21% 5.00%
& 1/2 0 0 1/2 0.69 10.82% 9.97% εD = −0.5 1/4 1/4 1/4 1/4 0.69 8.52% 7.98%
1/2 1/4 1/8 1/8 0.68 4.65% 4.36%
1/2 1/2 0 0 0.33 1.02% 1.00% γ = 2 1/2 0 1/2 0 0.33 5.34% 4.99%
& 1/2 0 0 1/2 0.34 11.31% 9.93% εD = −1.0 1/4 1/4 1/4 1/4 0.34 8.84% 7.96%
1/2 1/4 1/8 1/8 0.34 4.90% 4.43%
Table 1 demonstrates that the existence of self-control problems can have dramatic implications for optimal taxation. For instance, the first five rows apply for γ = 10 and εD = −0.5. If half the population is fully self-controlled while the other half the population has a very small present bias of β = .99, then the optimal tax is 5.15%. If instead the half the population with self-control problems has a somewhat larger present bias of β = .90 — which is still a smaller present bias
(larger β) than often discussed in the literature — the optimal tax is 63.71%. Table 1 also reveals the role of the future health costs and the role of the elasticity of demand for optimal taxes. The magnitude of the optimal tax is quite sensitive to the magnitude of the health costs. When we compare γ = 10 to γ = 2, the optimal tax is roughly 5-7 times larger when γ = 10. In contrast, the magnitude of the optimal tax is not much influenced by the elasticity of demand — for both γ = 10 and γ = 2, changing the elasticity of demand from εD = −0.5 to εD = −1.0 has a very small impact on the optimal tax. While it is an open empirical question exactly by how much the existence of self-control problems would alter optimal taxes, these numerical examples highlight that we should not presume the effect to be small.
We can also use this example to address the magnitude of Pareto-efficient taxes. With CRRA
17benefits and linear costs, Condition A from Proposition 4 is satisfied. Hence, it follows from
Proposition 4 that, if there is no heterogeneity in (ρ, γ), the minimum Pareto-efficient tax is larger than 0%; and even if there is heterogeneity in (ρ, γ), the minimum quasi-Pareto-efficient tax is larger than 0%. Our interest here is how much larger these taxes are. Because, for standard Pareto efficiency, the distribution of ρ plays a crucial role, and because we don’t see a natural way to choose this distribution, we focus on quasi-Pareto efficiency. Recall from Definition 1 that, if we let G denote the distribution of (ρ, γ) and B denote the set of β’s in the population, a tax t is quasiPareto-superior to a tax t
0
if EG [ˆu
∗∗
(t)] ≥ EG [ˆu
∗∗
(t
0
)] for all β ∈ B and EG [ˆu
∗∗
(t)] > EG [ˆu
∗∗
(t
0
)] for some β ∈ B. In our example here,
EG [ˆu
∗∗
(t)] = EG
£
ρ(x
∗
(t))
1−r
/(1 − r) − γ x
∗
(t) − (1 + t)x
∗
(t) + `(t) + I
¤
.
Much as for the optimal tax that maximizes Ω(t), the combination of CRRA benefits and linear costs implies that the distribution of ρ is irrelevant for quasi-Pareto efficiency (for details, see the
Appendix). Hence, for each case in Table 1, we can use this formula to solve numerically for the minimum quasi-Pareto-efficient tax t, which we present in Column 8 of Table 1. For instance, consider again the case where γ = 10 and εD = −0.5. When half the population has β = 1 while the other half has β = .99, any tax smaller than 5.00% is quasi-Pareto-inefficient. In other words, for any tax t < 5.00%, increasing the tax to 5.00% on average helps the people with β = .99 and also on average helps the people with β = 1. Similarly, when half the population has β = 1 while the other half has β = .9, any tax smaller than 48.48% is quasi-Pareto-inefficient — for any tax t < 48.48%, increasing the tax to 48.48% on average helps the people with β = .9 and also on average helps the people with β = 1.
Table 1 implies that, even if we mostly care about the average welfare of the fully self-controlled, the existence of small self-control problems in a portion of the population can have a significant impact on optimal taxes. In particular, if all we cared about is maximizing the average welfare of the fully self-controlled, then we would choose to implement the minimum quasi-Pareto-efficient tax t. And much as for the optimal tax t
∗
that maximizes Ω(t), in our numerical examples, t can be significantly larger than 0%.
23
23
There is also a maximum quasi-Pareto-efficient tax, which is determined by how taxes affect those with the smallest β in the population. But since our goal is to demonstrate that the existence of small self-control problems in a portion of the population can have a significant impact on optimal taxes even if we mostly care about the average welfare of the fully self-controlled, we do not investigate the maximum quasi-Pareto-efficient tax.
186. Alternative Models
Our main conclusions are driven by two crucial features of our model: (i) a person’s behavior (x
∗
) does not maximize her welfare (u
∗∗
), and in particular the person consumes more potato chips than she herself would like; and (ii) the person’s consumption of potato chips is sensitive to the market price, and hence a potato-chip tax can help to counteract her over-consumption. In our model, these features are driven by a time-inconsistent preference for immediate gratification. In this section, we discuss to what extent these features could arise from other behavioral models.
We first note that our model in this paper can literally be reinterpreted in terms of other sources of over-consumption. For instance, a person might underappreciate the severity of future health costs, in which case the β < 1 would reflect the extent of this underappreciation. Alternatively, a person might have an irrational (incorrect) optimism that the negative health consequences won’t occur for her, in which case the β < 1 would reflect the extent of this optimism.
The two crucial features of our model can also arise in other behavioral models of intertemporal choice. Consider, for instance, the models of cue-triggered visceral factors (Loewenstein (1996),
Bernheim and Rangel (2004)) or dual motivations (Loewenstein and O’Donoghue (2005)). In such models, behavior depends on both ‘‘cognitive’’ motivations and ‘‘visceral’’ motivations — variously labeled cognitive vs. emotional, deliberative vs. affective, cold vs. hot, and so forth. For many sin goods, a natural assumption is that visceral motivations are primarily influenced by the consumption benefits. If so, and if we take cognitive motivations to reflect welfare, then such models generate exactly the type of over-consumption that has been our focus.
24
To illustrate (using our notation), it might be that cognitive motivations (and welfare) are reflected by our welfare function u ∗∗
(x, z) = v(x; ρ) − c(x; γ) + z, but, due to the visceral focus on consumption benefits, behavior is derived from decision utility (1 + φ)v(x; ρ) − c(x; γ) + z, where φ reflects the magnitude of the visceral motivations. Although this approach reflects an overweighting of immediate benefits as opposed to an underweighting of future costs, the policy conclusions would be much the same.
There is, however, an important sense in which a dual-motivations approach might yield different conclusions. Once again, not only do our conclusions require over-consumption, but they also require that consumption is sensitive to the market price. For over-consumption that is driven by
24
Because they interpret visceral motivations (the ‘‘hot state’’) as a short-circuiting of rational decision-making,
Bernheim and Rangel (2004) indeed argue that only cognitive motivations (the ‘‘cold state’’) are relevant for welfare. Loewenstein and O’Donoghue (2005) discuss reasons why visceral motivations might also be (to some extent) relevant. 19visceral motivations, there may be reasons to believe that consumption is not very price sensitive.
Indeed, Bernheim and Rangel (2004, forthcoming) argue that, for many addictive goods, consumption is driven by a kind of short-circuiting of rational decision-making, where visceral motivations
(the ‘‘hot state’’) take over and are not price-sensitive at all. If so, then sin taxes may not be optimal, because they might merely make addicts pay a higher price without changing their consumption.
Bernheim and Rangel in fact show that, because of such effects, in some circumstances it could conceivably be optimal to subsidize an addictive good.
Gul and Pesendorfer (2001) develop an alternative model of self-control problems that does not yield over-consumption, and hence yields different conclusions. In their model, self-control problems are driven by temptation disutility — specifically, when making consumption decisions, people experience disutility when they forgo the most tempting option currently available. Gul and Pesendorfer motivate this model as an alternative explanation for people making ex-ante commitments — from a prior perspective, commitments can be valuable if they alter the most tempting option that will be available when it is time to consume. Gul and Pesendorfer (forthcoming) extend this model to addiction, and they conclude that a tax can only reduce welfare. To illustrate their logic (using our notation), a person’s behavior might be derived from decision utility u ∗∗
(x, z) − [v(ˆx; ρ) − v(x; ρ)], where xˆ is the most tempting level of potato-chip consumption and the bracketed term reflects the disutility from forgoing this option. Gul and Pesendorfer further assume that the person’s welfare function corresponds to this decision utility. Hence, under the plausible assumption that most tempting level of potato-chip consumption xˆ is not determined by what one can afford but rather by some satiation point, a tax on potato chips merely distorts potatochip consumption without any benefits.
25
Although Gul and Pesendorfer reach a different conclusion from ours, this conclusion requires the assumption that temptation disutility merits full normative weight. If instead the temptation disutility were given less-than-full normative weight — e.g., if the person’s welfare function were u ∗∗
(x, z) − α [v(ˆx; ρ) − v(x; ρ)] for some α ∈ [0, 1) — then our conclusions would once again hold. In particular, anyone with α < 1 would over-consume (by her own reckoning), and as long as the person was price-sensitive, sin taxes could improve her welfare.
25
Formally, the assumption about xˆ means that v(x; ρ) is maximized for some finite xˆ. And on the distortion, our tax-and-lump-sum-transfer scheme will merely reduce u
∗∗
(x, z) + v(x; ρ) without changing v(ˆx; ρ).
207. Discussion
In this section, we discuss the broader implications of our analysis, and also its limitations. This paper is part of a very recent literature that addresses public-policy implications of research in behavioral economics. Because much of the behavioral-economics literature describes the ways in which people make errors that lead them not to behave in their own best interests, it suggests the possible desirability of designing paternalistic policies that help people make better choices. But opening this door raises a number of concerns.
Economists (and others) often equate ‘‘paternalism’’ with restrictions on choices. We do not. By
‘‘paternalism’’, we mean that we are concerned that people might not be behaving in their own bestinterests and we are designing policy with an eye towards how that policy might help people make better choices. The taxes that we discuss are no more a limit on choices than are any traditional taxes. Because the prescribed taxes change relative prices, they change choice sets relative to the no-tax case, but do not reduce choice sets. Moreover, the more sophisticated schemes we discuss below involve the expansion of choice sets — illustrating how in some instances the best way to help consumers make better choices is to make new options available.
26
A major worry with regard to paternalism is that most adults in most situations make better choices for themselves than the government or others would make for them. Most behavioral economists, ourselves included, agree. As a result, there has been considerable emphasis in the literature on searching for minimally interventionist policies that help people who make errors while having little effect on those who are fully rational.
27
While the focus on minimal interventions is a natural place to start, we believe economists should study ‘‘optimal paternalism’’ using the standard methods of economic theory: Write down assumptions about the distribution of rational and irrational types of agents, about the available policy instruments, and about the government’s information about agents, and then investigate which policies achieve the ‘‘best’’ outcomes. In other words, economists ought to treat the analysis of optimal paternalism as a mechanism-design problem when some agents might be boundedly rational. Our analysis in this paper illustrates the value of this approach. While heavy taxes may appear to be
26
This point is certainly implicit in the literature on self-control problems, where it is often discussed how the creation of commitment technologies can make people better off (see for instance Laibson (1997)).
27
See for instance O’Donoghue and Rabin (1999b, 2001) who discuss ‘‘cautious paternalism’’; Camerer, Issacharoff,
Loewenstein, O’Donoghue, and Rabin (2003), who explore ‘‘asymmetric paternalism’’; Sunstein and Thaler (2003a,
2003b ), who investigate ‘‘libertarian paternalism’’; and Choi, Laibson, Madrian, and Metrick (2003), who discuss
‘‘benign paternalism’’.
21more heavy-handed and invasive than other cautiously paternalistic policies that we and others have advocated, our analysis reveals that in fact even relatively large taxes are unlikely to cause much harm to 100% self-controlled agents. Hence, even when we believe only a small proportion of the population makes errors, optimal policy might involve seemingly large deviations from the policy that would be optimal if everyone were fully rational. Furthermore, our analysis of Pareto-efficient taxes reveals that imposing taxes may not even involve trading off benefits for people who make errors against costs for fully rational people. In some instances, everyone can benefit.
To what extent should the government get involved in providing commitment devices to counteract self-control problems — after all, why couldn’t the private market provide any needed commitment devices? There are reasons to believe that, in fact, the government may play a very special role. One reason to be cautious in presuming that the private market will solve self-control problems is that people may be unaware of their own need for commitment; it may be hard to sell people a service they don’t think they need. It may also simply be impossible for the private market to provide the needed commitment devices. The same consumer who wants a commitment device to apply for some future decision may also want to get around that commitment device when that future decision arrives. If it’s profitable for firms to provide ways to get around earlier commitments, then the earlier commitments will never be taken in the first place. Imagine if our example of taxes were left to the private market. In principle, a person might sign a contract with a firm that says the firm will charge her a price above cost for potato chips. When she is craving potato chips, however, nothing stops another firm from offering her potato chips at cost. The special role of the government is that a government-imposed per-unit tax requires all firms to charge the higher price. 28
Our analysis has numerous limitations. For instance, we have ignored the possibility of substitute goods. If taxes are imposed on potato chips, people might substitute out of potato chips and into Twinkies. If such goods are taxable — and carefully taxed at the appropriate rate — our analysis extends in a straightforward way.
29
But if such goods are not taxable — for instance, if we start increasing taxes on alcohol, people might substitute into marijuana — a problem arises. If policy-
28
More generally, as an alternative to loose intuitions for how markets might deal with self-control problems, explicit models can allow economists to study carefully how market reactions compare to government intervention. Indeed, some researchers have begun this process — for instance, DellaVigna and Malmendier (2004) and Koszegi (forthcoming) explore more systematically the types of situations in which the market is likely to be able or unable to provide commitment devices.
29
There is of course a substantial practical issue of whether real-world governments can accurately assess which goods should be taxed and which should not.
22makers fully recognize their existence, then substitute but non-taxable sins merely put limits on the effectiveness of policy. In our framework, for instance, if u
∗
= v(x+w; ρ)−c(x+w; γ)+z, where w is a non-taxable sin good with market price pw, then a constraint on policy would be that we must have 1 + tx ≤ pw, because otherwise people would buy w rather than x. If instead policymakers naively ignore the existence of substitute and non-taxable sins, then imposing taxes may inadvertently do more harm than good. This is especially a concern if substitute sins have larger health costs — for instance, cigarette taxes might lead people to substitute into black-market, unfiltered cigarettes. We have also limited attention to uniform linear taxes. Especially because such taxes generally cannot implement the first best, it is natural to consider whether more sophisticated schemes can do better. In particular, we might take advantage of the fact that people with self-control problems would like to behave themselves in the future. For instance, a policy might attempt to sort types via tax menus wherein each consumer chooses in advance her per-unit tax and the associated lumpsum transfer. For some initial thoughts on such mechanisms, see O’Donoghue and Rabin (2003, forthcoming). A closely related issue is whether there exist superior policy instruments besides taxes. Given that the problem is over-consumption, perhaps a superior policy instrument would be to impose quantity restrictions — that is, a maximum quantity that people are permitted to consume. In some instances, quantity restrictions could be effective — e.g., if we knew everyone’s ideal consumption, we could just set the maximum quantity equal to it. But more generally, price commitments (taxes) have a major advantage relative to quantity commitments. Specifically, if people experience dayto-day variation in their tastes, there is value to having some flexibility to react to this day-to-day variation. When there is a commitment to a higher price (as with a tax), the person can still buy more when her tastes are high and buy less when her tastes are low. Under a quantity restriction, only the latter flexibility is possible.
Despite these limitations, we hope that the insights from our analysis in this simplified environment can be an early step to a more general analysis of optimal taxation when not all consumers are 100% self-controlled.
23Appendix: Proofs and Derivations
Preliminary Results: We first derive some basic results that will be useful throughout the proofs.
Recall that our propositions assume β ≤ 1, and we assume throughout that vxx − cxx < 0 (see footnote 7). From the text, Ω(t) = EF [ˆu(t)] + I, where uˆ(t) ≡ v (x
∗
(t); ρ) − c (x
∗
(t); γ) − x
∗
(t).
As long as v and c are thrice differentiable, Ω is continuous and twice differentiable, where dΩ dt
= EF
∙
duˆ dt ¸
= EF
∙
duˆ dx dx
∗
dt
¸
.
In addition, for each (ρ, γ, β), x
∗
(t) satisfies vx(x
∗
(t); ρ)−βcx(x
∗
(t); γ)−(1+t) = 0, from which one can derive: dx ∗ dt =
−1
− [vxx(x
∗
(t); ρ) − βcxx(x
∗
(t); γ)]
< 0 dx ∗ dβ =
−cx(x
∗
(t); γ)
− [vxx(x
∗
(t); ρ) − βcxx(x
∗
(t); γ)]
< 0 dx ∗ dρ = vxρ(x ∗
(t); γ)
− [vxx(x
∗
(t); ρ) − βcxx(x
∗
(t); γ)]
> 0 dx ∗ dγ =
−βcxγ(x
∗
(t); γ)
− [vxx(x
∗
(t); ρ) − βcxx(x
∗
(t); γ)]
< 0 duˆ dx
= vx (x
∗
(t); ρ) − cx(x
∗
(t); γ) − 1 = t − (1 − β)cx(x
∗
(t); γ)
Proof of Proposition 1: (1) If everyone has β = 1, then everyone has du/dx ˆ = t, and thus dΩ dt
= EF µ t ∗ dx ∗ dt ¶
= t ∗ EF µ dx
∗
dt
¶
.
Because EF (dx
∗
/dt) < 0 for all t, Ω is quasiconcave in t. In particular, dΩ/dt > 0 for t < 0%, dΩ/dt = 0 for t = 0%, and dΩ/dt < 0 for t > 0%, and so the optimal tax t
∗
= 0%.
(2) Suppose instead that everyone has β ≤ 1 and some people have β < 1. As in part 1, everyone with β = 1 has du/dt ˆ = t ∗ (dx
∗
/dt) ≥ 0 for all t ≤ 0%. Everyone with β < 1 has du/dt ˆ = [t − (1 − β)cx(x
∗
(t); γ)] ∗ (dx
∗
/dt). By assumption, cx(x
∗
(t); γ) > 0, which implies
[t − (1 − β)cx(x
∗
(t); γ)] < 0 for any t ≤ 0%, which in turn implies du/dt > ˆ 0 for any t ≤ 0%.
Because dΩ/dt = EF [du/dt ˆ ], it follows that dΩ/dt > 0 for all t ≤ 0%, and thus the optimal tax t ∗
> 0%.
(Note: When some people have β < 1, Ω is not necessarily quasiconcave in t. But our proof establishes that, while there may be multiple local maxima, all local maxima have t > 0%, and therefore any global maximum has t > 0%.)
24Proof of Proposition 2: To clarify notation, define F
0
(ρ, γ, β) ≡ G(ρ, γ)H0
(β) and F
1
(ρ, γ, β) ≡
G(ρ, γ)H1
(β), and then Ω
0
(t) = EF0 [ˆu(t)] + I and Ω
1
(t) = EF1 [ˆu(t)] + I.
(1) For any (ρ, γ) and t, if du/dt ˆ is larger for smaller β, then H1
(β) ≥ H0
(β) for all β implies
EH0 [du/dt ˆ ] < EH1 [du/dt ˆ ] .
For any t, if, for all (ρ, γ), du/dt ˆ is larger for smaller β, then
EG (EH0 [du/dt ˆ ]) < EG (EH1 [du/dt ˆ ]) ⇔
EF0 [du/dt ˆ ] < EF1 [du/dt ˆ ] ⇔ dΩ 0
/dt < dΩ
1
/dt.
Hence, if for all (ρ, γ) and t ≤ t
∗
0
, du/dt ˆ is larger for smaller β, then dΩ
1
/dt > dΩ
0
/dt for all t ≤ t
∗
0
. It follows that t
∗
1 > t
∗
0
.
(2) Because H1
(β|β ≥ β0
) = H0
(β) for all β ≥ β0
,
Ω
0
(t) = EH0 [EG(ˆu(t))] + I = EH1
|β≥β0
[EG(ˆu(t))] + I.
Hence,
Ω
1
(t) = Pr
H1
(β ≥ β0
)EH1
|β≥β0
[EG(ˆu(t))] + Pr
H1
(β < β0
)EH1
|β x
∗∗
for all β < β0
. Moreover, because, for any t, x
∗
(t) > x
∗∗
implies vx (x
∗
(t); ρ) − cx(x
∗
(t); γ) − 1 < 0, and because dx
∗
/dt < 0, x
∗
(t
∗
0
) > x
∗∗
implies du/dt ˆ =
[vx (x
∗
(t); ρ) − cx(x
∗
(t); γ) − 1] dx ∗ dt > 0 for all t ≤ t
∗
0
. It follows that for all (ρ, γ) and β < β0
,
uˆ(t
∗
0
) > uˆ(t) for all t < t
∗
0
. And thus Ω
1
(t
∗
0
) ≥ Ω
1
(t) for any t < t
∗
0
.
We next prove that dΩ
1
/dt|t=t
∗
0
> 0. dΩ 1 dt = Pr
H1
(β ≥ β0
)
dΩ
0
dt
+ Pr
H1
(β < β0
)EH1
|β x
∗∗
implies du/dt ˆ |t=t
∗
0
> 0. It follows that dΩ
1
/dt|t=t
∗
0
> 0.
Finally, the combination of Ω
1
(t
∗
0
) ≥ Ω
1
(t) for any t < t
∗
0 and dΩ
1
/dt|t=t
∗
0
> 0 implies that t ∗
1 > t
∗
0
.
25Proof of Proposition 3: From the text, uˆ ∗∗
(t) = [v(x
∗
(t); ρ) − c(x
∗
(t); γ) − x
∗
(t)] + [I + `(t) − tx
∗
(t)] , where `(t) ≡ t ∗ X∗
(t) = t ∗ EF [x
∗
(t)].
(1) If there is no heterogeneity in (ρ, γ) or in β, then x
∗
(t) is the same for everyone, and so
`(t) − tx
∗
(t) = 0 for everyone. Hence, everyone has the same uˆ ∗∗
(t) = v (x
∗
(t); ρ) − c(x
∗
(t); γ) − x
∗
(t) + I, and, given β = 1, this uˆ
∗∗
(t) is maximized at t = 0%. It follows that t = 0% is the unique
Pareto-efficient tax.
(2) Let ρmax and ρmin be the maximum and minimum ρ in the population, and let γmax and γmin be the maximum and minimum γ in the population. We refer to a person who has ρ = ρmax and γ = γmin as a type (A) person, and we refer to a person who has ρ = ρmin and γ = γmax as a type
(B) person. Because dx
∗
/dρ > 0 and dx
∗
/dγ < 0, for any t, type (A) has the largest x
∗
(t) in the population, and therefore has x
∗
(t) > X∗
(t). Analogously, type (B) has the smallest x
∗
(t) in the population, and therefore has x
∗
(t) < X∗
(t).
We can rewrite uˆ ∗∗
(t) = [v(x
∗
(t); ρ) − c(x
∗
(t); γ) − x
∗
(t)] + [I + t(X
∗
(t) − x
∗
(t))] .
Because, given β = 1, v(x
∗
(t); ρ) − c(x
∗
(t); γ) − x
∗
(t) is maximized at t = 0, type (A) has uˆ ∗∗
(0) > uˆ
∗∗
(t) for any t > 0, while type (B) has uˆ
∗∗
(0) > uˆ
∗∗
(t) for any t < 0. Moreover, for any (ρ, γ), β = 1 implies vx(x
∗
(0); ρ) − cx(x
∗
(0); γ) − 1 = 0, and thus duˆ ∗∗ dt t=0%
= X
∗
(0) − x
∗
(0).
It follows that there exists a t
00
< 0% such that type (A) has duˆ
∗∗
/dt < 0 for all t ∈ (t
00
, 0%], and there exists a t
0
> 0% such that type (B) has duˆ
∗∗
/dt > 0 for all t ∈ [0%, t
0
).
Consider some t0 ∈ [0%, t
0
). Any t < t0 cannot be Pareto-superior to t0 because type (B) has uˆ ∗∗
(t) < uˆ
∗∗
(t0). For any t > t0, all types have v(x ∗
(t); ρ) − c(x
∗
(t); γ) − x
∗
(t) < v(x
∗
(t0); ρ) − c(x
∗
(t0); γ) − x
∗
(t0).
At the same time, because X∗
(t) = EG [x
∗
(t)], it is impossible for all types to have X∗
(t)−x
∗
(t) >
X∗
(t0) − x
∗
(t0). Hence, there must be some type that has uˆ
∗∗
(t) < uˆ
∗∗
(t0), and so t cannot be
Pareto-superior to t0. It follows that any t0 ∈ [0%, t
0
) is Pareto-efficient.
An analogous argument (using type (A)) establishes that any t0 ∈ (t
00
, 0%] is Pareto-efficient.
26Proof of Proposition 4: We first prove that condition (A) implies that duˆ
∗∗
/dt|t=0% is larger for smaller β. duˆ ∗∗ dt t=0%
= [vx(x
∗
(0); ρ) − cx(x
∗
(0); γ) − 1] dx ∗ dt + X
∗
(0) − x
∗
(0).
To ease notation in what follows, we suppress the arguments in the derivatives of v and c. d h duˆ ∗∗ dt t=0% i dβ
= [vxx − cxx] dx ∗ dt dx
∗
dβ
+ [vx − cx − 1] d £ dx ∗ dt ¤ dβ − dx ∗ dβ .
From our preliminary results, dx
∗
/dt = 1/(vxx − βcxx), dx
∗
/dβ = cx/(vxx − βcxx), and vx − cx − 1 = t − (1 − β)cx = −(1 − β)cx at t = 0%. Differentiating dx
∗
/dt, d £ dx ∗ dt ¤ dβ =
−
h
(vxxx − βcxxx) dx ∗ dβ − cxx i (vxx − βcxx)
2
= dx ∗ dt dx
∗
dβ
∙
cxx cx −
(vxxx − βcxxx)
(vxx − βcxx)
¸
.
Hence,
d h duˆ
∗∗
dt t=0% i dβ
=
dx
∗
dt dx ∗ dβ ∙
(vxx − cxx) − µ (1 − β)cx
∙
cxx cx −
(vxxx − βcxxx)
(vxx − βcxx)
¸¶
− (vxx − βcxx)
¸
= dx ∗ dt dx
∗
dβ
(1 − β)
∙
cx(vxxx − βcxxx)
(vxx − βcxx)
− 2cxx
¸
.
Because dx
∗
/dβ < 0 and dx
∗
/dt < 0, it follows that duˆ
∗∗
/dt|t=0% is larger for smaller β if vxxx − βcxxx ≥
2cxx
cx
(vxx − βcxx) for all x.
(1) If there is no heterogeneity in (ρ, γ), then dx
∗
/dβ < 0 implies that x
∗
(0) is smallest for people with β = 1, and so people with β = 1 have x
∗
(0) < X∗
(0). Since people with β = 1 also have vx(x
∗
(0); ρ) − cx(x
∗
(0); γ) − 1 = 0, they have duˆ
∗∗
/dt|t=0% > 0. Moreover, since duˆ ∗∗
/dt|t=0% is larger for smaller β, everyone with β < 1 also has duˆ
∗∗
/dt|t=0% > 0. Hence, everyone in the population has duˆ
∗∗
/dt|t=0% > 0, and so there exists t
0
> 0% such that all taxes t ∈ (0%, t
0
) are Pareto-superior to t = 0%.
(2) We first prove the latter statement. Much as in the proof for part 1, if max(ρ,γ)∈Γ,β=1 x
∗
(0) <
X∗
(0), then for all (ρ, γ), people with β = 1 have duˆ
∗∗
/dt|t=0% > 0. And since duˆ
∗∗
/dt|t=0% is larger for smaller β, for all (ρ, γ), everyone with β < 1 also has duˆ
∗∗
/dt|t=0% > 0. Again, since everyone in the population has duˆ
∗∗
/dt|t=0% > 0, there exists t
0
> 0% such that all taxes t ∈ (0%, t
0
) are Pareto-superior to t = 0%.
To prove the former statement, we prove that, for all β, d [EG [ˆu
∗∗
(t)]] /dt|t=0% > 0. Note that d [EG[ˆu
∗∗
(t)]] dt t=0%
= EG
∙
duˆ
∗∗
dt t=0% ¸
.
27People with β = 1 have duˆ
∗∗
/dt|t=0% = X∗
(0) − x
∗
(0), and so EG [duˆ
∗∗
/dt|t=0%] = X∗
(0) −
EG [x
∗
(0)]. Because dx
∗
/dβ < 0, d [EG [x
∗
(0)]] /dβ = EG [d [x
∗
(0)] /dβ] < 0. Hence, EG [x
∗
(0)] is smallest for β = 1, and so X∗
(0) > EG [x
∗
(0)]. It follows that d [EG [ˆu
∗∗
(t)]] /dt|t=0% > 0 for β = 1.
Consider any β < 1. Because, for all (ρ, γ), duˆ
∗∗
/dt|t=0% is larger for smaller β, it follows that d [EG [ˆu
∗∗
(t)]] /dt|t=0% is larger for smaller β. Given d [EG [ˆu
∗∗
(t)]] /dt|t=0% > 0 for β = 1, we have d [EG [ˆu
∗∗
(t)]] /dt|t=0% > 0 for all β. The result follows.
Details for Numerical Calculations in Section 5:
First, we derive analytically an equation for dΩ(t)/dt. It is straightforward to derive that, with
CRRA benefits and linear costs, an individual’s demand becomes x ∗
(t) = ρ 1/r
(βγ + 1 + t)
1/r
.
Hence, the social-welfare function Ω(t) from Section 3 becomes:
Ω(t) = EF
£
ρ(x
∗
(t))
1−r
/(1 − r) − γ x
∗
(t) − x
∗
(t) + I
¤
= EF
"
ρ
1 − r µ ρ
1/r
(βγ + 1 + t)
1/r
¶1−r
− (γ + 1) µ ρ
1/r
(βγ + 1 + t)
1/r
¶#
+ I
= EF
£
ρ
1/r
¤
EF
"
1
1 − r µ 1 βγ + 1 + t
¶1−r
r
− (γ + 1) µ 1 βγ + 1 + t
¶1
r
#
+ I.
Note that maximizing Ω(t) is equivalent to maximizing Ωˆ
(t) ≡ Ω(t)/EF
£
ρ
1/r
¤
, and: dΩˆ (t) dt = EF
"
−1 r µ
1
βγ + 1 + t
¶1/r
+
(γ + 1) r µ
1
βγ + 1 + t
¶1/r+1
#
= EF
∙
(1 − β)γ − t r ∗ (βγ + 1 + t)
1/r+1
¸
.
Next, we derive analytically an equation for the elasticity of demand. Let q ≡ 1 + t denote the market price, and so x
∗
(t) = ρ
1/r
/(βγ + q)
1/r
. Market demand is thus
X
∗
= EF [x
∗
(t)] = EF
£
ρ
1/r
¤
∗ EF
£
(βγ + q)−1/r
¤
,
28and therefore the elasticity of demand is εD = dX∗ dq q X∗
= EF
£
ρ
1/r
¤
∗ EF
£
(−1/r) (βγ + q)−1/r−1
¤
∗ q EF [ρ
1/r
] ∗ EF [(βγ + q)−1/r
]
= µ −q r ¶
EF
£
(βγ + q)−1/r−1
¤
EF [(βγ + q)−1/r
]
.
We can now conduct our numerical calculations for the optimal tax. Specifically, we fix a γ
(that is the same for everyone) and a distribution of β. We then set t = 0%, or q = 1, and choose r so that εD is equal to its target value. Finally, we fix that r, and find numerically the t
∗
such that dΩˆ (t)/dt > 0 for all t ≤ t
∗
and dΩˆ
(t)/dt < 0 for t > t
∗
.
30
Finally, we search for the minimum quasi-Pareto-efficient tax. To do so, note that
EG [ˆu
∗∗
(t)] = EG
£
ρ(x
∗
(t))
1−r
/(1 − r) − γ x
∗
(t) − (1 + t)x
∗
(t) + `(t) + I
¤
= EG
£
ρ
1/r
¤
"
1
1 − r µ 1 βγ + 1 + t
¶(1−r)/r
− (γ + 1 + t) µ 1 βγ + 1 + t
¶1/r
#
+ `(t) + I.
Because `(t) = t EF [x
∗
(t)] = t EG
£
ρ
1/r
¤
EF
£
(βγ + 1 + t)−1/r
¤
, and because γ is the same for everyone, it follows that EG [ˆu
∗∗
(t)] ≥ EG [ˆu
∗∗
(t
0
)] if and only if uˆ
∗∗∗
(t) ≥ uˆ
∗∗∗
(t
0
), where uˆ ∗∗∗
(t) ≡
1
1 − r
(βγ + 1 + t)
−1/r+1
− (γ + 1 + t) (βγ + 1 + t)
−1/r
+ t EF
£
(βγ + 1 + t)−1/r
¤
.
Hence, we merely compute numerically uˆ
∗∗∗
(t) for β = 1, and search for the minimum t such that uˆ ∗∗∗
(t) > uˆ
∗∗∗
(t + .0001).
31
30
One can show that Ωˆ
(t) is concave for all t < (1 + γ)r. For all cases in Table 1, t
∗
is well within this range.
Moreover, for the cases in Table 1, one can also show that dΩˆ
(t)/dt < 0 for all t > (1 + γ)r. It follows that this technique does indeed identify the global optimum.
31
One can show that, in general, duˆ
∗∗∗
/dt is smaller for larger β. Hence, if uˆ
∗∗∗
(t + .0001) > uˆ
∗∗∗
(t) for the β = 1 type, then uˆ
∗∗∗
(t + .0001) > uˆ
∗∗∗
(t) for all types with β < 1 as well. Moreover, one can also show that, for all cases in Table 1, uˆ
∗∗∗
(t) is quasi-concave. It follows that this technique does indeed identify the minimum quasi-Pareto-efficient tax.
29References
Ainslie, George (1991). ‘‘Derivation of ‘Rational’ Economic Behavior from Hyperbolic Discount
Curves.’’ American Economic Review, 81(2), 334-340.
Ainslie, George (1992). Picoeconomics: The strategic interaction of successive motivational states within the person. New York: Cambridge University Press.
Ainslie, George and Nick Haslam (1992a). ‘‘Self-Control,’’ in George Loewenstein and Jon Elster, eds., Choice Over Time. New York: Russell Sage Foundation, 177-209.
Ainslie, George and Nick Haslam (1992b ). ‘‘Hyperbolic Discounting,’’ in George Loewenstein and Jon Elster, eds., Choice Over Time. New York: Russell Sage Foundation, 57-92.
Angeletos, George-Marios, David Laibson, Andrea Repetto, Jeremy Tobacman, and Stephen Weinberg (2001). ‘‘The Hyperbolic Buffer Stock Model: Calibration, Simulation, and Empirical
Evaluation.’’ Journal of Economic Perspectives, 15(3), 47-68.
Benabou, Roland and Jean Tirole (2002). ‘‘Self-Confidence and Personal Motivation.’’ Quarterly
Journal of Economics, 117(3), 871-915.
Bernheim, B. Douglas and Antonio Rangel (2004). ‘‘Addiction and Cue-Triggered Decision Processes.’’
American Economic Review, 94(5), 1558-1590.
Bernheim, B. Douglas and Antonio Rangel (forthcoming). ‘‘From Neuroscience to Public Policy:
A New Economic View of Addiction.’’ Swedish Economic Policy Review.
Besley, Timothy (1988). ‘‘A Simple Model for Merit Good Arguments.’’ Journal of Public Economics, 35, 371-383.
Camerer, Colin, Samuel Issacharoff, George Loewenstein, Ted O’Donoghue, and Matthew Rabin
(2003). ‘‘Regulation for Conservatives: Behavioral Economics and the Case for ‘Asymmetric
Paternalism’.’’ University of Pennsylvania Law Review, 151(3), 1211-1254.
Carrillo, Juan and Thomas Mariotti (2000). ‘‘Strategic Ignorance as a Self-Disciplining Device.’’
Review of Economic Studies, 67, 529-544.
Choi, James, David Laibson, Brigitte Madrian, and Andrew Metrick (2003). ‘‘Optimal Defaults.’’
American Economic Review (Papers and Proceedings), 93(2), 180-185.
DellaVigna, Stefano and Ulrike Malmendier (2004). ‘‘Contract Design and Self-Control: Theory and Evidence.’’ Quarterly Journal of Economics, 119, 353-402.
Diamond, Peter A. (1973). ‘‘Consumption Externalities and Imperfect Corrective Pricing.’’ Bell
Journal of Economics, 4, 526-538.
Diamond, Peter A. and James A. Mirrlees (1971a ). ‘‘Optimal Taxation and Public Production, Part
I: Production Efficiency.’’ American Economic Review, 61(1), 8-27.
30Diamond, Peter A. and James A. Mirrlees (1971b ). ‘‘Optimal Taxation and Public Production, Part
II: Tax Rules.’’ American Economic Review, 61(3), 261-278.
Fischer, Carolyn (1999). ‘‘Read This Paper Even Later: Procrastination with Time-Inconsistent
Preferences.’’ Resources for the Future Discussion Paper 99-20.
Frederick, Shane, George Loewenstein, and Ted O’Donoghue (2002). ‘‘Time Discounting and Time
Preference: A Critical Review.’’ Journal of Economic Literature, 40(2), 351-401.
Gruber, Jonathan and Botond Koszegi (2001). ‘‘Is Addiction ‘Rational’? Theory and Evidence.’’
Quarterly Journal of Economics, 116, 1261-1303.
Gruber, Jonathan and Botond Koszegi (2004). “Tax Incidence When Individuals are Time-Inconsistent:
The Case of Cigarette Excise Taxes.” Journal of Public Economics, 88, 1959-1987.
Gruber, Jonathan and Sendhil Mullainathan (2005). ‘‘Do Cigarette Taxes Make Smokers Happier?’’
B.E. Journals: Advances in Economic Analysis & Policy, 5, Article 4.
Gul, Faruk and Wolfgang Pesendorfer (2001). ‘‘Temptation and Self-Control.’’ Econometrica, 69,
1403-1435.
Gul, Faruk and Wolfgang Pesendorfer (forthcoming). ‘‘Harmful Addiction.’’ Review of Economic
Studies.
Harrison, Glenn, Morten Lau, and Melanie Williams (2002). ‘‘Estimating Individual Discount
Rates in Denmark: A Field Experiment.’’ American Economic Review, 92, 1606-1617.
Herrnstein, Richard, George Loewenstein, Drazen Prelec, and William Vaughan, Jr. (1993). ‘‘Utility Maximization and Melioration: Internalities in Individual Choice.’’ Journal of Behavioral
Decision Making, 6, 149-185.
Kahneman, Daniel (1994). ‘‘New Challenges to the Rationality Assumption.’’ Journal of Institutional and Theoretical Economics, 150, 18-36.
Katchova, Ani, Ian Sheldon, and Mario Miranda (2005). ‘‘A Dynamic Model of Oligopoly and
Oligopsony in the U.S. Potato-Processing Industry.’’ Agribusiness, 21(3), 409-428.
Koszegi, Botond (forthcoming). ‘‘On the Feasibility of Market Solutions to Self-Control Problems.’’ Swedish Economic Policy Review.
Kuchler, Fred, Abebayehu Tegene, and J. Michael Harris (2005). ‘‘Taxing Snack Foods: Manipulating Diet Quality or Financing Information Programs?’’ Review of Agricultural Economics,
27(1), 4-20.
Laibson, David (1997). ‘‘Hyperbolic Discounting and Golden Eggs.’’ Quarterly Journal of Economics, 112(2), 443-477.
Laibson, David (1998). ‘‘Life-Cycle Consumption and Hyperbolic Discount Functions.’’ European
Economic Review, 42, 861-871.
31Laibson, David, Andrea Repetto, and Jeremy Tobacman (1998). ‘‘Self-Control and Saving for
Retirement.’’ Brookings Papers on Economic Activity, 1, 91-196.
Loewenstein, George (1996). ‘‘Out of Control: Visceral Influences on Behavior.’’ Organizational
Behavior and Human Decision Processes, 65, 272-92.
Loewenstein, George and Ted O’Donoghue (2005). ‘‘Animal Spirits: Affective and Deliberative
Processes in Economic Behavior.’’ Mimeo, Cornell University.
Loewenstein, George and Drazen Prelec (1992). ‘‘Anomalies in Intertemporal Choice: Evidence and an Interpretation.’’ Quarterly Journal of Economics, 107(2), 573-597.
Musgrave, Richard A. (1959). The Theory of Public Finance. New York: McGraw-Hill.
O’Donoghue, Ted and Matthew Rabin (1999a). ‘‘Doing It Now or Later.’’ American Economic
Review, 89(1), 103-124.
O’Donoghue, Ted and Matthew Rabin (1999b ). ‘‘Procrastination in Preparing for Retirement,’’ in
Henry Aaron, ed., Behavioral Dimensions of Retirement Economics. Washington DC and New
York: Brookings Institution Press and Russell Sage Foundation, 125-156.
O’Donoghue, Ted and Matthew Rabin (2001). ‘‘Choice and Procrastination.’’ Quarterly Journal of Economics, 116(1), 121-160.
O’Donoghue, Ted and Matthew Rabin (2003). ‘‘Studying Optimal Paternalism, Illustrated by a
Model of Sin Taxes.’’ American Economic Review (Papers and Proceedings), 93(2), 186-191.
O’Donoghue, Ted and Matthew Rabin (forthcoming). ‘‘Optimal Taxes for Sin Goods.’’ Mimeo,
Cornell University. Swedish Economic Policy Review.
Phelps, E. S. and Robert A. Pollak (1968). ‘‘On Second-Best National Saving and Game-Equilibrium
Growth.’’ Review of Economic Studies, 35, 185-199.
Pigou, Arthur C. (1920). The Economics of Welfare. London: Macmillan and Company.
Ramsey, Frank P. (1927). ‘‘A Contribution to the Theory of Taxation.’’ Economic Journal, 37(145),
47-61.
Sheshinski, Eytan (2002). ‘‘Bounded Rationality and Socially Optimal Limits on Choice in a SelfSelection Model.’’ Mimeo, Hebrew University.
Sunstein, Cass and Richard H. Thaler (2003a ). ‘‘Libertarian Paternalism.’’ American Economic
Review (Papers and Proceedings), 93(2), 175-179.
Sunstein, Cass and Richard H. Thaler (2003b ). ‘‘Libertarian Paternalism Is Not An Oxymoron.’’
University of Chicago Law Review, 70, 1159-1202.
Thaler, Richard H. (1991). ‘‘Some Empirical Evidence on Dynamic Inconsistency,’’ in Quasi Rational Economics. New York: Russell Sage Foundation, 127-133.
32Thaler, Richard H. and George Loewenstein (1992). ‘‘Intertemporal Choice,’’ in Richard H. Thaler, ed., The Winner’s Curse: Paradoxes and Anomalies of Economic Life. New York: Free Press,
92-106.
33
References: Ainslie, George (1991). ‘‘Derivation of ‘Rational’ Economic Behavior from Hyperbolic Discount Curves.’’ American Economic Review, 81(2), 334-340. Ainslie, George (1992). Picoeconomics: The strategic interaction of successive motivational states within the person Ainslie, George and Nick Haslam (1992a). ‘‘Self-Control,’’ in George Loewenstein and Jon Elster, eds., Choice Over Time Ainslie, George and Nick Haslam (1992b ). ‘‘Hyperbolic Discounting,’’ in George Loewenstein and Jon Elster, eds., Choice Over Time Angeletos, George-Marios, David Laibson, Andrea Repetto, Jeremy Tobacman, and Stephen Weinberg (2001). ‘‘The Hyperbolic Buffer Stock Model: Calibration, Simulation, and Empirical Evaluation.’’ Journal of Economic Perspectives, 15(3), 47-68. Benabou, Roland and Jean Tirole (2002). ‘‘Self-Confidence and Personal Motivation.’’ Quarterly Journal of Economics, 117(3), 871-915. Bernheim, B. Douglas and Antonio Rangel (2004). ‘‘Addiction and Cue-Triggered Decision Processes.’’ American Economic Review, 94(5), 1558-1590. Besley, Timothy (1988). ‘‘A Simple Model for Merit Good Arguments.’’ Journal of Public Economics, 35, 371-383. Camerer, Colin, Samuel Issacharoff, George Loewenstein, Ted O’Donoghue, and Matthew Rabin (2003) Paternalism’.’’ University of Pennsylvania Law Review, 151(3), 1211-1254. Carrillo, Juan and Thomas Mariotti (2000). ‘‘Strategic Ignorance as a Self-Disciplining Device.’’ Review of Economic Studies, 67, 529-544. Choi, James, David Laibson, Brigitte Madrian, and Andrew Metrick (2003). ‘‘Optimal Defaults.’’ American Economic Review (Papers and Proceedings), 93(2), 180-185. DellaVigna, Stefano and Ulrike Malmendier (2004). ‘‘Contract Design and Self-Control: Theory and Evidence.’’ Quarterly Journal of Economics, 119, 353-402. Diamond, Peter A. (1973). ‘‘Consumption Externalities and Imperfect Corrective Pricing.’’ Bell Journal of Economics, 4, 526-538. Diamond, Peter A. and James A. Mirrlees (1971a ). ‘‘Optimal Taxation and Public Production, Part I: Production Efficiency.’’ American Economic Review, 61(1), 8-27. 30Diamond, Peter A. and James A. Mirrlees (1971b ). ‘‘Optimal Taxation and Public Production, Part II: Tax Rules.’’ American Economic Review, 61(3), 261-278. Fischer, Carolyn (1999). ‘‘Read This Paper Even Later: Procrastination with Time-Inconsistent Preferences.’’ Resources for the Future Discussion Paper 99-20. Frederick, Shane, George Loewenstein, and Ted O’Donoghue (2002). ‘‘Time Discounting and Time Preference: A Critical Review.’’ Journal of Economic Literature, 40(2), 351-401. Gruber, Jonathan and Botond Koszegi (2001). ‘‘Is Addiction ‘Rational’? Theory and Evidence.’’ Quarterly Journal of Economics, 116, 1261-1303. Gruber, Jonathan and Botond Koszegi (2004). “Tax Incidence When Individuals are Time-Inconsistent: The Case of Cigarette Excise Taxes.” Journal of Public Economics, 88, 1959-1987. Gruber, Jonathan and Sendhil Mullainathan (2005). ‘‘Do Cigarette Taxes Make Smokers Happier?’’ B.E Gul, Faruk and Wolfgang Pesendorfer (2001). ‘‘Temptation and Self-Control.’’ Econometrica, 69, 1403-1435. Harrison, Glenn, Morten Lau, and Melanie Williams (2002). ‘‘Estimating Individual Discount Rates in Denmark: A Field Experiment.’’ American Economic Review, 92, 1606-1617. Herrnstein, Richard, George Loewenstein, Drazen Prelec, and William Vaughan, Jr. (1993). ‘‘Utility Maximization and Melioration: Internalities in Individual Choice.’’ Journal of Behavioral Decision Making, 6, 149-185. Kahneman, Daniel (1994). ‘‘New Challenges to the Rationality Assumption.’’ Journal of Institutional and Theoretical Economics, 150, 18-36. Katchova, Ani, Ian Sheldon, and Mario Miranda (2005). ‘‘A Dynamic Model of Oligopoly and Oligopsony in the U.S Kuchler, Fred, Abebayehu Tegene, and J. Michael Harris (2005). ‘‘Taxing Snack Foods: Manipulating Diet Quality or Financing Information Programs?’’ Review of Agricultural Economics, 27(1), 4-20. Laibson, David (1997). ‘‘Hyperbolic Discounting and Golden Eggs.’’ Quarterly Journal of Economics, 112(2), 443-477. Laibson, David (1998). ‘‘Life-Cycle Consumption and Hyperbolic Discount Functions.’’ European Economic Review, 42, 861-871. 31Laibson, David, Andrea Repetto, and Jeremy Tobacman (1998). ‘‘Self-Control and Saving for Retirement.’’ Brookings Papers on Economic Activity, 1, 91-196. Loewenstein, George (1996). ‘‘Out of Control: Visceral Influences on Behavior.’’ Organizational Behavior and Human Decision Processes, 65, 272-92. Loewenstein, George and Ted O’Donoghue (2005). ‘‘Animal Spirits: Affective and Deliberative Processes in Economic Behavior.’’ Mimeo, Cornell University. Loewenstein, George and Drazen Prelec (1992). ‘‘Anomalies in Intertemporal Choice: Evidence and an Interpretation.’’ Quarterly Journal of Economics, 107(2), 573-597. Musgrave, Richard A. (1959). The Theory of Public Finance. New York: McGraw-Hill. O’Donoghue, Ted and Matthew Rabin (1999a). ‘‘Doing It Now or Later.’’ American Economic Review, 89(1), 103-124. O’Donoghue, Ted and Matthew Rabin (1999b ). ‘‘Procrastination in Preparing for Retirement,’’ in Henry Aaron, ed., Behavioral Dimensions of Retirement Economics O’Donoghue, Ted and Matthew Rabin (2001). ‘‘Choice and Procrastination.’’ Quarterly Journal of Economics, 116(1), 121-160. O’Donoghue, Ted and Matthew Rabin (2003). ‘‘Studying Optimal Paternalism, Illustrated by a Model of Sin Taxes.’’ American Economic Review (Papers and Proceedings), 93(2), 186-191. Phelps, E. S. and Robert A. Pollak (1968). ‘‘On Second-Best National Saving and Game-Equilibrium Growth.’’ Review of Economic Studies, 35, 185-199. Pigou, Arthur C. (1920). The Economics of Welfare. London: Macmillan and Company. Ramsey, Frank P. (1927). ‘‘A Contribution to the Theory of Taxation.’’ Economic Journal, 37(145), 47-61. Sheshinski, Eytan (2002). ‘‘Bounded Rationality and Socially Optimal Limits on Choice in a SelfSelection Model.’’ Mimeo, Hebrew University. Sunstein, Cass and Richard H. Thaler (2003a ). ‘‘Libertarian Paternalism.’’ American Economic Review (Papers and Proceedings), 93(2), 175-179. Sunstein, Cass and Richard H. Thaler (2003b ). ‘‘Libertarian Paternalism Is Not An Oxymoron.’’ University of Chicago Law Review, 70, 1159-1202. Thaler, Richard H. (1991). ‘‘Some Empirical Evidence on Dynamic Inconsistency,’’ in Quasi Rational Economics. New York: Russell Sage Foundation, 127-133. 32Thaler, Richard H. and George Loewenstein (1992). ‘‘Intertemporal Choice,’’ in Richard H. Thaler, ed., The Winner’s Curse: Paradoxes and Anomalies of Economic Life
You May Also Find These Documents Helpful
-
Gwartney, J. D., Stroup, R. L., Sobel, R. S., & Macpherson, D. A. (2013). Economics: Private and public choice (14th ed.). Independence, KY: Cengage Learning…
- 417 Words
- 2 Pages
Good Essays -
Simon, H. A. (1997). Models of Bounded Rationality, Vol. 3: Emperically Grounded Economic Reason. The MIT Press.…
- 1369 Words
- 6 Pages
Powerful Essays -
A quick recap of what has already been discussed leading into the next segment of this paper.…
- 3133 Words
- 13 Pages
Powerful Essays -
This research is being submitted on June 14, 2010, for Mr. Bergeen’s Microeconomics course at Rasmen College by John Divler.…
- 1125 Words
- 5 Pages
Better Essays -
COURSE SYLLABUS API-102A Economic Analysis of Public Policy Spring 2013 Instructor José Carlos Rodríguez Pueblita Email: jose_pueblita@hks.harvard.edu Webpage: http://hvrd.me/p63BYk Twitter: @jcpueblita Office R306 Assistant______ Mary Anne Baumgartner…
- 2083 Words
- 9 Pages
Powerful Essays -
Kirchler, E., & Hoelzl, E. (2006). Twenty-five years of the Journal of Economics Psychology (1981-2005): A report on the development of an interdisclinary field of research. Journal of Economic Psychology, 27(32), 793-804.…
- 626 Words
- 3 Pages
Good Essays -
Ely, E. W., Kleinpell, R. M., & Goyette, R. E. (2003). Advances in the understanding of…
- 5688 Words
- 23 Pages
Powerful Essays -
| Reference page is included and lists sources used in the paper. Sources are appropriately documented, although some errors may be present.…
- 1168 Words
- 5 Pages
Powerful Essays -
Cited: Carson, Robert B., Wade L. Thomas, and Jason Hecht. Economic Issues Today Alternative Approaches. Armonk: M.E. Sharpe, 2002.…
- 2288 Words
- 10 Pages
Powerful Essays -
“The Federal Bureau Investigation Uniform Crime Report involving firearms showed to be at sixty-seven point seven percent of the nation’s murders, forty-one point three percent of robberies, and twenty-one point two percent of aggravated assaults…From January 2012 to June 2012 there was an increase of one point nine percent violent crimes reported.” (FBI) There should be a balance between Gun Control and Gun Violence. Gun Violence will never go away as long as there are guns. Guns will never go away as long as Americans hold their right to the Second Amendment. Balance is needed in a world filled with the good and bad. With the rise of mass murders, homicides and aggravated assaults involving firearms, a desperate time for gun control is needed for order in this country. Gun Control needs a higher standard for law abiding citizens.…
- 486 Words
- 2 Pages
Satisfactory Essays -
Department of Economics and International Business, Lebow College of Business, Drexel University, Philadelphia, USA 2 Psychology Program, The Richard Stockton College of New Jersey, USA…
- 3657 Words
- 15 Pages
Powerful Essays -
References: 1. J Poto nik, GH Oettinger and C Hedegaard, 2011, Letter to the Members of the…
- 23857 Words
- 96 Pages
Powerful Essays -
This paper has benefited from the comments of Herschel Grossman, Anne Krueger, Gustav Ranis, Jeffrey Sachs, John Williamson, and three referees.…
- 18671 Words
- 75 Pages
Good Essays -
We thank Mark Aguiar, Gary Becker, Matthew Gentzkow, Jonathan Guryan, Daniel Hamermesh, Kevin Murphy, Andy Postlewaite, Karl Scholz, Jesse Shapiro, and Francesco Trebbi for very useful comments and conversations. The paper has also benefited from comments from seminar participants at the University of Chicago, The IRP Summer Workshop, UCLA, Washington University, the University of Minnesota, Dartmouth College, the NBER Labor Studies Summer Program, the NBER Consumption Group Summer Program, and the St Louis Federal Reserve. We absolve all of…
- 17169 Words
- 69 Pages
Powerful Essays -
Jenny C. Aker, Department of Economics and The Fletcher School, Tufts University, 160 Packard Avenue, Medford, MA 02155;…
- 11034 Words
- 45 Pages
Powerful Essays