An Empirical Study Using Panel Data
Abstract
This paper tries to find out how public housing affects homelessness, conditional on other relevant factors. Five ordinary least squares (OLS) regression models, including bivariate regression, multivariate regression, and fixed effects regression are used to estimate the relationship between public housing and homelessness based on yearly state-level panel data from 2007-2013. The results indicate that public housing plays a significant role in reducing homelessness.
I. Introduction and Background
Public housing is a common-often solution to the problem of homelessness, aiming at providing affordable housing at lower rents to people who have difficulty buying or renting a home at market price. At present, according to the estimation by the Department of Housing and Urban Development (HUD), there are approximately 1.2 million households living in public housing units in the U.S. Designing to directly address homelessness, public housing is often assumed to have a positive impact on reducing the size of the homeless population. With an adequate amount of public housing available, the number of the homeless is expected to drop sufficiently. …show more content…
However, does public housing actually work as it is supposed to be?
The purpose of this study is to answer such a question: Have public housing addressed or reduced the problem of homelessness to any important degree, when other relevant factors are accounted for? We are not trying to dig out the nature of homelessness; instead, we will focus specifically on the empirical relationship between public housing and homelessness. Given the fact that the development of public housing programs often requires a large amount of funding, understanding the role of public housing in alleviating homelessness helps policy makers to assess the effectiveness of subsidized housing
policies.
II. Literature Review
Only a few pieces of research specifically explore the relationship between public housing and homelessness. Some researchers illustrate the topic based on qualitative methods (e.g. participant observation) (Vakili-Zad, 2004), while others look at the role public housing plays in reducing homelessness from the perspective of statistical and econometric analysis.
Quigley (1990) used four multivariate OLS regression models based on the data from 50 U.S. cities in 1984 provided by HUD to estimate the effect public housing has on homelessness. The estimate of the number of the homeless divided by the population of the city is used as a dependent variable and the number of public housing units per capita is used as an independent variable. Other explanatory variables include poverty rate, unemployment rate, average temperature, vacancy rate, rent control, population growth rate, and average rent (Quigley, 1990; 90).
Similarly, Troutman, Jackson and Ekelund (1999) used the same data set, the 1984 HUD survey, and selected 40 of the cities to determine whether housing conditions are a leading cause of homelessness. They built three regression models, including an OLS model, a sample selection model, and a 3SLS selection model. The dependent variable is measured as the estimated number of the homeless population per capita. The number of public housing units, however, is not applied as an independent variable; the sum of state level per capita expenditures on public housing programs is included instead as a measure of public housing. Troutman and his colleagues examined a lot of explanatory variables. In addition to all the independent variables incorporated in Quigley’s study except the population growth rate, they also introduced household size, average rainfall, state level per capita expenditures on alcohol, drug, and mental health, direct federal housing assistance payments to individuals per capita, median house price, personal income per capita, non-metropolitan population, crime rate, medicaid payments per capita, institutionalized mental population, and percent of dwelling units occupied by renters (Troutman, Jackson and Ekelund, 1999; 200). All the variables (independent and dependent) are converted into logarithms.
Quigley and Troutman both found that the coefficients on public housing term are not statistically significant in most of the regression models. The existence of public housing is, according to the results of their studies, irrelevant to the extent of homelessness. Thus, they concluded that public housing programs failed to reduce homelessness.
Early (1998) reaches a similar conclusion using different approach. Early chose to use individual-level data in 15 U.S. cities rather than city- or state-level data to estimate the role of public housing in reducing homelessness. He combined micro-data from a 1987 study by the Urban Institute on the homeless with micro-data from the American Housing Survey (AHS) in 1985-1988 of the housed population. A logit regression model is used to estimate the probability of being homeless, which is the dependent variable, as a function of household characteristics and the characteristics of the city in which the household resides (Early, 1998; 689). The explanatory variables include real monthly income, cash assistance received, number of persons in the household, gender, race, age, CES-D score (a measure of depression), mental health spending, drug abuse spending, temperature, lowest quality of housing available, quality of homeless shelters, availability of homeless shelters, vacancy rate for low-rent units, and relative price of substandard housing. The fitted values from the logit regression suggest that only less than 5 percent of the subsidized population in the 15 cities would be homeless in the absence of the subsidies. Early concluded that expanding the current public housing programs cannot be expected to have much effect on homelessness.
The limitations of the existing research leave space for new studies regarding the same topic. One limitation is that these studies try to find a model that explains as much as possible of the causes of homelessness rather than explore specifically the relationship between public housing and homelessness. Some unnecessary factors that lack associations with public housing such as temperature are included as explanatory variables. Some factors that both have an effect on public housing and homelessness (e.g. average rent of public housing), however, are excluded in those studies, causing potential omitted variable bias. Another limitation is that the data used by these studies, which were collected in the 1980s, are out of date and can hardly represent the current situation.
III. Data and Methodology
In order to explore the potential causality between public housing and homelessness, this study applies regression analysis to yearly state-level panel data from 50 U.S. states and the District of Columbia in the years 2007-2013. A series of OLS regression models (including bivariate regression, multivariate regression, and fixed effects regression) are used to estimate the number of homeless population as a function of public housing and other explanatory variables. The dependent variable is measured as the count of sheltered and unsheltered homeless persons on a single night of a given year provided by the Point-In-Time (PIT) survey conducted by HUD. The number of public housing units under contract for federal subsidy and available for occupancy provided by HUD’s Picture of Subsidized Households data set is used as a measure of the level of public housing and will be the key independent variable. The first and the simplest regression model used in this study is: homeless = β0 + β1 (public housing) +
(1)
This function provides a rough and direct estimation of the relationship between public housing and homelessness. However, the bivariate regression model is unable to guarantee internal validity. A major reason is that there may be additional factors that may be responsible for the observed association between public housing and homelessness. Hence omitted variable bias will be generated. To deal with this problem, a multiple regression model allows us to estimate the effect of public housing on homelessness while controlling for factors that might confound the relationship between the two. Control variables should both have an effect on the independent variable (public housing) and the dependent variable (homelessness). Population, for instance, is an especially important example. The number of the homeless as well as the level of public housing can be expected to grow up with the size of population. So the first multivariate regression model is: homeless = β0 + β1 (public housing) + β2 (population) +
(2)
Additionally, other factors correlated with public housing may also be determinants of homelessness and cannot be ignored. For instance, an increase in the amount of government expenditures on public welfare will likely raise the level of public housing and reduce the number of homeless population. A similar example is the available units of all subsidized housing programs, which also has a positive impact on public housing and a negative impact on homelessness. It is totally possible that the government expenditures on mental health or drug abuse, or the available units of other assisted housing programs have more associations with homelessness. Here are more examples: The conditions of poverty and unemployment are likely to affect levels of homelessness and government efforts on developing subsidized housing programs. Federal spending on public housing programs helps state and local governments build more public housing and thus reduces homelessness. Lower rent of public housing is expected to mitigate homelessness by making public housing more affordable for people with difficulty finding a place to live at market price, although its role on the level of public housing is not very clear. One possibility is that a higher fraction of the budget for assisted housing programs is required to guarantee lower rents, which in turn reduce the ability of governments to build more public housing. In an effort to correctly identify the relationship between public housing and homelessness, it is necessary to bring these characteristics in the multiple regression function as control variables. The descriptions and summary statistics of all the variables used in the second multivariate regression model are given in Table 1 and Table 2.
Table 1. Description of Variables
Variable
Definition
Homeless
Count of sheltered and unsheltered homeless persons on a single night of a given year
Public housing
Number of public housing units under contract for federal subsidy and available for occupancy (in thousands)
Population
Annual estimate of the population (in thousands)
Rent
Average household contribution towards public housing rent per year (in U.S. dollars)
Spending
Average federal spending per public housing unit per year
(in U.S. dollars)
Expenditures
Amount of annual expenditures on public welfare of state government (in thousands of U.S. dollars)
All programs
Number of units under contract for federal subsidy and available for occupancy for all assisted housing programs
(in thousands)
Poverty
Ratio of the number of people who fall below the poverty line and the total population (in percentage)
Unemployment
Total unemployment as a percentage of civilian labor force
(in percentage)
Table 2. Summary Statistics
Variable
Observations
Mean
St. Dev.
Min.
Max.
Homeless
350
7093.96
5632.08
515
23379
Public housing
350
17.08
16.64
0.71
67.98
Population
350
4352.95
3291.05
513
12910
Rent
350
3012.80
687.31
828
5765
Spending
350
5391.37
1824.19
2206
18384
Expenditures
336
6084.63
4855.18
576
23707
All programs
350
162.59
131.66
11.60
539.66
Poverty
343
13.06
3.43
5.4
23.1
Unemployment
343
6.68
2.35
2.5
13.8
Therefore, the second multivariate regression model is: homeless = β 0 + β1 (public housing) + β2 (population) + β3 (rent) + β4 (spending)
(3) + β5 (expenditures) + β6 (all programs) + β7 (poverty) + β8 (unemployment) +
Though the multiple regression models are expected to mitigate omitted variable bias to some extent, a variety of related variables are still not included. Since a lot of factors that affect public housing and homelessness are difficult to be measured or collected, it is nearly impossible to control for every factor that may cause omitted variable bias. Another method to alleviate this problem is to introduce fixed effects method, which allows us to hold some factors constant even if we cannot observe them in panel data analysis. Specifically, state fixed effects control for differences between states that do not vary over time, while time fixed effects control for variables that change over time but are the same across states in a given year. This study first explores state fixed effects and then uses both state and time fixed effects to further discover the relationship between public housing and homelessness.
The two fixed effects regression models are: homeless = β 0 + β1 (public housing) + β2 (population) + β3 (rent)
(4) + β4 (spending) + β5 (expenditures) + β6 (all programs)
+ β7 (poverty) + β8 (unemployment) + FE (state) +
homeless = β 0 + β1 (public housing) + β2 (population) + β3 (rent)
(5) + β4 (spending) + β5 (expenditures) + β6 (all programs)
+ β7 (poverty) + β8 (unemployment) + FE (state) + FE (time) +
IV. Results
Table 3. Regression Results
Independent
Variable
Coefficient
(T-statistic)
(1)
Model 1
(2)
Model 2
(3)
Model 3
(4)
Model 4
(5)
Model 5
Public housing
289.8
(9.46)
***
-169.6
(-4.93)
***
-142.0
(-2.76)
**
-186.4
(-1.64)
-194.0
(-3.51)
**
Population
—
2.174
(11.04)
***
2.945
(8.85)
***
0.722
(0.70)
0.889
(0.85)
Rent
—
—
1.685
(2.86)
**
0.000
(0.00)
0.857
(0.77)
Spending
—
—
0.127
(0.66)
0.363
(2.68)
**
0.385
(2.45)
*
Expenditures
—
—
-0.774
(-3.39)
**
-0.661
(-3.74)
***
-0.736
(-3.75)
***
All programs
—
—
4.044
(0.40)
-2.652
(-0.33)
-0.599
(-0.07)
Poverty
—
—
-10.74
(-0.14)
-57.89
(-0.76)
-58.22
(-0.72)
Unemployment
—
—
213.5
(1.87)
44.61
(0.61)
-79.43
(-0.61)
State Fixed Effects
Time Fixed Effects
No
No
No
No
No
No
Yes
No
Yes
Yes
Observations
350
350
336
336
336
R2
0.283
0.751
0.782
0.976
0.977
Note: *Significant at the 0.05 level. **Significant at the 0.01 level. ***Significant at the 0.001 level.
The coefficients on all explanatory variables and their t-statistics are reported in Table 3. Public housing variables are statistically significant in Model 1, Model 2, Model 3, and Model 5.
In Model 1—the simplest bivariate regression model—the coefficient on public housing is highly significant and has a positive sign. Considering that Model 1 does not control for the population size, it is not surprising to see that larger states tend to have higher levels of homelessness as well as public housing compared to states with less population.
In Model 2, when population variable is added, the coefficient on public housing turns negative and is highly significant. With 1,000 more units of public housing available, the number of homeless population is expected to decrease by roughly 170, holding population constant.
After more variables are included in Model 3, public housing still has a negative and significant coefficient. Yet the significance level and the absolute value of the coefficient drop slightly, suggesting that Model 2 overestimates the impact of public housing on reducing homelessness. Now a 1,000 units increase in the level of public housing is associated with 142 less homeless, conditional on all control variables. Among the seven control variables, population, public housing rent, and state government expenditures on public welfare are significant.
In Model 4, differences between states such as climate conditions, geographic locations, and demographic characteristics are controlled for through the state fixed effects. This model shows that a 1,000 units within-state increase in public housing is associated with an approximately 186 within-state decrease in the number of homeless population, holding all other explanatory variables constant. However, the coefficient on public housing is not statistically significant at the 0.05 level, although it still has a negative sign.
In addition to differences across states, Model 5 also controls for factors that differ across time periods. The 2009 recession, for instance, has a nationwide influence on the relationship we are trying to estimate. On one hand, the economic downturn reduced average wages and raised unemployment rate, making more people homeless. On the other hand, governments with tight budget had fewer resources on the development of public housing programs. After controlling for factors like changes to all states brought by the great recession, the coefficient on public housing turns significant again. Public housing variable now has a coefficient of -194.0, which means that a 1,000 units within-state increase in the available units of public housing is associated with a roughly 194 within-state decrease in the number of homeless population, holding all control variables and yearly changes in nationwide factors constant. In both of the two fixed effects models, average federal spending per public housing and state government expenditures on public welfare are the only two significant variables. Additionally, public welfare expenditures variable is the only variable that remains significant in Model 3, Model 4, and Model 5, suggesting increased government expenditures on public welfare are likely to reduce homelessness significantly.
V. Discussion
Since the signs and the significance of the coefficients on public housing followed expectations in almost all the regression models with the exception of Model 4, the results indicate that public housing does have a significant effect on reducing homelessness. Model 5 is the most compelling one compared to other models used in this study because it better reflects the role of public housing that we are interested in. Through the inclusion of relevant explanatory variables and fixed effects of state and time, the internal validity of the results is greatly improved and the likely causality between the available units of public housing and the number of homeless population is more convincing. Unfortunately, due to the existence of some limitations remained with this study, it is still inappropriate to conclude that public housing has a clear causal relationship with homelessness.
Weakness of fixed effects methods: We can only make causal inference in the absence of omitted variable bias. Yet the fixed effects model fails to control for differences between states that vary over time, which can still create bias. For example, the 2009 recession might affect different states differently. The economy of some states may be weakened more severely compared to other states. If the number of homeless population was changed at the same time as states changed other policies, we might instead be measuring the impact of the other policies. There will always be other unobserved factors that could explain the relationship found in this study.
Weakness of data used: In order to assess the role of public housing in reducing homelessness, levels of the homeless population should be measured accurately. The estimates of the number of the homeless by HUD, which are used as the dependent variable in this study, have been attacked as undercounts of the true number of the homeless (Appelbaum et al., 1991). If the sample can hardly represent the population, the effectiveness of the results is in doubt. In addition, it is actually more appropriate to use city-level data, if possible, instead of state-level data because the vast majority of the homeless population is concentrated in urban areas, especially in large cities. Moreover, sample size is an important feature of any empirical study in which the goal is to make inferences about a population from a sample. A sample size of 336 cannot give us strong statistical power.
Policy implications: Indeed, the results of this study suggest that public housing programs do mitigate homelessness to some extent. However, for policy makers dealing with real-world problems, the negative and significant relationship between public housing and homelessness does not necessarily imply that expanding the current public housing programs is expected to solve the problem of homelessness efficiently. For the purpose of assessing the effectiveness of public housing programs, it is important to not only focus on the significance level, but to also learn about the specific costs and benefits of the programs. According to the regression results, an increase of 1,000 public housing units would reduce the number of homeless population by less than 200. Are the benefits worth the costs? Is public housing more effective on controlling the number of homeless population compared to other programs? Answers to these kinds of questions are needed to find out the right policies to end homelessness.
Appendix
Part 1. Graphs
Part 2. Stata Outputs
References
Appelbaum, Richard P., Michael Dolny, Peter Dreier and John I. Gilderbloom. (1991). Scapegoating Rent Control: Masking the Causes of Homelessness. Journal of the American Planning Association, No. 57, pp. 153-164
Dirk W. Early. (1998, Autumn). The Role of Subsidized Housing in Reducing Homelessness: An Empirical Investigation Using Micro-Data. Journal of Policy Analysis and Management, Vol. 17, No. 4, pp. 687-696
Cyrus Vakili-Zad. (2004, Fall). Housing or Dehousing? The Public Housing Waiting List, Eviction, and the Homeless in Toronto, Canada. Journal of Affordable Housing & Community Development Law, Vol. 14, No. 1, pp. 63-81
William Harris Troutman, John D. Jackson and Robert B. Ekelund Jr. (1999, January). Public Policy, Perverse Incentives, and the Homeless Problem. Public Choice, Vol. 98, No. 1/2, pp. 165-212
John M. Quigley. (1990, Winter). Does Rent Control Cause Homelessness? Taking the Claim Seriously. Journal of Policy Analysis and Management, Vol. 9, No.1, pp. 89-93