Petra L. Klumb, and Thomas Lampert
Berlin University of Technology, Ernst-Reuter-Platz 1,H 8, TU, Berlin 10587, Germany
Available online 21 November 2003.
Abstract
In this research synthesis, we summarize 161 measures of the effects of women 's employment on well being reported between 1950 and 2000. Variations in the conceptualization and measurement of employment and health outcomes and the difficulty in distinguishing social selection from social causation limit the inferences that can be drawn from the evidence. Therefore, we distinguish two types of studies. Longitudinal studies measuring relevant covariates at the first measurement occasion and statistically …show more content…
controlling them in multivariate analyses providing effect-size information are classified as Type II studies. The remaining studies are classified as Type I studies. The main findings were that (1) results from methodologically sound Type II studies confirm the cross-sectional finding that paid employment has no adverse effects on women; (2) the outcome groups psychological distress, subjective health, cardiovascular risks and disease, and mortality do not converge completely.
Author Keywords: Germany; Review; Women 's health; Gender; Employment; Well-being
Article Outline
• Introduction
• Conceptual and methodological issues
• Choice of predictors
• Choice of outcome measures
• Potential moderators
• Confounded measures
• Causation vs. selection
• Method
• Literature searches
• Criteria for inclusion of Type I studies
• Recorded variables
• Study characteristics
• Criteria for identification of Type II studies
• Summary
• Results
• The impact of employment on psychological distress
• The impact of employment on general physical morbidity
• The impact of employment on cardiovascular risk and disease
• The impact of employment on mortality
• Discussion
• Methodological implications
• Theoretical implications
• Acknowledgements
• References
• Further Reading
Introduction
In many industrialized countries, such as the US, the participation of women in the labor force increased markedly across the 20th century ([Bureau of Labor Statistics (2000)]). In other countries, such as Germany, the participation rate of women remained fairly constant across the 20th century, but women from younger cohorts became more likely to work full-time and to return to the labor market between or after childbirth ( [Lennon & Rosenfield (1992)]). These historical changes provided a unique opportunity for researchers to investigate the effects of multiple roles on women 's well being. Consequently, there was a growing interest in the impact of (the additional role of) paid employment after World War II. Despite numerous efforts, however, the understanding of the underlying processes is still rather limited. This review attempts to summarize 50 years of research on the impact of employment on women 's physical and mental health. Because of their importance in evaluating the evidence, issues of conceptualization, measurement, and study design will be discussed first.
Conceptual and methodological issues
Two competing models guide research on the effects of multiple roles: the stress hypothesis (also known as role strain or scarcity; [Goode (1960)]) and the enhancement hypothesis (also known as role expansion or accumulation; [Maas, Borchelt, & Mayer (1999)]; [Seeman & McEwen (1996)]). Because social roles are associated with daily demands and hassles as well as with major life events, the stress hypothesis holds that an additional role reduces an individual 's well being. At the same time, role involvement provides access to resources that may be instrumental in coping with demands. According to the enhancement hypothesis, therefore, individuals with an additional role should be in better health than individuals without this role—despite the additional demands and stressors it entails ( [Gove & Tudor (1973)]; [Karasek & Theorell (1990)]; [Ross and Mirowsky (1995)]). Other models encompass further properties of the living situation, individual expectations, and societal norms (job-stress model, person-environment fit model (P-E fit), role quality model, gender model). However, the majority of researchers has based their investigations on role theory and have chosen the variables of interest accordingly.
Choice of predictors
Researchers usually operationalize the exposure or treatment variable as employment status. Women who are employed at the time of the study (sometimes including those looking for a job) are compared to those who are not in paid employment. As many researchers have pointed out, this dichotomy lacks reliability, i.e., it is temporally unstable, and may lack validity, i.e., it does not capture interindividual differences in labor market participation very well ([Pampel & Zimmer (1989)]; [Reitzes & Mutran (1994)]). The comparison category may be made up of such different groups as permanent homemakers, women on maternity leave, women who are—temporarily or permanently—ill or disabled, unemployed women looking for jobs, and students. Because of the fragmentation of female labor market participation, characterized by moves in and out of the labor force as well as by changes of occupation or career paths, assessing employment status at a single point in time seems to be a suboptimal way of capturing women 's work experiences (but see [Watson & Pennebaker (1989)], who reported no substantial gain “by knowing whether a worker in 1981 was also in that category in 1971”). Therefore, some researchers draw on time-use information (daily, weekly, study-period, or life-time level) to quantify role involvement ( [Adelmann, Antonucci, Crohan, & Coleman (1990)]; [Bird & Fremont (1991)]; [Hall (1992)]; [Klumb and Perrez (under review)]; [Martikainen (1995)]; [Pampel & Zimmer (1989)]). Alternatively, the number of enacted roles may be used as a predictor. This predictor has been represented by different kinds and numbers of categories (e.g., [Pavalko & Smith (1999)], distinguished 5 roles; [Reisine, Fifield and Winkelman (1998)], distinguished 12 roles). Some perspectives focus on subjective experiences and expectations. For example, Barnett and Baruch 's concept of role quality is operationalized via the balance of rewards and concerns afforded by a role ( [Barnett & Baruch (1985)], while in the P-E fit approach, expectations and preferences are compared to actual role involvement ( [Verbrugge (1989)]; [Weatherall, Joshi, & Macran (1994)]). Objective working conditions are rarely considered in this area (but see [Haertel, Heiss, Filipiak, & Doering (1992)]).
Choice of outcome measures
Different research traditions tend to prefer specific outcome measures for conceptual, methodological, and historical reasons. In the initial studies on the effects of women 's employment, measures of psychological distress were used as dependent variables (see [Warr and Parry (1982a)], for a classification). This was a result of the attention attracted by the sex differential in depression, which was ascribed to differences in the rewards provided by the traditional sex roles ( [Pietromonaco, Manis, & Frohardt-Lane (1986)]; [Rook (1984)]; [Ross and Mirowsky (1995)]).
Large surveys usually include ratings of respondents’ general physical health or complaints ([Bartley, Sacker, Firth, & Fitzpatrick (1999)]; [Verbrugge (1989)]). Researchers in the tradition of the job-stress model usually prefer biological outcome variables such as neuroendocrine changes or cardiovascular risk factors; in large surveys, however, they have to rely on self-reports ( [Emslie, Hunt, & Macintyre (1999)]; [Kotler & Wingard (1989)]). Researchers with a background in demography or social epidemiology have related employment to prospectively collected mortality information ( [Marks (1977)]). Convergence or dissociations within or between these groups of measures have rarely been established in primary research (but see, e.g., [Bekker, de Jong, Zijlstra, & van Landeghem (2000)]). Nevertheless, a comparison of different outcomes and their time courses is of theoretical as well as practical importance.
Potential moderators
Rather than asking whether employment is adverse or beneficial per se (stress vs. enhancement), the question seems to be under which conditions it becomes adverse or beneficial. The same roles can have different effects depending on the prevailing conditions or the context ([Waldron and Jacobs (1989)]). The role-substitution hypothesis suggests that roles yielding similar resources may have a substitutive relationship, such as that of employment and marriage. Taking on an additional substitutive role will be of little additional benefit to health. The role-complementation hypothesis suggests that specific role combinations, such as that of parenthood and marriage, have complementary or synergistic effects. In these combinations, the benefit from one role is enhanced by the other.
Beyond the influences of employment, marital, and parental status, exposure to stressors and the availability of resources vary as a function of age, occupation, and/or occupational status ([Moen, Dempster-McClain, & Williams (1992)]).
Conflicts between roles change over the life course, as do the underlying demands and resources. Middle-aged women, blue-collar workers and those at the lower end of the status hierarchy are thought to be most exposed to stressors while having few resources available ( [Arber, Gilbert, & Dale (1985)]). Consequently, it may be that only middle-class women experience a positive relationship between employment and well being. Alternatively, working-class women may benefit more from employment because their non-occupational environment is less favorable than that of middle-class women ( [Warr and Parry (1982a)]).
Confounded measures
If all the variables of interest are assessed via self-report, the association between predictors and criteria may be based in part on shared method variance or other biases ([Warr & Parry (1982b)]). Partialling the prior from the subsequent measure of an outcome variable has been proposed as a way of controlling for the effects of stable third variables such as negative affectivity ( [Sieber (1974)]). Another way of controlling biases is to use multiple measures of predictors or outcomes.
Causation vs. …show more content…
selection
Since the seminal studies by [Hibbard & Pope (1991)], most researchers have seemed to assume a causal link between employment and well-being.
However, the statistical associations reported in the literature (mostly based on posttest-only designs with unequivalent groups) do not allow causal inferences to be drawn because alternative explanations cannot be ruled out ( [Cook & Campbell (1979)]). Prior interindividual differences in health may lead to differential acquisition or relinquishment of roles. This phenomenon is known as reverse causation or the healthy-worker effect ( [Jennings, Mazaik, & McKinlay (1984)]; [Waldron & Herold (1986)]; [Waldron, Herold, & Dunn (1982a)]). Employment status and health may also be the result of indirect selective processes, the antecedents of which (e.g., education) may or may not have been measured. This phenomenon is referred to as gravitation or drift. Most likely, there is a reciprocal relationship, i.e., selective and causative processes operate simultaneously and interactively ( [Waldron, Herold, & Dunn (1982a)]).
The impact of employment has been investigated with many different study designs and analytical strategies. A small number of studies have employed macro-level data, i.e., data with some geopolitical entity as the unit of analysis ([Moen & Yu (1999)]). The usual design, however, is an observational study based on individuals. Observational studies may be cross-sectional or longitudinal; most take a cross-sectional approach.
Most authors seem to implicitly assume that longitudinal studies would confirm their correlational findings. Some authors have been able to demonstrate that beneficial effects persist above and beyond selection ([Vagerö and Lahelma 's (1998) and Verbrugge (1983)]; [Weatherall, Joshi, & Macran (1994)]) whereas others have not ( [Jahoda, Lazarsfeld, & Zeisel (1975)]). The latter have argued that the higher prevalence of chronic health conditions among homemakers can be accounted for almost completely by the proportion of women who report ill health as a reason for not being employed.
For causal inferences to be warranted, a longitudinal design alone is not sufficient. The minimum requirements for a rigorous test of the causal effect of employment are as follows: (a) observing employment status (or better: employment history), (b) assessing all relevant confounding factors (related to both employment and health), (c) assessing subsequent change in health, and (d) controlling the confounding factors statistically when regressing change in health on employment status (for elaborations of the causation issue, see, [Cook & Campbell (1979)]; [White (1959)]). Because very few studies fulfill all of these requirements, it remains to be demonstrated whether methodologically sound longitudinal studies confirm the evidence generated by less rigorous approaches.
The above reasoning can be condensed into the following questions: (a) Do methodologically sound longitudinal studies confirm the cross-sectional findings? (b) Do different outcome measures yield converging results? In order to answer these questions, for each group of outcome measures we summarized the methodologically sound longitudinal studies that conformed to the minimum requirements (“Type II studies”) and compared their results to those of the remaining studies (“Type I studies”).
Method
Literature searches
The aim of the literature search was to identify all relevant empirical studies on the impact of employment on women 's health conducted between January 1950 and December 2000. In addition to published studies, we also tried to locate unpublished documents. We aimed to conduct as precise a search as possible, with “precision” referring to the ratio of relevant documents found to all those retrieved ([Wethington & Kessler (1989)]). Our search activities comprised three main strategies. The first was a computer-based search of major abstract databases (Medline, PsychInfo, and Sociofile) in which we combined title keywords from three domains: gender, sex, women, female, wife, housewife, homemaker, mother, or maternal with employment, work, occupation, or job with health, well-being, depression, disease, illness, morbidity, mortality, cardiovascular, or endocrine. The second strategy consisted in following up references in journal articles, book chapters, and books on employment and women 's health or related topics. Again, we searched for documents with at least one keyword from each domain in the title. The third strategy was to browse through journals and books in order to retrieve relevant documents not detected by the previous two strategies. We looked through journals considered to be particularly relevant: the American Journal of Epidemiology, the American Journal of Public Health, the Journal of Health and Social Behavior, the Journal of Personality and Social Psychology, Social Science & Medicine, and Women & Health. In addition, we contacted researchers active in the relevant areas and inquired about unpublished work. To find further unpublished research papers and reports, we used additional resources provided by various institutions and organizations; e.g., the Abstract Newsletter published by the National Technical Information Service (NTIS), an agency of the US Department of Commerce, the Social Science Literature Information System (SOLIS) produced by the German Informationszentrum Sozialwissenschaften (Social Science Information Center), and the Direct Access to Information (DATRIX II), an online retrieval system containing citations to dissertations and masters’ theses. We located a total of about 400 documents representing the pool of studies from which those to be included in our analyses were then selected.
Criteria for inclusion of Type I studies
In order to be included in the synthesis, a study had to (a) be a primary study with empirical findings based on quantitative methods of data collection and analysis, (b) include employment indicators as predictors and mental or physical health variables as criteria, (c) observe women aged between 16 and 68 years (studies on both sexes were only taken into account if the data were reported separately for women and men; results on older women were considered if information on middle-aged women was also available), (d) report an effect size and/or provide information about the statistical significance of the results (in the absence of an effect size, information about statistical significance in combination with the direction of the effect enabled us to apply vote-counting procedures). In the case of non-independence of studies, i.e., several reports based on the same sample population, we included only non-redundant independent and outcome measures. Study quality was not evaluated in this step. Overall, 140 articles reporting 153 studies met our criteria for inclusion.
Recorded variables
Study characteristics
We recorded the database, the measurement occasions, and the country in which the data were collected. Furthermore, we coded the type of employment indicator and the type of outcome variables used (psychological distress, general physical morbidity, neuroendocrine reactivity, cardiovascular risk and disease, and mortality), participants’ age, sex, marital and parental status, as well as their education and occupational status, if available. In addition, we recorded the sampling procedure and response rate, study design, and statistical analysis performed. For longitudinal studies, we recorded whether confounding variables were controlled for and whether alternative hypotheses such as reverse causation were tested. Moreover, the year and type of publication (journal article, book chapter, monograph, research report), the disciplinary background and the sex predominant in the author group were coded. Finally, we recorded whether the main hypotheses were deduced from theoretical models, and which models were concerned (role theory, person-environment fit, job model, gender model).
Most of the journal articles were published in public health journals (85% or 57%), followed by psychological and sociological journals (19% or 13% and 26% or 17%, respectively). Few studies were published before 1980, but the number of relevant publications grew rapidly during the 1980s, peaking in 1989 and 1992. After these peaks, the number of articles and book chapters began to decline again.
Criteria for identification of Type II studies
In a first step, we selected all studies with more than one measurement occasion (including prospective mortality analyses; N=32). For a study to be included in the Type II sample, outcome and relevant covariates also had to be measured at the first measurement occasion and statistically controlled in multivariate analyses providing effect-size information. It emerged that 27 of the longitudinal studies controlled for the outcome assessed at the first measurement occasion, 21 of these also controlled in some way for other confounding variables. After excluding studies that (a) were redundant (N=2), (b) reported exclusively effects of change in employment status on health (N=3), (c) were restricted to a specific illness (N=1), and (d) did not report effect sizes (N=1) 13 studies remained which met the Type II criteria. With such small numbers of studies, central tendencies lack robustness because of large confidence intervals. Therefore, we decided to display each result separately instead of reporting central tendencies. Both authors assessed the 17 effect sizes of the Type II studies (intra-class correlation COEFFICIENT=0.99) and discussed discrepancies until consensus was reached.
Summary
All 13 Type II studies were published in 1982 or later and 11 of them presented findings from the United States. The non-response rate was low in all studies. Four of the studies concentrated exclusively on married women. Information with regard to education, occupational status, and the number and age of children was scarce. Table 1 displays the synopsis of study characteristics.
Table 1. Description of the Type-II studies (N=13)
Full-size table (