Moral Hazard, Adverse Selection and Health Expenditures: A Semiparametric Analysis

Health Expenditure Analysis: Moral Hazard & Adverse Selection

Document information

Author

Patrick Bajari

instructor/editor Amy Finkelstein
School

National Bureau of Economic Research (NBER), University Of Minnesota, Duke University

subject/major Economics, Health Economics
Place Cambridge, Ma
Document type Working Paper
Language English
Format | PDF
Size 1.35 MB

Summary

I.Abstract

This research empirically investigates the effects of asymmetric information on health insurance markets, focusing on the distinction between adverse selection and moral hazard. Using data from the Health and Retirement Study (HRS) and a two-step semiparametric estimation strategy, the study finds significant evidence of moral hazard but not of adverse selection in the demand for health insurance and medical care.

1. Theoretical Predictions and Empirical Challenges

The abstract begins by highlighting the predictions of theoretical models regarding asymmetric information in health insurance markets. These models suggest that adverse selection and moral hazard can lead to inefficient outcomes. A key challenge emphasized is the difficulty previous empirical research has faced in disentangling the effects of adverse selection from those of moral hazard within the healthcare system. This inherent difficulty in separating these two influential factors underscores the novelty of the current research, which aims to address this significant methodological hurdle.

2. Empirical Approach and Data Source

The core methodology of the study is introduced: an empirical investigation using data from the Health and Retirement Study (HRS). This specific data source is crucial because it provides the necessary information to construct a structural model of demand for health insurance and medical care. The study utilizes a two-step semi-parametric estimation strategy, a sophisticated econometric technique designed to analyze the complex interplay of factors in health insurance markets. This approach allows for a more nuanced understanding than previous methods that struggled to distinguish between adverse selection and moral hazard.

3. Key Findings

The abstract concludes by presenting the study's main findings. Employing the described two-step semi-parametric estimation strategy on data from the HRS, the research provides significant evidence of moral hazard. However, a notable and potentially counterintuitive result is the lack of evidence supporting the presence of adverse selection. This unexpected outcome suggests a more complex dynamic at play in health insurance markets than some existing theoretical models might suggest, and thus opens up avenues for further research.

II.Introduction

A substantial theoretical literature predicts inefficient outcomes in insurance markets due to adverse selection and moral hazard. However, empirically distinguishing these effects has proven challenging. This paper proposes a new approach using a structural model to analyze the demand for health insurance and medical utilization, allowing for the simultaneous presence of both adverse selection and moral hazard.

1. Existing Theoretical Framework and its Limitations

The introduction establishes the foundation of the research by referencing a substantial body of theoretical literature that predicts market inefficiencies due to asymmetric information. This literature posits that adverse selection and moral hazard are key drivers of these inefficiencies. The introduction acknowledges the extensive work done by several prominent researchers (Arrow, Pauly, Akerlof, Zeckhauser, Spence, Rothschild, Stiglitz, Wilson) who have contributed to this theoretical framework, highlighting their specific contributions to the understanding of adverse selection (e.g., Rothschild and Stiglitz's demonstration of inefficient rationing by insurers) and moral hazard (e.g., Pauly's explanation of how consumers' lack of full cost-bearing can lead to moral hazard). The inherent complexity of insurance markets, and the difficulty of isolating the relative importance of these two factors, are clearly established as central challenges within the existing theoretical landscape.

2. The Need for Empirical Investigation and Existing Methodological Shortcomings

The introduction emphasizes the critical need for empirical work to assess the relative importance of adverse selection versus moral hazard. It argues that theoretical models often oversimplify the situation by focusing on one distortion at the expense of the other. This simplification has significant policy implications, as optimal policy will differ depending on whether moral hazard or adverse selection is the dominant force. The introduction then discusses existing empirical approaches for detecting asymmetric information, pointing out their limitations. A common method of examining the correlation between risk outcomes and contract generosity is noted, but its inability to distinguish between adverse selection and moral hazard is highlighted, citing the work of Chiappori and Salanié (2003). Other approaches, like exploiting the dynamic consequences of experience rating (Abbring et al., 2003) or using observable characteristics unutilized by insurers (Finkelstein and Poterba, 2006) are mentioned, but their limitations within the context of US health insurance regulation are also acknowledged.

3. Proposed Alternative Approach A Structural Model

The introduction concludes by presenting the paper's proposed solution: a novel approach based on estimating a structural model of consumer demand for health insurance and medical utilization. This structural model incorporates both adverse selection and moral hazard. The paper's method is explicitly contrasted with existing approaches, highlighting its advantages. It overcomes the limitations of previous empirical strategies by using a semiparametric approach, allowing for the simultaneous estimation of both adverse selection and moral hazard while explicitly accounting for the endogeneity of health insurance plan choices and prices. This innovative methodology promises to significantly advance the understanding of these phenomena in health insurance markets.

III.Data

The study uses data from Wave 3 (1996) of the Health and Retirement Study (HRS), a nationally representative sample of Americans born between 1931 and 1941. The analysis includes information on health insurance plans, medical expenditures, reimbursements, respondent location, and household income. The dataset was cleaned, resulting in 4412 observations after data trimming. Key insurance categories include employer-provided, Veterans Administration/Champus, self-employed, privately purchased, and uninsured plans (for those under 65); and Medicare with or without Medigap (for those 65 and older).

1. Data Source The Health and Retirement Study HRS

The data for this study comes from Wave 3 (1996) of the Health and Retirement Study (HRS). The HRS is described as a nationally representative sample of men and women born between 1931 and 1941, and their spouses or partners. The initial 1992 wave included 12,652 individuals from 7,607 households, with oversampling of Black, Hispanic, and Florida residents. Wave 3 was chosen for its data on out-of-pocket and total medical expenditures. The publicly available data is supplemented with confidential data on location of residence, which aids in creating instrumental variables. The specific variables used are detailed in Appendix I, with summary statistics presented in Table 1. The use of HRS data is a key strength of the study, providing a robust and nationally representative dataset for analyzing health insurance choices and their associated expenditure patterns.

2. Defining the Insurance Choice Set

The study carefully defines the set of health insurance choices (D) available to individuals, acknowledging the structure of the US health insurance market and the significant role of Medicare for those 65 or older. For those under 65, the choice set comprises employer-provided insurance, Veterans Administration/Champus insurance, self-employed insurance, privately purchased insurance, and the option of being uninsured. For those 65 and over, the choices include employer-provided, VA/Champus, self-employed insurance, and Medicare (with or without supplemental Medigap insurance). The study clarifies that the terms ‘insurance plan’ and ‘insurance category’ are used interchangeably, acknowledging the existence of various plans within each category. However, data limitations prevent a more fine-grained analysis of the specific plans themselves. This careful delineation of insurance categories is critical for the analysis to accurately capture the variation in insurance options within the US system.

3. Data Cleaning and Sample Selection

The data underwent significant cleaning and trimming. Observations with missing data on household income, insurance premiums, or out-of-pocket medical expenditures were removed. Observations where out-of-pocket medical expenditure exceeded total medical expenditure were also dropped. Further exclusions were made for cases where household income was less than the sum of household insurance premium and out-of-pocket medical expenditure. Finally, observations where the insurance category was Medicaid were eliminated. These data-trimming steps resulted in a final sample size of 4412 observations, representing approximately 3% fewer observations than the initial sample. This process ensured data quality and consistency, eliminating potentially problematic observations that could bias the results of the econometric analysis. The choice to remove Medicaid cases is also important to note, potentially affecting the generalizability of the results.

IV.Estimation

A two-step semiparametric estimator is developed to recover the parameters of the utility function. The first step nonparametrically identifies the co-payment schedule using a local linear estimator. The second step uses instrumental variables (including state housing price index, county malpractice insurance costs, and county establishment counts) to estimate utility function parameters and the distribution of latent health status. The identification strategy relies on the validity of utility maximization but not on parametric assumptions about the reimbursement schedule or the distribution of latent health status.

1. Two Step Semiparametric Estimation Strategy

The estimation strategy is a two-step semiparametric approach designed to estimate the parameters of a utility function without relying on strong parametric assumptions about the distribution of latent health. The key is the non-parametric identification of both the co-payment rate and the distribution of health status using the consumer's optimality condition. This allows for a flexible modeling of the relationship between health utilization and insurance plan characteristics. The first step uses a local linear estimator to nonparametrically estimate the co-payment schedule, and the second step utilizes instrumental variables to estimate the utility function parameters and recover the distribution of latent health shocks. The semiparametric nature is emphasized as an advantage, minimizing a priori restrictions while maintaining a parametric convergence rate.

2. Identification Strategy and Instrumental Variables

The identification strategy hinges on the specification of the utility function and the assumption of utility maximization. It doesn't depend on specific statistical assumptions regarding the reimbursement schedule or the distribution of health status. Instrumental variables are used to address endogeneity issues and provide exogenous variation in the characteristics of health insurance plans. The instruments used are geographic variations in state-level housing price index, county-level malpractice insurance costs (from the Medicare Payment Advisory Commission's Geographic Practice Cost Index – GPCI), and the number of establishments in a county. The crucial identifying assumption is the independence between these instruments and the distribution of latent health status. This ensures that instrumental variables capture variation in the price of healthcare relative to other goods without being correlated with unobserved health differences among individuals. The choice of these instruments is justified in terms of their potential to reflect county-level variations in insurance costs, potentially incorporating effects like defensive medicine.

3. Addressing Data Limitations and Alternative Estimation Methods

The estimation procedure acknowledges the presence of zero out-of-pocket medical expenditures in the data, indicating that some individuals didn't use medical care. To address this, a median-based moment condition is used instead of a mean-based condition, offering robustness to censoring at the upper and lower tails of the conditional distribution. This median-based approach, along with the two-step method of moments, avoids the need for an explicit selection equation for health insurance plan choice. The estimation strategy is compared to conventional two-stage least squares estimation, emphasizing the use of the optimality condition for a risk-averse consumer to derive the functional form of the demand equation, instead of relying on a reduced-form linear specification. The authors highlight that their semiparametric approach is a significant departure from previous research which often relied on ad hoc parametric assumptions about the distribution of private information.

V.Estimates of Model Parameters

The estimated coefficient of relative risk aversion for aggregate consumption is 0.85, while for health status it is 1.52, indicating greater risk aversion regarding health. The utility weight on health status relative to aggregate consumption is 1.37. The median latent health shock is $3,994 (1996 dollars). A new measure of moral hazard, calculated as the elasticity of total medical expenditure with respect to co-payment rates, reveals a median elasticity of -0.21; consistent with the RAND HIE study. This indicates a 0.21% decrease in total medical expenditures for a 1% increase in co-payment rates. However, there's substantial variation in elasticities across individuals and insurance categories. For example, self-employed individuals show higher elasticity than privately insured individuals.

1. First Stage Results Co payment Schedule

The section begins by presenting the results from the first stage of the two-step semiparametric estimation. This stage focuses on the local linear regression of the co-payment schedule (z(m)). Summary statistics from Table 2 are discussed, showing the average out-of-pocket expenditure for different insurance categories. Privately purchased insurance serves as the omitted category, with an average out-of-pocket expenditure of $1676 (1996 dollars). The analysis finds that uninsured individuals have higher average out-of-pocket expenses ($1333) than those with employer-provided insurance. Figures 1 and 2 graphically illustrate the estimated co-payment schedule and its gradient. The concave shape of the co-payment schedule and the non-negative gradient are highlighted as important for the subsequent estimation procedure. The non-linearity of the co-payment schedule reinforces the decision to use a nonparametric framework for flexible modeling of agents' co-payment rates.

2. Second Stage Results Utility Parameters and Risk Aversion

The second stage of the estimation focuses on the utility parameters, providing estimates of risk aversion and the utility weights on consumption of health relative to aggregate consumption. Table 3 presents these key findings. The study reports being the first to estimate the coefficient of relative risk aversion separately for health and aggregate consumption. The coefficient of relative risk aversion is estimated to be 0.85 for aggregate consumption and 1.52 for health, indicating individuals are significantly more risk-averse concerning their health status. The utility weight on health status is 1.37, showing that individuals value health more than aggregate consumption. The monetary value of the median level of latent health shocks is estimated to be $3994 (1996 dollars). These results are compared to existing estimates in the consumption literature (Zeldes, Shea, Hansen and Singleton, Gourinchas and Parker), demonstrating congruence and providing further validation of the estimated model.

3. Moral Hazard Measurement and Comparison with Existing Literature

This subsection introduces a new measure of moral hazard, calculated as the elasticity of total medical expenditure with respect to changes in co-payment rates. This measure is presented as more general than those previously used in the literature (Manning et al., 1987). Using the estimated utility parameters and the distribution of latent health status, the study computes this elasticity for each individual. Table 4 summarizes these elasticities, showing that a median elasticity of -0.21 indicates a 0.21% drop in total medical expenditure for a 1% increase in co-payment rates. This finding is compared and validated against results from the RAND Health Insurance Experiment (RAND HIE) (Manning et al., 1987; Newhouse, 1993), confirming the model's robustness and reliability. The consistency with existing literature, specifically Manning et al.'s findings, increases confidence in the study's results and methodologies.

VI.Discussion of Adverse Selection

The study analyzes the distribution of latent health shocks across insurance categories using a Kolmogorov-Smirnov test. While descriptive statistics suggest some differences in health status across categories (e.g., uninsured individuals appear healthier), the test fails to reject the null hypothesis of equal distributions. This finding challenges the notion of a separating equilibrium in the health insurance market, where individuals sort themselves into plans based on their private information about their health status. The lack of significant sorting based on adverse selection is partially explained by the fact that at high levels of expenditure, insured consumers receive similar reimbursements irrespective of plan choice.

1. Measuring Latent Health Shocks and their Distribution

This section describes how the latent health shock (θ) for each individual is recovered using equation 5. A higher value of θ indicates poorer health, measured in 1996 dollars. The authors highlight this as a novel contribution, quantifying latent health types within a health insurance contract model. Table 7 presents the distribution of latent health status overall and conditional on insurance category. The median overall health shock is reported, providing a benchmark for comparison across insurance categories. The discussion then focuses on the distribution of latent health shocks across different insurance categories, noting that the median level and risk (measured by the interquartile range) vary considerably. For instance, uninsured individuals exhibit the best median health status and lowest risk, while those on VA/Champus plans show the worst median health status and highest risk. This variation within insurance categories is also noted.

2. Testing for Adverse Selection Results and Interpretation

The core of this section is the testing for adverse selection using the nonparametrically estimated distributions of latent health shocks. A Kolmogorov-Smirnov test is employed to assess whether the distributions differ significantly across insurance categories. Despite some suggestive descriptive statistics (e.g., uninsured individuals appearing healthier, those in VA/Champus appearing unhealthier), the test results, summarized in Table 8, do not reject the null hypothesis of equal distributions across categories. Figures 5 and 6 provide graphical representations of these distributions (densities and cumulative distribution functions, respectively). These figures show that while there is substantial variation in health shocks within each insurance category, and some categories seem to have a higher proportion of individuals with poor health, the differences are not statistically significant enough to confirm adverse selection. This finding contradicts models that suggest health status drives the choice of insurance plan.

3. Explaining the Lack of Evidence for Adverse Selection

The section concludes by discussing the implications of the lack of significant adverse selection. It explores the relationship between observable characteristics (age, race, self-reported health status, education) and the latent health shocks (Table 9). Although older and white individuals and those with self-reported poor health are associated with larger health shocks, the observable characteristics only explain 5% of the latent health status variation. This limited explanatory power suggests substantial unobservable heterogeneity, which might be expected to drive adverse selection. However, the authors show that there is little evidence to suggest sorting between plans based on health. The authors propose that high health expenditures, when health insurance is most crucial, lead to comparable levels of reimbursement across plans, thus reducing incentives for individuals to sort based on their health status. This explanation suggests a limitation of existing theoretical models that assume a separating equilibrium driven primarily by adverse selection.

VII.Conclusion

The study finds evidence of moral hazard but not adverse selection in the health insurance market. The results do not support a model of separating equilibrium, highlighting the importance of further theoretical research to explain the nonlinearities in co-payment schedules and the observed consumer behavior. The lack of significant evidence for adverse selection suggests that consumers do not sort themselves into plans as efficiently as predicted by some existing theoretical models. This challenges conventional wisdom regarding the role of adverse selection in health insurance markets.

1. Summary of Findings Moral Hazard vs. Adverse Selection

The conclusion succinctly summarizes the key findings of the study. A semiparametric estimation procedure was used to analyze a model of health insurance demand and medical utilization, accounting for both adverse selection and moral hazard due to asymmetric information. The analysis provides strong evidence for the presence of moral hazard in health insurance plans. However, the study's results reveal a lack of evidence supporting the existence of adverse selection in the examined health insurance contracts. This absence of evidence for adverse selection is a significant finding, contrasting with the expectations of many theoretical models of insurance markets. The authors emphasize that this result indicates a considerable gap between theoretical models and actual observed consumer behavior in the context of health insurance markets.

2. Implications for the Separating Equilibrium Model

The conclusion directly addresses the implications of the findings for the theoretical model of separating equilibrium in insurance markets. The research's results do not support this model, where consumers are expected to sort themselves into different insurance plans based on their private information about their health status. This failure to find support for the separating equilibrium model suggests that the typical assumptions of these models may not accurately reflect reality in the health insurance market. The authors suggest that this discrepancy between the theoretical prediction and the empirical evidence necessitates further research into the factors driving consumer choices and plan design.

3. Future Research Directions

The conclusion highlights potential avenues for future research stemming from the study's findings. One important area highlighted is the need to explore and explain the non-linearity observed in co-payment schedules. This non-linearity suggests a more complex relationship between consumer behavior and the structure of insurance plans than many existing models account for. Additionally, the lack of support for the separating equilibrium model invites further theoretical work to refine and improve our understanding of how asymmetric information affects health insurance markets and consumer decision-making. The authors acknowledge limitations in the present study, such as the inability to incorporate heterogeneity in risk preferences due to data constraints, suggesting this as another direction for future research to explore.