Measuring Health Preference
This page provides a brief overview of measuring Quality-Adjusted Life Years (QALYs) and the VR-36/VR-12. For more complete information on preference measurement, see the HERC Guidebook: Preference Measurement in Economic Analysis (2007) or the HERC cyber seminar "Introduction to effectiveness, patient preferences, and utilities" from March 30, 2022.
To compare the value of one treatment with another, an outcome measure that works across different health states is needed. This rules out disease-specific quality of life measures as they only reflect a specific disease or illness.
In addition, the outcome measure must also work as a measure of preference. Preference measures not only health status, but also how the individual values their current health state. The valuation of the current health state usually covers the spectrum from full health to death.
To date, the quality adjusted life year (QALY) is the preferred metric for estimating health effects (Gold et al., 1996; Neumann et al., 2016). QALYs are estimated by multiplying each life year gained with an intervention by a quality-weighting factor that reflects the individual's quality of life in the health state for that year. Utilities, measured on a scale from zero (death) to one (perfect health), can be used as quality-weighting factors. Values worse than zero are possible for states considered worse than death. Neumann et al. (2016) provides further details on the QALY.
There are different ways to derive quality weights. The easiest is to use published reports and league tables. Besides Medline, a great resource for this is the Tufts Cost-Effectiveness Analysis Registry.
If existing utility weights do not meet your needs, you may need to collect weights. In doing so, sampling issues should be considered carefully (see Brazier et al., 2016 and Gold et al., 1996). The estimation of the QALY weights for a given period (i) and treatment (k) requires the successful completion of two tasks:
- Measuring the impact of an intervention on the distribution of health states. This task requires that the health states influenced by the treatment are completely characterized.
- Assessing the preferences (utilities) for these alternative health states.
This two-step estimation process can be done with different methods:
Indirect utility assessment
Direct utility assessment
- Rating Scales
- Standard gamble (SG)
- Time tradeoff (TTO)
A key distinction between the methods is how they handle risk. The Standard Gamble makes the respondent consider the risk of death. The Time Tradeoff method asks the person to consider a tradeoff with years of life. Some say that the TTO is cognitively easier to understand than SG, although the jury is still out on this. The preferred mode of administration for SG and TTO methods is via in-person interviews, but a lot of work is being done to develop and test computer and Internet administration methods (Lipman, 2021). Given the logistical complexities, many people turn to the rating scales. Rating scales, however, do not require people to consider risk. There is a controversy over how important it is to factor risk into the valuation of health states (See Gold et al, 1996, p 118).
It is also important to note that these methods usually yield different utility weights. This has led some people to use multiple methods.
The "Veterans RAND 36-Item Health Survey" (VR-36 and formerly the Veterans SF-36) was developed from the original RAND version of the 36-Item Health Survey version 1.0 (also known as MOS SF-36) at the RAND Corporation as part of the Medical Outcomes Study. The VR-12 ("Veterans RAND 12-Item Health Survey," formerly the Veterans SF-12) was derived from the VR-36. While the names of these assessment tools have changed, the content of the instruments has not.
There is no cost to use the VR-36 and VR-12. To request access to the VR-36 and VR-12, visit the Boston University School of Public Health website and click on the link for 'Request Access Now.'
Is any one assessment tool better than the others?
VR-36/VR-12 vs. SF-36/SF-12
The VR instruments use five-point response choices for seven items in the VR-36 and four items in the VR-12. Response choices that were originally dichotomous (a two-point yes/no choice) are now five-point response choices: "no, none of the time", "yes, a little of the time", "yes, some of the time", "yes, most of the time" and "yes, all of the time". These answers then contribute to the scales for role limitations due to physical and emotional problems. Expanding these scales in the VR instruments has resulted in a reduction of floor and ceiling effects, with important gains in the scales’ distributional properties and increases in reliability and validity. The VR-36 and VR-12 also include two additional items to assess physical and emotional health change, in contrast to the single general change item in the SF-36.
VR12/SF-12 vs. VR36/SF-36
The SF-12 is a shorter alternative to the SF-36, but it reproduces the eight-scale profile with fewer levels than the SF-36 and produces less precise scores. Because confidence intervals for group averages in health scores are largely determined by sample size, these differences are not as important for large group studies. The SF-12 improves efficiency and lowers cost for both profiles and summary scales, and is most appropriate for use in large samples of general and specific populations as well as large longitudinal studies of health outcomes. Selim and colleagues have also developed the VR-6D algorithm which computes health state utilities for the VR-12. Utilities or preference-based scores reflect values on health states and are essential for cost-effectiveness analysis.
For a list of studies in which the VR-36 or VR-12 were used for measuring health-related quality of life, please refer to the 'Published Articles' section here. Iqbal et al. (2009) also provides an overview of the VR-12 and a summary of earlier articles using this measure.
There are some limitations in using these assessment tools. Researchers interested in this understanding some of these limitations should consult Kazis (2004a), Kazis (2004b), Keller (1999), Rose (2008), Selim (2006), and Wilson (2000). Researchers interested in exploring assessment tools that aim to capture general wellbeing, as opposed to just health, may want to read more about the ICEpop CAPability measure (ICECAP) from the University of Birmingham.
Brazier J, Ratcliffe J, Saloman J, Tsuchiya A.Measuring and valuing health benefits for economic evaluation. 2nd Ed. Oxford: Oxford University Press; 2016.
Gold MR, Siegel JE, Russell LB, Weinstein MC. Cost-effectiveness in health and medicine. New York: Oxford University Press; 1996. see p. 285 et. seq.
Gyrd-Hansen D, Sogaard J. Discounting life-years: whither time preference? Health Econ 1998; 7:121-7.
Iqbal, SU, Rogers, W, Selim, A, Qian, S, Lee, A, Ren, XS, Rothendler, JD, Miller, D, Kazis, LE. The veterans RAND 12 item health suvery (VR-12): what it is and how it is used. Center for Health Quality, Outcomes, and Economic Research, A Health Services Research and Development Center of Excellence, VA Medical Center, Bedford, MA, USA.
Kamlet MS. A framework for cost-utility analysis of government healthcare programs: Office of Disease Prevention and Health Promotion, Public Health Service, U.S. Department of Health and Human Services; 1992.
Kazis LE, Miller DR., Clark JA., Skinner KM, Lee A, Ren XS, et al. Improving the response choices on the veterans SF-36 health survey role functioning scales: Results from the Veterans Health Study. J Ambul Care Manage. 2004 Jul-Sep;27(3):263-80.
Kazis LE, Lee A, Spiro A., 3rd, Rogers W, Ren XS, Miller DR, et al. Measurement comparisons of the medical outcomes study and veterans SF-36 health survey. Health Care Financing Review. 2004 Summer;25(4):43-58.
Kazis LE, Selim A, Rogers W, Ren XS, Lee A, Miller DR. Veterans RAND 12-Item Health Survey (VR-12): A White Paper Summary. Unpublished manuscript. https://www.researchgate.net/publication/237314426_Veterans_RAND_12_Item_Health_Survey_VR12_A_White_Paper_Summary.
Keller SD, Ware JE, Jr., Hatoum HT, Kong SX. The SF-36 Arthritis-Specific Health Index (ASHI): II. Tests of validity in four clinical trials. Med Care. 1999 May;37(5 Suppl):MS51-60.
Lipman, SA. Time for tele-TTO? Lessons learned from digital interviewer-assested time trade-off data collection. Patient. 2021; 14: 459-469.
Neumann PJ, Sanders GD, Russell LB, Siegal JE, Ganiats TG. Cost-effectiveness in health and medicine. 2nd Ed. New York: Oxford University Press; 2016.
Selim AJ, Berlowitz D, Fincke G, et al. Use of risk-adjusted change in health status to assess the performance of integrated service networks in the Veterans Health Administration. Int J Qual Health Care. 2006 Feb;18(1):43-50.
Selim AJ, Rogers W, Fleishman JA, et al. Updated U.S. population standard for the Veterans RAND 12-item Health Survey (VR-12). Qual Life Res. 2009 Feb;18(1):43-52.
Selim AJ, Rogers W, Qian SX, Brazier J, Kazis LE. A preference-based measure of health: the VR-6D derived from the veterans RAND 12-Item Health Survey. Qual Life Res. 2011 Oct;20(8):1337-47.
Sintonen H. The 15D instrument of health-related quality of life: properties and applications. Ann Med. 2001 Jul; 3(5):328-36.
The SF-12: An Even Shorter Health Survey: Version 2.0. SF-36.org Web Site.
Torrance GW & Feeny D. Utilities and quality-adjusted life years. Int J Technol Assess healthcare. 1989; 5(4): 559-75.
Wilson D, Parsons J, Tucker G. The SF-36 summary scales: problems and solutions. Soz Praventivmed. 2000;45(6):239-46.
Last updated: May 12, 2023