当前位置: 首页 > 期刊 > 《英国医生杂志》 > 2004年第5期 > 正文
编号:11343786
Comparability of self rated health: cross sectional multi-country survey using anchoring vignettes
http://www.100md.com 《英国医生杂志》
     1 Department of Population and International Health, Center for Population and Development Studies, Harvard School of Public Health, 9 Bow Street, Cambridge, MA 02138, USA, 2 Harvard University Global Health Initiative, 104 Mt. Auburn Street, Cambridge, MA 02138, USA

    Correspondence to: J A Salomon jsalomon@hsph.harvard.edu

    Abstract

    Valid, reliable, and comparable measures of health are critical components of the evidence base for clinical practice and health policy. Clinical trials and national surveys rely heavily on self reported measures of health,1-5 but interpretation of these measures is complicated by incomparability when different people understand and respond to a given question in different ways. Paradoxical findings have been reported in many analyses of population health surveys, suggesting that self reported measures may be misleading without adjustment for these differences.6-9

    Distinguishing between differences in self ratings due to actual health differences and differences due to varying norms or expectations for health is a key challenge in interpreting self reported measures of health.10 11 We may conceptualise different dimensions of health—for example, mobility, cognition, vision—as continuous but unobserved scales. Each available response to a categorical question corresponds to a range of values on the scale that may vary across individuals (fig 1). Differing expectations for health can lead to differences in the levels at which people change from using one response category to the next—that is, differences in response category cut points. For example, a 90 year old man who struggles to climb the stairs might characterise himself as having "mild difficulties" in moving around, but a 40 year old man with the same mobility might describe himself as having "moderate difficulties." These responses are incomparable because the individuals have different response category cut points for questions about mobility.

    Fig 1 Self assessment: how much difficulty do you have in moving around? The problems of interpersonal or cross population comparability may be conceptualised in terms of shifts in response category cut points. Different people (A, B, and C) might translate levels on an unobserved, continuous mobility scale into categorical responses in different ways, depending on the location of their cut points. Cut points define thresholds on the unobservable scale at which individuals move from one response category to another

    Strategies for making self reported measures of health more comparable may require new tools for both collecting and analysing survey data.12 Standard models for ordinal data—such as the ordered probit model—do not allow for variation in response category cut points, although these models can be adapted to allow for systematic cut point shifts in relation to covariates such as country, age, and sex.13-16 Anchoring vignettes are a new component of survey instruments that can be used in conjunction with the extended statistical models to position self reported responses on a common interpersonally comparable scale. We describe an application of this strategy from a series of pilot studies for the World Health Survey.17 We give examples of how anchoring vignettes may be used to understand variation in expectations for health and discuss the implications for interpreting self ratings of health.

    Methods

    A total of 3012 respondents completed the health survey. The mean age was 41 (standard deviation 15), with a range across countries from 33 (10) in the United Arab Emirates to 49 (15) in China. A total of 1837 (61%) respondents were younger than 45, and 478 (26%) had had less than 6 years of education (table 1). Self assessed mobility ratings varied considerably between countries, with 45% (249/555 in Sri Lanka) to 85% (431/510 in the United Arab Emirates) of respondents reporting no difficulties moving around. Of the 3012 respondents, 406 (13.5%) completed the version of the questionnaire that included mobility vignettes.

    Table 1 Distribution of sample used in pilot study of health module for the World Health Survey by age, sex, years of schooling, and country

    Evidence on consistency of vignette orderings across respondents and internal consistency within each individual's vignette ratings on the two mobility questions suggests that comprehension of the vignette rating task is good across all sites, and that a similar understanding of the levels described in the vignettes prevails (fig 2 and table 2). For the two global comparisons and the internal comparison, about three quarters of responses were completely consistent with an additional 18% to 22% having only one or two rank inconsistencies in each case.

    Fig 2 Distribution of respondents by number of rank inconsistencies in vignette ratings compared with global ordering and internal comparisons between two mobility questions. (Results shown for five vignettes common to all study sites. One rank inconsistency refers to cases in which the ranks of a pair of vignettes are inverted. For example, if vignettes are numbered according to the global ordering, then 12435 would be characterised by one rank inconsistency. Two rank inconsistencies would include cases in which one vignette shifts by two ranks (and displaces two adjacent vignettes accordingly)—for example, 14235—or cases in which two pairs of vignettes have inverted ranks, for example 21354. Because complete orderings may be unobserved when respondents rate more than one vignette in the same response category, we resolved ties in favour of the consistent ordering)

    Table 2 Consistency of vignette orderings and average rank correlation coefficients by country. Results are shown for the five vignettes common to all six countries

    Mobility questions in the World Health Survey pilot study

    (Q1) Overall in the last 30 days, how much difficulty did have with moving around? (a) none; (b) mild; (c) moderate; (d) severe; (e) extreme

    (Q2) In the past 30 days, how much difficulty did have in vigorous activities, such as running 3 km or cycling? (a) none; (b) mild; (c) moderate; (d) severe; (e) extreme

    Mobility vignettes

    Paul is an active athlete who runs long distance races of 20 km twice a week and plays soccer with no problems

    Mary has no problems with walking; running; or using her hands, arms, and legs. She jogs 4 km twice a week

    Adriana is quite active and does sports twice a week, such as tennis or swimming. Once a month, however, she is too tired for sports so takes a 3 km walk instead

    Rob is able to walk distances of up to 200 m without any problems, but feels tired after walking one km or climbing more than one flight of stairs. He has no problems with day to day physical activities, such as carrying food from the market

    Philip goes walking every day for half an hour, 1 km or 2 km. He does not practise any strenuous sports as he feels out of breath when he walks very quickly or runs

    Nathan has attacks of anxiety when he goes out of his house. So he leaves his home only once a week, and never by himself

    Anton does not exercise. He cannot climb stairs or do other physical activities because he is obese. He is able to carry the groceries and do some light household work

    Margaret feels chest pain and gets breathless after walking distances of up to 200 m, but is able to do so without assistance. Bending and lifting objects such as groceries also cause chest pain

    Rina has had a stiff neck for the last 10 days and it makes her move around slowly as any sudden movement causes pain

    Jenny is an adult with an intellectual impairment and she is also obese. She struggles to get out of a chair and moves very slowly

    Louis is able to move his arms and legs, but requires assistance in standing up from a chair or walking around the house. Any bending is painful, and lifting is impossible

    Vincent has a lot of swelling in his legs due to his health condition. He has to make an effort to walk around his home as his legs feel heavy

    Sid suffers from a mental illness and spends his days rocking in a chair. He never moves out of his chair except when physically assisted by another person

    David is paralysed from the neck down. He is confined to bed and must be fed and bathed by somebody else

    Gemma has a brain condition that makes her unable to move. She cannot even move her mouth to speak or smile. She can only blink her eyelids

    Names are included as examples only. Each site developed separate sets of locally appropriate male and female names, and interviewers presented the set of names matched to each respondent's gender.

    The primary purpose of including anchoring vignettes linked to self assessments is to detect and then adjust for differences in response category cut points to make categorical self reports more comparable. As an example of how vignette ratings can reveal differences in cut points that may relate to varying norms and expectations for health, fig 3 shows the distribution of ratings for one mobility vignette in different age groups for the three countries that included this vignette (Myanmar, Pakistan, and Turkey). The Kolmogorov-Smirnov test for equality of distributions confirms significant differences between the youngest and oldest age groups (P = 0.001). This example suggests that older individuals use a more lenient interpretation of the same set of response categories in describing mobility levels, which is consistent with the notion of shifting norms for health over the life course.

    Fig 3 Variation in vignette ratings across age groups in three countries (Myanmar, Pakistan, and Turkey) (N=211). Responses are shown for the question, " is able to walk distances of up to 200 m without any problems but feels tired after walking 1 km or climbing up more than one flight of stairs. He has no problems with day to day physical activities, such as carrying food from the market. Overall, how much difficulty does have with moving around?"

    When survey respondents rate a series of vignettes on a domain, we can summarise the responses in different groups using stacked bar diagrams. For example, fig 4 compares ratings for five mobility vignettes from the samples in China and Sri Lanka. Each stacked bar shows the categorical responses for one vignette, with the vignettes ordered from higher to lower mobility levels based on average categorical scores. In these samples, respondents from Sri Lanka tend to give less favourable ratings than those from China, conditional on the fixed level of mobility described in a vignette. The differences in self rated mobility in the two samples, shown in the top bars of fig 4, may arise from a combination of variation in health experiences and variation in expectations. Given the older sample in China and the results in fig 3, part of the variation in both self assessments and vignette ratings may be explained by age related health norms. Results in these non-probabilistic samples will not necessarily be generalisable to the entire populations in each country but nevertheless provide a useful illustration of the way that ratings of anchoring vignettes can show differences in cut points across populations.

    Fig 4 Mobility ratings for self assessment and selected vignettes, China and Sri Lanka (N=1061 for self ratings, N=151 for vignettes). The survey asked, "How much difficulty did have with moving around?" The vignettes shown, from left to right, are those labelled as Adriana, Anton, Margaret, Louis, and Gemma in the box

    In addition to comparisons within and between countries, comparisons of vignette ratings may also show how cut points for the same person change over time, where longitudinal data are available, or place cut points for multiple questions relating to the same domain on a common scale. For example, fig 5 shows the ratings for an array of 10 vignettes using the two different mobility questions. This figure shows that the second question is "more difficult" in the sense of tapping a higher level of mobility than the first; that individuals rate themselves favourably on mobility but recognise on average that the top two vignettes describe higher levels than their own; and that respondents use the available categories similarly in providing self ratings and vignette ratings, suggested by the correspondence between the two questions on both the self assessments and vignette ratings—in both cases, individuals respond to the second question in a way that accords with tapping a higher level of difficulty.

    Fig 5 Self assessments and vignette ratings for two mobility questions (Q1: How much difficulty did have with moving around? Q2: How much difficulty did have in vigorous activities?). Pooled results are shown from six countries (China, Myanmar, Pakistan, Sri Lanka, Turkey, and United Arab Emirates) (N=3012 for self ratings, N=406 for vignettes). The vignettes shown, from left to right, are those labelled as Paul, Mary, Adriana, Rob, Anton, Margaret, Rina, Louis, Vincent, and David in the box.

    Discussion

    Testa MA, Simonson DC. Assessment of quality-of-life outcomes. N Engl J Med 1996;334: 835-40.

    Kind P, Dolan P, Gudex C, Williams A. Variations in population health status: results from a United Kingdom national questionnaire survey. BMJ 1998;316: 736-41.

    Fischer D, Stewart AL, Bloch DA, Lorig K, Laurent D, Holman H. Capturing the patient's view of change as a clinical outcome measure. JAMA 1999;282: 1157-62.

    Shibuya K, Hashimoto H, Yano E. Individual income, income distribution, and self rated health in Japan: cross sectional analysis of nationally representative sample. BMJ 2002;324: 16-9.

    Garratt A, Schmidt L, Mackintosh A, Fitzpatrick R. Quality of life measurement: bibliographic study of patient assessed health outcome measures. BMJ 2002;324: 1417.

    Murray CJL, Chen LC. Understanding morbidity change. Popul Dev Rev 1992;18: 481-503.

    Mathers CD, Douglas RM. Measuring progress in population health and well-being. In: Eckersley R, ed. Measuring progress: is life getting better. Collingwood: CSIRO, 1998: 125-55.

    Sen A. Health: perception versus observation. BMJ 2002;324: 860-1.

    Sadana R, Mathers CD, Lopez AD, Murray CJL, Iburg KM. Comparative analysis of more than 50 household surveys of health status. In: Murray CJL, Salomon JA, Mathers CD, Lopez AD, eds. Summary measures of population health: concepts, ethics, measurement and applications. Geneva: World Health Organization, 2002.

    Carr AJ, Gibson B, Robinson PG. Measuring quality of life: is quality of life determined by expectations or experience? BMJ 2001;322: 1240-3.

    Freedman VA, Martin LG. Understanding trends in functional limitations among older Americans. Am J Public Health 1998;88: 1457-62.

    Murray CJL, Tandon A, Salomon JA, Mathers CD, Sadana R. Cross-population comparability of evidence for health policy. In: Murray CJL, Salomon JA, Mathers CD, Lopez AD, eds. Summary measures of population health: concepts, ethics, measurement and applications. Geneva: World Health Organization, 2002.

    Groot W. Adaptation and scale of reference bias in self-assessments of quality of life. J Health Econ 2000;19: 403-20.

    Wolfe R, Firth D. Modelling subjective use of an ordinal response scale in a many period crossover experiment. Appl Stat 2002;51: 245-55.

    Tandon A, Murray CJL, Salomon JA, King G. Statistical models for enhancing cross-population comparability. In: Murray CJL, Evans DB, eds. Health systems performance assessment: debates, methods and empiricism. Geneva: World Health Organization, 2003: 727-46.

    King G, Murray CJL, Salomon JA, Tandon A. Enhancing the validity and cross-population comparability of measurement in survey research. Am Polit Sci Rev 2004; 98. (In press.)

    World Health Organization. World Health Survey. www.who.int/whs (accessed 6 Jan 2004).

    Herskovits MJ. The hypothetical situation: a technique of field research. Southwest J Anthropol 1950;6: 32-40.

    Anderson HH, Anderson GL. An introduction to projective techniques and other devices for understanding human behavior. Englewood Cliffs: Prentice Hall, 1951.

    Walster E. Assignment of responsibility for an accident. J Pers Soc Psychol 1966;3: 73-9.

    Rossi PH, Nock SL. Measuring social judgments: the factorial survey approach. Beverly Hills: Sage, 1982.

    Koedoot CG, De Haes JC, Heisterkamp SH, Bakker PJ, De Graeff A, De Haan RJ. Palliative chemotherapy or watchful waiting? A vignettes study among oncologists. J Clin Oncol 2002;20: 3658-64.

    Goldie J, Schwartz L, McConnachie A, Morrison J. The impact of three years' ethics teaching, in an integrated medical curriculum, on students' proposed behaviour on meeting ethical dilemmas. Med Educ 2002;36: 489-97.

    Kelly WF, Eliasson AH, Stocker DJ, Hnatiuk OW. Do specialists differ on do-not-resuscitate decisions? Chest 2002;121: 957-63.

    Hughes R, Huby M. The application of vignettes in social and nursing research. J Adv Nurs 2002;37: 382-6.

    Tversky A, Kahneman D. The framing of decisions and the psychology of choice. Science 1981;211: 453-8.(Joshua A Salomon, assista)