Medical Outcomes Study Short Form 36

The SF-36 is a generic health survey created to assess health status in the general population as part of the Medical Outcomes Study (Ware & Sherbourne 1992). It is comprised of 36 items drawn from the original 245 items generated by that study (McHorney et al. 1993; Ware & Sherbourne 1992).

Items are organized into eight dimensions or subscales which include physical functioning, role limitations: physical, emotional, bodily pain, social functioning, general mental health, and general health perceptions. It also includes two questions intended to estimate change in health status over the past year. These two questions remain separate from the eight subscales and are not scored. With the exception of the general change in health status questions, subjects are asked to respond with reference to the past four weeks. An acute version of the SF-36 refers to problems in the past week only (McDowell & Newell 1996).

The recommended scoring system uses a weighted Likert system for each item. Items within subscales are summed to provide a total score for each subscale or dimension. Each of the eight summed scores is linearly transformed onto a scale from 0 to 100 to provide a score for each scale. In addition, a physical component and mental component score can be derived from the scale items. Standardized population data for several countries are available for the SF-36 (McDowell & Newell 1996). The component scores have also been standardized with a mean of 50 and standard deviation of 10 (Finch et al. 2002).

The SF-36 questionnaire can be self-completed or administered in person or over the telephone by a trained interviewer. It is considered simple to administer and takes less than 10 minutes to complete (Andresen & Meyers 2000). Permission to use the instrument should be obtained from the Medical Outcomes Trust who oversee the standardized administration of the SF-36 and will provide updates on administration and scoring (McDowell & Newell 1996). Various computer applications are available to assist in scoring the SF-36 including free Excel templates that can be downloaded from the internet (Callahan et al. 2005). 

Table: Characteristics of the Medical Outcomes Study Short Form 36

Advantages. The SF-36 is simple to administer. Both forms (i.e. self-completed or interview) take less than 10 minutes to complete (Hartley et al. 1995). As a self-completed, mailed questionnaire, it has been shown to have reasonably high response rates: 83% has been reported by Brazier et al. (1992), O'Mahony and Rodgers H (1998), and 75%-83% was reported by P. Dorman et al. (1998). Dorman et al. (1999) reported a response rate of 85% and Walters et al. (2001) reported 82% overall and 69% for those over age 85.

Callahan et al. (2005) found that the SF-36 was appropriate for longitudinal serial assessment of recovery in a mixed group of patients suffering from a cerebrovascular accident, TBI, or spinal cord dysfunction. The instrument has been shown to be valid and reliable in the adult TBI population and appears to be sensitive to the wide spectrum of health issues faced by this group (Emanuelson et al. 2003; Findler et al. 2001).

Limitations. Higher rates of missing data have been reported among older patients when using a self-completed form of administration (Brazier et al. 1992; Brazier et al. 1996; Hayes et al. 1995). O’Mahony et al. (1998) found item completion rates to range from 66% to 96%. At the scale level, complete data collection (amount required to compute a scale score) ranged from 67% (role limitations-emotional) to 97% (social functioning). Walters et al. (2001) reported scale completion rates among community dwelling older adults ranging from 86.4% to 97.7% with all eight scales being calculable for 72% of respondents. Dorman et al. (1999) reported a proportion of missing data on the scale level ranging from 2% (social functioning) to 16% (role functioning-emotional). Given the lack of data completeness found, postal administration of the SF-36 may not be appropriate for use among older adults. However, low completion rates may not be limited to self-completion or postal administration. Andresen et al. (1999) administered the SF-36 to nursing home residents by face-to-face interview and reported that only 1 in 5 residents were able to complete it.

It has been suggested that data completeness may be indicative of respondent acceptance and understanding of the survey as relevant to them (Andresen et al. 1999; O'Mahony & Rodgers 1998). Hayes et al. (1995) noted that the most common items missing on the self-completed questionnaire referred to work or vigorous activity. Older respondents identified these questions as pertinent for much younger people and not relevant to their own situation. The authors suggested modifications to some of the questions, which may increase acceptability to older populations. In a qualitative assessment of the physical functioning and general health perceptions dimensions of the SF-36, Mallinson (2002) noted that the participants, who were all over the age of 65, tended to display signs of disengagement from the interview process and some participants expressed concern relating to the relevance of the questions. There was also considerable variation noted in subjective interpretation of items and most subjects used qualifying, contextual information to clarify their responses to the interviewer. As Mallinson (2002) pointed out, individual issues of subjective meaning and context are lost when the questionnaire is scored.

The SF-36 does not lend itself to the generation of an overall summary score. In scales using summed Likert scales, information contained within individual responses is lost in the total scale score, in that any given total score can be achieved in a variety of ways from individual item responses (Dorman et al. 1999). Hobart et al. (2002) examined the use of the 2-dimensional model, which consists of a MCS and PCS. These two scales can account for only 60% of the variance in SF-36 scores suggesting a significant loss of information when the 2-component model is used.

It has been suggested that the SF-36 may be more sensitive to the health difficulties of mild TBI than of moderate/severe TBI patients as it was unable to differentiate between the severity levels (Emanuelson et al. 2003). One study found initial differences between these groups, but once depression was controlled for, these differences were less visible, suggesting that depression may account for the differences between TBI groups on the SF-36 (Findler et al. 2001). MacKenzie et al. (2002) suggest that adding a cognitive component to the SF-36 would make the instrument a more useful outcome measure in a head trauma population, as the tool is likely to underestimate the extent of disability in this group.

The level of test re-test reliability reported in stroke populations indicate that the SF-36 may not be adequate for serial comparisons of individual patients, but rather should be used for large group comparisons only (Dorman et al. 1998). Weinberger et al. (1996) also questioned the usefulness of the SF-36 in serial evaluation of individuals given large reported absolute differences in SF-36 scores obtained via common modes of administration (face-to-face interview, self-administration and telephone interview) over short testing intervals.

Dikmen et al. (2001) emphasized that the SF-36 was designed to be self-administered, thus its disadvantage is the inability to use the SF-36 to assess patients who are too impaired to complete the questionnaire on their own. While the use of a proxy may be the only means by which to include data from more severely affected TBI patients, reported disagreement between patient and proxy assessments has been considerable. In an adolescent TBI population, moderate rates of agreement were reported between proxy and patient respondent ratings for items related to physical health. However, on more subjective items, agreement was very low (Ocampo et al. 1997). It has been suggested that clinicians do not substitute proxy data for patient responses due to the subjective nature of many SF-36 items (Ocampo et al. 1997).

 

Summary- Medical Outcomes Study Short Form 36

Interpretability: Use of scale scores and summary component scores represents a loss of information and decreases potential clinical interpretability. Standardized norms for several countries are available for the SF-36. 

Acceptability: Completion times are approximately 10 minutes for either self-completed or interview administered questionnaires. Some items have been questioned for their relevance to elderly populations. The SF-36 has been studied for use by proxy, but agreement rates are low and reliability of the test decreased when proxy respondents completed assessments. 

Feasibility: The SF-36 questionnaire can be administered through a self-completion questionnaire or by interview (either on the telephone or in-person). It has been used as a mail survey with reasonably high completion rates reported. However, data obtained is more complete when interview administration is used. Permission to use the instrument and additional information regarding its administration and scoring should be obtained from the Medical Outcomes Trust.

Table: Short Form 36 Evaluation Summary

Reliability

Validity

Responsiveness

Rigor

Results

Rigor

Results

Rigor

Results

Floor/ceiling

+++

 

++ (TR)

++ (IC)

+++

+++

++

+++

+

NOTE: +++=Excellent; ++=Adequate; +=Poor; N/A=insufficient information; TR=Test re-test; IC=Internal Consistency; IO=Interobserver; Varied (re. floor/ceiling effects; mixed results).