Association of measured physical performance and demographic and health characteristics with self-reported physical function: implications for the interpretation of self-reported limitations

Background Self-reported limitations in physical function often have only weak associations with measured performance on physical tests, suggesting that factors other than performance commonly influence self-reports. We tested if personal or health characteristics influenced self-reported limitations in three tasks, controlling for measured performance on these tasks. Methods We used cross-sectional data on adults aged ≥ 60 years (N = 5396) from the Third National Health and Nutrition Examination Survey to examine the association between the repeated chair rise test and self-reported difficulty rising from a chair. We then tested if personal characteristics, health indicators, body composition, and performance on unrelated tasks were associated with self-reported limitations in this task. We used the same approach to examine associations between personal and health characteristics and self-reported difficulty walking between rooms, controlling for timed 8-foot walk, and self-reported difficulty getting out of bed, controlling for repeated chair rise test results. Results In multivariate analyses, participants who performed worse on the repeated chair rise test were more likely to report difficulty with chair rise. However, older age, lower education level, lower serum albumin, comorbidities, knee pain, and being underweight were also significantly associated with self-reported limitations with chair rise. Results were similar for difficulty walking between rooms and getting out of bed. Conclusions Self-reports of limitations in physical function are influenced by personal and health characteristics that reflect frailty, and should not be interpreted solely as measured difficulty performing the task.

A gold standard method to measure physical functioning does not exist. Self-report questionnaires have been adopted as easily administered instruments that can capture limitations in a wide spectrum of tasks [11,12]. However, self-report is subjective and may be influenced by mood, misjudgment of usual ability, or misinterpretation by the respondent. Despite these potential limitations, self-report questionnaires of physical functioning have face and construct validity [2]. An approach commonly used to test the construct validity of self-reported measures of functioning is to compare responses on these measures with directly-observed or measured performance on similar tasks. For example, self-reported difficulty in rising from a chair is tested for correlations with measured ability to rise from a chair on a timed test.
Although self-reported functioning and performance on objective physical tests are correlated, these associations are generally weak [13][14][15][16][17]. These studies often examined associations between multi-item questionnaires and physical performance test batteries, which averaged measures of performance over several domains of functioning [14,15,[18][19][20]. One explanation for the weak correlations in these studies may be the problem of compensability: multi-item or summary measures do not identify which functions are most limited, the mean result might compensate for isolated limitations, and good performance on some measures might confound the association between other performance measures and their corresponding self-reported functions. To use physical performance tests to assess the construct validity of self-reported measures, it would be more appropriate to compare highly specific pairings of physical performance tests and self-reported physical function; that is, how self-reported limitations compare to measured performance on a corresponding test of the same task.
An alternative explanation for the modest association between self-report and physical performance tests may be that factors other than performance affect selfreports of limitations. Personal and health characteristics may influence how different patients appraise the limitations they have.
Despite extensive literature on the use of self-reports to measure physical functioning, few studies to date have examined whether factors other than measured performance on the same task influence self-reports. Our primary objective was to determine if self-reported limitations in physical functions were associated with personal and health characteristics, after accounting for measured performance on the same task. To address the problem of compensability, we analyzed three self-report tasks (rising from a chair, getting in or out of bed, and walking between rooms) for which there were corresponding physical performance tests (timed repeated chair rise and 8-foot walk). This design provided a unique method with which to assess influences on, and the meaning of, self-reported limitations.

Data source and study sample
We analyzed data from the Third National Health and Nutrition Examination Survey (NHANES III), a national population-based sample of non-institutionalized individuals in the United States [21]. In this cross-sectional study, we included participants aged ≥ 60 years because only these persons were eligible for an assessment of physical function. Among this subset, we excluded from the study individuals (n = 441) who lacked assessment of physical function at the mobile examination center (92.3%) or their home (7.7%) by one of the following three physical performance tests: repeated chair rise, 8-foot walk, or lock and key test. Our final study sample included 5396 persons. Participants completed the Household Questionnaire, which included questions on physical functioning, before they had the physical examination and performance testing.

Analytic framework
To maximize the specificity of the association between physical performance tests and self-reported physical function, we studied performance tests in relation to their corresponding self-reported functions: 1) repeated chair rise test and its relationship with self-reported difficulty rising from a chair; 2) repeated chair rise test and its relationship with self-reported difficulty getting in or out of bed; and 3) 8-foot walk test and its relationship with self-reported difficulty walking between rooms on the same level. Limitations on eight additional physical functions were asked, but were not included in the analysis because they did not have a corresponding physical performance test administered in NHANES III.

Dependent variables
Ability to perform the three self-reported physical functions of interest was assessed by the following: "Please tell me if you have no difficulty, some difficulty, much difficulty or are unable to do these activities at all when you are by yourself and without the use of aids." 1) "Standing up from an armless straight chair?", 2) "Getting in or out of bed?", and 3) "Walking from one room to another on the same level?".

Independent variables
Data on six physical performance tests were collected in NHANES III by trained assessors: repeated chair rise test, 8-foot walk, lock and key test, shoulder range of motion, active hip and knee flexion, and timed tandem stand test [16,[22][23][24][25]. The repeated chair rise test, an assessment of lower extremity motor function and postural control, was a timed test of five consecutive rises from an armless straight chair. The 8-foot walk test, an evaluation of gait, was a timed test of usual speed to walk 8 feet. Time to complete the test (in seconds) was represented as gait speed (in meters per second) by first converting feet to meters and then dividing by the time in seconds needed to complete the test. We categorized performance on both tests into quartiles, with the bestperforming quartile as the reference group.
The lock and key test, a test of eye-hand coordination and fine motor skills, was a timed test of unlocking a lock with a key. Internal and external rotations of both shoulders were scored as full, partial, or unable to perform. Hip and knee flexion were scored similarly. For analysis, results on the lock and key test were categorized into quartiles, and range of motion of the shoulders and flexion of the hips and knees were dichotomized as either full or not, with full as the reference group. We did not include the timed tandem stand test because almost all participants attained the maximum allotted time. Reliability of these physical performance tests has been reported to be good [16,[22][23][24][25].
We included covariates available in the data set and known to be associated with physical function. Demographic characteristics included age, gender, race-ethnicity, and education level. We categorized age into five groups (60-64, 65-69, 70-74, 75-79, and 80 years and older) to allow for non-linear relationships, and categorized education level, recorded as highest grade attained, into three groups (0-8, 9-12, and 13-17 years).
The health indicators were current cigarette smoking, hemoglobin level, serum albumin concentration, knee pain, and comorbidities. We included current cigarette smoking, hemoglobin, and serum albumin because they are indicators of general health [26][27][28][29]. Hemoglobin and serum albumin were used as continuous variables in the regression models, with associated odds ratios representing change per 1 gram per deciliter. Knee pain was included because the functions we studied involved the lower extremities, and pain may affect physical function. Knee pain, recorded as tenderness on palpation or pain with passive motion during the physical examination, was coded as absent, present in one knee, or present in both knees. We included comorbidities that may impact physical function: arthritis, stroke, diabetes mellitus, chronic bronchitis, emphysema, asthma, myocardial infarction, congestive heart failure, and cancer (excluding skin cancer). These were collected by self-report.
Body composition was assessed by body mass index (BMI) and skeletal muscle mass, which are prognostic indicators of physical function. BMI, measured as weight in kilograms/height in meters squared, was grouped using World Health Organization categories of underweight (< 18.5 kg/m 2 ), normal weight (18.5 -24.9 kg/ m 2 ), overweight (25.0 -29.9 kg/m 2 ), and obesity (≥ 30.0 kg/m 2 ) because of its non-linear relationship with physical function [30,31]. Skeletal muscle mass was determined from a prediction equation based on bioelectrical impedance analysis resistance, age, gender, and height. Following Janssen, we expressed skeletal muscle mass as skeletal muscle index (SMI) to account for differences in non-skeletal muscle mass, where SMI = (skeletal muscle mass/body mass) × 100 [32].

Statistical analysis
Analyses were performed using methods that accounted for the multistage, clustered sampling of NHANES III. We used ordinal logistic regression models to examine the association between specific physical performance tests and self-reported limitations. In unadjusted models, the degree of self-reported functional limitation was the dependent variable and the corresponding physical performance test was the independent variable. In adjusted models, we included age, gender, race-ethnicity, education level, arthritis, stroke, diabetes mellitus, chronic bronchitis, emphysema, asthma, myocardial infarction, congestive heart failure, cancer, smoking, hemoglobin level, serum albumin concentration, knee pain, BMI, and SMI. To determine if other physical performance tests were associated with any of the three self-reported functional limitations, we then added the lock and key test, shoulder range of motion, hip and knee flexion, and either chair rise test or 8-foot walk test as independent variables to each model.
To assess the validity of the proportional odds assumption in the ordinal logistic regression models, we examined qualitatively the similarity of odds ratios for contrasts between each level of the dependent variable [33]. We represented the associations with a single odds ratio, since odds ratios for different contrasts were found to be similar. Data were missing for education level in 0.7% of cases, height in 0.2%, weight in 0.3%, hemoglobin in 5.9%, serum albumin in 7.8%, bioelectrical impedance analysis resistance in 17.3%, chair rise test in 10.3%, 8-foot walk test in 7.1%, lock and key test in 4.5%, shoulder rotation in 0.3%, and hip and knee flexion in 6.3%. Data were missing due to different reasons. During evaluation of physical functioning, participants who made no attempt to perform a specific maneuver because of severe physical limitations were coded as "blank". We assigned these participants to the worst performing quartile. Participants who attempted the task but failed to complete it were also assigned to the worst performing quartile. On the other hand, participants who made no attempt to perform a specific maneuver for reasons unrelated to physical limitations (e.g. time constraints) were coded as "blank but applicable". After extensive review by survey analysts, data believed to be extreme or illogical and viewed as virtually impossible were also coded as "blank but applicable". We treated data coded as blank but applicable as missing at random. We used the multiple imputation method with the Markov Chain Monte Carlo algorithm to impute missing values [34]. This allowed us to retain all participants in the analyses, and provides estimates that are less biased than those of a complete-case analysis [35]. Analyses were performed using SAS version 9.2 (SAS Institute Inc, Cary, NC).

Participant characteristics
Participants had a mean (± standard error of the mean) age of 70.7 ± 0.2 years (Table 1). Arthritis was the most common comorbid condition (44.7%), while 12.6% reported having diabetes mellitus, and 11.4% reported having had a myocardial infarction. At least some difficulty rising from an armless straight chair was reported by 20.5%; 14.9% reported at least some difficulty getting in or out of bed, and 8.2% reported at least some difficulty walking between rooms on the same level.

Association of physical performance tests with selfreported functional limitations
In the first set of analyses that tested the association of the repeated chair rise test and self-reported limitations rising from a chair, worse performance on the chair rise test was significantly associated with the odds of reporting worse limitations ( Table 2). In adjusted models, age was a significant correlate of functional limitation, independent of performance on the repeated chair rise test, with progressively higher adjusted odds ratios beginning with 70-74 year-olds. Lower education level, arthritis, stroke, congestive heart failure, and cancer were associated with a higher odds of worse self-reported limitation, while a higher level of serum albumin was associated with a lower odds of a worse self-reported limitation. Participants with bilateral knee pain and those who were underweight or obese were more likely to report worse limitations. Gender, current smoking, hemoglobin level, and SMI were not associated with self-reported limitation in rising from a chair in this model.
Results of models predicting self-reported ability to get in or out of bed were similar (Table 3). Participants in the 3 rd and 4 th quartiles on the chair rise test were more likely to report worse limitations than those in the best-performing quartile. In the adjusted model, older age, lower education level, arthritis, stroke, congestive heart failure, cancer, lower serum albumin level, bilateral knee pain, and lower BMI were also significantly associated with an increased odds of worse self-reported functioning, independent of measured performance on the chair rise test.
In the third set of models examining self-reported ability to walk between rooms, the 8-foot walk test was significantly associated with the odds of a worse level of self-reported limitation (Table 4). In the adjusted model, older age, arthritis, stroke, chronic bronchitis, congestive heart failure, cancer, and lower serum albumin level were significantly associated with self-reported limitation in walking between rooms, independent of measured performance on the 8-foot walk test.

Association with other physical performance tests
Self-reported limitation in rising from a chair was associated not only with performance on the chair rise test, but also with performance on the 8-foot walk and with limitation in hip and knee flexion, when these physical performance tests were included in the model (Table 5). Associations with self-reported limitations getting in or out of bed were similar. Poor performance on the 8-foot  walk test as well as limitations in hip and knee flexion were significantly associated with self-reported difficulty walking between rooms. These findings indicate that performance tests were not uniquely specific in explaining variation in corresponding self-reported functional limitations.

Complete case analysis
Results of complete case analysis were similar to those of the main analysis that used multiple imputation of missing values. In the complete case analyses, selfreported limitations rising from a chair were associated not only with worse performance on the repeated chair rise test, but also with older age, current cigarette smoking, arthritis, stroke, myocardial infarction, congestive heart failure, lower BMI, knee pain, and lower serum albumin level. Self-reported limitations getting in or out of bed were associated with worse performance on the chair rise test, arthritis, stroke, chronic bronchitis, congestive heart failure, and bilateral knee pain. Worse performance on the 8-foot walk test was associated with higher odds of self-reported limitations walking between rooms. Additional significant covariates included arthritis, stroke, chronic bronchitis, congestive heart failure, and knee pain. These results indicate that self-reports were influenced by personal and health characteristics and not exclusively by the measured difficulty in performing the task.

Discussion
Our findings indicate that self-reported limitations in physical function were associated with measured performance on the task being assessed. Nonetheless, selfreported physical functioning was influenced also by personal and health characteristics and not solely by the measured difficulty in performing the task. These findings indicate that self-report captures information above and beyond performance on the specific task itself. Functional limitations were strongly associated with physical performance tests, particularly for participants in the worst-performing quartiles. Despite the importance of physical performance tests, other factors were independently associated with self-reported physical functioning. Advanced age showed strong graded associations with limitations in each of the three functions. Participants with comorbid conditions were more likely to report worse limitations, consistent with prior reports [36,37]. For all three functions, serum albumin level was an important indicator of worse self-reported functioning, beyond the information on disease burden provided by the presence of comorbidities. Low serum albumin level has been associated with an increased odds of functional limitations in earlier studies [28,29]. Underweight participants had increased risks of selfreported limitations than their normal weight counterparts, consistent with previous reports that low BMI is associated with functional limitations [31,38]. Pain in both knees was significantly associated with an increased odds of limitations rising from a chair and getting in or  out of bed. These findings suggest that self-reports of functional limitations represent global perceptions of frailty, rather than solely an appraisal of limitations on the task being asked. Despite extensive literature on this topic, the nature of the association between physical performance tests and self-reported limitations has remained incompletely characterized. Most prior studies compared a group of physical performance tests (typically a performance battery) with multi-item self-report functions [14,15,[18][19][20]39]. For example, Reuben and colleagues found weak associations between physical function questionnaires and a battery of physical performance tests in 83 older adults [14]. Myers and colleagues found good correspondence (defined as > 80% agreement) between a set of 14 physical performance tests and a set of corresponding selfreported limitations in only one-third of participants [13]. We similarly found that physical performance tests did not correspond exclusively to self-reported limitations. Kempen et al studied the relationship of sociodemographic characteristics, performance tests, personality measures, and cognitive and affective functioning and self-reported limitations in 753 older adults [17]. They found that associations between physical performance tests and corresponding self-reported limitations were weak, and that some of the discrepancy was explained by depressive symptoms and self-efficacy. These results support our findings in suggesting that factors other than performance can impact self-report. However, while they accounted for cognitive and affective symptoms, we found that less well-recognized physical factors, such as serum albumin level, pain, and BMI were associated with a higher odds of self-reported limitations. Physical performance tests that were not specifically paired to the physical functions studied were also significantly associated with self-reported limitations. For example, poor performance on the 8-foot walk test, a test of gait, was associated with worse self-reported functioning in rising from a chair and getting in or out of bed, which are measures of changes in body position. These associations demonstrate performance tests were not exclusive correlates of their specifically paired selfreported limitations, and that tests of other lower extremity functions (but not of upper extremity functions) also influence self-reports.
In contrast to many previous studies, we used highly specific pairings to provide a more valid test of the relationship between physical performance tests and self-reported physical function, thereby minimizing problems of compensability and the risk of confounding across different domains of physical functioning. We also tested a broad set of personal and health characteristics as correlates. Moreover, the population-based national sample increases the generalizability of our results.
Our study also has some limitations. Because we wanted to examine self-reported functions for which there were corresponding physical performance tests, we were able to examine only two physical performance tests. Although we do not know if our results apply to other physical performance tests, the consistency of results suggests that the findings may be relevant for other physical functions. We did not have data on depressive symptoms, personality measures, fatigue, and cognitive functioning, each of which can affect physical functioning [15]. However, our objective was not to Odds ratio (OR) adjusted for age, gender, education, arthritis, stroke, diabetes mellitus, chronic bronchitis, emphysema, asthma, myocardial infarction, congestive heart failure, cancer (excluding skin cancer), smoking, hemoglobin, serum albumin, knee pain, body mass index, skeletal muscle index, and physical performance test. ‡ Q1, 2, 3, 4 represent 1st through 4th quartiles, from best performance (Q1) to worst performance (Q4). § Full shoulder rotation versus any limitation in shoulder rotation. ¶ Full hip and knee flexion versus any limitation in hip and knee flexion.
identify all factors that may impact self-reported physical functioning, but rather to test the relationship between physical performance and patient factors and self-reported functioning. Although physical performance tests were related to self-reported limitations, measurement error and the effort-dependence of performance measures may have led to an underestimate of these associations. Our sample included persons aged ≥ 60 years, and we do not know if the associations are generalizable to younger individuals.

Conclusions
Our results support the validity of self-reported physical function by finding associations between self-reports and measured performance on similar tasks. More important, however, were our findings that these associations were neither specific nor exclusive. Personal and health characteristics of respondents also influenced self-reported physical function. Our findings caution against a narrow or strict interpretation of self-reported limitations in individual tasks. Self-reported limitations represent a gestalt rather than an appraisal isolated from its context. Self-reports capture information beyond task difficulty.