Evaluation of the late life disability instrument in the lifestyle interventions and independence for elders pilot (LIFE-P) study

Background The late life disability instrument (LLDI) was developed to assess limitations in instrumental and management roles using a small and restricted sample. In this paper we examine the measurement properties of the LLDI using data from the Lifestyle Interventions and Independence for Elders Pilot (LIFE-P) study. Methods LIFE-P participants, aged 70-89 years, were at elevated risk of disability. The 424 participants were enrolled at the Cooper Institute, Stanford University, University of Pittsburgh, and Wake Forest University. Physical activity and successful aging health education interventions were compared after 12-months of follow-up. Using factor analysis, we determined whether the LLDI's factor structure was comparable with that reported previously. We further examined how each item related to measured disability using item response theory (IRT). Results The factor structure for the limitation domain within the LLDI in the LIFE-P study did not corroborate previous findings. However, the factor structure using the abbreviated version was supported. Social and personal role factors were identified. IRT analysis revealed that each item in the social role factor provided a similar level of information, whereas the items in the personal role factor tended to provide different levels of information. Conclusions Within the context of community-based clinical intervention research in aged populations, an abbreviated version of the LLDI performed better than the full 16-item version. In addition, the personal subscale would benefit from additional research using IRT. Trial registration The protocol of LIFE-P is consistent with the principles of the Declaration of Helsinki and is registered at http://www.ClinicalTrials.gov (registration # NCT00116194).


Background
Disability is a major focus for intervention research in aging due to the social, personal, and economic costs associated with the loss of independence [1]. The magnitude of this problem will intensify with the aging of the 'baby boom' generation. Consistent with the International classification of functioning, Disability, and Health (ICF) framework [2], disability is now conceptualized as a rubric for capturing impairments, functional limitations, and activity restrictions. Jette and his colleagues [3] have noted that most existing instruments focus on assessing discrete functional tasks to the exclusion of performance on socially defined tasks expected of an individual within a typical sociocultural and physical environment. Thus, they developed the Late Life Disability Instrument (LLDI), a 16-item measure to assess limitations and frequency of performing life roles and activities [3].
The Lifestyle Interventions and Independence for Elders Pilot (LIFE-P) study was a single blind fourcenter randomized controlled trial of a 12-month physical activity (PA) intervention compared to a successful aging (SA) intervention in sedentary older adults. The LLDI was used to measure change in disability within randomized groups of LIFE-P. Because the original LLDI was developed on a small, restricted sample, prior to measuring change in the LLDI within LIFE-P, we undertook an investigation to re-examine the measurement properties of the instrument. The longitudinal design of LIFE-P enabled us to examine the stability of the factor structure of the LLDI as disability responsive to change with time and to evaluate the quality of individual items.
We initially use confirmatory factor analysis to investigate whether the factor structure for the limitation domain of the LLDI, as applied to baseline and followup data obtained from LIFE-P participants, was compatible with the originally publication. Furthermore, because McAuley and colleagues [4] published an abbreviated version of the LLDI consisting of 8 items that had superior psychometric qualities as compared to the original instrument, we examine the fit of their measurement model within the LIFE-P data. Finally, to further elucidate how individual items play a role in measuring disability, we present results from item response theory (IRT) for evaluating the relationship between disability and item responses at month 12.

Study Sample
In LIFE-P, at baseline, 6-and 12-months, comprehensive standard assessments were conducted by trained research staff blinded to intervention assignment [5][6][7]. The study was approved by the NIH and local institutional review boards at the four clinic sites and all study participants gave written informed consent. Between May 2004 and February 2005, 424 participants at elevated risk of disability were enrolled. Participants were aged 70-89 years and able to complete a 400-meter walk in 15 minutes. Major exclusion criteria included presence of severe heart failure, uncontrolled angina, and other severe illnesses that might interfere with physical activity. Detailed inclusion/exclusion criteria and a flow diagram regarding to the specific numbers of individuals screened and reasons for exclusion can be found in an earlier publication [7].

Instrument
The Late Life Disability questionnaire includes items for a wide variety of life tasks, such as personal maintenance; mobility and travel; exchange of information; social, community, and civic activities; home life; paid or volunteer work; and involvement in economic activities [3]. It was developed to assess meaningful concepts of disability in terms of frequency and limitation in performance of 16 life tasks, and was originally developed on a sample of 150 community-dwelling older adults aged 60 and older. In this study, we focused on limitation domain only. The limitation dimension describes capability of performing these life tasks. It includes both personal (health, physical, or mental energy) and environmental (transportation, accessibility, or socioeconomic) factors. Limitation questions are phrased, "to what extent do you feel limited in doing a particular task?" with response options of "not at all," "a little," "somewhat," "a lot," and "completely." Jette et al. [3] demonstrated that two disability domains, instrumental and management, were identified within limitation dimension for 16 items. McAuley et al. [4] identified two domains, social and personal roles, using the abbreviated version with 8 items only.

Participant Characteristics
We obtained data on participant's age, gender, race/ethnicity, education, marital status, and living arrangements using a structured personal interview. Prevalence of clinical conditions, including heart condition, chronic pulmonary condition, anxiety/depression, stroke, diabetes, high blood pressure, hip fracture, liver disease, and cancer, was determined using self-reported physician-diagnosed disease information [5]. The mean disability limitation total scaled score was calculated as described by Jette et al. [3].

Statistical analysis
Participant Characteristics in the LLDI developmental sample and the LIFE-P at Baseline were compared. Percentage was presented for categorical variables and mean was presented for continuous variables.

Factor structure evaluation
We compared our LIFE-P factor solutions with those from Jette et al [3] and McAuley et al. [4] using the 16 items and 8 items, respectively. Exploratory Factor Analysis (EFA) with principal extraction and orthogonal rotation was used at baseline, 6-months and 12-months to determine the factor structure from the LIFE-P. One and two factor solutions were selected to allow for comparisons to the solutions published previously. A varimax rotation was used to obtain a set of independent and best interpretable factors. The factors were interpreted based on the factor loadings which relate the items to putative underlying factors. The analysis was performed after combing the two intervention groups and also stratified by the two groups.
Subsequently, we applied Confirmatory Factor Analysis (CFA) at baseline, 6-months, and 12-months to check whether the factor structure for the limitation domain from the LLDI was compatible with the original publications [3,4]. Maximum likelihood estimation in SAS 9.1 (Cary, NC) was used and has resulted in accurate fit indices with ordered categorical data [8]. The chi-square goodness-of-fit test was performed first. For large samples, it is very sensitive and is liberal in rejection of the null hypothesis that the model fits the data.
Additional indicators, including the comparative fit index (CFI) [9], non-normed index [10], normed coefficient (NFI) [10], and root mean squared error approximation coefficient (RMSEA) [11] were also investigated. Values approximating 0.90 for CFI, non-normed index, and NFI are indicative of good model fit to the data. A RMSEA value of less than or equal to 0.1 corresponds to an "acceptable" fit, and 0.05 or lower indicates a "good" fit.

Item-level analysis
As an item-level exploration, we applied IRT analysis within each factor for the 12-month data. The month 12 visit was selected because at that visit participants exhibited a wider amount of variation in level of disability and we reasoned that data from this visit might more closely resemble the samples used in previous publications. For easier interpretation purpose, we divided the scale for each limitation item into the following two groups: the "less limitation" classification included responses of "not at all," "a little," and "somewhat", whereas the "a lot of limitation" classification included responses of "a lot," and "completely". Item parameters were generated including difficulty (location) and discrimination (slope or correlation) [12]. It is assumed that the behavior of the items is invariant to the sample to which the items are applied. Item characteristic curves were generated to display the probability of a positive response to each item as a function of disability. In addition, a second graph, the item information function, was generated to indicate the effectiveness of an item in measuring different levels of disability. The Multilog program Version 7.0 (Assessment Systems Corporation, St. Paul, MN) was used for analysis. Table 1 contains the participant characteristics in LIFE-P at baseline and the LLD developmental sample. The sample size in LIFE-P (424) is larger than that in the LLD developmental sample (150). The majority of LIFE-P participants were aged 70-79 (72.9%). In contrast, the LLD developmental sample ranged in age from 60 years to more than 90 years, with 40.7% of the LLD developmental sample aged 70-79. Both studies had a large percentage of women. The LIFE-P sample had 18.2% that self-reported race as black compared with 7.3% for LLD. The LIFE-P participants reported a higher level of attained education compared to the LLD developmental sample. A slightly greater percentage of LIFE-P participants reported currently living with their spouse. The mean disability limitation total scaled score was slightly higher in LIFE-P. Within LIFE-P, the scaled scores were slightly lower at baseline than months 6 and 12. This suggests that the participants may have been more likely to participate in life tasks at the follow-up visits in LIFE-P and that LIFE-P participants may have been more capable of participating in life tasks compared to the LLD developmental sample. In general, the LIFE-P participants reported a greater burden of comorbidities, including a higher prevalence of anxiety/depression, diabetes, and cancer.

Results
The study design, recruitment, and participant characteristics of McAuley et al. [4] have been described in detail elsewhere [4]. Briefly, there were 250 black (32.4%) and white (67.6%) women recruited to participate in a 24-month prospective study of women's health behaviors. Their mean age (68.1 ± 6.1) was 8.7 years younger than LIFE-P participants. Most (91.5%) were high school graduates. This sample reported less cardiovascular diseases (8.8%) and more pulmonary disease (15.6%) compared to the other two study samples The percentages of diabetes (12.4%) and cancer (6%) were higher than the LLD developmental sample and lower than the LIFE-P sample (data not shown).

Factor structure evaluation
There were not many missing LLDI items in the LIFE-P study; the rates of missing items were below or equal to 1% for all items except one ("work at a volunteer job" at baseline) was 2%. Results from EFA are presented in Table 2. To allow a comparison with the original factor analysis performed by Jette et al., the items and factor loadings for one-and two-factor models are shown. Concentrating first on the two-factor solution, and using the 0.45 loading criterion, we found that five items ("visit friends", "go out to public places", "keep in touch with others", "participate in social activities", "take care of local errands") loaded on the factors differently at the three time points. With the exception of these items, the remaining items consistently loaded on these factors across time. When comparing the two-factor solution at month 12 to that reported by Jette et al [3], seven of the items that loaded on the first factor were among the twelve items that loaded on the first factor reported by Jette et al.; two of the items that loaded on the second factor were among the four items that loaded on the second factor reported by Jette et al; and seven of the items had inconsistent loadings. The onefactor model was slightly more consistent across time (α = 0.89, 0.91, and 0.91 for baseline, month 6, and month 12, respectively). The results stratified by intervention groups were similar; thus, we only presented the overall results.
Since the result of our factor analysis was not comparable to that reported by Jette et al., we further applied EFA to the eight items (the abbreviated version) reported by McAuley et al. [4]. Adopting the same factor names that were used by McAuley et al. [4] ("social role" and "personal role"), we found that four items  ("visit friends", "travel out of town", "go out to public places", and "invite family and friends into home") loaded highly on limitations in capabilities to perform social tasks and four items ("provide meals", "take care of personal care needs", "take care of local errands", and "take care of household business") loaded highly on limitations for personal tasks ( Table 3). The result was consistent with McAuley et al [4]. Results from the CFA for the limitation domain of the LLDI from LIFE-P are provided in Table 3. Initially, we tested the fit of one and two factor models for the 16-item limitation domain using baseline data. The one-factor model did not present a good fit to the data. The two-factor model performed better for these baseline data; however, as described above, the result was difficult to interpret. Subsequently, we applied similar confirmatory factor analyses to the data collected at the 6-month and 12-month visits. Results were similar across visits, with fit statistics indicating a slight improvement in fit for both one and two-factor solutions at these two visits. Across all visits, the twofactor solution consistently outperformed the one-factor solution; however, as described above the twofactor solution was also difficult to interpret. Moreover, results from CFA using the abbreviated version showed a reasonable fit to the data. The two-factor model performed better compared to the one-factor model at the different time points (Table 3).

Item-level analysis
IRT was subsequently used to empirically assess the relation between the factor and each of the four items (abbreviated version) that loaded highly on the specific factor at month 12 in the LIFE-P participants. Results from this analysis are presented in Figures 1 and 2. The IRT analysis revealed that the level of information provided by each of the four items in the social role factor were consistent (Figure 1), and items in the personal role factor tended to provide different levels of information ( Figure 2). For example, the item "take care of local errands" provided high discriminating power and a high level of information at a moderate level of disability, whereas the other three items did not appear to be highly informative across disability levels.

Discussion
The factor structure for the limitation domain using the 16 items within the LLDI in LIFE-P study did not corroborate the findings reported by Jette et al [3]. The twofactor solution was not ideal and difficult to interpret. However, the factor structure using the eight items, the abbreviated version proposed by McAuley et al. [4], was supported by the LIFE-P data. Although only older women were recruited in McAuley et al. [4], the abbreviated version was still applicable in a study that included both older men and women like the LIFE-P. Two factors, social and personal roles, were identified using the abbreviated version. One of the attractive features of the short form is that it retains the original ideas originally put forth by Jette et al. [3], yet reduces participant burden. Moreover, the IRT analysis revealed that the level of information provided by each item in the social role factor was consistent, but the items in the personal role factor provided different levels of information.
There are several possible reasons why we were unable to confirm the originally published factor structure of the LLDI. First, because the sample size from the LLD developmental sample was small, those results may be unstable. Ideally, the LLDI should be evaluated in large, population-based samples. Second, LIFE-P was a community-based clinical trial and the study participants may not be representative of the LLDI developmental sample. For example, from Table 1, it is clear that LIFE-P participants are well-educated and not as healthy as those in the original study published by Jette et al. However, it is worth noting that the range and severity of disability in the two samples were quite similar. And third, responses to the individual items may differ between the two samples due to external factors. For example, time of year may be a confounder for certain items. Specifically, people may keep in touch more with others around the holidays than at other times of the year. This confounder may also contribute to why we did not observe consistent factor loadings across the three time points.
The item-level analysis indicates that the level of information for social roles provided by each of the four items was consistent, showing that the stated activities are of equal importance in capturing late life activities. However, items on the second factor -personal roletend to provide different levels of information. For example, most participants seem to be able to take care of essential household business, as reflected in the low difficulty item parameter and low information of the household business item. However, participants may not have the capacity or willingness to perform non-essential local errands.
So what is the take home message and where should research with the LLDI go from here? First, we see no advantage of using the long form over the short form and would suggest that investigators use the brief 8item LLDI in future research. Second, application of item-response theory to the LLDI short form offered support for the content of the social subscale, but it was mixed for items making up the personal subscale. Future research is needed with the personal subscale in populations that have greater difficulty with basic activities of daily living (ADLs). In particular, even though the physical functioning of LIFE-P participants was compromised somewhat, these individuals did live independently in the community. The personal subscale may be more appropriate for studies conducted within senior living communities in which older adults often have difficulty with one or more basic ADL. This also raises the more general issue of using the LLDI in both large epidemiological studies and smaller controlled trials. Unless the population of interest involves older adults that either have or are likely to experience deficits in functioning that compromise very basic social and personal activities, the LLDI should not be used. Third, LIFE-P collected the LLDI at three different time points: application of factor analysis to each time point may not be the most efficient way (from a statistical analysis point of view) to evaluate the properties of the questionnaire. Accordingly, it is crucial for methodologists to develop methods that can incorporate the factor data at different time points while considering the possible different factor structure at each time point.

Conclusions
In summary, we contrasted LLDI results from LIFE-P and two other studies [3,4]. The abbreviated version using eight items performed better in our study sample and we would recommend it for use in future research. Given the item content of the LLDI and the results of our analyses, we would conclude that this instrument is best used with older adults that have or are likely to develop impairments which are likely to influence very basic social and personal activities. In addition, the personal subscale would benefit from additional research using IRT in these target populations.