To recruit patients with osteoarthritis about to have primary total knee replacement (TKR), questionnaires were sent out to 125 consecutive patients on the waiting list at the Department of Orthopedics at Lund University Hospital in Lund, Sweden. Patients were recruited from December 1999 to April 2001. Of these 125 patients, 20 were excluded, ten underwent other operative procedures, eight were not operated on during the study period and two had rheumatoid arthritis. Thus preoperative data were available for 105 patients with knee osteoarthritis.
All questionnaires were mailed to the patients and returned by mail in a pre-paid envelope. In addition to the KOOS, which includes the WOMAC, patients were also sent the SF-36 and questions regarding background data. The Swedish version LK 1.0 of the KOOS , including the Swedish version LK 1.0 of the WOMAC , and the Acute Swedish version of the SF-36  were used. Literacy of the subjects was not assessed.
The Knee injury and Osteoarthritis Outcome Score (KOOS) is an extension of the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) . KOOS was developed and is validated for several cohorts of younger and/or more active patients with knee injury and/or knee osteoarthritis [6, 7, 9]. KOOS is a 42-item self-administered self-explanatory questionnaire that covers five patient-relevant dimensions: Pain, Other Disease-Specific Symptoms, ADL Function, Sport and Recreation Function, and knee-related Quality of Life. The WOMAC pain questions are included in the subscale Pain, the WOMAC stiffness questions are included in the subscale Other Disease-Specific Symptoms and the WOMAC subscale Function is equivalent to the KOOS subscale ADL. The questionnaire, scoring manual and user's guide can be downloaded from http://www.koos.nu
KOOS Score Calculation
The KOOS's five patient-relevant dimensions are scored separately: Pain (nine items); Symptoms (seven items); ADL Function (17 items); Sport and Recreation Function (five items); Quality of Life (four items). A Likert scale is used and all items have five possible answer options scored from 0 (No Problems) to 4 (Extreme Problems) and each of the five scores is calculated as the sum of the items included. Scores are transformed to a 0–100 scale, with zero representing extreme knee problems and 100 representing no knee problems as common in orthopaedic scales [13, 14] and generic measures . Scores between 0 and 100 represent the percentage of total possible score achieved. An aggregate score was not calculated since it was regarded desirable to analyze and interpret the five dimensions separately.
Since it was believed a priori that functions such as running, jumping, squatting, kneeling and pivoting were not applicable to all patients undergoing total knee replacement, a sixth answer option (not applicable) was given for the five items included in the subscale Sport and Recreation Function. If the box "not applicable" was marked the item was treated as missing data.
Missing data. If a mark was placed outside a box, the closest box was used. If two boxes were marked, that which indicated the more severe problems was chosen. Missing data were treated as such; one or two missing values were substituted with the average value for that subscale. If more than two items were omitted, the response was considered invalid and no subscale score was calculated.
The SF-36 is a widely used measure of general health status which comprises eight subscales; Physical Functioning, Role-Physical, Bodily Pain, General Health, Vitality, Social Functioning, Role-Emotional and Mental Health [11, 15]. The SF-36 is self-explenatory and takes about 10 minutes to complete. The SF-36 is scored from 0 to 100, 0 indicating extreme problems and 100 indicating no problems.
In addition to demographic data, patients were asked to report co-morbid conditions. Patients were asked if they were currently treated by a doctor, or had been treated during the last year, for any of the following 11 conditions: Back problems, Lung disease, High blood pressure, Heart disease, Impaired circulation in the lower extremity, Neurologic disease, Diabetes, Cancer, Ulcer, Kidney disease, Impaired vision or eye disease.
To assess test-retest stability questionnaires were sent out one week apart on two separate occasions (pre-operatively and 6 month follow-up) for two different randomly selected subsets of patients. Wilcoxons signed rank test was used to determine if any significant changes occurred between the test and retest administration of the questionnaire. Intraclass correlation coefficients (ICC 2,1) were calculated for all patients together and for the pre-operative and post-operative assessments separately. According to the method suggested by Bland and Altman the difference between the two assessments was plotted against the mean of the two assessments for each subject. 95% of differences were expected to be less than two standard deviations .
Content validity was assessed at baseline by asking the patients to rate the importance of improvement in each of the five KOOS subscales on a 5-point Likert-scale as extremely important, very important, moderately important, somewhat important, or not important at all. For each subscale examples of included questions were given.
Convergent and divergent construct validity was determined by comparison of the pre-operative administrations of the KOOS and the SF-36. The SF-36 subscale Physical Functioning measures limitations of the ability to perform general physical activities, a corresponding construct to what the ADL and Sport scales of the KOOS are intended to measure. SF-36 Bodily Pain measures pain/ache and disturbances in normal activities, a construct similar to knee pain which the KOOS Pain scale is designed to measure. We expected the highest correlations when comparing the scales that are supposed to measure the same or similar constructs. Further the eight subscales of SF-36 have been shown to produce valid indices of Physical Health and Mental Health . Since the KOOS is designed to measure physical health rather than mental health we expected to observe higher correlations between the KOOS subscales and the SF-36 subscales of Physical Function, Bodily Pain, and Role Physical (convergent construct validity) than between KOOS subscales and the SF-36 subscales of Mental health, Vitality, Role Emotional, Social Functioning, and General Health (divergent construct validity). However based on previous methodological studies of the KOOS, we expected the correlations to the SF-36 subscale Role Physical to be lower than the correlations to the subscales Physical Function and Bodily Pain [6, 9].
We expected that total knee replacement would induce a change in patients' perception of symptoms and function that could be measured by the questionnaires. Responsiveness was calculated as effect size, standardized response mean (SRM) and relative efficiency. Effect size is defined as mean score change divided by the standard deviation of the pre-operative score . Effect sizes >0.8 are considered large . Standardized response mean is defined as mean score change divided by the standard deviation of the change score . Relative efficiency was computed by squaring the ratio of the z-statistics .
In part, the ability to respond to change can be assessed in terms of the proportion of patients at the floor (i.e. the worst score) or the ceiling (i.e. the best score) of each scale . To assess the ability to respond to change the floor and ceiling effects were determined pre-operatively, at 6, and 12 months. For comparative reasons the WOMAC was examined in the same way.