Development and validation of quality of life scale of nasopharyngeal carcinoma patients: the QOL-NPC (version 2)

Background The aim was to develop and validate the quality of life scale for nasopharyngeal carcinoma (NPC) patients, the QOL-NPC (version 2), a specific instrument to measure quality of life for NPC patients. Methods The QOL-NPC was developed and validated according to standard procedures. The patients were assessed using the QOL-NPC, FACT-G, and FACT-H&N. Classical test theory was used to evaluate the reliability, validity, and responsiveness of the QOL-NPC. Results A total of 487 patients (97.4 %) completed the questionnaire. The QOL-NPC comprised four domains, as follows: physical function (eight items); psychological function (five items); social function (five items); and side effects (eight items). All of the items had a lower proportion of missing data. Cronbach's alpha values of the domains ranged from 0.72 to 0.84. The split-half reliability coefficients ranged from 0.77 to 0.84. All of the intra-class correlation coefficients were > 0.8. The normed fit index, non-normed fit index, and comparative fit index were >0.89. The root mean square error of approximation was 0.097, with a 90 % confidence interval (0.093, 0.100). The domain scores of the QOL-NPC were significantly correlated with the FACT-G and FACT-H&N (P < 0.05). All of the domain scores of patients using different amounts of radiotherapy were significantly different (P < 0.001). All domain scores decreased at the completion of radiotherapy, with effect sizes ranging from −0.82 to −0.22. Conclusions The QOL-NPC is valid for measuring QOL with good reliability, validity, and responsiveness. The QOL-NPC is recommended to measure the QOL for Chinese NPC patients.

. The QOL-NPC was widely used to evaluate the QOL of Chinese NPC patients. Based on the application there were some problems. (1) The item of the QOL-NPC (V1) was rated on a 0-10 numeric visual analogue scale (VAS). Some patients reported that it was difficult to understand. For example, some patients with poorer reading skills were not able to distinguish score 5 from score 6. Most studies have reported that VAS and Likert responses have few differences in reliability and responsiveness, and are highly correlated [15][16][17]. Because the Likert responses are easier to administer, compute, and interpret for the patients, Likert responses are most often applied [15][16][17][18]. (2) Some items had problems. The previous patients reported that they were worried about the infection of the disease due to a lack of medical knowledge. Therefore, the item "worried about the infection of the disease" was applied in V1. Some important symptoms were missing in V1, such as pain in the throat and cough when swallowing food.
The purpose of this study was to develop and assess the QOL-NPC (version 2 [V2]) according to a set of standardized procedures of instrument development.

Development of the QOL-NPC
The standard development and validation procedures were followed to develop and validate the QOL-NPC [19][20][21][22][23]. The procedures are shown in Fig. 1, which included construct definition, item generation, language testing and content validity, pilot study, and validation study.

Construct definition and item generation
The QOL-NPC (V1) contains 30 items in four domains: physical function (PH, seven items); psychological function (PS, six items); social function (SO, five items); and side effect (SE, 12 items).
The domains of the QOL-NPC (V2) were sourced from V1. The items of the V1 were carefully discussed and revised by five experts. For example, the item "worried about the infection of the disease" was revised to "worried about the inheritance of the disease." According to suggestions from NPC patients and clinical professionals, the following three items were added: have a pain in your throat (PH domain); cough when swallowing food (PH domain); and feel difficult to communicate with your family and friends (SO domain). Finally, a total of 33 items were generated. The VAS scale of the item was revised into a 1-5 Likert scale. The 1-5 Likert scales were expressed as not at all (excellent), a little bit (very good), moderate (good), quite a bit (fair), and extreme (poor).

Language testing and content validity
All of the items were tested in a convenience sample of 20 NPC patients from different educational levels. The patients were asked whether or not they could understand the meaning of the items. Problematic items were revised according to the comments of the patients. Fig. 1 Steps towards development and validation procedure Eight experts were asked to assess the content validity. Expert consulting was available to evaluate whether or not the items of the QOL-NPC could represent the most relevant and important aspects of NPC patients [24]. Minor revisions and rewording of some items were performed until content validity was achieved.

Pilot testing
A cross-sectional study (pilot testing) was conducted to select the items. A total of 181 NPC patients were enrolled. The Research Ethics Committee of the Cancer Center at Sun Yat-Sen University provided ethical approval. The sample size was 5-10 times the item number for the pilot test. The items were screened and selected using the floor and ceiling method, coefficient of variation, correlation analysis, internal consistency coefficients, and confirmatory factor analysis. According to item selection, seven items were deleted, five of which were deleted from the SE domain. Due to the popularity of intensity-modulated radiotherapy (IMRT), the NPC patients had fewer side effects, such as dysphonia, alopecia (hair loss), dizziness, and decreased vision due to RT.
After the item selection, the QOL-NPC (V2) contained 26 items in four domains: PH (eight items); PS (five items); SO (five items); and SE domain (eight items). Each item scored 1 to 5 points. Each domain was transformed into a 0-100 score. A higher score indicated a better QOL. The scale was self-administered by the patients. The scale was showed in Appendix 1.

Validation study
A cross-sectional study (validation study) was conducted to assess the psychological characteristics of the QOL-NPC V2. The Research Ethics Committee of the Cancer Center at Sun Yat-Sen University provided ethical approval. The study was conducted between 1 July 2013 and 31 May 2014. Eligibility criteria included the following: (1) pathologically-proven NPC in the Cancer Center of Sun Yat-Sen University; (2) ≥16 years of age; and (3) able to provide informed consent to participate. The patients were excluded if diagnosed with another cancer, NPC relapse, or unconscious, confused, or cognitively impaired. The cognitively impaired were diagnosed by a psychologist.
The investigators included two medical post-graduates and three physicians, who were trained before the survey. The investigators explained the aim of our study before obtaining informed consent from the patients. If the patients agreed to participate in the survey, a questionnaire was given to them. The questionnaire included a socio-demographic sheet, QOL-NPC V2, FACT-G, and FACT-H&N. The socio-demographic sheet covered gender, educational degree, marriage status, dialect (Cantonese, Hakka, Chaoshanese, and others), pathologic type, Union for International Cancer Control (UICC) stage, methods of RT, RT stage, and other disease. The patients completed the questionnaire without assistance. If the patients did not understand the items on the questionnaire, the investigators explained them. If the questionnaire had missing data, the questionnaire was immediately returned to the patient for completion.
Terwee et al. considered a sample size of at least 50 patients to be adequate for the assessment of retest reliability and responsiveness [25]. Eighty inpatients were required to complete the QOL-NPC V2 within 2-3 days, which was used for the retest-test. A short interval (2-3 days) was chosen for the following reasons: (1) The NPC in-patients were all treated with RT, which had an obvious influence on QOL, especially for the long interval.
(2) Marx et al. reported no significant differences for the test-retest reliability of 2 day and 2 week intervals [26]. The newly-diagnosed patients (60 patients), who had not been previously treated by RT, were required to finish the QOL-NPC after 50 ± 2 days of RT treatment. The data were used for the responsiveness test. These patients completed the scale by themselves in the retest and responsiveness tests. The investigators, the setting environment, and the investigation procedure were the same as the first test.

Data analysis
Classical test theory (CTT) was used to assess the scale. SPSS 21.0 (Chicago, IL, USA) and Lisrel software (version 8.7) were performed [27]. The percentage of missing data, and the time to complete the instrument was calculated. Internal consistency reliability and split-half reliability were assessed using Cronbach's alpha value and Pearson's correlation coefficients between two halves of the items, respectively. Test-retest reliability was evaluated using an intra-class correlation coefficient (ICC) and the 95 % confidence interval (CI) of the two scores within 2-3 days. The correlation coefficients of the item-own domain (the item and its own domain) and the item-other domains (the item and other domains) were calculated. Construct validity was evaluated by the normed fit index (NFI), non-normed fit index (NNFI), comparative fit index (CFI), and root mean square error of approximation (RMSEA) based on confirmatory factor analysis (CFA) [28][29][30]. The correlation coefficients between the QOL-NPC and the FACT (FACT-G and FACT-H&N) were calculated to assess criterion validity. Discriminant validity was assessed by comparing the domain scores of the patients among different RT stages and different RT methods (analysis of variance). A paired samples t-test was used to analyze the score changes over time. Effect size was calculated as the change in scores divided by the standard deviation of the baseline score [31].

Results
A sample of 500 patients was enrolled in the study. Thirteen patients (2.6 %) did not complete the questionnaire. Thus, 487 patients were included for the analysis ( Table 1). The mean age was 47.0 ± 11.1 years (range, 16.1-78.1 years). There were 341 male and 146 female patients. Of the patients, 93.0 % were married, 68.8 % were Cantonese, 88.9 % were the undifferentiated type, 46.6 % were III stage, and 79.9 % did not have another disease.
The average time to complete the instrument was 8.4 ± 4.6 min, ranging from 3.8 to 16.3 min. Ten patients did not understand certain items, such as the item "mental stress". They completed the items with the help of the investigators. The scores of all the items ranged from 1 to 5 (Table 2). Item SE8 scored the highest (3.90), while item PS2 scored the lowest (2.46). Item SE8 had 3.7 % missing data. Other items had a lower proportion of missing data.
The mean score of the SE domain was the maximum (64.5), and the mean score of the PS domain was the minimum (50.3; Table 3). The Cronbach's alpha value of the domain ranged from 0.72 to 0.84. The split-half coefficients of the domain ranged from 0.77 to 0.84. The SE domain had a maximum Cronbach's alpha value and split-half coefficient. All of the ICCs were >0.8, and of all the coefficients were significantly different.
All items correlated more strongly with their own domain than the other domains (Table 4). For example, the correlation coefficients of the items in the PH domain and PH ranged from 0.47 to 0.77, which were greater than the other domains.
The results of the CFA analysis showed that the RMSEA was equal to 0.097 with a 90 % CI (0.093, 0.100). Both the NFI and NNFI were equal to 0.89. The CFI was equal to 0.90. The factor loadings of CFA are shown in Table 4. The minimum factor loading was 0.47 (SO1 and SO2). The structure diagram is shown in Fig. 2.
The PH, PS, and SO domains of the QOL-NPC had a positive correlation with physical, emotional, and social/ family well-being of the FACT-G with coefficients of 0.71, 0.63, and 0.56, respectively. The correlation coefficient between the SE domain of the QOL-NPC and FACT-H&N was 0.51, which was significantly different.
The PH, PS, and SE domain scores of patients in different RT stages were significantly different (P <0.001; Table 5). The patients who were receiving RT had the lowest scores in the PH, PS, and SE domains. The patients before RT and >5 years after RT had the highest scores. The SO domain scores of patients in different RT stages were not significantly different (P >0.05). All the domain scores of patients using different RT methods were significantly different (P <0.001; Table 5). The patients who did not receive RT had the highest scores, followed by those receiving intensity-modulated radiotherapy (IMRT).
Sixty patients before RT were enrolled to test the responsiveness of the QOL-NPC over time. At the

Discussion
The most commonly specific instruments used to assess QOL of NPC patients include QLQ-C30, QLQ-H&N35 [4,32,33], and FACT-NP, which consists of FACT-G and NPC subscale [11]; however, due to cultural differences, we developed the QOL-NPC to assess the QOL of Chinese NPC patients.
The QOL-NPC (V2) had good content validity according to the suggestions of experts. The QOL-NPC was broadly defined as the endpoint directly derived from the patient, which included symptoms, health status, adherence, and side effect [34]. The QOL-NPC included the most important aspects characterizing specific aspects of NPC patients, which is structurally made up of physical function, psychological function, social function, and side effects. For example, physical function included feeling tired, losing weight, having a headache, nasal tampon or nasal bleeding, satisfied with appearance, and coughing when swallowing food. Side effects included  dry mouth (xerostomia), pain in the throat, difficulty in opening the mouth, memory decline, skin injuries in the head and neck, and damaged teeth due to RT. It is wellaccepted that dry mouth is the most significant morbidity during and following RT, which causes serious disorders in tasting, chewing, and swallowing, as well as sleeping disorders [35]. The QOL-NPC (V2) had good reliability and validity based on the results of CTT. All of the domains had moderate or high Cronbach's alpha coefficients (0.72-0.84), and split-half reliability coefficients (0.77-0.84). The researchers gave a positive rating for internal consistency when Cronbach's alpha was >0.70 [25]. All of the domains had high intra-class correlation coefficients (0.82-0.88), which indicated that the QOL-NPC (V2) can evaluate the QOL of patients. Based on the results of CFA, RMSEA was equal to 0.097 with a 90 % CI  The PH, PS, and SE domain of the QOL-NPC were sensitive to discriminate the QOL of NPC patients in different RT stages. The QOL of the NPC patients during RT were the lowest. These results were consistent with our hypothesis. It is known that RT has a serious impact on the health status of the patients [4,36]. All the domain scores of patients using different RT methods were significantly different. The patients who did not receive RT had the highest scores. The patients who receive IMRT had the higher scores than those receiving other RT methods. Our results were consistent with other studies, which showed that IMRT played a significant role in improving the QOL of NPC patients [37,38].
The QOL-NPC (V2) had good responsiveness based on the results of CTT. After the newly diagnosed NPC patients received RT treatment, they had numerous side effects, especially head and neck symptoms, such as pain in the mouth and throat, dry mouth, and difficulties in speaking. Therefore, the domain scores of the QOL-NPC decreased. The effect sizes of these domains ranged from −0.82 to −0. There were some study limitations. (1) All of the patients in the study were enrolled from the Cancer Center of Sun Yat-Sen University. The QOL-NPC (V2) should be further evaluated by the data from other centers. (2) Only 60 NPC patients were used to test the responsiveness of the QOL-NPC (V2). The responsiveness of the scale should be further assessed in a larger sample of patients. (4) Some patients completed the QOL-NPC with the help of the investigators. It was a limitation of the study, for item explanation by a third party can generate application bias.

Conclusions
The QOL-NPC (V2) is valid for measuring QOL with good reliability, validity, and responsiveness. We recommend the application of the QOL-NPC (V2) for measuring QOL in the Chinese NPC patients.  Change scores a , the score at the end of radiotherapy minus the baseline, −100 (maximum worsening) to +100 (maximum improvement) Effect size b , calculated as the change in scores divided by the SD of the baseline score

Appendix 1
Quality of life scale of nasopharyngeal carcinoma patients: The QOL-NPC (version 2). The Quality of life scale of nasopharyngeal carcinoma patients (version 2) includes 26 items, which is presented as a five-point scale. Please respond to each item according to your feeling in the past 2 weeks. The following items are about your physical function, psychological function and social function related to nasopharyngeal carcinoma.
The following items were about the side effects due to radiotherapy.

Competing interests
The authors declare that they have no competing interests.
Authors' contributions YS was involved in the study design, construct definition, item generation and validation study; and drafted the manuscript. CWM was involved in language testing and content validity, and drafted the manuscript. WQC and LW collected the data and drafted the manuscript. QX analyzed the data and discussed the results. ZCW and ZLW interpreted the data. LZL was involved in the recruitment of patients. XLC