Psychometric analysis of the Brazilian-version Kidscreen-27 questionnaire

Background The objective of this study was to verify the reliability, discriminatory power and construct validity of the Kidscreen-27 questionnaire in Brazilian adolescents. Methods Adolescents that participated of the pilot study (210 adolescents; 52.9% boys; 13.7 years old) and of the baseline (816 participants; 52.7% girls; 13.1 years old) of the Movimente Project in 2016/2017 composed the sample of the present study. This project was carried out in six public schools in the city of Florianópolis, Santa Catarina, Brazil. Test–retest reproducibility was assessed by the intraclass correlation coefficient and Gwet coefficient; internal consistency through McDonald's Omega; Hankins' Delta G coefficient verified the scale's discriminatory power and; confirmatory factor analysis to assess construct validity. Results Reproducibility values ranged from 0.71 to 0.78 for the dimensions (ICC), and ranged from 0.60 to 0.83 for the items (Gwet). McDonald's Ômega (0.82–0.91) for internal consistency measures. Discriminatory power ranging from 0.94 for the dimension Social Support and Friends to 0.98 for Psychological Well-Being. The factorial loads were > 0.40, except for item 19 (0.36). The fit quality indicators of the model were adequate (X2[df] = 1022.89 [311], p < 0.001; RMSEA = 0.053 (0.049–0.087); CFI = 0.988; TLI = 0.987), confirming the five-factor structure originally proposed. Conclusions The Brazilian-version Kidscreen-27 achieved good levels of reproducibility, internal consistency, discriminatory power and construct validity. Its use is adequate to measure the health-related quality of life of adolescents in the Brazilian context.


Introduction
Health-related quality of life (HRQoL) is a perceived and multidimensional health model that describes aspects of well-being and physical, emotional, mental, social and behavioral functions, identified by the individual himself and by others [1]. Its evaluation in children and adolescents is considered an important health indicator, as it is during this period of life that cognitive, physical, psychosocial, emotional and behavioral changes occur and can affect health and well-being [2]. Studies show different associations of HRQoL with biological (sex, age, biological maturation) [2,3] and behavioral characteristics (physical activity, sedentary behavior, diet, sleep, smoking and alcohol consumption) [2,4] as well with diseases such as asthma, diabetes, obesity and rare diseases [5]. da Silveira et al. Health Qual Life Outcomes (2021) 19:185 As a proposal for HRQoL measurement, Kidscreen emerged from a project promoted by a group from the European Union with the participation of thirteen countries with the objective of producing a cross-cultural selfassessment measure for healthy children and adolescents and/or with chronic diseases [6]. The first instrument developed by the group was the version with 52 items (Kidscreen 52), covered in 10 dimensions of HRQoL [1]. Then, in order to provide an adequate tool for large epidemiological and clinical studies, a version with 27items, covering five dimensions, was derived from the 52-item version [1]. Finally, a 10-item version, derived from the 27-item version, was created for them to summarize the dimension scores in a single value-global index [1].
Since its development, all versions have been used in a variety of configurations and study designs in different parts of the world [7][8][9][10][11]. In Brazil, the 52-item version was translated and evaluated for exploratory factor structure and internal consistency [12]. A second study, using the same translated version, evaluated the 27-item version for reproducibility, internal consistency and construct validity through face-to-face interviews [13]. Although adequate validation parameters were observed, the authors highlight that the results found refer to specific context, and that differences sociocultural existing among Brazilian regions should be considered in the use of the instrument. The present study intends to advance in three points: (1) to analyze if the psychometric parameters are kept in a sample of the south region of Brazil; (2) to use the collective interview procedure, and not the face-to-face one [13], considering that in school-based research this procedure is more usual; (3) to perform statistical analysis more appropriate for ordinal categorical data, type of item of the Kidscreen instrument, differently of the another study [13]. Therefore, this study aimed to assess the reliability, discriminatory power and construct validity of the version translated into Brazilian Portuguese language, available on the Kidscreen website.

Design and participants
In order to analyze the objectives of this study, two stages of the "Movimente Program" (www. movim ente. ufsc. br) conducted in 2016/2017 in schools in the Florianopolis city in the state of Santa Catarina, southern Brazil [14] were considered: pilot study (May-July 2016) and baseline (March 2017). Characterized as a school-based intervention program, randomized and controlled by conglomerate, the Program was registered in Clinical Trials (NCT02944318) and conducted during a school year (March to December 2017).
In the present study, the sample size calculation was performed in two moments. To estimate reproducibility, the sample size considered an intraclass correlation coefficient (ICC) ≥ 0.20, two applications of the questionnaire, type I error of 5% and type II error of 20% (power of 80%) and an increase 30% for losses and refusals. Following these criteria, a sample of 193 adolescents was necessary. To estimate other parameters (internal consistency, discriminatory power and construct validity), the sample size considered the rate of 20 individuals for each item of the instrument [15]. Considering the 27 items, a total sample of 540 subjects was estimated. For these analyzes, individuals who completed all 27 items were considered.
The pilot study data (all classes from the 7th to the 9th grade of a school), provided elements to evaluate reproducibility. In this phase, the questionnaire was applied in two moments, with an interval of one week. In this case, the literature is not unanimous, but it recommends that this interval should not long (e.g.: 2 months), to avoid possible changes in the phenomenon, nor short (e.g.: 1 day), to avoid that the results are contaminated by the recall effect [16].
The baseline data (all classes from the 7th to the 9th grade of six schools), allowed to examine internal consistency, discriminatory power and construct validity. The adolescents involved in the present study signed the assent form and were authorized by their respective guardians, by signing the consent form. All adolescents, of both sexes, regularly enrolled in the selected schools, attending the first 2 weeks of class (data collection period) were eligible. This study was approved by the Research Ethics Committee of the Federal University of Santa Catarina (CAAE: Protocol Number.: 49462015.0.0000.0121) and by the Florianópolis Municipal Department of Education. The copyright of the Kidscreen instrument used in the research belongs to the Kidscreen Group, under the responsibility of Prof. Ulrike Ravens-Sieberer, MPH. A formal collaboration form was signed to use the instrument.

Kidscreen 27-Health-Related quality of Life
Kidscreen 27 consists of five dimensions: (1) Physical Well-being (5 items); (2) Psychological Well-being (7 items); (3) Autonomy and Parent relation (7 items); (4) Social Support and Peers (4 items) and (5) School environment (4 items). The 27 items have five response options according to intensity (nothing, little, moderately, very, totally) and frequency (never, rarely, sometimes, often, always), with 1-week recall period. Scores are coded from 1 to 5 and items formulated with negative response categories (1, 9, 10 and 11) are inverted to follow the same direction as positively formulated items [1]. The questionnaire was administered in a similar way at the two collection times (pilot and baseline), being carried out in the classroom during school hours, with completion by the students themselves and with an average duration of 10 min. The assembly of the database was performed through optical reading of the questionnaire, using the SPHYNX software (Sphynx ® , Software Solution Incorporation, USA). Possible erasures or errors made by the respondents were checked manually by the team members, who were previously trained to handle the equipment.

Statistical analysis
To estimate reproducibility, Intraclass Correlation Coefficient (ICC) [17] and Gwet's coefficient [18] were used in analyzes of dimensions scales and items, respectively. Gwet Agreement Coefficient was proposed as an alternative parameter to assess agreement of categorical variables that overcome known biases related to Cohen's Kappa [19]. For the ICC, values below 0.50 are considered weak, between 0.50 and 0.75 are moderate, between 0.75 and 0.90 are good and values > 0.90 are excellent indicative [17]. For the Gwet coefficient, values < 0.20 are considered of low agreement, between 0.21 and 0.40 mild, 0.41-0.60 moderate, 0.61-0.80 good and, values > 0, 80 are considered of excellent agreement [20].
To assess internal consistency, the omega coefficient [21], pointed out as an alternative and considered a more sensitive measure than Cronbach's Alpha and appropriate to estimate reliability, mainly of multidimensional instruments where different item scales and factor loads [22]. For the calculation of the ômega coefficient, the Weighted Least Squares Means and Variance estimator (WLSMV) was used as the basis, highlighted in the literature for being more suitable for modeling latent variables with categorical indicators of ordinal level, due to its robustness to modest violations of the underlying normality and good performance in larger sample sizes [23]. Values greater than 0.80-1.00 are considered desirable; indexes between 0.70 and 0.79 are considered recommended and indexes between 0.60 and 0.69 should be accepted for research use only (clinical use is not recommended). Values below 0.60 suggest unreliability of the instrument [21]. For comparison with other studies involving analysis of the psychometric properties of Kidscreen, Cronbach's alpha is presented.
The discriminatory power was determined by Hankins' Delta G, a statistic indicated for scales of dichotomous items, but also for polytomous items, typically five or Likert-type scales [24]. Values between 0 (individuals with the same score, without variability) and 1 (individuals distributed along the scale, with variability) are possible. A score of 0.70 or more is reported as acceptable [24].
Confirmatory factor analysis was used to assess the quality of fit of the model and to compare competing models (construct validity), estimated by the WLSMV. The model adjustment was analyzed considering: chisquare (X 2 [df ]) with values of p ≤ 0.05, RMSEA (Root Mean Square Error of Approximation) with values ≤ 0,05 suggest a good fit, CFI (Comparative Fit Index) and TLI (Tucker-Lewis Index) with values > 0.95 indicate proper fit and factorial loads (FL) of the items with values ≥ 0.40 considered acceptable. [25]. All the analyses were conducted on R version 3.6.1 for Windows, using the lavaan package version 0.6-7.

Results
Of the 251 students who participated in the pilot study, 210 were considered for the reproducibility analyzes, as they filled out the questionnaires at both moments of collection (83.7% response rate). For the baseline sample, 921 students completed the questionnaire and of these, 816 were included in the analyzes because they had complete data on the 27 items of Kidscreen (88.6% response rate). Boys were more present in the pilot study (52.9%) and girls at the baseline (52.7%). The mean age of the pilot study sample was 13.7 ± 1.0 years, while the baseline was 13.1 ± 1.1 years. The mean scores of HRQoL (T-scores: 50 ± 10) ranged from 43.6 points in the dimension Physical Well-being (pilot) to 49.8 points in the dimension Social Support and Peers (baseline), being proportionally higher in the baseline in all five dimensions (Table 1).
Regarding reproducibility, the intraclass correlation coefficients ranged from 0.71 (School environment) to 0.78 (Physical Well-being), and the Gwet AC 2 coefficients ranged from 0.60 for item 3 "Have you been physically active (e. g. running, climbing, biking)?" to 0.83 for item 1 "In general, how would you say your health is?", both from the dimension Physical Well-being ( Table 2).
The measure of internal consistency ranged from 0.82 in the dimension School Environment to 0.91 in the dimension Psychological Well-being ( Table 3). The floor effects ranged from 0.1 to 0.5%, while the ceiling effects ranged from 2.7 to 16.0%. The scale's discriminatory power ranged from 0.94 in the Social Support and Peers dimension to 0.98 in the Psychological Well-being dimension.
The results showed that the factorial loadings were greater than 0.40, except for item 19 "Do you had enough money for both expenses?" of the "Autonomy and Parent relation" domain (loading = 0.36). Other loadings ranged from 0.43 to item 18 "Do you had enough money to do the same things as your friends?" from the dimension "Autonomy and Parent relation" to 0.93 for item 21 "Have you had fun with your friends?" the "Social Support and Peers" dimension. The correlations between dimensions ranged from 0.46 between the dimensions Social Support and Peers and School Environment to 0.77 between Psychological Well-being and Autonomy and Parent relation (Fig. 1).
The fit quality indicators of the model without error covariances showed that the factorial structure (

Discussion
This study demonstrated that the Kidscreen 27 instrument reached good levels of reproducibility, internal consistency, discriminatory power and construct validity, being adequate to measure the HRQoL of adolescents in the Brazilian context.
A large body of psychometric results from international research involving the Kidscreen-27 instrument allows a direct comparison with our results [1,7,9,13,26]. As for HRQL scores, previous studies have shown similar quantifications [7,26] with mean values around 49.8 to 53.9 and higher [9,13] with values around 52.1-85.7. Explanations for these results may consider issues such as the cultural, socioeconomic and methodological context of the research, factors that can interfere both positively and negatively in the HRQoL scores in the different dimensions of the instrument.
The reproducibility of this study found higher ICC values ranging from 0.71 to 0.78 than the Ravens-Sieberer et al. [26] study with data from 13 European countries and similar to the studies developed by Andersen et al. [7] (ICC: 0.71 a 0.81) and Nezu et al. [8] (ICC: 0.73 a 0.79). Higher values were found in the studies by Quintero et al. [27] (ICC: 0.87 a 0.99), Ng et al. [28] (ICC: 0.78 a 0.86) and Farias Júnior et al. [13] (0.70-0.96). In the present study, the second application of the questionnaire was seven days after the first. Virtually all studies used an interval of seven days or more between applications. Some precautions with the ICC values must be considered, such as the questionnaire application procedure (face-to-face interview, telephone interview, responsible interview (proxy) and self-report) and the age range of the respondents [29]. Specifically regarding administration, Kidscreen was originally developed to be responded through selfreported or parents/guardians [1]. The administration by interview was observed in the study by Farias Júnior et al. [13] and Quintero et al. [27], a question that may explain the slightly higher ICC values, in relation to the values found in this study. A moderate to good agreement was observed by the Gwet coefficient between the items of the five dimensions, with most items classified as "good agreement" according to the criteria by Landis & Koch [20]. Item 1 "In general, how would you say your health is?", Was the item with the highest agreement (0.83), followed by item 25 "Have you got on well at school?" (0.81). The other items-maintained values ranging from 0.73 to 0.60. On this issue, the literature has highlighted the frequent use of the Kappa coefficient instead of the Gwet coefficient [30] to analyze the stability of categorical variables. Although Kilen Li Gwet proved in 2002 [19] the superiority of the Gwet coefficient when compared to Cohen's Kappa, few researchers use it as a statistical tool, or are not even aware of its existence [30].
The internal consistency of this study was assessed by McDonald's omega coefficient, an estimator that considers the different standardized factor loads for each item of the instrument. The omega values in the present study ranged from 0.82 for the School Environment dimension to 0.91 for the Psychological Well-being dimension, classified as desirable [21].
The decision by this estimator contradicts most of the validation studies for Kidscreen, which chose to present Cronbach's alpha [31], including by the Kidscreen Group [1]. In view of this, it was decided to present the alpha values as well, as no psychometric study of the Kidscreen instrument was found in the literature involving the omega coefficient for internal consistency analysis. Cronbach's alpha values in the present study ranged from 0.  [9]. Following Cronbach's criteria [31], it can be said that the scale has acceptable internal consistency.
The scale's discriminatory power ranged from 0.94 to 0.98. These results are similar to found by the Kidscreen Group (0.81-0.99). This analysis, even recommended [1], has not yet been reported by any study analyzing the psychometric properties of Kidscreen. In this study,   we opted for the Hankins Delta G coefficient [24], as it is theoretically more appropriate for the type of items on the Kidscreen, polytomous and graduated response scale.
The CFA supported the five dimensions found in the original study and in other studies [7,13], demonstrating that the instrument is in accordance with the conceptual and theoretical considerations on the measurement of HRQoL. The loads of the items as well as the correlations between the dimensions were considered good and the multidimensional structure was confirmed. The final model showed an acceptable fit, especially when the modification indices were taken into account. The estimator WLSMV used in this analysis, uses the matrix of polychoric correlations between the items during the factor analysis. Correlations of this nature tend to be, in comparison to Pearson's coefficient, a more consistent estimate of the true linear relationship between variables [23]. Kidscreen validity studies use maximum likelihood as an estimator for exploratory and confirmatory factor analyzes [9,13].
In the model, three covariance were added, in this order: (i18-i19; i9-i10; i10-i11). Conceptually, the covariance of these items makes sense. Item 18 "Have you had enough money to do the same things as your friends?" and item 19 "Have you had enough money for your expenses?", both of the dimension "Autonomy and Parent relation" are related to financial issues, unlike the other items (13, 14, 15, 16 and 17) that are related to issues of autonomy, relationship with parents and life at home. Although parents are probably at this age, sources of financial support, this seems to be independent of their relationships with them [28].
Other covariance, from item 9 "Have you felt sad?" with item 10 "Have you felt so bad that you didn't want to do anything?" and item 10 "Have you felt so bad that you didn't want to do anything?" with item 11 "Have you felt lonely?", are related to mood and emotion, different from the other items (6, 7, 8 and 12) that are related to psychological characteristics (items 6, 7 and 8) and selfperception (item 12). Andersen et al. [7], when evaluating the psychometric properties of the Norwegian version of Kidscreen 27, draw attention to another detail: items 9, 10 and 11 are formulated "negatively", different from the other items that are written "positively", which can influence the contribution of the items to the dimension in question.
In contrast to these findings, two studies confirmed other dimensional structures for Kidscreen in its version with 27 items. Ng et al. [28] [27] performed exploratory factor analysis and confirmed a seven-dimensional version. When excluding item 1 "In general, how would you say your health is?", Six dimensions remained. At the end, the confirmatory factorial ratified the six dimensions (RMSEA = 0.097; CFI = 0.754; NFI = 0.699; GFI 0.754; AGFI = 0.701).
This study had strengths. Theoretical development and validation of instruments for measuring quality of life in adolescents has become relevant in different contexts in the health field, as it is a recognized way to understand the needs in health services and guide decision making for the allocation of financial resources for health programs [32]. In addition, methodological rigor of processing and analysis of variables was followed as recommended by the Kidscreen Group, which allows external comparisons.
As limitations not tested the convergent validity and sensitivity to change due to the objectives and crosssectional design of this study. An important fact to be highlighted is that in the version of the Kidscreen 27 instrument, the Self-Perception dimension is represented only by item 12 "Have you been happy with the way you are?" Originally, the construction of the instrument started from a generic proposal, indicated to measure the HRQoL of healthy children and adolescents and/or with chronic diseases. The sample of this study did not count on the participation of adolescents who had physical limitations, which could be interesting to test the psychometric properties.

Conclusions
Kidscreen 27 is still considered an instrument for measuring generic and cross-cultural HRQoL since its origin. That said, in this assessment process, good levels of reliability were achieved, assessed by test-retest reproducibility and internal consistency, the scale had a great discriminatory power and its five dimensions were confirmed by the construct validity, being indicated to measure the health-related quality of life in Brazilian adolescents. Studies futures should evaluate if items related to mood and emotion should not compose a dimension different from those related to psychological and self-perception characteristics since the direction of the sub-items and the results were quite different.

Abbreviations
HRQoL: Health related quality of life; ICC: Intraclass correlation coefficient; WLSMV: Weighted least squares means and variance estimator; RMSEA: Root mean square error of approximation; CFI: Comparative Fit Index; TLI: Tucker-Lewis Index; FL: Factorial loads.