The analyses of the domain structures of the VFQ-25 do not perfectly describe the proposed structure of the instrument. The analyses provided support for the conceptual model but there were also some differences of note. The ‘worry’ item (Q 3) did not load onto the mental health domain in the new model, but instead became a separate and distinct single item factor. The general driving question was associated with social functioning whereas the driving in difficult conditions or at night items both fell out as a separate domain. The social functioning domain seemed to disappear with both items being redistributed to other domains. The near vision item which refers to seeing things on a crowded shelf was aligned with the distance vision domain as opposed to the near vision domain of the original model. The results were similar when using baseline data alone and when pooling baseline and follow-up data at Week 54. However, variable clustering methodology showed that none of the eight established multi-item scales met the criterion for further splitting.
The analyses of the instrument’s reliability were generally supportive. However internal consistency of the measure is often very restricted because many domains only include a single item. This lack of sufficient numbers of items is an important limitation in the design of the VFQ-25. It is better practice to have multiple items per domain, for example, between four and ten items as this provides a much more thorough assessment of the dimension. The overall measurement properties of the VFQ-25 (i.e. not just reliability) would be improved if there were more items per domain, or perhaps fewer domains.
The evidence to support concurrent and construct (discriminant and convergent) validity was mixed. Significant correlations were found between VFQ-25 and EQ-5D VAS scores, but these were generally low. Regression analyses found that different domains of the VFQ-25 were significantly predicted by variables such as ECOG, HbA1c, and ETDRS but this pattern of findings was not consistent. Indeed it wasn’t clear to the authors what pattern of results would necessarily be predicted. It’s interesting to note that based upon very similar patterns of correlations between visual acuity and VFQ-25 in patients with diabetic retinopathy, Matza et al.  concluded that the instrument “demonstrated construct validity”. The interpretation of whether something is psychometrically valid is subjective and reflects different points of view.
The analysis of known groups validity did demonstrate that the measure could differentiate between patients based on ETDRS letter score. However, the exploration of the validity of the VFQ-25 was rather complicated because of the wide conceptual scope of the instrument. The VFQ-25 includes domains which reflect central visual function – such as near and distance activities, and driving- domains which should be related to visual acuity in DME patients. The instrument also includes more general domains such as role difficulties and social functioning which will be influenced by many factors other than central vision. Therefore there are a priori reasons as to why these domains would not be related to ETDRS score. In reality, however, the association between domain scores and visual acuity was no greater for vision-related domains than it was for the more general domains. This indicates that much of what the VFQ-25 is measuring is not related to visual functioning. This argument is supported when the correlations with EQ-5D VAS are considered.
Other authors have raised fundamental concerns regarding the performance of the VFQ-25. Marella et al.  report a study which employed Rasch analysis to explore the performance of the VFQ-25 in patients with low vision and concluded the 12 factor structure of the VFQ-12 has no psychometric validity. They proposed a simpler two factor solution reflecting visual and socio-emotional functioning. Some of the items do not fit this structure and the authors suggest that these be dropped (when the instrument is used in patients with low vision). In addition the Rasch analysis indicated that many of the items could effectively be reduced to a dichotomous outcome. This study, and others [37, 38], have also highlighted limitations with the VFQ-25. On balance the VFQ-25 has design flaws and failings related to its psychometric performance; nonetheless, it was able to measure various aspects of HRQoL in patients with DME.
There are some important limitations with our study. The analyses were undertaken retrospectively on clinical trial data and, consequently, the data available for the analyses were limited. Quite detailed ophthalmology data were available but there were limited data from other measures of HRQoL as might be expected in a clinical trial, as trial participants may already be overburdened with clinical measurements required by the protocol. HRQoL was restricted to the EQ-5D, and for these analyses only the VAS was used. The VAS is an assessment of general health status and as such was probably influenced by many factors. It would be useful to gather more detailed information perhaps using a generic profile measure of HRQoL, and in addition an alternative measure of vision-specific HRQoL. In our analyses of construct validity we relied on the EQ-5D VAS to benchmark the VFQ-25 data against. We believe this is a limitation of our study because the EQ-5D VAS is a single item scale which assesses self-rated health. Therefore, while the EQ-5D VAS may be able to provide an overview of health status, it is not specific to the domains that the VFQ-25 is measuring. This makes it difficult to interpret the importance of correlations. It would have been preferable to benchmark against another profile measure. In addition, we used the EQ-5D VAS to benchmark MID of vision specific functioning and HRQoL. While estimates of MID on these measures have been previously reported, it should be remembered that the EQ-5D VAS assesses overall health status. Therefore this is not the most appropriate anchor to use for assessing the MID of vision-specific functioning and HRQL as measured by the VFQ-25. It would have been preferable to use anchors that are much more specific to the domain being assessed, but such data were not captured in the trial.
It should also be noted that because of the nature of the condition, the data were collected using a validated telephone interview script rather than through self-report or face-to-face interviews, apart from in India where telephone interviews were not possible. This administration method was not specifically examined as a predictor variable in the psychometric analyses.
The present study suggests that the VFQ-25 has some validity as a measure of HRQoL in patients with DME. It performs as well in DME as in assessment of diabetic retinopathy , and age related macular degeneration . However, consistent with other studies, the VFQ-25 has some important limitations. To improve measurement of vision-related quality of life in DME patients, a new or modified instrument should be developed.