The development of the TRIM Diabetes/Device followed draft FDA guidelines for the development of new PRO measures [1]. Ethics/IRB approval was obtained for both the item development and validation phases of the process.
Item Development
Item Generation
The development of the item content for the TRIM Diabetes/Device began in 2002 with the development of the TRIAD Measures (The Diabetes Symptom Measure (DSM), Diabetes Productivity Measure (DPM) and the Diabetes Medication Satisfaction Measure (DiabMedSat)) for oral agents and injectable treatments (syringe and pen) for type 1 and 2 diabetes [2]. This knowledge was supplemented in 2006 regarding inhaled insulin and in 2008 for insulin pumps and GLP-1 pens [3]. To develop the TRIM-Diabetes, previous data from the development of the Diabetes TRIAD Measures were qualitatively re-examined and re-analyzed along with the newly collected information regarding inhaled and pump delivered insulin and thereby forming the basis for the TRIM-Diabetes/Device development project.
Information regarding the methodology for the collection of patient interview data from the TRIAD measures has been previously published [2]. Therefore only information on the data collected since 2006 are presented here. This data included: (1) telephone or in-person interviews of diabetes experts defined as endocrinologists or internists; and (2) telephone or in-person individual interviews and focus groups of type 1 and type 2 diabetes patients who had used inhaled and pump delivered insulin in either the U.S. or Australia, and is presented here. These interviews followed a semi-structured interview guide which included open-ended questions regarding the perceived impact of treatment on the social, physical, and psychological aspects of life, treatment satisfaction issues, and the specific variables that act as moderators (i.e., factors that either help or hinder the impact of treatment). Expert and individual patient telephone interviews each lasted approximately one hour. Patient focus groups lasted approximately two hours. Completed interviews were used to guide and inform subsequent interviews. Thus issues that were raised by experts and patients previously were further explored and either confirmed or rejected thereby ensuring high content validity. The number of interviews and focus groups needed to ensure content validity was determined by the 'point of saturation' (i.e., no new information appeared during the last interview/focus group). All interviews and focus groups were conducted by the first author, who is a mental health clinician and trained group facilitator. All inhaled insulin patients were recruited for the interviews by a physician who had treated them for their diabetes with inhaled insulin either currently or in the past three months. Current insulin pump patients were recruited through a professional medical marketing group from their volunteer panel. Both clinical experts and patients received an honorarium for their participation in the interviews.
Data from all interviews were coded and hand sorted and qualitatively analyzed to identify common themes and concepts. This analysis was then considered, along with the previously collected TRIAD focus group data analyses, to create a conceptual model of the multifaceted impact of diabetes medication across the spectrum of delivery systems. Based on this model, the preliminary items for the TRIM-Diabetes/Device were then generated to reflect the model domains. Domains (expected to become subscales of the final measure) were named to reflect the item content for that domain.
Cognitive Debriefing
Cognitive debriefing of the preliminary TRIM-Diabetes/Device measure, based on pre-defined item definitions, was conducted in an independent sample of type 1 and 2 persons with diabetes. Each method of diabetes medication as well as administration type was represented (three participants each were currently on oral medication, insulin by syringe, insulin by pen, insulin by pump or GLP-1 pen). It was not possible to include patients using inhaled insulin as it was no longer commercially available at the time of the debriefing.
Participants were mailed (or e-mailed) the TRIM-Diabetes/Device in advance and were asked to complete it prior to a prearranged individual telephone interview to assess comprehension, wording, formatting, clarity, and relevance of items. During this interview, for each item respondents were asked: 1) "What did the question mean to you?"; 2) "Was the question worded in a way that made sense to you?"; 3) "Was the question in any way offensive or objectionable to you?"; and 4) "Was the question about something which is important or relevant to you regarding your diabetes medication?" Respondents were then asked overall: 1) "Were the instructions and formatting clear?"; 2) "Did the response choices make sense?"; 3) Does a two-week recall time frame seem appropriate considering what the questions are about?; 4) "When you completed the questionnaire, did you have any difficulty accurately remembering your experiences over the past two weeks?"; 5) "Is there anything we forgot to ask?"; and 6) "Is there anything else you would like to comment on regarding the survey?"
After the first five participants were interviewed, findings were reviewed and a decision was made as to whether any changes to the measures were necessary. This process continued in blocks of five participants (one from each treatment/administration type group) until a determination was made that readability and relevance was acceptable based on consensus agreements between respondents in an entire block.
Validation Study
Procedures
An online validation study was conducted to collect data to assess the measurement and psychometric properties of the TRIM-Diabetes. To be eligible for the study, the subject was required to be over the age of eighteen, currently on their diabetes treatment, and able to read and comprehend English. The sample selection process created the sampling frame of targeted persons with diabetes who went through a healthcare profiler and self-reported they had either type 1 or type 2 diabetes diagnosed by a physician. To avoid potential bias associated with panel recruitment from a single source or single methodology, a multi-sourced panel recruitment strategy was employed including permission e-mails, affiliate networks, and web site advertising. A stratified sample procedure was employed using invitation selection criteria to account for disproportional response rates between stratification categories. Stratification variables were age, ethnicity, income and primary method/type of diabetes medication (oral agents, insulin syringe, insulin pen or insulin pump, GLP-1 pen).
Measures
The following measures were administered in a validation survey battery:
The TRIM-Diabetes/Device Preliminary Version
A 60-item self-report questionnaire assessing six hypothesized domains: Productivity (Daily Activities), Productivity (Work), Psychological, Device Satisfaction, Efficacy and Burden. The five-point Likert like response options, for all items, range from Not at all/Never to Extremely/Almost always, Always or Extremely dissatisfied/inconvenient to Extremely satisfied/convenient, depending upon the item stem and are scored so that a higher score indicates a better health state.
Problem Areas in Diabetes (PAID)
A 20-item self-report scale developed to assess the current level of diabetes-related emotional distress both in type 1 and type 2 diabetes. PAID items contain commonly expressed negative emotions related to living with diabetes (e.g., worrying about hypoglycemia, feeling burned out by the daily efforts to manage the diabetes, feeling worried about the future and complications) that are rated on a five-point Likert scale ranging from 0 (not a problem) to 4 (a serious problem); scores are summed and standardized to a 0-100 scale, with higher scores indicating higher emotional distress [4].
Activity Impairment Assessment (AIA)
A five-item questionnaire assessing the amount of time that an individual's work or regular activities have been impaired as a result of their condition. Patients respond to AIA items on a five-point-type scale and are given a total score, where a higher score indicates greater impairment [5].
Insulin Treatment Satisfaction Questionnaire (ITSQ)
A 22-item questionnaire assessing treatment satisfaction for diabetic patients on insulin. In addition to a total score (sum of all domains), the items make up five domains: inconvenience of regimen, lifestyle flexibility, glycemic control, hypoglycemic control, and insulin delivery device satisfaction. All items are rated on a seven-point Likert scale, with the higher score (for the total score and for each subscale) indicating better treatment satisfaction. Only the inconvenience of regime domain was used in this study [6].
Treatment Satisfaction Questionnaire for Medication (TSQM)
A 14-item generic questionnaire that measures a patient's satisfaction with medication. Items are rated on a five- or seven-point scale according to patients' experience with the medication in terms of satisfaction, bother/interference with side effects, ease of use and confidence, with a higher score indicating greater satisfaction [7].
Medication Compliance Scale (MCS)
A six-item unvalidated measure assessing how often a patient thinks about postponing or skipping doses, or has actually postponed or missed doses over the past two weeks. Items are scored on a six-point Likert scale, from 0 (never) to 5 (always). The total score is calculated by summing item values with higher scores indicting greater compliance problems [8].
Diabetes Medication Satisfaction (DiabMedSat)
A 21-item measure consisting of three sub-scales: burden, efficacy and symptoms that was developed to measure diabetes treatment satisfaction and is applicable to a wide range of diabetes therapies. Items are rated on a five- or seven-point scale according to patients' experience with the medication, with a higher score indicating greater satisfaction [2].
Quality of Life Enjoyment and Satisfaction Questionnaire (Q-LES-Q) (Short Form)
A 16-item questionnaire developed to assess the degree of enjoyment and satisfaction experienced in eight areas (physical health, subjective feelings of well-being, work, household duties, school, leisure, social relationships, and general life quality). Each item is rated on a five-point Likert scale. Scores are aggregated, with higher scores indicative of greater enjoyment or satisfaction in each domain [9].
Center for Epidemiologic Studies Depression Scale (CES-D)
A 20-item measure comprising six scales reflecting major dimensions of depression: depressed mood, feelings of guilt and worthlessness, feelings of helplessness and hopelessness, psychomotor retardation, loss of appetite, and sleep disturbance experienced in the past week. Response categories indicate the frequency of occurrence of each item, and are scored on a four-point scale. Higher scores (both item and total scores) indicate more depressive symptoms. A score of 16 or higher has been used extensively as the cut-off point for high depressive symptoms on this scale [10].
Diabetes Fear of Injecting and Self-Testing Questionnaire Fear of Self Injecting subscale (D-FISQ)
A 15-item quality-of-life subscale that measures fear of self-injecting in adult diabetics. Subjects rate the items on a four-point Likert scale. Scores are summed, so that a higher score indicates greater fear [11].
Statistical Methods
Validation procedures were conducted according to an a priori developed statistical analysis plan (SAP). First, item level psychometric and conceptual criteria were used to refine and reduce the preliminary item pool and reduce redundancy between items. Next, factor analysis to identify structural domains was performed. Reliability and validity testing was then performed. It is the intention of the developers that the TRIM-Diabetes/Device may be used either as a total score or that each domain can stand alone as a separate measure. Therefore, all reliability and validity tests were performed on both the total scores and for each domain.
Item Characteristics and Measurement Model (Scaling)
For item reduction both item psychometric properties and conceptual importance were taken into consideration in making retention/deletion decisions for the initial potential pool of 60 items. Items were considered for deletion, based on psychometric criteria: if the item had missing data (i.e., no response) >5% of the time; if ceiling effects were present (>50% optimal response); or if item-to-item correlations within the total item pool were high, thus indicating redundancy between items (Pearson's correlation coefficient >0.70) [12]. Items that did not perform well psychometrically could be considered for retention if conceptually important and/or unique.
The factor structure was determined by an exploratory principle component factor analysis using Varimax orthogonal rotation with Kaiser normalization. Although a priori conceptual domains were developed, the number of factors in the analysis was not specified so as not to force an inappropriate solution. A scree plot was examined to confirm the final factor solution. Item-to-total scale correlations were assessed using the Pearson's correlation between individual item scores and the total subscale score for the associated subscale. Correlation coefficients <0.40 were considered evidence of poor association [13].
Test for Reliability
The internal consistency reliability was assessed using Cronbach's alpha. This statistic is used to analyze additive scales to determine to what degree the items within the scale are associated. A high internal consistency suggests that the scale or subscale is measuring a single construct. Alpha values range from 0.00 to 1.00; however, a minimum correlation of 0.70 is preferred to claim the instrument is internally consistent [14].
The test-retest reliability was assessed at approximately two weeks post initial completion of the battery. To be eligible for the retest, participants had to respond "No" to the questions: "Have you experienced any major life events since you filled out the previous questionnaire approximately 2 weeks ago (e.g., moving, divorce, losing job)?" and "Has the past 2 weeks been an unusually stressful period for you?" and respond "Yes" to the question: "Have you been taking the same diabetes medication over the past 2 weeks?" An alpha of >0.70 was considered evidence of acceptable test-retest reliability.
Tests for Validity
The validation of the TRIM-Diabetes/Device followed the analyses as specified in the SAP. However, since the factor analyses yielded slight differences from the hypothesized domains, some of the a priori defined hypotheses for the validation had to be altered to fit the new measurement model. These new hypotheses were formulated after finalizing the factor structure and BEFORE examining the data for validity and reliability and have been considered as a priori hypotheses.
The convergent validity was evaluated by testing the following a priori hypotheses using a two-tailed Pearson's correlation coefficient with significance at the p < 0.05 level. When more than one hypothesis per domain is proposed, the minimum threshold of at least one hypothesis had to be met to claim convergent validity. Correlation coefficients >0.40 were considered acceptable evidence of moderate to strong associations [13].
H01: Total score: TRIM-Diabetes total will be significantly related to generic treatment satisfaction (TSQM) and/or an overall self-report total impact item.
H02: Treatment Burden subscale: TRIM-Diabetes Treatment Burden will be significantly related to burden (burden subscale of the DiabMedSat) and/or an overall burden self-report item.
H03: Daily Life subscale: TRIM-Diabetes Daily Life will be significantly related to restrictions in daily activities (AIA) and/or an overall daily life self-report item.
H04: Diabetes Management: TRIM-Diabetes Management will be significantly related to self-reported efficacy (Efficacy subscale of the DiabMedSat and TSQM efficacy) and/or an overall diabetes control self-report item.
H05: Psychological Health subscale: TRIM-Diabetes Psychological Health will be significantly related to self-reported problems with diabetes (PAID) and/or an overall emotional self-report item.
H06: Compliance subscale: TRIM-Diabetes Compliance will be significantly related to assessed compliance (MCS).
H07: Total score: TRIM-Diabetes Device total and the domains of Device Function and Device Bother subscales will be significantly related to self-reported device satisfaction (subscale of the TSQM and ITSQ) and an overall burden of medication self-report item.
The known-groups validity, or the ability of a PRO to distinguish between groups known to differ on characteristics which are expected to impact the PRO assessment, was evaluated by assessing the following a priori hypotheses. The TRIM-Diabetes scores of the known groups were compared using one-way ANOVA with groups as a fixed factor with p-values at the p < 0.05 level as evidence of a significant difference between known group. For domains with two hypotheses, at least one had to be met as the minimal threshold to claim known group validity.
H08: Total score: TRIM-Diabetes total will be significantly greater for those willing to switch to another medication (coded as not at all, slightly or moderately, extremely interested) or not recommend to others and/or as compliance improves.
H09: Treatment Burden subscale: TRIM-Diabetes Treatment Burden will significantly increase as number of daily injections increases and/or the type of treatment becomes more burdensome (would be less for orals/tablet group).
H10: Daily Life subscale: TRIM-Diabetes Daily Life will significantly increase as life satisfaction increases (Q-LES-Q) (coded as poor/fair/good) and/or, for those who work, greater satisfaction for those who lost fewer days from work due to diabetes (<1 day/1-2 days/3+ days).
H11: Diabetes Management subscale: TRIM-Diabetes Management score will significantly increase as: A1c levels improve (coded as <6.8/6.8 to 8.0/>8.0,), the number of medical visits decreases (coded as none/1/2+), change in diabetes treatment plans due to low blood sugar decreases and/or as self report diabetes control increases.
H12: Psychological Health subscale: TRIM-Diabetes Psychological Health will significantly increase as depression (CES-D) decreases and/or level of family and friends support of diabetes management efforts increases.
H13: Compliance subscale: TRIM-Diabetes Compliance will be significantly greater for those patients only taking oral medications, lower for those using either a pump, syringe, or pen.
H14: Device Satisfaction: TRIM-Diabetes Device total and device Function and Bother will significantly increase as fear of injections (D-FISQ) decreases (for those on any injectable treatment).
Interpretability: Minimally Important Difference
Since we did not have longitudinal data to examine the minimally important difference (MID) using a change score, self-report items also included in the battery, one per domain of the TRIM-Diabetes/Device, were used as anchors to approximate the MID. This analysis was considered exploratory and is meant to provide preliminary estimates of differences established using an anchor-based approach. To calculate the MID, the relationship and magnitude of change between these self-report "overall" items to the scores of each TRIM-Diabetes domain score were examined. As specified in the SAP, the MID considered changes in scores of TRIM-Diabetes domains between responses of roughly "Slightly" and "Somewhat" as the minimally important interval. For example, the difference in the mean response for the TRIM-Diabetes Burden domain score for those who respond "Slightly burdensome" and those that respond "Somewhat burdensome" on the independent item "Overall, how burdensome do you think that your insulin/diabetes medication has been?" was calculated. One-half standard deviations were calculated as the threshold for the difference to assess the MID [15].