Skip to main content

Development of a proxy-reported pulmonary outcome scale for preterm infants with bronchopulmonary dysplasia



To develop an accurate, proxy-reported bedside measurement tool for assessment of the severity of bronchopulmonary dysplasia (also called chronic lung disease) in preterm infants to supplement providers' current biometric measurements of the disease.


We adapted Patient-Reported Outcomes Measurement Information System (PROMIS) methodology to develop the Proxy-Reported Pulmonary Outcomes Scale (PRPOS). A multidisciplinary group of registered nurses, nurse practitioners, neonatologists, developmental specialists, and feeding specialists at five academic medical centers participated in the PRPOS development, which included five phases: (1) identification of domains, items, and responses; (2) item classification and selection using a modified Delphi process; (3) focus group exploration of items and response options; (4) cognitive interviews on a preliminary scale; and (5) final revision before field testing.


Each phase of the process helped us to identify, classify, review, and revise possible domains, questions, and response options. The final items for field testing include 26 questions or observations that a nurse assesses before, during, and after routine care time and feeding.


We successfully created a prototype scale using modified PROMIS methodology. This process can serve as a model for the development of proxy-reported outcomes scales in other pediatric populations.


Bronchopulmonary dysplasia (BPD), or chronic lung disease (CLD), is one of the most common sequelae of preterm birth [1], and its severity is an important predictor of long-term outcomes in premature infants [2]. The infants most vulnerable to BPD are those born before the 28th week of gestation (extremely low gestational age newborns, ELGANs). Compared to their peers without lung disease, ELGANs with BPD have increased mortality [2, 3]. Those who survive with BPD have prolonged initial hospitalizations [4] and an increased risk of neurodevelopmental impairment such as mental retardation and cerebral palsy [57]. These BPD-associated morbidities lead to increased family stress, economic hardship, and increased health care costs throughout childhood [4, 8, 9].

The most common definitions of BPD include the receipt of oxygen at 36 weeks post-menstrual age, with or without a physiologic test of oxygen dependency [10, 11], and the National Institutes of Health (NIH) consensus categorization of "none," "mild," "moderate," and "severe," which is based on the duration of oxygen therapy and the amount of oxygen received at 36 weeks [12]. These NIH categories help determine the effect of therapies designed to reduce the incidence of BPD in a clinical trial, but they are not useful to providers who are attempting to examine the day-to-day pulmonary function of an infant, and this oxygen-based categorization does not capture the nuances of disease-related functional limitations.

A valid bedside assessment tool of pulmonary function will give clinicians and researchers a more effective way to test therapies by reliably identifying subtle effects on infant pulmonary function or by identifying subgroups of infants who respond to therapies such as diuretics or bronchodilators. Our goal was to develop a scale to assess the effects of lung disease on functional outcomes using proxy-reported measures. We adapted Patient-Reported Outcomes Measurement Information System (PROMIS) methodology, a widely recognized system of instrument item selection and refinement for patient-reported outcomes [1318], to develop a parsimonious Proxy-Reported Pulmonary Outcomes Scale (PRPOS). Our most significant adaptation of current PROMIS methods is our entire reliance on proxy-reported measures for this neonatal population because of their inability to report on their own.

The ultimate goal of PRPOS is to provide clinicians with a set of items and responses in various functional domains that can discriminate between infants with differing degrees of BPD severity. Our secondary goal is to present a model instrument development process that might be replicated for use in diseases of infancy. This paper describes the first five of six steps in the scale development process: (1) identification of domains, items, and responses; (2) item classification and selection using a modified Delphi process; (3) focus group exploration of items and response options; (4) cognitive interviews of proxy reporters on a preliminary scale; (5) final revision before field testing; and (6) reliability testing (for which analysis is ongoing).


We developed PRPOS in the five phases illustrated in Figure 1.

Figure 1
figure 1

PRPOS development phases. Phases of development of the Proxy-Reported Pulmonary Outcomes Scale, from November 2009 to June 2010.

Phase 1: Identification of domains, items, and responses

We identified an appropriate set of activity domains and assessments for inclusion in the scale using face-to-face interviews with experienced neonatologists, nurses, and neonatal nurse practitioners at two academic medical centers (The University of North Carolina at Chapel Hill [UNC] and Duke University) and input from a panel of national experts in neonatology, pediatric pulmonology, feeding, and development.

We conducted interviews individually or in small groups using a "brainstorming" format. We asked respondents to use their clinical experience to identify characteristics of an infant diagnosed with BPD [CLD] at 36 weeks and any activities that precipitated these characteristics. During this phase of the process, items were included if at least two participants agreed on their discriminative utility, with the goal of identifying a complete set of potential items. The resulting set of activity domains and assessments, which grew in the course of the discussions from nine original "assessments and domains" to what began to be called 15 "qualities and conditions," was used in the next phase of the development process.

Phase 2: Item classification and selection

We used a modified Delphi process, a method of obtaining consensus on a subject matter from experts in the field through anonymous solicitation or polling of their opinions [19], to identify, classify, review, and revise possible items and domains. Modified Delphi process participants included experienced neonatologists, nurses, and neonatal nurse practitioners, developmental specialists, and feeding specialists at five academic medical centers (UNC, Duke University, Stanford University, University of Alabama at Birmingham [UAB], and University of Iowa [Iowa]).

Our modified Delphi process included three steps: (1) a survey, (2) working group meetings, and (3) a second survey reflecting areas where consensus had not yet been achieved. The surveys were designed and administered using the web-based survey software Qualtrics (Provo, UT), and each respondent received a unique URL to the surveys. The entire process took place from December 2009 to February 2010.

We invited 59 clinicians from five academic medical centers to participate in the two surveys (Table 1); in addition, we asked our eight expert panel members to take the second survey.

Table 1 Demographic information on participants in the modified Delphi process

The first survey (step one) had three parts. In part one, respondents described how certain qualities or conditions (alertness, tone of back/trunk, lower body, and upper body, eye appearance, eyebrow appearance, desaturations, presence of tachypnea, recovery time from tachypnea, retractions, and heart rate) appear in infants with four levels of BPD [CLD] severity--none, mild, moderate, severe--in three situations (e.g., at baseline before care, during care time, and during the first five minutes of feeding). Table 2 presents the scenarios used to describe level of CLD severity. Respondents also described the appearance of three feeding cues: opening the mouth, dropping the tongue, and the position of the chin. The survey provided three "other" categories where respondents could fill in additional characteristics they thought were important and describe the appearance of those characteristics in infants at each of the disease states.

Table 2 Scenarios to describe level of CLD severity

In part two of the survey, respondents rated how well each of the observation domains and feeding cues would discriminate levels of CLD severity using a scale of 1 to 9, where 1 = not at all well and 9 = extremely well.

In part three, respondents provided open-ended feedback on the types of things that should be recorded before the assessment (e.g., whether a retinopathy of prematurity exam had taken place that day, or the timing of a furosemide dose) and made comments on other things we should consider in developing the scale.

Following the survey, we conducted three multidisciplinary workgroups (step two of the modified Delphi process) at UNC and Duke. At the start of the workgroups, we asked participants to score how well a set of items--quality of sleep; alertness, arousability, facial expression; disorganization; difficulty in calming; color change; tone; and feeding mechanics---reflects the severity of CLD in an infant during five states (sleep, transition, awake state, care time, and feeding) using a five point scale (0 = no; 1 = some; 2 = moderately, 3 = pretty closely; and 4 = yes, very much). We then had guided discussions in which we asked participants to help refine our set of domains, narrow similar terms to a single, best descriptor, and clarify and simplify complex items. At the end of the workgroup, participants completed the score card again, and we determined whether discussion had changed preferences.

The feedback we received from the working groups contributed to development of our second survey (step 3), in which respondents estimated at what severity of lung disease they might observe a particular behavior or action and how well those items discriminate levels of CLD severity. Table 3 lists the five behavior domains. We also asked whether the following terms were familiar and useful in describing breathing: intercostal, subcostal, and substernal retractions; head bobbing; and nasal flaring. The survey included space for respondents to provide additional comments. At the conclusion of the modified Delphi process, we developed a preliminary scale.

Table 3 Domains and behaviors used in survey 2

Phase 3: Focus groups

In February 2010, we conducted two focus groups of bedside nurses, a physical therapist, and a developmental specialist to clarify domains, confirm item definitions, and refine the wording of potential scale items and corresponding response options [13, 20]. An experienced focus group moderator conducted both focus groups, and members of the research team observed the discussions and provided background and clarification when necessary. The moderator used a semi-structured interview guide to elicit group participation and discussion on specific topic areas. We audiorecorded the focus group sessions and compared and collated notes taken by investigators in the group with the moderator's notes from the transcripts.

Each focus group was presented with the same scenario describing the clinical course of a premature infant at 36 weeks, and then asked to think about the infant in four disease states, no CLD, mild, moderate and severe CLD (see Additional File 1, Box S1). The focus group moderator instructed the participants to refer to the scenario throughout the discussion. Questions during the discussion centered on nine areas (Table 4).

Table 4 Sample focus group questions from nine domains

Phase 4: Cognitive interviews

Following the focus groups, we conducted semi-structured cognitive interviews to obtain information about what items actually meant to potential respondents in terms of their comprehension of individual questions (i.e., the question intent and meaning of terms), the sense of the questions overall, retrieval from memory of relevant information (i.e., recallability of information and recall strategy), decision processes, response processes, and instructions for using the tool [13, 18, 21, 22].

The cognitive interviews were approved by the Institutional Review Board at UNC, and all interviewees gave their informed consent prior to the interview. The interviews took place in April and May 2010 and included bedside nurses from three academic medical centers (UNC, Stanford, and Iowa), chosen to elucidate possible regional differences in response to terms. In our cognitive interview process, a bedside nurse used the scale on an infant and then participated in a cognitive interview. The experienced cognitive interviewer followed a semi-structured interview guide with questions about each item, the overall scale, and the directions.

Examples of the cognitive interview questions include

  • On a scale of 1 to 5, with 1 being easiest and 5 being hardest, how easy or hard was it to choose an answer?

  • How sure are you of your answer? -or- How sure are you that it is [X]?

  • Would it be easier for you if you could choose from fewer options? (If yes, probe: what response options would you eliminate?)

  • Would it be easier for you if you could choose from more options? (If yes, probe: what other response options would you like to see here?)

  • Is there another response that should be added that would more fully describe what you observe?

  • Why do you say [X]? -or- Tell me why you chose [answer] instead of some other answer on the list.

After the first three interviews, we assessed each nurse's feedback and revised items and response options in the scale that respondents had thought were unclear. We then conducted three more interviews and made minor changes to the scale after each one.

Phase 5: Final scale revision

We used the results of the focus groups and cognitive interviews to develop a prototype PRPOS and prepare it for field testing in five geographically dispersed academic centers with varying rates of BPD.


Phase 1: Identification of domains, items, and responses

During the brainstorming phase, 15 experienced clinicians identified an initial item pool of nine activity domains and nine assessments (Table 5). The national expert panel included two neonatologists, two pediatric pulmonologists, two infant feeding experts, and two neurodevelopmental specialists (seven from the United States and one from Canada). They confirmed that these domains and assessments were comprehensive, observable, and related to CLD at age 36 weeks adjusted gestational age. However, the expert panel raised a potential concern about assessing feeding behaviors because of the interaction of immaturity, respiratory disease, and feeder skill. Based on this input, we modified the feeding assessment to include only the initial period of feeding.

Table 5 Initial set of activity domains and assessments

Using input from the face-to-face interviews and expert panel, we arrived at a set of 15 activity domains and assessments, or "qualities and conditions," to be included in the next phase of the development process.

Phase 2: Item classification and selection (modified Delphi and workgroups)

We received 38 responses to the first survey (response rate = 64%) and 43 responses to the second survey (response rate = 64%). Seventeen people took part in the working groups: ten from UNC, including nurses and a feeding specialist, and seven from Duke, including developmental/family specialists, researchers, and a nurse.

First Survey

The open-ended responses to the first survey provided us with user-generated, specific terms and phrases with which respondents could describe an infant's appearance at the four levels of BPD severity. Nurses and neonatal nurse practitioners provided more detailed descriptions than did neonatologists, and the feeding and developmental specialists provided more nuanced responses about feeding and development.

Table 6 shows that, on average, registered nurses, nurse practitioners, neonatologists, and developmental and feeding specialists scored alertness, tone, eyes, eyebrows, and feeding cues mid-range (4-6) on the scale. Desaturation, tachypnea over baseline, time to recover from tachypnea, retractions received high scores (8 or 9). Nurses and specialists were more likely than were physicians to rate aspects of tone and feeding as valuable discriminators of levels of CLD severity.

Table 6 Survey 1 results of average ratings of appropriateness of CLD observation

Respondents reported that pre-assessment data should include information on the clinical environment (e.g., parent visits, room noise), administration and timing of medications (e.g., timing of last steroid course, dose of caffeine/aminophylline), procedures and tests (e.g., laboratory tests, immunizations, radiology visit), and respiratory support (e.g., type and magnitude of support).

Workgroup Feedback

The workgroup participants assisted in narrowing multiple terms to a single, best term for 12 items. For example, eyebrow descriptors "furrowed," "scrunched," "contracted," and "tense" were narrowed to "furrowed." In addition, participants clarified, defined, or distinguished similar descriptions for eight items. For instance, participants helped discriminate between eyes closed due to stress, described by the term "eyes tightly closed," and eye closure that does not indicate distress, denoted by "closed and sleepy" eyes. In three cases, workgroup participants simplified terms; for example, we reduced descriptions of musculoskeletal tone from four to three because of clinicians' inability to discriminate accurately between four different levels.

Participants also highlighted areas of uncertainty, expressing concern that some of our feeding items (mouth/tongue position; rooting/feeding cues) might be influenced by the feeder's technique and level of experience or the infant's development and feeding skills, rather than by the infant's level of CLD severity. The groups also noted that it is difficult to decipher whether "raised" and "furrowed" eyebrows signal distress related to the infant's CLD.

When we asked workgroup members to rescore after discussion, their responses did not change significantly from what they reported before discussion. Overall, most items scored as "moderately" or "pretty closely" reflecting severity of CLD in infants.

Second Survey

Results from the second survey of the modified Delphi process suggested that we had a range of behaviors and actions that would indicate different levels of CLD severity for each domain (see Additional File 2, Table S1). For five of the domains (tone and desaturations during the first five minutes of feeding, respiratory rate with feeding, and calming and desaturations during care time), we did not have a descriptive behavior or action that would reflect the absence of disease, or "no CLD". Thus, we added a descriptor that reflected no CLD more clearly. For five domains (sleep, arousal/transition, general state during care time, color change, and feeding cues), we had descriptive behaviors or actions that showed overlap between moderate and severe disease. Most respondents (81%) reported that intercostal, subcostal, and substernal retractions, head bobbing, and nasal flaring were familiar and/or useful terms to describe breathing. A few respondents (16%) noted other degrees to consider between "barely visible" and "pronounced," and a few others (9%) did not find the term "head bob" familiar or useful.

We chose eleven areas for further discussion, expansion, and clarification using focus groups. We eliminated four potential assessment domains (sleep, rooting/feeding cues, mouth/tongue position, and tone during first five minutes of feeding) because of difficulty in defining an appropriate scale (sleep) or low scores on the CLD discrimination question. We also added two areas--retractions and nasal flaring-- for inclusion on the tool, but we determined that we did not need to explore these further during the focus groups.

Phase 3: Focus Groups

Eighteen beside nurses and specialists participated in the two focus groups, with nine participants in each group. All participants had at least three years of experience in the neonatal intensive care unit. The focus group discussions helped us to confirm response options for our items and determine the scale endpoints from no disease to severe CLD. Focus groups also helped us discover which terms should not be used as response options (e.g., "mottled" to describe the infant's color, and "floppy" or "hypotonic" to describe the infant's tone). As we note above, we began by presenting the focus groups with eleven areas, arousal, general state during care time, calming, eyes, eyebrows, color, tone, desaturations during feeding, respiratory rate during feeding, desaturations, and tachypnea, and asked group members to discuss transition/arousal, calming, agitation and energy/activity level, eye appearance, color change, tone, desaturations, and respiratory rate. We also asked focus group members to think about descriptors of general state--mainly calm or quiet, restless, agitated or irritable, distressed, and frantic--and of the ability to calm--self-calms, calms with containment, voice soothing, irritable, not easily calmed, frantic/inconsolable. In the course of listening to focus group discussion, we chose to eliminate the questions about color and tone, and also to eliminate questions about eyebrows, but retain questions on eyes, and add questions about respiratory rate and desaturation during both care time and feeding.

Phase 4: Cognitive Interviews

Six bedside nurses from three academic medical centers, UNC (n = 3), Stanford University (n = 2), and the University of Iowa (n = 1) participated in one-hour cognitive interviews.

Overall, the nurses reported that the questions were easy to answer. Interview respondents found that the tool's instructions were understandable for the overall assessment and the care time portion of it, but they found the instructions less clear for the feeding portion of the assessment. At least one respondent suggested wording changes to the response options of 12 of 20 questions, but half or more of the respondents suggested changes to the response options for only these four questions: (1) How would you describe the infant's general state?; (2) How would you describe the infant's tone?; (3) How do the infant's eyes appear as you begin care?; and (4) How would you describe the infant's endurance during care time?

In response to these cognitive interview results, we changed the response options in four cases about which at least half the respondents had suggestions. The old and new responses to the questions are presented in Table 7. To illustrate the evolving refinement of responses, we initially included two additional response options to the general state question: "sleeping" and "tired." After testing this twice, we realized that the question should actually be divided into two questions--one on "general state" and one on "general status."

Table 7 Response option rewording after cognitive interviews

Phase 5: Final item revision

We refined the directions for using the scale, particularly for the feeding assessment section. We defined "desaturation" as an oxygen saturation of less than 80%, and we defined "increased respiratory rate" as a respiratory rate above 60 or, if the infant's baseline respiratory rate was already above 60, an "increase" is defined as a respiratory rate above the baseline. We provided instructions for how to calculate the baseline respiratory rate--count for 30 seconds, then multiply by 2--and we revised other question wording and response options, examples of which can be seen in Table 8.

Table 8 Examples of question and response option wording changes to the PRPOS


The use of the PROMIS methodology in PRPOS's development assures us that the creation of the instrument has been both transparent and replicable expert clinical judgment from registered nurses, neonatal nurse practitioners, neonatologists, and developmental and feeding specialists has informed all the phases of the development process. We continually refined the scale's potential set of items and response options with the goal of achieving a parsimonious set of items going into the cognitive interviews. We did not have to remove any items during the final scale revision. The prototype scale includes 26 questions about the infant that a nurse assesses before, during, and after a routine care time and feeding, and takes less than 2 minutes to complete.

Our scale development process was similar to, but more broadly inclusive and iterative than, the development of the Premature Infant Pain Profile [23, 24] because of our use of modified Delphi surveys, workgroups, focus groups, and cognitive interviews. We used the more extensive and rigorous modified PROMIS methodology in an attempt to overcome some of the inherent limitations of proxy measures and to accomplish much of the work of establishing valid and reliable items prospectively, rather than depending entirely on retrospective testing of measures. Each phase of the development process produced uniquely valuable information. The initial consultation with expert providers helped us explore and define the domains we needed to measure. The modified Delphi Process, including the two surveys interrupted by workgroup discussion, gave us enormous insight into shared--and unshared--conceptual underpinnings to common terms. The focus groups of end-users--the bedside neonatal intensive care unit nurses who care for infants with BPD--reassured us that we had succeeded in narrowing the domains to the minimum number that adequately describes BPD infants' disease state, to decrease the burden of administration. Finally, the cognitive interviewing gave us an exceptional opportunity to query users' experience with the instrument itself: "Was it understandable? Easy to complete? Effective? Did response categories mean to users what we intended them to mean?" We expect that completion of all these steps will enhance the usefulness of each individual item and enhance the usability of these assessment items across different clinical settings.

Each instrument development phase could not alone lead to a successful product, but no phase was dispensable, and, taken together, they have generated a set of items ready for quantitative assessment. Our development process is limited by the fact that it is performed only in academic medical centers, although it is reasonable to assume that most non-academic center neonatal intensive care units would share many features of the academic medical center environment. Our focus groups were conducted at only two neonatal intensive care units both located in a single state, opening the possibility of limitations by region, or practice culture. Our more geographically dispersed cognitive interviewing and field testing should help us identify any such problems.

The PRPOS is currently undergoing field testing at five academic medical centers, where bedside nurses are applying the assessment tool to a cohort of 150-200 neonates (25-40 per institution) between 23 and 30-6 weeks gestational age at birth (excluding infants with chromosomal abnormalities) and between 36-0 and 36-6 weeks postmenstrual age. At the conclusion of field testing, we will perform psychometric analyses of the data to test item validity and reliability, for the purpose of further scale refinement.


We expect that use of the PRPOS to assess observable, functional domains will greatly enhance the current unidimensional assessment of BPD severity based on oxygen use alone. For example, the PRPOS might allow clinicians and researchers to test therapies for BPD more effectively by accurately identifying subtle effects on lung function. In addition, refinement in the definition of BPD may allow more accurate prediction of important outcomes such as hospital length of stay and re-hospitalization after discharge, and further refine the relationship between BPD and neurodevelopmental outcome.

Use of a structured approach modelled on the rigorous PROMIS methodology helped us develop and refine a proxy-reported measurement instrument over a short period of time, while maintaining precision, clarity, discrimination, and comprehensiveness balanced with parsimony. This approach will serve as a useful model for others interested in developing proxy-reported outcomes measures.



bronchopulmonary dysplasia


chronic lung disease


extremely low gestational age newborn


proxy: reported pulmonary outcome scale.


  1. Schmidt B, Asztalos EV, Roberts RS, Robertson CMT, Suave RS, Whitfield MF, Trial of Indomethacin Prophylaxis in Preterms (TIPP) Investigators: Impact of bronchopulmonary dysplasia, brain injury, and severe retinopathy on the outcome of extremely low-birth-weight infants at 18 months: results from the trial of indomethacin prophylaxis in preterms. JAMA 2003,289(9):1124–9. 10.1001/jama.289.9.1124

    Article  PubMed  Google Scholar 

  2. Ehrenkranz RA, Walsh MC, Vohr BR, Jobe AH, Wright LL, Fanaroff AA, Wrage LA, Poole K, National Institutes of Child Health and Human Development Neonatal Research Network: Validation of the National Institutes of Health consensus definition of bronchopulmonary dysplasia. Pediatrics 2005,116(6):1353–60. 10.1542/peds.2005-0249

    Article  PubMed  Google Scholar 

  3. Ambalavanan N, Van Meurs KP, Perritt R, Carlo WA, Ehrenkranz RA, Stevenson DK, Lemons JA, Poole WK, Higgins RD, NICHD Neonatal Research Network: Predictors of death or bronchopulmonary dysplasia in preterm infants with respiratory failure. J Perinatol 2008,28(6):420–6. 10.1038/jp.2008.18

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  4. Cotten CM, Oh W, McDonald S, Carlo W, Fanaroff AA, Duara S, Stoll B, Laptook A, Poole K, Wright LL, Goldberg RN: Prolonged hospital stay for extremely premature infants: risk factors, center differences, and the impact of mortality on selecting a best-performing center. J Perinatol 2005,25(10):650–5. 10.1038/

    Article  PubMed  Google Scholar 

  5. Vohr BR, Wright LL, Dusick AM, Mele L, Verter J, Steichen JJ, Simon NP, Wilson DC, Broyles S, Bauer CR, Delaney-Black V, Yolton KA, Fleisher BE, Papile LA, Kaplan MD: Neurodevelopmental and functional outcomes of extremely low birth weight infants in the National Institute of Child Health and Human Development Neonatal Research Network, 1993–1994. Pediatrics 2000,105(6):1216–26. 10.1542/peds.105.6.1216

    Article  CAS  PubMed  Google Scholar 

  6. Wood NS, Costeloe K, Gibson AT, Hennessy EM, Marlow N, Wilkinson AR, EPICure Study Group: The EPICure study: associations and antecedents of neurological and developmental disability at 30 months of age following extremely preterm birth. Arch Dis Child Fetal Neonatal Ed 2005,90(2):F134–40. 10.1136/adc.2004.052407

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  7. Fily A, Pierrat V, Delporte V, Breart G, Truffert P, EPIPAGE Nord-Pas-de-Calais Study Group: Factors associated with neurodevelopmental outcome at 2 years after very preterm birth: the population-based Nord-Pas-de-Calais EPIPAGE cohort. Pediatrics 2006,117(2):357–66. 10.1542/peds.2005-0236

    Article  PubMed  Google Scholar 

  8. Katz-Salamon M, Gerner EM, Jonsson B, Lagercrantz H: Early motor and mental development in very preterm infants with chronic lung disease. Arch Dis Child Fetal Neonatal Ed 2000,83(1):F1–6. 10.1136/fn.83.1.F1

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  9. McAleese KA, Knapp MA, Rhodes TT: Financial and emotional cost of bronchopulmonary dysplasia. Clin Pediatr (Phila) 1993,32(7):393–400. 10.1177/000992289303200702

    Article  CAS  Google Scholar 

  10. Walsh MC, Yao Q, Gettner P, Hale E, Collins M, Hensman A, Everette R, Peters N, Miller N, Muran G, Auten K, Newman N, Rowan G, Grisby C, Arnell K, Miller L, Ball B, McDavid G, National Institute of Child Health and Human Development Neonatal Research Network: Impact of a physiologic definition on bronchopulmonary dysplasia rates. Pediatrics 2004,114(5):1305–11. 10.1542/peds.2004-0204

    Article  PubMed  Google Scholar 

  11. Walsh MC, Wilson-Costello D, Zadell A, Newman N, Fanaroff A: Safety, reliability, and validity of a physiologic definition of bronchopulmonary dysplasia. J Perinatol 2003,23(6):451–6. 10.1038/

    Article  PubMed  Google Scholar 

  12. Jobe AH, Bancalari E: Bronchopulmonary dysplasia. Am J Respir Crit Care Med 2001,163(7):1723–9.

    Article  CAS  PubMed  Google Scholar 

  13. DeWalt DA, Rothrock N, Yount S, Stone AA, on behalf of the PROMIS Cooperative Group: Evaluation of Item Candidates: The PROMIS Qualitative Item Review. Med Care 2007,45(5 Suppl 1):S12–21.

    Article  PubMed Central  PubMed  Google Scholar 

  14. Reeve BB, Hays RD, Bjorner JB, Cook KF, Crane PK, Teresi JA, Thissen D, Revicki DA, Weiss DJ, Hambleton RK, Liu H, Gershon R, Reise SP, Lai JS, Cella D: Psychometric evaluation and calibration of health-related quality of life item banks: plans for the Patient-Reported Outcomes Measurement Information System (PROMIS). Med Care 2007,45(5 Suppl 1):S22–31.

    Article  PubMed  Google Scholar 

  15. Castel LD, Williams KA, Bosworth HB, Eisen SV, Hahn EA, Irwin DE, Kelly MA, Morse J, Stover A, DeWalt DA, DeVellis RF: Content validity in the PROMIS social-health domain: a qualitative analysis of focus-group data. Qual Life Res 2008,17(5):737–49. 10.1007/s11136-008-9352-3

    Article  PubMed Central  PubMed  Google Scholar 

  16. Walsh TR, Irwin DE, Meier A, Varni JW, DeWalt DA: The use of focus groups in the development of the PROMIS pediatrics item bank. Qual Life Res 2008,17(5):725–35. 10.1007/s11136-008-9338-1

    Article  PubMed Central  PubMed  Google Scholar 

  17. Christodoulou C, Junghaenel DU, DeWalt DA, Rothrock N, Stone AA: Cognitive interviewing in the evaluation of fatigue items: results from the patient-reported outcomes measurement information system (PROMIS). Qual Life Res 2008,17(10):1239–46. 10.1007/s11136-008-9402-x

    Article  PubMed Central  PubMed  Google Scholar 

  18. Irwin DE, Varni JW, Yeatts K, DeWalt DA: Cognitive interviewing methodology in the development of a pediatric item bank: a patient reported outcomes measurement information system (PROMIS) study. Health Qual Life Outcomes 2009.,7(3):

  19. Fitch K, Bernstein SJ, Aguilar MS, Burnand B, LaCalle JR, Lazaro P, van het Loo M, McDonnell J, Vader J, Kahan JP: The RAND/UCLA Appropriateness Method User's Manual. Santa Monica, CA: RAND; 2001.

    Google Scholar 

  20. Walsh TR, Irwin DE, Meier A, Varni JW, DeWalt DA: The use of focus groups in the development of the PROMIS pediatrics item bank. Qual Life Res 2008,17(5):725–35. 10.1007/s11136-008-9338-1

    Article  PubMed Central  PubMed  Google Scholar 

  21. Willis GB: Cognitive Interviewing: A "how to" guide. 1999. []

    Google Scholar 

  22. Christodoulou C, Junghaenel DU, DeWalt DA, Rothrock N, Stone AA: Cognitive interviewing in the evaluation of fatigue items: results from the patient-reported outcomes measurement information system (PROMIS). Qual Life Res 2008,17(10):1239–46. 10.1007/s11136-008-9402-x

    Article  PubMed Central  PubMed  Google Scholar 

  23. Ballantyne M, Stevens B, McAllister M, Dionne K, Jack A: Validation of the premature infant pain profile in the clinical setting. Clin J Pain 1999,15(4):297–303. 10.1097/00002508-199912000-00006

    Article  CAS  PubMed  Google Scholar 

  24. Stevens B, Johnston C, Petryshen P, Taddio A: Premature Infant Pain Profile: development and initial validation. Clin J Pain 1996,12(1):13–22. 10.1097/00002508-199603000-00004

    Article  CAS  PubMed  Google Scholar 

Download references


We would like to acknowledge the contributions of our expert panel (Steven H. Abman, MD, University of Colorado; Carl L. Bose, MD, University of North Carolina at Chapel Hill; Robert G. Castile, MD, Ohio State University and Nationwide Children's Hospital; Richard A. Ehrenkranz, MD, Yale University School of Medicine; Gail C. McCain, PhD, University of Miami School of Nursing and Health Studies; Michael E. Msall, MD, University of Chicago Medical Center; Rita H. Pickler, PhD, RN, Virginia Commonwealth University Medical Center; and Peter Rosenbaum, MD, McMaster University) and the physicians, nurse practitioners, registered nurses, developmental specialists, and feeding specialists from Duke University, Stanford University, University of Alabama-Birmingham, and University of Iowa who participated in our surveys, working groups, focus groups, and cognitive interviews. We also thank our focus group moderator, Diane Bloom, PhD, as well as Jeanne Snodgrass and Teresa Edwards for their assistance with cognitive interviewing.

This work was funded by the National Center for Research Resources and the Eunice Kennedy Shriver National Institute of Child Health and Human as a UNC Clinical and Translational Science Award Administrative Supplement, award number 3UL1RR025747-02S3.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Wayne A Price.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

Research question: WAP, MML; Study conceptualization and design: WAP, MML, STR, DAD, SEM; Data collection: WAP, SEM, LMP; Data analysis and interpretation: WAP, SEM, STR, DAD; Initial draft and revisions of manuscript: SEM, WAP, STR; Manuscript revision: DAD, MML, LMP. All authors read and approved the final manuscript.

Electronic supplementary material


Additional file 1: Box S1. Focus Group Scenario. This file presents the scenario used in the focus group discussions. (DOC 21 KB)


Additional file 2: Table S1. Survey 2 results for CLD severity classification of behaviors and actions in each domain. This file shows a table of the domains and behaviors/actions used in the second survey, with an indication of whether the behavior/action was classified as being characteristic of no, mild, moderate, or severe lung disease. (DOC 66 KB)

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Massie, S.E., Tolleson-Rinehart, S., DeWalt, D.A. et al. Development of a proxy-reported pulmonary outcome scale for preterm infants with bronchopulmonary dysplasia. Health Qual Life Outcomes 9, 55 (2011).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: