To translate and validate the psychometric characteristics of a Turkish version of the Obstetric Quality-of-Recovery score 11 tool used to measure post-cesarean delivery recovery in Turkish-speaking patients.
After the original English version of the Obstetric Quality-of-Recovery score 11 tool was translated into Turkish; it was psychometrically validated to assess the post-cesarean delivery quality of recovery. Validity, reliability, and feasibility were investigated. The Obstetric Quality-of-Recovery score 11 tool was administered to Turkish-speaking patients on postoperative day 1. On postoperative day 1, a global health visual analog scale was used to assess the patient's perceived global recovery.
One hundred and eighty-six patients completed their questionnaires, providing a completion rate of 97.38%. The Spearman rho (ρ) correlation coefficient between the Obstetric Quality-of-Recovery score and global health visual analog scale (0–100 points) was 0.850 at postoperative day 1 following surgery (P < 0.001). Internal consistency, measured using Cronbach’s alpha, was 0.822. The split-half coefficient was 0.708. The Obstetric Quality-of-Recovery score differed significantly between the emergency and elective cesarean delivery groups (80 (41–104) vs. 83.3 (51–102); P < 0.05). The test–retest reliability of the Obstetric Quality-of-Recovery score items was more than 0.6 in 82% of cases, indicating good repeatability and reliability.
The Obstetric Quality-of-Recovery score 11 is a valid and reliable tool to measure the post-cesarean quality of recovery in Turkish-speaking patients. The psychometric properties of the Turkish version of the scale to measure the post-cesarean quality of recovery were similar to those of the seminal English version.
Recovery after cesarean delivery (CD) is a multidimensional and complex process influenced by a variety of factors, such as patients, obstetric procedures, and anesthetic characteristics. The majority of research on CD recovery has focused on physiological parameters, including pain, nausea/vomiting, recovery of bowel function, length of hospital stay, recovery timeframes, and the occurrence of adverse events such as poor outcome and mortality [1, 2]. There is an increasing focus on the patient-perceived quality of recovery (QoR). Patient-reported outcome measures (PROMs) can be used to assess the patient's perspective .
The Quality of Recovery-40 (QoR-40) score was developed in 2000. It is now widely used. It has also been successfully translated and validated in the Turkish language . The Obstetric Quality-of-Recovery (ObsQoR-11) score, derived from the QoR-40 scale, was developed to assess recovery in the first 24 h after CD . The ObsQoR-11 scale is a composite patient-reported outcome measurement of the quality of recovery that evaluates four underlying factors as follows. Physical comfort and pain are represented by Factor 1. Factor 2 reflects both physical independence and mental well-being. Factor 3 represents physical independence, whereas Factor 4 supplements Factor 1. The ObsQoR-11 scale provides a score ranging from 0 to 110, with a high score indicating a good recovery.
The ObsQoR-11 tool has evolved to the ObsQoR-10 questionnaire in the 2020 [6, 7]. One item was created by combining the items for moderate, and severe pain. The ObsQoR-10 tool hadn't been published when our study was in the design, planning, and protocol approval phases.
This study aimed to develop the Turkish version of the ObsQoR-11 (ObsQoR-11T) through a translation and cultural adaptation process, and to evaluate the validity and reliability of the ObsQoR-11T for Turkish women who had an elective and emergency cesarean delivery. The authors hypothesized that the ObsQoR-11T would have comprehensive validity and reliability, similar to the original English version.
This prospective observational cohort study of term women undergoing CD was approved by the Ethics Committee of Gülhane Education and Research Hospital, Turkey (No. 2021/506), and registered with ClinicalTrials.gov (NCT04744311, February 8, 2021). Written informed consent was obtained from all the participants. The study was conducted in line with the principles of the Declaration of Helsinki . All methods were carried out following the Strengthening the Reporting of Observational Studies in Epidemiology guideline . Patients who underwent surgery at the hospital between January 2021 and August 2021 were enrolled.
Women aged ≥ 18 years who underwent CD at ≥ 37 weeks of pregnancy, and were able to read and speak Turkish, were included in the study Patients who were lack of fluency in Turkish, inability to read or understand written Turkish and inability to obtain written informed consent due to neuropsychiatric disorders such as schizophrenia, mental retardation, seizures with eclampsia and addiction were excluded from the study that may bias the ObsQoR-11T measurements.
Development of the ObsQoR-11T
Permission was received from the author of the original English language version of the ObsQoR-11 scale. The translation technique was performed as per the recommendations of Beaton and Bullinger . First, two authors (UK and MEI) translated the ObsQoR-11 into Turkish with reference to the Turkish version of the validated QoR-40 (QoR-40T) . A temporary Turkish version of the ObsQoR-11 was agreed upon, which was then back-translated by a third person (co-author SŞ; healthcare experience in the USA and Turkey). Subsequently, a consensus was made regarding the ObsQoR-11T. The ObsQoR-11T was then tested using a daily working list with a simple randomly selected cohort of ten nurses. All ObsQoR-11T questions were confirmed to be comprehensible. The final ObsQoR-11T is shown in Fig. 1.
Informed written consent was obtained from each patient before surgery. Demographic characteristics were recorded preoperatively. Intraoperative features were obtained from electronic and written patient records. On the morning of the elective scheduled CD, the ObsQoR-11T scale was explained to the patient and consent gained. Before an emergency CD, while the patient was in the preoperative room consent gained. The ObsQoR-11T was administered at 24th hour following CD. At the 25th hour, a computer-assisted randomization program (random.org) was used to determine a random subset of 20 patients prior to complete the ObsQoR-11T scale once again. The researchers, who were members of the perioperative care team, were on hand to assist the patients the ObsQoR-11T. The time required to complete each ObsQoR-11T scale was recorded. The general well-being was measured with a 100-mm global health visual analog scale (VAS) with the ObsQoR-11T questionnaire. The VAS scale ranges from 0 to 100 mm, indicating poor to best possible recovery. The ObsQoR-11T scale and the VAS scale were administered using self-assessment, with assistance as required.
Our institutional neuraxial anesthesia regimen includes intrathecal administration of hyperbaric bupivacaine 12–15 mg, with fentanyl 15 mcg, via a single spinal injection. There is no routine approach for general anesthesia and is dependant on the anesthesiologist, For postoperative analgesia, patients regularly received paracetamol 1 g four times daily unless contraindicated. Patients are also routinely prescribed I.M. diclofenac 75 mg as required after surgery. Intravenous ondansetron 4 mg as required were also prescribed unless contraindicated. Generally, 6 h after the spinal anesthesia, patients are encouraged to mobilize, and 8 h after, a trial without a urinary catheter is attempted.
Psychometric evaluation of the ObsQoR-11T
To measure the convergent validity, the correlation between ObsQoR-11T score and global health VAS score was evaluated. Discriminant validity was tested by comparing the ObsQoR-11T score in two groups divided by the VAS (≥ 70 mm [good] vs. < 70 mm [poor]). To measure the construct validity, the correlation of continuous parameters with the ObsQoR-11T was evaluated. The ObsQoR-11T scores were compared in terms of education level, presence of comorbidities, parity groups, history of cesarean section, need for elective or emergency cesarean section, emergency category for emergency cesarean section, and type of anesthesia.
Cronbach's alpha, split-half reliability, and test–retest reliability were used to measure reliability. The test–retest reliability was analyzed in a subgroup of women who were asked to complete the questionnaire 60 min later (at 25 h), which was correlated to the 24-h results. The intra-class correlation coefficient was used to assess test–retest reliability. The floor and ceiling effects were calculated by determining if 15% of respondents received the greatest or lowest possible score. The recruitment rate, completion rate, and time taken to complete the scale were used to assess acceptability and feasibility (the investigator measured).
The normal distribution of the continuous variables was tested using the Kolmogorov–Smirnov test. Measurement data are presented as mean ± standard deviation (SD), median (min–max) and categorical data are presented as frequency and percentage number (%). Differences in distribution were analysed by the Kruskal– Wallis test and Mann–Whitney U-test. Difference in distribution of categorical data was analysed by Fisher’s exact test and Chi-square test.
To achieve structural validity, confirmatory factor analysis (CFA) was performed.
It is suggested that sample size should be at least 10–15 times the number of items [11, 12]. According to Lacobucci, 50 can be sufficient for minimum sample size and 100 can be sufficient for maximum sample size . ObsQoR-11 scale consisted of 4 dimensions and 11 items. It can be said that sample size of this study (160) is sufficient for CFA. CFA was performed using the Mplus 7 program . To estimate the CFA model parameters, the Robust Maximum Likelihood estimation (MLR) method was used.
Correlations between the ObsQoR-11T items and VAS scores were measured using the Spearman rank (ρ) correlation coefficient. Internal consistency was measured using Cronbach’s α and split-half reliability. Test–retest reliability was measured using the intraclass correlation coefficient. All statistical analyses were performed using IBM SPSS statistics for Windows, version 25.0.; IBM Corp, Armonk, NY, USA. Differences were considered statistically significant when the P-value was < 0.05.
A total of 203 patients were screened for eligibility. Of these, 12 did not meet the inclusion criteria. The final sample consisted of 191 patients and there were no refusals (recruitment rate: 94%). After recruitment, five patients were excluded before the postoperative follow-up. A total of 186 patients completed the ObsQoR-11T after CD (completion rate: 97.38%). The mean time taken to complete the postoperative ObsQoR-11T scale was 123 ± 45 s for all patients, 121 ± 41 s for patients with elective CD, and 125 ± 43 s for patients with emergency CD (P = 0.173). Patient demographic characteristics are summarized in Table 1, medical characteristics and obstetric indications for CD are summarized in Table 2.
The construct validity of ObsQoR-11T scale was determined via CFA analysis. After CFA analysis, the factor loadings and t values were examined and it was seen that all factor loadings were significant at the 0.05 level. The CFA model was given in Fig. 2. As seen in Fig. 2, all factor loadings were positive. The model-fit indexes were given in Table 3.
To determine model data fit, firstly the chi-square (χ2) test should be examined. The significance level of χ2 values greater than 0.05 indicate that model data fit provided. As seen in Table 3, significance of χ2 value was lower than 0.05. It can be said that the model data fit not achieved. However, χ2 test sensitive to sample size . Therefore, beside χ2 test, other general goodness-of-fit indices (for e.g.CFI, TLI etc.) should be examined. In this study, in addition to χ2 statistic, CFI (the comparative fit index), TLI (Tucker-Lewis Index), SRMR (Standardized Root Mean Square Residual) and RMSEA (Root Mean Square Error of Approximation) values were examined. In order to obtain model data fit, SRMR index should be less than 0.08  and RMSEA should be between 0.05 and 0.08 show an acceptable fit . Moreover, CFI and TLI indices between 0.90 and 0.95 indicate acceptable fit . As seen in Table 3, CFI is 0.975; TLI is 0.961; SRMR is 0.041 and RMSEA is 0.078. Al model fit indices proved that the model has an acceptable fit. As a result, it can be said that the construct validity of the ObsQoR-11T scale is provided.
To assess for convergent validity, we evaluated the correlation between the ObsQoR-11T and the VAS for recovery. The Spearman rho (ρ) correlation coefficient was 0.850 (95% CI 0.805 to 0.885) for all patients, 0.728 (95% CI 0.615–0.811) for patients with elective CD, and 0.868 (95% CI 0.808–0.910) for patients with emergency CD at the postoperative 24th hour following cesarean delivery (P < 0.0001). There was a strong correlation between the ObsQoR-11T and the VAS (correlation > 0.70). The individual item correlation to the VAS score is demonstrated in Table 4. Patients with a good or poor postoperative recovery, as indicated by a global health VAS ≥ 70 or < 70 mm, respectively, were compared to establish discriminant validity. The median [IQR] ObsQoR-11T score was significantly different between these groups (90 [83.1–94] vs. 71 [66–78.5]) (P < 0.0001). There were no statistically significant results in the correlation between the ObsQoR-11T score and the continuous variables (Table 5). In the comparison of the ObsQoR-11T score over categorical variables, only the difference between the emergency and elective CD groups (80 (41–104) vs. 83.3 (51–102)) was found to be significant (P < 0.05) (Table 6).
Internal consistency measured using Cronbach’s alpha was 0.822 for all patients; 0.821 in patients delivering by elective CD, and 0.814 in those delivering by emergency CD. The inter-item correlation matrix for the ObsQoR-11T is outlined in Table 7. Inter-item correlations were mostly at r > 0.15 (82%) for all patients, r > 0.15 (85%) for patients with elective CD, r > 0.15 (75%) for patients with emergency CD, a good indicator of consistency. Split-half reliability with the Spearman-Brown adjustment (which measures the extent to which all parts of the test contribute equally to the desired measurement) was 0.708 for all patients, 0.697 for patients with elective CD, 0.703 for patients with emergency CD, implying an equal contribution from all items. The test–retest reliability of the ObsQoR-11T items was r > 0.6 in 82% of items and ≥ 0.45 in the remaining items (no. 4 and 5) for all patients, r > 0.6 in 64% of items for patients with elective CD, r > 0.6 in 82% of items for patients with emergency CD, suggesting adequate repeatability and reliability (Table 8). The percentage of women who achieved the highest and lowest possible ObsQoR-11 scores at 24 h was 0% (n = 0/186). Therefore, no floor or ceiling effects of the scoring tool were demonstrated. The ObsQoR-11 T scores were negatively skewed. The level of skewness was − 0.515 for all patients, − 0.610 for patients with elective CD, − 0.458 for patients with emergency CD at 24 h postoperatively, indicating that the majority of the ObsQoR-11T scores were greater than 55 points.
The results of our study showed that the ObsQoR-11T was a valid, reliable, clinically convenient, and suitable scale for measuring the quality of postoperative recovery after both elective and non-elective CD in the Turkish-speaking population. In addition to being an ideal scale for evaluating convergent validity with the ObsQoR-11T, the global health VAS is the most frequently used scale and the gold standard. The QoR-40 scale, which is the source of the ObsQoR-11 scale, lacks content validity for obstetric recovery because it does not include items pertaining to care of a baby. Especially at the postoperative 24th hour, a strong correlation was found between the ObsQoR-11T scores and VAS scores (r = 0.850). This achieved the > 0.6 criterion for health rating scales, demonstrating that the ObsQoR-11T has excellent convergent validity, and more strongly than the original ObsQoR-11 (r = 0.53) . Regarding the surgical types including general surgery, orthopedics, and otolaryngological surgeries, the QoR-40T scale was evaluated on the 3rd postoperative day in terms of convergent validity and the result was r = 0.468 . As such, the ObsQoR-11T is better than the QoR-40T in terms of convergent validity.
The discriminant validity of the study was confirmed by comparing women with good or poor postoperative recovery, as indicated by the global health VAS score . In the original study , the good versus poor recovery median values (100 vs. 87) according to the VAS score after the elective CD were found to be 97 versus 64. In the validation study after non-selective CD, it was found to be 90 versus 71 in the ObsQoR-11T . While discriminatory validity was achieved in all three studies, the difference between the scores is due to the difference between the postoperative recovery procedures of the centers. Moreover, overall floor and ceiling effects were absent. Hence, it is feasible to use ObsQoR-11T after CD.
The construct validity was determined by conducting CFA, considering the data collected from 186 participants and the original dimensions and items. Turkish adaptation of the ObsQoR-11 scale’s original structure was confirmed and structural validity was ensured.
In our study, a statistical difference was observed between the global ObsQoR-11T scores (74 vs. 82) after emergency and elective CD. While the pregnant women for elective CDs were able to psychologically prepare themselves and their expectations, women who underwent emergency CDs did not have any time for preparation. This may explain the difference in the ObsQoR-11T scores during the recovery period. Moreover, complications are naturally more likely to occur in emergency CDs and this may have decreased the quality of the postoperative recovery. In our study, no correlations were found between other recorded demographic and surgical data, and the ObsQoR-11T. In studies based on the specific evaluation of these variables in the future, the ObsQoR-11T score should be evaluated. Unlike the QoR-40 T study, correlations of the ObsQoR-11T with these variables could not be shown. This demonstrates that the postoperative recovery period after CD is a more unique and complex process than other surgeries.
Cronbach's alpha and split-half reliability were 0.82 and 0.70, respectively, and comparable with those reported for the original ObsQoR-11 and QoR-40 T [4, 5]. Cronbach's alpha was more than 0.7 which is above the recommended criterion . Internal consistency was also tested by inter-item correlation, with high values indicating strong item correlation within the instrument. The correlation coefficients between the items and the global ObsQoR-11T scores were between 0.41 and 0.75, and the lowest coefficient value was related to the 6th item, while the highest coefficient value was related to the 10th item. In the original study, the lowest values were found in items 8 and 9, while the highest coefficient value was found in item 1 . In both the original study and our study, negative correlation values were not obtained in the inter-item correlation matrix. These results were enough to confirm that the ObsQoR-11T possesses adequate reliability.
There is no consensus on the timing of test repetition in the QoR studies [21, 22]. To set a period long enough not to remember the answers given after 24 h, but short enough not to deviate significantly from the health status at 24 h, we also performed a retest at the 25th hour, similar to previous studies. The test–retest reliability was excellent.
The presence of questions containing both negative and positive expressions in the same questionnaire causes difficulties in the psychometric evaluation process. While in the first five questions of the questionnaire, the 11-point Likert scale starts from 10, it starts from 0 in the second section consisting of six questions. This sudden change resulted in confusion. The person administering the questionnaire may need to provide guidance with a proactive attitude to prevent confusion.
While we typically determine 0 as no pain and 10 as the most severe pain in the postoperative pain assessment using the VAS or NRS, 10 expresses the most pain-free situation in the pain questions, which are the first two questions of the ObsQoR-11 scale. Having two separate questions for pain assessment also caused confusion. Our patients described the pain as ‘‘tolerable pain’’ for moderate pain and ‘‘terrible pain’’ for severe agonizing pain.
There are some limitations to our study. Patients with the most severe illnesses, who would receive the worst scores if they were unable to consent or complete the survey within 24 h, as well as those from other cultures who needed assistance understanding written Turkish or those with less education, could have been exclude of the study. To measure responsiveness in the validation studies of the quality of recovery scales, the same questionnaire was applied both preoperatively and postoperatively, and Cohen effect size and standardized response mean measurements were performed [23, 24]. As there were questions such as “I can hold my baby without help” and “I can breastfeed my baby without help” in the questionnaire, the preoperative ObsQoR-11 evaluation was not completely objective. Moreover, unlike the original study, the preoperative questionnaire was not applied since non-elective cases were also included in our study. The other limitation of our study was that it was single-centered. In addition, if the baby is taken to the neonatal ICU (NICU) after birth, questions 8 and 9 about the baby may not be possible to answer, therefore ObsQoR-11T measurement will result in a different score if the baby is admitted to the NICU.
In conclusion, current study evaluated the Turkish version of the ObsQoR‐11 scoring tool to measure QoR on the first postoperative day after CD in a single centre. In terms of validity, reliability, clinical acceptability, and feasibility, the ObsQoR-11T performed well. The questionnaire may be used to assess postoperative recovery after CD as a standardized patient-reported outcome measure. More research is needed to validate this tool in spontaneous or assisted vaginal deliveries, as well as in patients with babies admitted to the NICU, taking their ability to nurse/feed/hold the baby into account.
Availability of data and materials
Data are presented in the manuscript.
Deniau B, Bouhadjari N, Faitot V, et al. Evaluation of a continuous improvement programme of enhanced recovery after caesarean delivery under neuraxial anaesthesia. Anaesth Crit Care Pain Med. 2016;35:395–9. https://doi.org/10.1016/j.accpm.2015.11.009.
Acikel A, Ozturk T, Goker A, Hayran GG, Keles GT. Comparison of patient satisfaction between general and spinal anaesthesia in emergency caesarean deliveries. Turk J Anesth Reanim. 2017;45:41–6. https://doi.org/10.5152/TJAR.2017.38159.
Ciechanowicz S, Setty T, Robson E, et al. Development and evaluation of an Obstetric Quality-of-Recovery score (ObsQoR-11) after elective Caesarean delivery. Br J Anaesth. 2019;122:69–78. https://doi.org/10.1016/j.bja.2018.06.011.
Sultan P, Kormendy F, Nishimura S, et al. Comparison of spontaneous versus operative vaginal delivery using obstetric quality of recovery-10 (ObsQoR-10): an observational cohort study. J Clin Anesth. 2020;63:109781. https://doi.org/10.1016/j.jclinane.2020.109781.
Sultan P, Kamath N, Carvalho B, et al. Evaluation of inpatient postpartum recovery using the obstetric quality of recovery-10 patient reported outcome measure: a single-center observational study. Am J Obstet Gynecol MFM. 2020;2:100202. https://doi.org/10.1016/j.ajogmf.2020.100202.
World Medical Association. World Medical Association declaration of Helsinki: ethical principles for medical research involving human subjects. JAMA. 2013;310:2191–4. https://doi.org/10.1001/jama.2013.281053.
von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP. STROBE initiative, the strengthening the reporting of observational studies in epidemiology (STROBE) Statement: guidelines for reporting observational studies. Int J Surg Lond Engl. 2014;12:1495–9. https://doi.org/10.1016/j.ijsu.2014.07.013.
Ciechanowicz S, Howle R, Heppolette C, Nakhjavani B, Carvalho B, Sultan P. Evaluation of the Obstetric Quality-of-Recovery score (ObsQoR-11) following non-elective caesarean delivery. Int J Obstet Anesth. 2019;39:51–9. https://doi.org/10.1016/j.ijoa.2019.01.010.
GÖ, UK, MEİ conceived the study, analyzed data, wrote and revised the manuscript. UK and MEİ analyzed data, wrote and revised the manuscript. ÖO and MU collected data and revised the manuscript. GÖ and SS conceived the study and revised the manuscript. All authors read and approved the final manuscript.
Approved by the Ethics Committee of Gülhane Education and Research Hospital, Turkey (No. 2021/506), and registered with ClinicalTrials.gov (NCT04744311, February 8, 2021). Written informed consent was obtained from all the participants.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Ozkan, G., Kara, U., Ince, M.E. et al. Validation of the Turkish version of the Obstetric Quality-of-Recovery score 11 (ObsQoR-11T) after cesarean delivery.
Health Qual Life Outcomes20, 155 (2022). https://doi.org/10.1186/s12955-022-02073-y