Free access

Perspective

1 March 2016

Improving the relevance and consistency of outcomes in comparative effectiveness research

Authors: Sean R Tunis [email protected], Mike Clarke, Sarah L Gorst, Elizabeth Gargon, Jane M Blazeby, Douglas G Altman, and Paula R WilliamsonAuthor Info & Affiliations

Publication: J. Comp. Eff. Res.

Volume 5, Number 2

https://doi.org/10.2217/cer-2015-0007

PDF

Abstract

Policy makers have clearly indicated – through heavy investment in the Patient Centered Outcomes Research Institute – that reporting outcomes that are meaningful to patients is crucial for improvement in healthcare delivery and cost reduction. Better interpretation and generalizability of clinical research results that incorporate patient-centered outcomes research can be achieved by accelerating the development and uptake of core outcome sets (COS). COS provide a standardized minimum set of the outcomes that should be measured and reported in all clinical trials of a specific condition. The level of activity around COS has increased significantly over the past decade, with substantial progress in several clinical domains. However, there are many important clinical conditions for which high-quality COS have not been developed and there are limited resources and capacity with which to develop them. We believe that meaningful progress toward the goals behind the significant investments in patient-centered outcomes research and comparative effectiveness research will depend on a serious effort to address these issues.

**Figure 1.** Year of first publication of each core outcome sets study (n = 227).
COS: Core outcome sets.

**Figure 2.** Number of core outcome sets developed in each disease category (n = 305).
^†Studies we are aware have been published since December 2014.
COS: Core outcome sets; N/A: Not applicable.

First draft submitted: 16 October 2015; Accepted for publication: 7 January 2016; Published online: 1 March 2016

In August 2015, the Patient-Centered Outcomes Research Institute (PCORI) announced that they had surpassed the milestone of US$1 billion dollars in funding for patient-centered outcomes research (PCOR) and comparative effectiveness research (CER). The goal of this investment is to conduct research that will improve decision making by patients, clinicians, payers and other stakeholders, resulting ultimately in improved health outcomes and reduced healthcare spending [1]. One of the fundamental features distinguishing CER and PCOR from traditional clinical research is an emphasis on measuring and reporting outcomes that are more meaningful to patients, and that better reflect the decision making needs of clinicians, payers and policymakers [2,3]. Systematic reviews of clinical research have consistently observed serious problems with the outcomes reported in published studies, not only with respect to the relevance of those outcomes, but significant variation in which outcomes are reported, the instruments used to report them and biases from reporting some, but not all, of the outcomes that were collected in the trials. Because of this, the ability to use clinical studies to make reliable comparisons of the effectiveness of therapies is limited, providing suboptimal returns on the substantial investments made to support these studies.

In this paper, we provide an overview of the current state of outcomes reporting in clinical research and identify a number of specific initiatives that are working to improve outcome reporting in future studies. The paper focuses most heavily on the increasing activity in the development of Core Outcomes Sets (COS), and the potential for this work to substantially improve the quality and relevance of outcomes measured and reported in clinical research. Finally, the paper identifies a number of gaps in this work and proposes a series of activities necessary to accelerate the development and use of more relevant, consistent and patient-centered outcomes. The activities currently underway are not adequate in scope or magnitude to address the critical problems with research outcomes described in this paper. We believe that rapid and meaningful progress toward the goals behind the significant investments in PCOR and CER will depend on implementation of an intensive, coordinated and sustained effort to develop, measure and report relevant, reliable, patient-centered and standardized health outcomes.

Current problems with the outcomes in clinical trials

A lack of adequate attention to the choice of outcomes in clinical trials has led to avoidable waste in both the production and reporting of research. Currently, there are five major problems with outcomes reported in clinical trials: failure to collect outcomes that are most meaningful to patients, a high degree of variability across trials in the outcomes reported and variation in outcome measurement instruments used, lack of information on the measurement properties of the instruments and biased reporting of outcomes in published trials.

A number of reports have observed that the outcomes included in clinical research have not always been those that patients regard as most important or relevant [4]. For example, clinical guidelines issued by the American College of Physicians noted that the ability to provide strong recommendations was limited by the absence of measures of cognitive function that are commonly used in clinical care [5]. Medicare declined to provide reimbursement for cervical artificial discs in part because they viewed the outcomes reported in the trials as uninformative for key aspects of patient functional abilities [6]. The primary outcome reported for most trials of drugs for psoriasis, which is also the one that is required for regulatory approval in the USA, is based on a clinician's judgment of the extent of disease, while payers, clinicians and patients view the distribution of the plaques and impact on functioning as most meaningful for their quality of life [7]. In their 2012 methodology committee report, PCORI provides a standard for patient outcomes that instructs researchers to, ‘measure outcomes that people representing the population of interest notice and care about’, to be identified with input from patients and decision makers through meetings, surveys or published studies [8]. While this standard should raise awareness of the importance of patient-relevant outcomes, relying on individual research teams to select these outcomes independently and informally is unlikely to reduce the problems described further below.

The second major problem with outcomes in trials is variability in measurement and reporting. Evidence-based decision making often depends on the ability to aggregate results from multiple studies or to make treatment comparisons indirectly by looking at results of separate studies. These efforts depend heavily on the degree of consistency in the outcomes that are measured and reported across studies. At present, many studies that explore the effects of the same intervention on a specific health condition measure or report different outcomes, whether they be patient-reported or clinician-reported. This makes it difficult to compare, contrast or combine the findings of these studies when making decisions and setting policies, causing problems for people trying to use the output from healthcare research. For example, a survey of trials involving people with schizophrenia found that 2194 different scales had been used in 10,000 controlled trials: on an average, a new instrument had been introduced for every fifth trial [9]. In other research, it has been shown that more than 25,000 outcomes appeared only once or twice in oncology trials [10].

Consensus is also needed on how selected outcomes should be defined and measured. There is often a variety of instruments being used for measuring the same outcome, with great variability in how outcomes are measured, scored and reported. Incomparable scores from different instruments further hampers evidence synthesis in systematic reviews. There is also great variability in the quality (reliability and validity) of measures used and there is a need for guidance on how to select the best instrument for a given outcome.

Difficulties caused by heterogeneity in outcome measurement are well known to systematic reviewers [11] and hamper efforts to present guideline developers with succinct information on the most important outcomes. For example, Summary of Findings (SoF) tables were developed by the Grading of Recommendations Assessment, Development and Evaluation (GRADE) working group, to provide a summary of the evidence for important outcomes, along with the quality of this evidence [12]. They allow for the inclusion of up to seven important reported outcomes, providing a way to present the main findings of a review in a simple and transparent format. They have been shown to improve readers' understanding and speed of retrieval of the findings of systematic reviews [13], but they will only be effective for decision-making if they include the most relevant outcomes for that purpose. However, a recent review found that although there has been an increased inclusion of SoF tables in Cochrane Reviews since they were introduced in 2008, they were still absent from nearly half of the reviews published for the first time in 2013 [14].

The fifth major problem with outcomes in clinical trials is reporting bias. Outcome reporting bias (ORB) occurs as a consequence of the selection for publication of a subset of the originally collected outcomes on the basis of the results. This form of bias has been identified as a threat to evidence-based medicine because clinical trial outcomes with statistically significant results are more likely to be published [15]. The current CONSORT statement for reporting trials recommends that completely defined primary and secondary outcome measures should be prespecified and any changes to trial outcomes after the trial commenced should be documented with reasons [16]. It goes on to recommend that the results for each outcome should be reported for each group, along with the estimated effect size and its precision. Despite this guidance, empirical research has shown that statistically significant outcomes were more likely to be fully reported compared with nonsignificant outcomes (range of odds ratios: 2.2–4.7). When comparing trial publications with protocols, it was found that 40–62% of studies had at least one primary outcome that was changed, introduced or omitted in the time period between the production of these documents describing what the researchers planned to do and what they eventually did [17]. While it is not uncommon for investigators to modify study protocols, this high frequency of changes to primary outcomes highlights the potential for significant outcome reporting bias, and the potential benefit of widely agreed, standardized outcomes. Previous work, examining two separate cohorts of Cochrane systematic reviews, has shown that 55% (157/283) of reviews in the first cohort could not include data for the review on the primary outcome from all eligible studies [13], and 86% (79/92) of the second cohort could not include full data from the main harm outcome of interest for all studies [18].

Definition & examples of core outcome sets

These issues of relevance, inconsistency and outcome reporting bias could be improved with the development and application of agreed standardized sets of outcomes, known as core outcome sets (COS), to be measured and reported for specific areas of health [19]. The outcomes included in a COS could include patient-reported outcomes, clinician-reported outcomes and other patient-relevant outcomes. These sets are intended to represent the minimum that should be measured and reported in all clinical trials of a specific condition.

The existence or use of a COS does not imply that outcomes in a particular trial should be restricted to those in the relevant set. But, the expectation is that the core outcomes will always be collected and reported, with researchers including additional outcomes of particular relevance or interest to their specific study if they wish. The use of COS will make it easier for the results of trials to be compared, contrasted and combined, thereby reducing waste in research [20]. Their use would greatly reduce heterogeneity between trials because all trials would measure and report the agreed important outcomes, and lead to research that is more likely to have measured patient-relevant outcomes. Importantly, their use would enhance the value of evidence synthesis by reducing the risk of outcome reporting bias and ensuring that all trials contribute usable information to a review and meta-analysis.

COS serve an analogous role of having a defined set of quality measures that are measured and reported consistently for all public reporting and pay for performance programs, such as those developed through the National Quality Forum. These measures sets have the benefit of being endorsed through a robust, transparent, multistakeholder process to ensure that they are relevant, reliable and efficient. They also allow for accurate comparisons across providers, and can be aggregated across multiple providers to generate meaningful information on trends in process and outcomes, including patient reported outcomes, across larger health systems and regions.

An early example of an attempt to standardize outcomes was an initiative by the World Health Organization in the 1970s, relating to reporting results of cancer treatment trials [21]. More than 30 representatives from groups undertaking trials in cancer came together, the result of which was a WHO handbook recommending the minimum requirements for data collection in cancer trials. This data set included the minimum data that should be made available about the patient, the tumor, toxicity and effects of therapy including response, recurrence and disease-free survival.

Since then, particularly notable work relating to outcome standardization has been undertaken by the OMERACT (Outcome Measures in Rheumatology) initiative [22], which has advocated the use of COS, designed using consensus techniques, in clinical trials in rheumatology since their first conference in 1992 [23]. OMERACT has served a critical role in the development and validation of clinical and radiographic outcome measures in rheumatoid arthritis (RA), osteoarthritis, psoriatic arthritis, fibromyalgia and other rheumatic diseases. As an example, the COS recommended for trials of medicinal products in the treatment of RA included the following seven outcomes: tender joints, swollen joints, pain, physician global assessment, patient global assessment, physical disability and acute phase reactants. There are currently 20 groups working on separate COS within the field of rheumatology, all coordinated under the umbrella of OMERACT.

The first evaluation of the uptake of a COS related to recommendations made for clinical trials of symptom-modifying antirheumatic drugs (SMARDS) in the treatment of RA, ratified in 1994 by the WHO and International League of Associations for Rheumatology (ILAR), and also included in guidance issued by the US FDA and EMA. This study demonstrated that nearly 70% of trialists reporting trials in RA are now measuring the COS [24], and 90% of the trialists contacted said they would consider using the COS if they were to lead a new trial in RA. Clearly, COS have the potential to improve the evidence base for healthcare, but additional work is need to develop strategies to ensure that they are disseminated and used by clinical researchers.

Since OMERACT, there have been other examples of similar COS initiatives to develop recommendations about the outcomes that should be measured in clinical trials. These include the IMMPACT [25], whose aim is to develop consensus reviews and recommendations for improving the design, execution and interpretation of clinical trials of treatments for pain. Including the first IMMPACT meeting in 2002, there have been 17 consensus meetings on clinical trials of treatments for acute and chronic pain in adults and children. Additional examples are the Harmonizing Outcome Measures for Eczema (HOME [26]) Initiative and the International Dermatology Outcomes Measures (IDEOM, [27]), which are international groups developing core outcomes to include in trials of the treatment of skin conditions.

The COMET initiative

Stimulated in part by the success of OMERACT and IMMPACT, as well as the increase in awareness of problems with outcomes collected and reported in clinical trials, interest and activity in COS have been increasing rapidly over the past 5 years. The COMET Initiative [52] was established to encourage and support the process of developing and implementing COS [28]. It was launched in 2010 with the following aims: to raise awareness of current problems with outcomes in clinical trials, to encourage COS development and uptake, to promote patient and public involvement in COS development, to provide resources to facilitate these aims, to avoid unnecessary duplication of effort and to encourage evidence-based COS development.

COMET aims to collate and stimulate the development, application and promotion of COS, by including data on relevant individual studies in a publically available internet-based resource. This database includes publications of previous COS development projects, as well as planned and ongoing work. A systematic review was undertaken in 2013 that provided a first comprehensive search for COS in health research [29]. It identified 198 relevant studies that determined which outcomes or domains should be measured in clinical trials for a specific health condition. The review revealed wide variation in the methods used to develop COS and work is needed to assess the implications of these different methods for both minimizing bias and maximizing efficiency in the development of COS, and for ensuring uptake. As an example, although benefits have been shown for involving patients in trial design, only 16% of the published COS reported that there was input from patients in their development [30]. The review highlighted the need for methodological guidance, including how to engage key stakeholder groups, particularly members of the public, in the development and implementation of COS.

This review was updated to the end of 2014, and a further 29 new COS studies were identified [31]. There has been a general increase in the number of COS over the years, with a consistently higher number of COS published annually in recent years than in most years before 2010 (Figure 1). The studies identified have been added to the COMET database in order to provide an up-to-date, comprehensive database of COS. In addition, to reduce unnecessary duplication of effort, ongoing COS studies are also registered in the database. Figure 2 shows the number of COS developed, or in development, according to health category. Taken together, this work has identified many health areas where a COS has been developed and highlighted important gaps. For example, there has been substantial work on COS for rheumatology and neurology, while mental health conditions have not received significant attention.

Awareness of the need for COS continues to grow and knowledge of the COMET Initiative increases, as reflected in the website and database usage figures [20,32]. More than 16,500 visits were made to the website in 2014 (36% increase over 2013) and 9780 new visitors (43% increase). By December 2014, a total of 6588 searches had been completed of the COMET database, with 2383 in that year alone (11% increase). An online survey in May–June 2015 was answered by 206 (52%) of the 396 visitors to the website, and revealed that the most common reasons for searches were ‘I am thinking about developing a core outcome set’ and ‘I am planning a clinical trial’.

Methods for developing COS

While standardized methods for the development of COS have not yet been widely adopted, a fairly well-defined set of issues to consider are increasingly being addressed [19]. A detailed handbook with step-by-step instructions on the COS development process has been developed by OMERACT, and is regularly updated with insights generated through ongoing COS work [33]. At a high level, the first stage in the development of a COS is most frequently a decision on what outcomes or outcome domains to measure, followed by agreement on how those outcomes should be defined and measured in order to provide the necessary data, adopting the outcome specification model previously described [34]. COS may include both patient-reported outcomes (PROs) and clinician-reported outcomes (ClinROs). Several existing initiatives, described in the next section, are relevant to the process of determining how an outcome should be defined and measured. In addition, COMET and COSMIN (COnsensus-based Standards for the selection of health Measurement INstruments) have recently collaborated on the development of a guideline for instrument selection [35].

As noted above, COS have been developed using a variety of methods [29,31]. These include the use of a literature/systematic review as an early step, which rose from 33% (66/198) of the studies in the original review to 72% (21/29) in the update. A variety of methods have been used to assess and develop consensus, with the Delphi technique used for 15% (29/198) of the COS in the original review and 31% (9/29) in the update.

The stakeholder groups regarded as key to developing a COS varies between health areas. Clinical experts have been involved in all studies, but there has been a recent shift towards greater involvement of patient and public representatives. This increased, where reported, from 18% (31/174) in the original review, to 59% (13/22) in the update, and 89% (66/74) among the ongoing studies that are registered in the COMET database.

Involving the public in research can bring challenges and, in response to this need, COMET has established the PoPPIE (People and Patient Participation, Involvement and Engagement) Working Group to develop resources, including the development of plain language summaries in partnership with patients [36], and pursue a research agenda. Research is needed on how to: involve patients as research partners in design of COS studies; identify and meet information needs of patients as both research partners and participants; identify appropriate methods for eliciting consensus among patient groups; generate appropriate questions for patients taking part in a COS study; access and engage patients in COS studies; ensure hard to reach communities are involved; bring different stakeholder groups' views together; and evaluate the stakeholder experience of taking part.

Other initiatives involving standardized outcomes

A number of programs and initiatives in the US and elsewhere overlap to some degree with the expanding work on COS, and present opportunities for coordination and collaboration, as well as the need to avoid duplication of effort.

PROMIS [37] is an NIH initiative, implemented through a number of leading academic centers, which provides an extensive system of measures of patient-reported health status for physical, mental and social health and can be used in studies across a range of medical conditions. A primary objective of PROMIS is to assemble a set of questions to assess the most common dimensions of patient-reported outcomes for a wide range of chronic diseases. These include items to measure pain, fatigue, psychological distress, physical function and overall health. The PROMIS database of measures focuses on patient-reported outcomes (PROs), while COS generally include both patient-reported and clinician-reported outcomes, as well as other outcomes such as laboratory results, etc. PROMIS focuses on how to measure items that are patient-reported, providing an item question bank. How to measure an outcome is typically the second stage of a COS development process, after determining what the most important outcomes to measure are first. This two-stage approach to COS development has the advantage that it can highlight gaps in outcome measurement warranting further research, for example the identification of an important construct or domain for which no suitable measurement instrument currently exists. PROMIS does not seek to develop agreement around a defined set of minimum outcomes to measure and report in a standard fashion across studies, and, by itself therefore, would not address the fundamental problems of relevance, consistency or reporting bias discussed above.

The NIH has is supported a related initiative to encourage the use of common data elements (CDEs), including outcomes, in NIH supported research projects and registries. The NIH provide a resource portal [38] that includes databases and repositories of data elements and case report forms that might help investigators in identifying and selecting data elements for use in their projects. The NIH CDE program is most relevant to researchers once they have determined ‘what’ to measure, offering resources on ‘how’ to measure both particular outcomes and other relevant data items.

The International Consortium for Health Outcomes Measurement (ICHOM [39]) organizes global teams of physician leaders, outcome researchers and patient advocates to define core sets of outcomes for use for specific health conditions to assess the quality of clinical practice. This initiative was established in 2012 with the aim of providing a structured process to achieve consensus on health outcomes to be reported for the purpose of comparing the performance of competing healthcare providers. By the end of 2015, ICHOM had completed 12 standard sets, with seven more in progress and the goal of completing 50 sets by 2017. A list of completed sets, in progress and conditions under consideration is available at [40].

ICHOM is focusing on the development of core data sets to evaluate the quality and efficiency of clinical care, rather than for use in clinical trials. The methods used by ICHOM differ substantially from those used in the development of COS for clinical trials (even taking account of the variability in methods used for the latter), as discussed in more detail below. It is unclear whether the outcomes sets developed by ICHOM for clinical care would also be useful for clinical research, although there would be value in developing standards that could be used for both purposes, making it possible to reuse data collected for either reason.

As an example, ICHOM recently published their consensus measures for patients treated for prostate cancer, providing limited detail on the specific methods used to reach agreement on the recommended measures. It would be valuable to have more information on those methods, such as how the number of patients and other stakeholders was decided; how patients were identified, selected and involved in the process; how final decisions of the inclusion of an outcome were made [41]. In addition, it is important to know how the measuring tools were selected. For instance, the ICHOM standards require a record of the date of recurrence of an abnormal PSA, but the variety of definitions of PSA recurrence (either within a treatment or across treatments) could render comparisons difficult and problematic, particularly for clinical trials [42].

The FDA Clinical Outcome Assessments (COA) Staff aim to encourage the development and application of patient-focused end point measures in medical product development to highlight clinical benefit in labeling. They engage with stakeholders to improve clinical outcome measurement standards and policy development, by providing guidance on COA development, validation, and interpretation of clinical benefit end points in clinical trials. Unmet medical needs are addressed through the Clinical Outcomes Assessment Qualification Program. COA qualification is dependent on appraisal of the evidence to support the conclusion that the COA is a well-defined and reliable assessment of a particular concept of interest for application in studies used to support drug marketing authorization. The EMA is working along similar lines to increase the use of well validated patient-relevant outcomes in their regulatory process.

The main focus of the FDA and EMA work is understandably on the end points to be used in product approvals and labeling, and has not been concerned to date with the assessment of effectiveness involving outcomes that may be useful for nonregulatory decisions by patients, clinicians, payers and others. This may be changing as a result of greater attention to ‘patient-focused drug development’, and may ultimately lead the FDA to greater consideration of outcomes that are relevant to and informed by patients [43]. For example work on Core Symptom Measures for cancer trials has been promoted in prostate, head and neck and ovarian cancers [44].

In considering the measurement instruments to use, several may exist for any given outcome, usually with varying psychometric properties (i.e., reliability and validity). Systematic reviews of measurement instruments provide one way to select a measurement instrument for an outcome within a COS. The COSMIN initiative collates systematic reviews of studies of measurement properties of existing measurement instruments that intend to measure (aspects of) health status or (health-related) quality of life. An overview of these reviews and guidelines for performing such reviews can be found at [45].

However, the quality of these reviews varies widely, and there is a lack of reviews of outcome measurement instruments for many outcomes in many disease areas [46]. More high-quality reviews are needed and the methodology of performing such reviews needs to be further developed and implemented.

Key challenges & moving forward

While it is encouraging that the level of activity around COS has been increasing, substantial work remains to be done, and there are a number of key challenges that must be addressed to accelerate progress. To inform how best to move forward, the Center for Medical Technology Policy and the COMET initiative hosted a one day COS workshop in April 2014, supported with funding from PCORI, The European Union and the UK Medical Research Council. The meeting opened with presentations by ten North American experts currently developing COS for use in clinical trials, PCOR, systematic reviews, quality improvement and other contexts. Each presentation was followed by discussion and feedback from representatives of federal agencies and national organizations with an interest in condition-specific, standardized health outcomes, including most of those described above. The discussions underscored the need to expand capacity to develop high-quality COS, and identified several issues requiring attention to promote further progress. These challenges and potential strategies to address them are summarized below.

Better understanding of the gaps in COS

The number and quality of COS remains limited, despite the recent increase in activity in this field. While a systematic review of gaps has not yet been done, an initial attempt has been made to map the content of the COS database to the most prevalent acute and chronic conditions. For example, no COS have been published regarding trials of drug therapy for Type 2 diabetes or interventions for the management of chronic wounds. And for many other conditions, the available COS have used informal consensus methods rather than structured approaches, which might undermine their acceptability for a wide range of decision makers. An initial informal review of conditions identified by the WHO as being responsible for the highest global burden of disease [47] has identified key gaps. A more systematic assessment of these gaps is now underway. Again, even if existing COS have been identified, further work is needed to determine whether they are of adequate quality to promote them broadly.

An important next step would be to conduct a systematic assessment to identify those high-prevalence, high-burden conditions for which high-quality COS do not yet exist. This could be done by using work that has already been done to identify and rank burden of illness globally or nationally. For example, in the USA, the Agency for Healthcare Research and Quality (AHRQ) has developed priorities for CER, as have the Institute of Medicine and PCORI. Having identified the priority conditions, published COS in those clinical domains could be found relatively simply through the COMET database. Finally, these COS would need to be assessed for quality, of both their methods and reporting. As noted below, work is underway to develop quality assessment and reporting tools for COS, but the initial quality screening could look for a few basic indicators of quality, such as the inclusion of patients or consumers in the development process.

Expanded capacity to produce COS

As priority areas for COS are identified, mechanisms need to be in place to support their development and implementation. This will require research to identify best practices for COS development and the production of a reporting guideline to facilitate clear reporting of the COS and the processes used to develop it (see below). This would likely be accelerated by an increased interest in research funding agencies to support the development of high-priority COS, and of the methods and tools necessary to support groups involved in this work. The nature of this work requires multistakeholder collaboration, ideally at the national or international level. Ideally, many of these initiatives would include leadership from the patient advocacy organizations relevant to each topic. In this way, the benefits of COS for reducing waste in research will not arise from wasteful practices in their development [48].

As part of its effort to support COS developers, the COMET database includes previous work that might help the development of new COS, alongside the reports of COS themselves. For example, an initial step in COS development is usually a review of outcomes measured in previous clinical trials and work to identify outcomes felt to be important by patients. With this in mind, the database includes more than 120 reviews of outcomes measured in trials and 52 studies of patients' perspectives on outcomes to be measured in their condition.

Improving the quality of COS

The aforementioned systematic review and its update revealed wide variation in the methods used to develop COS, with no clear consensus on best practice ([29]; Gorst et al., in preparation). Although key issues to consider when developing a COS have been described [19], there is little or no guidance on how to choose and involve stakeholders, develop consensus, achieve geographical representation or undertake many other aspects of the process.

One of the urgent areas of need is to expand engagement of patients and the public in the development of COS. As noted above, a significant minority of publications reported any patient or public involvement in the process. There is anecdotal evidence to suggest that this has started to shift, and the forthcoming updated systematic review of COS will provide more recent empirical data. Given the rapidly increasing recognition of the need to develop, validate and report outcomes that are meaningful to patients, it is a critical priority to fund research to develop and validate formal qualitative methods to effectively engage patients and the public in this process. The absence of empirically based best practices should not prevent the development of initial consensus around key principles and techniques. Once documented and standardized to some extent, it will be possible to do empirical work to evaluate the effectiveness of alternative approaches.

When best practices have been developed, it will also be possible and useful to develop a quality assessment tool. Although nearly 230 published COS studies have already been identified, the lack of an assessment tool means that there has been no formal quality assessment of these. Defining the quality of a COS is not straightforward. Ultimately, a ‘good’ COS would be one that leads to improved health outcomes but this might be far down-stream of the development process and difficult to measure. It is also unlikely to be a feature of any report describing the development of a COS. Instead, a tool is needed to assess how the COS developers minimized biases which would otherwise undermine the ability of the COS to have a positive impact on patient care and outcomes. This would be analogous to the Cochrane Risk of Bias tool for assessing the studies to include in systematic reviews [49].

Along with best practice guidance and metrics to assess the quality of COS, it will be valuable to develop a standardized reporting tool for publications and reports of COS. Although a preliminary checklist was proposed for the reporting of Delphi surveys used in the development of COS [50], this is not sufficient to address the wider aspects of COS studies. COS studies are still not well reported in terms of the scope and methods used in their development. Information that might help users decide whether to adopt a COS or develop a new one is often lacking. To try to redress this, work is underway to develop a COS reporting guideline, using an international consensus process [51]. The initial areas under consideration include the rationale for and scope of the COS, methods of development, stakeholder involvement, sources of information, consensus process, limitations and plans for implementation and updating.

Increasing uptake of high-quality COS

Where high-quality COS exist or are developed, they will need to be widely accepted and implemented by the majority of researchers if their benefits for health are to be realized. Publication and broad dissemination of the COS is a basic requirement, and the simplified access to relevant reports and papers is now possible as a result of COMET's work. Stronger incentives are likely to be necessary for high levels of uptake; for example, the recognition or endorsement of COS by research funding organizations, journal editors, clinical guideline developers and payer/HTA organizations. Researchers should be considerably more likely to give serious consideration to COS when that decision has the potential to impact the funding of their research, publication of their results, of have an influence over clinical policy or reimbursement.

With respect to research funders, the National Institute for Health Research (NIHR) in the UK provides the following text in their guidance notes for applicants: “Where established Core Outcomes exist they should be included amongst the list of outcomes unless there is good reason to do otherwise. Please see The COMET Initiative website at [52] to identify whether Core Outcomes have been established.” There are not yet examples of US funders making explicit reference to COS, but the NIH, AHRQ, PCORI and others may wish to consider under what circumstances it would be reasonable to do so.

Some journal editors have begun to encourage consideration of COS by researchers. The SPIRIT guidance for reporting protocols of clinical trials [53] encourages investigators to ascertain whether there is a COS relevant to their trial and, if so, to include those outcomes in their trial. Existence of a common set of outcomes does not preclude inclusion of additional relevant outcomes for a given trial.” Finally, in obstetrics and gynecology, the CROWN (Core Outcomes in Women's Health) Initiative [54] is a consortium of more than 70 journal editors that will “strongly encourage the reporting of results for COS. Facilitate embedding of COS in research practice, working closely with researchers, reviewers, funders and guideline makers.”

There is considerable potential for increasing uptake of COS through the mechanism of recognition by clinical guideline developers, HTA organizations and payers – as these groups all carry substantial influence through their direct influence on the evidence-based reimbursement of products and services. The manual for guideline development by the NICE recommends the COMET database as a source of information to be considered (NICE, 2012). The Green Park Collaborative (GPC) in the USA [55] develops recommendations for research in specific therapeutic areas, through a multistakeholder collaborative process that includes payers, guideline developers, HTA organizations, as well as patients, clinicians and other key stakeholders. A number of GPC projects have attempted to increase the measurement and reporting of patient-relevant outcomes [56]. Efforts are also underway to leverage the broad range of decision making authority represented by the membership of this collaborative to promote the uptake of high-quality COS. Ultimately, it is likely that explicit recognition or formal endorsement of COS will be necessary to ensure the level of consistent use that will achieve the original objectives of this work.

Conclusion

If CER is to have an impact on practice and, thereby, the health of patients and the public, it needs to ensure that the outcomes that matter most to the people making decisions and choices are developed, measured and reported. COS can facilitate this, especially when their development and implementation is integrated with other key initiatives to improve research and practice.

In addition to standardized outcomes in clinical trials, consistent reporting of outcomes is also important for studies of the onset, course and consequences of disease. To this end, clinical cohorts, registries and observational databases have been established; and routinely collected data from within electronic medical record (EMR) systems is likely to become increasingly available in the future to support research. To achieve the full potential of such an approach, it is vital that similar attention is paid to the choice and definition of outcomes that should be collected within such systems. It has also been proposed that prospective registries of clinical trials should place increased emphasis on COS [57].

There are several serious challenges to overcome. Some useful work is already underway as discussed in this paper. However, much remains to be done, and the effort will need to be focused, coherent and sustained. Health areas in particular need of COS have to be identified, the development of high-quality COS to meet this need has to be encouraged and supported, and these COS have to be implemented in research and practice. Ultimately, the failure to consistently develop, measure and report the outcomes most meaningful to patients in research is both wasteful and harmful, to research participants and to patients.

Executive summary

Current problems with the outcomes in clinical trials

Outcomes included in clinical trials and comparative effectiveness research have not always been those that patients, clinicians and payers regard as the most important or relevant.

Many studies which explore the effects of the same intervention on a specific health condition measure or report different outcomes.

Difficulties caused by heterogeneity in outcome measurement create uncertainty that interferes with clinical and health policy decision making.

We believe that progress toward the goals behind the significant investments in patient-centred outcomes research and comparative effectiveness research will depend on expanded efforts to develop, measure and report standardized health outcomes in this research.

Definition & examples of core outcome sets

A core outcome set (COS) has been defined as an agreed standardized set of outcomes that should be measured and reported, as a minimum, in all clinical trials in specific areas of health or healthcare.

COS may include both patient-reported outcomes (PROs) and clinician-reported outcomes (ClinROs).

The OMERACT initiative has served a critical role in the development and validation of outcome measures in clinical trials in rheumatology, since their first conference in 1992.

There are a few hundred other examples of COS initiatives including the IMMPACT and the HOME Initiative.

The COMET initiative

The COMET Initiative was launched in 2010 to encourage and support the process of developing and implementing COS.

The first comprehensive search for COS in health research highlighted the need for methodological guidance, including how to engage key stakeholder groups, particularly members of the public, in the development and implementation of COS.

The COMET database highlights numerous clinical domains in which high-quality COS have not yet been developed.

Methods for developing COS

Standardized methods for the development of COS have not yet been widely adopted, but a set of issues to consider are increasingly being addressed.

The first stage in the development of a COS is most frequently a decision on what outcomes or outcome domains to measure, followed by agreement on how those outcomes should be defined and measured.

Clinical experts have been involved in all studies identified for the COMET database, and there has been a recent shift toward greater involvement of patient and public representatives.

Other initiatives involving standardized outcomes

A number of initiatives are underway to improve the measurement and reporting of outcomes in research and clinical care, with a variety of objectives and approaches.

PROMIS is an NIH initiative, which provides an extensive system of measures of patient-reported health status for physical, mental, and social health and can be used in studies across a range of medical conditions.

The US FDA Clinical Outcome Assessments program encourages the development and application of patient-focused end point measures for use in the regulatory context.

The COSMIN initiative collates systematic reviews of studies of measurement properties of existing outcome measurement instruments.

Key challenges & moving forward

Despite a recent increase in published COS, there is still considerable room for improvement in the number and quality of COS.

There is a need for a better understanding of the gaps in COS – identifying those high-prevalence, high-burden conditions for which high-quality COS do not yet exist.

Research is required to identify best practices for COS development.

It is a critical priority to fund research to develop and validate formal methods to effectively engage patients and the public in this process.

It would be useful to develop a quality assessment tool for COS.

Work is underway to develop a reporting guideline to facilitate clear reporting of the COS and the processes used to develop it.

Increasing uptake of high-quality COS

Where high-quality COS exist or are developed, they will need to be widely accepted and implemented by researchers if their benefits for health are to be realized.

Publication and broad dissemination of the COS is a basic requirement, and COMET has made it much easier to access relevant reports and papers.

Stronger incentives are likely to be necessary for high levels of uptake; for example, the recognition or endorsement of COS by research funding organizations, journal editors, clinical guideline developers and payers.

Acknowledgements

The authors would like to thank J Gerson from the Patient Centered Outcomes Research Institute for his insights and support for this work.

Financial & competing interests disclosure

This work was supported in part by a Eugene Washington Engagement Award from the Patient Centered Outcomes Research Institute. This work was also funded in part by the MRC MRP (Medical Research Council Methodology Research Panel), grant number MR/J004847/1; and European Union Seventh Framework Programme ([FP7/2007-2013] [FP7/2007-2011]) under grant agreement number 305081. The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.

No writing assistance was utilized in the production of this manuscript.

References

Orszag PR, Ellis P. Addressing rising health care costs – a view from the Congressional Budget Office. N. Engl. J. Med. 357(19), 1885–1887 (2007).