Free access

Special Report

14 December 2020

The terminology conflict on efficacy and effectiveness in healthcare

Authors: Franz Porzsolt https://orcid.org/0000-0003-3554-2902 [email protected], Felicitas Wiedemann, Meret Phlippen, Christel Weiss, Manfred Weiss https://orcid.org/0000-0003-4851-6439, Karen Schmaling, and Robert M KaplanAuthor Info & Affiliations

Publication: Journal of Comparative Effectiveness Research

Volume 9, Number 17

https://doi.org/10.2217/cer-2020-0149

PDF

Abstract

Designers and architects created the rule ‘form follows function (FFF)’ for their own profession. Our paper demonstrates that this FFF rule applies equally well to the designers of clinical studies. Four examples present are as follows: disregarding this FFF rule causes an inconsistent terminology to differentiate between efficacy and effectiveness, inconsistent differentiation of efficacy and effectiveness interferes with the consistent interpretation of the results of clinical studies, inconsistent interpretation of clinical studies results in an unexpectedly variance of recommendations in clinical guidelines and the fusion of the FFF designer rule and of the demands of Cochrane and Bradford Hill (‘can it work?’, ‘does it work?’ and ‘is it worth it?’) avoids the terminology problem and its misleading consequences. This strategy is presented.

The NIH Collaboratory asserted “If we want more evidence-based practice we need more practice-based evidence” [1]. ‘Evidence-based’ practice suggests that day-to-day healthcare decisions are based on scientific data. For many, ‘scientific data’ is limited to results from randomized controlled trials (RCTs).

Unfortunately, the term ‘randomized’ may have two meanings. Some investigators consider random allocation only a characteristic of experimental or explanatory studies that describe efficacy. According to these investigators the random allocation is not a characteristic of pragmatic or observational study designs that describe effectiveness [2–4]. Others believe that both types of studies should be randomized [5,6].

This paper addressed the variability in the use of the terms. Inconsistent use of terminology can confuse the interpretation of clinical study results and the recommendations in evidence-based clinical guidelines. Borrowing from other disciplines, we offer suggestions for a uniform terminology to improve communication in clinical research and healthcare delivery.

Description of the terminology problem

Study question

The aim of this study is to examine the variability in use of terms used to describe explanatory and pragmatic trials. Further, to consider how this variability affects clinical guidelines. Two contradictory concepts frame the terminology problem. First, there is a functional dichotomy between explanatory and pragmatic trials: explanatory trials describe the efficacy, in other words, the proof of principle under experimental conditions, while pragmatic trials are expected to describe the real-world effectiveness under nonexperimental real-world conditions. Our arguments are derived from papers by three research groups [2–4]. The second issue is an assumed continuum between explanatory and pragmatic trials. Therein, trials are believed to be expressed by their forms: both explanatory and pragmatic trials are randomized but differ gradually by either more experimental or more pragmatic characteristics. The arguments for this concept are listed in the CONSORT 2008 statement [7] and the PRECIS-2 wheel [8].

Method

To answer our study question, we followed the strategy of designers and architects ‘form follows function’ [9] and identified publications that used terms to describe either the function or the form of a trial. These examples had to be identified by hand searching as there are neither search engines that identify terminology problems nor standardized tools such as PRISMA or GRADE that fit this purpose. Using PubMed and Google Scholar, we began by searching for titles that combined the term ‘terminology’ with at least one of the terms that described either the function or the form of the study. Ten terms were used in the searches: ‘experimental’, ‘observational’, ‘explanatory’, ‘pragmatic’, ‘efficacy’, ‘effectiveness’, ‘randomized’, ‘nonrandomized’, ‘analytical’ and ‘descriptive’. We excluded studies that used these terms but did not define them. Clinical examples were insufficient. Even this broad search strategy identified too few studies for the analysis of the ten functional or formal terms. Duplicate publications from the same research group were excluded when similar results were reported in another paper that was already included. Papers were included that had pairs of terms (e.g., randomized and nonrandomized) or matched single terms that were not supplemented by the corresponding term, for example, mentioning and explaining the term ‘explanatory’ but not the term ‘pragmatic’. The co-occurrence of the five pairs of terms was extracted from each study.

Results

The initial search identified 40 studies. In 13 studies ‘terminology’ was combined with the term ‘analytical’, in nine studies with either ‘experimental’ or ‘effectiveness’, in seven with ‘randomized’, in two with ‘pragmatic’ and in none with either ‘explanatory’ or ‘nonrandomized’ or ‘observational’ or ‘descriptive’ or ‘efficacy’. Overall, two types of publications were identified: papers that assumed a dichotomy between randomized and nonrandomized or explanatory and pragmatic trials, and papers that assumed a continuum between randomized and nonrandomized or explanatory and pragmatic trials.

The most frequently used terms in methodological publications on randomized/nonrandomized or explanatory/pragmatic studies were the five pairs of terms: randomized/nonrandomized, explanatory/pragmatic, experimental/observational, analytical/descriptive and efficacy/effectiveness. The information of the 11 review articles summarizing the details of this standard is shown in Table 1.

Table 1. Terminology in eleven methodical review articles on clinical trials.

Year	Authors	Randomized	Nonrandomized	Explanatory	Pragmatic	Experimental	Observational	Analytical	Descriptive	Efficacy	Effectiveness	N of explained terms/paper	Ref.
1967	Schwartz & Lellouch	N	N	+	+	N	N	N	N	+	N	3	[2]
1998	Roland & Torgerson	+	+	+	+	N	N	N	N	+	+	6	[10]
2002	Grimes & Schulz	+	+	N	N	+	+	+	+	N	N	6	[3]
2006	Gartlehner	+	+	+	+	N	+	N	N	+	+	7	[11]
2008	Zwarenstein et al.	+	+	+	+	N	N	N	N	(+)	(+)	5	[7]
2009	Thorpe et al.	+	+	+	+	+	N	N	+	(+)	(+)	6	[12]
2014	Thiese	+	+	N	N	+	+	+	+	+	(+)	7	[4]
2015	Loudon et al.	+	+	+	+	N	N	N	N	+	N	5	[8]
2016	Ford & Norrie	+	+	+	+	N	N	N	N	+	+	6	[13]
2016	Raine et al.	+	+	+	+	+	N	+	+	+	+	9	[14]
2017	Oude Rengerink et al.	+	+	+	+	N	N	N	N	+	+	6	[15]
	# of 11 reviews that explain the meaning of the used term	10	10	9	9	4	3	3	4	8	5

Review articles advocating the dichotomy of randomized and nonrandomized or explanatory and pragmatic trials are marked yellow. All other articles advocate the randomization for explanatory and pragmatic trials. The reviews marked in light blue meet the criteria proposed by Roland & Torgerson. The dark blue fit only partially the definitions of the Roland & Torgerson review and advocate an existing continuum between explanatory and pragmatic trials.

N: This term is not mentioned in this review.

+: This term is mentioned and is explained in the review.

(+):This term is mentioned but not explained in this review.

The vertical columns in the table show that none of the 11 reviewers included all ten terms in their reviews and the horizontal columns show that none of the 11 reviews included all ten terms. The most frequently described pairs of terms were randomized/nonrandomized and explanatory/pragmatic while the least commonly mentioned pairs of terms were experimental/observational and analytical/descriptive. Raine [14], Thiese [4] and Gartlehner [11] included the most of the ten terms in their reviews while Schwartz & Lellouch [2], Zwarenstein [7] and Loudon [8] included the least of the ten terms.

This ‘Swiss-cheese pattern’ with small and big holes suggests low congruence in the terminology of study functions but does not provide sufficient explanatory information.

Discussion

We used 11 methodological publications to demonstrate inconsistencies in the use of terminology in research that supports clinical care. The inconsistent terminology is not only an academic problem; it can have important implications for medical communication. Additional evidence suggests that inconsistent use of terminology undermines both basic and applied research. Mixing up efficacy and effectiveness research can hinder the interpretation of clinical studies [16,17] and the meaningfulness of clinical guidelines [18–22]. These guidelines will be influenced by efficacy and effectiveness data, but the direction of influence is not predicable when terms are not used consistently.

The RCT is an experimental method that should be applied only under experimental study condition (ESC). The pragmatic controlled trial (PCT) is an observational method that should be applied only in a pragmatic or nonexperimental study under real world condition (RWC). To avoid confusion, experimental study results should not be generalized to nonexperimental conditions and vice versa. Experimental and nonexperimental studies can be differentiated by seven criteria [23,24]. In summary, for any further decisions and developments a clear separation of the definitions of all facets of efficacy and effectiveness is essential.

The PCT can assess outcomes of interventions under day-to-day RWC [25–27]. The final chapter of our 2006 book ‘Optimizing Health’ emphasized the importance of quantifying the value of healthcare to patients in day-to-day practice [28]. We now turn to theoretical frameworks for establishing the value of care for patients.

Valuing healthcare

Our current proposal is based on combining two important theoretical frameworks.

Cochrane–Bradford Hill questions

More than 70 years ago, Archie Cochrane and Austin Bradford Hill distinguished three questions that must be considered in the evaluation of clinical interventions: ‘can it work?’, ‘does it work?’ and ‘is it worth it?’ [29].

BAUHAUS Dessau & the Hochschule für Gestaltung Ulm

In addition to the Cochrane–Bradford Hill questions, architects have been considering the concept of ‘form follows function’ for over 100 years. In Figure 1 the three Cochrane–Hill questions are combined with the two dimensions defied by the architects and designers. These two dimensions are the functional and formal or structural characteristics of a (research) project.

Figure 1. This figure connects two types of information: the information provided by designers & the information provided by the answers to the three Cochrane–Hill questions.
As a result of this fusion of concepts from architecture and epidemiology the emerging ‘syntopy’ enables the differentiation of the traditional terminology into a ‘functional terminology’ and a ‘formal terminology’. The resulting differentiated terminology is a possible solution of the existing terminology conflict as shown in this figure.
ESC: Experimental study condition; RWC: Real world condition.

The first of the Cochrane–Hill questions addresses the ‘proof of principle’. Many RCTs performed under ideal circumstances can be used as proof of principle. They establish ‘efficacy’ [2–4]. The confirmation of ‘efficacy’ requires only a few studies: the principle answer can be provided by a single experiment that has been confirmed once or twice. Importantly, efficacy trials are often conducted under conditions quite dissimilar to every-day practice. Exclusion criteria dismiss patients with many of the most common comorbidities and active run in periods exclude participants who are not perfectly adherent.

An RCT cannot be used for assessment of effectiveness because the RWCs need to be significantly modified before an RCT can be performed. The application of an RCT and the measurement of effectiveness are, therefore, incompatible. The challenge we had to solve was the development of a method that can render the ‘natural chaos’ under RWC into an evaluable order without changing the characteristics of the RWC. Characteristics under RWC are:

•

Under RWC the allocation is always made by the practitioner (ideally together with the patient) but never by the investigator;

•

Under RWC exclusion criteria cannot be applied. All patients who meet the inclusion criteria of a PCT may be offered care independent of comorbidities, cotreatments or other confounding factors;

•

Under RWC it is the responsibility of the attending doctor to find the best therapy for each patient;

•

Under RWC answers to the second and third question of the Cochrane–Hill strategy will be provided.

We suggest that a solution of the terminology problem combines the Cochrane–Hill strategy with the recommendation of architects and designers, in other words, ‘form follows function’ [9].

Albert Einstein is often quoted as saying that “problems cannot be solved with the same mindset that created them”. Health services researchers seem to agree with the three Cochrane–Hill questions and with the three most important dimensions of outcomes that are: efficacy, effectiveness and value. But there is little recognition that different tools are required to answer three different questions.

Is efficacy/effectiveness a continuum

It is often assumed that there is a continuum efficacy and effectiveness. However, we believe the distinction between efficacy and effectiveness is better represented as a dichotomy than as a continuum. A continuum between efficacy and effectiveness would require an intermediate stage between efficacy and effectiveness. We have not been able to identify an intermediate stage. Clear outcome dimensions can be assessed in efficacy and effectiveness trials: the proof of principle and the real-world effectiveness. It is not possible to define the outcome dimension that will be assessed in a trial that is neither explanatory nor pragmatic.

According to our dichotomous model, outcomes can be assessed either under ESC or under RWC but not under conditions between ESC and RWC. Our model presumes that efficacy can be assessed under ESC only while effectiveness and value can be assessed under RWC only. In other words, our dichotomous model clearly differentiates two functions and two forms.

The disadvantage of our model is the requirement of a new tool that can be used for valid assessment of effects under RWC. No gold standard has yet been defined for this action so far. The development of our model has been described in several publications [9,24,27,30] and is summarized in Figure 2.

Figure 2. The three questions posed by Archie Cochrane & Austin Bradford Hill were used to propose the described strategy & answers to the three questions: the ‘Proof of Principle’, in other words, the efficacy, the real world effectiveness & the subjective value.
The three answers can be assessed either under ideal study conditions or under real world conditions from the perspectives of clinical research, health services research or economic research. The three different answers provided under different conditions represent different perspectives, require three different types of studies and three different tools for assessment of either efficacy or effectiveness or value.

Future perspective

Over the next 5–10 years, we expect an evolution toward greater acceptance of studies conducted under real-world conditions. Although we will continue to ask whether new treatments can work (efficacy), there will be greater attention to whether they do work in clinical practice (effectiveness) and whether they are valuable from the patient’s perspective.

Clarification of the terminology around efficacy and effectiveness will be required to achieve this transition. Further, dependence on the traditional RCT as the primary method of evaluation will give way to a mix of RCTs conducted under experimental conditions and pragmatic trials conducted under real-world circumstances. The research toolbox will be expanded to include advances in clinical research, health services research and economic analysis. We hope that clinical guidelines will require that the evidence has relevance to the circumstances in which providers practice and patients receive care. Regulators and insurers should also attend to a broader range of evidence for approval decisions.

Executive summary

•

Designers and architects created the rule ‘form follows function’ for their own profession. When we tried to apply this rule to the 3D-assessment of outcomes in healthcare, we observed an existing conflict between form (structure) and function in some of the studies. The 3D-assessment is based on recommendations by Archie Cochrane and Austin Bradford Hill. It includes the assessment of three different outcome dimensions (efficacy, effectiveness and value) and three different tools for assessment of these outcomes. The important aspects are:

◦

The form (structure) of randomized controlled trials is well defined by two essential characteristics, the definition of ‘exclusion criteria’ and the ‘random-based allocation’.

◦

The scientific literature describes two different functions of clinical studies. One function is the description of efficacy, in other words, the confirmation of the proof of principle. The second function is the description of outcomes following care under real-world conditions, in other words, the confirmation of real-world effectiveness.

◦

Some authors use randomized controlled trials to confirm both efficacy as well as effectiveness. This means the same structure of a clinical trial is expected to confirm to different functions.

•

To support our observation, we present the inconsistent terminology for differentiation of efficacy and effectiveness and report the consequences of this conflict.

•

We propose a strategy and a method to avoid this conflict. The strategy is based on the demands of A Cochrane and AB Hill (‘can it work?’, ‘does it work?’ and ‘is it worth it?’).

Acknowledgments

This work was not possible without the support of colleagues and students who contributed to this publication. Cooperating colleagues: V Balassone (Tor Vergata University, Rome/Italy), M Eisemann (Arctic University of Norway, Tromsø/Norway), A Ghosh & K Ghosh (Mayo Clinic, Rochester/MN, USA), GO Kamga Wambo (ICE e.V, Ulm/Germany), F Lobmeyer (University of Ulm/Germany), PCM Mayer (UNICEUMA, Imperatriz/MA/Brazil), P Rosati (Ospedale Pediatrico Bambino Gesu, Rome/Italy), TG Thomaz (UFF, Niteroi/RJ/Brazil), C Weiß (Medical Statistics Mannheim Hospital/University of Heidelberg, Mannheim/Germany) and M Weiss (University Hospital Ulm, Ulm/Germany). Doctoral fellows: M Phlippen, C Miller, M Schroeder, G Tzoulaki and F Wiedemann (Medical Faculty, University of Ulm, Ulm/Germany).

Financial & competing interests disclosure

The authors have no relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.

No writing assistance was utilized in the production of this manuscript.

References

Papers of special note have been highlighted as: • of interest; •• of considerable interest

Green LW. Public health asks of systems science: to advance our evidence-based practice, can you help us get more practice-based evidence? Am. J. Public Health 96, 406–409 (2006).