Original Article
Scaled rectangle diagrams can be used to visualize clinical and epidemiological data

https://doi.org/10.1016/j.jclinepi.2005.01.018Get rights and content

Abstract

Objective

To illustrate scaled rectangle diagrams as a method for displaying clinical and epidemiological attributes (such as symptoms, signs, results of marker tests, disease, or risk factors). These are quantitative Venn diagrams, but using rectangles instead of circles.

Study Design and Setting

The method is illustrated through examples from various data sets with different types of clinical information.

Results

Examples drawing on studies of lung disease, rheumatic fever, blood pressure, lipid levels, sudden infant death syndrome, and low birth weight illustrate the different types of relationships between variables that the scaled rectangle approach can reveal (e.g., high- and low-risk groups; dependent, independent, or co-occurring attributes; effects from choice of cutoff; cumulative distributions; and case-control attributes).

Conclusion

Scaled rectangle diagrams are a novel way to display clinical data. They show clearly the relative frequency of clinical attributes and the extent to which they are shared characteristics. Features are revealed that might otherwise not have been appreciated.

Introduction

Venn diagrams have often been suggested in clinical research to symbolically show how features such as clinical signs, symptoms, tests, and staging coincide. Feinstein [1] pioneered their use. Some authors have tried scaling Venn diagrams so that the circles and their intersections are drawn in proportion to frequencies of occurrence. Feinstein [1], however, noted the technical difficulty in creating such proportional Venn diagrams and observed that, although they may demonstrate quantitative relationships, the qualitative aspects of the relationships may be less clear than with a purely symbolic nonproportional Venn diagram. In an earlier publication [2], I have shown that the technical difficulty is reduced somewhat by using rectangles instead of circles. The resulting scaled rectangle diagram often show both qualitative and quantitative information reasonably well.

Here, examples of the analysis of clinical and epidemiological data will illustrate the use of scaled rectangle diagrams. With examples used as case studies, some of the ideas I had earlier presented [2] will be advanced in a number of ways. For example, despite my original suggestion that four attributes may represent a limit [2], it is sometimes possible to represent more. New criteria for fitting scaled rectangle diagrams use a penalizing function to avoid narrow rectangles with a high length-to-breadth ratio. Color or shading can be used to represent the intensity of an additional variable, and scaled rectangle diagrams can be used to represent combinations of attributes and high- and low-risk subgroups. In some examples, scaled rectangle diagrams will be compared with Feinstein's original Venn diagrams. Other examples draw on studies of hypertension, lipids, sudden infant death syndrome, and low birth weight to illustrate the method.

Section snippets

Theory and methods

Suppose there exist attributes (e.g., signs, symptoms, or tests) that are measurable on patients and there is a data set with recorded presence of attributes in a sample of patients. The relative frequency, or prevalence, of each attribute can be computed and rectangles drawn with area proportional to prevalence. A scaled rectangle diagram is formed by positioning these rectangles so that not only are the areas of the rectangles proportional to attribute prevalence, but also with areas of

Feinstein's lung disease example

Feinstein [1] describes a sample of 175 patients with lung disease. Patients were classified with the characteristics emphysema (EMPH), mucous gland hyperplasia (MPH), ‘blue and bloated’ (BB), ‘pink and puffy’ (PP), and with neither BB nor PP (NN). Feinstein presented two attempts to draw a quantitative Venn diagram; the better of these is reproduced in Fig. 1a. Figure 1b is a scaled rectangle diagram for these data. It shows the relative sizes of the subgroups more accurately than Feinstein's

Discussion

Clinical and epidemiological researchers are often interested in the extent to which signs, symptoms, results of tests, risk factors, and other clinical observations co-occur. Often an appreciation of the actual extent of co-occurrence is lost by the reduction to summary measures of association, which are usually intended to convey departure from statistical independence; however, although independence is a fundamental concept, it is instructive to actually visualize the co-occurrence of

References (16)

  • K.M. Johnson

    The two by two diagram: a graphical truth table

    J Clin Epidemiol

    (1999)
  • J.J. Foster

    The influence of shape on apparent area: a new demonstration

    Acta Psychol

    (1976)
  • A.R. Feinstein

    Clinical judgment

    (1967)
  • R.J. Marshall

    Displaying clinical data relationships with scaled rectangle diagrams

    Stat Med

    (2001)
  • Marshall RJ. Search Partition Analysis. SPAN software home page [Internet]. 1999–2001. Available at:...
  • C. Bullen et al.

    Cardiovascular disease risk factors in 65–84 year old men and women: results from the Auckland University Heart and Health Study

    N Z Med J

    (1998)
  • E.A. Mitchell et al.

    Results from the first year of the New Zealand cot death study

    N Z Med J

    (1991)
  • R.J. Marshall

    Determining and visualising at-risk groups in case-control data

    J Epidemiol Biostat

    (2001)
There are more references available in the full text version of this article.

Cited by (27)

  • Physiotherapy for patients with shoulder pain in primary care: a descriptive study of diagnostic- and therapeutic management

    2017, Physiotherapy (United Kingdom)
    Citation Excerpt :

    In all other cases, median scores and the interquartile range (IQR) were used. Hypotheses after patient history were categorized according to the guidelines (complaints arising from pathology/dysfunction in: 1) the subacromial space (subacromial impingement, internal impingement & sprain/strain), 2) glenohumeral joint (glenohumeral joint instability, frozen shoulder, biceps tendinopathy & SLAP), 3) acromioclavicular (AC)/sternoclavicular (SC) joint, 4) cervico-thoracic spine and 5) other and presented in a scaled rectangle diagram [30]. The number of missings were reported with all data.

  • Cardiovascular risk can be represented by scaled rectangle diagrams

    2009, Journal of Clinical Epidemiology
    Citation Excerpt :

    Here the nature of the difficulty is demonstrated with “scaled rectangle diagrams,” that in contrast to the approach in Ref. [1], represent frequencies by areas. Scaled rectangle diagrams have been developed to show relationships among categorical variables and specifically in an epidemiologic and clinical context [2]. We use data from a CVD screening program in Auckland, New Zealand, from published Scottish data, and data from a modeling study of CVD prevention in China, to illustrate the ideas.

  • Clinical information displays to improve ICU outcomes

    2008, International Journal of Medical Informatics
  • Individual and Joint Expert Judgments as Reference Standards in Artifact Detection

    2008, Journal of the American Medical Informatics Association
    Citation Excerpt :

    Furthermore, the table shows that high agreement values were observed for no pair of physicians for all three variables. Figure 3 visualizes the intersection of the individual and joint judgments of the ABPm, CVP, and HR time series using scaled rectangle diagrams.18 The CVP diagram shows that almost all data points that were judged as artifact by expert 1 and 2 were included in the data points regarded as artifact by expert 3 and 4.

View all citing articles on Scopus
View full text