Purpose: Students will work in groups to develop a visual model of how coronavirus spreads and will discuss how such models might be used to inform decisions about public health.

Procedural overview: After reading the Science News article “COVID-19 case clusters offer lessons and warnings for reopening” and exploring the data, students will analyze data about one set of coronavirus clusters displayed in three different ways. Students will create a visual model to show how another set of virus clusters formed and expanded. Based on the models, students will discuss how data and their presentation can inform public health officials who are developing and evaluating plans to restrict or reopen communities.

Approximate class time: 1 to 2 class periods

Visual Models for How a Virus Spreads student worksheet
Computer with access to the interactive bubble diagram in the Science News article “COVID-19 case clusters offer lessons and warnings for reopening
Interactive meeting and screen-sharing application for virtual learning (optional)

Directions for teachers:


After students read the Science News article “COVID-19 case clusters offer lessons and warnings for reopening,” introduce them to the core concepts of epidemiological methodology, including identifying how viruses spread, identifying outbreak clusters and tracing contacts between people. A print version of the article, “Lessons from COVID case clusters,” appears in the August 15, 2020 issue of Science News.

An outbreak cluster, or viral cluster, is a group of cases that are related by location and time. By studying clusters, scientists and health officals can construct a timeline or series that identifies the index case, or first person infected, and the people to whom the index case may have spread the disease. Clusters can be used to understand how infections spread. The coronavirus spreads when people are in close enough contact that an uninfected person can breathe in viral particles exhaled by an infected person. The virus can also be transmitted when someone touches surfaces contaminated with exhaled viral particles and then touches their face, especially their mouth, nose or eyes. Contact tracing is the process of identifying people who have come in contact with someone with an infectious disease and thus may have been exposed to the virus. Contact tracing allows health officials to identify specific activities or locations in which the virus is likely to spread, as well as to interrupt the cycle of infection by isolating potentially infectious people until they have cleared the virus from their system and are no longer contagious.

After students have read the article and explored the data, answer questions No. 1 and 2 as a class. If students will be doing this activity virtually, instruct them on the platform they should use for discussion (Zoom, Skype or another suitable chat program) and for data sharing (e-mail, Google Docs, Sheets or Slides).

1. How do analyzing clusters and contact tracing help identify the processes and factors involved in viral spread?

Students should think about how index cases are identified and how secondary and tertiary transmissions are mapped. By tracing contacts and movements of the infected people in a cluster and constructing a timeline, scientists are able to identify who spread the virus to whom and gain some clues to how. Analyzing clusters allows scientists to identify settings and conditions under which the virus is most likely to infect the most people. Scientists can then analyze the correlations and identify the most likely methods by which the virus spreads, whether that is by direct human contact; through food, water or waste; or by contact with nonhuman animals.

2. How did studying the restaurant cluster in Guangzhuo, China, help scientists learn about how the virus spreads?

Students should discuss how investigators identified the index case and modeled air flow within the restaurant. By identifying the most likely infected person based on her movements and symptoms, researchers were able to identify the most likely source of infection. By modeling the air flow caused by the air conditioner, they were able to see how viral particles moved through the restaurant, which explained the pattern of infected vs. noninfected people at the restaurant.

Now guide the class as they analyze the interactive cluster bubble model included in the Science News article “COVID-19 case clusters offer lessons and warnings for reopening.” In this model, a “cluster” is a set of cases linked to a specific time and place. Students should explore the interactive bubble model for all settings and for a few specific settings. Encourage students to explore settings that show different patterns and to describe each setting in terms of the following characteristics:

  • Is it indoor or outdoor/confined or open?
  • Is it densely populated? How many people is each individual in that situation likely to encounter or interact with?
  • Is it a unique event or setting, or is it somewhere people return to repeatedly?
  • Are people close together or far apart? Are they likely to touch one another?
  • How long will people be in contact in each setting?
  • Is physical exertion likely to be high or low? (Will people breathe heavily?)
  • Are people easily able or likely to wear masks, wash their hands and/or practice social distancing?
  • Were viral clusters in each setting likely to be large or small?
  • Can you identify any correlations between the characteristics of the settings and the average size of the viral clusters?

Answer questions No. 3–6 as a class.

3. What do the known coronavirus clusters reveal about the environmental factors most likely to lead to larger clusters and outbreaks?

The clusters reveal that indoor settings are particularly problematic for the spread of coronavirus. Environments where people are packed together, breathing heavy and interacting closely and frequently are also likely to promote the spread of infection. Clusters in households are common, but the number of people infected is likely to be small. Factories, cruise ships, jails and eldercare facilities, where a lot of people are in close contact for extended periods of time, can promote largre clusters, and so on.

4. What information is presented in the bubble diagrams to describe and compare outbreak clusters? What information is not included in that model?

The bubble diagram in the article describes outbreak clusters by identifying the size of each outbreak in specific settings and whether the event occurred indoors or outdoors. You have to look at each cluster individually to find specific information, such as the number of people in the setting, the number of people infected and information about related clusters in the same set. The model does not indicate a timeline for infections within the cluster or identify the index case or who was infected by that individual. The bubble diagram also does not show how different clusters may have been connected to other clusters.

5. What are the benefits and limitations of using the bubble diagram? Why do you think the authors chose to use that cluster model?

Encourage students to think about what information is being presented and what information has been left out of the diagram. Guide students toward a discussion of relevancy for the intended audience and how much information is enough to be useful to a specific discussion without becoming too much and causing confusion or derailing a conversation.

6. Why is the way data are presented important to scientific understanding of a phenomenon?

The way data are presented highlights different information from which trends, correlations between variables and cause-and-effect relationships can be understood. The data provide the evidence around which hypotheses and theories can be developed and policies can be implemented. When data are presented in different ways, different relationships are highlighted or exposed, which changes the questions scientists ask and the conclusions they draw.

Data analysis

Students will now work in small groups to analyze three different visual displays of clusters of coronavirus cases. Students will start by analyzing the set of Cheonan fitness class clusters in the bubble diagram from the Science News article “COVID-19 case clusters offer lessons and warnings for reopening.”  Then, groups should review the article “Cluster of coronavirus disease associated with fitness dance classes, South Korea,” and study the Figure from the main article and Table 2 in the linked Appendix. To support collaboration, consider having students use a Google doc, a Zoom breakout room or some other platform that will allow members of each group to work simultaneously and to give one another immediate feedback.

Students should answer questions No. 7–15 in their groups.

Data analysis: Science News bubble diagram

7. Locate the Sport setting category in the interactive bubble diagram presented in the Science News article. How would you describe the clusters in the Sport setting in terms of average size? What other conclusions can you draw about the spread of the coronavirus in the Sport setting?

Generally, the Sport clusters are relatively small. Most of the transmission in Sport settings took place indoors, where people are likely to be close together and there is limited air flow. The data also suggest that when people breathe heavily, as they do when exercising, they may be more likely to spread or contract the virus.

8. Hover over each bubble in the Sport setting to identify the clusters in Cheonan, South Korea. What does this display of data communicate about how the disease was spread within the Cheonan clusters, and what information is left out of this diagram?

The Sport setting category includes information about each individual cluster in the set, but it does not make connections between the clusters in the set. In addition, the clusters are not organized in any sort of chronological order or in order of size of cluster or by which instructor was the index case in each cluster.

9. Who was the index case in each cluster in Cheonan, South Korea? About how many people total were infected during fitness classes in this set of clusters?

The index case in each cluster was the instructor. By adding up the “Total cases in cluster” for all of the bubbles that represent clusters in Cheonan, South Korea, you find that nearly 60 people (57) were infected within this set of clusters.

Data analysis: Table 2

10. Read the abstract of the article “Cluster of coronavirus disease associated with fitness dance classes, South Korea.” Review Table 2 in the linked Appendix,  which describes the “attack rate,” or the percentage of people exposed to infection who contracted the disease. Was the disease transmitted at the same rate in all of the classes and by all of the instructors? What was the overall infection rate within the cluster set?

The overall infection rate, or attack rate, was 26.3 percent. The rate of transmission was not consistent between classes or between instructors. Classes ranged from 5 percent infection to 70 percent infection. Instructor A had an attack rate of about 20 percent, which was about half of the attack rate caused by Instructor B.

11. Based on Table 2, what information can you easily identify about how the disease spread within the clusters? How does this differ from the information you could identify in the bubble diagram?

I could easily identify the total number of people exposed and the total number infected within the clusters together and within each cluster. I can see which instructors were index cases for each cluster and how many people were infected during each class. This same information is included in the bubble diagram, but it is much easier to identify when looking at the table. A data table describing the data for “All Settings” in the bubble diagram would not be easy to read or to use to compare sizes of clusters or how the virus spread within indoor versus outdoor settings.

Data analysis: Article figure

12. Analyze the Figure in the main article. This diagram is a case map of the Cheonan clusters. It organizes the clusters by the index case (instructor shown as a red square) and the specific fitness class (cluster) as a vertical bar. It includes both a timeline for exposure and the relationship of the infected person to the index case and to other infected people. What information does this display have in common with the other two models?

Like the other two models, the case map identifies clusters within the set by index case and by the class, but it is much easier to see that all of the clusters in the set originated with a single instructor. It is organized vertically in the same order as the data table, and it includes information about the total number of exposures and the total number of infections connected with each instructor and each class. However, you have to count the yellow squares in each segment to determine the total number of infections in each of the clusters.

13. What information is included in this graphic that makes it different from the others?

This display is much more detailed than the other two models. This diagram provides a timeline, which shows the dates when students were infected. It also shows additional data about related clusters that formed when instructors or students infected family members, coworkers or acquaintances. So, this diagram shows how a set of clusters in one setting (Sport) can be connected to clusters in other settings (such as Household or Work). The bubble diagram does not include information about the transmission across settings.

Model analysis

14. Which of the three models did you find most interesting? Which did you find easiest to interpret? Why?

Answers will vary, and students may or may not choose the same model as their answer to both parts of the question. Students should discuss the visual appeal of the different models, as well as the amount of information provided in each model and the ease with which students could identify key information.

15. Did your preferred model have not enough information, just enough information or too much information? How could you improve that model to make it more informative?

Encourage students to discuss what defines “enough” information for the model. The amount of information that is sufficient for a model or visual display of data depends on the purpose of the model and its audience. One model may be better for one purpose or audience but may be lacking for another purpose or audience. Students may discuss ways to better organize or distinguish data within the models, such as using a stroke or outline color in the bubble diagram to show clusters within a setting that occurred in the same location, or they may want to put rules between clusters in the case map diagram or to differentiate between the total number exposed versus infected in each cluster of the case map so they don’t have to count boxes.

Model construction

Next, students will review the information provided in the Science News bubble diagram related to a set of clusters in Codogno, Lombardy, Italy. You may need to help students identify the clusters in the interactive bubble diagram. Because students have already explored the Sport setting, have students start there to identify the Italy cluster (this is the only “outdoor” cluster in the Sport setting). The text provided with that cluster explains that additional clusters can be found in the Party, Household and Hospital settings. Students will use that information to identify related clusters. Once they have identified the connected clusters in each setting, they should record information about cluster size and transmission within and between clusters.

Once students have identified and recorded all available information about this set of clusters, students will analyze the data and work in their groups to develop a new visual model to show how the infection spread in these clusters. To guide their development of the new model, students will answer the questions No. 16–19 in their groups.

16. How many clusters are in the Codogno, Lombardy, Italy cluster set? What settings did these clusters occur in? How many people were infected?

There are four clusters in this cluster set. These clusters are represented by one cluster each in the Sport, Party, Household and Hospital settings. In the Sport setting, the initial patient infected one other person. In the Party setting, the initial patient infected three other people. In the Household setting, the initial patient infected one other person. In the Hospital setting, the patient infected eight other people, although it is unclear from the information how many of those infections were primary transmission directly from the index patient and how many were secondary or tertiary transmissions.

17. How well does the interactive bubble diagram illustrate how the infection spread across clusters? How could you create a visual model of these clusters that displays more clearly how the virus was transmitted?

The interactive bubble diagram divides the clusters by setting and so doesn’t visually connect them. A visual model that more clearly displays how the virus was transmitted by a single person to people in multiple settings could be a matrix or a flow chart or a case map like the one presented for the Cheonan fitness class clusters.

18. Is there any information missing that is necessary to construct a complete visual model of this set of clusters?

The bubble diagram does not include information about the total number of people exposed in each setting, so the attack rate for each cluster can not be determined. It also does not include a timeline of exposure like the case map for the Cheonan cluster set did, but that is not completely necessary in order to construct a model that illustrates how the Italy clusters are connected.

19. Construct a new visual model to describe how the virus spread within the Italy cluster set.

Students should construct a visual model of the viral clusters. This could be a table, a flow chart, a matrix, a wheel and spoke diagram, a case map, or any other visual model that indicates the setting and the magnitude of each cluster and shows how they are connected.

Application of the model

Allow each group time to present its model of the Italy clusters to you and to at least one other group to receive feedback. Then provide groups with a short time to revise their models if necessary. Then, act as a facilitator as students answer questions No. 20–23 as a class.

Use this time to formatively assess individual and group progress. Students should demonstrate an understanding of how data and its presentation can inform and affect public health decisions.

20. How did developing a model of viral transmission within a set of clusters affect your understanding of the modes of transmission?

Students should discuss how analyzing the existing diagrams and constructing their own model allowed them to find correlations between the characteristics of different settings and infection rates. They should also discuss how many different settings a single person routinely encounters when going about their normal life. When people move freely between settings, they are at a greater risk of contracting and spreading an infectious disease to others. The activity should also have helped students understand how virus clusters happen and how clusters can be interrelated.

21. What are the benefits and limitations of the different types of visual models you used or developed for displaying data about an infection’s spread?

Guide students to talk about each model separately and to compare the different models. Ask students: How does the presentation of data affect the way it is received or interpreted? How do the benefits and limitations of a model depend in part on its purpose and its audience? Some models are easier to extract information from at a glance. Other models may contain more information or are able to organize data more effectively, but they may be more complex and more difficult to read.

22. How do scientists and public officials use data about a particular virus to inform public health decisions?

Students should think about how science and engineering influence the development of responses to social problems. Governments and health officials use information about how a virus spreads to determine which settings and activities involve the highest and lowest risk of infection. Then, they make decisions about how to protect the most people. Students may wish to discuss how public health decisions must take into account the safety of individuals and the entire population while also maintaining the stability of the economy and social and cultural needs of the population.

23. Which types of model do you think would be most informative for public health officials who are developing and evaluating plans to restrict or reopen communities? Support your selection with evidence and scientific reasoning.

Some questions to ask students to guide the discussion include: How can the same data be used to support different proposals for enacting or retracting public health directives? How might someone change the way they present data in order to make specific trends and patterns more clear to their audience? And how might the way data are presented influence public officials or administrators as they make policy decisions? Students should discuss how the most informative model depends on the context in which it is being used. Some public health decisions are complex and require analysis of complex data. If data were presented in a way that was oversimplified or lacked important details, decision-makers may not have the necessary information in order to make the best decisions. However, including too much information may make the data confusing or tedious, likewise rendering a decision less than optimal. Students should conclude that the model that is most informative is the one that includes the most information that is relevant to the specific decision being made in a simple-to-digest format.

Possible extension

Another set of clusters that crossed settings was based in Chicago, Ill. This set of clusters is represented in the Meals setting and the Funerals setting in the Science News bubble diagram. Advanced students or students interested in further exploring the data could construct a new model to visually display that set of clusters. Data and graphics describing that cluster can be found in the linked article “Community transmission of SARS-CoV-2 at two family gatherings — Chicago, Illinois, February–March 2020.”

Additional resources

If you want additional resources for the discussion or to provide resource for student groups, check out the links below.

World Health Organization
Video press conferences on the ongoing coronavirus outbreak
Q&A on the same outbreak

U.S. Government
Guidelines for Opening Up America Again
CDC: Implementation of Mitigation Strategies for Communities with Local COVID-19 Transmission
CDC: COVID-19 Mathematical Modeling

Additional Science News articles
Here’s what we’ve learned in six months of COVID-19—and what we still don’t know
The Coronavirus Outbreak