Overcooked & Outplayed

This exercise is a part of Educator Guide: AI Influence and Organism Observations / View Guide
A human (left) and AI (right) collaborate to cook soup containing tomatoes (red and green objects) and/or onions (beige objects). In this case, the AI, but not the human, knows that the duo will receive a bonus if the human serves the soup. The second half of the video shows the result of a new training method in which an AI learns how to influence human behavior. Here, the AI has figured out that if it places a dish (white circle) next to the stove, the human will use it to deliver the soup, at the bottom of the screen.

Directions for teachers:

To engage students before reading the article, have them answer the “Before Reading” questions as a warmup in class. Then, instruct students to read the online Science News article “AI learned how to sway humans by watching a cooperative cooking game.” Afterward, have them answer the “During Reading” questions.

As an optional extension, instruct student to answer the “After Reading” questions as a class discussion or as homework.

This article also appears in the April 20 issue of Science News. Science News Explores offers another version of the same article written at a middle-school reading level. Post this set of questions without answers for your students using this link.

Directions for students:

Read the online Science News article “AI learned how to sway humans by watching a cooperative cooking game.” Then answer the following questions as directed by your teacher.

Before Reading

1. Artificial intelligence, or AI, has developed quickly over the last few years. List three examples of AI you may see in daily life. Consider how AI technology might advance over the next five years. Do you feel more optimistic or more worried about how AI might change our lives? Explain your answer.

Answers will vary, but listed technologies may include Alexis/Siri, generative text models such as ChatGPT, or chatbots. Answers will vary regarding feelings about the advancement of such technology.

2. Come up with a one-sentence definition for the word “learn.” Based on your definition, do you think a machine—such as a computer—can learn? Explain why a machine could or could not meet the requirements of your definition.

Answers will vary regarding whether a machine can learn. But many responses may define “learn” as gaining new skills or information.

During Reading

1. This article says that people in the future will probably work closely with AI. What “crucial and pertinent problem” must researchers tackle to prepare for such a future?

Researchers must determine the extent to which AI can learn how to influence people’s behavior.

2. Name at least one method researchers have used in the past to train AI. What are some drawbacks to using that method?

Researchers have used reinforcement learning, where an AI interacts with an environment, to train AI. This method is not very efficient and can waste a lot of time.

3. Describe the four training methods researchers used in the study to train the AI.

In the first method, AI imitated how humans played the game. In the second method, it imitated the best human performances. In the third method, it practiced with another AI and didn’t acknowledge the human performances. In the fourth method, the AI used offline RL methods to predict the best behaviors to follow in order to achieve the highest score.

4. How did the “human-deliver” version of the game differ from the “tomato-bonus” version?

In the “human-deliver” version, the team earned double points if the human partner delivered the food order. In the “tomato-bonus” version, the team earned double points if they delivered the soup orders with tomato and no onion.

5. In this study, the human players were not told about the “double-point rules.” But AIs knew about it. What does this set-up require the AIs to do in order for the team to win?

The AI must find a way to influence their human teammate to follow these “hidden rules.”

6. The offline-RL AI outperformed AI trained using the other three methods. Contrast the performance of the offline RL-trained AI teams with the next best training method in percentages.

Offline RL-trained AI teams scored about 50 percent more points than teams with AI that trained using the next-best method.

7. How did offline RL-trained AIs influence their human partners in the “human-deliver” games?

The offline RL-trained AI placed the dishes-to-be-served right next to their human partner.

8. What were the goals of the researchers’ follow-up study? What were the results?

The research team wanted to investigate whether AI could influence a human’s overall game strategy that involved multiple steps. In this new experiment, some AIs were instructed to consider their human partners’ gameplay patterns and then use that information to devise and work out a strategy to win. The results of this experiment showed that the AI that could work out it’s partner’s pattern, scoring about 50 percent more points than the AI that was not instructed to consider its partner’s strategy.

After Reading
1. Imagine a world in which AI technology can influence human behavior. What might be a positive implication of this development? What is one potential negative implication of this development? Discuss.

Responses will vary.

2. Refer to your answer to Question 1 in Before Reading. If you answered this question again, would you answer the same way or differently? Explain why you would or would not revise your original answer. Refer to findings from this article to support your answer.

Responses will vary.