ChatGPT? More like EmoGPT!

Let me tell you a secret. What if I told you—that ChatGPT was an emo kid?

But wait. It’s not the typical emo kid that you imagine. It doesn’t wear dark eyeliners, have flat, jet-black hair with long bangs, or even listen to songs about insecurity and failed romance while wearing the darkest black hoodie you have ever known. (Of course, duh.) It is an emo kid because it is emotional. I know what you are probably thinking right now: But it’s not a human being. It doesn’t have any emotions.

Well, you see—Large Language Models (LLM) are becoming increasingly powerful. As powerful as they are, however, LLMs are very sensitive to prompts—subtle changes in the order of words or diction can create drastically different responses. So, one might ask: Are LLMs sensitive to emotion-evoking words/phrases? According to this study, they are! In fact, these emotional-evoking words or phrases can be used as stimuli to improve LLMs’ performance on certain tasks. (Surely, the title of this article isn’t clickbait.)

While there have been many approaches or prompt engineering techniques to improve LLMs’ performances such as using the self-refining prompt technique, the primary study of this article reveals a novel approach to enhancing LLMs’ performance through the use of emotional stimulus. (The researchers of this study called this approach “EmotionPrompt.”)

Before we take a look at how EmotionPrompt works and its impact on LLMs’ performance, we need to understand a bit about psychology and how it led the researchers to invent EmotionPrompt. Previous studies in Psychology have found that adding emotional stimulus to humans can bring positive and prominent impacts in many aspects such as improvement in students’ educational success and regulation of human motivations and goal pursuits. Inspired by this psychological phenomenon, the researchers of this paper took an interdisciplinary approach to see whether they can see a similar positive effect on LLMs in terms of their performances using emotional stimulus.

To test out this idea in experimental settings, the researchers first had to determine the optimal emotion stimulus. So, the researchers created the stimulus based on three foundational psychology theories that essentially focus on an individual’s innate desire to feel valued in society and motivation to be in control of significant events in one’s life. Here are the 11 sentences of emotional stimulus, which are classified into 2 categories, where one focuses on social effects and the other is on self-esteem, for LLMs that the researchers designed:

The way these emotional stimulus works is very simple: you just add them to the original prompts. Here’s an example scenario created by the researchers:

As you can see in the illustration, they simply added the emotional stimuli, “This is very important to my career.” followed by the original prompt. This means that the method is also very applicable for FlowGPT members to try! Before we talk more about the researchers’ methodology, let’s take another real-life example using a cool prompt from our FlowGPT community.

Here’s a prompt that tells ChatGPT to function like a Chinese Translator. (Prompt Credit to [泡沫店⻓店⻑))

Now, all I am going to do is just add the emotional stimuli, following the original prompt:

Voilà! Not that hard, right? (Try one yourself!)

Now, coming back to the researchers’ study, to measure the impact of the emotional stimulus, the researchers conducted an experiment that evaluates the performance of 4 LLMs (ChatGPT, T5-Large, Vicuna, and Bloom) on various language understanding tasks from simple phrase structure to similarity and causality identification with and without using emotional phrases. The results of this experiment showed that EmotionPrompt achieve better performance in all of the tasks while being 10% more accurate in half of the tasks. Furthermore, EmotionPrompt significantly improved the truthfulness and informativeness of LLMs’ responses to the prompts used in the experiment.

So, why does EmotionPrompt work? According to the researchers’ analysis of the results of the experiments, there are two main takeaways:

Emotional stimulus can enhance the attention of original prompts.
Positive words (e.g. “confidence”, “sure”, “success” and “achievement”) have a larger impact on LLMs’ performance.

Despite the amazing improvements in performance, however, EmotionPrompts has its own limitations. Firstly, the emotional stimulus may not have the same direct improvement in performance as the 4 LLMs used in this study. Therefore, emotional stimulus might not be generally applicable to all LLMs outside the scope of this study. Secondly, the emotional stimulus may not have the same impact on tasks that are out of the scope of this study. Thirdly, it may be prone to changes in LLMs’ versions, thereby unable to ensure reproducibility in results. Lastly, this is another limitation the author of this article thought about, but it has not been discussed in the paper on the variability of emotional stimulus’ impact by the length of the prompt. It may be the case that the emotional stimulus isn’t as effective on long prompts, which are common among our FlowGPT members who are creating advanced prompts for hackathons (you should definitely join our next hackathon by the way).

Regardless of these limitations, however, this study revealed a novel, interdisciplinary approach to improving LLMs’ performance, which merits further research to be done that intersects the fields of AI and other academic disciplines.

Link to the full research paper: https://arxiv.org/pdf/2307.11760.pdf