Listening to Bedtime Stories from Vector
We discuss Prompt Engineering for Generative AI use cases
In a previous post, we discussed ways to customize Vector Wirepod’s Knowledge Graph by using Large Language Models (LLMs) available as part of Together.ai Since we wrote the post, there have been many new developments in LLMs such a fancy new model Mixtral 8x7B (which performs better and does inference 6x faster than Meta’s Llama 70B model), Together.ai getting $102.5 Milion Series A funding, as well as news of Meta deploying 350000 nVidia GPUs to train their next version of Llama LLM - tentatively called Llama 3. The Generative AI research face is fascinating, and it is a great time to live and witness the rapid progress in technology.
In this post, we will discuss another aspect of research and engineering in Generative AI known as Prompt Engineering. Specifically, we discuss how to use Prompt Engineering to have our beloved Anki Vector robot narrate us bedtime stories to put us to sleep. If you want to try this out yourself, you will need to clone and run my custom version of Wirepod available in my open source repository (The custom version is in branch nighttime-stories of my Wirepod repository, I didn’t push changes to the official Wirepod repository because I didn’t think they would be merged by Wirepod’s owner).
Specifically, my changes to wirepod can be found here. I am designing a good prompt to let the large language model generate a story which can be narrated by the Vector robot. Let us look into the details.
Editor’s Note: We really depend on our readers to advertise our forum by word-of-mouth. If you think someone will penefit from this post, please refer them to join our newsletter with the following link.
What is Prompt Engineering?
Prompt engineering is about designing the best prompt for our use case. As an example, in our use case which is having Vector narrate bedtime stories, the prompt we design is:
prompt = "You are a friendly robot and your job is to help children. The child wants to go to sleep. Could you narrate a story to the child so that she goes to sleep peacefully? The story must be about " + storyType + ". The story must have at most 1000 letters."
The variable storyType is set to “a fairy tale”, “stars at night”, “a magical kingdom”, or “small children” depending on my verbal request to vector. For example, if we ask Vector: “Tell me a story about a fairy?”, the above code will design the prompt with the variable storyType set to “a fairy tale”. More details are available in the change.
Besides designing the right prompt, we also did the following:
Limit the number of tokens: It is time consuming for large language models to spit out text. If it takes too long, then Vector times out and makes an apologetic animation of not being able to cater to the request. (The timeout for Vector to get a reponse to a question from the Knowledge Graph is set in firmware, so there is no easy way to change it unless you could compile Vector’s firmware… which we are currently unable to.) Therefore we limit the number of tokens to 200. We also design the prompt to request that the story be smaller than 1000 characters. This helps the model design the story properly.
This science of coaching the large language model to yield a desirable output is still in infancy in research. There is a free course offered by deep learning about prompt engineering specific to ChatGPT by OpenAI. (Note: Prompt engineering is known to be very sensitive to the language model, so prompt engineering practices for ChatGPT may not work as effectively for other models such a Llama. But the basic principals should be the same)
Spilt the output into continous strings: At many times, we noted that the output from the large language model had different paragraphs which confused Vector and let it to end the story at the first paragraph. We parsed the output to construct a single paragraph which Vector said eligantly.
Bedtime Story
We integrated the Vector robot with this version of Wirepod. We chose the Llama2-70B model available via Together AI for the Knowledge Graph. Here is a video showing Vector ranting of a bedtime story and wishing me a good night.
Conclusion
The combination of Wirepod and Large Language Models opens up a diversity of use cases for Vector. As the models become faster to execute (e.g. Mixtral 8x7B), it will be possible for Vector to send more complex prompts and give longer and more interesting answers. We will explore this in coming posts. If you have a chance to try this out, please provide your inputs in teh comments section.