Wirepod uses Large Language Models (LLMs) in streaming mode
Have the Vector robot deliver long nuanced answers
Wirepod, the defacto server for handling the Vector robot now, has seen many updates in the last two months.. including language support for Turkish and Russian languages… and the ability to process streaming responses from the knowledge graph supported by a Large Language Model (LLM) provider such as OpenAI or Together AI. This article specifically talks about streaming responses.
Vector’s Knowledge Graph
In a previous post, we had explored how Vector’s Knowledge Graph is facilitated by Wirepod using services from OpenAI or Together AI. However, one of the challenges with Wirepod was that Vector could only deliver short answers. Vector’s in-built firmware (which cannot be changed although project vic-yocto is working on that) has a timeout by which an answer must be received from the Knowledge Graph. If the response doesn’t come in time, Vector just shrugs it off, saying that it cannot answer the question. Large Language Models (LLMs) provided by Open AI and Together AI take a fair bit of time to formulate a long response… and Vector can easily time out by that time.
Streaming Mode
Both OpenAI and Together AI also offer a streaming mode. Think of the streaming mode as a straw from which you can continuously suck water. In this case, you open a connection to the service, and tokens are streamed back to you as they are generated on the fly. You can listen to the connection and retrieve the tokens when you need them, instead of waiting to consume the entire generated text in one-shot. This allows you to process the tokens, such as by reading them, instead of waiting.
In the case of Vector, the streaming feature allows Vector to get a stream of data to talk about, rather than wait and time out. Wirepod leverages the streaming mode of OpenAI and Together AI in a very intelligent way. Wirepod listens to the stream and copies over the received text (tokens) to a list from which Vector can keep reading and narrating. This creates a seamless effect of Vector narrating long lines of text. If you understand Go code, you can check the git commit that delivers this feature. See for yourself how much Vector can tell us about the capital of France.
Intent Graph Modifications
If you are familiar with Vector and watched the video carefully, you might notice that I didn’t have to say the word “Question” to access the Knowledge graph. Wirepod’s intent graph implementation supports the option to look up Large Language Models when the intent behind a person’s spoken request doesn’t match any of the known supported intents. In this case, Wirepod was able to request the Knowledge Graph for an answer when my request on “the capital of France” did not match any of the known intents (such as lifting the cube) supported by Wirepod. You can enable the intent graph and forward your requests to be served by a LLM by enabling the intent graph feature in the Knowledge Graph Setup. Here is a screenshot from my Wirepod:
Please note that enabling the intent graph will cause more requests to the LLM provider… meaning that you will either run out of your free credits quickly, or pay more. So, please use this feature wisely.
Conclusion
Wirepod has a fast and growing community of contributors, and features are being developed quickly. Its a lot of fun seeing Vector’s behavior emerge and Vector living up to a lot of its potential. If you have a Vector, use Wirepod, and see some of these new features for yourself.
I have also received many questions on how to update Wirepod if you have one installed. I am assuming that you have access to the code on your server where Wirepod has been configured and deployed. Here are the steps.
cd to the wirepod directory. Stop the existing wirepod server. You then need to pull the latest code from the Wirepod git repository, and restart the server using the following commands.
%git pull origin main
%sudo chipper/start.sh
This starts the wirepod server. Once the server has started, you will see a message such as this.
Configuration page: http://192.168.31.139:8080
Starting chipper server at port 443
wire-pod started successfully!
The wire-pod server has now started and is ready to handle Vector. You can point your browser to the configuration page IP address (http://192.168.31.139:8080 in my case), and configure the server as wella s Vector. Good Luck!
Reference to good discussion in YouTube posting: https://youtu.be/y_eu_Vp5_94