Try out OpenAI compatible Large Language Models with Wirepod
Your Vector robot can be a battleground to try different Large Language Models
Wirepod’s integration with Large Language Models (LLMs)
Wirepod has been investing heavily in supporting the usage of Large Language Models (LLMs) in Vector’s intent graph and knowledge graph (Knowledge graph is used when you ask Vector “I have a question”, while the intent graph is used to understand your intent and answer questions in normal conversations). We have been writing about the numerous features that have been built into adding Machine Learning to Wirepod this year, such as the streaming mode and translating requests into actions.
Wirepod natively supports two LLM providers in its Knowledge graph: OpenAI and Together. Wirepod also supports Houndify which is more of a conversational AI agent, than a LLM. You can change your Knowledge graph settings by going to Server Settings → Knowledge Graph in Wirepod’s setup menu.
Open AI API support
A lesser known feature of Wirepod is that it supports the OpenAI API format and uses the OpenAI API Go client in its code. To explain this feature, let me provide some background on what the OpenAI API format is.
OpenAI was the first to offer inference on Machine Learning models as a service. Libraries have been built by OpenAI and third party developers to use OpenAI APIs in many languages such as Python, Go, and Node.js. These libraries make it trivial to construct a request to the OpenAI API, process the response, and handle errors. As a result, a lot of software has been already written to interact with OpenAI models (like GPT-3, GPT-4, etc.) using these libraries.
OpenAI had a big lead on this format because it pioneered machine learning models for consumer usage. However, the advent of open-source models such as Llama from Meta has allowed a large number of providers to offer inference on these models as a service. As an example, if you look at the Artificial Analysis dashboard (a service that allows you to compare Artificial Intelligence (AI) providers and service), you would find at least 20 providers offering services on Large Language Models. Some of the known names are Google Cloud, Microsoft Azure, and Amazon AWS. The lesser known names and startups are Together, Fireworks.ai, Lepton.ai, and Groq. We have an article which gives you tips and strategies to compare these recent offerings, and offers suggestions on how to pick the best offering in the marketplace.
In order to compete with OpenAI, many of these providers made the APIs compatible with the OpenAI API format. This meant that software already written using OpenAI API compatible libraries need not be re-written, and can be used directly by changing the URL and the API Key. This article shows you how to use the services of three common providers which are OpenAI API compatible to power your Vector robot. I could use basic services from all these three providers for free, although a paid account may be beneficial for advanced services.
Using OpenAI compatible LLM services
The general way to choose a custom LLM provider is to go to Wirepod → Server Settings → Knowledge Graph and then use “Custom” as the choice in the dropdown for “Knowledge Graph API Provider”. Let us take a few examples of services and provide screenshots.
Groq
Groq provides LLM inference using a specialized accelerator which they call the LPU (Language Processing Unit). An LPU is different from a GPU in the sense that it only supports onchip memory (called SRAM). Groq came into prominence because they were the first to show speeds of 100s of tokens/sec for inference on LLMs.
You can get a Groq API key using the notes on this page. The following screenshot shows how to configure Wirepod to use Groq for the Llama3-8B model. Note the API endpoint and LLM model chosen.