Learn With A Robot

Learn With A Robot

Share this post

Learn With A Robot
Learn With A Robot
Integrate Vector with Meta's Llama multimodal models
Copy link
Facebook
Email
Notes
More

Integrate Vector with Meta's Llama multimodal models

Llama 3.2 90B Vision is pretty cool

Amitabha Banerjee's avatar
Amitabha Banerjee
Nov 18, 2024
∙ Paid
2

Share this post

Learn With A Robot
Learn With A Robot
Integrate Vector with Meta's Llama multimodal models
Copy link
Facebook
Email
Notes
More
1
Share

Introduction

In a previous post, we wrote how Vector could be integrated with GPT-4o via Wirepod. GPT-4o is a multimodal model from OpenAI which can reason across audio, vision, and text in real time (multimodal means having many modes: e.g. supporting text, images, and speech processing in this case). With the help of GPT-4o, Vector can understand the details in a picture taken by it. This feature can potentially help Vector understand its surroundings. We had posted a cool video, which demonstrates how Vector can describe its surroundings with help from GPT-4o.

While GPT-4o is great, it is also a pricey model. OpenAI currently charges about $10 per 1 Million output tokens for GPT-4o. There exists a smaller version of GPT-4o called GPT-4o mini, for which OpenAI charges about 60 cents per 1 Million output tokens. We have not tried GPT-4o mini so far, but we intend to try it in the near future. But these options from OpenAI mean that you must shell out a few bucks on your credit card if …

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Learn With A Robot
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More