New Course - Machine Learning from Scratch to Production

Learn how to build and deploy models with Roboflow

Dec 23, 2022

This year, I have been working on a new course that aims to teach the lifecycle of managing Machine Learning Operations. I posted a few chapters earlier this year at Chapter 1, Chapter 2, and Chapter 3. Many thanks to all of you who reviewed the material and provided feedback. Today, I feel happy to finally release it as a ready and complete course. Annual paying subscribers at LearnWithARobot.com will be able to subscribe at no cost. (Please send me a note so that I can add you at the course website on Thinkific for free)

From Scratch to Production

Why do this course?

In my previous course, Learn AI With A Robot, I taught the basic concepts of Machine Learning (ML) including how to train a Machine Learning model with the help of a prepared dataset using Google Colab. However, training a model is just the first pre-requisite of having a Machine Learning model run in a production system. An effective Machine Learning solution requires iterative steps of improving the model: identifying situations where the model fails or is inaccurate → debugging the root cause of those inaccuracies → fixing the dataset with which the model was trained → retraining the model → examining whether the new model is better than the old model with respect to some Key Performance Indicators (KPIs). More and more companies are discovering that their success in Machine Learning/ Artifical Intelligence is dependent on how quickly and effectively they can work through these iterations. Hence, the need for a strong and effective Machine Learning pipeline.

There are few courses that teach the art of Machine Learning Operations (MLOps). In this course, I attempt to teach the fundamental concepts of MLOps in a concise 60 minute video: about as much duration as a soccer/football match. But first, lets check if you are ready for this course.

Prerequisites?

The course is designed to be technically light, as it is made for the average user who may not be familiar with computer science, system design, or machine learning. Rather, a high school degree is the basic requirements, along with an eagerness to learn fundamental technologies that are about to change our lives. The course makes use of a fun project: teaching a Vector robot to detect the presence of another Vector robot via object detection. However, the Vector robot is just an example, you could use the concepts you learn in this course to do any kind of machine learning using the datasets available at Roboflow. As an example, you could do a fun project of mounting an outdoor camera and counting the number of birds or squirrels that visit your backyard. Some additional hobby projects are here. So, to be clear, you do not need to own any Vector robots to learn this course.

Some additional knowledge certainly helps, such as:

i) Experience training a Machine Learning model on a Jupyter notebook: Chapters 5 and 6 of my previous course: Learn AI WIth A Robot discusses the important concepts

ii) Experience with Python Programming is a plus. Many of the examples and demonstrations use Python, so prior experience with programming in Python definitely helps.

But just to reiterate, beyond these prerequisites, all you need is an eagerness to learn.

Why should you do the course?

Here are a few reasons to enroll in this course:

You want to learn why some companies succeed while many others fail in Machine Learning endeavors.
You want to learn how to use commercial Machine Learning data management tools such as Roboflow.
You want to learn the various ways in which you can deploy a Machine Learning Model in production and manage it.
You wll be able to do your own hobby project using free public datasets available at Roboflow.
From Scratch to Production

Why is A Machine Learning pipeline important?

For motivation, here is a tweet from Andrej Karpathy, previously Director of AI at Tesla Motors.

Andrej Karpathy @karpathy

Potentially nitpicky but competitive advantage in AI goes not so much to those with data but those with a data engine: iterated data aquisition, re-training, evaluation, deployment, telemetry. And whoever can spin it fastest. Slide from Tesla to ~illustrate but concept is general

This slide effectively shows how hard it is to deploy a Machine Learning Model in practice. Companies such as Tesla, Uber, Meta and Google have succeded in Machine Learning because they made substantial investments in designing effective Machine Learning pipelines. Many of these companies have released their processes in public domain. At the same time, other companies spectacularly fail in their Machine Learning models because they couldn’t build a Machine Learning pipeline. According to lead Machine Learning Systems designer Chip Nguyen, 9/10 Machine Learning models never get deployed to production. This course aims to collect all the domain information of both successes and failures in Machine Learning deployments, and presents it in a way that is effective and easy to consume.

From Scratch to Production

Chapters

Here is an overview of the course chapters:

Chapter 1 provides an overview on how to design a ML solution from scratch and deliver it in production. Drawing from experience, I will walk you through all the steps that you need to take to build a working ML solution. We will consider an interesting problem to solve, that of training our favorite Vector robot to recognize another Vector robot in its proximity. We will then go about identifying the steps to accomplish a solution which can work efficiently and also be managed in production. We will use some common ML software and services: specifically Roboflow and ModelDB. The ML lifecycle which we will introduce can be described in the following figure.
The Machine Learning Operations involved in deploying a model which allows Vector to detect another Vector robot.
Chapter 2 discussses how to create a great computer vision dataset with the help of Roboflow. We start from how to collect images specific to your environment, such as a home robot in my case. We discuss the different aspects we need to keep in mind to create a diversified and representative dataset. We visit the different strategies to label images, including using ML to label with the help of Roboflow label assist. We discuss ways to post process and augment the dataset while checking on the dataset health at the same time.
Chapter 3 focuses on how to train a ML model with your dataset, either using cloud services such as AWS Sagemaker or Roboflow, or using your own deployment. We discuss the pros and cons of both approaches. Specifically for Roboflow, we discuss an interesting option of transfer learning, where we use an existing trained model - the COCO image recognition model - and boost the model to recognize our Vector robots. While Roboflow offers a great service to train a model, it does not offer the flexibility to choose your own model. Using your own deployment, either via AWS Sagemaker or something that you have deployed will enable you have more control on the model and code that you wish to use to train the model.
Chapter 4 discusses how to iteratively improve your ML model in production. They key to success in ML is to quickly identify the weak points in your model and continuously improve the model. We discuss how to accomplish this with a simple example.
Chapter 5 concludes and gives your plenty of extra reading material and references.

In summary

Once again, invite you to enroll at Machine Learning - From Scratch to Production. I promise you will learn a lot in a short span of 60 minutes.

From Scratch to Production

Learn With A Robot

Discussion about this post