AWS Sagemaker: The Definitive Guide

What is AWS Sagemaker

AWS Sagemaker is a machine learning platform that helps machine learning engineers, data scientist to explore, build and deploy machine learning model to production.

Why use AWS Sagemaker

AWS sagemaker makes it super efficient, fill the gaps and provide coherent structure on how ML projects with productivity in mind. Let's talk about some of the possible current workflows and how it can be when using AWS Sagemaker.

If you think of how ML teams are created they all start with something small and want to see if ML will add any extra benefit to the product and how it can improve the overall customer experience.

Basic steps in putting a model into production.

Start with aggregating the data needed for solving a business problem and making sure the quality of the data is good.
Open up a Jupyter Notebook or RStudio and perform exploratory data analysis.
- If there are any issues with the data go back to step 1 and fix the data.
Explore a bunch of models and different parameter through GridSearch/Randomized Search or even more sophisticated Algorithm through Hyper parameter optimization techniques.
Finalize on a model once you have a evaluated the model offline and are confident that the model is doing the right thing.
- When exploring different model keep a track of which model and parameters perform the best.
- For analysis you maintain the features used, statistics about them, parameters used and the algorithm in a spreadsheet or some other doc.
Store the trained model in S3 or some other object store and expose an http endpoint to make predictions.
Run A/B tests and call in favor of the model performed as it performed really well.Hurray now you are ready to start iterating on improving the model or probably at this point you have convinced the management ML is useful and they want to use ML in most of the features in your product.

The above mentioned steps work well when there are a small number of projects and few ML engineers who are working together the bigger the team grows if we don't adapt will become frustating for ML engineers and decrease their productivity.

Depending on what stage of the company you are in you might allocate a separate team to build out some internal tool to manage the ML workflows or if you are don't have enough resources you stick with the existing process but slow down your iteration speed. Everything comes with a tradeoff and the final outcome depends on what you think is good for the team in the long run.

If the above seems daunting to you at least you can be happy you are not alone. That is where AWS sagemaker shines.

How does AWS Sagemaker ML Workflows

Let's look at a ML Model Lifecycle and see what tools AWS Sagemaker Provide.

Exploration

Every project startd with exploring the data we have at hand. It could either be through Jupyter Notebooks, RStudio or any other framework for that matter.

AWS Sagemaker provides managed notebook that support python or R environments. At this point many are wondering what is so different about this? We could install it locally on a laptop right? And the answer is yes you can. This works well with smaller datasets and when the Algorithms are not super CPU heavy.

AWS sagemaker allows you to start the notebook with many different Instances types. That means you can start with a smaller dataset and once you have the model ready you could just change the instance type and rebuild the model with a large dataset. This is not possible to do on your laptop and you will need to work with your infra team to get a bigger instance, manage dependencies, deploy and train the model there. Any step there after will slow you down.

Also sagemaker comes with many machine learning frameworks preinstalled and you don't need to worry about managing any dependencies.

Everything is good till now but you still want some custom package which you love. Sure there are way to install those packages. Every notebook has a virtual environment backing it so you could very well do

!pip install awesome-package

in the jupter cell or hook on to the lifecycle configuration that allow you to install packages at notebook creation time.

Model Exploration

Amazon Prebuilt Models
- In the next stage you are looking for a way to build a model and AWS sagemaker provide many algorithms that are already built and all you need to do is provide data and train the model. More on this here Built in Algorithms
Script Mode
- In this method you could write your own model and pass that training script to the AWS sagameker abstraction and you will be able to customize how the model should work. This is the best method if you want to stick with the existing machine learning libraries. More on this here
Bring your own model
- This gives the most control but also need the most work

Metric and Artifact Management

AWS Sagemaker has a component called Sagemaker Experiments that allows you to track any metadata for a given training run. The metadata can be

Parameters
Any artifacts like some stats about the features
Data inputs and the path where they are stored.
Metric for every run. Can be AUC, Log Loss or any other metric that can be parsed as a regex. More on that Sagemaker Experiments

Model Deployment and AutoScaling

Building model is one part of the puzzle but for it to be useful the same needs to be deployed, made scalable and managed. As you can guess AWS sagemaker provides simpler way to deploy model and not worry about how to do it.