AWS SageMaker has been the industry leader, offering many services that complement the entire lifecycle from data preparation to model deployment.
In this blog, you will learn how to do Modeling with AWS Machine Learning.
This blog cover:
- Machine Learning Life Cycle
- Linear Learner and XGBoost
- K-Nearest Neighbours and Random Cut Forest
- Fundamentals of Hyperparameter
Modeling With AWS Machine Learning: Life Cycle
Once you identified machine learning is the right fit for your business problem. Your next step will be to identify the right machine learning type which helps you in finding the right Model for Modeling with AWS Machine Learning.
Machine learning has three learning types i.e. Supervised Learning, Reinforcement Learning, and Unsupervised Learning.
Note : Do Read Our Blog Post On Data Engineering With AWS Machine Learning.
1) Supervised Learning
- Supervised learning is similar to a child learning under the guidance of a supervisor or a teacher.
- It uses what we call the label Data, which means the outcome is the correct answer is already know. It tries to model a relationship between the inputs on the output from the training data so that it can predict new outcomes for the test data that you may feed later.
- Types of supervised learning are:
- Regression
- Multiple Regression
- Linear Regression
- Classification
- Binary Classification
- Multiple Classification
- Regression
2) Unsupervised Learning
- Unsupervised learning is similar to a child trying to figuring out things all by itself, without any guidance or supervision.
- In this technique, you allow the model to work on it soon on and discover information.
- Unsupervised Learning uses unlabeled data and tries to predict unknown patterns in the data.
- Types of unsupervised learning are:
- Dimensionality Reduction
- Clustering
Do Check : Our Blog On Amazon Comprehend.
3) Reinforcement Learning
- Imagine every time your kid exhibits good behavior, you reward or incentivize a kid to strengthen or reinforce that specific behavior. Reinforcement learning uses the same strategy and there is no label data.
- It tries to study the problem and try to retrofit its model in order to improve.
Know More : About aws dms ( Amazon Database Migration Service )
Linear Learner and XGBoost
As you prepare for the machine learning specialty exam, modeling with AWS Machine Learning is very important. It’s very important to have a good understanding of the building algorithms offered by Amazon Sage Maker.
1) Linear Learner
- Implementation of linear learner involves three steps i.e. preprocess, training, and validation.
- Preprocess: You can perform the normalization process manually, or you can let the algorithm do it for you. If the normalization option is turned on, the algorithm studies a sample of data and learned their mean on standard division, and each of the features is calibrated to have a mean of zero value. To get good results you need to ensure that the data is shuffled properly.
- Training: It uses SGD, a Stochastic gradient descent during the training phase, and you can also use some of the optimization algorithms like Adagrad and Adam. You can also apparently optimize multiple models, with each one of them having different objectives.
- Validate: The training is run parallelly, and the model’s error evaluated against their validation set. The optimal model is selected by comparing against the appropriate metric.
- The linear learner is a supervised learning algorithm, though the name sounds like it’s a regression algorithm, it can be used both for classification or regression.
Also Read : Our Blog Post On Amazon Rekognition.
2) XGBoost
- XGBoost is an efficient open-source implementation of the Gradient Boosting algorithm.
- It is a supervised learning algorithm that can be used effectively in handling both classifications and regression problems.
- XGBoost Uses CSV and Libsvm format to read input data both in training on inference phase.
- Amazon recommends using CPUs not GPUs for the training phase as the algorithm is memory intensive are not compute-intensive.
- XGBoost algorithm computes metrics like accuracy, area under the curve, F1 score, mean absolute error, mean average precision, mean square error, and route mean squared error during the training process.
Also Check : Our Blog Post On “AWS Certified Machine Learning Specialty“.
K-Nearest Neighbours And Random Cut Forest
1) KNN
- K-Nearest Neighbours also called KNN. Training in KNN runs in three phases.
- Sampling: In sampling, the size of the initial dataset is optimized so that it fits in the memory.
- Dimensionality Reduction: In dimensional reduction, the algorithm tries to remove the noise around the features using the algorithms like the random forest, and reduce the footprint of the model in the memory.
- Index Building: Index building optimizes the efficient look-up off the distance between the sample points and its k nearest neighbors. it provides Three different types of indexes were flat index, an inverted index, and inverted index with part quantization.
- KNN can be used in modeling both Classifications and Regression.
2) Random Cut Forest
- The random cut forest is an algorithm for anomaly detection.
- It is an unsupervised learning algorithm. This algorithm looks for outlier’s anomalies in the data like unexpected spikes, picks in periodicity, and unclassifiable data points.
- The first step is to fetch a random sample of data and a technique called reservoir sampling is used for this purpose.
- The next step in the training process is to slice the data into a number of equal partitions. Then each partition is sent to an individual tree and the tree recursively organizes its partition into a binary tree.
- The third step is to choose the hyperparameters num trees and the number of samples per tree. The recommendation is, to begin with, 100 trees in are two balance between the anomaly score noise and moral complexity. Anomaly detection supports both train and test data channels.
Note : Do Checkout Our Blog Post On Amazon Lex.
To know more about Amazon SageMaker Built-in Algorithms, click here.
Modeling With AWS Machine Learning: Fundamentals Of Hyperparameter
Firstly, we will understand the parameter and then the hyperparameter. Perform hyperparameter optimization in Modeling with AWS Machine Learning requires a basic understanding of hyperparameter.
1) Parameter
- A model parameter is internal to the model and It can be visualized as a configuration variable whose value can be estimated, or derived from the data that we feed in these values.
- These values not set manually by the model developer, but they’re required by the model while making predictions.
- The predicted value is saved along with a trained model and the accuracy of the predicted value determines the optimal prediction of your model.
- Example of Model Parameter are:
- Weights in an ANN
- Support vectors in SVM
- Coefficients in a linear regression
Also Check : Features of AWS Certificate Manager
2) Hyperparameter
- Hyperparameters are external to the model and the values of hyperparameters are set before starting the training process.
- They are independent of the data that is being trained and these values do not change during the training process, since is values are not part of the final model. These values are not saved along with the model.
- Example of model Hyperparameter are:
- K values in KNN
- Learning rate for training a Neural network
- Lambda in lasso regression
Also Check : Our Blog Post On Data Engineering With AWS Machine Learning.
In this blog, you have learned Machine Learning Life Cycle, understand built-in algorithms offered by SageMaker, and the fundamentals of Hyperparameter. These topics are covered in the Modeling with AWS Machine Learning section in the AWS certified Machine Learning Specialist course.
Do Checkout: Our Blog Post on Deep Learning On AWS , For More Information.
Related References
- AWS Certified Machine Learning Specialty: All You Need To Know
- Introduction To Amazon SageMaker Built-in Algorithms
- Amazon Rekognition | Computer Vision On AWS
- AWS Database Services – Amazon RDS, Aurora, DynamoDB, ElastiCache
- Amazon Kinesis Overview, Features, and Benefits
Next Task For You
If you are also interested and want to more about the AWS certified Machine Learning Specialist then join the Waitlist.
Leave a Reply