Tuesday, 7 November 2017

Machine Learning through example.

What is machine learning?

The purpose of this Blog is to explain about machine learning as simple as possible using a simple example. Our aim is to create a system, that answer the question, whether given drink is wine or beer?. This question answering system that we are going to build is called model and this model is going to create via process called training.

What is Training?

In machine learning, goal of training is to create accurate model, that answers our questions correctly most of the time. In order to train a model we need to collect the data. There are many aspects of drinks that we can collect data on. for simplicity, here we will collect two aspects of drinks, colour and alcohol percentage. we hope, we can split two types of drinks based on these two factors alone. we call these factors as features. The quality and quantity of data you gather will directly determine how good your predictive model would be. At this point we can collect some training data, create a table with three columns, namely colour, alcohol %, beer or wine.

Data preparation.

 Next step is data preparation, We load our data into a suitable place and prepare for use. we can use visualization techniques to check for data imbalance or finding anomalies in data. For example, if you have collected more data points for beer than wine, our model is going to be heavily biased towards beer. Make sure order of these data is random. We also need to split our data into two part, preferably 80-20. First part we will use for training and the second part we will use for evaluating our model. In this step we may have to do lots of other data preparation techniques in order to clean our data, such as normalization, duplicate detection, finding outliers, converting some text values to its number equivalent etc.(Some algorithms would accept only numeric values).

Selecting appropriate model.

Next step is choosing a model. There are many models that researchers have created over the years. Some are suited for image data,some for numerical data, some for text based data. In our case we have just two features, so we can use simple linear model.

Training the model.

Now we can move on to training. In this step we will incrementally use our user data to improve the ability of our model to predict whether the given drink is beer or wine. When we start our training at first, the model draw a random line through the data. Then as the each step of the training progresses, the line moves step by step closer to the idea of separation of the wine and beer. Once training is complete its time to evaluate the model. Evaluation, allow us to test our model against our data, which is never been used for training. Once you are done with evaluation,it is possible to see you can further improve your model. We can do this by tuning some of our parameters, that we implicitly assumed during our training. one example of such parameter is no of iterations.

Deploy the model.

Final step is to deploy our model. we can finally use our model to predict, whether the given drink is beer or wine.

Cheers
Hope this would be helpful.

No comments:

Post a Comment

Python and packages offline Installation in ubuntu machine.

https://www.linkedin.com/pulse/python-packages-offline-installation-ubuntu-machine-sijo-jose/