So, how to make machines learn?
We've learned in previous blogs why to teach machine? and what is machine learning?
If you haven't read them take a good look before jumping here!
Machine Learning begins with data. Data is the most important thing in ML. We need a lot of data to train a machine learning model. In general we divide data in two parts but google suggest that you should divide it in three parts!
So, why to divide data in three different parts? The first part is used to train the model, second is used for testing the model and the last data set is used for validation before production!
- Training data is the the 70% of your data from which you teach your ML model. On the basis of this training the model learns to establish an algorithm by itself and use it for prediction making.
- Testing data which is 25% of data and used to check whether the model is sufficient to give prediction to an unknown data and by which we learn about accuracy of the model.
- Validation data is the ultimate test of our ML model before getting into production. While using it we can conclude that our model is ready for the real world data.
Many models perform great till testing data but fails in validation because of over-training the model! The over-trained model is a term used for a model that is over-trained to make an accurate prediction. This over training make model limited to that data and due to which it can't make prediction on new data!
In data we have two parts labels and features.
- Labels are input data that we provide to machine and on basis of which it'll predict the outcome.
- Feature is the outcome, which your ML model will predict using the labels.
I've a question for you! What will be the labels and features for a ML model which is use to predict a house price?
Comments
Post a Comment