Machine learning is the field of study that gives computers the ability to learn without being explicitly programmed
We are using machine learning multiple times a day even without us realizing it
A feature is an input variable—the x variable in simple linear regression. A simple machine learning project might use a single feature, while a more sophisticated machine learning project could use millions of features, specified as: x1, x2...xn
In the spam detector example, the features could include the following:
A label is the thing we're predicting—the y variable in simple linear regression. The label could be the future price of wheat, the kind of animal shown in a picture, the meaning of an audio clip, or just about anything
Here's an example:
housingMedianAge (feature) |
totalRooms (feature) |
totalBedrooms (feature) |
medianHouseValue (label) |
---|---|---|---|
15 | 5612 | 1283 | 66900 |
19 | 7650 | 1901 | 80100 |
17 | 720 | 174 | 85700 |
14 | 1501 | 337 | 73400 |
20 | 1454 | 326 | 65500 |
A model defines the relationship between features and label. For example, a spam detection model might associate certain features strongly with "spam". Let's highlight two phases of a model's life:
Training means creating or learning the model. That is, you show the model labeled examples and enable the model to gradually learn the relationships between features and label.
Inference means applying the trained model to unlabeled examples.
That is, you use the trained model to make useful predictions (y'
).
For example, during inference, you can predict medianHouseValue
for
new unlabeled examples.