Data is split in three ways in Machine Learning -

Training Data

  • Training Data is used to train our model.
  • This is the data that our model actually sees (both input and output) and learn from.

Validation Data

  • Validation data is used to do a frequent evaluation of the model, fit on training data along with improving involved hyperparameters.
  • This data plays its part when the model is actually training.

Testing Data

  • Once our model is completely trained, testing data provides an unbiased evaluation.
  • When we feed in the inputs of testing data, our model will predict some values without seeing the actual output.
  • After prediction, we evaluate our model by comparing it with the actual output present in the testing data.
  • This is how we evaluate and see how much our model has learned from the experiences feed in as training data, set at the time of training.