How to split your dataset to train and test datasets using SciKit LearnData Science by Sunny Srinidhi - July 27, 2018November 5, 20192 When you're working on a model and want to train it, you obviously have a dataset. But after training, we have to test the model on some test dataset. For this, you'll a dataset which is different from the training set you used earlier. But it might not always be possible to have so much data during the development phase. In such cases, the obviously solution is to split the dataset you have into two sets, one for training and the other for testing; and you do this before you start training your model. But the question is, how do you split the data? You can't possibly manually split the dataset into two. And you also have to make sure you split