Data Science

ColumnTransformer in SciKit for LabelEncoding and OneHotEncoding in Machine Learning

In a very old post - Label Encoder vs. One Hot Encoder in Machine Learning - I had demonstrated how to use label encoding and one hot encoding to separate out categorical text data into numbers and different columns. But the SciKit library has come a long way since I wrote that post, and it has made life a lot more easier. The developers of the library might have realised that people use LabelEncoding and OneHotEncoding very frequently. So they decided to come up with a new library called the ColumnTransformer, which will basically combine LabelEncoding and OneHotEncoding into just one line of code. And the result is exactly the same. In this post, we'll quickly take a look at how we can do that with some code snippets. The Code First, as usual, we need to import the required li...

Read More