When you're starting your machine learning journey, you'll come across null hypothesis and the p-value. At a certain point in...
datascience
We have seen methods such as fit(), transform(), and fit_transform() in a lot of SciKit's libraries. And almost all tutorials,...
In a very old post - Label Encoder vs. One Hot Encoder in Machine Learning - I had demonstrated how...
In the previous post, we tried to understand the basics of Apache's Kafka Streams. In this post, we'll build on...
If you work with streams of big data which have to be collected, transformed, and analysed, you for sure would...
In the last post, we saw how to query data from S3 using Amazon Athena in the AWS Console. But...
Amazon Athena is defined as "an interactive query service that makes it easy to analyze data directly in Amazon Simple...
If you are in the big data or data science or BI space, you might have heard about Apache Spark....
Not a lot of people have heard of Apache Drill. That is because Drill caters to very specific use cases,...
If you’ve worked with Spark SQL, you might have come across the concept of User Defined Functions (UDFs). As the...