In this post, we'll see how we can query tables that reside in Hive using a Spring Boot application. As...
bigdata
JanusGraph is a graph processing tool that can process graphs stored on clusters with multiple nodes. JanusGraph is designed for...
When you're starting your machine learning journey, you'll come across null hypothesis and the p-value. At a certain point in...
We have seen methods such as fit(), transform(), and fit_transform() in a lot of SciKit's libraries. And almost all tutorials,...
In the previous post, we tried to understand the basics of Apache's Kafka Streams. In this post, we'll build on...
If you work with streams of big data which have to be collected, transformed, and analysed, you for sure would...
In the last post, we saw how to query data from S3 using Amazon Athena in the AWS Console. But...
Amazon Athena is defined as "an interactive query service that makes it easy to analyze data directly in Amazon Simple...
If you are in the big data or data science or BI space, you might have heard about Apache Spark....
If you’ve worked with Spark SQL, you might have come across the concept of User Defined Functions (UDFs). As the...