You are here
Home > Search Results for "hadoop"

Installing Hadoop on the new M1 Pro and M1 Max MacBook Pro

MacBook Pro

We’ll see how to install and configure Hadoop and it’s components on MacOS running on the new M1 Pro and M1 Max chips by Apple.

Installing Hadoop on Windows 11 with WSL2

fabio-big-data

We’ll see how to install and configure Hadoop and it’s components on Windows 11 running a Linux distro using WSL 1 or 2.

Understanding Apache Hive LLAP

apache hive

In this post, I try to explain what LLAP is for Apache Hive and how it can help us in reducing query latency.

Installing Zsh and Oh-my-zsh on Windows 11 with WSL2

Windows-PC

In this post, which is a part of a series of to setup Windows 11 and WSL2 for big data work, I install Zsh and Oh-my-zsh and setup up aliases

Querying Hive Tables From a Spring Boot App

In this post, we’ll see how to connect to a Hive database and run queries on that database from a Spring Boot application.

How To Generate Parquet Files in Java

parquet logo

The Parquet file format has become very popular lately. In this post, we’ll see what it is, and how to create Parquet files in Java using Spring Boot.

Getting Started with Apache Drill and MongoDB

Not a lot of people have heard of Apache Drill. That is because Drill caters to very specific use cases, it's very niche. But when used, it can make significant differences to the way you interact with data. First, let's see what Apache Drill is, and then how we can connect our MongoDB data source to Drill and easily query data. What is Apache Drill? According to their website, Apache Drill is "Schema-free SQL Query Engine for Hadoop, NoSQL and Cloud Storage." That's pretty much self-explanatory. So, Drill is a tool to query Hadoop, MongoDB, and other NoSQL databases. You can write simple SQL queries that run on the data stored in other databases, and you get the result in a row-column format. The

Connect Apache Spark to your HBase database (Spark-HBase Connector)

apcheSpark

There will be times when you’ll need the data in your HBase database to be brought into Apache Spark for processing. Usually, you’ll query the database, get the data in whatever format you fancy, and then load that into Spark, maybe using the `parallelize()`function. This works, just fine. But depending on the size of the data, this could cause delays. At least it did for our application. So after some research, we stumbled upon a Spark-HBase connector in Hortonworks repository. Now, what is this connector and why should you be considering this? The Spark-HBase Connector (shc-core) The SHC is a tool provided by Hortonworks to connect your HBase database to Apache Spark so that you can tell your Spark context to pickup the

About Me

Connect with me on: Twitter | LinkedIn | Medium Products Links Links is  a simple bookmarking service which allows you to bookmark your favorite websites from your Android device, or from the Chrome browser. The service also lets your organise your bookmarks into various folders so that its easy to keep track of your bookmarks. Your bookmarks are synced between your Chrome browser and your Android device. So no matter if you're on a desktop, a laptop, an Android smartphone, or an Android tablet, your bookmarks are available. You can have a look at the web interface and register, which will let you use the Chrome extension and the Android app. Nothing Pro As the name suggests, this app does absolutely nothing. It just has a label which says, well,

Top