You are here
Home > Search Results for "hadoop"

Installing Hadoop on the new M1 Pro and M1 Max MacBook Pro

MacBook Pro

We’ll see how to install and configure Hadoop and it’s components on MacOS running on the new M1 Pro and M1 Max chips by Apple.

Installing Hadoop on Windows 11 with WSL2

fabio-big-data

We’ll see how to install and configure Hadoop and it’s components on Windows 11 running a Linux distro using WSL 1 or 2.

Understanding Apache Hive LLAP

apache hive

In this post, I try to explain what LLAP is for Apache Hive and how it can help us in reducing query latency.

Installing Zsh and Oh-my-zsh on Windows 11 with WSL2

Windows-PC

In this post, which is a part of a series of to setup Windows 11 and WSL2 for big data work, I install Zsh and Oh-my-zsh and setup up aliases

Querying Hive Tables From a Spring Boot App

In this post, we’ll see how to connect to a Hive database and run queries on that database from a Spring Boot application.

How To Generate Parquet Files in Java

parquet logo

The Parquet file format has become very popular lately. In this post, we’ll see what it is, and how to create Parquet files in Java using Spring Boot.

Getting Started with Apache Drill and MongoDB

Not a lot of people have heard of Apache Drill. That is because Drill caters to very specific use cases, it's very niche. But when used, it can make significant differences to the way you interact with data. First, let's see what Apache Drill is, and then how we can connect our MongoDB data source to Drill and easily query data. What is Apache Drill? According to their website, Apache Drill is "Schema-free SQL Query Engine for Hadoop, NoSQL and Cloud Storage." That's pretty much self-explanatory. So, Drill is a tool to query Hadoop, MongoDB, and other NoSQL databases. You can write simple SQL queries that run on the data stored in other databases, and you get the result in a row-column format. The

Connect Apache Spark to your HBase database (Spark-HBase Connector)

apcheSpark

There will be times when you’ll need the data in your HBase database to be brought into Apache Spark for processing. Usually, you’ll query the database, get the data in whatever format you fancy, and then load that into Spark, maybe using the `parallelize()`function. This works, just fine. But depending on the size of the data, this could cause delays. At least it did for our application. So after some research, we stumbled upon a Spark-HBase connector in Hortonworks repository. Now, what is this connector and why should you be considering this? The Spark-HBase Connector (shc-core) The SHC is a tool provided by Hortonworks to connect your HBase database to Apache Spark so that you can tell your Spark context to pickup the

About Me

Connect with me on: Twitter | LinkedIn | Medium Products Finance Journal Finance Journal is your personal command center for your money. It replaces messy spreadsheets and mental math with a beautiful, easy-to-use app that gives you total clarity on your financial life. Here is what it does for you: See Where Your Money Goes: Don't just track "how much"—track "where" and "why." Tag expenses with specific locations (powered by Google Maps), categories, and even specific trips to see exactly what lifestyle choices drive your spending. Visual Insights: Instantly understand your habits with a dashboard that turns your data into clear, easy-to-read charts and trends. Travel Made Easy: Planning a getaway? Dedicated trip tracking lets you keep your vacation budget separate from your daily expenses, so you can relax without the

Top