Why you should switch to Signal or Telegram from WhatsApp, TodayTech by Sunny Srinidhi - November 23, 2018December 19, 20193 When we think of communicating with someone today, we mostly think of sending them a text message or a voice note on WhatsApp. And some other people who are least bothered about their privacy online, think of Facebook Messenger. But not all these users know what's happening with the messages they exchange on these platforms. Let's take a look at that. Before we start, let me admit, I am by no means an expert on security and privacy online. But I have done enough research for the last couple of years, which made me switch to Firefox and DuckDuckGo (with a lot of customized preferences on both), from Google's Chrome browser and search. I've made a lot of other such switches
Simple Apache Kafka Producer and Consumer using Spring BootTech by Sunny Srinidhi - November 23, 2018March 2, 20202 Originally published here: https://medium.com/@contactsunny/simple-apache-kafka-producer-and-consumer-using-spring-boot-41be672f4e2b Before I even start talking about Apache Kafka here, let me answer your question after you read the topic — aren’t there enough posts and guides about this topic already? Yes, there are plenty of reference documents and how-to posts about how to create Kafka producers and consumers in a Spring Boot application. Then why am I writing another post about this? Well, in the future, I’ll be talking about some advanced stuff, in the data science space. Apache Kafka is one of the most used technologies and tools in this space. It kind of becomes important to know how to work with Apache Kafka in a real-world application. So this is an introductory post to the technology, which we’ll be
Keystroke Dynamics, What Is It?Tech by Sunny Srinidhi - November 16, 20180 For decades, we have been using the two-pronged key system for securing our electronic data and services. The two-pronged key we're talking about is the username/password combination. There are variations of this, of course. For example, instead of a username, you might be using your email address, or something called a user ID. But the concept remains the same. The username/password combination for security is over 50 years old. To be more precise, it was first implemented in the year 1961 at Massachusetts Institute of Technology (MIT). We have been using this security method for all kinds of data and services online, including but not limited to emails, banking, and gaming services. But it's also true that it's been proved a lot many
What is multicollinearity?Data Science by Sunny Srinidhi - August 8, 2018January 30, 20200 Multicollinearity is a term we often come across when we’re working with multiple regression models. But do we actually know what it means?
Overfitting and Underfitting models in Machine LearningData Science by Sunny Srinidhi - August 2, 20180 In most of our posts about machine learning, we've talked about overfitting and underfitting. But most of us don't yet know what those two terms mean. What does it acutally mean when a model is overfit, or underfit? Why are they considered not good? And how do they affect the accuracy of our model's predictions? These are some of the basic, but important questions we need to ask and get answers to. So let's discuss these two today. The datasets we use for training and testing our models play a huge role in the efficiency of our models. Its equally important to understand the data we're working with. The quantity and the quality of the data also matter, obviously. When the data
Different types of Validations in Machine Learning (Cross Validation)Data Science by Sunny Srinidhi - August 1, 20180 Now that we know what is feature selection and how to do it, let's move our focus to validating the efficiency of our model. This is known as validation or cross validation, depending on what kind of validation method you're using. But before that, let's try to understand why we need to validate our models. Validation, or Evaluation of Residuals Once you are done with fitting your model to you training data, and you've also tested it with your test data, you can't just assume that its going to work well on data that it has not seen before. In other words, you can't be sure that the model will have the desired accuracy and variance in your production environment. You need
Different methods of feature selectionData Science by Sunny Srinidhi - July 31, 2018November 6, 20191 In our previous post, we discussed what is feature selection and why we need feature selection. In this post, we're going to look at the different methods used in feature selection. There are three main classification of feature selection methods - Filter Methods, Wrapper Methods, and Embedded Methods. We'll look at all of them individually. Filter Methods Filter methods are learning-algorithm-agnostic, which means they can be employed no matter which learning algorithm you're using. They're generally used as data pre-processors. In filter methods, each individual feature in the dataset will be scored on its correlation with the dependent variable. A variety of statistical tests will be used to calculate this correlation score. Based on this score, it will be decided whether to
What is Feature Selection and why do we need it in Machine Learning?Data Science by Sunny Srinidhi - July 31, 2018November 11, 20192 If you've come across a dataset in your machine learning endeavors which has more than one feature, you'd have also heard of a concept called Feature Selection. Today, we're going to find out what it is and why we need it. When a dataset has too many features, it would not be ideal to include all of them in our machine learning model. Some features may be irrelevant for the independent variable. For example, if you are going to predict how much it would cost to crush a car, and the features you're given are: the dimensions of the car if the car will be delivered to the crusher or the company has to go pick it up if the car
Linear Regression in Python using SciKit LearnData Science by Sunny Srinidhi - July 30, 2018July 30, 20181 Today we'll be looking at a simple Linear Regression example in Python, and as always, we'll be using the SciKit Learn library. If you haven't yet looked into my posts about data pre-processing, which is required before you can fit a model, checkout how you can encode your data to make sure it doesn't contain any text, and then how you can handle missing data in your dataset. After that you have to make sure all your features are in the same range for the model so that one feature is not dominating the whole output; and for this, you need feature scaling. Finally, split your data into training and testing sets. Once you're done with all that, you're ready to start your
Why do we need feature scaling in Machine Learning and how to do it using SciKit Learn?Data Science by Sunny Srinidhi - July 27, 2018November 5, 20191 When you're working with a learning model, it is important to scale the features to a range which is centered around zero. This is done so that the variance of the features are in the same range. If a feature's variance is orders of magnitude more than the variance of other features, that particular feature might dominate other features in the dataset, which is not something we want happening in our model. The aim here is to to achieve Gaussian with zero mean and unit variance. There are many ways of doing this, two most popular are standardisation and normalisation. No matter which method you choose, the SciKit Learn library provides a class to easily scale our data. We can use the StandardScaler