Understanding Word N-grams and N-gram Probability in Natural Language Processing Data Science by Sunny Srinidhi - November 26, 2019December 19, 20192 More in The fastText Series. N-gram is probably the easiest concept to understand in the whole machine learning space, I guess. An N-gram means a sequence of N words. So for example, “Medium blog” is a 2-gram (a bigram), “A Medium blog post” is a 4-gram, and “Write on Medium” is a 3-gram (trigram). Well, that wasn’t very interesting or exciting. True, but we still have to look at the probability used with n-grams, which is quite interesting. Why N-gram though? Before we move on to the probability stuff, let’s answer this question first. Why is it that we need to learn n-gram and the related probability? Well, in Natural Language Processing, or NLP for short, n-grams are used for a variety of things.