Home Blog

Mastering NLP: Create Powerful Language Models with Python

language modeling NLP python
language modeling NLP python

Natural Language Processing (NLP) has revolutionized the way we interact with computers, enabling them to understand and interpret natural human language in ways previously thought impossible. Whether it’s virtual assistants, language translation, or speech recognition, NLP is powering the next generation of intelligent applications.

In this article, I will show you how to create powerful language models with Python, and take your NLP skills to the next level. By the end of this tutorial, you will have the knowledge and tools to create your own language models. So, let’s get started and master NLP together!

Let’s start by exploring the concept of a language model and building one with python.

What is a language Model?

A language model is a probability distribution over a sequence of words.

In simpler words, it is a model that learns to predict the probability of a sequence of words.

Let’s play with some examples of “sequence of words”

Which sequence of words is more accurate?

A. John likes to play

B. Play John likes

The first example follows a word order grammar rule (SVO) Subject - Verb - Object

The second example doesn’t.

Correct answer is A

Language Modeling is used in several Natural language processing projects like machine translation, auto-complete, auto-correct and speech recognition systems.

Types of Language Models

Rule-based models

Rule-based models are language models that use a set of hand-crafted rules to generate and interpret natural language. These models can be effective for simple tasks but are often limited by their reliance on explicit rules.

Statistical Language Models

Statistical language models use statistical techniques like probabilistic algorithms and linguistic rules to learn to predict the probability of a sequence of words.

Examples are N-grams, Hidden Markov Models (HMM)

Neural Language Models

Neural language models use different neural networks and deep learning algorithms to analyze and interpret natural language. These models can achieve state-of-the-art results.

Neural language models are often more complex than statistical models and they require large amounts of training data.

Examples include; Recurrent Neural Networks (RNNs). RNNs are good at modeling long-term dependencies between words in a sentence.

Transformer Models: Transformer models use self-attention mechanisms to process sequential data.

Examples of transformer models are BERT (Bidirectional Encoder Representations from Transformers) and GPT (Generative Pretrained Transformer).

Hybrid Models

Hybrid language models combine multiple approaches, such as rule-based, statistical, and neural models.

Knowledge-based Models

Knowledge-based models use structured data, such as ontologies and semantic networks, to analyze and generate natural language. These models are effective for tasks that require a deep understanding of language semantics.

Let’s jump right into it with a few examples using Python.

Unlocking the Power of Language: Building an N-Gram Language Model with Python

What are N-grams?

N-grams refer to a series or sequence of N consecutive tokens or words.

There are several types of N-grams based on the number of tokens or words in the sequence:

  1. Unigrams: These are N-grams with a single token or word.
  2. Bigrams: These are N-grams with two tokens or words.
  3. Trigrams: These are N-grams with three tokens or words.
  4. 4-grams (Quadgrams): These are N-grams with four tokens or words.
  5. 5-grams (Pentagrams): These are N-grams with five tokens or words.
  6. N-grams with higher values of N, such as 6-grams (Hexagrams), 7-grams (Heptagrams), and so on.

The choice of N in N-grams depends on the application and the complexity of the language. For example, bigrams and trigrams are commonly used in language modeling tasks, while higher-order N-grams may be used for more complex language analysis.

For an example, consider the following sentence:

"The big brown fox jumped over the fence"

  1. Unigrams would be: "The", "big", "brown", "fox", "jumped", "over", "the", "fence"
  2. Bigram: "The big", "big brown", "brown fox", "fox jumped", "jumped over", "over the", "the fence"
  3. Trigram: "The big brown", "big brown fox", "brown fox jumped", "fox jumped over", "jumped over the", "over the fence"
  4. 4-gram (Quadgram): "The big brown fox", "big brown fox jumped", "brown fox jumped over", "fox jumped over the", "jumped over the fence"
  5. 5-gram (Pentagram): "The big brown fox jumped", "big brown fox jumped over", "brown fox jumped over the", "fox jumped over the fence"
  6. 6-gram (Hexagram): "The big brown fox jumped over", "big brown fox jumped over the", "brown fox jumped over the fence"

Example: Predict the next word

language modeling ngram nlp prediction

To predict the next word in a sentence, we can use a trigram model (N=3)

This model evaluates the likelihood of every potential next word based on the two previous words. This is achieved by calculating the frequency of each trigram in a training corpus and subsequently estimating the probability of each trigram.

Now that we understand what N-grams are, let’s move on to implementing N-gram models with Python.

Install NLTK using pip

pip install nltk

We will be using the Reuters corpus, which is a collection of news documents.

download the necessary data:

import nltk

from nltk.corpus import reuters
from nltk import ngrams, FreqDist

# Load the Reuters corpus
corpus = reuters.words()

# Tokenize the corpus into trigrams
n = 3
trigrams = ngrams(corpus, n)

# Count the frequency of each trigram
fdist = FreqDist(trigrams)

To begin, we load the Reuters corpus using the reuters.words() function, which returns a list of words in the corpus.

Afterward, we utilize the ngrams() function to create trigrams by tokenizing the corpus, with the function accepting two arguments: the corpus itself and N (in this case, 3 for trigrams).

we count the frequency of each trigram using the FreqDist() function.

With the frequency distribution of the trigrams, we can calculate probabilities and make predictions.

# Define the context of the sentence we want to predict
context = ('we', 'are')

# Get the list of possible next words and their frequencies
next_words = [x[0][2] for x in fdist.most_common() if x[0][:2] == context]

# Print the next word
print(next_words, end=' ')

Building a Neural Language Model (RNNs) using Keras library in Python

Let’s train a recurrent neural network (RNNs) language model using the Keras library in Python to predict the next word in a sentence

First, we need to prepare the training data. We will use a text corpus to train our language model. For this example, let’s use a small text corpus consisting of five sentences.

# Import the necessary libraries
from keras.preprocessing.text import Tokenizer
from keras.utils import to_categorical, pad_sequences
import numpy as np

# Define the text corpus
text_corpus = [
    'She drinks coffee every morning.',
    'She drinks tea in the afternoon.',
    'She drinks water all day long.',
    'She drinks wine in the evening.'

# Tokenize the text corpus
tokenizer = Tokenizer()
sequences = tokenizer.texts_to_sequences(text_corpus)

# Pad the sequences to a fixed length
max_sequence_length = max([len(seq) for seq in sequences])
padded_sequences = pad_sequences(sequences, maxlen=max_sequence_length, padding='pre')

# Prepare the input and output sequences
X = padded_sequences[:,:-1]
y = to_categorical(padded_sequences[:,-1], num_classes=len(tokenizer.word_index)+1)

Next step is to define our recurrent neural network language model. We will use an LSTM (Long Short-Term Memory) layer to learn the temporal dependencies between words.

# Import the necessary libraries
from keras.models import Sequential
from keras.layers import Embedding, LSTM, Dense

# Define the model
model = Sequential()
model.add(Embedding(input_dim=len(tokenizer.word_index)+1, output_dim=50, input_length=max_sequence_length-1))
model.add(Dense(len(tokenizer.word_index)+1, activation='softmax'))

# Compile the model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# Train the model
model.fit(X, y, epochs=50, verbose=1)

Finally, we can use the trained model to predict the next word in the sentence "she drinks cofee in the __".

We will first encode the sentence as a sequence of words using the tokenizer, and then use the model to predict the next word in the sequence.

# Encode the input sentence as a sequence of words
input_sentence = 'she drinks coffee in the '
input_sequence = tokenizer.texts_to_sequences([input_sentence])[0]

# Pad the input sequence to a fixed length
input_sequence = pad_sequences([input_sequence], maxlen=max_sequence_length-1, padding='pre')

# Use the model to predict the next word
predicted_index = np.argmax(model.predict(input_sequence, verbose=0), axis=-1)
predicted_word = list(tokenizer.word_index.keys())[list(tokenizer.word_index.values()).index(predicted_index)]

# Print the predicted word
Note:  this example uses small text data. Try replicating the example with a larger dataset. Also, create training and validation datasets.

Transformer Model

Transformer models are currently eating the world with the widespread adoption of GPT (Generative Pretrained Transformer) and BERT models.

transformer model eating good with GPT

For this example, we would use transfer learning to build a transformer model that predicts the next word in a sentence.

The first step is to prepare the dataset:

Pre-trained language models like GPT-2 or GPT-3, are large-scale language models trained on massive amounts of text data.

We’ll need a dataset of text input and corresponding labels for the “next word” suggestions. You can use any dataset, such as the Gutenberg Corpus, Wikipedia or any other text corpus. You can then preprocess the dataset by tokenizing the text and splitting it into training and validation sets.

Load a pre-trained transformer model:

You can use any pre-trained transformer model, such as BERT (Bidirectional Encoder Representations from Transformers) or GPT-2, as a base model for our project. You can load the pre-trained model using the Hugging Face Transformers library and extract the necessary layers for your task.

Define the model:

You can define your model as a sequence of layers, starting with an input layer, followed by a transformer layer, and ending with an output layer. You can use the pre-trained transformer layer as the main layer of your model and add additional layers for fine-tuning.

Compile the model:

You can compile the model with a suitable loss function and optimizer. For example, you can use the categorical cross-entropy loss function and the Adam optimizer.

Train the model:

You can train the model on your training data and validate it on your validation data. You can use techniques such as early stopping and learning rate scheduling to optimize the training process.

Test and evaluate the model:

You can evaluate the model on a test set and measure its performance using metrics such as accuracy or precision.

Deploy the model:

Once we are satisfied with the performance of the model, we can deploy it as a web service or integrate it into an existing application. We can use frameworks like Flask or Django to build a RESTful API that exposes the functionality. We can also use libraries like TensorFlow Serving or PyTorch Serving to deploy the model in a scalable and efficient way.

This is a simple example using only a pre-trained transformer model GPT-2

import tensorflow as tf
from transformers import TFAutoModelWithLMHead, AutoTokenizer

# Load the pre-trained model and tokenizer
model_name = "gpt2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = TFAutoModelWithLMHead.from_pretrained(model_name)

# Generate some sample text
input_text = "She loves to drink "
input_ids = tokenizer.encode(input_text, return_tensors='tf')

# Generate new text using the language model
output_ids = model.generate(

# Decode the output text and print it
output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)

Further Reading on Natural Language Processing

  1. Get started with Natural Language Processing
  2. Morphological segmentation
  3. Word segmentation
  4. Parsing
  5. Parts of speech tagging
  6. breaking sentence
  7. Named entity recognition (NER)
  8. Natural language generation
  9. Word sense disambiguation
  10. Deep Learning (Recurrent Neural Networks)
  11. WordNet
  12. Language Modeling

Interested in learning how to build for production? check out my publication TreapAI.com

Connect to MongoDB from Python

connect to mongodb from python

Connect to MongoDB from Python 3

Connect to MongoDB from Python code – Follow these steps to connect to your Mongo database from your python code.

1. Install PyMongo

PyMongo supports CPython 3.7+ and PyPy3.7+

PyMongo official documentation recommends using pip

$ python3 -m pip install pymongo
Collecting pymongo
  Downloading pymongo-4.3.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (501 kB)
     |████████████████████████████████| 501 kB 346 kB/s
Requirement already satisfied: dnspython<3.0.0,>=1.16.0 in /home/jennifer/anaconda3/lib/python3.8/site-packages (from pymongo) (2.2.1)
Installing collected packages: pymongo
Successfully installed pymongo-4.3.3

if you already have pymongo and want to upgrade, run the line below

$ python3 -m pip install --upgrade pymongo

2. Import MongoClient and create a connection

from pymongo import MongoClient

client = MongoClient()

you can specify the host address and port

from pymongo import MongoClient
host = 'localhost'
port = 27017
client = MongoClient(host, port)

You can also use mongodb URI format

from pymongo import MongoClient
URI = 'mongodb://localhost:27017'
client = MongoClient(URI)

MongoDB URI with username and password


If the username and password have special characters, we can escape them according to RFC 3986 using urllib.parse.quote_plus

from pymongo import MongoClient
from urllib.parse import quote_plus as urlquote

user = 'mongouser'
password = 'mongo@strong!pasword'
host = 'localhost'
port = 27017
URI = f'mongodb://{urlquote(user)}:{urlquote(password)}@{host}:{port}'
client = MongoClient(URI)

3. Close Connection


Keras vs TensorFlow vs PyTorch | Which is Better or Easier?

Keras vs TensorFlow vs PyTorch

Keras, TensorFlow, and PyTorch are some of the most popular machine learning and deep learning frameworks being used by professionals and newbies alike.

Deep learning is a subset of machine learning that uses neural networks to train models on large datasets. This compares three popular Deep Learning Frameworks: Keras, TensorFlow, and PyTorch. Here you’ll find key differences between these frameworks and will be able to decide which would be best for you.

What is TensorFlow?


TensorFlow is an open-source software library for machine learning research and development. It provides a set of tools for numerical computation using data flow graphs.

TensorFlow was originally developed by Google Brain team and released as open source software in November 2015.

It is used for machine learning applications, including speech recognition, image recognition, predictive analytics, natural language processing, and other more specialized tasks. See Tensorflow Lite for Android.

TensorFlow was originally developed to support the development of machine learning models, but the scope of TensorFlow has since been expanded to include other types of modeling and data processing.

The core TensorFlow framework provides APIs for expressing parallel computations, training models and executing them on both CPUs and GPUs.

What is Keras?

Keras is an open-source neural network high-level API that can run on top of TensorFlow, Theano or CNTK. It was written in Python, developed with the intention to allow for fast experimentation.


Going from idea to result with the least possible delay allows for faster iteration during the development process, which leads to better models.

Keras’s backend can be configured to use Theano or TensorFlow. This means that it can be used with one or the other without worrying about switching between them, making it easier for developers who want to experiment with different deep learning frameworks without rewriting their code.

Keras has the following key features:

  • Support for convolutional neural networks (CNN) for computer vision applications, recurrent neural networks (RNN) for sequence processing applications, and any combination.
  • High level of customizability through user-defined callbacks and hooks.
  • Support for arbitrary network architectures: multi-input or multi-output models are easily expressed in just a few lines of code.

Keras has two main components:

The first is a high-level API to build and train deep learning models. This API makes it easy to quickly prototype new ideas without getting bogged down in the details of building neural networks. The second is a set of pre-built models that can be used for common tasks such as classification, regression, clustering, and more.

What is PyTorch?


PyTorch is a deep learning framework that provides GPU acceleration and support for both Python and C++. It is one of the most popular frameworks in the deep learning space, with an active community of developers.

PyTorch was developed by Meta’s artificial intelligence research group, which also created Caffe2, a machine learning framework.

The first public release of PyTorch was in January 2016.

Keras vs TensorFlow vs PyTorch what is the difference?

These three frameworks have a lot in common, although they are all slightly different.

1. Architecturesimpler and more readable architecturecomplex architecture
2. APIprovides both high and low level APIsHigh levelLow Level
3. Speedfastcomparatively slowersuper fast
4. BackendNo backend neededBackend support include TensorFlow, Theano, CNTKNo backend needed
5. Datasetcan crunch large datasets with high performancesuitable for small datasetscan crunch large datasets with high performance

TensorFlow vs Keras vs PyTorch Which is Easier for Beginners?

These frameworks have different learning curves. Because of Keras’s simplicity, it’s easier to understand.

Keras vs TensorFlow vs PyTorch Which is Better?

Keras is perfect for programming quick prototypes and things that need to be created without a lot of data.

PyTorch is most suitable for building large models with big data and high performance.

Keras vs TensorFlow vs PyTorch Which is Faster?

PyTorch is comparatively faster than Keras.

PyTorch vs Keras vs TensorFlow Which is more popular?

According to Quora and Kaggle, Keras seems to be the most popular deep learning framework among data scientists for its simplicity and PyTorch by academia and industrial research team for research flexibility.

NLTK WordNet: Synonyms, Antonyms, Hypernyms [Python Examples]

Nltk WordNet with Python Examples

What is Wordnet?

WordNet is a large lexical database for words in the English language. It provides synonyms, definitions, and other useful information and relations between synonyms.

Wordnet groups English words into sets of synonyms called Synsets. It can be seen as a semantic database used to find synonyms, antonyms, part of speech information and related words in English.

Wordnet has been used extensively in computational linguistics, cognitive psychology, and machine translation research.

What are Synonyms in NLP?

Synonyms are words that have the same or similar meaning. They can be used in different contexts and with different meanings but they still have the same underlying meaning. Synonyms are a great way to find new words if you don’t know how to express what you want to say.

In some cases, the meaning of synonymous words is not always interchangeable, but they often share some of their intentions.

for example:

The word “regretful” is synonymous with “bad” but can’t be interchanged in the sentence below.

“A bad motor caused the ship to sink”

WordNet Python Examples using NLTK: Find Synonyms and Antonyms from NLTK

The purpose of this tutorial is to show you how you can use NLTK WordNet in Python to find synonyms and antonyms.

NLTK is a natural language processing library with a built-in WordNet database. So, in this tutorial, we will write a simple python code that will help us find synonyms for any given word by using NLTK WordNet in Python.

Import WordNet

from nltk.corpus import wordnet

A Synset is a WordNet synonym set. It is a collection of words that have the same meaning, and are grouped together in a sense as they are related to each other.

Synset python example:

from nltk.corpus import wordnet
synset = wordnet.synsets("cunning")


synset nltk wordnet python example

From the results, we can see that the word “cunning” is synonymous with “craft”, “crafty” and “clever”.

To see more words in the synset, we can expand and loop through the synset lists for more words.

for syns in synset:
    for word in syns.lemmas():


synset nltk wordnet python example 2

What are Antonyms in NLP?

Antonyms are words that have opposite or nearly opposite meanings. They are usually opposites in the way they are used. For example, “close” is an antonym of “open”.

They are also known as antonym pairs.

Some examples of antonyms are:

  • Hot and cold
  • Loud and quiet
  • Up and down
  • In and out
  • Night and day

How to find the Antonym of a word

synset = wordnet.synsets("fear")
for syns in synset:
    for word in syns.lemmas():
        if word.antonyms():


antonyms nltk wordnet python example

We can write a python function that outputs both the synonym and antonyms of any given word

from nltk.corpus import wordnet
from nltk.tokenize import word_tokenize

def synonym_antonyms_parser(sentence):
    tokens = word_tokenize(sentence)
    for token in tokens:
        for syns in wordnet.synsets(token):
            for word in syns.lemmas():
                synonym = word.name()
                if word.antonyms():
                    antonym = word.antonyms()
                    print( "Word: {} Synonym: {} Antonym: {}".format(token, synonym, antonym))

sentence = "good is bad"


synonyms antonyms nltk wordnet python example

There are many words in the English language that do not have antonyms. Proper Nouns, Material Nouns, Proper Adjectives, Adjectives of Number are some categories that can never have antonyms. Words like “clothe” or “American” can’t be opposed by an antonym.

What are Hypernyms in NLP?

A hypernym is a broader word that encompasses more specific words. Hypernyms are words that have a more general meaning than other words. For example, the words “dog” and “cat” are hyponyms of the word “animal.”

Another good example of a hypernym is the word “book.” The word book is related to words like “library,” “reading,” and “writing.”

print ("Hypernym for 'cat': ", wordnet.synset('cat.n.01').hypernyms()) 

From the output, we can see that “cat” is a hyponym of “feline”. Feline is a family of mammals in the order Carnivora, informally referred to as cats.

hypernyms nltk wordnet
print ("Hypernym for 'fish': ", wordnet.synset('fish.n.01').hypernyms()) 

What is the hyponym of fish? From the output we can see that the hypernym for fish is “aquatic vertebrate”

What is the hyponym of fish?

Hypernymy and hyponymy are useful in NLP for analyzing content to provide the most relevant response possible for a question and answering system.

What are Holonyms in NLP?

Holonyms are words that share the same root word but have different meanings.

An example of a holonym is “bank.” The word “bank” can refer to a financial institution, or an edge of land that goes down to the water and protects what’s behind it from flooding, or a place where you put money to earn interest.

What are Meronyms in NLP?

Meronyms are a part of the word that is used to refer to a whole object or its properties.

For example: The word “car” is a meronym for “wheels”, “doors”, and “engine”.

In the sentence “the trunk of a tree“, the word “trunk” is the meronym of “tree.”

Noun meronyms include: arm, leg, wing. Adjective meronyms include: right-handed, left-handed

car = wordnet.synset('car.n.01')


meronyms nltk wordnet python examples

Is WordNet a taxonomy?

Yes, WordNet can be considered a taxonomy or a type of lexical database that organizes words and their meanings into a network of interrelated concepts.

Words are organized into synsets (sets of synonyms) based on their meanings and relationships. Each synset represents a distinct concept or idea, and is linked to other synsets through semantic relationships such as hyponymy (is-a), hypernymy (has-a), meronymy (part-of), and holonymy (member-of).

This hierarchical structure of synsets and their relationships can be seen as a taxonomy of concepts and their interrelationships. For example, the synset for “dog” is a hyponym of the synset for “animal”, which is a hyponym of the synset for “living thing”, and so on.

In addition to its use as a taxonomy, WordNet is also widely used for natural language processing tasks such as word sense disambiguation, information retrieval, and text classification

Further Reading on Natural Language Processing

  1. Get started with Natural Language Processing
  2. Morphological segmentation
  3. Word segmentation
  4. Parsing
  5. Parts of speech tagging
  6. breaking sentence
  7. Named entity recognition (NER)
  8. Natural language generation
  9. Word sense disambiguation
  10. Deep Learning (Recurrent Neural Networks)
  11. WordNet
  12. Language Modeling

Interested in learning how to build for production? check out my publication TreapAI.com

Neural Style Transfer Create Mardi Gras Art with Python TF Hub


Neural style transfer is referred to as an artistic algorithm that takes two images (in this case, a content image and a style reference image) and blends them to produce an image that looks like the content with attributes that take after the styling of the reference.

In this tutorial, we would use a pre-trained deep learning (Convolutional Neural Network) model to create an image in the style of another one. 

TensorFlow Hub is a centralized repository of pre-trained machine learning models. It provides a one-stop shop for developers who want to use machine learning in their applications without building and training their models from scratch.

Mardi Gras is a celebration of the last day before Lent. It’s a festival of food, music, and dance.

It is one of the most famous festivals in the world and it takes place in New Orleans.

We will use these Madia Gras and Paintings stock photos for the content and reference images

Style Transfer Environment Setup with TensorFlow on Colab

We would use google colab environment to train and run our deep learning model.

Code reproduced from Tensorflow.org
Download Jupyter Notebook 

Create a new notebook


import os
import tensorflow as tf
# Load compressed models from tensorflow_hub

import numpy as np
import IPython.display as display

import matplotlib.pyplot as plt
import matplotlib as mpl
mpl.rcParams['figure.figsize'] = (12, 12)
mpl.rcParams['axes.grid'] = False

import PIL.Image
import time
import functools

Create a function to

def tensor_to_image(tensor):
  tensor = tensor*255
  tensor = np.array(tensor, dtype=np.uint8)
  if np.ndim(tensor)>3:
    assert tensor.shape[0] == 1
    tensor = tensor[0]
  return PIL.Image.fromarray(tensor)

Import the Madia Gras and Painting Images to Colab

content_path_1 = 'content.jpg'
content_path_2 = 'content2.jpg'
content_path_3 = 'content3.jpg'
style_path_1 = 'reference.jpg'
style_path_2 = 'reference_style.jpg'
style_path_3 = 'reference_style3.jpg'

def load_img(path_to_img):
  max_dim = 512
  img = tf.io.read_file(path_to_img)
  img = tf.image.decode_image(img, channels=3)
  img = tf.image.convert_image_dtype(img, tf.float32)

  shape = tf.cast(tf.shape(img)[:-1], tf.float32)
  long_dim = max(shape)
  scale = max_dim / long_dim

  new_shape = tf.cast(shape * scale, tf.int32)

  img = tf.image.resize(img, new_shape)
  img = img[tf.newaxis, :]
  return img

def imshow(image, title=None):
  if len(image.shape) > 3:
    image = tf.squeeze(image, axis=0)

  if title:

content_image = load_img(content_path_1)
style_image = load_img(style_path_1)

plt.subplot(1, 2, 1)
imshow(content_image, 'Content Image')

plt.subplot(1, 2, 2)
imshow(style_image, 'Style Image')

Import TensorFlow Hub to use a pre-trained model

import tensorflow_hub as hub
hub_model = hub.load('https://tfhub.dev/google/magenta/arbitrary-image-stylization-v1-256/2')
stylized_image = hub_model(tf.constant(content_image), tf.constant(style_image))[0]

content_image = load_img(content_path_2)
style_image = load_img(style_path_2)

plt.subplot(1, 2, 1)
imshow(content_image, 'Content Image')

plt.subplot(1, 2, 2)
imshow(style_image, 'Style Image')

import tensorflow_hub as hub
hub_model = hub.load('https://tfhub.dev/google/magenta/arbitrary-image-stylization-v1-256/2')
stylized_image = hub_model(tf.constant(content_image), tf.constant(style_image))[0]

content_image = load_img(content_path_3)
style_image = load_img(style_path_3)

plt.subplot(1, 2, 1)
imshow(content_image, 'Content Image')

plt.subplot(1, 2, 2)
imshow(style_image, 'Style Image')

import tensorflow_hub as hub
hub_model = hub.load('https://tfhub.dev/google/magenta/arbitrary-image-stylization-v1-256/2')
stylized_image = hub_model(tf.constant(content_image), tf.constant(style_image))[0]

Final Images – A Masterpiece

Meta is working on AI features for the Metaverse


Meta is working on several artificial intelligence features for the Metaverse. In a Livestream broadcast, Zuckerberg demonstrated a virtual world created by an AI feature called Builder Bot.

Builder Bot follows a voice command and creates a new environment in the Metaverse. 

It can create and also import objects from the real world into the metaverse.

Builder Bot is still in its early stages. Zuckerberg demonstrated how a virtual space is created with the help of Builder Bot. He started with simple commands like “let’s go to the beach,” which prompted a 3D landscape of sand and water to surround him. He describes this as “all AI-generated.”

His demo with another Meta employee used voice commands to create more objects like a picnic table, clouds and added sound effects of ocean waves and seagulls.

Meta is one of many companies experimenting with AI to create virtual environments.

Meta AI unveiled some of their other groundbreaking AI announcements at the event, including a plan for an NLP universal language translator and a new version of their conversational AI system called Project CAIRaoke.

This will help for more accessible communication in the Metaverse.

More on this story

10 Best Open-source Machine Learning Libraries [2022]

10 Best Open source Machine learning Libraries [2021]

Machine learning libraries and frameworks make it easier to write code for machine learning without knowing the underlying mathematics behind the algorithms or building from scratch. With libraries, we can write code faster to train models.

In picking a machine learning framework to use, carefully consider these:

1. Learning curve of a machine learning library

Some machine learning libraries are easy to learn and implement, while others require more technical expertise.

2. User and organization’s adoption

It is essential to know what tool organizations use in production if you intend to apply for machine learning jobs or build your product. 

3. Project scope

Machine learning libraries focus on different goals. Find a library that is useful to your project scope. 

If your project scope focuses on image data, you should choose better-optimized frameworks for images.

10 best machine learning libraries and frameworks.

1. PyTorch


PyTorch is an open-source machine learning framework developed by Facebook’s AI Research lab (FAIR)

Written in: Python, CUDA, C++.

PyTorch is used both for research and production in building state-of-the-art products. It broadly supports the development of projects in computer vision, natural language processing, reinforcement

learning and more.

It has a robust ecosystem and is supported on major cloud platforms.

Learning curve: Medium

PyTorch is easier to learn than other deep learning frameworks.

Adoption level: High

1900+ contributors on Github. Used by over 83,000 repositories on Github.

Who is using PyTorch?

salesforce, Stanford university, Udacity

Where to learn PyTorch

Learn the basics of PyTorch following these tutorials.

2. TensorFlow (TF)


TensorFlow is an open-source platform for machine learning developed by Google. TensorFlow was released to the public in November 2015. The core of TensorFlow is written in Python, C++, and CUDA.

TF is used both in research and production environment.

Although Python is widely used for TensorFlow, TensorFlow is available in R, JavaScript.

TF is popularly used for numerical computations. It has inbuilt machine learning and statistical tools. It is used to build projects on regression, classification, neural networks, and more.

In production, TensorFlow Extended (TFX) is used to build a production pipeline. It is optimized for large scaling and other deployment features.

Read more on MLOps.

TensorFlow has a visualization toolkit called TensorBoard. TensorBoard provides an interactive web-based dashboard for visualization.

TensorFlow can run computations on CPU, GPU, and TPU.

Learning curve: Medium

Adoption level: High

TensorFlow is widely used in production.

3,036+ Contributors; used by over 146,000 repositories on Github.


TensorFlow has a step-by-step example on its website.

Who is using TensorFlow?

Some of the companies using TensorFlow include;

Airbnb, Google, Coca-Cola, DeepMind, GE Healthcare, intel. Twitter, Dropbox, eBay, Lenovo, Linkedin, Nvidia, PayPal, Snapchat, Bloomberg, musical.ly, Kakao, AMD

Where to learn?

I recommend you start with TensorFlow tutorials here.

3. Scikit-learn


Scikit-learn is a popular open-source machine learning library developed by David Cournapeau and initially released in June 2007.

Written in Python, C, C++, Cython.

Scikit-learn is not built to run across clusters. It is mainly used in experimentation.

learning curve: Easy

Adoption level: High

2000+ contributors on Github. Used by over 238,000 repositories on Github.

4. Spark MLlib


Apache Spark itself is a unified analytics engine for large-scale data processing. It is used for many things, including;

creating and managing data pipelines, data ingestion, data streams, machine learning modeling.

Spark MLlib is built on top of Spark. It is widely used in production because it integrates easily with other Spark components like Spark SQL, Spark Streaming.

learning curve: Medium

Adoption level: High

1600+ contributors on Github. Used by over 604 repositories on Github.

5. spaCy


spaCy is an open-source library developed by Explosion AI for natural language processing and written in Python and Cython.

spaCy is an excellent library for feature engineering and extracting information on text data. spaCy is built for production use, and it can handle large volumes of text data.

learning curve: Medium

Adoption level: Medium

542+ contributors on Github. Used by over 26,000 repositories on Github.

6. Natural language toolkit (NLTK)


NLTK is an open-source library for natural language processing. It was initially developed by Steven Bird, Edward Loper, Ewan Klein.

Written in Python.

NLTK is very useful in preprocessing text data.

learning curve: easy

Adoption level: High

340 contributors on Github. Used by over 107,000 repositories on Github.

7. Numpy


NumPy is an open-source python library that offers an extensive collection of comprehensive mathematical functions. Numpy helps us work with arrays to perform various mathematical operations.

Written in Python and C.

Jim Hugunin created NumPy. Initially released in 1995.

learning curve: easy

Adoption level: High

1,169+ contributors on Github. Used by over 736,000 repositories on Github.

8. Pandas


Pandas is an open-source data analysis library. It is an excellent tool for data analysis and manipulation.

Pandas were created by Wes McKinney and released on 11 January 2008.

Written in Python, C, and Cython.

learning curve: easy

Adoption level: High

2380+ contributors on Github. used by over 469,000 repositories on Github.

9. Matplotlib


Matplotlib is one of the most popular plotting open-source libraries for the Python programming language. It is used to plot a graphical representation of data.

learning curve: easy

Adoption level: High

1,097+ contributors, used by over 387,000 repositories on Github.

10. Keras


Keras is an open-source library built on top of TensorFlow for creating an artificial neural network.

François Chollet originally developed it. Released in March 2015.

learning curve: medium

Adoption level: high

910+ contributors on Github.


These machine learning libraries and frameworks power a good number of machine learning products. They are widely used in state-of-the-art machine learning research and production.

They are;

  1. PyTorch
  2. TensorFlow
  3. Scikit-learn
  4. Spark MLlib
  5. spaCy
  6. NLTK
  7. Numpy
  8. Pandas
  9. Matplotlib
  10. Keras

Introduction to MLOps | Machine learning

introduction to mlops

Introduction to MLOps answers the question of how to deploy a stable model?

In deploying machine learning models to the production environment, It is important to consider performance in the real world. Machine learning operation involves model testing, versioning, continuous deployment (CI/CD), availability, and monitoring.

introduction to mlops_ml in production

How does a machine learning system work in production?

Explore these archives to grasp the what, why, and how to build machine learning systems. 

These steps below are essential in building an excellent end-to-end machine learning system.

  1. In a typical MLE project, it is crucial first to define the scope of the project. The excellent project scope will define the outcome of the project. 
  2. Data Engineering: this phase defines the methods and techniques used to collect, organize and store big data. Some other ways to clean and preprocess data.
  3. Build Machine learning models. At this phase, you already have the correct data defined in your scope documentation. Different machine learning algorithms are applied in training and testing a good model. 
  4. Model deployment. Deploy models to connect with new or existing applications either natively or through application interfaces (API). 
  5. Model monitoring and maintenance. It is crucial to monitor models in production. Monitor its performance over time and if there is a need to retrain based on new information. 

Machine Learning & Data Science Communities in the World


Find a directory of machine learning and data science communities in the world.

To add your community here, please send me an email or a pull request on Github.

Online Communities

North America

Machine learning communities in United States

  1. Women in Machine Learning and Data science Atlanta. Meetup Page
  2. Austin Women in Machine and Data science. Meetup page
  3. Natural language processing. Newyork. see group
  4. NYC Data Science, Big Data, ML, Blockchain, Web Dev
  5. Data & AI and App Development Meetup

Machine learning communities in Canada




Machine learning communities in United Kingdom

  1. AI for Good
  2. London Data, Analytics & AI Geeks. Visit page
  3. Remote ML paper Club. visit page
  4. ODSC London AIx. see page
  5. Data science on AWS UK. see group
  6. Social Science and Data science book club. see group
  7. Data science network (DSNet). London Chapter. visit group
  8. Artificial Intelligence for Healthcare. see group
  9. Data Engineering for Data science (DE4DS). see group









Czech Republic















Bosnia and Herzegovina



North Macedonia











San Marino

Holy See


Machine learning communities in Nigeria

  1. Abuja Women in Machine learning and Data science. visit page
  2. Lagos Women in machine learning and data science. visit page
  3. AI saturdays lagos. see page
  4. AI saturdays abuja. see page
  5. Data science Nigeria DSN. see page



DR Congo


South Africa











Côte d’Ivoire


Burkina Faso













South Sudan


Sierra Leone




Central African Republic








Equatorial Guinea





Cabo Verde

Sao Tome & Principe














South Korea
Saudi Arabia
North Korea
Sri Lanka
United Arab Emirates
State of Palestine

Latin America and the Carribean

Dominican Republic
El Salvador
Costa Rica
Trinidad and Tobago
Saint Lucia
St Vincent & Grenadines
Antigua and Barbuda
Saint Kitts & Nevis


Papua New Guinea
New Zealand
Solomon Islands
Marshall Islands

What is Machine Learning

What is Machine Learning

Fundamentals of Machine Learning

Machine learning (ML) is beyond an internet buzzword, and the possibilities it promises are nothing short of fantasy and science friction.
At the end of this article, my objectives are to open you up to the world of machine learning.
We will start with some definitions, take a short trip in the machine learning memory lane, explore some exciting research.
From there, we will peep into “who is using ML in production” and see the prerequisites required to get started.
We will also discuss the different approaches to solving ML problems.
We will look at the platforms and frameworks used in coding for ML.
Lastly, we get to chat about what next?
This article is a long read, and you won’t be writing code. Grab something to drink or eat, sit back and enjoy 🙂

What is Machine Learning?

I like to define Machine Learning (ML) as the science of getting a computer to learn to solve a problem by experience, just like humans without explicitly instructing it. Similar to how humans learn with experience, computers are fed with data instead.

Machine learning is a sub-field of Artificial Intelligence (AI) that focuses on how computer algorithms automatically learn and improve their accuracy without being explicitly programmed.

The goal is for computers to learn with no human intervention.
Machine learning is a branch of Artificial Intelligence (AI) and Computer Science.
ML is used in diverse applications and found helpful in almost every field: medicine and health, transportation, education, entertainment, finance.

ML thrives in solving problems that are extremely difficult for conventional programming logic. For example, Image classification.

In the example below, classifying if an image is a dog or a cat.

cats vs. dogs image classification

Writing code to recognize and differentiate images can be complicated. Even If we find a way to perfectly describe the appearance of a dog in item A to the computer, what happens when we show the picture on Item B to the same algorithm?


Not all dogs have a long tail. Some dogs are hairy, and some are not hairy.
The algorithm will be thousands of If-else statements if we try building it out using traditional programming. And yet, it won’t still be accurate.


In image classification, the goal is to identify images of a particular object or thing accurately. Using traditional programming, some photos of dogs will be falsely classified as cats and vice versa.
Machine learning solves this using a different approach. Instead of instructing the computer on every step, we feed the computer many additional images of dogs and cats.
The algorithm learns over time the characteristics that genuinely describe a dog and a cat.


The goal of a machine learning model is not to memorize the data but to generalize. In Machine learning, algorithms train on a sample data known as training data. The model created is tested with a testing dataset.

ML is closely related to some fields like Data Mining and Statistics. Some parts of ML focus on solving problems like predictive analysis, which is a statistical problem. But not all of the machine learning is statistics. Machine learning requires training and testing data, and so knowing how to mine the correct data is a plus for ML Engineers.
Other related fields are Linguistics, Data Engineering, and Software Engineering, etc.

Why Machine Learning is Important?

ML has been used to solve simple and basic tasks to challenging scientific problems. The ability to solve complex problems is one of the main attractions of machine learning.
The availability of high computational power and large volumes of data are the reasons behind the recent ML-powered products.

ML is used in business operations to segment and understand customers’ data. ML is used to optimize and automate routine tasks, mitigate risks, classify fraudulent and non-fraudulent scenarios, predict financial and market trends, recommend products to customers, improve existing tedious processes. Businesses have been able to enact intelligent decisions and stay ahead of the competition because of machine learning.

Machine Learning History

Just like every other scientific field, ML has a great history. From the popular Turing tests by Alan Turing in the 1950s, John McCarthy and the Dartmouth conference in Summer 1956, the dreadful AI winter in the 1980s, to the first AI (AlphaGo developed by Deepmind) that beat the world best GO player in 2016. And more recently, GPT-3 an autoregressive language model designed by OpenAI.

alan turing
Alan Turing ~ image credit Wikipedia

In the 1950s, Alan Turing created the famous “Turing Test.”
For a machine to pass the test, it has to convince a human that it is another human and not a computer.

Arthur Samuel: Image credit Google arts and culture

Arthur Lee Samuel, a pioneer researcher in “game technology” and artificial intelligence, in 1952 designed the first computer game that learned as it played the game checkers. In 1959, he coined the name “machine learning.”

The first artificial neural network was designed in 1958 by Frank Rosenblatt.
In the 1960s, Nilsson N. J.’s book on “Learning Machines” theorized pattern recognition and pattern classification possibilities.
The late 1980s and 1990s witnessed a massive collapse in the field. Research funding dried up. Big corporations that financed AI research withdrew interest.

Garry Kasparov and the IBM deep blue. image credit AFP

However, things bounced back, and there was a big win in 1997 when the IBM deep blue, a chess-playing AI, beat the world champion, Garry Kasparov, in a chess game.
More and more groundbreaking research has since sprung up.

Some Interesting research in Machine learning

Since the end of AI winter, we have seen more exciting research. Thanks to high computing power and the internet that make big data available.

AlexNet (2012)

AlexNet is a deep convolutional neural network that won the 2012 ImageNet Large Scale Visual Recognition Challenge.
The neural network classified 1.3 million high-resolution images in the LSVRC-2010 ImageNet training set into 1000 different classes, with over 500,000 neurons consisting of eight layers, five convolutional layers, and three fully-connected layers.

Read the paper here by Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton.

See code implementation.


The GPT-3 released at the peak of COVID-19 pandemic lockdown was one of the highlights of 2020. GPT-3, short for Generative Pre-trained Transformer 3, is a language model that produces human-like text. GPT-3 succeeded the GPT-2. The OpenAI research laboratory creates the language model.

OpenAI mentioned on its website that over 300 applications are currently powered by GPT-3. Delivering search, conversation, text completion, and other advanced AI features through their API.

Microsoft ImageBert

ImageBert is a pre-trained model that combines natural language processing (NLP) and computer vision (CV) to translate image-text. Researchers created it at Bing Multimedia Team, Microsoft. Di Qi, Lin Su, Jia Song, Edward Cui, Taroon Bharti, Arun Sacheti.

Read the paper here ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised Image-Text Data.

Machine learning in action

Away from research, who is using ML in production?

It is most likely you are consuming ML products already. Big and small organizations are incorporating ML into their products.

Google uses Machine learning to sort out spam messages.

Facebook uses ML in ads placement.

Grammarly is a writing assistant that helps you compose writings with better grammar and checks for spelling errors.

CopyAI is an excellent tool for copywriting. If you have writer’s block or stale copy, typing a few words about the content you are working on can autogenerate words.

Try CopyAI

Machine learning prerequisite


Statistics is the discipline concerned with collecting, organization, analysis, interpretation and presentation of empirical data.

Machine learning largely depends on data, and Statistics is a discipline that focuses on data.
Some tools and concepts used in statistics can be handy for machine learning.
For example, descriptive statistics are used to transform raw data into useful information.
These essential concepts include “mean,” mode, median, skewness, standard deviation, outliers, etc.
Having a good basic knowledge of statistics will help you understand data and extract information from data.

Mathematics: Linear Algebra, Calculus, Probability

Mathematics helps to understand the underlying fundamentals of machine learning algorithms.


Calculus plays an essential role in designing ML algorithms.

Some topics include:

  • multivariate calculus
  • Derivative
  • Gradient and Gradient descent
  • Chain rule
  • Vector calculus etc.


Probability describes the likelihood of an event occurring.

Some topics include:

  • Probability distribution
  • Rules of probability
  • Random Variables


Linear Algebra deals with vectors, matrices, and linear transformations.

Some topics include;

  • Variables, functions and coefficients
  • Logarithms
  • Linear equations
  • Matrix multiplication


Some programming languages like Python, R, Scala, Matlab, and Javascript can code for machine learning. Python and R are the most popular and widely used.

To learn Python, check out my free tutorials on DailyCodingDev.

Prerequisite for ML in Production

Data Modelling and Data Analysis

If your focus is on applied machine learning and building ML models for production, this should be one of your primary prerequisites.

If your goal is to build and deploy models in production, you don’t need to be an expert in calculus and math.

You need a good knowledge of collecting, cleaning, aggregating, exploring, and visualizing data.

You will spend a lot of time preparing and exploring your data.

Some topics include:

  • Data collection
  • Exploratory data analysis
  • Data cleaning
  • Modelling
  • Data visualization

However, there is no need to be an expert on these prerequisites to start machine learning.

Machine Learning Approaches

Supervised learning

Supervised learning is a learning approach that tries to model relationships between the output variable (target/labels) and the input variable (features) of a given dataset to accurately predict output labels in a new scenario based on what it learned.
The goal is to map an input value (x) to the desired output value (Y)
Y = f(X)

The job of the learning algorithm is to generalize and not to memorize the training data.
Generalizing helps the algorithm to perform well in an unseen environment.
Supervised learning problems can be grouped into classification and regression problems.

Some supervised learning algorithms include;

  • Support vector machines
  • Linear regression
  • Logistic regression
  • Decision trees
  • Naïve Bayes
  • K-nearest neighbor
  • Linear discriminant analysis (LDA)
  • Similarity learning
  • Neural Networks

Unsupervised learning

An unsupervised learning approach is used when there is no output label. The learning algorithm models the distribution of the data in other to learn about the data. Here the algorithm learns patterns.
Unsupervised learning problems can be grouped into association and clustering problems.

Some algorithms include;

  • K-means
  • Apriori algorithm

Semi-supervised learning

The semi-supervised learning approach tackles problems with the combination of techniques from supervised and unsupervised learning.
For example, when there are not enough labeled data, the unsupervised technique can learn about the scenario. In contrast, the supervised method models the existing relationship.

Reinforcement learning

Reinforcement learning aims at training a model to make a sequence of decisions. This approach employs trial and error. An intelligent agent is tasked to perform a specific activity. It is either punished for failing or rewarded for succeeding. The reinforcement learning approach is robust and used in creating some of the sophisticated machine learning products, e.g., self-driving cars, robotics, AI games e.t.c

Summary of ML approaches

We learned about the approaches to tackle machine learning problems.

They are;

  • Supervised learning
  • Unsupervised learning
  • Semi-supervised learning
  • Reinforcement learning

Machine learning frameworks and platforms.

Some of the platforms and frameworks used in building for machine learning include;

  • Pytorch – Open source machine learning library developed by Facebook AI lab. visit here
  • Tensorflow – Open source library for ML. visit here
  • Scikit Learn – Open source ML library for python programming. visit here
  • SpaCY – Open source library for Natural language processing
  • AfriLang – Open source library for Natural language processing
  • Python, Scala, R – Programming language
  • Theano – Python library for numerical computation
  • AWS Sagemaker – A cloud ML platform
  • Google ML toolkit – Cloud ML platform
  • Anaconda – Ml and data science package managment
  • IBM Watson Studio – Cloud ML platform. visit here
  • Google Colab – Cloud jupyter notebook environment

Get started learning in ML

Communities and online learning have decentralized machine learning education. Even without a master’s degree in computer science, people have learned, built products, and researched.

Where to learn online

  • My own machine learning series.
  • Stanford Machine learning course on Coursera by Andrew Ng.
  • edx ML course.
  • Pluralsight machine learning courses

I lead an AI community, and I have learned more in meetups and group discussions.

Also, check out the list of machine learning and data science communities across the world.

Bonus: Some Machine learning terms and definitions

  • Training vallidation and test dataset: Data used to train, validate and test a model.
  • Data cleaning: The process of detecting and correcting errors, filtering noise and inaccuracies in a dataset.
  • Overfitting: When a model memorizes a training dataset by capturing noise.
  • Underfitting: When a model is unable to capture the true properties of a dataset.
  • Regularization: Used to prevent a model from overfitting.
  • Ground Truth: Used to check the accuracy of a model against the real world.
  • Neural Networks:
  • Deep Learning: Neural networks with multiple layers.
  • Computer Vision: Computer Vision is an interdisciplinary scientific field that deals with how computers understand images.
  • Natural Language Processing: NLP, Natural Language Processing is an interdisciplinary scientific field that deals with the interaction between computers and the human natural language.
  • Artificial General Intelligence:
  • Bias
  • Noise
  • Active learning
  • Perceptron
  • Classification : used in predicting discrete values. E.g Spam or no spam
  • Regression used in predicting continuous values. E.g Weather prediction
  • Curse of dimensionality
  • Random forest
  • Clustering
  • Association rules

See full Glossary for more terms and definitions.


Hey! You made it to the end. 🎉

In conclusion, we discussed the definition of machine learning, discussed why ML is essential. We took a short trip down to ML history and returned to recent research and products in production. We also looked at prerequisites to get started and approaches to solving problems. Lastly, we looked at a list of some popular frameworks and how to start learning.
I hope you enjoyed this as much as I did.
You can subscribe to my newsletter to get weekly exclusive machine learning tips.

Follow my Machine learning series here.

Follow topics on Natural language processing start here

Stay safe