There’s something about Markov models to me that is *cool *yet very weird. They are one of the first types of examples of probability theory I stumbled on in around 2013/2014 and this was before I even knew what probability theory or a stochastic model meant or actually did. It’s only when I dived a bit further into the possible applications of them that I frankly found them so much **fun. **

Yes, that’s right Markov Chain models are a guilty confession of mine and as shown later in this post I sometimes tinker with them to create Frankenstein-esque applications in Python. Maybe its their – albeit very limited – capability to generate text and predict the future that keeps me entertained, who knows.

This blog post covers the *basics* of the markov chain process. I’ll readily state upfront that I certainly will not be able to cover over one hundred years of history/application of it in this one post. Instead my intention is to attempt to teach what they are, how they can be used and also how you might be able to have some fun with them 🙂

## So, what the Markov?

A **Markov chain** is “a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event.”^{[1]}

Markov Chains, Wikipedia

The markov process itself was named after the Russian mathematician Andrey Markov in the early 20th century when he published his first paper on the topic in 1906.

What I find particularly interesting about markov chains is that they essentially do not hold any kind of ‘attention’ or memory. The next step is only based on what state was attained previously.

As an example, consider the below table:

Markov Chains Explained, Tech Effigy Tutorials.

If it is cloudy, there is a 10% probability of the next state being cloudy, 50% of the next state being rain and 40% of the next state being sunny. Therefore it is most likely that the next day will be rainy and we move to the rainy state. The next day following is also likely to be rainy, based on a 60% probability.

Because this is a stochastic model and because of simply how probability theory actually works there is no way to say that this is what *will happen.* It’s just highly likely that the predictions from this table of data are that rainy days will follow cloudy days and that in turn rainy days are more likely to follow rainy days. *One of the key things to remember here is that the prediction of the day’s state only has a dependency on the previous day, there are no other dependencies or form of memory/attention.*

A good way to represent this table of data is also in a state diagram or a transition matrix that Joseph Armstrong has explained so well on the Tech Effigy pages:

Markov Chains Explained, Tech Effigy Tutorials.

## Applications of a Markov Chain Model

The applications of a markov chain model are varied and there is also a number of derivatives of them. For example did you know that some of Google’s page rank algorithm uses markov chains in its prediction of what a random surfer of the search engine will do? Other examples of applications include:

- Chemical reactions and engineering – Out of scope of this post but you only have to Google this to see a great deal of papers on the subject
- Text prediction – Yes, your iPhone or Android (other phones are available) most likely uses markov chains in its text prediction based on what you have typed before.
- Financial applications – Used for credit risk and predicting market trends
- Text generation – Lots of applications here, however, some include:
- Subreddit simulation – An automated process that uses markov chains to generate random submissions, titles and comments using markov chains.
- Quote generation – You’ll see how this can be possibly achieved below

- Google page rank

Admittedly, some of the above examples have limited scope (i..e. text generation) but some are also quite powerful and well used.

In a later post I’ll possibly also look to cover derivatives such as monte carlo markov chains (MCMC) & Hidden Markov Models (HMM).

MCMCs are widely used to calculate numerical approximations of multi-dimensional integrals and come from a large class of algorithms that focus on sampling of a probability distribution. Where as HMMs are a derivative of the standard markov chain that have some states that are hidden or unobserved.

## Having fun with Markov chains

So, the fun bit. At least in my eyes!

It’s relatively easy (< 50-60 lines of code) to generate sentences using markov chains in Python from a given corpus of text. Shout out to Digitators.com who’s code I have used and modified for my purpose.

In the example I show below I’m looking to generate some new inspirational quotes based on around 350 existing classic quotes. Admittedly some of the quotes that I generated are a bit of a train crash, but then many others are hilarious and some of them are even quite profound! I have selected and included a prize few quotes below the code, I DID NOT WRITE THESE MYSELF!

### The code to generate new quotes

The overall process to create quotes revolves around:

- Getting data (in this case existing quotes) from a source file
- Ensuring data is in a correct format to create the pseudo-markov model, in this case a Python dictionary of words with associated values for each word.
- Generate a list of words of a maximum size that starts with a random START word, appends it to the list and then searches the dictionary for the next potential words from the START word.
- The loop finishes when either the max length is reached or an END word is selected.

# The only Python library we need for this work!
import random

Now to create the dictionary of START, END and associated words

# Create a dict to store our key words and associated values
model = {}
for line in open('markovdata.txt', 'r', encoding="utf8") :
line = line.lower().split()
for i, word in enumerate(line):
if i == len(line)-1:
model['END'] = model.get('END', []) + [word]
else:
if i == 0:
model['START'] = model.get('START', []) + [word]
model[word] = model.get(word, []) + [line[i+1]];

And finally to create a function that will create a quote of max length or finishing with an END word.

def quotegen():
generated = []
while len(generated)< 30 :
if not generated:
words = model['START']
elif generated[-1] in model['END']:
break
else:
words = model[generated[-1]]
generated.append(random.choice(words))
print( ' '.join(generated))

That’s it! I’m sure the Python could be made more Pythonic but it allows you to see how you can easily use ANY text data to generate new sentences of data.

### Example quotes:

“hope is not something ready made. it cannot be silenced.”

“fall seven times and that you go of my success is showing up.”

“challenges are empty.”

“each monday is during our possibilities become limitless.”

“to love what you feel inferior without your heart.”

“dream big and methods. third, adjust all your children, but people give up creativity. the dark, the light.”

“dream big and stand up someone else will forget what we must do.”

“change your whole life”

“a voice within yourself–the invisible battles inside all success.”

“the mind is afraid of success should be happy monday!”

I could go on but hopefully you get an impression of some of the zany and sometimes profound quotes that can be generated with less than 60 lines of code and approximately 350 quotes!

Until next time, thanks for reading 🙂