Big O Notation? More like Big O-M-G

1007

There are a lot of ways to write algorithms for a specific task. Since there are so many variations, how does someone know which algorithm to use in which situation? One way to look at the problem is to ask yourself: what limitations are present and how can I check if the algorithm fits within these constraints? Big O is used to help identify which algorithms fit inside these constraints and limitations.

Big O notation is a mathematical representation of how an algorithm scales with an increasing number of inputs. In other words: the length of time it takes for a certain algorithm to run or how much space it will consume during computations.

By using Big O notation, developers and engineers can construct the best algorithm that fits within their constraints.

How does Big O work?

So, Big O is used to describe the trend of an algorithm, but what does the language surrounding Big O sound like? In conversation, you would talk about Big O like this: “The time complexity of [a certain algorithm] is O(n) — “O of n” (both “O” and “n” are pronounced like the letter). The reason I am using the phrase time complexity is because Big O notations are most commonly referred to as time complexities. “n” is used in Big O to refer to the number of inputs. There is another side to Big O called space complexity but I’ll talk about that later in the blog.

Now that you know how to talk about Big O and what it’s used for, let’s look at some examples to help you understand how Big O is calculated. The following examples will go over some of the more common Big O notations you will see (in order): O(n)O(log(n))O(n^2)O(1).


Time complexity: O(n)

1*g5D_dsEo__lYJIkEc_sEmg.jpeg

“Could you not?”

Let’s say that someone stole the last cookie out of your cookie jar (“how rude!”), and you really wanted to eat that cookie, so now you have to go out and find it. One way you could go about to find the cookie is to ask your 5 friends, one by one, if they took your cookie. We could represent this as a function in Python3:

>>> def who_has_my_cookie(my_friends):
...     for friend in my_friends:
...         if friend.get('has_cookie'):
...             return friend.get('name')

In this function, you are going to go ask each of your friends if they have your cookie. In this case, the length of my_friends is your n. This function’s run time is dependent on how many friends were present during the abduction. For that reason, this function would have a time complexity of O(n).

Now let’s feed the function some input, and see who took the cookie:

>>> my_friends = [
...     {'name':'Kristen', 'has_cookie':False},
...     {'name':'Robert', 'has_cookie':False},
...     {'name':'Jesse', 'has_cookie':False},
...     {'name':'Naomi', 'has_cookie':False},
...     {'name':'Thomas', 'has_cookie':True}
... ]
>>> cookie_taker = who_has_my_cookie(my_friends)
>>> print(cookie_taker)
Thomas

You might wonder, “But what if the first person you ask has the cookie?” Great question! In that case, the time complexity would be O(1), also known as constant time. Even so, the time complexity for this problem would still be O(n).

Why is the correct notation O(n)? In most cases, you won’t get the best case scenario — i.e. finding the cookie on the first person. This means it’s best to think of the worst case scenario when dealing with algorithms.

Summary of O(n): A loop that could iterate the same amount of times as the max number of input. In the case above, the loop goes through 5 iterations, of which there were 5 friends (my_friends) before finding who stole your cookie.


Time Complexity: O(log(n))

In this next scenario, lets say that you had all of your friends line up side by side. You then find the middle person and ask them if the cookie is on their left side or right side. You then find the middle person on that side and ask them the same thing — using the original middle person as a boundary to search smaller portions of the line. This keeps happening until the person who has the cookie is found. This can be written in Python3:

>>> def search_for_cookie(my_friends, first, last):
...     middle = (last - first) // 2 + first
...     middle_response = my_friends[middle].get('where_is_cookie')
...     if middle_response is 'I have it':
...         return my_friends[middle].get('name')
...     elif middle_response is 'to the right':
...         # Search on the right half
...         return search_for_cookie(my_friends, middle + 1, last)
...     else:
...         # Search on the left half
...         return search_for_cookie(my_friends, first, middle - 1)

In this function, first refers to the first person in the section being looked at — the first time, the section you’ll be looking at is the whole line. last refers to the last person in the section being looked at and middle refers to the person in the middle in between first and last.

This function would keep going until the person that has your cookie has been singled down. Looking at the worst case scenario, this function wouldn’t find out who has the cookie until the person has been singled out on the last run. This means the function has a time complexity of O(log(n)).

Let’s feed this function some input and see who took your cookie:

>>> my_friends = [
...     {'name':'David', 'where_is_cookie':'to the right'},
...     {'name':'Jacob', 'where_is_cookie':'to the right'},
...     {'name':'Julien', 'where_is_cookie':'to the right'},
...     {'name':'Dimitrios', 'where_is_cookie':'I have it'},
...     {'name':'Sue', 'where_is_cookie':'to the left'}
... ]
>>> cookie_taker = search_for_cookie(my_friends, 0, len(my_friends) ... - 1)
>>> print(cookie_taker)
Dimitrios

Summary of O(log(n))When a function cuts the section you’re looking at by half every iteration, you can suspect an O(log(n)) algorithm. Spotting this kind of behavior along with incrementing using i = i * 2 for each iteration allows for simple diagnosis. This way of narrowing down who has the cookie is typical of recursive functions like quick sort, merge sort and binary searches.


Time Complexity: O(n^2)

In this next scenario, let’s say that only one person knows who took your cookie and only by name. The person who took the cookie won’t fess up to it either, so you have to name off each other friend to the friend you are talking to. This can be written in Python3:

>>> def who_has_my_cookie(my_friends):
...     for friend in my_friends:
...         for their_friend in friend.get('friends'):
...             if their_friend.get('has_cookie'):
...                 return their_friend.get('name')

If you imagine yourself in this situation you can see that it’s going to take a lot longer than any of the other situations because you need to keep reciting everyone’s name every time you go to the next friend.Worst case scenario, you need to cycle through each friend and ask them about every other friend. If you look at the loops, you can see that the outer loop has a time complexity of O(n), and the inner loop also has a time complexity of O(n). This means the function will have a time complexity of O(n^2) because you multiply the time complexities together O(n * n).

Let’s feed this function some input and see who took your cookie:

>>> my_friends = [
...     {'name':'Steven', 'friends':[
...         {'name':'Steven', 'has_cookie':False},
...         {'name':'Binita', 'has_cookie':False},
...         {'name':'Linds', 'has_cookie':False},
...         {'name':'Raman', 'has_cookie':False},
...         {'name':'Elaine', 'has_cookie':False}
...         ]
...     },
...     {'name':'Binita', 'friends':[
...         {'name':'Steven', 'has_cookie':False},
...         {'name':'Binita', 'has_cookie':False},
...         {'name':'Linds', 'has_cookie':False},
...         {'name':'Raman', 'has_cookie':False},
...         {'name':'Elaine', 'has_cookie':False}
...         ]
...     },
...     {'name':'Linds', 'friends':[
...         {'name':'Steven', 'has_cookie':False},
...         {'name':'Binita', 'has_cookie':False},
...         {'name':'Linds', 'has_cookie':False},
...         {'name':'Raman', 'has_cookie':False},
...         {'name':'Elaine', 'has_cookie':False}
...         ]
...     },
...     {'name':'Raman', 'friends':[
...         {'name':'Steven', 'has_cookie':False},
...         {'name':'Binita', 'has_cookie':False},
...         {'name':'Linds', 'has_cookie':False},
...         {'name':'Raman', 'has_cookie':False},
...         {'name':'Elaine', 'has_cookie':False}
...         ]
...     },
...     {'name':'Elaine', 'friends':[
...         {'name':'Steven', 'has_cookie':False},
...         {'name':'Binita', 'has_cookie':False},
...         {'name':'Linds', 'has_cookie':False},
...         {'name':'Raman', 'has_cookie':True},
...         {'name':'Elaine', 'has_cookie':False}
...         ]
...     }
... ]
>>> cookie_taker = who_has_my_cookie(my_friends)
>>> print(cookie_taker)
Raman

Summary of O(n^2): There is a loop within a loop, so you could instantly think that it would be O(n^2). That’s a correct assumption, but to make sure, double check and make sure the increments are still being incremented ntimes. Some loops within loops turn out to be O(nlog(n)).


Time Complexity: O(1)

In this next scenario, let’s assume that you have a security camera that is watching your jar of cookies at all times. Once you noticed that your cookie was missing, you check the tapes and know exactly who took your cookie. For this reason, you can immediately go to that person and get the cookie back. This can be written in Python3:

>>> def who_has_my_cookie(my_friends, security_camera):
...     return my_friends[security_camera].get('name')

This would be considered a time complexity of O(1) because no matter who took your cookie, it would always result in the same amount of time being spent getting it back.

Let’s feed this function some input and expose the cookie thief:

>>> my_friends = [
...     {'name':'Sid', 'has_cookie':False},
...     {'name':'Guillaume', 'has_cookie':False},
...     {'name':'Lee', 'has_cookie':False},
...     {'name':'Liz', 'has_cookie':False},
...     {'name':'Kevin', 'has_cookie':True}
... ]
>>> cookie_taker = who_has_my_cookie(my_friends, 4)
>>> print(cookie_taker)
Kevin

Summary of O(1): If a function has no loops, and only performs one thing based off of the input you are giving, it’s usually going to be an O(1) time complexity.


There’s one more thing about Big O notations that you need to know. Constants in Big O are thrown out. The reasoning behind this is that once the number of inputs become large enough, the constants don’t really give any meaningful insights into the time complexity anymore.

Nested Loops vs Non-Nested Loops

When there is a nested loop, you multiply them together. In the O(n^2)example above, there was a loop within a loop giving the time complexity O(n^2). If there is a loop right after another loop, (not nested), you would add them together rather than multiply. This would give a time complexity of O(n + n) which can be combined to produce O(2n). You know now that Big O notations get rid of any constants, so that would reduce down into O(n).

But what if there was another set of inputs that aren’t as long or short as n? You can represent that set of inputs as k. Say you have a nested loop with nas the outer loop and k as the inner loop. This would boil down to O(n*k). You can’t really simplify it to be O(n^2) since k is not the same as n, so O(n*k) is the correct notation. Same goes for non-nested loops: O(n + k).


If you are a visual learner, I have included a graph to help you spot which algorithm is taking more time than other algorithms. You can clearly see that any of the polynomial notations ( n!2^nn^2 ) are generally going to take a lot longer than say, log(n).

 
1*lLbL3VCKkKbf-Y_dVbB5-Q.png

N is computation time and n is number of input elements

Some of the other types of time complexities that exist, but are not described in this blog are:

  1. O(n!) — O of n factorial
  2. O(2^n) — O of 2 to the n
  3. O(nlog(n)) — O of n log n: Example
  4. O(sqrt(n)) — O of square root of n

Feel free to look up these more complex Big O notations.

Space Complexity

There is also another side to Big O. Instead of focusing on time only, a different aspect can also be inspected — Space complexity. Space complexity is a measure of the amount of working storage an algorithm needs during run time.

In this scenario, let’s say that all your friends are in different rooms and you need to keep track of who you’ve talked to so you don’t ask the same person twice. This can be written in Python3:

>>> def who_took_my_cookie(my_friends):
...     spoken_to = []  
...     for friend in my_friends:
...         if friend not in spoken_to:
...             if friend.get('has_cookie'):
...                 print(spoken_to)
...                 return friend.get('name')
...             else:
...                 spoken_to.append(friend.get('name')

In this example, the spoken_to list could be as long as there are total friends. This means that the space complexity is O(n).

Let’s feed this function some input and see who took your cookie:

>>> my_friends = [
...     {'name':'Kristen', 'has_cookie':False},
...     {'name':'Robert', 'has_cookie':False},
...     {'name':'Jesse', 'has_cookie':False},
...     {'name':'Naomi', 'has_cookie':False},
...     {'name':'Thomas', 'has_cookie':True}
... ]
>>> cookie_taker = who_has_my_cookie(my_friends)
['Kristen', 'Robert', 'Jesse', 'Naomi']
>>> print(cookie_taker)
Thomas

Summary of Space Complexity: Space complexity can have as many notations as time complexity. This means that the O(n^2), etc. also exists and can be calculated using the same general principles. Be careful though, time and space complexities may be different than each other for a given function.

By combining time complexity with space complexity, you can begin to compare algorithms and find which one will work best for your constraints.

Battle of the Algorithms

1*-0kWkkgpkBi4VywaovVwqA.jpeg

“Which algorithm will win?!”

Now that you know what time and space complexities are and how to calculate them, let’s talk about some cases where certain algorithms may be better than others.

A simple example to clearly define some constraints would be to examine a self-driving car. The amount of storage in the car is finite — meaning that there is only so much memory that can be used. Because of this, algorithms which favor low space complexity are often favored. But in this case, fast time complexities are also favored, which is why self-driving cars are so difficult ot implement.

Now let’s examine something like a cellphone. The same storage limitations are in place, but because the cellphone isn’t driving around on the street, some sacrifices in time complexity can be made in favor of less space complexity.


Now that you have the tools to start computing the time and space complexity of the algorithms you read and write, determine which algorithm is best for your constraints. Determining which algorithm to use must take into consideration the space available for working memory, the amount of input to be parsed, and what your function is doing. Happy Coding!