Algorithm efficiency Algorithm Efficiency and Big-O Notation

Table of Contents


1. Introduction: Algorithm Efficiency (and why we care)

Processing power continues increasing as time moves forward. Many things process so quickly that we barely notice it, if at all. That doesn't mean we can just write code as inefficiently as possible and just let it run because "computers are fast enough, right?" - as technology evolves, we crunch more and more data, and as "big data" becomes a bigger field, we need to work with this data efficiently - because it doesn't matter how powerful a machine is, processing large quantities of data with an inefficient algorithm can still take a long time.

And some computers today are slow. Not usually the computers that the average person is using (Even though it might feel that way if they're running Windows). Some computers that handle systems are pretty basic, or they're small, and don't have the luxury of having great specs. Fitness trackers, dedicated GPS devices, a thermostat, etc. For systems like these, processing efficiency is important.

Let's say our algorithm's time to execute increases quadratically. That means, as we increase the amount of items to process (\(n\)), the time to process that data goes up quadratically.

c4_u12_AlgorithmEfficiency_quadratic.png

For processing 1 piece of data, it takes 1 unit of time (we aren't focused so much on the actual unit of time, since each machine runs at different speeds) - that's fine. Processing 2 pieces of data? 4 units of time. 8 pieces of data? 64 units of time. We don't just double the amount of time it takes when we double the data - we square it.

How much data do you think is generated every day on a social media website that has millions of users? Now imagine the processing time for an algorithm with a quadratic increase…

Data \(n\) Time units
1 1
2 4
3 9
4 16
5 25
100 10,000
1,000 1,000,000
10,000 100,000,000
1,000,000 1,000,000,000,000

From a design standpoint we also need to know what the efficiency of different algorithms are (such as the algorithm to find data in a Linked List or the algorithm to resize a dynamic array) in order to make design decisions on what is the best option for our software.


2. Example: Fibonacci sequence, iterative and recursive

As an example, you shouldn't use a recursive algorithm to generate a Fibonacci sequence (Each number \(F_n\) in the sequence is equal to the sum of the two previous items: \(F_n = F_{n-1} + F_{n-2}\)). It's just not as efficient! Let's look at generating the string of numbers:

1, 1, 2, 3, 5, 8, 13, 21, 34, 55

The most efficient way to do this would be with an iterative approach using a loop. However, we could also figure it out with a recursive approach, though the recursive approach is much less efficient.

Fibonacci sequence, iterative:

int GetFib_Iterative(int n)
{
  if (n == 0 || n == 1) { return 1; }

  int n_prev = 1;
  int n_prevprev = 1;
  int result = 0;

  for (int i = 2; i <= n; ++i)
    {
      result = n_prev + n_prevprev;
      n_prevprev = n_prev;
      n_prev = result;
    }

  return result;
}

This algorithm has a simple loop that iterates \(n\) times to find a given number of the Fibonacci sequence. The amount of time the loop iterates based on \(n\) is:

n 3 4 5 6 7 8 9 10 20
iterations 2 3 4 5 6 7 8 9 19

Fibonacci sequence, recursive:

int GetFib_Recursive(int n)
{
  if (n == 0 || n == 1) { return 1; }

  return GetFib_Recursive(n - 1) + GetFib_Recursive(n - 2);
}

The way this version is written, any time we call this function with any value \(n\), it has to compute the Fibonacci number for \(Fib(n-1)\), \(Fib(n-2)\), \(Fib(n-3)\), … \(Fib(1)\), \(Fib(0)\) … twice. This produces duplicate work b ecause it effectively doesn't "remember" what \(Fib(3)\) was after it computes it, so for \(Fib(n)\) it has to recompute \(Fib(n-1)\) and \(Fib(n-2)\) each time…

n 3 4 5 6 7 8 9 10 20
iterations 2 4 7 12 20 33 54 88 10,945

The growth rate of the iterative version ends up being linear: As \(n\) rises linearly, the amount of iterations also goes up linearly.

The growth rate of the recursive function is exponential: As \(n\) rises linearly, the amount of iterations goes up exponentially.

c4_u12_AlgorithmEfficiency_fib.png

  • Red/solid: Recursive growth rate
  • Blue/dashed: Iterative growth rate
  • Y scale from 0 to 25 (the 0th, to 25th Fibonacci number to generate).

For small amounts this might not be too bad. However, if we were generating a Fibonacci number further in the list, it would continue getting even slower…

c4_u12_AlgorithmEfficiency_fib2.png

  • Red/solid: Recursive growth rate
  • Blue/dashed: Iterative growth rate
  • Y scale from 0 to 11,000 (the 11,000th Fibonacci number to generate).

If we are trying to generate the 11,000th item in the sequence, the iterative approach requires 11,000 iterations in a loop. That linear increase is still much faster than each \(Fib(n)\) calling \(Fib(n-1)\) and \(Fib(n-2)\) recursively.


3. Big-O Notation and Growth Rates

For the most part, we don't care much about the exact amount of times a loop runs. After all, while the program is running, \(n\) could be changing depending on user input, or how much data is stored, or other scenarios. Instead, we are more interested in looking at the big picture: the generalized growth rate of the algorithm.

We use something called "Big-O notation" to indicate the growth rate of an algorithm. Some of the rates we care about are:

\(O(1)\) \(O(log(n))\) \(n\) \(n^2\) \(n^3\) \(2^n\)
Constant Logarithmic Linear Quadratic Cubic Exponential
Constant time, \(O(1)\)

We think of any single command as being constant time. The operation \[ a = b + c \] will take the same amount of computing time no matter what the values of \(a\), \(b\), and \(c\) are.

int F(int x)
{
  return 3 * x + 2;
}

c4_u12_AlgorithmEfficiency_constant.png

Logarithmic time, \(O(log(n))\)

Having an algorithm that halves its range each iteration of the loop will result in a logarithmic growth rate.

int Search(int l, int r,
           int search, int arr[])
{
  while ( l <= r )
    {
      int m = l + (r-l) / 2;
      if ( arr[m] == search )
        { return m; }
      else if ( arr[m] < search )
        { l = m+1; }
      else if ( arr[m] > search )
        { r = m-1; }
    }
  return -1;
}

c4_u12_AlgorithmEfficiency_log.png

Linear time, \(O(n)\)

Having a single loop that iterates from start to end from 0 to \(n\) with no interruptions (like breaks) will be a linear time.

int Sum( int n )
{
  int sum = 0;
  for (int i=1; i<=n; i++)
    {
      sum += n;
    }
  return sum;
}

c4_u12_AlgorithmEfficiency_linear.png

If there is some scenario that causes the loop to halve its range each time, then its growth rate would be less than linear: as \(O(log(n))\).

Quadratic time, \(O(n^2)\)

Quadratic time comes into play when you have one loop iterating \(n\) times nested within an other loop that also iterates \(n\) times. For example, if we were writing out times tables from 1 to 10 (\(n = 10\)), then we would need 10 rows and 10 columns, giving us \(10^2 = 100\) cells.

void TimesTables(int n)
{
  for (int y=1; y<=n; y++)
    {
      for (int x=1; x<=n; x++)
        {
          cout << x << "*"
               << y << "="
               << x*y << "\t";
        }
      cout << endl;
    }
}

c4_u12_AlgorithmEfficiency_quadraticB.png

Cubic time, \(O(n^3)\)

Just like nesting two loops iterating \(n\) times each gives us a \(n^2\) result, having three nested loops iterating \(n\) times each gives us a \(O(n^3)\) growth rate.

void CubicThing(int n)
{
  for (int z=1; z<=n; z++)
    {
      for (int y=1; y<=n; y++)
        {
          for (int x=1; x<=n; x++)
            {
              cout << x << " "
                   << y << " "
                   << z << endl;
            }
        }
    }
}

c4_u12_AlgorithmEfficiency_cubic.png

Exponential time, \(O(2^n)\)

With an exponential function, each step increases the complexity of the operation. The Fibonacci example is a good illustration: Figuring out Fib(0) and Fib(1) are constant, but Fib(2) requires calling Fib(0) and Fib(1), and Fib(3) calls Fib(1) and Fib(2), with Fib(2) calling Fib(0) and Fib(1), and each iteration adds that many more operations.

int Fib(int n)
{
  if (n == 0 || n == 1)
    return 1;

  return
    Exponential_Fib(n-2)
    + Exponential_Fib(n-1);
}

c4_u12_AlgorithmEfficiency_exponential.png

Growth rate comparisons:

c4_u12_AlgorithmEfficiency_comparisons.png


Author: Rachel Wil Sha Singh

Created: 2023-10-30 Mon 12:23

Validate