1
00:00:02,150 --> 00:00:08,500
When we have multiple available solutions for a given problem, we of course want to find the best one

2
00:00:08,510 --> 00:00:10,900
and for this, we need to compare the solutions

3
00:00:10,910 --> 00:00:13,740
but how do we compare different algorithms?

4
00:00:14,000 --> 00:00:17,620
In the end, it typically comes down to the runtime performance.

5
00:00:17,630 --> 00:00:23,240
We wonder how long does an operation take and of course, we want to pick the solution which takes the

6
00:00:23,240 --> 00:00:27,360
less amount of time, so which is fastest to execute.

7
00:00:27,380 --> 00:00:34,010
Now the problem just is, of course we could use a stopwatch and measure how long our different algorithms

8
00:00:34,010 --> 00:00:39,290
take for different inputs but this would not really be a good idea,

9
00:00:39,320 --> 00:00:44,780
it depends way too much on the hardware we're running on, which other processes might be running in the

10
00:00:44,780 --> 00:00:49,320
background and for small data sets, for small amounts of data,

11
00:00:49,340 --> 00:00:56,150
there might be a lot of variation. So indeed we try to measure and express this in a generalized form

12
00:00:56,420 --> 00:01:02,840
and not in time units, not in seconds and so on because that would be really hard to measure,

13
00:01:02,840 --> 00:01:09,800
instead we use something which is called time complexity and there, you could simply imagine a chart

14
00:01:10,160 --> 00:01:17,360
where you have two axis and one is the time axis and one is the number of items or the size of the input

15
00:01:17,360 --> 00:01:24,620
you're feeding into your algorithm and then you have a certain relation between the number of items

16
00:01:24,620 --> 00:01:30,710
you're passing into your algorithm and the time it takes to execute and this is expressed with something

17
00:01:30,710 --> 00:01:35,130
which is called the Big O notation, which we write like this.

18
00:01:35,330 --> 00:01:41,720
Now that's very abstract, so let me explain what exactly this means, what the Big O notation is and how

19
00:01:41,720 --> 00:01:48,370
we could measure this for our two example algorithms. Back to the two algorithms we wrote,

20
00:01:48,470 --> 00:01:53,480
how long does this approach take and how long does this solution take?

21
00:01:53,480 --> 00:01:55,390
Well let's start with getMin,

22
00:01:55,400 --> 00:02:03,410
how can we find out how long it takes? For this, we in the end start by counting the number of executions

23
00:02:03,470 --> 00:02:11,060
of the different parts of our algorithm, so you could say we count how often each line in this function

24
00:02:11,090 --> 00:02:14,000
is reached for a given input.

25
00:02:14,000 --> 00:02:18,170
Now let's say here, our input is 3, 1, 2,

26
00:02:18,170 --> 00:02:22,000
so we're feeding in an array of three elements.

27
00:02:22,040 --> 00:02:26,570
How often will this block execute then? Exactly once,

28
00:02:26,570 --> 00:02:28,410
this is not inside of a loop,

29
00:02:28,430 --> 00:02:30,320
this is just in the function itself,

30
00:02:30,320 --> 00:02:34,730
we call that function once no matter how big our array is,

31
00:02:34,730 --> 00:02:39,410
so this part here executes exactly once

32
00:02:39,410 --> 00:02:44,780
so one execution and that's true for the entire if statement, so including the code which is inside

33
00:02:44,890 --> 00:02:51,380
of the if statement, of course this part technically doesn't always execute, this will only execute if this condition

34
00:02:51,380 --> 00:02:55,070
is true but therefore at most it executes once,

35
00:02:55,100 --> 00:02:57,860
sometimes this line will not execute at all,

36
00:02:57,890 --> 00:03:01,200
the if statement overall is executed once.

37
00:03:01,460 --> 00:03:03,380
So what about this line here?

38
00:03:03,440 --> 00:03:10,020
It's also outside of the for loop, so it also only executes once, we have one execution here because

39
00:03:10,020 --> 00:03:13,160
this only gets executed once when the function gets called.

40
00:03:13,160 --> 00:03:18,740
Well actually to be precise, of course if we would feed in an empty array or not an array at all

41
00:03:18,740 --> 00:03:22,430
so that our first if statement catches this and throws an error,

42
00:03:22,430 --> 00:03:25,820
this line and all the other code in this function would not execute at all.

43
00:03:26,060 --> 00:03:28,110
So it's not always running once,

44
00:03:28,160 --> 00:03:30,290
sometimes it doesn't run at all

45
00:03:30,290 --> 00:03:33,740
but that would be different cases for this algorithm,

46
00:03:33,740 --> 00:03:40,010
so in the best case from a performance perspective, we get invalid data and the algorithm doesn't run

47
00:03:40,010 --> 00:03:46,040
once. That of course is great for performance, not so much for our application logic though. We typically

48
00:03:46,040 --> 00:03:51,860
look at something which is called the average case though, which simply means on average we expect

49
00:03:51,860 --> 00:03:56,170
valid data and therefore on average, this line of course will execute once,

50
00:03:56,180 --> 00:04:01,870
so that's just important to understand how you work with these if statements. You can't participate

51
00:04:01,880 --> 00:04:03,950
which concrete values you get,

52
00:04:03,950 --> 00:04:06,380
you don't know the exact data you get,

53
00:04:06,380 --> 00:04:13,920
so we simply take the common case, the typical case we would expect. Now the return statement here also

54
00:04:14,100 --> 00:04:18,840
is executed once because we only return once in this entire function.

55
00:04:18,870 --> 00:04:20,810
So now what about the for loop?

56
00:04:20,820 --> 00:04:28,740
Well of course the loop itself so that we dive into it executes once but then the code in the loop executes

57
00:04:28,740 --> 00:04:31,590
more often, so we could say this gets executed once,

58
00:04:31,590 --> 00:04:34,360
this exact line but now inside of the loop,

59
00:04:34,380 --> 00:04:36,310
of course we run more often.

60
00:04:36,390 --> 00:04:37,710
What about this line here?

61
00:04:37,710 --> 00:04:40,380
How often do we run this if check?

62
00:04:40,380 --> 00:04:46,670
Well we have three items in the array but we start the loop at item two.

63
00:04:46,820 --> 00:04:53,290
So this loop, for an array of three items, will run twice because it doesn't run for the very first item,

64
00:04:53,330 --> 00:04:55,730
it starts at the second item.

65
00:04:55,730 --> 00:05:01,640
So here we should have two executions of this line here and then in here,

66
00:05:01,640 --> 00:05:02,860
this depends,

67
00:05:02,900 --> 00:05:05,460
this depends on the concrete data we get.

68
00:05:05,540 --> 00:05:11,660
It might never execute if we have an array as an input where the first element is the smallest because

69
00:05:11,690 --> 00:05:17,180
then this condition will never be met but if we have an array where we have a smaller item after the

70
00:05:17,180 --> 00:05:20,630
first element, then this will at least run once,

71
00:05:20,630 --> 00:05:28,340
so this here could be anything between zero to two executions in our case here. For this concrete array,

72
00:05:28,370 --> 00:05:34,730
we know this would execute once, it would execute once because for the second element which is the smallest

73
00:05:34,730 --> 00:05:35,180
one,

74
00:05:35,180 --> 00:05:38,270
this condition would be fulfilled and therefore we would run it,

75
00:05:38,300 --> 00:05:45,160
so we have one execution here for our concrete example. Now in case this is not entirely clear,

76
00:05:45,170 --> 00:05:52,940
you can of course also throw in some console log statements, for example here and there we could log

77
00:05:53,030 --> 00:06:05,760
execution for and add a couple of other breakpoints, for example execution init and then here execution

78
00:06:06,120 --> 00:06:06,870
return

79
00:06:06,870 --> 00:06:09,210
and then you could of course simply count this.

80
00:06:09,210 --> 00:06:15,600
So here you see execution init runs once, execution return runs once, execution for runs twice,

81
00:06:15,630 --> 00:06:19,400
that's exactly what I wrote here, 2, 1, 1.

82
00:06:19,470 --> 00:06:24,150
So that's also something you can do if you're not sure whether you counted correctly,

83
00:06:24,150 --> 00:06:26,560
so here it looks like we did count correctly.

84
00:06:26,580 --> 00:06:32,970
Now overall we can therefore also express this in mathematical terms you could say. We can group this

85
00:06:32,970 --> 00:06:40,160
into three main blocks, the initialization block here because these lines here all run equally often,

86
00:06:40,410 --> 00:06:45,800
then the part where we're done and then also the dynamic part here

87
00:06:45,810 --> 00:06:48,330
and of course it depends on your algorithm,

88
00:06:48,340 --> 00:06:52,410
how many different blocks of code you can identify.

89
00:06:52,410 --> 00:06:55,530
So here I would say we have these three main blocks - initialization,

90
00:06:55,740 --> 00:06:59,490
then our loop logic and then the return statement.

91
00:06:59,490 --> 00:07:06,600
So if we want to derive the time and with that, I mean the number of executions, then we could say that

92
00:07:06,600 --> 00:07:15,600
we have three blocks. We have one execution here for this first block and one execution for this last

93
00:07:15,600 --> 00:07:16,280
block

94
00:07:16,380 --> 00:07:20,130
and in between, we have two executions

95
00:07:20,130 --> 00:07:26,070
and with that I don't mean the number of lines that get executed but simply how often these three blocks

96
00:07:26,160 --> 00:07:28,530
get executed at most here.

97
00:07:28,530 --> 00:07:30,790
Now why am I writing it like this?

98
00:07:30,870 --> 00:07:38,370
Because if we take a closer look at this function here, we actually see that this block and this block

99
00:07:38,610 --> 00:07:47,760
are executed in a constant way, which means these two blocks don't depend on the input size, on the number

100
00:07:47,760 --> 00:07:49,830
of items in the array in this example,

101
00:07:49,830 --> 00:07:56,290
no matter if we have an array with three items or with a million items, this will only run once,

102
00:07:56,310 --> 00:07:59,280
this part here and this will also only run once.

103
00:07:59,280 --> 00:08:00,520
So this is constant, so

104
00:08:00,540 --> 00:08:05,040
we could have c1 and c3 here,

105
00:08:05,070 --> 00:08:08,140
so the first and the third block are in the end constant,

106
00:08:08,150 --> 00:08:11,080
so we don't care about the concrete values.

107
00:08:11,100 --> 00:08:13,290
Now what about this for loop here?

108
00:08:13,350 --> 00:08:17,220
Now here, the execution depends on the length of the array,

109
00:08:17,250 --> 00:08:22,860
it's basically executing n-1 amount of times you could say. So for an array with three elements,

110
00:08:22,860 --> 00:08:28,860
this runs twice, for an array with a million elements, this would run 990,999

111
00:08:28,860 --> 00:08:35,880
times. Now since we initialize our current minimum with the first number in the array

112
00:08:35,880 --> 00:08:39,630
here and therefore this also depends a bit on the array,

113
00:08:39,630 --> 00:08:40,180
right,

114
00:08:40,200 --> 00:08:42,890
this here also accesses our array,

115
00:08:42,960 --> 00:08:51,000
we could therefore also simplify this as to just n and we could say this runs n * c2 times, what

116
00:08:51,000 --> 00:08:59,070
is c2? c2 was our constant number of executions we have inside of the for loop, this here runs two

117
00:08:59,070 --> 00:09:04,470
times but it only runs once per loop iteration you could say right, so this if loop is executed twice

118
00:09:04,500 --> 00:09:10,020
because the for loop runs twice because we have an array of three elements but this if statement of course

119
00:09:10,050 --> 00:09:13,290
only runs once per loop iteration.

120
00:09:13,350 --> 00:09:19,040
So we have that constant value of 1 inside of the loop, for every loop iteration,

121
00:09:19,050 --> 00:09:24,450
this code here happens to run once. So why am I doing it like this?

122
00:09:24,450 --> 00:09:30,690
Because what hopefully gets obvious here is that the constants don't really matter to us,

123
00:09:30,870 --> 00:09:36,870
it's not interesting whether we have some code that runs exactly once at the beginning and the end independent

124
00:09:36,870 --> 00:09:42,350
from the input size, we want to measure how long this algorithm takes,

125
00:09:42,390 --> 00:09:49,680
so how many operations it executes based on the number of items we feed into it.

126
00:09:49,770 --> 00:09:56,130
So if here we feed in one million items and this line still only executes once, then this line is simply

127
00:09:56,130 --> 00:09:57,560
not interesting for us,

128
00:09:57,600 --> 00:10:01,830
this does not make the algorithm any better or worse,

129
00:10:01,890 --> 00:10:08,670
instead what matters to us are the parts in the algorithm that depend on the input size because that

130
00:10:08,670 --> 00:10:11,480
is what will scale up or down.

131
00:10:11,490 --> 00:10:16,760
So here, we can simplify this to just n in the end because this loop

132
00:10:16,770 --> 00:10:26,070
you could say combined with this line runs n times, if we feed in an array of three elements, this here

133
00:10:26,070 --> 00:10:32,430
in the end goes through all the items in the array, so it runs three times you could say. Of course the

134
00:10:32,430 --> 00:10:38,430
loop technically runs n-1 times but for a large array of a million items, the -1 really

135
00:10:38,430 --> 00:10:39,910
is not interesting.

136
00:10:39,960 --> 00:10:46,070
So this simply depends linearly on the length of the array,

137
00:10:46,080 --> 00:10:54,810
so here we would say we have linear time complexity and we typically express this in this big O notation

138
00:10:54,810 --> 00:11:00,020
which means we write a big O and then in brackets, the input size,

139
00:11:00,120 --> 00:11:04,740
so n. So that simply means our algorithm here depends

140
00:11:04,740 --> 00:11:12,300
in a linear way on the number of inputs. If we feed in more items, it will take longer but it grows in

141
00:11:12,300 --> 00:11:13,690
a linear way,

142
00:11:13,740 --> 00:11:21,390
so feeding in 1000 elements takes around 1000 times as long as it takes for one item.

143
00:11:21,390 --> 00:11:24,130
It does not grow exponentially or slower,

144
00:11:24,240 --> 00:11:30,330
it grows at a linear array, so the more items we feed in, the longer it will take but it will grow in a

145
00:11:30,330 --> 00:11:31,830
linear way.

146
00:11:31,830 --> 00:11:38,370
Now let's maybe have a look at the second approach here to see a different example, a different time

147
00:11:38,370 --> 00:11:42,450
complexity and how this notation helps us compare algorithms.
