1
00:00:00,040 --> 00:00:04,760
Well, just before we bring up our Whizzy interface, let me just show you a less whizzy one.

2
00:00:04,880 --> 00:00:08,880
Uh, what's broken with day three's notebook that we looked at before?

3
00:00:09,280 --> 00:00:15,280
You remember that we had our very nice, elegant, simple five line question answering that did just

4
00:00:15,280 --> 00:00:17,520
fine, and we brought up a nice UI on it.

5
00:00:17,520 --> 00:00:18,480
And here it is.

6
00:00:18,520 --> 00:00:21,040
And we could say, who is Avery?

7
00:00:21,440 --> 00:00:23,400
And it's gonna do just fine.

8
00:00:23,400 --> 00:00:25,560
It's gonna, of course, do a rag lookup.

9
00:00:25,560 --> 00:00:27,560
It's going to give us a great answer.

10
00:00:28,040 --> 00:00:29,320
Well, check this out.

11
00:00:29,320 --> 00:00:33,440
If I say how much is her salary?

12
00:00:34,000 --> 00:00:37,120
Uh, we are going to get something, uh, odd back.

13
00:00:37,600 --> 00:00:40,920
Samantha Green's current salary is 70,000.

14
00:00:41,000 --> 00:00:41,960
What is that about?

15
00:00:42,000 --> 00:00:44,640
Two things have gone on that are a problem here.

16
00:00:44,640 --> 00:00:49,680
The two things that I showed you, we fixed, uh, in the in the proper module implementation, the

17
00:00:49,680 --> 00:00:52,840
two problems with this, uh, with this simplistic version.

18
00:00:52,840 --> 00:00:56,640
First of all, it wasn't even looking at the history that didn't get passed to the LM.

19
00:00:56,640 --> 00:01:02,440
So when I say her there, it has no idea that I'm still referring to Avery, but even if it had known,

20
00:01:02,440 --> 00:01:08,740
it would still have had a problem, because that this is the text that's being used for the Rag lookup.

21
00:01:08,740 --> 00:01:12,740
And because of that, it's just picking some random document with salary in it.

22
00:01:12,740 --> 00:01:15,380
The first one that happens to be Samantha Green.

23
00:01:15,380 --> 00:01:21,540
And that's why we're getting a, uh, like a, like a red herring of an answer, something that seems

24
00:01:21,540 --> 00:01:23,340
to be a completely different topic.

25
00:01:23,580 --> 00:01:25,020
So those are the two problems.

26
00:01:25,020 --> 00:01:30,860
We're not maintaining conversation history when we call the LM, but also we're looking up context based

27
00:01:30,860 --> 00:01:31,620
off this.

28
00:01:31,620 --> 00:01:36,540
So the two fixes that I just showed you in that longer version, first of them is to build up the proper

29
00:01:36,540 --> 00:01:40,740
history and pass that into lang chain using that Lang chain conversion function.

30
00:01:40,740 --> 00:01:44,060
And secondly, I'm not just looking up the context on this.

31
00:01:44,100 --> 00:01:50,420
I'm looking at the context on the combination of all of this text so that it would in fact include Avery

32
00:01:50,420 --> 00:01:51,020
in there.

33
00:01:51,180 --> 00:01:52,140
That's why I did it.

34
00:01:52,140 --> 00:01:53,580
Now you see it broken.

35
00:01:53,620 --> 00:01:56,900
Let's see how it looks in the new Gradio UI.

36
00:01:56,940 --> 00:02:01,860
And so I bring up a new terminal with shift control Backtick.

37
00:02:02,340 --> 00:02:03,140
Where is it?

38
00:02:03,740 --> 00:02:04,540
There it is.

39
00:02:04,540 --> 00:02:05,100
Okay.

40
00:02:05,140 --> 00:02:13,450
And now I go into week five and now I do UV run app Dot Pi, and it's going to spring up a new window,

41
00:02:13,770 --> 00:02:14,450
I hope.

42
00:02:14,490 --> 00:02:19,290
After a small pause while Gradio chugs into gear and there it is.

43
00:02:19,290 --> 00:02:19,930
Bam!

44
00:02:19,970 --> 00:02:20,850
Here we go!

45
00:02:20,890 --> 00:02:24,130
The ensure expert assistant.

46
00:02:24,530 --> 00:02:29,730
A nice little UI, and we've got a conversation on the left and we've got retrieved.

47
00:02:29,730 --> 00:02:31,730
Context will appear here on the right.

48
00:02:31,730 --> 00:02:32,530
Isn't that cool?

49
00:02:32,810 --> 00:02:40,610
All right, so obviously I could say who is aviary and we'll see how it does okay.

50
00:02:40,650 --> 00:02:41,330
What have we got.

51
00:02:41,330 --> 00:02:42,290
So we get a good answer.

52
00:02:42,330 --> 00:02:43,530
Of course we know this answer.

53
00:02:43,530 --> 00:02:50,850
And on the right here relevant context, we have the information coming back here along with its source,

54
00:02:50,970 --> 00:02:52,450
the place that it came from.

55
00:02:52,570 --> 00:02:57,730
And this white text is what was actually provided in the context to the LM.

56
00:02:57,930 --> 00:03:03,450
The orange is is just a note of the metadata showing us the source of where it came from.

57
00:03:03,450 --> 00:03:07,330
But this was not provided to to the LM, just these white blobs.

58
00:03:07,330 --> 00:03:08,410
Here it all is.

59
00:03:08,490 --> 00:03:11,230
This is all the information that was passed about Avery.

60
00:03:11,270 --> 00:03:12,310
How many chunks do we have?

61
00:03:12,310 --> 00:03:13,550
We have one, two.

62
00:03:13,590 --> 00:03:14,270
Three.

63
00:03:14,310 --> 00:03:14,830
Four.

64
00:03:15,230 --> 00:03:16,110
Five.

65
00:03:16,430 --> 00:03:17,190
Five.

66
00:03:17,310 --> 00:03:20,630
Five chunks were retrieved because we set K to five.

67
00:03:20,750 --> 00:03:21,550
Beautiful.

68
00:03:21,950 --> 00:03:24,630
And now I can come back here and I can say.

69
00:03:24,990 --> 00:03:27,870
And what is her salary?

70
00:03:28,310 --> 00:03:31,470
And we hope that we will see a combination of things.

71
00:03:31,790 --> 00:03:34,310
Avery Lancaster's current salary is 2.25.

72
00:03:34,350 --> 00:03:39,190
We see that the relevant context it's retrieved is related to this text and this text.

73
00:03:39,190 --> 00:03:41,190
So it's still all about Avery.

74
00:03:41,230 --> 00:03:44,030
The current salary is right up there in the second chunk.

75
00:03:44,190 --> 00:03:52,070
And so it also the the chat was sent with history to uh for one nano.

76
00:03:52,070 --> 00:03:55,070
And it was able to say we're talking about Avery for sure.

77
00:03:55,230 --> 00:03:56,550
And I've got the context.

78
00:03:56,550 --> 00:03:57,870
I know her salary.

79
00:03:57,870 --> 00:03:59,310
That's how it responds.

80
00:03:59,510 --> 00:04:03,110
This is rag with history with lookup.

81
00:04:03,190 --> 00:04:04,190
Fantastic.

82
00:04:04,350 --> 00:04:05,150
It's working.

83
00:04:05,150 --> 00:04:06,990
Now let me go back into cursor.

84
00:04:07,110 --> 00:04:14,970
I just want to go into the knowledge base, into the employees, and I want to find one employee document

85
00:04:14,970 --> 00:04:15,570
in particular.

86
00:04:15,570 --> 00:04:18,050
I want to find Maxine Thompson.

87
00:04:18,410 --> 00:04:19,290
Here she is.

88
00:04:19,290 --> 00:04:20,410
Here is Maxine Thompson.

89
00:04:20,410 --> 00:04:22,650
Let's bring that up in a nice markdown style.

90
00:04:22,650 --> 00:04:23,410
Let's have a look.

91
00:04:23,450 --> 00:04:25,130
This is her HR record.

92
00:04:25,130 --> 00:04:26,850
We'll get the terminal out of the way there.

93
00:04:27,090 --> 00:04:34,610
Uh, if we look around here, you'll see that she was recognized as the inaugural innovator of the year

94
00:04:34,610 --> 00:04:43,930
in 2023, receiving a prestigious IIoT award, the inaugural innovator of the year award in 2023.

95
00:04:43,970 --> 00:04:45,090
Good for Maxine.

96
00:04:45,130 --> 00:04:52,130
Now, I wonder whether or not the the R rag pipeline is going to be able to retrieve that.

97
00:04:52,370 --> 00:04:54,090
That is a challenging one.

98
00:04:54,290 --> 00:04:55,730
Let's find out.

99
00:04:55,930 --> 00:04:58,530
Let's go back to the terminal.

100
00:04:58,570 --> 00:05:01,050
Let's start our Gradio app up.

101
00:05:01,410 --> 00:05:02,410
Here we go.

102
00:05:05,050 --> 00:05:07,450
And then we're going to ask the question.

103
00:05:09,490 --> 00:05:18,150
We are going to say okay then tell us who won the prestigious I o t y award.

104
00:05:19,110 --> 00:05:20,150
Give it that.

105
00:05:20,190 --> 00:05:21,630
Let's see what it can do.

106
00:05:21,830 --> 00:05:24,630
It says Maxine was recognized.

107
00:05:24,670 --> 00:05:26,110
It gets that right.

108
00:05:26,150 --> 00:05:32,430
And you can see over on the right what's going on that it's coming up and we'll see right there that,

109
00:05:32,430 --> 00:05:35,790
uh, it found the details in here somewhere.

110
00:05:35,990 --> 00:05:36,590
Uh, there we go.

111
00:05:36,630 --> 00:05:38,670
It's because she received the prestigious award.

112
00:05:38,670 --> 00:05:43,230
Was also mentioned in her promotion, and it's got it right there with her name.

113
00:05:43,870 --> 00:05:47,590
But you may notice something that's not perfect about this.

114
00:05:47,710 --> 00:05:52,670
You may notice that it says Maxine, but it doesn't say Maxine Thompson.

115
00:05:52,670 --> 00:05:54,830
It doesn't realize her full name.

116
00:05:54,830 --> 00:06:02,510
And that's that's this slightly tiresome thing that this chunk that it that it found here mentions that

117
00:06:02,510 --> 00:06:07,990
she received that award in this rather nice small chunk, but it doesn't have her full name in that

118
00:06:07,990 --> 00:06:12,830
chunk, because her full name was at the top of the HR document and not in the middle.

119
00:06:12,830 --> 00:06:19,620
And this is an example of a nasty little problem with Rag with chunking that the chunk might answer

120
00:06:19,620 --> 00:06:25,300
the question, but it might not contain some crucial context from elsewhere in the document, meaning

121
00:06:25,300 --> 00:06:28,580
that you get back an okay but not perfect answer.

122
00:06:28,580 --> 00:06:31,060
So that's an interesting one for us to observe.

123
00:06:31,260 --> 00:06:34,340
It's okay, but it's not as complete as it could be.

124
00:06:34,380 --> 00:06:35,780
And there are other problems too.

125
00:06:35,820 --> 00:06:41,300
There are some problems with the way that I did that history lookup, and that's slightly hacky way.

126
00:06:41,500 --> 00:06:45,660
Suppose I start a new conversation and I say, uh, who is Avery?

127
00:06:47,300 --> 00:06:48,700
Uh, like like this.

128
00:06:48,900 --> 00:06:50,060
Uh, who is Avery?

129
00:06:51,180 --> 00:06:58,940
And we get back on the right and answer and then say, uh, and what did Avery do before?

130
00:06:59,860 --> 00:07:04,900
We'll get back some more conversation about Avery before founding Ensure Elm, she worked as a senior

131
00:07:04,900 --> 00:07:07,900
product manager or coming from similar chunks, working well.

132
00:07:07,900 --> 00:07:11,260
And now we'll say who won the prestigious.

133
00:07:13,300 --> 00:07:16,100
Award and see how it does.

134
00:07:16,900 --> 00:07:19,680
And strangely, it hasn't got the answer.

135
00:07:19,680 --> 00:07:20,600
It doesn't know.

136
00:07:20,960 --> 00:07:22,240
Why has that happened?

137
00:07:22,680 --> 00:07:26,400
Well, that's because I made that change.

138
00:07:26,400 --> 00:07:32,320
So that when I'm retrieving context, I combine this and this and this.

139
00:07:32,720 --> 00:07:40,160
And now the thing that I'm looking up in my data store, it's got too much other stuff other than I.o.t.

140
00:07:40,200 --> 00:07:41,840
It's got too much about aviary.

141
00:07:41,960 --> 00:07:45,760
And so if we look over on the right, it's still finding chunks about aviary.

142
00:07:45,800 --> 00:07:47,480
That's not very satisfactory.

143
00:07:47,480 --> 00:07:48,680
So we have a problem.

144
00:07:48,840 --> 00:07:54,800
The the change that I made has made it better at looking up context that relates to the whole conversation,

145
00:07:54,800 --> 00:07:57,520
but in doing so it's made it poorer.

146
00:07:57,680 --> 00:08:03,440
If you then change the topic and want to talk about something completely different, it's still dredging

147
00:08:03,440 --> 00:08:07,320
up context from earlier in the conversation and that's no good.

148
00:08:08,000 --> 00:08:09,440
So how would you fix that?

149
00:08:09,480 --> 00:08:14,520
There are many ways to fix it, but there are some ways are better than others, and we will work on

150
00:08:14,520 --> 00:08:17,600
problems like this over the course of the next couple of days.

151
00:08:17,880 --> 00:08:24,220
But the real takeaway I have for you is that rag is something that can be quite hit and miss.

152
00:08:24,220 --> 00:08:26,260
And we're going to really double down on this tomorrow.

153
00:08:26,260 --> 00:08:34,100
But but it's it's important to recognize that Rag is this very experimental, really quite hacky approach

154
00:08:34,100 --> 00:08:36,460
of trying to look up relevant context.

155
00:08:36,460 --> 00:08:38,460
And it can frequently go wrong.

156
00:08:38,620 --> 00:08:41,740
You can get good examples of rag working really, really well.

157
00:08:41,740 --> 00:08:46,180
Like it's surfacing that Maxine won the award.

158
00:08:46,460 --> 00:08:49,300
But you can also see that it didn't quite get her full name.

159
00:08:49,340 --> 00:08:50,620
It wasn't complete.

160
00:08:50,620 --> 00:08:56,740
And then after we've talked about Avery for a bit, it's not able to find out who won the Coty Award.

161
00:08:56,740 --> 00:08:59,460
So so it's quite easy to get it off track.

162
00:08:59,660 --> 00:09:06,140
So the, the, the right way to approach Rag is to come up with each problem like this as a new problem

163
00:09:06,140 --> 00:09:06,900
to be solved.

164
00:09:06,900 --> 00:09:11,460
And it's a bit like whack a mole, you find something and then you fix it, and then you find another

165
00:09:11,460 --> 00:09:12,540
problem and you fix that.

166
00:09:12,540 --> 00:09:17,980
And sometimes in fixing one problem, you cause another problem, as we just have as I, as I fix that

167
00:09:17,980 --> 00:09:21,100
that history, I then I then created this new problem.

168
00:09:21,100 --> 00:09:28,080
So a lot about Rag is getting after these kinds of puzzles and making your rag system better and better

169
00:09:28,080 --> 00:09:30,440
through a series of different techniques.

170
00:09:30,440 --> 00:09:32,360
That's what Rag is all about.

171
00:09:32,400 --> 00:09:37,440
And there's a technique to allow us to do that in a more scientific way.

172
00:09:37,560 --> 00:09:39,600
And that's what we'll be covering tomorrow.

173
00:09:39,640 --> 00:09:45,160
And in the meantime, before I leave the screen, the mission for you before we go to tomorrow is to

174
00:09:45,200 --> 00:09:46,480
bring up this user interface.

175
00:09:46,480 --> 00:09:47,880
You've run App.py.

176
00:09:47,920 --> 00:09:53,160
After doing a CD implementation, you've run, ingest py to populate your data store.

177
00:09:53,600 --> 00:09:58,880
Then do a you've run app.py and experiment and find the rough edges.

178
00:09:58,880 --> 00:10:01,760
Find what it's good at, find what it's bad at.

179
00:10:01,800 --> 00:10:03,400
Look at the chunks on the right.

180
00:10:03,440 --> 00:10:05,840
Understand why are chunks getting surfaced?

181
00:10:05,880 --> 00:10:13,480
Try and get really great examples of rag doing an amazing job at showing deep expertise about information

182
00:10:13,480 --> 00:10:19,640
buried in the documents, but then also find cases where it foolishly is unable to answer something

183
00:10:19,640 --> 00:10:24,120
that's clearly there in the documents, and get a really good sense for what's working well.

184
00:10:24,280 --> 00:10:25,880
And what are the blind spots?

185
00:10:25,880 --> 00:10:30,470
What are the rough edges and have that on our list because we're going to get after them, fix them

186
00:10:30,470 --> 00:10:35,590
as we make our rag better and better, starting tomorrow and before tomorrow.

187
00:10:35,630 --> 00:10:40,990
I do hope that you're going to take a moment, uh, to, you know, if you're a wine drinker, then

188
00:10:40,990 --> 00:10:43,350
I'd say it deserves a nice glass of wine.

189
00:10:43,510 --> 00:10:45,230
Uh, maybe even a glass of champagne.

190
00:10:45,590 --> 00:10:48,190
You have built a rag product.

191
00:10:48,190 --> 00:10:48,990
You can do this.

192
00:10:48,990 --> 00:10:51,270
You can now write your own ingest dot pi.

193
00:10:51,390 --> 00:10:58,830
You can write your own app.py and your dot pi and have something which is able to vectorize, chunk,

194
00:10:58,870 --> 00:11:03,150
vectorize, store a bunch of documents, and then you can ask questions on them.

195
00:11:03,150 --> 00:11:08,750
And it apparently has expertise across your entire knowledge base and only you.

196
00:11:08,750 --> 00:11:13,190
You secretly know that it doesn't at any one time have knowledge of the whole knowledge base.

197
00:11:13,230 --> 00:11:20,430
It's doing this trick of using vector similarity to do a semantic search over your data, pluck out

198
00:11:20,470 --> 00:11:26,590
a few chunks of relevant context, shove them in the prompt, getting back an answer based on that knowledge

199
00:11:26,590 --> 00:11:27,390
each time.

200
00:11:27,590 --> 00:11:29,030
It's it's a trick.

201
00:11:29,070 --> 00:11:30,370
It's a conjuring trick.

202
00:11:30,410 --> 00:11:35,130
It works really well, and it gives the impression that you've got something that's an expert on your

203
00:11:35,130 --> 00:11:37,850
entire knowledge base, but it has some rough edges.

204
00:11:38,090 --> 00:11:42,850
So as a recap, you can now generate code, generate text with front end models.

205
00:11:42,850 --> 00:11:44,970
With open source models, you can use tools.

206
00:11:44,970 --> 00:11:50,050
You can confidently choose the right model for your projects and and add it to the list.

207
00:11:50,090 --> 00:11:56,090
You can implement a Rag expert question answering system and the secret, the dark secret.

208
00:11:56,130 --> 00:11:56,330
Just.

209
00:11:56,330 --> 00:12:01,290
Just between you and me is that it's actually quite easy from next time.

210
00:12:01,290 --> 00:12:03,730
Next time though, we're going to get more disciplined.

211
00:12:03,770 --> 00:12:07,050
You can explain the importance of evaluations evals.

212
00:12:07,050 --> 00:12:11,890
So critical going to be able to measure the effectiveness of your Rag system.

213
00:12:12,130 --> 00:12:18,090
And we'll be able to experiment with things like encoders and chunking strategies and use our metrics

214
00:12:18,090 --> 00:12:23,730
use our evaluations to to be able to to to gauge how much difference we're making and to be able to

215
00:12:23,770 --> 00:12:26,650
make decisions about different solutions we want to build.

216
00:12:26,890 --> 00:12:27,410
All right.

217
00:12:27,450 --> 00:12:28,130
Lots of stuff.

218
00:12:28,130 --> 00:12:29,810
Tomorrow it's going to be really, really great.

219
00:12:29,970 --> 00:12:31,370
Um, very much looking forward to it.

220
00:12:31,370 --> 00:12:32,410
I hope you are too.

221
00:12:32,450 --> 00:12:33,370
I will see you then.