1
00:00:00,040 --> 00:00:04,800
Okay, so we're asking who went to Manchester University and we're going to retrieve 20 documents and
好的，我们要询问谁去了曼彻斯特大学，我们将检索 20 份文件

2
00:00:04,800 --> 00:00:05,680
see how we do.
看看我们怎么做。

3
00:00:06,000 --> 00:00:12,200
And I do believe if you do this before we do the semantic chunking and you try it with the prior version,
我相信如果你在我们进行语义分块之前这样做并且你尝试使用之前的版本，

4
00:00:12,200 --> 00:00:15,800
it doesn't get it right and it's lower than than ten, the top ten that we pull.
它做得不对，而且低于我们拉的前十名。

5
00:00:15,840 --> 00:00:19,040
But as of now, because we've done the semantic chunking, it doesn't do terribly.
但到目前为止，因为我们已经完成了语义分块，所以效果还不是很好。

6
00:00:19,080 --> 00:00:20,320
It does bring it back.
它确实把它带回来了。

7
00:00:20,360 --> 00:00:22,520
It's the fourth document along.
这是第四份文件。

8
00:00:22,720 --> 00:00:24,680
And in fact that is that is index four.
事实上，这就是索引四。

9
00:00:24,720 --> 00:00:30,360
It actually means it's the fifth document along because uh, zero is the first document.
它实际上意味着它是第五个文档，因为呃，零是第一个文档。

10
00:00:30,560 --> 00:00:35,600
So the fifth document down in the stack contained the word Manchester.
因此，堆栈中的第五个文档包含单词曼彻斯特。

11
00:00:35,760 --> 00:00:39,920
Uh, so, uh, it did find it eventually, but it was quite far down.
呃,所以呃,最终还是找到了,但是还差得很远.

12
00:00:39,960 --> 00:00:44,320
Now, what happens if we rerank we call the rerank to reorder things.
现在，如果我们重新排序会发生什么，我们称之为重新排序以重新排序。

13
00:00:44,320 --> 00:00:47,400
You can see that it's put the fifth document at the top now.
您可以看到它现在将第五个文档放在顶部。

14
00:00:47,560 --> 00:00:50,320
So now if we get it to say where is Manchester?
那么现在如果我们得到它说曼彻斯特在哪里？

15
00:00:50,360 --> 00:00:54,040
You'll see that Manchester is in the very top spot.
您会发现曼彻斯特位居榜首。

16
00:00:54,240 --> 00:01:00,120
It was able to sift all of these documents and shove Manchester right up to the top.
它能够筛选所有这些文件，并将曼彻斯特推上榜首。

17
00:01:00,240 --> 00:01:04,000
And so let's see what is in the content of this first document.
让我们看看第一个文档的内容是什么。

18
00:01:04,160 --> 00:01:07,010
You can see that this is about Jessica Liu.
你可以看到这是关于Jessica Liu的。

19
00:01:07,370 --> 00:01:08,890
I think Jessica Yu is Jessica Liu.
我认为余洁西卡就是刘洁西卡。

20
00:01:08,930 --> 00:01:09,450
That's right.
这是正确的。

21
00:01:09,450 --> 00:01:16,410
Is the name of the person who was, uh, described as going to University of Manchester.
是那个被描述为去曼彻斯特大学的人的名字。

22
00:01:16,570 --> 00:01:18,610
So that's working very nicely.
所以效果非常好。

23
00:01:18,810 --> 00:01:19,330
Okay.
好的。

24
00:01:19,690 --> 00:01:23,090
Uh, then everything else is just going to be so easy.
呃，那么其他一切都会变得很容易。

25
00:01:23,130 --> 00:01:28,890
Fetch context is just something which will take the unranked version and then rerank it.
获取上下文只是获取未排名版本然后重新排名的东西。

26
00:01:28,890 --> 00:01:32,530
And it's designed to sound just like the fetch context that we had before.
它的设计听起来就像我们之前的获取上下文一样。

27
00:01:32,930 --> 00:01:34,690
And now we're really in the end game.
现在我们真的进入了最后的阶段。

28
00:01:34,690 --> 00:01:38,250
It's time to to answer the question system prompt.
是时候回答系统提示的问题了。

29
00:01:38,290 --> 00:01:41,330
You're a knowledgeable, friendly assistant representing the company.
你是代表公司的知识渊博、友善的助理。

30
00:01:41,330 --> 00:01:41,930
And sure.
当然。

31
00:01:41,930 --> 00:01:44,850
Um, you're chatting about ensure Elm now.
嗯，您现在正在谈论确保 Elm。

32
00:01:45,050 --> 00:01:49,090
I hope you remember one of the points was about improving your prompts.
我希望您记住其中一点是关于改进您的提示。

33
00:01:49,090 --> 00:01:52,050
And so I thought, well, I should do my own take my own advice.
所以我想，好吧，我应该采取我自己的建议。

34
00:01:52,050 --> 00:01:53,250
So I put in here.
所以我就放在这里了。

35
00:01:53,250 --> 00:01:57,450
Your answer will be evaluated for accuracy, relevance and completeness.
我们将评估您的答案的准确性、相关性和完整性。

36
00:01:57,450 --> 00:02:01,610
So make sure it answers the question and fully answers it and blah blah blah.
因此，请确保它回答了问题并完全回答了问题等等。

37
00:02:01,850 --> 00:02:08,910
Like, I'm really trying to hammer at the LMS so that it does a good job in all three dimensions, and
就像，我真的在努力改进 LMS，以便它在所有三个维度上都做得很好，并且

38
00:02:08,910 --> 00:02:13,510
telling it that it's going to be evaluated is like, uh, you know, hopefully really, really makes
告诉它它将被评估就像，呃，你知道，希望真的，真的让

39
00:02:13,510 --> 00:02:17,230
it focus, pay attention to, to to those objectives.
它聚焦、关注、关注这些目标。

40
00:02:17,590 --> 00:02:19,550
And then we give it some context.
然后我们给它一些背景信息。

41
00:02:19,590 --> 00:02:21,750
With this context, please answer these as question.
在此背景下，请回答这些问题。

42
00:02:21,790 --> 00:02:24,230
Be accurate, relevant and complete.
准确、相关且完整。

43
00:02:24,270 --> 00:02:27,790
There's no harm in being repetitive in system prompts as well as in courses.
在系统提示和课程中重复并没有什么坏处。

44
00:02:28,030 --> 00:02:28,910
Uh uh.
呃呃。

45
00:02:29,110 --> 00:02:33,030
So and you'll notice that this otherwise is very similar format to when we use lang chain.
因此，您会注意到，这与我们使用 lang 链时的格式非常相似。

46
00:02:33,030 --> 00:02:34,030
Same same kind of thing.
一样的东西。

47
00:02:34,030 --> 00:02:35,430
I've got like a little tag in here.
我这里有一个小标签。

48
00:02:35,430 --> 00:02:36,670
That's my system prompt.
这是我的系统提示。

49
00:02:37,030 --> 00:02:39,790
And, and this, this is the, the extent of it.
而且，这，这就是它的范围。

50
00:02:39,790 --> 00:02:42,870
This is the, the thing that's going to turn this into messages.
这就是将其转化为消息的东西。

51
00:02:43,030 --> 00:02:47,270
Um, I take a question, I take the history and I take some chunks.
嗯，我提出一个问题，了解历史，并了解一些内容。

52
00:02:47,590 --> 00:02:50,030
I build my context.
我构建我的背景。

53
00:02:50,070 --> 00:02:52,350
It's just joining together all the different chunks.
它只是将所有不同的块连接在一起。

54
00:02:52,550 --> 00:03:00,070
I call my system prompt format changing, flipping this context field here for the context that's passed
我称我的系统提示格式更改为传递的上下文，在此翻转此上下文字段

55
00:03:00,070 --> 00:03:00,750
in here.
在这里。

56
00:03:01,350 --> 00:03:04,510
And then I return role system for the system prompt.
然后我返回角色系统以获取系统提示。

57
00:03:04,710 --> 00:03:08,350
I add in the history and the user for the user's question.
我添加了用户问题的历史记录和用户。

58
00:03:08,390 --> 00:03:12,000
That makes our messages for our Rag Pipeline.
这就是我们为 Rag Pipeline 提供的信息。

59
00:03:12,520 --> 00:03:13,400
Super easy.
超级简单。

60
00:03:13,840 --> 00:03:17,960
Final rag trick to do is query rewriting.
最后要做的技巧是查询重写。

61
00:03:17,960 --> 00:03:21,840
One more thing to do, which is one where we take the user's question.
还有一件事要做，这就是我们回答用户问题的地方。

62
00:03:21,840 --> 00:03:28,600
We don't just take the question as it is, we try rewriting it first to see whether our LLM can do better.
我们不会直接接受问题本身，我们会先尝试重写它，看看我们的法学硕士是否可以做得更好。

63
00:03:28,840 --> 00:03:32,960
And so here we have query rewriting Onshow.
所以这里我们有查询重写 Onshow。

64
00:03:33,080 --> 00:03:37,040
And now you can see something that you're hopefully super used to.
现在您可以看到一些您希望超级习惯的东西。

65
00:03:37,080 --> 00:03:40,560
Now this is just an example of a function that calls an LLM.
现在这只是调用 LLM 的函数的示例。

66
00:03:40,600 --> 00:03:41,840
There's no structured outputs.
没有结构化的输出。

67
00:03:41,840 --> 00:03:44,240
This time it could not be easier.
这一次再简单不过了。

68
00:03:44,240 --> 00:03:48,400
We describe a message you're in conversation with the user.
我们描述您正在与用户对话的消息。

69
00:03:48,400 --> 00:03:50,280
This is the history of your conversation.
这是你们谈话的历史记录。

70
00:03:50,280 --> 00:03:51,800
This is the user's question.
这是用户的问题。

71
00:03:52,080 --> 00:03:58,160
Respond with a single refined question that will be used to search the knowledge base.
回答一个将用于搜索知识库的精致问题。

72
00:03:58,520 --> 00:03:59,120
Important.
重要的。

73
00:03:59,120 --> 00:04:00,920
Respond only with the query.
仅响应查询。

74
00:04:00,960 --> 00:04:02,800
Nothing else you can imagine.
没有其他你能想象的了。

75
00:04:02,800 --> 00:04:07,440
I had to do this a few times because it kept wanting to tell me more, ask more, follow on questions
我不得不这样做几次，因为它一直想告诉我更多，问更多，跟进问题

76
00:04:07,440 --> 00:04:08,760
and things, which is not the point.
和事情，这不是重点。

77
00:04:08,760 --> 00:04:12,600
So you have to practice and experiment with these prompts and try each one separately.
因此，您必须练习和试验这些提示，并分别尝试每一个提示。

78
00:04:12,880 --> 00:04:19,210
And then the code is so simple I'm using light LM again, which is similar to chat completions.
然后代码非常简单，我再次使用轻型 LM，这类似于聊天完成。

79
00:04:19,210 --> 00:04:21,290
Create model is the model.
创建模型就是模型。

80
00:04:21,290 --> 00:04:22,570
That's the message.
这就是信息。

81
00:04:22,570 --> 00:04:27,210
And then I return response zero dot message content.
然后我返回响应零点消息内容。

82
00:04:27,370 --> 00:04:28,450
That's it.
就是这样。

83
00:04:28,690 --> 00:04:31,090
So that is rewrite query.
这就是重写查询。

84
00:04:31,130 --> 00:04:32,570
It takes a question.
这需要一个问题。

85
00:04:32,570 --> 00:04:34,290
It optionally takes history.
它可以选择获取历史记录。

86
00:04:34,290 --> 00:04:37,250
And then it responds with the new query.
然后它会响应新的查询。

87
00:04:37,250 --> 00:04:41,250
So what happens if we call that with who won the Ieti award.
那么，如果我们将其称为“Ieti 奖”的获得者，会发生什么呢？

88
00:04:41,250 --> 00:04:42,730
We get back something pretty similar.
我们得到了非常相似的东西。

89
00:04:42,730 --> 00:04:45,170
Who was the recipient of the Ieti award?
Ieti 奖的获得者是谁？

90
00:04:45,170 --> 00:04:47,410
It's just changed one to recipient.
它只是将一位收件人更改为收件人。

91
00:04:47,410 --> 00:04:48,610
But you know, whatever.
但你知道，无论如何。

92
00:04:48,610 --> 00:04:49,770
Sometimes it can do better.
有时它可以做得更好。

93
00:04:49,770 --> 00:04:52,730
And particularly if there's history, it can do better.
尤其是如果有历史，它可以做得更好。

94
00:04:53,210 --> 00:04:57,650
So now we can turn this into an answer question function.
现在我们可以把它变成一个回答问题的函数。

95
00:04:57,650 --> 00:04:59,850
This is the other key function we had to write.
这是我们必须编写的另一个关键函数。

96
00:04:59,850 --> 00:05:01,170
We had to write answer question.
我们必须写出答案问题。

97
00:05:01,170 --> 00:05:04,050
And we had to write uh fetch content.
我们必须编写呃获取内容。

98
00:05:04,050 --> 00:05:06,010
Those were fetch context.
这些是获取上下文。

99
00:05:06,050 --> 00:05:09,010
Those were the two functions that we implemented before.
这是我们之前实现的两个功能。

100
00:05:09,170 --> 00:05:11,690
So this is the answer question function.
所以这就是回答问题的函数。

101
00:05:11,690 --> 00:05:13,970
It takes a question and it takes the history.
这需要一个问题，也需要历史。

102
00:05:14,250 --> 00:05:15,530
And look at how simple it is.
看看它是多么简单。

103
00:05:15,570 --> 00:05:17,970
We uh we rewrite the query.
我们呃我们重写查询。

104
00:05:18,630 --> 00:05:20,270
We then get the chunks.
然后我们得到块。

105
00:05:20,270 --> 00:05:27,390
We make the rag messages, we call completion for the model and the messages, and we return the response.
我们制作碎布消息，调用模型和消息的完成，然后返回响应。

106
00:05:27,750 --> 00:05:28,470
That's it.
就是这样。

107
00:05:28,630 --> 00:05:32,390
That is rag with no lang chain that nothing more than this.
那是没有郎链的破布，仅此而已。

108
00:05:32,390 --> 00:05:36,790
If you're expecting something that was hundreds of lines of code and lots of rag stuff.
如果您期望的东西是数百行代码和大量破烂的东西。

109
00:05:36,830 --> 00:05:37,470
It's not.
它不是。

110
00:05:37,510 --> 00:05:39,030
It's just a few lines of code.
这只是几行代码。

111
00:05:39,030 --> 00:05:41,390
Rewrite the query, get the chunks.
重写查询，获取块。

112
00:05:41,430 --> 00:05:42,710
Get the messages.
获取消息。

113
00:05:42,750 --> 00:05:45,390
Call OpenAI, return the response.
调用OpenAI，返回响应。

114
00:05:45,830 --> 00:05:46,670
There we go.
我们开始吧。

115
00:05:46,790 --> 00:05:47,790
Let's ask it.
我们来问问吧。

116
00:05:47,830 --> 00:05:50,590
Who won the I o t award?
谁获得了 I o t 奖？

117
00:05:51,630 --> 00:05:52,550
And there we go.
我们就这样吧。

118
00:05:52,590 --> 00:05:57,950
We get Maxine Thompson won the I o t award in 2023.
Maxine Thompson 荣获 2023 年物联网奖。

119
00:05:57,990 --> 00:06:01,670
It's a response you can see there that happens to be accurate.
您可以在那里看到一个恰好准确的响应。

120
00:06:01,710 --> 00:06:03,070
It happens to be complete.
它恰好是完整的。

121
00:06:03,110 --> 00:06:05,750
It's got the last name and it happens to be relevant.
它有姓氏并且恰好是相关的。

122
00:06:05,750 --> 00:06:08,150
It hasn't given us any spurious information.
它没有给我们任何虚假信息。

123
00:06:08,150 --> 00:06:11,630
And you can see that there was some nice reranking that went on.
您可以看到正在进行一些不错的重新排名。

124
00:06:11,790 --> 00:06:16,190
Uh, and uh, yeah, these were all the bits of relevant context.
呃，呃，是的，这些都是相关背景的内容。

125
00:06:16,230 --> 00:06:20,270
You can see one other thing that we're going to do when we turn this into code, which is once we've
当我们把它变成代码时，你可以看到我们要做的另一件事，那就是一旦我们

126
00:06:20,270 --> 00:06:25,960
done this reranking, we don't need to pass all of these chunks to the final LM, because now we can
完成此重新排名后，我们不需要将所有这些块传递给最终的 LM，因为现在我们可以

127
00:06:25,960 --> 00:06:29,520
be pretty sure that the bottom ten are going to be less relevant.
可以肯定的是，后十名的相关性会降低。

128
00:06:29,600 --> 00:06:33,800
So once we've done this, we might as well chop it off at some point and be more efficient with our
因此，一旦我们做到了这一点，我们不妨在某个时候将其砍掉，并提高我们的效率。

129
00:06:33,800 --> 00:06:34,080
LM.
LM。

130
00:06:34,120 --> 00:06:37,000
That's one of the cool things about using Reranking.
这是使用重新排名的最酷的事情之一。

131
00:06:37,320 --> 00:06:40,640
Okay, let's, uh, we're almost ready for showtime.
好吧，让我们，呃，我们快要准备好表演了。

132
00:06:40,760 --> 00:06:41,800
Let's ask the question.
让我们问这个问题。

133
00:06:41,800 --> 00:06:44,040
Who went to Manchester University?
谁去了曼彻斯特大学？

134
00:06:44,240 --> 00:06:47,280
Uh, and, uh, we get back the answer.
呃，呃，我们得到了答案。

135
00:06:47,320 --> 00:06:50,320
Jessica Liu attended the University of Manchester.
杰西卡刘就读于曼彻斯特大学。

136
00:06:50,480 --> 00:06:51,600
It works.
有用。

137
00:06:51,600 --> 00:06:53,040
It works nicely.
效果很好。

138
00:06:53,200 --> 00:06:56,400
And, uh, you you can check that it doesn't work before.
而且，呃，你可以检查一下它以前是否不起作用。

139
00:06:56,440 --> 00:06:57,800
At least it didn't work for me.
至少对我来说没用。

140
00:06:58,040 --> 00:07:00,560
Now, one thing I'll tell you is that this doesn't always work.
现在，我要告诉你的一件事是，这并不总是有效。

141
00:07:00,720 --> 00:07:02,400
I discovered initially that it.
我最初发现它。

142
00:07:02,400 --> 00:07:03,200
That it didn't work.
那是行不通的。

143
00:07:03,200 --> 00:07:04,920
It did a bad job of working.
它的工作做得很糟糕。

144
00:07:04,960 --> 00:07:11,160
And to my great disappointment, I discovered the reason that it sometimes doesn't work is because of
令我非常失望的是，我发现它有时不起作用的原因是

145
00:07:11,160 --> 00:07:12,840
query rewriting.
查询重写。

146
00:07:13,000 --> 00:07:18,920
Query rewriting sometimes counts against us because when we call query rewriting, it tends to like
查询重写有时对我们不利，因为当我们调用查询重写时，它往往喜欢

147
00:07:18,920 --> 00:07:21,360
to add the word ensure elm in there.
在其中添加“确保榆树”一词。

148
00:07:21,400 --> 00:07:25,940
It tends to like say saying who from ensure elm blah blah blah blah blah blah blah.
它往往喜欢说谁来自确保榆树等等等等。

149
00:07:26,340 --> 00:07:30,980
Even though I added in here, don't mention the company name unless blah blah blah.
即使我在这里添加了，也不要提及公司名称，除非等等。

150
00:07:31,220 --> 00:07:35,500
I had to put that in there and it stopped saying that, but it still does sometimes.
我不得不把它放在那里，它就不再这么说了，但有时它仍然如此。

151
00:07:35,540 --> 00:07:41,540
And when it does that, the rag retrieval is less reliable because it surfaces tons of documents that
当这样做时，抹布检索的可靠性就会降低，因为它会显示大量文档，而这些文档

152
00:07:41,540 --> 00:07:48,180
are generic documents about ensure elm and not and Jessica's document like goes goes all the way down
是关于确保 elm 和 not 的通用文档，杰西卡的文档就像一直向下

153
00:07:48,180 --> 00:07:50,380
and sometimes it doesn't get reranked up.
有时它不会被重新排名。

154
00:07:50,500 --> 00:07:55,700
And so it becomes because of reranking it becomes a bit less reliable.
因此，由于重新排名，它变得不太可靠。

155
00:07:55,820 --> 00:08:00,180
Sorry, not because of Reranking because of rewriting, it becomes a bit less reliable.
抱歉，不是因为Reranking，因为重写，变得不太可靠。

156
00:08:00,220 --> 00:08:01,380
I hope you're following me here.
我希望你能在这里关注我。

157
00:08:01,420 --> 00:08:06,500
Rewriting the query, unfortunately, has added in more words like ensure elm.
不幸的是，重写查询添加了更多单词，例如“ensure elm”。

158
00:08:06,500 --> 00:08:12,820
That ends up diluting the rag retrieval and meaning that the relevant context doesn't get surfaced.
这最终会削弱抹布检索，并意味着相关上下文不会浮出水面。

159
00:08:12,980 --> 00:08:14,060
That's a problem.
这是一个问题。

160
00:08:14,060 --> 00:08:21,460
And that that final problem we will also solve in the last step of this, which is now we're going to
最后一个问题我们也将在最后一步中解决，现在我们要

161
00:08:21,460 --> 00:08:27,900
take all of this and we're going to implement it in Python modules in an implementation directory.
完成所有这些，我们将在实现目录中的 Python 模块中实现它。

162
00:08:27,900 --> 00:08:30,140
And then we're going to take it for a test drive.
然后我们将对其进行试驾。