1
00:00:00,080 --> 00:00:01,040
Okay, so here we are.

2
00:00:01,040 --> 00:00:05,080
We're going to write a better chat instead of returning bananas.

3
00:00:05,080 --> 00:00:11,600
It's actually going to call OpenAI or whichever model you like and return an answer from the LLM.

4
00:00:11,600 --> 00:00:13,360
But it's the same as far as Gradio is concerned.

5
00:00:13,360 --> 00:00:14,440
It's the same callback.

6
00:00:14,440 --> 00:00:15,240
It's calling.

7
00:00:15,280 --> 00:00:17,240
It doesn't know that we're calling an LLM.

8
00:00:17,280 --> 00:00:17,800
Okay.

9
00:00:18,040 --> 00:00:22,240
So first up, there's just a little line here that is not necessary.

10
00:00:22,240 --> 00:00:23,400
If you're using OpenAI.

11
00:00:23,440 --> 00:00:24,760
It doesn't make any difference.

12
00:00:24,760 --> 00:00:29,360
But it means that if you're using Gemini or some of the other models then it will definitely work.

13
00:00:29,520 --> 00:00:30,360
And it is.

14
00:00:30,360 --> 00:00:36,240
Do you remember I mentioned that that history variable that I showed you, it looked very similar to

15
00:00:36,280 --> 00:00:39,280
OpenAI's format, but it also had some extra fields in there.

16
00:00:39,280 --> 00:00:40,520
It had metadata.

17
00:00:40,520 --> 00:00:45,440
And that's OpenAI doesn't care if you add in extra fields, but some of the models do.

18
00:00:45,640 --> 00:00:52,960
So I'm just doing this one line here, which just replaces history with a new version of history that

19
00:00:52,960 --> 00:00:58,280
only has role and content in it and clears out metadata or anything else.

20
00:00:58,280 --> 00:01:00,040
And this makes no difference for OpenAI.

21
00:01:00,080 --> 00:01:04,950
But if you're using Gemini, it means it will work rather than give you an error that it doesn't like

22
00:01:04,950 --> 00:01:05,910
metadata in there.

23
00:01:06,390 --> 00:01:08,150
So that's just a small point.

24
00:01:08,190 --> 00:01:08,950
Look out for that.

25
00:01:08,950 --> 00:01:12,310
I have it all over the place just so that it'll always work on Gemini too.

26
00:01:12,630 --> 00:01:13,470
All right.

27
00:01:13,470 --> 00:01:15,230
And now we should know this.

28
00:01:15,230 --> 00:01:21,230
Well, we first need to to package everything together into the messages messages that we're going to

29
00:01:21,270 --> 00:01:22,790
want to send to OpenAI.

30
00:01:23,070 --> 00:01:24,310
And it's very simple.

31
00:01:24,350 --> 00:01:24,990
I press tab.

32
00:01:25,030 --> 00:01:26,350
This curse has done it for me.

33
00:01:26,550 --> 00:01:31,030
Uh, basically we just want this list of dictionaries.

34
00:01:31,030 --> 00:01:37,070
We want to start with role system content and the system message, because Gradio is just passing in

35
00:01:37,070 --> 00:01:38,390
the contents of the UI.

36
00:01:38,550 --> 00:01:40,790
So Gradio doesn't know that we have a system message.

37
00:01:40,790 --> 00:01:46,310
So we need to put that in to start with and then put in all of the history that Gradio sent in, and

38
00:01:46,310 --> 00:01:52,270
then end with the current user's prompt as role user content.

39
00:01:52,270 --> 00:01:56,430
This message, you see, we take all the history and then we shove in one more message, which is the

40
00:01:56,430 --> 00:01:58,110
user's message at the end.

41
00:01:58,430 --> 00:02:01,500
And so this is now got all of the messages in it.

42
00:02:01,540 --> 00:02:02,420
Great.

43
00:02:02,420 --> 00:02:08,180
And now we can say response is OpenAI create model is model messages.

44
00:02:08,180 --> 00:02:10,020
Is messages perfect.

45
00:02:10,300 --> 00:02:19,140
And then finally we return response dot choices zero dot message dot content the chat completions API.

46
00:02:19,420 --> 00:02:20,380
There we have it.

47
00:02:20,420 --> 00:02:21,300
Let's run.

48
00:02:21,300 --> 00:02:22,980
This is our new callback.

49
00:02:23,020 --> 00:02:24,420
It looks great.

50
00:02:24,540 --> 00:02:25,700
New update available.

51
00:02:25,700 --> 00:02:26,460
I'll get that later.

52
00:02:26,460 --> 00:02:26,740
Thank you.

53
00:02:26,780 --> 00:02:27,260
Cursor.

54
00:02:27,460 --> 00:02:37,660
Um, and now we will do um, chat interface that that same chat interface again and we'll see whether

55
00:02:37,660 --> 00:02:40,060
now we get the same user interface.

56
00:02:40,060 --> 00:02:41,820
Just just as easy as before.

57
00:02:41,860 --> 00:02:45,580
Gradio doesn't know that anything has changed, except we've got this, this new callback.

58
00:02:45,860 --> 00:02:51,140
And when we type in messages, we're hoping to see something better than bananas as the answer.

59
00:02:51,140 --> 00:02:51,740
Here we go.

60
00:02:51,780 --> 00:02:52,580
Are you ready for this?

61
00:02:53,020 --> 00:02:56,700
I run it, up comes the chat and I'm gonna say hi there.

62
00:02:57,060 --> 00:02:57,820
Drumroll.

63
00:02:58,380 --> 00:02:59,020
Hello.

64
00:02:59,060 --> 00:03:00,180
How can I assist you today?

65
00:03:00,340 --> 00:03:02,090
That of course, coming back from GPT.

66
00:03:02,410 --> 00:03:03,130
Hi there.

67
00:03:03,530 --> 00:03:05,530
My name is editor.

68
00:03:06,610 --> 00:03:07,290
Hello, editor.

69
00:03:07,330 --> 00:03:07,970
Nice to meet you.

70
00:03:08,010 --> 00:03:08,890
How can I help you today?

71
00:03:08,890 --> 00:03:09,730
What's my name?

72
00:03:12,170 --> 00:03:13,050
Your name is Ed.

73
00:03:13,090 --> 00:03:14,170
How can I assist you further?

74
00:03:14,170 --> 00:03:14,490
Ed?

75
00:03:14,850 --> 00:03:17,410
All right, so I imagine you get it.

76
00:03:17,410 --> 00:03:18,010
You're with me.

77
00:03:18,010 --> 00:03:20,770
But just a really press this point home.

78
00:03:20,810 --> 00:03:24,250
We asked Gradio to make a chat interface for us.

79
00:03:24,250 --> 00:03:26,610
We gave it this callback function.

80
00:03:26,890 --> 00:03:32,890
And so as a result, every time that the user presses something in here, like what's my name?

81
00:03:33,130 --> 00:03:36,010
Uh, that is considered the message.

82
00:03:36,210 --> 00:03:38,730
Everything so far is considered the history.

83
00:03:38,770 --> 00:03:44,930
Gradio takes those two bits of information and it automatically calls the callback chat, passing in

84
00:03:44,930 --> 00:03:46,930
the message passing in the history.

85
00:03:46,930 --> 00:03:50,210
And this is in OpenAI format or very close to it.

86
00:03:50,290 --> 00:03:55,370
We scrub the format to make it just right, only needed for Gemini and a couple of others for grok as

87
00:03:55,370 --> 00:03:55,970
well, I think.

88
00:03:56,290 --> 00:04:02,000
And then messages, uh, we construct exactly the right messages, including shoving the system prompt

89
00:04:02,000 --> 00:04:03,920
at the top, we call OpenAI.

90
00:04:03,960 --> 00:04:06,440
We return the response from OpenAI.

91
00:04:06,480 --> 00:04:11,400
So instead of bananas, what comes back and appears in the screen is actually the response from the

92
00:04:11,400 --> 00:04:11,920
model.

93
00:04:12,360 --> 00:04:15,080
I know you've got it already, but I like to talk.

94
00:04:15,440 --> 00:04:17,040
All right, all right, let's move on.

95
00:04:17,040 --> 00:04:17,760
Let's do more.

96
00:04:17,960 --> 00:04:18,480
Okay.

97
00:04:18,520 --> 00:04:24,480
So I'm I'm now going to update that chat callback to do streaming as well.

98
00:04:24,640 --> 00:04:27,600
And so this time it's exactly the same as before.

99
00:04:27,600 --> 00:04:33,400
But I've got OpenAI ActionScript and I'm passing in stream equals true.

100
00:04:33,760 --> 00:04:38,640
And then I can just simply put here for chunk in stream.

101
00:04:38,640 --> 00:04:41,280
Remember that we're just going to iterate over what comes back.

102
00:04:41,680 --> 00:04:47,720
And then we're going to get the next chunk of what comes back, accumulate it in this variable response

103
00:04:47,720 --> 00:04:53,360
and yield response each time so that gradio it will see that this isn't a function anymore.

104
00:04:53,360 --> 00:04:56,320
It's a generator, a special kind of function.

105
00:04:56,320 --> 00:04:58,480
It's a function that you can iterate over.

106
00:04:58,480 --> 00:05:04,030
And as a result, Gradio will automatically not just wait until it's finished and show the results,

107
00:05:04,030 --> 00:05:10,270
but iterate over it and surely, surely slowly show the response as it comes.

108
00:05:10,630 --> 00:05:11,270
Surely.

109
00:05:11,470 --> 00:05:12,750
Uh, all right.

110
00:05:13,030 --> 00:05:16,950
Uh, then let's just bring this up and see what we get.

111
00:05:16,950 --> 00:05:17,710
Here it is.

112
00:05:17,830 --> 00:05:19,910
Uh, so now I can say hi there again.

113
00:05:19,910 --> 00:05:20,590
Let's try.

114
00:05:20,790 --> 00:05:21,790
Hi there.

115
00:05:23,030 --> 00:05:24,590
And you can see that it came streaming.

116
00:05:24,590 --> 00:05:33,310
And let's make it something, uh, explain the pros and cons of a genetic, uh, architecture, uh,

117
00:05:33,510 --> 00:05:36,950
genetic AI, uh, for commercial purposes.

118
00:05:37,510 --> 00:05:40,870
Let's do that and see what we get.

119
00:05:40,910 --> 00:05:45,310
Is this something that we're having going off on a on a big old diatribe?

120
00:05:45,590 --> 00:05:46,110
There you go.

121
00:05:46,150 --> 00:05:47,830
And you can see it streaming away.

122
00:05:47,830 --> 00:05:54,550
And it's even gradio nicely handles, uh, markdown coming back out of the box, uh, without needing

123
00:05:54,590 --> 00:05:56,790
to use markdowns or anything this time.

124
00:05:56,830 --> 00:05:57,830
Looks great.

125
00:05:57,990 --> 00:05:59,430
Uh, isn't that easy?

126
00:05:59,470 --> 00:05:59,830
Isn't it?

127
00:05:59,870 --> 00:06:02,630
Wasn't it easy to build like an interactive chatbot?