-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy path33-Neural Nets and the Learning Function.srt
executable file
·3933 lines (3146 loc) · 79.8 KB
/
33-Neural Nets and the Learning Function.srt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1
00:00:01,069 --> 00:00:03,194
以下内容提供
the following content is provided under
2
00:00:03,199 --> 00:00:05,774
CreativeCommons许可您的支持
a Creative Commons license your support
3
00:00:05,779 --> 00:00:08,024
将帮助MITOpenCourseWare继续
will help MIT OpenCourseWare continue to
4
00:00:08,029 --> 00:00:09,855
提供高质量的教育资源
offer high quality educational resources
5
00:00:09,860 --> 00:00:10,935
免费
for free
6
00:00:10,940 --> 00:00:13,125
捐款或查看额外的捐款
to make a donation or to view additional
7
00:00:13,130 --> 00:00:15,165
数百个麻省理工学院课程的材料
materials from hundreds of MIT courses
8
00:00:15,170 --> 00:00:21,574
访问位于ocw.mit.edu的麻省理工学院开放式课件
visit MIT opencourseware at ocw.mit.edu
9
00:00:21,579 --> 00:00:27,225
好的,所以我知道我在哪里
ok so actually I have a I know where
10
00:00:27,230 --> 00:00:29,344
人们正在从事项目和工作
people are working on projects and
11
00:00:29,349 --> 00:00:32,445
您不对任何材料负责
you're not responsible for any material
12
00:00:32,450 --> 00:00:36,015
讲座中的新材料谢谢你
new material in the lectures thank you
13
00:00:36,020 --> 00:00:40,755
来了,但我确实有一些东西
for coming but I do have something in an
14
00:00:40,760 --> 00:00:43,995
重要的话题被修订
important topic which is a revised
15
00:00:44,000 --> 00:00:46,995
关于神经网络构建的版本
version about the construction of neural
16
00:00:47,000 --> 00:00:49,275
了解我们的基本结构
nets the basic structure that we're
17
00:00:49,280 --> 00:00:54,824
我们正在努力,以便开放
we're working with so that's on the open
18
00:00:54,829 --> 00:01:00,495
网页7.1节就是这样
web at section 7.1 that's so
19
00:01:00,500 --> 00:01:14,505
建立神经网真的是它
construction of neural nets really it's
20
00:01:14,510 --> 00:01:22,955
学习功能的构建
a construction of the learning function
21
00:01:22,960 --> 00:01:26,925
f这就是你的功能
f so that's the function that you
22
00:01:26,930 --> 00:01:29,685
通过梯度下降或优化
optimize by gradient descent or
23
00:01:29,690 --> 00:01:33,015
随机梯度下降和你
stochastic gradient descent and you
24
00:01:33,020 --> 00:01:39,224
适用于培训数据,以尽量减少
apply to the training data to minimize
25
00:01:39,229 --> 00:01:43,724
你只是想到它的损失
the loss you so just thinking about it
26
00:01:43,729 --> 00:01:46,215
我写的是一种更有条理的方式
in a more organized way because I wrote
27
00:01:46,220 --> 00:01:49,335
在我知道更多之前的那一节
that section before I knew anything more
28
00:01:49,340 --> 00:01:52,985
而不是现在如何拼写神经网络
than how to spell neural nets but now
29
00:01:52,990 --> 00:01:59,025
我考虑的更重要了
I'm thought about it more so the the key
30
00:01:59,030 --> 00:02:04,544
这一点可以与我所拥有的相比
point may be compared to what I had in
31
00:02:04,549 --> 00:02:07,425
过去是我现在认为这是
the past is that I now think of this as
32
00:02:07,430 --> 00:02:11,865
两组变量X的函数
a function of two sets of variables X
33
00:02:11,870 --> 00:02:14,625
和V.
and V
34
00:02:14,630 --> 00:02:23,295
所以X是权重,V是
so X are the weights and V are the
35
00:02:23,300 --> 00:02:29,245
特征向量样本特征
feature vectors the sample feature
36
00:02:29,250 --> 00:02:37,585
从这样的那些载体来从
vectors from the so those come from the
37
00:02:37,590 --> 00:02:42,085
如果一次训练数据
training data either one at a time if
38
00:02:42,090 --> 00:02:44,245
我们正在做随机梯度下降
we're doing stochastic gradient descent
39
00:02:44,250 --> 00:02:48,805
一次批量大小为1或B
with many batch size one or B at a time
40
00:02:48,810 --> 00:02:51,325
如果我们做的是小批量的B或者
if we're doing mini batch of size B or
41
00:02:51,330 --> 00:02:54,895
整个事情整个时代后如
the whole thing a whole epoch at once if
42
00:02:54,900 --> 00:02:58,215
我们正在进行全面的梯度下降
we're doing full-scale gradient descent
43
00:02:58,220 --> 00:03:01,725
所以那些是特征向量和
so so those are the feature vectors and
44
00:03:01,730 --> 00:03:08,635
这些是那些中的数字
these are the numbers in in those in the
45
00:03:08,640 --> 00:03:11,875
Mait在线性步骤中
Mait in the linear steps where the
46
00:03:11,880 --> 00:03:19,165
权重所以他们的矩阵AK是
weights so they're the matrices a k that
47
00:03:19,170 --> 00:03:25,195
你乘以V乘以和
you multiply by multiply V by and also
48
00:03:25,200 --> 00:03:34,435
你添加的偏向量BK
the bias vectors BK that you add on to
49
00:03:34,440 --> 00:03:38,514
转移的起源可以的,这些都是
shift the origin okay and these are the
50
00:03:38,519 --> 00:03:41,035
这些是你的
and these are it's these that you
51
00:03:41,040 --> 00:03:48,475
优化那些是优化和
optimize those are to optimize and
52
00:03:48,480 --> 00:03:55,465
什么是整体的结构
what's the structure of the whole the
53
00:03:55,470 --> 00:03:58,255
整个学习功能和如何
whole of the learning function and how
54
00:03:58,260 --> 00:04:00,895
你用它来做什么神经
do you use it what what does the neural
55
00:04:00,900 --> 00:04:07,815
净的样子,所以你把你f的
net look like so you you take f of a
56
00:04:07,820 --> 00:04:12,385
第一组重量如此f的第一组
first set of weight so f of the first
57
00:04:12,390 --> 00:04:17,604
权重的设置将A1和B1等等
set of weights would be a1 and b1 so
58
00:04:17,609 --> 00:04:25,135
这是X部分和实际的部分
that's the X part and the and the actual
59
00:04:25,140 --> 00:04:27,265
样本矢量样本矢量
sample vector the sample vectors
60
00:04:27,270 --> 00:04:31,254
在迭代中v为零
v zero in the iteration and this
61
00:04:31,259 --> 00:04:35,985
产生,然后你做非线性
produces and then you do the nonlinear
62
00:04:35,990 --> 00:04:39,055
步骤到每个组件并产生
step to each component and that produces
63
00:04:39,060 --> 00:04:40,555
V一
V one
64
00:04:40,560 --> 00:04:45,325
所以有一个典型的我可以写出来
so there is a typical I could write out
65
00:04:45,330 --> 00:04:54,205
这是什么这里有1V0加V1,因此
what this is here a 1 V 0 plus V 1 so
66
00:04:54,210 --> 00:04:56,935
这两个步骤是线性步骤
that's the two steps are the linear step
67
00:04:56,940 --> 00:05:01,314
你的输入是V0你采取线性
you the input is V 0 you take the linear
68
00:05:01,319 --> 00:05:04,284
使用第一个权重a的步骤
step using the using the first weights a
69
00:05:04,289 --> 00:05:07,795
1和B1然后你采取非线性
1 and B 1 then you take the nonlinear
70
00:05:07,800 --> 00:05:11,575
步骤,这给你V1这样的
step and that gives you V 1 so that's
71
00:05:11,580 --> 00:05:15,055
真的比我上面的线条好,所以我会
really better than my line above so I'll
72
00:05:15,060 --> 00:05:24,464
擦除上面的那条线
erase that line above yep
73
00:05:24,469 --> 00:05:29,814
这样就可以从V0和V0产生V1
so that produces V 1 from V 0 and the
74
00:05:29,819 --> 00:05:35,235
第一重,然后是下一级
first weight and then the next level
75
00:05:35,240 --> 00:05:41,545
输入v1所以我只称这个VK或V.
inputs v1 so I'll just call this VK or V
76
00:05:41,550 --> 00:05:46,435
K减1,我称之为VK好吧
K minus 1 and I'll call this one VK okay
77
00:05:46,440 --> 00:05:50,575
所以k等于1到多少
so so k equal to 1 up to however many
78
00:05:50,580 --> 00:05:58,375
图层URL图层,因此输入为v-0
layers URL layers so the input was v-0
79
00:05:58,380 --> 00:06:02,185
所以你可以说这个V真的是v-0
so this V is really v-0 you could say
80
00:06:02,190 --> 00:06:07,974
而你这是神经网络而且这个
and you this is the neural net and this
81
00:06:07,979 --> 00:06:12,504
是每个输出输出
is the out input and output from each
82
00:06:12,509 --> 00:06:17,904
层然后VL是最终输出
layer and then V L is the final output
83
00:06:17,909 --> 00:06:21,654
从最后一层开始,让我们这样做吧
from the final layer so so let's just do
84
00:06:21,659 --> 00:06:28,555
这里的图片是v-0样本
a picture here here here is v-0 a sample
85
00:06:28,560 --> 00:06:30,355
矢量或如果我们正在做图像
vector or if we're doing image
86
00:06:30,360 --> 00:06:34,305
处理它的所有像素
processing it's all the pixels in in the
87
00:06:34,310 --> 00:06:37,315
图像中的在所述数据和所述
image in the in the in the data and the
88
00:06:37,320 --> 00:06:41,185
这是一个样本的培训
training from one sample this is
89
00:06:41,190 --> 00:06:50,695
一个训练样本,然后你
one training sample and then you
90
00:06:50,700 --> 00:06:55,645
乘以1然后你加上B1和你
multiply by a 1 and you add B 1 and you
91
00:06:55,650 --> 00:07:01,015
取Riu的那个向量,然后给出
take Riu of that vector and that gives
92
00:07:01,020 --> 00:07:05,905
你V2V1抱歉给你V1和
you V 2 V 1 sorry that gives you V 1 and
93
00:07:05,910 --> 00:07:12,835
然后你迭代到最后VL最后
then you iterate to finally V L the last
94
00:07:12,840 --> 00:07:15,775
你最后没有做真正的uu层
layer you don't do real uu at the last
95
00:07:15,780 --> 00:07:21,685
层,所以它只是alVL减1加
layer so it's just al V L minus 1 plus
96
00:07:21,690 --> 00:07:26,455
VL也可能不会做偏置矢量
VL and you may not do a bias vector also
97
00:07:26,460 --> 00:07:30,265
在那一层,但你可能和那个
at that layer but you might and that's
98
00:07:30,270 --> 00:07:34,705
这是最后的输出
this is the finally the output so is
99
00:07:34,710 --> 00:07:36,745
那张照片和照片是这样的
that picture and so that picture is
100
00:07:36,750 --> 00:07:40,405
更清晰的对我比以前要
clearer for me than it was previously to
101
00:07:40,410 --> 00:07:44,125
的权重,以便在区分
distinguish between the weights so in
102
00:07:44,130 --> 00:07:48,535
在梯度下降算法中
the in the gradient descent algorithm
103
00:07:48,540 --> 00:07:52,345
你正在选择的是这些X.
it's these X's that you're choosing the
104
00:07:52,350 --> 00:07:55,015
五世的不是这些由给定
V's are not these are given by the
105
00:07:55,020 --> 00:07:56,995
培训不属于该项目的数据
training data that's not part of the
106
00:07:57,000 --> 00:07:59,995
优化部分不属于它的一部分
optimization part that's not part of its
107
00:08:00,000 --> 00:08:03,355
第6章中的X,你找到了
X in chapter 6 where you're finding the
108
00:08:03,360 --> 00:08:06,865
最佳权重,所以这个X真正的立场
optimal weights so this X really stands
109
00:08:06,870 --> 00:08:12,715
forX代表所有重量
for X stands for the all the weights
110
00:08:12,720 --> 00:08:25,405
你计算达AL为L所以它的
that you compute up to a L be L so it's
111
00:08:25,410 --> 00:08:27,385
这就是所有人的集合
that's that's a collection of all the
112
00:08:27,390 --> 00:08:29,835
权重和重要部分
weights and the important part for
113
00:08:29,840 --> 00:08:32,635
实践的应用是实现
applications for practice is to realize
114
00:08:32,640 --> 00:08:35,085
通常有更多的重量和
that there are often more weights and
115
00:08:35,090 --> 00:08:37,735
然后在权重中有更多组件
more components in the weights then
116
00:08:37,740 --> 00:08:40,914
还有在功能组件
there are components in the feature
117
00:08:40,919 --> 00:08:43,224
在vs中的样本中的向量
vectors in the samples in the vs
118
00:08:43,229 --> 00:08:46,675
通常X的大小大于
so often the size of X is greater than
119
00:08:46,680 --> 00:08:49,765
V的大小是一个有趣的
the size of V which is an interesting
120
00:08:49,770 --> 00:08:55,075
还有那种意想不到的情况
and sort of unexpected situation so
121
00:08:55,080 --> 00:09:00,595
经常我会写那个经常是X的
often I'll just write that often the X's
122
00:09:00,600 --> 00:09:14,115
权重X是否在确定之下
are the weights X's are under determined
123
00:09:14,120 --> 00:09:24,024
因为X的数量超过和
because the number of X's exceeds and
124
00:09:24,029 --> 00:09:27,204
往往远远超过签证的数量
often far exceeds the number of visa
125
00:09:27,209 --> 00:09:31,315
基数的数量
number of the cardinality the number of
126
00:09:31,320 --> 00:09:36,135
这里的权重在A和B中
weights in this is in the A's and B's
127
00:09:36,140 --> 00:09:42,595
而这些都是在样品中
and the these are in the samples in the
128
00:09:42,600 --> 00:09:51,954
训练设定好数字
training set the number well the number
129
00:09:51,959 --> 00:09:56,514
的所有样本的特征
of features of all the samples in the
130
00:09:56,519 --> 00:10:01,405
训练集,所以我会得到新的
training set so I'll get that new
131
00:10:01,410 --> 00:10:06,505
第7.1节有希望在本周开始
section 7.1 up hopefully this week on
132
00:10:06,510 --> 00:10:11,185
在开放的那是开放的设置和
the on the open that's the open set and
133
00:10:11,190 --> 00:10:15,535
我会在那里给你发电子邮件给你
I'll email to you on stellar is there
134
00:10:15,540 --> 00:10:18,505
更多我应该说的,你看
more I should say about this you see
135
00:10:18,510 --> 00:10:21,505
在这里我可以画出画面但是
here the I can draw the picture but of
136
00:10:21,510 --> 00:10:24,175
当然手绘的图片很远
course a hand-drawn picture is far
137
00:10:24,180 --> 00:10:30,535
不如机器绘制的图片
inferior to to a machine drawn picture
138
00:10:30,540 --> 00:10:33,595
但在线图片,但让我这样做
but online picture but let me just do it
139
00:10:33,600 --> 00:10:37,454
所以训练样本有V.
so there's V the training sample has
140
00:10:37,459 --> 00:10:41,365
一些组件,然后他们
some components and then they're
141
00:10:41,370 --> 00:10:45,625
现在乘以这里将是v1
multiplied now here's going to be v1 the
142
00:10:45,630 --> 00:10:52,845
第一层,可以有一个
first in layer and that can have a
143
00:10:52,850 --> 00:11:00,565
不同数量的组件
different number of of of components in
144
00:11:00,570 --> 00:11:03,295
在第一层不同的数量
in the first layer different number of
145
00:11:03,300 --> 00:11:06,855
神经元然后每个来自
neurons and then each one comes from
146
00:11:06,860 --> 00:11:08,574
来自
from the
147
00:11:08,579 --> 00:11:12,984
结的,所以我不会继续下去,但在这里
eze by so I won't keep going here but
148
00:11:12,989 --> 00:11:17,274
但你看到的图片就是这样的
but you you see the picture so there's a
149
00:11:17,279 --> 00:11:20,785
伴侣描述基质的一个
mate that describes a matrix a one that
150
00:11:20,790 --> 00:11:22,674
告诉你那些权重是多少
tells you what the weights are on those
151
00:11:22,679 --> 00:11:28,644
然后有一个b1添加了
and then there's a b1 that's added the
152
00:11:28,649 --> 00:11:33,024
BIOS矢量被添加到所有要获得的内容中
BIOS vector is added to all those to get
153
00:11:33,029 --> 00:11:33,954
v1
the v1
154
00:11:33,959 --> 00:11:40,105
所以v1是1V0加B1和
there's so v1 is a 1 V 0 plus B 1 and
155
00:11:40,110 --> 00:11:45,024
那么这就是现场
then onwards so this this is the spot
156
00:11:45,029 --> 00:11:48,434
我们手工绘制它很明显
we're drawing it by hand is clearly
157
00:11:48,439 --> 00:11:52,824
不如任何其他可行的方式
inferior to any other possible way to do
158
00:11:52,829 --> 00:12:00,264
好吧所以现在我还没有投入
it okay so now I haven't yet put into
159
00:12:00,269 --> 00:12:03,774
图片中的损失功能就是这样
the picture the loss function so that's
160
00:12:03,779 --> 00:12:07,454
要最小化的功能
the function that you want to minimize
161
00:12:07,459 --> 00:12:13,674
那么损失函数究竟是什么呢?
so what is the loss function so we're
162
00:12:13,679 --> 00:12:20,004
选择x2即A和B的全部
choosing x2 that's all the A's and B's
163
00:12:20,009 --> 00:12:28,674
把损失降到最低功能埃尔好了,
to minimize the loss function el okay so
164
00:12:28,679 --> 00:12:31,405
这是SRA教授的这一部分
it's this part that Professor SRA's
165
00:12:31,410 --> 00:12:36,564
讲座是关于他所说的他是el
lecture was about so he he said el is
166
00:12:36,569 --> 00:12:45,655
通常是所有F的有限总和
often a finite sum over all the of F so
167
00:12:45,660 --> 00:12:53,274
十六,这是什么?所以这是
what would that be f of X V I so this is
168
00:12:53,279 --> 00:12:59,754
带权重的输出
the output from the with weights with
169
00:12:59,759 --> 00:13:03,984
来自样本编号I的X中的权重和if
weights in X from sample number I and if
170
00:13:03,989 --> 00:13:06,504
我们正在进行批量处理
we're doing batch processing that is
171
00:13:06,509 --> 00:13:08,754
我们正在做整批一次,然后
we're doing the whole batch at once then
172
00:13:08,759 --> 00:13:11,234
我们计算所有我和那是
we compute that for all I and that's the
173
00:13:11,239 --> 00:13:13,465
计算是荒谬的
computation that's ridiculously
174
00:13:13,470 --> 00:13:16,965
贵,而你去了
expensive and you go instead to
175
00:13:16,970 --> 00:13:19,735
随机渐变,你只需选择
stochastic gradient and you just choose
176
00:13:19,740 --> 00:13:22,324
其中一个或那些B.
one of those or B of those
177
00:13:22,329 --> 00:13:25,475
少数就像30到1208
small number be like 30 to 120 8 of
178
00:13:25,480 --> 00:13:29,764
这些F但是全尺度渐变
these F's but but full-scale gradient
179
00:13:29,769 --> 00:13:34,894
下降选择权重X到
descent chooses the weights X to
180
00:13:34,899 --> 00:13:37,684
现在最小化损失所以法律我
minimize the loss now so the law I
181
00:13:37,689 --> 00:13:40,084
没有这方面的损失现在还没有出现
haven't got the loss here yet there is
182
00:13:40,089 --> 00:13:45,124
这个功能损失将是负数
this function the loss would be minus
183
00:13:45,129 --> 00:13:51,694
样本II的真实结果不是
the true result from sample I I don't
184
00:13:51,699 --> 00:13:53,644
写我没有好的记谱
write I haven't got a good notation for
185
00:13:53,649 --> 00:13:56,194
我愿意接受建议
that I'm open to suggestions
186
00:13:56,199 --> 00:13:57,845
那我该怎么写空气呢?
so how do I want to write the air
187
00:13:57,850 --> 00:14:03,155
假设我是这样的话
suppose suppose I'm so that would be if
188
00:14:03,160 --> 00:14:05,405
这可能是最不正方形
it was least squares I would maybe be
189
00:14:05,410 --> 00:14:10,444
平衡,这将是一个总和
squaring that so that would be a sum of
190
00:14:10,449 --> 00:14:14,465
所有的误差平方
squares of errors squared over all the
191
00:14:14,470 --> 00:14:17,434
样品或如果我做随机
samples or if I'm doing stochastic
192
00:14:17,439 --> 00:14:19,774
梯度下降我会尽量减少我
gradient descent I would minimize I
193
00:14:19,779 --> 00:14:22,234
我猜我最小化了这个但是
guess I'm minimizing this but the
194
00:14:22,239 --> 00:14:25,624
问题是我是否使用整个功能
question is do I use the whole function
195
00:14:25,629 --> 00:14:30,215
L在每次迭代中或者我只是选择
L in at each iteration or do I just pick
196
00:14:30,220 --> 00:14:35,254
要查看的样本中的一个或仅两个
one or only B of the samples to look at
197
00:14:35,259 --> 00:14:39,064
在迭代次数K所以这就是
at iteration number K so so this is the
198
00:14:39,069 --> 00:14:44,554
这是X的L然后我加起来了
this is the L of X then I've added up
199
00:14:44,559 --> 00:14:48,215
所有V的所以只是为了保持他们的
over all the V's so just to keep their
200
00:14:48,220 --> 00:14:52,144
符号直接我有这个功能
notation straight I have this function