-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy path26-Structure of Neural Nets for Deep Learning.srt
executable file
·3968 lines (3174 loc) · 80.6 KB
/
26-Structure of Neural Nets for Deep Learning.srt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1
00:00:01,069 --> 00:00:03,194
以下内容提供
the following content is provided under
2
00:00:03,199 --> 00:00:05,715
CreativeCommons许可您的支持
a Creative Commons license your support
3
00:00:05,720 --> 00:00:08,024
将帮助MITOpenCourseWare继续
will help MIT OpenCourseWare continue to
4
00:00:08,029 --> 00:00:09,855
提供高质量的教育资源
offer high quality educational resources
5
00:00:09,860 --> 00:00:10,935
免费
for free
6
00:00:10,940 --> 00:00:13,125
捐款或查看额外的捐款
to make a donation or to view additional
7
00:00:13,130 --> 00:00:15,165
数百个麻省理工学院课程的材料
materials from hundreds of MIT courses
8
00:00:15,170 --> 00:00:21,754
访问位于ocw.mit.edu的麻省理工学院开放式课件
visit MIT opencourseware at ocw.mit.edu
9
00:00:21,759 --> 00:00:24,524
好的,所以这是重要的一天
ok so this is an important day and
10
00:00:24,529 --> 00:00:28,545
星期五是重要的一天,希望你
Friday was an important day I I hope you
11
00:00:28,550 --> 00:00:31,605
享受教授的Fras太棒了
enjoyed professor's Fras terrific
12
00:00:31,610 --> 00:00:33,945
演讲,就像我做你可能
lecture as much as I did you probably
13
00:00:33,950 --> 00:00:37,985
看见我记笔记疯狂的
saw me taking notes like mad for the
14
00:00:37,990 --> 00:00:40,545
要写入有关节现在
section that's now to be written about
15
00:00:40,550 --> 00:00:44,955
随机梯度下降所以他和他
stochastic gradient descent so and he he
16
00:00:44,960 --> 00:00:48,705
如果你记得并且答应了一个定理
promised a theorem if you remember and
17
00:00:48,710 --> 00:00:51,915
没有时间,所以他要去
there wasn't time and so he was going to
18
00:00:51,920 --> 00:00:53,925
发送给我或仍然要发送
send it to me or still is going to send
19
00:00:53,930 --> 00:00:56,055
对我来说,我会报告我没有得到它
it to me I'll report I haven't got it
20
00:00:56,060 --> 00:01:00,125
但我会这样,但我会把它带到课堂上
yet so I'll but I'll bring it to class
21
00:01:00,130 --> 00:01:05,025
等待和希望看到和那个
wait and see hopefully about and that'll
22
00:01:05,030 --> 00:01:08,775
让我们有机会回顾随机性
give us a chance to review stochastic
23
00:01:08,780 --> 00:01:11,235
梯度下降中心算法
gradient descent the central algorithm
24
00:01:11,240 --> 00:01:15,525
深度学习然后今天就是这样
of deep learning and then this today is
25
00:01:15,530 --> 00:01:21,794
关于深层的中心结构
about the central structure of deep
26
00:01:21,799 --> 00:01:25,245
神经网和你们中的一些人会知道
neural nets and some of you will know
27
00:01:25,250 --> 00:01:28,695
已经是他们如何联系的了
already what they how they're connected
28
00:01:28,700 --> 00:01:34,904
什么是什么功能f
what what the what the function f the
29
00:01:34,909 --> 00:01:38,205
学习功能它可以把它叫做
learning function it could call it the
30
00:01:38,210 --> 00:01:40,544
那是学习功能
learning function that that's
31
00:01:40,549 --> 00:01:43,695
构建了整个系统的目标
constructed the whole system is aiming
32
00:01:43,700 --> 00:01:49,005
在构建这个函数f
at constructing this function f which
33
00:01:49,010 --> 00:01:52,065
然后学习训练数据
learns the training data and then
34
00:01:52,070 --> 00:01:56,044
将它应用于测试数据和
applying it to the test data and the
35
00:01:56,049 --> 00:02:00,555
奇迹是它在这方面做得很好
miracle is that it does so well in in
36
00:02:00,560 --> 00:02:04,715
实践那就是改变了
practice that's what has transformed
37
00:02:04,720 --> 00:02:10,175
深入学习如此重要
deep learning into such a important
38
00:02:10,180 --> 00:02:14,995
应用程序,以便使,所以这是
application so so so this is
39
00:02:15,000 --> 00:02:18,875
这虽然第七章已经开始了
this though Chapter seven has been up
40
00:02:18,880 --> 00:02:24,785
几个月来吉祥物mit.edu斜线
for months on the mascot mit.edu slash
41
00:02:24,790 --> 00:02:27,965
从数据视觉中学习,我会添加它
learning from data sight and I'll add it
42
00:02:27,970 --> 00:02:29,945
太棒了,因为那就是你要去的地方
to stellar because that's where you'll
43
00:02:29,950 --> 00:02:35,075
正在寻找它好然后
be looking for it okay and then the
44
00:02:35,080 --> 00:02:37,925
第二个反向传播的方式
second the backpropagation the way to
45
00:02:37,930 --> 00:02:43,145
我可能会计算梯度
compute the gradient I'll probably
46
00:02:43,150 --> 00:02:47,375
今天达到了这个想法,你会看到
reached that idea today and you'll see
47
00:02:47,380 --> 00:02:49,865
什么是连锁规则,但它是怎么回事
what it's the chain rule but how is it
48
00:02:49,870 --> 00:02:52,775
组织好,所以这是什么
organized okay so so what's the
49
00:02:52,780 --> 00:02:54,875
结构是什么样的深层计划
structure what's the plan for deep
50
00:02:54,880 --> 00:03:00,575
神经网络从这里开始就是好的开始
neural nets good start starting here so
51
00:03:00,580 --> 00:03:06,454
我们所拥有的是培训数据,所以我们拥有
what we have is training data so we have
52
00:03:06,459 --> 00:03:14,435
向量x1到X我应该使用什么
vectors x1 to X what should I use for
53
00:03:14,440 --> 00:03:17,885
那些样本的培训数量
the number of training of samples that
54
00:03:17,890 --> 00:03:20,615
我们在训练中接受了培训
we have in the training the training
55
00:03:20,620 --> 00:03:26,375
数据好Sadie数据确定和每个
data well Sadie for data ok and each
56
00:03:26,380 --> 00:03:30,185
矢量那些被称为特征向量
vector those are called feature vectors
57
00:03:30,190 --> 00:03:40,954
所以每个都等于特征向量
so equals feature vectors so each one
58
00:03:40,959 --> 00:03:48,305
每个X都是呃有这样的特征
each X is uh has like m features so
59
00:03:48,310 --> 00:03:52,324
也许它可能是我的符号不是
maybe it maybe I my notation isms isn't
60
00:03:52,329 --> 00:03:56,965
这里很热,我有很多
so hot here I have a whole lot of
61
00:03:56,970 --> 00:04:01,445
向量让我不要使用下标
vectors let me not use the subscript for
62
00:04:01,450 --> 00:04:05,405
那些正确的矢量功能
those right away so vectors feature
63
00:04:05,410 --> 00:04:09,514
向量和每个向量可能都有
vectors and each vector has got maybe so
64
00:04:09,519 --> 00:04:14,104
我们说m的功能就像我们一样
we say m features like if we were
65
00:04:14,109 --> 00:04:17,375
测量身高,年龄和体重
measuring height and age and weight and
66
00:04:17,380 --> 00:04:21,574
所以这些工作就是特色
so on those would be features so the job
67
00:04:21,579 --> 00:04:25,805
是神经网络的工作
is of the the job of the neural network
68
00:04:25,810 --> 00:04:28,395
是创造一个和
is to create a and
69
00:04:28,400 --> 00:04:30,105
我们打算也许我们分类
and we're going to classify maybe we're
70
00:04:30,110 --> 00:04:35,025
要对男女老少进行分类
going to classify men and women or boys
71
00:04:35,030 --> 00:04:35,895
和女孩
and girls
72
00:04:35,900 --> 00:04:38,775
所以我们让它成为一个分类
so our let's make it a classification
73
00:04:38,780 --> 00:04:43,935
问题只是一个二进制所以
problem a just a binary so the
74
00:04:43,940 --> 00:04:48,015
分类问题是我们应该做的
classification problem is what should we
75
00:04:48,020 --> 00:04:56,205
说减去一个或一个或类似的
say minus one or one which is sort of or
76
00:04:56,210 --> 00:04:59,685
零或一个或男孩或女孩或猫或狗
zero or one or boy or girl or cat or dog
77
00:04:59,690 --> 00:05:05,265
或卡车或汽车或反正只有两个
or truck or car or anyway just two
78
00:05:05,270 --> 00:05:09,435
课程到两个班,所以我要去
classes to two class so I'm just gonna
79
00:05:09,440 --> 00:05:15,315
做两级分类所以我们
do two class classification so the we
80
00:05:15,320 --> 00:05:17,865
知道训练数据属于哪个类
know which class the training data is in
81
00:05:17,870 --> 00:05:20,145
对于每个向量X,我们知道正确的
for each vector X we know the right
82
00:05:20,150 --> 00:05:23,384
回答所以我们要创建一个函数
answer so we want to create a function
83
00:05:23,389 --> 00:05:25,755
给出正确的答案,然后
that gives the right answer and then
84
00:05:25,760 --> 00:05:28,925
我们将在其他数据上使用该功能
we'll use that function on other data
85
00:05:28,930 --> 00:05:32,675
人们都知道,所以让我写下来
people know so let me write that down
86
00:05:32,680 --> 00:05:40,664
创建一个X的函数f,以便
create a function f of X that so that
87
00:05:40,669 --> 00:05:49,055
获得大部分的课程都是正确的
that gets most of gets the class correct
88
00:05:49,060 --> 00:05:54,875
换句话说,X的f应该是负数
in other words f of X should be negative
89
00:05:54,880 --> 00:06:02,295
当分类是什么时候
for when when the classification is
90
00:06:02,300 --> 00:06:06,335
X的减1和f应为正
minus 1 and f of X should be positive
91
00:06:06,340 --> 00:06:11,865
当分类加1和as时
when the classification is plus 1 and as
92
00:06:11,870 --> 00:06:14,444
我们知道我们不一定得到
we know we don't necessarily have to get
93
00:06:14,449 --> 00:06:18,284
每X个样品都可能正确
every X every sample right that may be
94
00:06:18,289 --> 00:06:21,075
如果有一些样品过度拟合
overfitting if there's some sample
95
00:06:21,080 --> 00:06:24,944
得到那个真的很奇怪
that's just truly weird by getting that
96
00:06:24,949 --> 00:06:26,565
我们正在寻找
right we're going to be looking for
97
00:06:26,570 --> 00:06:32,205
测试集中的真正奇怪的数据和
truly weird data in the test set and and
98
00:06:32,210 --> 00:06:35,594
这不是我们想要的,我们希望有一个好主意
that's not a good idea we want we want
99
00:06:35,599 --> 00:06:40,875
我们试图发现的规则
the rule that we're trying to discover
100
00:06:40,880 --> 00:06:41,615
该规则的
the rule the
101
00:06:41,620 --> 00:06:44,224
涵盖几乎所有情况,但不是每个
covers almost all cases but not every
102
00:06:44,229 --> 00:06:48,095
疯狂怪异的情况好吧这是我们的工作
crazy weird case okay so that's our job
103
00:06:48,100 --> 00:06:52,955
创建一个X的函数f
to create a function f of X that is
104
00:06:52,960 --> 00:06:57,634
几乎所有的训练都是正确的
correct on almost all of the training
105
00:06:57,639 --> 00:07:05,005
数据是的
data yeah
106
00:07:05,010 --> 00:07:11,164
所以之前我画的图片
so before I draw the picture of the
107
00:07:11,169 --> 00:07:16,474
网络让我记得记住
network let me just remember remember to
108
00:07:16,479 --> 00:07:24,095
提到我不知道的网站游乐场
mention the site playground I don't know
109
00:07:24,100 --> 00:07:25,685
如果你已经看过那么我会
if you've looked at that so I'm gonna
110
00:07:25,690 --> 00:07:35,694
问你在tensorflowdotorg的游乐场
ask you playground at tensorflow dot org
111
00:07:35,699 --> 00:07:39,305
有多少人知道该网站或已经搞砸了
how many know that site or have messed
112
00:07:39,310 --> 00:07:43,384
只有几个没关系好吧
with it just just a few okay okay so
113
00:07:43,389 --> 00:07:46,085
它不是一个非常复杂的网站
it's not a very sophisticated site it's
114
00:07:46,090 --> 00:07:52,465
只有四个例子和例子
got only four examples for examples and
115
00:07:52,470 --> 00:07:58,505
是的,所以这样一个例子是一个
and the yeah so the so one example is a
116
00:07:58,510 --> 00:08:04,055
很多积分都有蓝色B表示
whole lot of points there are blue B for
117
00:08:04,060 --> 00:08:11,694
蓝色里面有一堆点
blue inside a bunch of points that are
118
00:08:11,699 --> 00:08:15,955
另一套旧的橙色
another set that are old for orange
119
00:08:15,960 --> 00:08:21,305
橙色,但没关系,所以这两个
orange but okay so those are the two
120
00:08:21,310 --> 00:08:24,305
类橙色和蓝色所以点X.
classes orange and blue so the points X
121
00:08:24,310 --> 00:08:29,105
并且特征向量是
and the the feature vector is is the
122
00:08:29,110 --> 00:08:33,274
特征向量在这里只是X.
feature vector is here is just the X
123
00:08:33,279 --> 00:08:38,224
坐标的特征是X.
what the coordinates features are the X
124
00:08:38,229 --> 00:08:43,414
点的Y坐标和我们的工作
Y coordinates of the point and our job
125
00:08:43,419 --> 00:08:47,315
是找到一个积极的功能
is to find a function that's positive on
126
00:08:47,320 --> 00:08:49,895
这些点和负面的
these points and negative on those
127
00:08:49,900 --> 00:08:53,885
分那么有一个简单的模型问题
points so there's a simple model problem
128
00:08:53,890 --> 00:08:54,485
和
and
129
00:08:54,490 --> 00:08:58,115
我建议也只是部分是,如果
I recommend well just partly it's if
130
00:08:58,120 --> 00:09:01,524
你是深度学习的专家
you're a expert in deep learning this is
131
00:09:01,529 --> 00:09:05,665
对于孩子,但在道德上我在这里
for children but but morally here I
132
00:09:05,670 --> 00:09:09,834
当然在玩这个教训
certainly learned from playing in this
133
00:09:09,839 --> 00:09:17,644
操场上,所以你设置了你,你
playground so you you set the you you
134
00:09:17,649 --> 00:09:24,095
设置步长您设置还是设置
set the step size do you set it or does
135
00:09:24,100 --> 00:09:26,524
它说没有,是的,我想你可以
it say no yeah yeah I guess you can
136
00:09:26,529 --> 00:09:28,235
改变它我没有我不认为我
change it I haven't I don't think I've
137
00:09:28,240 --> 00:09:31,445
改变它你还说什么哦你
changed it what else do you said oh you
138
00:09:31,450 --> 00:09:37,505
你设置了非线性激活
you set the the nonlinear activation the
139
00:09:37,510 --> 00:09:44,704
非线性激活函数有效
nonlinear activation function active
140
00:09:44,709 --> 00:09:47,405
我会说功能,让我走吧
I'll say function and let me just go
141
00:09:47,410 --> 00:09:51,125
在这里,说什么功能的人
over here and say what functions people
142
00:09:51,130 --> 00:09:54,095
现在大多使用激活
now mostly use the the activation
143
00:09:54,100 --> 00:10:02,375
函数被称为riilu发音
function is called riilu pronounced
144
00:10:02,380 --> 00:10:04,445
不同的方式我不知道我们得到了什么
different ways I don't know how we got
145
00:10:04,450 --> 00:10:07,475
进入这个功能的疯狂的事情
into that crazy thing for this function
146
00:10:07,480 --> 00:10:14,795
这是零和X所以函数riilu
that is zero and X so the function riilu
147
00:10:14,800 --> 00:10:17,735
作为X的函数是最大的
as a function of X is the maximum the
148
00:10:17,740 --> 00:10:24,125
大于0和X所以它产生了
larger of 0 and X so it produces the
149
00:10:24,130 --> 00:10:27,334
要点是它不是线性的和重点
point is it's not linear and the point
150
00:10:27,339 --> 00:10:30,545
如果我们不允许非线性
is that if we didn't allow non-linearity
151
00:10:30,550 --> 00:10:31,535
在这里的某个地方
in here somewhere
152
00:10:31,540 --> 00:10:33,515
我们甚至无法解决这个游乐场
we couldn't even solve this playground
153
00:10:33,520 --> 00:10:36,995
问题因为如果我们的分类器是
problem because if our classifiers were
154
00:10:37,000 --> 00:10:40,084
所有线性分类器,如支持
all linear classifiers like support
155
00:10:40,089 --> 00:10:44,045
矢量机我无法分开
vector machines I couldn't separate the
156
00:10:44,050 --> 00:10:48,905
蓝色的橙色与飞机我是
blue from the orange with a plane I it's
157
00:10:48,910 --> 00:10:51,215
以某种方式创造了一些非线性
got to somehow create some nonlinear
158
00:10:51,220 --> 00:10:53,905
函数可能是函数
function which is maybe the function is
159
00:10:53,910 --> 00:10:57,365
努力成为一个好的功能
trying to be a good a good function
160
00:10:57,370 --> 00:11:02,605
可能是r和theta的函数
would be a function of r and theta maybe
161
00:11:02,610 --> 00:11:08,855
也许r减5所以也许是距离
maybe r minus 5 so maybe the distance
162
00:11:08,860 --> 00:11:10,805
那个让我们假设的那个
out to the that let's suppose that
163
00:11:10,810 --> 00:11:15,545
距离是5然后r减5
distance is 5 then r minus 5 will be
164
00:11:15,550 --> 00:11:18,845
对蓝调不利,因为R很小
negative on the Blues because R is small
165
00:11:18,850 --> 00:11:21,875
我们的负5将是正面的
and our minus 5 will be positive on the
166
00:11:21,880 --> 00:11:23,885
橘子因为R更大而且
oranges because R is bigger and
167
00:11:23,890 --> 00:11:29,215
因此我们会有正确的迹象
therefore we will have the right signs
168
00:11:29,220 --> 00:11:34,295
小于0大于0,它会
less than 0 greater than 0 and it'll
169
00:11:34,300 --> 00:11:38,194
将此数据归类为此
classify this the this data this
170
00:11:38,199 --> 00:11:42,185
训练数据是的,所以它必须这样做
training data yeah so so it has to do
171
00:11:42,190 --> 00:11:45,155
这不是一件难事
that this is not a hard one to do there
172
00:11:45,160 --> 00:11:49,115
我说两个是四个例子
are four examples as I say two are
173
00:11:49,120 --> 00:11:52,355
琐事它解决了它找到了一个好处
trivial it solves it finds a good
174
00:11:52,360 --> 00:11:54,905
功能很好是啊我忘记了
function well yeah I've forgotten
175
00:11:54,910 --> 00:11:58,505
他们是如此微不足道,他们却没有
they're so trivial they've they don't
176
00:11:58,510 --> 00:12:02,824
不应该提及然后这是
shouldn't be mentioned and then this is
177
00:12:02,829 --> 00:12:09,334
中等测试然后硬测试
the medium test and then the hard test
178
00:12:09,339 --> 00:12:13,324
当你有橘子时,你有一种
is when you have oranges you have a sort
179
00:12:13,329 --> 00:12:18,394
橙子螺旋和里面你有
of spiral of oranges and inside you have
180
00:12:18,399 --> 00:12:23,074
一个煮熟的蓝色缎面螺旋
a spiral of blue satin that was a cooked
181
00:12:23,079 --> 00:12:32,345
同比饲料,所以你会这么说的
up by a feed so you would so that the
182
00:12:32,350 --> 00:12:34,475
系统正试图找到一个功能
system is trying to find a function
183
00:12:34,480 --> 00:12:38,375
这在一个螺旋上是正面的
that's positive on one spiral and
184
00:12:38,380 --> 00:12:40,385
另一方面也是负面的
negative on the other spiral and that
185
00:12:40,390 --> 00:12:44,215
需要相当多的时间
takes quite a bit of time many many
186
00:12:44,220 --> 00:12:50,255
时代我学会了一个时代
epochs I learned what an epoch is did
187
00:12:50,260 --> 00:12:52,535
你知道什么是一个时代,我没有
you know what an epoch is I I didn't
188
00:12:52,540 --> 00:12:53,915
知道是不是只是一个花哨的词
know whether it was just a fancy word
189
00:12:53,920 --> 00:12:57,274
用于计算渐变中的步数
for counting the steps in gradient
190
00:12:57,279 --> 00:13:01,715
血统,但它计算它的数量
descent but it counts it counts the
191
00:13:01,720 --> 00:13:05,885
步骤可以,但一个时代就是
steps all right but one epoch is the
192
00:13:05,890 --> 00:13:09,965
与大小匹配的步数
number of steps that matches the size of
193
00:13:09,970 --> 00:13:12,035
训练数据,如果你有一个
the training data so if you have a
194
00:13:12,040 --> 00:13:16,145
百万个样本我们正在评分普通
million samples we're grading ordinary
195
00:13:16,150 --> 00:13:18,215
渐渐下降你会做一个
gradient descent you would be doing a
196
00:13:18,220 --> 00:13:20,695
百万你有一百万
million you have a million by
197
00:13:20,700 --> 00:13:25,165
当然随机的每一步问题
in problem per step of course stochastic
198
00:13:25,170 --> 00:13:28,435
梯度下降只做一个小批量
gradient descent just does a mini batch
199
00:13:28,440 --> 00:13:33,535
1或32或者其他什么,但无论如何,如果我们
of 1 or 32 or something but anyway if we
200
00:13:33,540 --> 00:13:39,505
是的,所以这是你的数量
had yeah so so it's the number of you