-
Notifications
You must be signed in to change notification settings - Fork 4
/
Copy pathHCQ_MSRVTT_1kA_xlnet-large.txt
2607 lines (2607 loc) · 195 KB
/
HCQ_MSRVTT_1kA_xlnet-large.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Experiment directory: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large
Preparing the dataloaders ...
Loading dataset MSRVTT_jsfusion_trainval in ram ...
Finish loading dataset MSRVTT_jsfusion_trainval in ram, taking 420.19769644737244 s.
Loading dataset MSRVTT_jsfusion_test in ram ...
Finish loading dataset MSRVTT_jsfusion_test in ram, taking 34.35290241241455 s.
Loading dataset MSRVTT_jsfusion_test in ram ...
Finish loading dataset MSRVTT_jsfusion_test in ram, taking 28.41338062286377 s.
Training ...
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch0.pth ...
Done in 4.224s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch0.pth ...
Done in 8.362s
epoch : 0
loss : 0
learning_rate : 2e-05
n_samples : 0
n_steps : 0
MSRVTT_jsfusion_test/t2v_metrics/R1: 0.1
MSRVTT_jsfusion_test/t2v_metrics/R5: 0.6
MSRVTT_jsfusion_test/t2v_metrics/R10: 1.3
MSRVTT_jsfusion_test/t2v_metrics/R50: 4.9
MSRVTT_jsfusion_test/t2v_metrics/MedR: 503.5
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 500.71
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 0.42726586816979173
MSRVTT_jsfusion_test/v2t_metrics/R1: 0.1
MSRVTT_jsfusion_test/v2t_metrics/R5: 0.5
MSRVTT_jsfusion_test/v2t_metrics/R10: 1.2
MSRVTT_jsfusion_test/v2t_metrics/R50: 5.3
MSRVTT_jsfusion_test/v2t_metrics/MedR: 496.5
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 497.417
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 0.3914867641168864
mnt_best : 0.42726586816979173
not_improved_count: 0
Train Epoch: 1 [1/250 128/32000 (0%)] Loss: 10.01799 (QuantReg: 22.50723) QuantErr: 22.50723 batch_time=34.81228
Train Epoch: 1 [12/250 1536/32000 (5%)] Loss: 9.77463 (QuantReg: 22.58473) QuantErr: 22.58473 batch_time=0.96884
Train Epoch: 1 [23/250 2944/32000 (9%)] Loss: 9.73438 (QuantReg: 22.63482) QuantErr: 22.63482 batch_time=1.00548
Train Epoch: 1 [34/250 4352/32000 (14%)] Loss: 9.71679 (QuantReg: 22.64890) QuantErr: 22.64890 batch_time=0.93058
Train Epoch: 1 [45/250 5760/32000 (18%)] Loss: 9.71577 (QuantReg: 22.63837) QuantErr: 22.63837 batch_time=0.93643
Train Epoch: 1 [56/250 7168/32000 (22%)] Loss: 9.70845 (QuantReg: 22.62885) QuantErr: 22.62885 batch_time=0.96476
Train Epoch: 1 [67/250 8576/32000 (27%)] Loss: 9.69755 (QuantReg: 22.62171) QuantErr: 22.62171 batch_time=10.24880
Train Epoch: 1 [78/250 9984/32000 (31%)] Loss: 9.68985 (QuantReg: 22.63718) QuantErr: 22.63718 batch_time=0.93780
Train Epoch: 1 [89/250 11392/32000 (36%)] Loss: 9.62693 (QuantReg: 22.70801) QuantErr: 22.70801 batch_time=0.97945
Train Epoch: 1 [100/250 12800/32000 (40%)] Loss: 9.63578 (QuantReg: 22.69494) QuantErr: 22.69494 batch_time=1.28777
Train Epoch: 1 [111/250 14208/32000 (44%)] Loss: 9.50666 (QuantReg: 22.74500) QuantErr: 22.74500 batch_time=0.96756
Train Epoch: 1 [122/250 15616/32000 (49%)] Loss: 9.25584 (QuantReg: 22.71401) QuantErr: 22.71401 batch_time=0.85694
Train Epoch: 1 [133/250 17024/32000 (53%)] Loss: 9.15487 (QuantReg: 22.71056) QuantErr: 22.71056 batch_time=1.61352
Train Epoch: 1 [144/250 18432/32000 (58%)] Loss: 8.98874 (QuantReg: 22.73242) QuantErr: 22.73242 batch_time=0.91312
Train Epoch: 1 [155/250 19840/32000 (62%)] Loss: 8.60274 (QuantReg: 22.77786) QuantErr: 22.77786 batch_time=0.88188
Train Epoch: 1 [166/250 21248/32000 (66%)] Loss: 8.18565 (QuantReg: 22.83610) QuantErr: 22.83610 batch_time=0.97522
Train Epoch: 1 [177/250 22656/32000 (71%)] Loss: 7.72776 (QuantReg: 22.79393) QuantErr: 22.79393 batch_time=0.99146
Train Epoch: 1 [188/250 24064/32000 (75%)] Loss: 8.05658 (QuantReg: 22.73022) QuantErr: 22.73022 batch_time=0.92927
Train Epoch: 1 [199/250 25472/32000 (80%)] Loss: 7.48474 (QuantReg: 22.77823) QuantErr: 22.77823 batch_time=0.99301
Train Epoch: 1 [210/250 26880/32000 (84%)] Loss: 7.43995 (QuantReg: 22.74429) QuantErr: 22.74429 batch_time=0.98067
Train Epoch: 1 [221/250 28288/32000 (88%)] Loss: 7.24783 (QuantReg: 22.77491) QuantErr: 22.77491 batch_time=0.99212
Train Epoch: 1 [232/250 29696/32000 (93%)] Loss: 6.88897 (QuantReg: 22.77134) QuantErr: 22.77134 batch_time=0.94361
Train Epoch: 1 [243/250 31104/32000 (97%)] Loss: 6.76353 (QuantReg: 22.71180) QuantErr: 22.71180 batch_time=1.47008
Train Epoch: 1 codebook_update_time=2.32706
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch1.pth ...
Done in 11.364s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch1.pth ...
Done in 22.164s
epoch : 1
loss : 8.78908115196228
quant_reg : 22.709373641967773
quant_err : 22.709373641967773
learning_rate : 2e-05
n_samples : 32000
n_steps : 250
MSRVTT_jsfusion_test/t2v_metrics/R1: 1.1
MSRVTT_jsfusion_test/t2v_metrics/R5: 9.6
MSRVTT_jsfusion_test/t2v_metrics/R10: 17.1
MSRVTT_jsfusion_test/t2v_metrics/R50: 49.9
MSRVTT_jsfusion_test/t2v_metrics/MedR: 51.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 105.679
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 5.652232391128106
MSRVTT_jsfusion_test/v2t_metrics/R1: 1.8
MSRVTT_jsfusion_test/v2t_metrics/R5: 8.2
MSRVTT_jsfusion_test/v2t_metrics/R10: 16.2
MSRVTT_jsfusion_test/v2t_metrics/R50: 49.0
MSRVTT_jsfusion_test/v2t_metrics/MedR: 53.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 109.039
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 6.206791032688915
mnt_best : 5.652232391128106
not_improved_count: 0
Train Epoch: 2 [1/250 128/32000 (0%)] Loss: 7.01702 (QuantReg: 3.59413) QuantErr: 3.59413 batch_time=32.75300
Train Epoch: 2 [12/250 1536/32000 (5%)] Loss: 6.90289 (QuantReg: 3.77754) QuantErr: 3.77754 batch_time=0.98641
Train Epoch: 2 [23/250 2944/32000 (9%)] Loss: 6.44556 (QuantReg: 4.57808) QuantErr: 4.57808 batch_time=2.70837
Train Epoch: 2 [34/250 4352/32000 (14%)] Loss: 6.13177 (QuantReg: 5.49128) QuantErr: 5.49128 batch_time=1.00531
Train Epoch: 2 [45/250 5760/32000 (18%)] Loss: 6.14934 (QuantReg: 6.06904) QuantErr: 6.06904 batch_time=0.92511
Train Epoch: 2 [56/250 7168/32000 (22%)] Loss: 5.84079 (QuantReg: 6.98664) QuantErr: 6.98664 batch_time=1.00397
Train Epoch: 2 [67/250 8576/32000 (27%)] Loss: 5.61914 (QuantReg: 7.20684) QuantErr: 7.20684 batch_time=0.91712
Train Epoch: 2 [78/250 9984/32000 (31%)] Loss: 5.98795 (QuantReg: 10.03718) QuantErr: 10.03718 batch_time=0.92204
Train Epoch: 2 [89/250 11392/32000 (36%)] Loss: 5.08508 (QuantReg: 9.38529) QuantErr: 9.38529 batch_time=0.96653
Train Epoch: 2 [100/250 12800/32000 (40%)] Loss: 5.45357 (QuantReg: 9.84642) QuantErr: 9.84642 batch_time=0.98214
Train Epoch: 2 [111/250 14208/32000 (44%)] Loss: 5.42742 (QuantReg: 9.84158) QuantErr: 9.84158 batch_time=1.03121
Train Epoch: 2 [122/250 15616/32000 (49%)] Loss: 5.03619 (QuantReg: 10.40193) QuantErr: 10.40193 batch_time=0.98885
Train Epoch: 2 [133/250 17024/32000 (53%)] Loss: 5.08599 (QuantReg: 10.49106) QuantErr: 10.49106 batch_time=0.98140
Train Epoch: 2 [144/250 18432/32000 (58%)] Loss: 5.18431 (QuantReg: 12.86250) QuantErr: 12.86250 batch_time=0.97506
Train Epoch: 2 [155/250 19840/32000 (62%)] Loss: 5.30022 (QuantReg: 12.02941) QuantErr: 12.02941 batch_time=0.95761
Train Epoch: 2 [166/250 21248/32000 (66%)] Loss: 4.56831 (QuantReg: 14.11454) QuantErr: 14.11454 batch_time=1.00046
Train Epoch: 2 [177/250 22656/32000 (71%)] Loss: 4.74184 (QuantReg: 12.51161) QuantErr: 12.51161 batch_time=0.92933
Train Epoch: 2 [188/250 24064/32000 (75%)] Loss: 4.95900 (QuantReg: 13.35147) QuantErr: 13.35147 batch_time=0.95013
Train Epoch: 2 [199/250 25472/32000 (80%)] Loss: 5.02757 (QuantReg: 12.85251) QuantErr: 12.85251 batch_time=1.49267
Train Epoch: 2 [210/250 26880/32000 (84%)] Loss: 4.92415 (QuantReg: 14.26878) QuantErr: 14.26878 batch_time=6.91202
Train Epoch: 2 [221/250 28288/32000 (88%)] Loss: 4.75378 (QuantReg: 14.55404) QuantErr: 14.55404 batch_time=0.94807
Train Epoch: 2 [232/250 29696/32000 (93%)] Loss: 4.63703 (QuantReg: 14.80769) QuantErr: 14.80769 batch_time=0.93746
Train Epoch: 2 [243/250 31104/32000 (97%)] Loss: 4.55476 (QuantReg: 14.50089) QuantErr: 14.50089 batch_time=0.90195
Train Epoch: 2 codebook_update_time=1.83773
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch2.pth ...
Done in 11.612s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch2.pth ...
Done in 23.464s
removing stale ckpt [epoch 1] [took 0.00s]
removing stale ckpt [epoch 0] [took 0.00s]
epoch : 2
loss : 5.424358850479126
quant_reg : 10.23915901851654
quant_err : 10.23915901851654
learning_rate : 1.9e-05
n_samples : 64000
n_steps : 500
MSRVTT_jsfusion_test/t2v_metrics/R1: 6.9
MSRVTT_jsfusion_test/t2v_metrics/R5: 24.7
MSRVTT_jsfusion_test/t2v_metrics/R10: 38.2
MSRVTT_jsfusion_test/t2v_metrics/R50: 73.9
MSRVTT_jsfusion_test/t2v_metrics/MedR: 18.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 51.525
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 18.672528700292037
MSRVTT_jsfusion_test/v2t_metrics/R1: 6.8
MSRVTT_jsfusion_test/v2t_metrics/R5: 25.0
MSRVTT_jsfusion_test/v2t_metrics/R10: 38.3
MSRVTT_jsfusion_test/v2t_metrics/R50: 73.4
MSRVTT_jsfusion_test/v2t_metrics/MedR: 17.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 52.911
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 18.673077446590145
mnt_best : 18.672528700292037
not_improved_count: 0
Train Epoch: 3 [1/250 128/32000 (0%)] Loss: 4.53523 (QuantReg: 8.12148) QuantErr: 8.12148 batch_time=32.39993
Train Epoch: 3 [12/250 1536/32000 (5%)] Loss: 4.07147 (QuantReg: 8.23518) QuantErr: 8.23518 batch_time=0.97431
Train Epoch: 3 [23/250 2944/32000 (9%)] Loss: 4.70174 (QuantReg: 8.27071) QuantErr: 8.27071 batch_time=0.96165
Train Epoch: 3 [34/250 4352/32000 (14%)] Loss: 4.73052 (QuantReg: 8.66160) QuantErr: 8.66160 batch_time=0.96992
Train Epoch: 3 [45/250 5760/32000 (18%)] Loss: 4.53682 (QuantReg: 9.09087) QuantErr: 9.09087 batch_time=1.00190
Train Epoch: 3 [56/250 7168/32000 (22%)] Loss: 4.00258 (QuantReg: 9.56769) QuantErr: 9.56769 batch_time=0.95984
Train Epoch: 3 [67/250 8576/32000 (27%)] Loss: 4.08891 (QuantReg: 8.97258) QuantErr: 8.97258 batch_time=5.35488
Train Epoch: 3 [78/250 9984/32000 (31%)] Loss: 4.02429 (QuantReg: 9.63496) QuantErr: 9.63496 batch_time=0.96027
Train Epoch: 3 [89/250 11392/32000 (36%)] Loss: 3.93850 (QuantReg: 9.53036) QuantErr: 9.53036 batch_time=0.96930
Train Epoch: 3 [100/250 12800/32000 (40%)] Loss: 4.34660 (QuantReg: 9.33809) QuantErr: 9.33809 batch_time=0.97623
Train Epoch: 3 [111/250 14208/32000 (44%)] Loss: 4.63871 (QuantReg: 10.57006) QuantErr: 10.57006 batch_time=0.97366
Train Epoch: 3 [122/250 15616/32000 (49%)] Loss: 4.09245 (QuantReg: 9.83315) QuantErr: 9.83315 batch_time=0.94604
Train Epoch: 3 [133/250 17024/32000 (53%)] Loss: 4.28288 (QuantReg: 10.63818) QuantErr: 10.63818 batch_time=1.02557
Train Epoch: 3 [144/250 18432/32000 (58%)] Loss: 3.74670 (QuantReg: 10.10635) QuantErr: 10.10635 batch_time=1.05611
Train Epoch: 3 [155/250 19840/32000 (62%)] Loss: 4.05089 (QuantReg: 11.13665) QuantErr: 11.13665 batch_time=1.04554
Train Epoch: 3 [166/250 21248/32000 (66%)] Loss: 4.06136 (QuantReg: 10.62087) QuantErr: 10.62087 batch_time=0.95320
Train Epoch: 3 [177/250 22656/32000 (71%)] Loss: 3.67122 (QuantReg: 10.96291) QuantErr: 10.96291 batch_time=0.90715
Train Epoch: 3 [188/250 24064/32000 (75%)] Loss: 3.84146 (QuantReg: 10.78650) QuantErr: 10.78650 batch_time=0.92799
Train Epoch: 3 [199/250 25472/32000 (80%)] Loss: 3.37843 (QuantReg: 11.78949) QuantErr: 11.78949 batch_time=0.97374
Train Epoch: 3 [210/250 26880/32000 (84%)] Loss: 3.76073 (QuantReg: 11.21923) QuantErr: 11.21923 batch_time=0.99026
Train Epoch: 3 [221/250 28288/32000 (88%)] Loss: 3.87638 (QuantReg: 11.69628) QuantErr: 11.69628 batch_time=0.92347
Train Epoch: 3 [232/250 29696/32000 (93%)] Loss: 4.63682 (QuantReg: 11.54513) QuantErr: 11.54513 batch_time=0.94461
Train Epoch: 3 [243/250 31104/32000 (97%)] Loss: 3.48229 (QuantReg: 12.09723) QuantErr: 12.09723 batch_time=0.94675
Train Epoch: 3 codebook_update_time=1.87231
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch3.pth ...
Done in 11.451s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch3.pth ...
Done in 22.856s
removing stale ckpt [epoch 2] [took 0.03s]
epoch : 3
loss : 4.098606305122376
quant_reg : 10.114148504257201
quant_err : 10.114148504257201
learning_rate : 1.805e-05
n_samples : 96000
n_steps : 750
MSRVTT_jsfusion_test/t2v_metrics/R1: 9.8
MSRVTT_jsfusion_test/t2v_metrics/R5: 32.5
MSRVTT_jsfusion_test/t2v_metrics/R10: 47.5
MSRVTT_jsfusion_test/t2v_metrics/R50: 80.3
MSRVTT_jsfusion_test/t2v_metrics/MedR: 11.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 40.018
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 24.73248088514102
MSRVTT_jsfusion_test/v2t_metrics/R1: 10.5
MSRVTT_jsfusion_test/v2t_metrics/R5: 33.3
MSRVTT_jsfusion_test/v2t_metrics/R10: 46.9
MSRVTT_jsfusion_test/v2t_metrics/R50: 79.7
MSRVTT_jsfusion_test/v2t_metrics/MedR: 12.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 41.1415
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 25.405951134132213
mnt_best : 24.73248088514102
not_improved_count: 0
Train Epoch: 4 [1/250 128/32000 (0%)] Loss: 4.41904 (QuantReg: 9.02146) QuantErr: 9.02146 batch_time=44.57084
Train Epoch: 4 [12/250 1536/32000 (5%)] Loss: 3.95179 (QuantReg: 9.57405) QuantErr: 9.57405 batch_time=1.10719
Train Epoch: 4 [23/250 2944/32000 (9%)] Loss: 3.51215 (QuantReg: 9.01416) QuantErr: 9.01416 batch_time=0.90060
Train Epoch: 4 [34/250 4352/32000 (14%)] Loss: 3.74746 (QuantReg: 9.61735) QuantErr: 9.61735 batch_time=0.96038
Train Epoch: 4 [45/250 5760/32000 (18%)] Loss: 3.67254 (QuantReg: 9.76642) QuantErr: 9.76642 batch_time=0.93926
Train Epoch: 4 [56/250 7168/32000 (22%)] Loss: 3.75506 (QuantReg: 10.25710) QuantErr: 10.25710 batch_time=0.98238
Train Epoch: 4 [67/250 8576/32000 (27%)] Loss: 3.92846 (QuantReg: 9.74849) QuantErr: 9.74849 batch_time=0.97028
Train Epoch: 4 [78/250 9984/32000 (31%)] Loss: 3.43258 (QuantReg: 9.75816) QuantErr: 9.75816 batch_time=0.96509
Train Epoch: 4 [89/250 11392/32000 (36%)] Loss: 3.40402 (QuantReg: 10.56158) QuantErr: 10.56158 batch_time=0.93015
Train Epoch: 4 [100/250 12800/32000 (40%)] Loss: 3.06566 (QuantReg: 10.94719) QuantErr: 10.94719 batch_time=0.97041
Train Epoch: 4 [111/250 14208/32000 (44%)] Loss: 3.33523 (QuantReg: 10.51681) QuantErr: 10.51681 batch_time=0.86529
Train Epoch: 4 [122/250 15616/32000 (49%)] Loss: 3.09144 (QuantReg: 10.54238) QuantErr: 10.54238 batch_time=0.97562
Train Epoch: 4 [133/250 17024/32000 (53%)] Loss: 3.37573 (QuantReg: 10.25210) QuantErr: 10.25210 batch_time=0.91491
Train Epoch: 4 [144/250 18432/32000 (58%)] Loss: 3.61729 (QuantReg: 10.64485) QuantErr: 10.64485 batch_time=0.93936
Train Epoch: 4 [155/250 19840/32000 (62%)] Loss: 3.84124 (QuantReg: 10.61423) QuantErr: 10.61423 batch_time=0.94563
Train Epoch: 4 [166/250 21248/32000 (66%)] Loss: 3.59680 (QuantReg: 11.01027) QuantErr: 11.01027 batch_time=0.91574
Train Epoch: 4 [177/250 22656/32000 (71%)] Loss: 3.31980 (QuantReg: 10.59215) QuantErr: 10.59215 batch_time=0.93221
Train Epoch: 4 [188/250 24064/32000 (75%)] Loss: 3.13286 (QuantReg: 11.25523) QuantErr: 11.25523 batch_time=1.20074
Train Epoch: 4 [199/250 25472/32000 (80%)] Loss: 2.93578 (QuantReg: 11.06846) QuantErr: 11.06846 batch_time=1.10409
Train Epoch: 4 [210/250 26880/32000 (84%)] Loss: 3.16489 (QuantReg: 11.34843) QuantErr: 11.34843 batch_time=0.90699
Train Epoch: 4 [221/250 28288/32000 (88%)] Loss: 3.17752 (QuantReg: 11.70614) QuantErr: 11.70614 batch_time=0.95648
Train Epoch: 4 [232/250 29696/32000 (93%)] Loss: 3.22727 (QuantReg: 11.43022) QuantErr: 11.43022 batch_time=0.92019
Train Epoch: 4 [243/250 31104/32000 (97%)] Loss: 3.04438 (QuantReg: 11.56193) QuantErr: 11.56193 batch_time=0.91485
Train Epoch: 4 codebook_update_time=1.70999
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch4.pth ...
Done in 29.248s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch4.pth ...
Done in 40.458s
removing stale ckpt [epoch 3] [took 0.00s]
epoch : 4
loss : 3.5073723478317262
quant_reg : 10.505402263641358
quant_err : 10.505402263641358
learning_rate : 1.71475e-05
n_samples : 128000
n_steps : 1000
MSRVTT_jsfusion_test/t2v_metrics/R1: 13.0
MSRVTT_jsfusion_test/t2v_metrics/R5: 36.0
MSRVTT_jsfusion_test/t2v_metrics/R10: 51.2
MSRVTT_jsfusion_test/t2v_metrics/R50: 82.0
MSRVTT_jsfusion_test/t2v_metrics/MedR: 10.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 34.919
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 28.82959919863306
MSRVTT_jsfusion_test/v2t_metrics/R1: 11.6
MSRVTT_jsfusion_test/v2t_metrics/R5: 36.1
MSRVTT_jsfusion_test/v2t_metrics/R10: 51.9
MSRVTT_jsfusion_test/v2t_metrics/R50: 82.5
MSRVTT_jsfusion_test/v2t_metrics/MedR: 10.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 34.678
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 27.90685203167781
mnt_best : 28.82959919863306
not_improved_count: 0
Train Epoch: 5 [1/250 128/32000 (0%)] Loss: 3.17781 (QuantReg: 11.10160) QuantErr: 11.10160 batch_time=29.27460
Train Epoch: 5 [12/250 1536/32000 (5%)] Loss: 3.54663 (QuantReg: 10.32556) QuantErr: 10.32556 batch_time=0.99139
Train Epoch: 5 [23/250 2944/32000 (9%)] Loss: 3.42772 (QuantReg: 10.79662) QuantErr: 10.79662 batch_time=0.90961
Train Epoch: 5 [34/250 4352/32000 (14%)] Loss: 3.14198 (QuantReg: 10.99249) QuantErr: 10.99249 batch_time=0.91040
Train Epoch: 5 [45/250 5760/32000 (18%)] Loss: 3.52512 (QuantReg: 10.33237) QuantErr: 10.33237 batch_time=0.92445
Train Epoch: 5 [56/250 7168/32000 (22%)] Loss: 3.21850 (QuantReg: 10.85848) QuantErr: 10.85848 batch_time=0.92558
Train Epoch: 5 [67/250 8576/32000 (27%)] Loss: 3.34921 (QuantReg: 10.32959) QuantErr: 10.32959 batch_time=1.06501
Train Epoch: 5 [78/250 9984/32000 (31%)] Loss: 2.88913 (QuantReg: 10.97242) QuantErr: 10.97242 batch_time=0.96011
Train Epoch: 5 [89/250 11392/32000 (36%)] Loss: 3.17454 (QuantReg: 10.94588) QuantErr: 10.94588 batch_time=0.92760
Train Epoch: 5 [100/250 12800/32000 (40%)] Loss: 2.86925 (QuantReg: 10.87292) QuantErr: 10.87292 batch_time=1.19848
Train Epoch: 5 [111/250 14208/32000 (44%)] Loss: 2.95381 (QuantReg: 11.36944) QuantErr: 11.36944 batch_time=0.99583
Train Epoch: 5 [122/250 15616/32000 (49%)] Loss: 3.04091 (QuantReg: 11.23054) QuantErr: 11.23054 batch_time=0.89951
Train Epoch: 5 [133/250 17024/32000 (53%)] Loss: 2.78175 (QuantReg: 10.96769) QuantErr: 10.96769 batch_time=0.95991
Train Epoch: 5 [144/250 18432/32000 (58%)] Loss: 2.83699 (QuantReg: 11.68920) QuantErr: 11.68920 batch_time=0.98194
Train Epoch: 5 [155/250 19840/32000 (62%)] Loss: 2.80254 (QuantReg: 11.46029) QuantErr: 11.46029 batch_time=0.88352
Train Epoch: 5 [166/250 21248/32000 (66%)] Loss: 3.12340 (QuantReg: 11.73655) QuantErr: 11.73655 batch_time=0.94855
Train Epoch: 5 [177/250 22656/32000 (71%)] Loss: 2.74646 (QuantReg: 11.74212) QuantErr: 11.74212 batch_time=0.95308
Train Epoch: 5 [188/250 24064/32000 (75%)] Loss: 2.88388 (QuantReg: 11.74543) QuantErr: 11.74543 batch_time=0.96605
Train Epoch: 5 [199/250 25472/32000 (80%)] Loss: 3.51582 (QuantReg: 11.75576) QuantErr: 11.75576 batch_time=0.95817
Train Epoch: 5 [210/250 26880/32000 (84%)] Loss: 3.03463 (QuantReg: 11.52388) QuantErr: 11.52388 batch_time=2.01277
Train Epoch: 5 [221/250 28288/32000 (88%)] Loss: 3.12859 (QuantReg: 11.53335) QuantErr: 11.53335 batch_time=0.90964
Train Epoch: 5 [232/250 29696/32000 (93%)] Loss: 2.50921 (QuantReg: 11.71417) QuantErr: 11.71417 batch_time=0.92327
Train Epoch: 5 [243/250 31104/32000 (97%)] Loss: 3.70886 (QuantReg: 11.97993) QuantErr: 11.97993 batch_time=0.93461
Train Epoch: 5 codebook_update_time=2.00216
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch5.pth ...
Done in 11.348s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch5.pth ...
Done in 24.708s
removing stale ckpt [epoch 4] [took 0.00s]
epoch : 5
loss : 3.103938196182251
quant_reg : 11.28231481552124
quant_err : 11.28231481552124
learning_rate : 1.6290125e-05
n_samples : 160000
n_steps : 1250
MSRVTT_jsfusion_test/t2v_metrics/R1: 15.5
MSRVTT_jsfusion_test/t2v_metrics/R5: 39.4
MSRVTT_jsfusion_test/t2v_metrics/R10: 53.2
MSRVTT_jsfusion_test/t2v_metrics/R50: 83.4
MSRVTT_jsfusion_test/t2v_metrics/MedR: 9.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 34.187
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 31.908999272420818
MSRVTT_jsfusion_test/v2t_metrics/R1: 14.4
MSRVTT_jsfusion_test/v2t_metrics/R5: 38.3
MSRVTT_jsfusion_test/v2t_metrics/R10: 52.8
MSRVTT_jsfusion_test/v2t_metrics/R50: 84.7
MSRVTT_jsfusion_test/v2t_metrics/MedR: 9.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 31.719
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 30.76557687893297
mnt_best : 31.908999272420818
not_improved_count: 0
Train Epoch: 6 [1/250 128/32000 (0%)] Loss: 2.46487 (QuantReg: 11.07335) QuantErr: 11.07335 batch_time=37.01370
Train Epoch: 6 [12/250 1536/32000 (5%)] Loss: 2.73939 (QuantReg: 11.70261) QuantErr: 11.70261 batch_time=0.88205
Train Epoch: 6 [23/250 2944/32000 (9%)] Loss: 2.54660 (QuantReg: 11.77385) QuantErr: 11.77385 batch_time=1.53580
Train Epoch: 6 [34/250 4352/32000 (14%)] Loss: 3.02294 (QuantReg: 11.61400) QuantErr: 11.61400 batch_time=0.95926
Train Epoch: 6 [45/250 5760/32000 (18%)] Loss: 3.33153 (QuantReg: 11.79185) QuantErr: 11.79185 batch_time=0.93230
Train Epoch: 6 [56/250 7168/32000 (22%)] Loss: 2.87068 (QuantReg: 11.54889) QuantErr: 11.54889 batch_time=0.96097
Train Epoch: 6 [67/250 8576/32000 (27%)] Loss: 2.99848 (QuantReg: 12.17841) QuantErr: 12.17841 batch_time=0.97459
Train Epoch: 6 [78/250 9984/32000 (31%)] Loss: 2.86316 (QuantReg: 11.85867) QuantErr: 11.85867 batch_time=0.92374
Train Epoch: 6 [89/250 11392/32000 (36%)] Loss: 2.78977 (QuantReg: 11.71375) QuantErr: 11.71375 batch_time=1.01895
Train Epoch: 6 [100/250 12800/32000 (40%)] Loss: 2.57292 (QuantReg: 12.01605) QuantErr: 12.01605 batch_time=0.97451
Train Epoch: 6 [111/250 14208/32000 (44%)] Loss: 3.25211 (QuantReg: 11.82637) QuantErr: 11.82637 batch_time=0.91109
Train Epoch: 6 [122/250 15616/32000 (49%)] Loss: 2.76665 (QuantReg: 11.99751) QuantErr: 11.99751 batch_time=0.96865
Train Epoch: 6 [133/250 17024/32000 (53%)] Loss: 3.03396 (QuantReg: 11.69678) QuantErr: 11.69678 batch_time=0.94832
Train Epoch: 6 [144/250 18432/32000 (58%)] Loss: 2.77629 (QuantReg: 11.59154) QuantErr: 11.59154 batch_time=0.93235
Train Epoch: 6 [155/250 19840/32000 (62%)] Loss: 2.47759 (QuantReg: 12.70865) QuantErr: 12.70865 batch_time=0.94280
Train Epoch: 6 [166/250 21248/32000 (66%)] Loss: 2.83831 (QuantReg: 11.74438) QuantErr: 11.74438 batch_time=0.93798
Train Epoch: 6 [177/250 22656/32000 (71%)] Loss: 3.05024 (QuantReg: 12.01515) QuantErr: 12.01515 batch_time=0.94891
Train Epoch: 6 [188/250 24064/32000 (75%)] Loss: 2.84345 (QuantReg: 11.95888) QuantErr: 11.95888 batch_time=0.92935
Train Epoch: 6 [199/250 25472/32000 (80%)] Loss: 2.54374 (QuantReg: 12.39259) QuantErr: 12.39259 batch_time=0.91901
Train Epoch: 6 [210/250 26880/32000 (84%)] Loss: 2.59466 (QuantReg: 12.75879) QuantErr: 12.75879 batch_time=0.91943
Train Epoch: 6 [221/250 28288/32000 (88%)] Loss: 3.12606 (QuantReg: 12.19607) QuantErr: 12.19607 batch_time=0.98761
Train Epoch: 6 [232/250 29696/32000 (93%)] Loss: 2.97998 (QuantReg: 12.30149) QuantErr: 12.30149 batch_time=1.00170
Train Epoch: 6 [243/250 31104/32000 (97%)] Loss: 2.56905 (QuantReg: 12.22343) QuantErr: 12.22343 batch_time=0.94753
Train Epoch: 6 codebook_update_time=1.85564
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch6.pth ...
Done in 11.366s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch6.pth ...
Done in 22.482s
removing stale ckpt [epoch 5] [took 0.00s]
epoch : 6
loss : 2.845298850059509
quant_reg : 11.994772911071777
quant_err : 11.994772911071777
learning_rate : 1.547561875e-05
n_samples : 192000
n_steps : 1500
MSRVTT_jsfusion_test/t2v_metrics/R1: 16.3
MSRVTT_jsfusion_test/t2v_metrics/R5: 42.2
MSRVTT_jsfusion_test/t2v_metrics/R10: 57.5
MSRVTT_jsfusion_test/t2v_metrics/R50: 85.1
MSRVTT_jsfusion_test/t2v_metrics/MedR: 7.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 31.702
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 34.07134671876686
MSRVTT_jsfusion_test/v2t_metrics/R1: 15.6
MSRVTT_jsfusion_test/v2t_metrics/R5: 42.6
MSRVTT_jsfusion_test/v2t_metrics/R10: 56.2
MSRVTT_jsfusion_test/v2t_metrics/R50: 84.3
MSRVTT_jsfusion_test/v2t_metrics/MedR: 8.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 30.6805
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 33.42644349499048
mnt_best : 34.07134671876686
not_improved_count: 0
Train Epoch: 7 [1/250 128/32000 (0%)] Loss: 2.71637 (QuantReg: 12.19452) QuantErr: 12.19452 batch_time=34.66906
Train Epoch: 7 [12/250 1536/32000 (5%)] Loss: 2.85204 (QuantReg: 12.31967) QuantErr: 12.31967 batch_time=1.00488
Train Epoch: 7 [23/250 2944/32000 (9%)] Loss: 2.29233 (QuantReg: 12.18498) QuantErr: 12.18498 batch_time=0.89078
Train Epoch: 7 [34/250 4352/32000 (14%)] Loss: 2.42701 (QuantReg: 12.39354) QuantErr: 12.39354 batch_time=0.94108
Train Epoch: 7 [45/250 5760/32000 (18%)] Loss: 2.88019 (QuantReg: 11.88167) QuantErr: 11.88167 batch_time=0.90732
Train Epoch: 7 [56/250 7168/32000 (22%)] Loss: 2.44305 (QuantReg: 12.50069) QuantErr: 12.50069 batch_time=0.90071
Train Epoch: 7 [67/250 8576/32000 (27%)] Loss: 2.43266 (QuantReg: 12.15127) QuantErr: 12.15127 batch_time=0.95499
Train Epoch: 7 [78/250 9984/32000 (31%)] Loss: 2.69548 (QuantReg: 12.58985) QuantErr: 12.58985 batch_time=0.92488
Train Epoch: 7 [89/250 11392/32000 (36%)] Loss: 2.58117 (QuantReg: 12.77910) QuantErr: 12.77910 batch_time=0.94150
Train Epoch: 7 [100/250 12800/32000 (40%)] Loss: 2.20056 (QuantReg: 12.15302) QuantErr: 12.15302 batch_time=1.16455
Train Epoch: 7 [111/250 14208/32000 (44%)] Loss: 2.48988 (QuantReg: 12.76624) QuantErr: 12.76624 batch_time=1.11312
Train Epoch: 7 [122/250 15616/32000 (49%)] Loss: 2.36966 (QuantReg: 12.66845) QuantErr: 12.66845 batch_time=0.97560
Train Epoch: 7 [133/250 17024/32000 (53%)] Loss: 2.79817 (QuantReg: 13.21059) QuantErr: 13.21059 batch_time=0.92996
Train Epoch: 7 [144/250 18432/32000 (58%)] Loss: 3.09839 (QuantReg: 12.28126) QuantErr: 12.28126 batch_time=0.94610
Train Epoch: 7 [155/250 19840/32000 (62%)] Loss: 2.71945 (QuantReg: 12.65611) QuantErr: 12.65611 batch_time=0.95961
Train Epoch: 7 [166/250 21248/32000 (66%)] Loss: 3.25871 (QuantReg: 13.28324) QuantErr: 13.28324 batch_time=1.06571
Train Epoch: 7 [177/250 22656/32000 (71%)] Loss: 2.57939 (QuantReg: 13.06782) QuantErr: 13.06782 batch_time=0.94125
Train Epoch: 7 [188/250 24064/32000 (75%)] Loss: 3.22869 (QuantReg: 12.85829) QuantErr: 12.85829 batch_time=1.00216
Train Epoch: 7 [199/250 25472/32000 (80%)] Loss: 2.36239 (QuantReg: 12.92698) QuantErr: 12.92698 batch_time=0.96886
Train Epoch: 7 [210/250 26880/32000 (84%)] Loss: 2.63610 (QuantReg: 12.69041) QuantErr: 12.69041 batch_time=0.90592
Train Epoch: 7 [221/250 28288/32000 (88%)] Loss: 2.67159 (QuantReg: 12.87861) QuantErr: 12.87861 batch_time=0.90005
Train Epoch: 7 [232/250 29696/32000 (93%)] Loss: 2.47842 (QuantReg: 13.49173) QuantErr: 13.49173 batch_time=0.90290
Train Epoch: 7 [243/250 31104/32000 (97%)] Loss: 2.62980 (QuantReg: 12.74806) QuantErr: 12.74806 batch_time=0.93308
Train Epoch: 7 codebook_update_time=1.67448
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch7.pth ...
Done in 23.893s
removing stale ckpt [epoch 6] [took 0.00s]
epoch : 7
loss : 2.6123549194335935
quant_reg : 12.605710403442384
quant_err : 12.605710403442384
learning_rate : 1.47018378125e-05
n_samples : 224000
n_steps : 1750
MSRVTT_jsfusion_test/t2v_metrics/R1: 16.0
MSRVTT_jsfusion_test/t2v_metrics/R5: 42.2
MSRVTT_jsfusion_test/t2v_metrics/R10: 57.3
MSRVTT_jsfusion_test/t2v_metrics/R50: 86.1
MSRVTT_jsfusion_test/t2v_metrics/MedR: 7.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 31.306
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 33.821719639555056
MSRVTT_jsfusion_test/v2t_metrics/R1: 17.0
MSRVTT_jsfusion_test/v2t_metrics/R5: 42.4
MSRVTT_jsfusion_test/v2t_metrics/R10: 55.6
MSRVTT_jsfusion_test/v2t_metrics/R50: 85.4
MSRVTT_jsfusion_test/v2t_metrics/MedR: 8.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 30.2765
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 34.221301550039655
mnt_best : 34.07134671876686
not_improved_count: 1
Train Epoch: 8 [1/250 128/32000 (0%)] Loss: 2.50877 (QuantReg: 12.63520) QuantErr: 12.63520 batch_time=30.35329
Train Epoch: 8 [12/250 1536/32000 (5%)] Loss: 2.33659 (QuantReg: 12.34136) QuantErr: 12.34136 batch_time=0.91303
Train Epoch: 8 [23/250 2944/32000 (9%)] Loss: 2.48395 (QuantReg: 12.30443) QuantErr: 12.30443 batch_time=0.97640
Train Epoch: 8 [34/250 4352/32000 (14%)] Loss: 2.85807 (QuantReg: 12.60335) QuantErr: 12.60335 batch_time=0.98031
Train Epoch: 8 [45/250 5760/32000 (18%)] Loss: 2.41956 (QuantReg: 12.85806) QuantErr: 12.85806 batch_time=1.04252
Train Epoch: 8 [56/250 7168/32000 (22%)] Loss: 2.72266 (QuantReg: 12.84640) QuantErr: 12.84640 batch_time=0.93676
Train Epoch: 8 [67/250 8576/32000 (27%)] Loss: 2.74498 (QuantReg: 12.81339) QuantErr: 12.81339 batch_time=3.29873
Train Epoch: 8 [78/250 9984/32000 (31%)] Loss: 2.44502 (QuantReg: 13.04124) QuantErr: 13.04124 batch_time=0.91732
Train Epoch: 8 [89/250 11392/32000 (36%)] Loss: 2.32284 (QuantReg: 13.20116) QuantErr: 13.20116 batch_time=0.94816
Train Epoch: 8 [100/250 12800/32000 (40%)] Loss: 2.43589 (QuantReg: 12.85992) QuantErr: 12.85992 batch_time=1.00351
Train Epoch: 8 [111/250 14208/32000 (44%)] Loss: 2.28442 (QuantReg: 12.98008) QuantErr: 12.98008 batch_time=0.98476
Train Epoch: 8 [122/250 15616/32000 (49%)] Loss: 2.68094 (QuantReg: 13.58034) QuantErr: 13.58034 batch_time=0.98002
Train Epoch: 8 [133/250 17024/32000 (53%)] Loss: 2.84370 (QuantReg: 13.17529) QuantErr: 13.17529 batch_time=6.52749
Train Epoch: 8 [144/250 18432/32000 (58%)] Loss: 2.77426 (QuantReg: 12.77742) QuantErr: 12.77742 batch_time=0.89138
Train Epoch: 8 [155/250 19840/32000 (62%)] Loss: 2.67675 (QuantReg: 13.08780) QuantErr: 13.08780 batch_time=1.08885
Train Epoch: 8 [166/250 21248/32000 (66%)] Loss: 2.58307 (QuantReg: 13.09278) QuantErr: 13.09278 batch_time=0.89856
Train Epoch: 8 [177/250 22656/32000 (71%)] Loss: 2.23800 (QuantReg: 13.51679) QuantErr: 13.51679 batch_time=0.96024
Train Epoch: 8 [188/250 24064/32000 (75%)] Loss: 2.79332 (QuantReg: 12.67534) QuantErr: 12.67534 batch_time=0.92204
Train Epoch: 8 [199/250 25472/32000 (80%)] Loss: 2.18047 (QuantReg: 13.40542) QuantErr: 13.40542 batch_time=0.91099
Train Epoch: 8 [210/250 26880/32000 (84%)] Loss: 1.90916 (QuantReg: 13.38227) QuantErr: 13.38227 batch_time=0.96513
Train Epoch: 8 [221/250 28288/32000 (88%)] Loss: 2.58002 (QuantReg: 13.38357) QuantErr: 13.38357 batch_time=0.96817
Train Epoch: 8 [232/250 29696/32000 (93%)] Loss: 2.82008 (QuantReg: 13.10613) QuantErr: 13.10613 batch_time=0.92485
Train Epoch: 8 [243/250 31104/32000 (97%)] Loss: 2.26333 (QuantReg: 13.44009) QuantErr: 13.44009 batch_time=0.94789
Train Epoch: 8 codebook_update_time=1.69230
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch8.pth ...
Done in 15.007s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch8.pth ...
Done in 26.277s
removing stale ckpt [epoch 7] [took 0.00s]
epoch : 8
loss : 2.4663368978500366
quant_reg : 12.98315299987793
quant_err : 12.98315299987793
learning_rate : 1.3966745921874999e-05
n_samples : 256000
n_steps : 2000
MSRVTT_jsfusion_test/t2v_metrics/R1: 16.9
MSRVTT_jsfusion_test/t2v_metrics/R5: 43.6
MSRVTT_jsfusion_test/t2v_metrics/R10: 58.4
MSRVTT_jsfusion_test/t2v_metrics/R50: 85.5
MSRVTT_jsfusion_test/t2v_metrics/MedR: 7.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 29.81
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 35.04252138122261
MSRVTT_jsfusion_test/v2t_metrics/R1: 18.1
MSRVTT_jsfusion_test/v2t_metrics/R5: 43.6
MSRVTT_jsfusion_test/v2t_metrics/R10: 57.0
MSRVTT_jsfusion_test/v2t_metrics/R50: 87.0
MSRVTT_jsfusion_test/v2t_metrics/MedR: 8.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 30.804
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 35.564221513252704
mnt_best : 35.04252138122261
not_improved_count: 0
Train Epoch: 9 [1/250 128/32000 (0%)] Loss: 1.84487 (QuantReg: 13.19439) QuantErr: 13.19439 batch_time=28.90711
Train Epoch: 9 [12/250 1536/32000 (5%)] Loss: 2.25644 (QuantReg: 13.55181) QuantErr: 13.55181 batch_time=0.91249
Train Epoch: 9 [23/250 2944/32000 (9%)] Loss: 2.34637 (QuantReg: 13.47284) QuantErr: 13.47284 batch_time=0.95897
Train Epoch: 9 [34/250 4352/32000 (14%)] Loss: 2.54107 (QuantReg: 13.16794) QuantErr: 13.16794 batch_time=1.32197
Train Epoch: 9 [45/250 5760/32000 (18%)] Loss: 2.24901 (QuantReg: 12.78832) QuantErr: 12.78832 batch_time=0.94984
Train Epoch: 9 [56/250 7168/32000 (22%)] Loss: 2.22648 (QuantReg: 13.51905) QuantErr: 13.51905 batch_time=0.94195
Train Epoch: 9 [67/250 8576/32000 (27%)] Loss: 1.95843 (QuantReg: 13.23476) QuantErr: 13.23476 batch_time=1.84099
Train Epoch: 9 [78/250 9984/32000 (31%)] Loss: 2.26064 (QuantReg: 13.83502) QuantErr: 13.83502 batch_time=0.89368
Train Epoch: 9 [89/250 11392/32000 (36%)] Loss: 2.10003 (QuantReg: 13.09937) QuantErr: 13.09937 batch_time=0.86626
Train Epoch: 9 [100/250 12800/32000 (40%)] Loss: 2.70611 (QuantReg: 13.00881) QuantErr: 13.00881 batch_time=0.99041
Train Epoch: 9 [111/250 14208/32000 (44%)] Loss: 2.19677 (QuantReg: 13.15735) QuantErr: 13.15735 batch_time=0.96739
Train Epoch: 9 [122/250 15616/32000 (49%)] Loss: 1.82602 (QuantReg: 13.62952) QuantErr: 13.62952 batch_time=0.90574
Train Epoch: 9 [133/250 17024/32000 (53%)] Loss: 2.55937 (QuantReg: 13.13681) QuantErr: 13.13681 batch_time=2.45273
Train Epoch: 9 [144/250 18432/32000 (58%)] Loss: 2.24416 (QuantReg: 13.40150) QuantErr: 13.40150 batch_time=0.93711
Train Epoch: 9 [155/250 19840/32000 (62%)] Loss: 2.21582 (QuantReg: 13.39546) QuantErr: 13.39546 batch_time=0.99605
Train Epoch: 9 [166/250 21248/32000 (66%)] Loss: 2.50154 (QuantReg: 13.73148) QuantErr: 13.73148 batch_time=0.99129
Train Epoch: 9 [177/250 22656/32000 (71%)] Loss: 3.14302 (QuantReg: 13.13132) QuantErr: 13.13132 batch_time=0.94032
Train Epoch: 9 [188/250 24064/32000 (75%)] Loss: 2.59679 (QuantReg: 13.22370) QuantErr: 13.22370 batch_time=1.29286
Train Epoch: 9 [199/250 25472/32000 (80%)] Loss: 2.42437 (QuantReg: 13.73668) QuantErr: 13.73668 batch_time=0.97024
Train Epoch: 9 [210/250 26880/32000 (84%)] Loss: 1.96389 (QuantReg: 14.04005) QuantErr: 14.04005 batch_time=0.96865
Train Epoch: 9 [221/250 28288/32000 (88%)] Loss: 1.50417 (QuantReg: 14.18342) QuantErr: 14.18342 batch_time=1.01799
Train Epoch: 9 [232/250 29696/32000 (93%)] Loss: 2.13043 (QuantReg: 14.29761) QuantErr: 14.29761 batch_time=0.94445
Train Epoch: 9 [243/250 31104/32000 (97%)] Loss: 2.42467 (QuantReg: 13.47423) QuantErr: 13.47423 batch_time=0.96066
Train Epoch: 9 codebook_update_time=1.84155
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch9.pth ...
Done in 11.042s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch9.pth ...
Done in 21.521s
removing stale ckpt [epoch 8] [took 0.00s]
epoch : 9
loss : 2.3228577609062193
quant_reg : 13.401194896697998
quant_err : 13.401194896697998
learning_rate : 1.3268408625781248e-05
n_samples : 288000
n_steps : 2250
MSRVTT_jsfusion_test/t2v_metrics/R1: 17.6
MSRVTT_jsfusion_test/t2v_metrics/R5: 44.1
MSRVTT_jsfusion_test/t2v_metrics/R10: 59.5
MSRVTT_jsfusion_test/t2v_metrics/R50: 87.0
MSRVTT_jsfusion_test/t2v_metrics/MedR: 7.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 28.354
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 35.877546914162934
MSRVTT_jsfusion_test/v2t_metrics/R1: 18.6
MSRVTT_jsfusion_test/v2t_metrics/R5: 46.0
MSRVTT_jsfusion_test/v2t_metrics/R10: 59.1
MSRVTT_jsfusion_test/v2t_metrics/R50: 86.9
MSRVTT_jsfusion_test/v2t_metrics/MedR: 7.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 27.712
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 36.97879476431726
mnt_best : 35.877546914162934
not_improved_count: 0
Train Epoch: 10 [1/250 128/32000 (0%)] Loss: 1.87299 (QuantReg: 13.76980) QuantErr: 13.76980 batch_time=29.08093
Train Epoch: 10 [12/250 1536/32000 (5%)] Loss: 2.22026 (QuantReg: 13.82216) QuantErr: 13.82216 batch_time=0.91185
Train Epoch: 10 [23/250 2944/32000 (9%)] Loss: 2.32318 (QuantReg: 13.61902) QuantErr: 13.61902 batch_time=0.95536
Train Epoch: 10 [34/250 4352/32000 (14%)] Loss: 2.47852 (QuantReg: 13.36021) QuantErr: 13.36021 batch_time=0.91227
Train Epoch: 10 [45/250 5760/32000 (18%)] Loss: 2.29465 (QuantReg: 13.84867) QuantErr: 13.84867 batch_time=1.45828
Train Epoch: 10 [56/250 7168/32000 (22%)] Loss: 2.06031 (QuantReg: 13.81310) QuantErr: 13.81310 batch_time=0.90589
Train Epoch: 10 [67/250 8576/32000 (27%)] Loss: 2.30968 (QuantReg: 13.77792) QuantErr: 13.77792 batch_time=0.90241
Train Epoch: 10 [78/250 9984/32000 (31%)] Loss: 2.35116 (QuantReg: 13.40857) QuantErr: 13.40857 batch_time=0.91210
Train Epoch: 10 [89/250 11392/32000 (36%)] Loss: 2.10802 (QuantReg: 13.50336) QuantErr: 13.50336 batch_time=0.97662
Train Epoch: 10 [100/250 12800/32000 (40%)] Loss: 2.00886 (QuantReg: 13.88383) QuantErr: 13.88383 batch_time=0.94425
Train Epoch: 10 [111/250 14208/32000 (44%)] Loss: 1.83483 (QuantReg: 14.19246) QuantErr: 14.19246 batch_time=0.90253
Train Epoch: 10 [122/250 15616/32000 (49%)] Loss: 2.47413 (QuantReg: 13.81083) QuantErr: 13.81083 batch_time=0.97600
Train Epoch: 10 [133/250 17024/32000 (53%)] Loss: 2.09193 (QuantReg: 13.54428) QuantErr: 13.54428 batch_time=0.97331
Train Epoch: 10 [144/250 18432/32000 (58%)] Loss: 2.24514 (QuantReg: 14.17810) QuantErr: 14.17810 batch_time=0.94910
Train Epoch: 10 [155/250 19840/32000 (62%)] Loss: 2.11032 (QuantReg: 13.52281) QuantErr: 13.52281 batch_time=0.95867
Train Epoch: 10 [166/250 21248/32000 (66%)] Loss: 2.13632 (QuantReg: 13.43034) QuantErr: 13.43034 batch_time=1.42452
Train Epoch: 10 [177/250 22656/32000 (71%)] Loss: 1.92969 (QuantReg: 14.28326) QuantErr: 14.28326 batch_time=0.92161
Train Epoch: 10 [188/250 24064/32000 (75%)] Loss: 2.09987 (QuantReg: 14.41071) QuantErr: 14.41071 batch_time=0.90833
Train Epoch: 10 [199/250 25472/32000 (80%)] Loss: 2.12729 (QuantReg: 14.34742) QuantErr: 14.34742 batch_time=1.02915
Train Epoch: 10 [210/250 26880/32000 (84%)] Loss: 2.01988 (QuantReg: 14.45647) QuantErr: 14.45647 batch_time=0.90758
Train Epoch: 10 [221/250 28288/32000 (88%)] Loss: 1.81555 (QuantReg: 14.34044) QuantErr: 14.34044 batch_time=0.95171
Train Epoch: 10 [232/250 29696/32000 (93%)] Loss: 1.80815 (QuantReg: 14.33976) QuantErr: 14.33976 batch_time=0.97766
Train Epoch: 10 [243/250 31104/32000 (97%)] Loss: 1.77354 (QuantReg: 14.03878) QuantErr: 14.03878 batch_time=0.92165
Train Epoch: 10 codebook_update_time=1.69209
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch10.pth ...
Done in 11.276s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch10.pth ...
Done in 23.720s
removing stale ckpt [epoch 9] [took 0.00s]
epoch : 10
loss : 2.1712232708930967
quant_reg : 13.841436546325683
quant_err : 13.841436546325683
learning_rate : 1.2604988194492186e-05
n_samples : 320000
n_steps : 2500
MSRVTT_jsfusion_test/t2v_metrics/R1: 17.8
MSRVTT_jsfusion_test/t2v_metrics/R5: 46.1
MSRVTT_jsfusion_test/t2v_metrics/R10: 60.6
MSRVTT_jsfusion_test/t2v_metrics/R50: 87.2
MSRVTT_jsfusion_test/t2v_metrics/MedR: 7.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 27.617
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 36.773179693749526
MSRVTT_jsfusion_test/v2t_metrics/R1: 18.4
MSRVTT_jsfusion_test/v2t_metrics/R5: 47.1
MSRVTT_jsfusion_test/v2t_metrics/R10: 60.0
MSRVTT_jsfusion_test/v2t_metrics/R50: 87.5
MSRVTT_jsfusion_test/v2t_metrics/MedR: 6.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 26.961
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 37.324728742588945
mnt_best : 36.773179693749526
not_improved_count: 0
Train Epoch: 11 [1/250 128/32000 (0%)] Loss: 1.93156 (QuantReg: 13.83332) QuantErr: 13.83332 batch_time=28.41601
Train Epoch: 11 [12/250 1536/32000 (5%)] Loss: 2.14626 (QuantReg: 13.75519) QuantErr: 13.75519 batch_time=1.40088
Train Epoch: 11 [23/250 2944/32000 (9%)] Loss: 1.86565 (QuantReg: 13.79788) QuantErr: 13.79788 batch_time=0.93204
Train Epoch: 11 [34/250 4352/32000 (14%)] Loss: 2.31121 (QuantReg: 13.96164) QuantErr: 13.96164 batch_time=1.19698
Train Epoch: 11 [45/250 5760/32000 (18%)] Loss: 2.39499 (QuantReg: 14.25036) QuantErr: 14.25036 batch_time=0.98372
Train Epoch: 11 [56/250 7168/32000 (22%)] Loss: 1.87954 (QuantReg: 14.01311) QuantErr: 14.01311 batch_time=0.99482
Train Epoch: 11 [67/250 8576/32000 (27%)] Loss: 2.18965 (QuantReg: 14.19204) QuantErr: 14.19204 batch_time=0.92575
Train Epoch: 11 [78/250 9984/32000 (31%)] Loss: 2.46704 (QuantReg: 14.06478) QuantErr: 14.06478 batch_time=0.91950
Train Epoch: 11 [89/250 11392/32000 (36%)] Loss: 2.24659 (QuantReg: 14.34724) QuantErr: 14.34724 batch_time=0.97037
Train Epoch: 11 [100/250 12800/32000 (40%)] Loss: 2.54108 (QuantReg: 14.22346) QuantErr: 14.22346 batch_time=1.07552
Train Epoch: 11 [111/250 14208/32000 (44%)] Loss: 2.22436 (QuantReg: 13.89672) QuantErr: 13.89672 batch_time=0.93197
Train Epoch: 11 [122/250 15616/32000 (49%)] Loss: 2.67775 (QuantReg: 14.07904) QuantErr: 14.07904 batch_time=1.00819
Train Epoch: 11 [133/250 17024/32000 (53%)] Loss: 2.32204 (QuantReg: 14.15905) QuantErr: 14.15905 batch_time=0.91625
Train Epoch: 11 [144/250 18432/32000 (58%)] Loss: 1.69969 (QuantReg: 14.41447) QuantErr: 14.41447 batch_time=1.27421
Train Epoch: 11 [155/250 19840/32000 (62%)] Loss: 1.97584 (QuantReg: 14.09743) QuantErr: 14.09743 batch_time=0.94383
Train Epoch: 11 [166/250 21248/32000 (66%)] Loss: 2.70536 (QuantReg: 13.76459) QuantErr: 13.76459 batch_time=2.66208
Train Epoch: 11 [177/250 22656/32000 (71%)] Loss: 2.07422 (QuantReg: 14.19648) QuantErr: 14.19648 batch_time=0.92173
Train Epoch: 11 [188/250 24064/32000 (75%)] Loss: 1.99970 (QuantReg: 14.12578) QuantErr: 14.12578 batch_time=1.11221
Train Epoch: 11 [199/250 25472/32000 (80%)] Loss: 2.19947 (QuantReg: 13.93186) QuantErr: 13.93186 batch_time=0.91376
Train Epoch: 11 [210/250 26880/32000 (84%)] Loss: 2.10473 (QuantReg: 14.36239) QuantErr: 14.36239 batch_time=0.91321
Train Epoch: 11 [221/250 28288/32000 (88%)] Loss: 1.89015 (QuantReg: 14.81385) QuantErr: 14.81385 batch_time=0.87831
Train Epoch: 11 [232/250 29696/32000 (93%)] Loss: 1.65221 (QuantReg: 14.59063) QuantErr: 14.59063 batch_time=0.97922
Train Epoch: 11 [243/250 31104/32000 (97%)] Loss: 1.70619 (QuantReg: 14.49242) QuantErr: 14.49242 batch_time=0.96081
Train Epoch: 11 codebook_update_time=1.82280
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch11.pth ...
Done in 10.994s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch11.pth ...
Done in 21.880s
removing stale ckpt [epoch 10] [took 0.00s]
epoch : 11
loss : 2.094702980518341
quant_reg : 14.13588438796997
quant_err : 14.13588438796997
learning_rate : 1.1974738784767577e-05
n_samples : 352000
n_steps : 2750
MSRVTT_jsfusion_test/t2v_metrics/R1: 19.2
MSRVTT_jsfusion_test/t2v_metrics/R5: 46.5
MSRVTT_jsfusion_test/t2v_metrics/R10: 61.2
MSRVTT_jsfusion_test/t2v_metrics/R50: 87.2
MSRVTT_jsfusion_test/t2v_metrics/MedR: 7.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 27.436
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 37.946221248839265
MSRVTT_jsfusion_test/v2t_metrics/R1: 17.7
MSRVTT_jsfusion_test/v2t_metrics/R5: 45.8
MSRVTT_jsfusion_test/v2t_metrics/R10: 60.6
MSRVTT_jsfusion_test/v2t_metrics/R50: 88.1
MSRVTT_jsfusion_test/v2t_metrics/MedR: 7.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 26.597
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 36.62439473736401
mnt_best : 37.946221248839265
not_improved_count: 0
Train Epoch: 12 [1/250 128/32000 (0%)] Loss: 2.61214 (QuantReg: 13.69654) QuantErr: 13.69654 batch_time=37.91902
Train Epoch: 12 [12/250 1536/32000 (5%)] Loss: 2.12480 (QuantReg: 13.99149) QuantErr: 13.99149 batch_time=1.03643
Train Epoch: 12 [23/250 2944/32000 (9%)] Loss: 1.82070 (QuantReg: 14.50130) QuantErr: 14.50130 batch_time=0.90662
Train Epoch: 12 [34/250 4352/32000 (14%)] Loss: 2.03084 (QuantReg: 14.07161) QuantErr: 14.07161 batch_time=0.96260
Train Epoch: 12 [45/250 5760/32000 (18%)] Loss: 2.01642 (QuantReg: 14.40558) QuantErr: 14.40558 batch_time=0.96244
Train Epoch: 12 [56/250 7168/32000 (22%)] Loss: 2.20106 (QuantReg: 13.75885) QuantErr: 13.75885 batch_time=0.94466
Train Epoch: 12 [67/250 8576/32000 (27%)] Loss: 2.12748 (QuantReg: 14.50060) QuantErr: 14.50060 batch_time=0.90100
Train Epoch: 12 [78/250 9984/32000 (31%)] Loss: 2.29096 (QuantReg: 14.61816) QuantErr: 14.61816 batch_time=1.23547
Train Epoch: 12 [89/250 11392/32000 (36%)] Loss: 2.51499 (QuantReg: 14.04959) QuantErr: 14.04959 batch_time=0.96400
Train Epoch: 12 [100/250 12800/32000 (40%)] Loss: 2.16783 (QuantReg: 14.01913) QuantErr: 14.01913 batch_time=0.95526
Train Epoch: 12 [111/250 14208/32000 (44%)] Loss: 1.92830 (QuantReg: 14.70532) QuantErr: 14.70532 batch_time=0.91334
Train Epoch: 12 [122/250 15616/32000 (49%)] Loss: 1.93278 (QuantReg: 14.61211) QuantErr: 14.61211 batch_time=0.93634
Train Epoch: 12 [133/250 17024/32000 (53%)] Loss: 1.94828 (QuantReg: 14.52110) QuantErr: 14.52110 batch_time=2.01315
Train Epoch: 12 [144/250 18432/32000 (58%)] Loss: 2.01007 (QuantReg: 14.47675) QuantErr: 14.47675 batch_time=0.89979
Train Epoch: 12 [155/250 19840/32000 (62%)] Loss: 2.00971 (QuantReg: 14.38321) QuantErr: 14.38321 batch_time=1.08931
Train Epoch: 12 [166/250 21248/32000 (66%)] Loss: 1.95146 (QuantReg: 14.28090) QuantErr: 14.28090 batch_time=1.04071
Train Epoch: 12 [177/250 22656/32000 (71%)] Loss: 1.79982 (QuantReg: 14.66872) QuantErr: 14.66872 batch_time=0.93374
Train Epoch: 12 [188/250 24064/32000 (75%)] Loss: 2.22600 (QuantReg: 14.64910) QuantErr: 14.64910 batch_time=0.90087
Train Epoch: 12 [199/250 25472/32000 (80%)] Loss: 1.95171 (QuantReg: 14.51005) QuantErr: 14.51005 batch_time=0.89492
Train Epoch: 12 [210/250 26880/32000 (84%)] Loss: 2.13262 (QuantReg: 14.33471) QuantErr: 14.33471 batch_time=1.02417
Train Epoch: 12 [221/250 28288/32000 (88%)] Loss: 2.06069 (QuantReg: 14.44377) QuantErr: 14.44377 batch_time=0.90103
Train Epoch: 12 [232/250 29696/32000 (93%)] Loss: 2.17394 (QuantReg: 14.68787) QuantErr: 14.68787 batch_time=0.98363
Train Epoch: 12 [243/250 31104/32000 (97%)] Loss: 1.33653 (QuantReg: 15.10424) QuantErr: 15.10424 batch_time=0.97659
Train Epoch: 12 codebook_update_time=1.68966
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch12.pth ...
Done in 11.614s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch12.pth ...
Done in 23.038s
removing stale ckpt [epoch 11] [took 0.00s]
epoch : 12
loss : 1.978166717529297
quant_reg : 14.472517227172851
quant_err : 14.472517227172851
learning_rate : 1.1376001845529198e-05
n_samples : 384000
n_steps : 3000
MSRVTT_jsfusion_test/t2v_metrics/R1: 19.0
MSRVTT_jsfusion_test/t2v_metrics/R5: 47.2
MSRVTT_jsfusion_test/t2v_metrics/R10: 61.4
MSRVTT_jsfusion_test/t2v_metrics/R50: 87.5
MSRVTT_jsfusion_test/t2v_metrics/MedR: 6.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 27.433
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 38.04415918975242
MSRVTT_jsfusion_test/v2t_metrics/R1: 17.6
MSRVTT_jsfusion_test/v2t_metrics/R5: 47.5
MSRVTT_jsfusion_test/v2t_metrics/R10: 60.3
MSRVTT_jsfusion_test/v2t_metrics/R50: 88.3
MSRVTT_jsfusion_test/v2t_metrics/MedR: 6.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 26.3375
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 36.94093327016191
mnt_best : 38.04415918975242
not_improved_count: 0
Train Epoch: 13 [1/250 128/32000 (0%)] Loss: 2.08975 (QuantReg: 14.48830) QuantErr: 14.48830 batch_time=34.36180
Train Epoch: 13 [12/250 1536/32000 (5%)] Loss: 1.86515 (QuantReg: 14.62930) QuantErr: 14.62930 batch_time=0.92812
Train Epoch: 13 [23/250 2944/32000 (9%)] Loss: 2.20327 (QuantReg: 14.17117) QuantErr: 14.17117 batch_time=0.90182
Train Epoch: 13 [34/250 4352/32000 (14%)] Loss: 1.53398 (QuantReg: 14.47392) QuantErr: 14.47392 batch_time=0.88538
Train Epoch: 13 [45/250 5760/32000 (18%)] Loss: 1.88404 (QuantReg: 14.43652) QuantErr: 14.43652 batch_time=0.94481
Train Epoch: 13 [56/250 7168/32000 (22%)] Loss: 1.71366 (QuantReg: 14.46061) QuantErr: 14.46061 batch_time=0.86657
Train Epoch: 13 [67/250 8576/32000 (27%)] Loss: 1.59654 (QuantReg: 14.85097) QuantErr: 14.85097 batch_time=0.92520
Train Epoch: 13 [78/250 9984/32000 (31%)] Loss: 1.70285 (QuantReg: 14.74530) QuantErr: 14.74530 batch_time=0.94129
Train Epoch: 13 [89/250 11392/32000 (36%)] Loss: 1.74269 (QuantReg: 14.77121) QuantErr: 14.77121 batch_time=0.94569
Train Epoch: 13 [100/250 12800/32000 (40%)] Loss: 1.51969 (QuantReg: 14.68976) QuantErr: 14.68976 batch_time=0.95864
Train Epoch: 13 [111/250 14208/32000 (44%)] Loss: 1.74121 (QuantReg: 14.56892) QuantErr: 14.56892 batch_time=0.89569
Train Epoch: 13 [122/250 15616/32000 (49%)] Loss: 1.81192 (QuantReg: 14.48050) QuantErr: 14.48050 batch_time=0.91139
Train Epoch: 13 [133/250 17024/32000 (53%)] Loss: 2.07973 (QuantReg: 14.59914) QuantErr: 14.59914 batch_time=0.92707
Train Epoch: 13 [144/250 18432/32000 (58%)] Loss: 2.03637 (QuantReg: 14.56265) QuantErr: 14.56265 batch_time=0.90598
Train Epoch: 13 [155/250 19840/32000 (62%)] Loss: 1.82887 (QuantReg: 14.49787) QuantErr: 14.49787 batch_time=0.99445
Train Epoch: 13 [166/250 21248/32000 (66%)] Loss: 1.82484 (QuantReg: 14.50947) QuantErr: 14.50947 batch_time=0.98652
Train Epoch: 13 [177/250 22656/32000 (71%)] Loss: 1.74567 (QuantReg: 14.99789) QuantErr: 14.99789 batch_time=1.05720
Train Epoch: 13 [188/250 24064/32000 (75%)] Loss: 1.66173 (QuantReg: 14.56537) QuantErr: 14.56537 batch_time=1.01333
Train Epoch: 13 [199/250 25472/32000 (80%)] Loss: 1.57669 (QuantReg: 14.87376) QuantErr: 14.87376 batch_time=0.93978
Train Epoch: 13 [210/250 26880/32000 (84%)] Loss: 2.02000 (QuantReg: 15.04495) QuantErr: 15.04495 batch_time=0.96430
Train Epoch: 13 [221/250 28288/32000 (88%)] Loss: 1.71071 (QuantReg: 14.96191) QuantErr: 14.96191 batch_time=0.92662
Train Epoch: 13 [232/250 29696/32000 (93%)] Loss: 1.67830 (QuantReg: 14.86715) QuantErr: 14.86715 batch_time=0.96325
Train Epoch: 13 [243/250 31104/32000 (97%)] Loss: 1.99515 (QuantReg: 14.71180) QuantErr: 14.71180 batch_time=1.06752
Train Epoch: 13 codebook_update_time=1.74308
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch13.pth ...
Done in 11.844s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch13.pth ...
Done in 23.168s
removing stale ckpt [epoch 12] [took 0.00s]
epoch : 13
loss : 1.9190941877365113
quant_reg : 14.678930751800538
quant_err : 14.678930751800538
learning_rate : 1.0807201753252737e-05
n_samples : 416000
n_steps : 3250
MSRVTT_jsfusion_test/t2v_metrics/R1: 19.7
MSRVTT_jsfusion_test/t2v_metrics/R5: 47.2
MSRVTT_jsfusion_test/t2v_metrics/R10: 62.2
MSRVTT_jsfusion_test/t2v_metrics/R50: 88.3
MSRVTT_jsfusion_test/t2v_metrics/MedR: 6.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 27.533
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 38.67225853861698
MSRVTT_jsfusion_test/v2t_metrics/R1: 20.0
MSRVTT_jsfusion_test/v2t_metrics/R5: 48.3
MSRVTT_jsfusion_test/v2t_metrics/R10: 64.1
MSRVTT_jsfusion_test/v2t_metrics/R50: 87.8
MSRVTT_jsfusion_test/v2t_metrics/MedR: 6.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 26.4635
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 39.562013363485256
mnt_best : 38.67225853861698
not_improved_count: 0
Train Epoch: 14 [1/250 128/32000 (0%)] Loss: 1.97708 (QuantReg: 15.04557) QuantErr: 15.04557 batch_time=31.18802
Train Epoch: 14 [12/250 1536/32000 (5%)] Loss: 1.85914 (QuantReg: 14.69805) QuantErr: 14.69805 batch_time=0.89683
Train Epoch: 14 [23/250 2944/32000 (9%)] Loss: 2.07560 (QuantReg: 14.71085) QuantErr: 14.71085 batch_time=0.94611
Train Epoch: 14 [34/250 4352/32000 (14%)] Loss: 1.35986 (QuantReg: 15.11191) QuantErr: 15.11191 batch_time=0.94874
Train Epoch: 14 [45/250 5760/32000 (18%)] Loss: 2.36284 (QuantReg: 14.86612) QuantErr: 14.86612 batch_time=0.88854
Train Epoch: 14 [56/250 7168/32000 (22%)] Loss: 1.75122 (QuantReg: 14.83032) QuantErr: 14.83032 batch_time=0.92299
Train Epoch: 14 [67/250 8576/32000 (27%)] Loss: 1.96486 (QuantReg: 14.29847) QuantErr: 14.29847 batch_time=1.25382
Train Epoch: 14 [78/250 9984/32000 (31%)] Loss: 1.46403 (QuantReg: 14.87214) QuantErr: 14.87214 batch_time=0.96061
Train Epoch: 14 [89/250 11392/32000 (36%)] Loss: 2.20416 (QuantReg: 15.09789) QuantErr: 15.09789 batch_time=0.95149
Train Epoch: 14 [100/250 12800/32000 (40%)] Loss: 1.51158 (QuantReg: 14.31876) QuantErr: 14.31876 batch_time=0.93196
Train Epoch: 14 [111/250 14208/32000 (44%)] Loss: 1.70800 (QuantReg: 14.68949) QuantErr: 14.68949 batch_time=0.95651
Train Epoch: 14 [122/250 15616/32000 (49%)] Loss: 1.87077 (QuantReg: 14.60644) QuantErr: 14.60644 batch_time=0.91819
Train Epoch: 14 [133/250 17024/32000 (53%)] Loss: 1.88194 (QuantReg: 14.58100) QuantErr: 14.58100 batch_time=1.05136
Train Epoch: 14 [144/250 18432/32000 (58%)] Loss: 2.10262 (QuantReg: 15.10540) QuantErr: 15.10540 batch_time=4.01646
Train Epoch: 14 [155/250 19840/32000 (62%)] Loss: 1.90144 (QuantReg: 14.90787) QuantErr: 14.90787 batch_time=0.92675
Train Epoch: 14 [166/250 21248/32000 (66%)] Loss: 1.59332 (QuantReg: 14.86272) QuantErr: 14.86272 batch_time=0.94295
Train Epoch: 14 [177/250 22656/32000 (71%)] Loss: 1.87387 (QuantReg: 14.55794) QuantErr: 14.55794 batch_time=0.91704
Train Epoch: 14 [188/250 24064/32000 (75%)] Loss: 1.95408 (QuantReg: 15.23730) QuantErr: 15.23730 batch_time=1.05210
Train Epoch: 14 [199/250 25472/32000 (80%)] Loss: 2.00114 (QuantReg: 14.46506) QuantErr: 14.46506 batch_time=0.95303
Train Epoch: 14 [210/250 26880/32000 (84%)] Loss: 1.89136 (QuantReg: 15.22893) QuantErr: 15.22893 batch_time=4.34654
Train Epoch: 14 [221/250 28288/32000 (88%)] Loss: 2.05614 (QuantReg: 14.87249) QuantErr: 14.87249 batch_time=0.96867
Train Epoch: 14 [232/250 29696/32000 (93%)] Loss: 1.73657 (QuantReg: 15.13330) QuantErr: 15.13330 batch_time=0.93654
Train Epoch: 14 [243/250 31104/32000 (97%)] Loss: 1.91568 (QuantReg: 15.41397) QuantErr: 15.41397 batch_time=0.94119
Train Epoch: 14 codebook_update_time=2.34460
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch14.pth ...
Done in 12.517s
removing stale ckpt [epoch 13] [took 0.04s]
epoch : 14
loss : 1.8867003774642945
quant_reg : 14.862853115081787
quant_err : 14.862853115081787
learning_rate : 1.02668416655901e-05
n_samples : 448000
n_steps : 3500
MSRVTT_jsfusion_test/t2v_metrics/R1: 18.9
MSRVTT_jsfusion_test/t2v_metrics/R5: 48.9
MSRVTT_jsfusion_test/t2v_metrics/R10: 62.1
MSRVTT_jsfusion_test/t2v_metrics/R50: 88.4
MSRVTT_jsfusion_test/t2v_metrics/MedR: 6.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 28.104
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 38.57335561800797
MSRVTT_jsfusion_test/v2t_metrics/R1: 21.4
MSRVTT_jsfusion_test/v2t_metrics/R5: 48.7
MSRVTT_jsfusion_test/v2t_metrics/R10: 62.0
MSRVTT_jsfusion_test/v2t_metrics/R50: 87.8
MSRVTT_jsfusion_test/v2t_metrics/MedR: 6.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 25.7515
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 40.12774989807129
mnt_best : 38.67225853861698
not_improved_count: 1
Train Epoch: 15 [1/250 128/32000 (0%)] Loss: 1.72287 (QuantReg: 14.77785) QuantErr: 14.77785 batch_time=34.83907
Train Epoch: 15 [12/250 1536/32000 (5%)] Loss: 2.06146 (QuantReg: 14.83519) QuantErr: 14.83519 batch_time=0.98207
Train Epoch: 15 [23/250 2944/32000 (9%)] Loss: 1.73976 (QuantReg: 14.87178) QuantErr: 14.87178 batch_time=0.87408
Train Epoch: 15 [34/250 4352/32000 (14%)] Loss: 1.74861 (QuantReg: 15.06234) QuantErr: 15.06234 batch_time=0.95377
Train Epoch: 15 [45/250 5760/32000 (18%)] Loss: 2.10499 (QuantReg: 14.39252) QuantErr: 14.39252 batch_time=0.90304
Train Epoch: 15 [56/250 7168/32000 (22%)] Loss: 1.84365 (QuantReg: 15.06769) QuantErr: 15.06769 batch_time=0.90069
Train Epoch: 15 [67/250 8576/32000 (27%)] Loss: 1.86830 (QuantReg: 14.85887) QuantErr: 14.85887 batch_time=2.27287
Train Epoch: 15 [78/250 9984/32000 (31%)] Loss: 1.40855 (QuantReg: 15.14871) QuantErr: 15.14871 batch_time=1.00578
Train Epoch: 15 [89/250 11392/32000 (36%)] Loss: 1.93237 (QuantReg: 14.94285) QuantErr: 14.94285 batch_time=0.93953
Train Epoch: 15 [100/250 12800/32000 (40%)] Loss: 1.69258 (QuantReg: 15.22314) QuantErr: 15.22314 batch_time=0.98876
Train Epoch: 15 [111/250 14208/32000 (44%)] Loss: 1.88024 (QuantReg: 15.02106) QuantErr: 15.02106 batch_time=0.95843
Train Epoch: 15 [122/250 15616/32000 (49%)] Loss: 2.10314 (QuantReg: 14.54384) QuantErr: 14.54384 batch_time=0.90515
Train Epoch: 15 [133/250 17024/32000 (53%)] Loss: 1.71156 (QuantReg: 14.94934) QuantErr: 14.94934 batch_time=0.87156
Train Epoch: 15 [144/250 18432/32000 (58%)] Loss: 1.49333 (QuantReg: 14.90578) QuantErr: 14.90578 batch_time=0.95615
Train Epoch: 15 [155/250 19840/32000 (62%)] Loss: 1.45383 (QuantReg: 15.00553) QuantErr: 15.00553 batch_time=0.92860
Train Epoch: 15 [166/250 21248/32000 (66%)] Loss: 1.89123 (QuantReg: 15.19379) QuantErr: 15.19379 batch_time=0.90638
Train Epoch: 15 [177/250 22656/32000 (71%)] Loss: 1.93097 (QuantReg: 14.78199) QuantErr: 14.78199 batch_time=0.98288
Train Epoch: 15 [188/250 24064/32000 (75%)] Loss: 1.36193 (QuantReg: 15.24311) QuantErr: 15.24311 batch_time=0.91576
Train Epoch: 15 [199/250 25472/32000 (80%)] Loss: 2.02105 (QuantReg: 15.41585) QuantErr: 15.41585 batch_time=0.97174
Train Epoch: 15 [210/250 26880/32000 (84%)] Loss: 1.58718 (QuantReg: 15.55374) QuantErr: 15.55374 batch_time=2.65106
Train Epoch: 15 [221/250 28288/32000 (88%)] Loss: 1.69585 (QuantReg: 15.25782) QuantErr: 15.25782 batch_time=0.95054
Train Epoch: 15 [232/250 29696/32000 (93%)] Loss: 2.13562 (QuantReg: 15.19802) QuantErr: 15.19802 batch_time=0.92414
Train Epoch: 15 [243/250 31104/32000 (97%)] Loss: 1.47935 (QuantReg: 15.24244) QuantErr: 15.24244 batch_time=0.88458
Train Epoch: 15 codebook_update_time=1.77994
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch15.pth ...
Done in 11.106s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch15.pth ...
Done in 22.770s
removing stale ckpt [epoch 14] [took 0.00s]
epoch : 15
loss : 1.7913374891281129
quant_reg : 15.101330867767334
quant_err : 15.101330867767334
learning_rate : 9.753499582310594e-06
n_samples : 480000
n_steps : 3750
MSRVTT_jsfusion_test/t2v_metrics/R1: 20.2
MSRVTT_jsfusion_test/t2v_metrics/R5: 48.7
MSRVTT_jsfusion_test/t2v_metrics/R10: 63.4
MSRVTT_jsfusion_test/t2v_metrics/R50: 87.3
MSRVTT_jsfusion_test/t2v_metrics/MedR: 6.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 27.961
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 39.65730488607522
MSRVTT_jsfusion_test/v2t_metrics/R1: 21.3
MSRVTT_jsfusion_test/v2t_metrics/R5: 47.4
MSRVTT_jsfusion_test/v2t_metrics/R10: 62.4
MSRVTT_jsfusion_test/v2t_metrics/R50: 87.7
MSRVTT_jsfusion_test/v2t_metrics/MedR: 6.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 26.019
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 39.790632712124214
mnt_best : 39.65730488607522
not_improved_count: 0
Train Epoch: 16 [1/250 128/32000 (0%)] Loss: 1.27851 (QuantReg: 15.18181) QuantErr: 15.18181 batch_time=31.52558
Train Epoch: 16 [12/250 1536/32000 (5%)] Loss: 1.68363 (QuantReg: 14.91276) QuantErr: 14.91276 batch_time=0.96897
Train Epoch: 16 [23/250 2944/32000 (9%)] Loss: 1.70073 (QuantReg: 15.36941) QuantErr: 15.36941 batch_time=0.94821
Train Epoch: 16 [34/250 4352/32000 (14%)] Loss: 1.63916 (QuantReg: 15.28623) QuantErr: 15.28623 batch_time=0.97291
Train Epoch: 16 [45/250 5760/32000 (18%)] Loss: 1.76896 (QuantReg: 14.94155) QuantErr: 14.94155 batch_time=0.93717
Train Epoch: 16 [56/250 7168/32000 (22%)] Loss: 1.74493 (QuantReg: 15.49982) QuantErr: 15.49982 batch_time=0.93187
Train Epoch: 16 [67/250 8576/32000 (27%)] Loss: 1.80134 (QuantReg: 15.25446) QuantErr: 15.25446 batch_time=2.70575
Train Epoch: 16 [78/250 9984/32000 (31%)] Loss: 1.87192 (QuantReg: 15.03439) QuantErr: 15.03439 batch_time=0.95583
Train Epoch: 16 [89/250 11392/32000 (36%)] Loss: 1.45895 (QuantReg: 15.34309) QuantErr: 15.34309 batch_time=0.92678
Train Epoch: 16 [100/250 12800/32000 (40%)] Loss: 1.86412 (QuantReg: 15.51902) QuantErr: 15.51902 batch_time=0.96324
Train Epoch: 16 [111/250 14208/32000 (44%)] Loss: 1.59912 (QuantReg: 15.13201) QuantErr: 15.13201 batch_time=0.91256
Train Epoch: 16 [122/250 15616/32000 (49%)] Loss: 1.95489 (QuantReg: 15.52283) QuantErr: 15.52283 batch_time=0.91391
Train Epoch: 16 [133/250 17024/32000 (53%)] Loss: 1.71647 (QuantReg: 14.83457) QuantErr: 14.83457 batch_time=1.19118
Train Epoch: 16 [144/250 18432/32000 (58%)] Loss: 1.48730 (QuantReg: 15.42063) QuantErr: 15.42063 batch_time=1.28721
Train Epoch: 16 [155/250 19840/32000 (62%)] Loss: 1.50335 (QuantReg: 15.43275) QuantErr: 15.43275 batch_time=0.93493
Train Epoch: 16 [166/250 21248/32000 (66%)] Loss: 1.58167 (QuantReg: 15.38560) QuantErr: 15.38560 batch_time=0.90934
Train Epoch: 16 [177/250 22656/32000 (71%)] Loss: 2.34808 (QuantReg: 15.34752) QuantErr: 15.34752 batch_time=0.94473
Train Epoch: 16 [188/250 24064/32000 (75%)] Loss: 1.44029 (QuantReg: 15.13417) QuantErr: 15.13417 batch_time=0.95359
Train Epoch: 16 [199/250 25472/32000 (80%)] Loss: 1.67327 (QuantReg: 15.57219) QuantErr: 15.57219 batch_time=0.87845
Train Epoch: 16 [210/250 26880/32000 (84%)] Loss: 1.79176 (QuantReg: 15.58656) QuantErr: 15.58656 batch_time=0.91614
Train Epoch: 16 [221/250 28288/32000 (88%)] Loss: 1.61074 (QuantReg: 15.84409) QuantErr: 15.84409 batch_time=0.92915
Train Epoch: 16 [232/250 29696/32000 (93%)] Loss: 1.91418 (QuantReg: 15.58796) QuantErr: 15.58796 batch_time=0.91899
Train Epoch: 16 [243/250 31104/32000 (97%)] Loss: 1.51808 (QuantReg: 15.14534) QuantErr: 15.14534 batch_time=0.90151
Train Epoch: 16 codebook_update_time=1.73735
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch16.pth ...
Done in 14.220s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch16.pth ...
Done in 25.650s
removing stale ckpt [epoch 15] [took 0.00s]
epoch : 16
loss : 1.7432164211273193
quant_reg : 15.328203147888184
quant_err : 15.328203147888184
learning_rate : 9.265824603195063e-06
n_samples : 512000
n_steps : 4000
MSRVTT_jsfusion_test/t2v_metrics/R1: 20.3
MSRVTT_jsfusion_test/t2v_metrics/R5: 49.3
MSRVTT_jsfusion_test/t2v_metrics/R10: 63.3
MSRVTT_jsfusion_test/t2v_metrics/R50: 87.8
MSRVTT_jsfusion_test/t2v_metrics/MedR: 6.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 27.852
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 39.86412375544485
MSRVTT_jsfusion_test/v2t_metrics/R1: 21.1
MSRVTT_jsfusion_test/v2t_metrics/R5: 49.0
MSRVTT_jsfusion_test/v2t_metrics/R10: 63.7
MSRVTT_jsfusion_test/v2t_metrics/R50: 88.8
MSRVTT_jsfusion_test/v2t_metrics/MedR: 6.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 25.0605
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 40.383689049977136
mnt_best : 39.86412375544485
not_improved_count: 0
Train Epoch: 17 [1/250 128/32000 (0%)] Loss: 1.56231 (QuantReg: 15.47337) QuantErr: 15.47337 batch_time=39.37867
Train Epoch: 17 [12/250 1536/32000 (5%)] Loss: 1.67722 (QuantReg: 15.34550) QuantErr: 15.34550 batch_time=0.91431
Train Epoch: 17 [23/250 2944/32000 (9%)] Loss: 1.58657 (QuantReg: 15.47903) QuantErr: 15.47903 batch_time=0.95475
Train Epoch: 17 [34/250 4352/32000 (14%)] Loss: 1.97195 (QuantReg: 15.20479) QuantErr: 15.20479 batch_time=0.94783
Train Epoch: 17 [45/250 5760/32000 (18%)] Loss: 1.42481 (QuantReg: 15.24973) QuantErr: 15.24973 batch_time=0.94439
Train Epoch: 17 [56/250 7168/32000 (22%)] Loss: 1.46853 (QuantReg: 15.07724) QuantErr: 15.07724 batch_time=0.93448
Train Epoch: 17 [67/250 8576/32000 (27%)] Loss: 1.68763 (QuantReg: 15.58274) QuantErr: 15.58274 batch_time=0.96582
Train Epoch: 17 [78/250 9984/32000 (31%)] Loss: 1.81766 (QuantReg: 15.89675) QuantErr: 15.89675 batch_time=0.92199
Train Epoch: 17 [89/250 11392/32000 (36%)] Loss: 1.28922 (QuantReg: 15.76390) QuantErr: 15.76390 batch_time=0.95391
Train Epoch: 17 [100/250 12800/32000 (40%)] Loss: 1.58214 (QuantReg: 15.80774) QuantErr: 15.80774 batch_time=0.91981
Train Epoch: 17 [111/250 14208/32000 (44%)] Loss: 1.76779 (QuantReg: 15.26629) QuantErr: 15.26629 batch_time=0.92594
Train Epoch: 17 [122/250 15616/32000 (49%)] Loss: 1.50217 (QuantReg: 15.46128) QuantErr: 15.46128 batch_time=0.97848
Train Epoch: 17 [133/250 17024/32000 (53%)] Loss: 1.42301 (QuantReg: 15.87632) QuantErr: 15.87632 batch_time=0.92259
Train Epoch: 17 [144/250 18432/32000 (58%)] Loss: 1.65101 (QuantReg: 15.69044) QuantErr: 15.69044 batch_time=0.92639
Train Epoch: 17 [155/250 19840/32000 (62%)] Loss: 2.00600 (QuantReg: 15.43097) QuantErr: 15.43097 batch_time=1.03675
Train Epoch: 17 [166/250 21248/32000 (66%)] Loss: 1.84027 (QuantReg: 15.24996) QuantErr: 15.24996 batch_time=0.97850
Train Epoch: 17 [177/250 22656/32000 (71%)] Loss: 1.91138 (QuantReg: 15.60410) QuantErr: 15.60410 batch_time=0.94043
Train Epoch: 17 [188/250 24064/32000 (75%)] Loss: 1.29958 (QuantReg: 15.90351) QuantErr: 15.90351 batch_time=0.95002
Train Epoch: 17 [199/250 25472/32000 (80%)] Loss: 1.58560 (QuantReg: 15.76266) QuantErr: 15.76266 batch_time=0.92739
Train Epoch: 17 [210/250 26880/32000 (84%)] Loss: 1.75143 (QuantReg: 15.67547) QuantErr: 15.67547 batch_time=0.88482
Train Epoch: 17 [221/250 28288/32000 (88%)] Loss: 1.73899 (QuantReg: 15.80598) QuantErr: 15.80598 batch_time=1.34648
Train Epoch: 17 [232/250 29696/32000 (93%)] Loss: 1.83879 (QuantReg: 15.65211) QuantErr: 15.65211 batch_time=0.94578
Train Epoch: 17 [243/250 31104/32000 (97%)] Loss: 1.47779 (QuantReg: 15.78644) QuantErr: 15.78644 batch_time=1.04794
Train Epoch: 17 codebook_update_time=1.82680
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch17.pth ...
Done in 11.404s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch17.pth ...
Done in 23.114s
removing stale ckpt [epoch 16] [took 0.00s]
epoch : 17
loss : 1.6884302434921266
quant_reg : 15.492783779144288
quant_err : 15.492783779144288
learning_rate : 8.80253337303531e-06
n_samples : 544000
n_steps : 4250
MSRVTT_jsfusion_test/t2v_metrics/R1: 20.3
MSRVTT_jsfusion_test/t2v_metrics/R5: 49.8
MSRVTT_jsfusion_test/t2v_metrics/R10: 64.3
MSRVTT_jsfusion_test/t2v_metrics/R50: 87.2
MSRVTT_jsfusion_test/t2v_metrics/MedR: 6.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 27.152
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 40.207967283007825
MSRVTT_jsfusion_test/v2t_metrics/R1: 21.7
MSRVTT_jsfusion_test/v2t_metrics/R5: 49.6
MSRVTT_jsfusion_test/v2t_metrics/R10: 63.2
MSRVTT_jsfusion_test/v2t_metrics/R50: 88.7
MSRVTT_jsfusion_test/v2t_metrics/MedR: 6.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 24.8255
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 40.82123718157939
mnt_best : 40.207967283007825
not_improved_count: 0
Train Epoch: 18 [1/250 128/32000 (0%)] Loss: 1.28361 (QuantReg: 15.38119) QuantErr: 15.38119 batch_time=31.08124
Train Epoch: 18 [12/250 1536/32000 (5%)] Loss: 1.23270 (QuantReg: 15.45279) QuantErr: 15.45279 batch_time=1.04871
Train Epoch: 18 [23/250 2944/32000 (9%)] Loss: 2.14332 (QuantReg: 14.98877) QuantErr: 14.98877 batch_time=2.85377
Train Epoch: 18 [34/250 4352/32000 (14%)] Loss: 1.58846 (QuantReg: 15.36753) QuantErr: 15.36753 batch_time=0.99053
Train Epoch: 18 [45/250 5760/32000 (18%)] Loss: 1.62132 (QuantReg: 15.70231) QuantErr: 15.70231 batch_time=0.98637
Train Epoch: 18 [56/250 7168/32000 (22%)] Loss: 1.98683 (QuantReg: 15.38814) QuantErr: 15.38814 batch_time=0.97374
Train Epoch: 18 [67/250 8576/32000 (27%)] Loss: 1.62664 (QuantReg: 15.57914) QuantErr: 15.57914 batch_time=2.87385
Train Epoch: 18 [78/250 9984/32000 (31%)] Loss: 1.53184 (QuantReg: 15.42665) QuantErr: 15.42665 batch_time=0.89814
Train Epoch: 18 [89/250 11392/32000 (36%)] Loss: 1.48272 (QuantReg: 15.80005) QuantErr: 15.80005 batch_time=0.93235
Train Epoch: 18 [100/250 12800/32000 (40%)] Loss: 1.69230 (QuantReg: 15.44788) QuantErr: 15.44788 batch_time=0.90289
Train Epoch: 18 [111/250 14208/32000 (44%)] Loss: 1.55759 (QuantReg: 16.04316) QuantErr: 16.04316 batch_time=0.97982
Train Epoch: 18 [122/250 15616/32000 (49%)] Loss: 1.61362 (QuantReg: 15.82780) QuantErr: 15.82780 batch_time=0.91792
Train Epoch: 18 [133/250 17024/32000 (53%)] Loss: 1.79851 (QuantReg: 15.95078) QuantErr: 15.95078 batch_time=0.92411
Train Epoch: 18 [144/250 18432/32000 (58%)] Loss: 1.51063 (QuantReg: 15.62129) QuantErr: 15.62129 batch_time=0.94302
Train Epoch: 18 [155/250 19840/32000 (62%)] Loss: 1.81548 (QuantReg: 15.49172) QuantErr: 15.49172 batch_time=0.96976
Train Epoch: 18 [166/250 21248/32000 (66%)] Loss: 1.61639 (QuantReg: 15.39183) QuantErr: 15.39183 batch_time=0.93099
Train Epoch: 18 [177/250 22656/32000 (71%)] Loss: 1.84064 (QuantReg: 15.27461) QuantErr: 15.27461 batch_time=1.00845
Train Epoch: 18 [188/250 24064/32000 (75%)] Loss: 1.62528 (QuantReg: 15.36502) QuantErr: 15.36502 batch_time=0.91242
Train Epoch: 18 [199/250 25472/32000 (80%)] Loss: 1.54520 (QuantReg: 15.51767) QuantErr: 15.51767 batch_time=0.92610
Train Epoch: 18 [210/250 26880/32000 (84%)] Loss: 1.72827 (QuantReg: 15.61959) QuantErr: 15.61959 batch_time=0.98559
Train Epoch: 18 [221/250 28288/32000 (88%)] Loss: 2.00565 (QuantReg: 15.78315) QuantErr: 15.78315 batch_time=0.92627
Train Epoch: 18 [232/250 29696/32000 (93%)] Loss: 1.49198 (QuantReg: 16.04692) QuantErr: 16.04692 batch_time=0.96335
Train Epoch: 18 [243/250 31104/32000 (97%)] Loss: 1.79945 (QuantReg: 15.65147) QuantErr: 15.65147 batch_time=0.92138
Train Epoch: 18 codebook_update_time=1.80065
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch18.pth ...
Done in 12.084s
Updating 'best' checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch18.pth ...
Done in 38.875s
removing stale ckpt [epoch 17] [took 0.00s]
epoch : 18
loss : 1.6545728011131287
quant_reg : 15.64542448425293
quant_err : 15.64542448425293
learning_rate : 8.362406704383544e-06
n_samples : 576000
n_steps : 4500
MSRVTT_jsfusion_test/t2v_metrics/R1: 22.1
MSRVTT_jsfusion_test/t2v_metrics/R5: 51.4
MSRVTT_jsfusion_test/t2v_metrics/R10: 64.9
MSRVTT_jsfusion_test/t2v_metrics/R50: 87.9
MSRVTT_jsfusion_test/t2v_metrics/MedR: 5.0
MSRVTT_jsfusion_test/t2v_metrics/MeanR: 26.053
MSRVTT_jsfusion_test/t2v_metrics/geometric_mean_R1-R5-R10: 41.930820733928286
MSRVTT_jsfusion_test/v2t_metrics/R1: 22.3
MSRVTT_jsfusion_test/v2t_metrics/R5: 51.1
MSRVTT_jsfusion_test/v2t_metrics/R10: 64.7
MSRVTT_jsfusion_test/v2t_metrics/R50: 88.5
MSRVTT_jsfusion_test/v2t_metrics/MedR: 5.0
MSRVTT_jsfusion_test/v2t_metrics/MeanR: 24.3035
MSRVTT_jsfusion_test/v2t_metrics/geometric_mean_R1-R5-R10: 41.93178476934956
mnt_best : 41.930820733928286
not_improved_count: 0
Train Epoch: 19 [1/250 128/32000 (0%)] Loss: 1.92468 (QuantReg: 15.29653) QuantErr: 15.29653 batch_time=31.43363
Train Epoch: 19 [12/250 1536/32000 (5%)] Loss: 1.74730 (QuantReg: 15.04210) QuantErr: 15.04210 batch_time=0.92388
Train Epoch: 19 [23/250 2944/32000 (9%)] Loss: 1.46651 (QuantReg: 15.39216) QuantErr: 15.39216 batch_time=0.94551
Train Epoch: 19 [34/250 4352/32000 (14%)] Loss: 1.79476 (QuantReg: 15.21726) QuantErr: 15.21726 batch_time=0.90137
Train Epoch: 19 [45/250 5760/32000 (18%)] Loss: 1.37357 (QuantReg: 15.47638) QuantErr: 15.47638 batch_time=0.90945
Train Epoch: 19 [56/250 7168/32000 (22%)] Loss: 1.22115 (QuantReg: 15.64889) QuantErr: 15.64889 batch_time=0.89606
Train Epoch: 19 [67/250 8576/32000 (27%)] Loss: 1.44654 (QuantReg: 15.70459) QuantErr: 15.70459 batch_time=0.93422
Train Epoch: 19 [78/250 9984/32000 (31%)] Loss: 1.33010 (QuantReg: 15.72937) QuantErr: 15.72937 batch_time=0.93470
Train Epoch: 19 [89/250 11392/32000 (36%)] Loss: 1.40101 (QuantReg: 16.25807) QuantErr: 16.25807 batch_time=0.96029
Train Epoch: 19 [100/250 12800/32000 (40%)] Loss: 1.60099 (QuantReg: 15.76589) QuantErr: 15.76589 batch_time=0.89154
Train Epoch: 19 [111/250 14208/32000 (44%)] Loss: 1.41880 (QuantReg: 15.84074) QuantErr: 15.84074 batch_time=0.98230
Train Epoch: 19 [122/250 15616/32000 (49%)] Loss: 2.27468 (QuantReg: 15.10261) QuantErr: 15.10261 batch_time=0.94567
Train Epoch: 19 [133/250 17024/32000 (53%)] Loss: 1.37711 (QuantReg: 15.46016) QuantErr: 15.46016 batch_time=0.90875
Train Epoch: 19 [144/250 18432/32000 (58%)] Loss: 2.09365 (QuantReg: 15.67321) QuantErr: 15.67321 batch_time=1.18388
Train Epoch: 19 [155/250 19840/32000 (62%)] Loss: 1.47882 (QuantReg: 15.86937) QuantErr: 15.86937 batch_time=0.91208
Train Epoch: 19 [166/250 21248/32000 (66%)] Loss: 1.57935 (QuantReg: 15.85233) QuantErr: 15.85233 batch_time=0.94699
Train Epoch: 19 [177/250 22656/32000 (71%)] Loss: 1.81412 (QuantReg: 16.11547) QuantErr: 16.11547 batch_time=0.90509
Train Epoch: 19 [188/250 24064/32000 (75%)] Loss: 1.59920 (QuantReg: 15.98169) QuantErr: 15.98169 batch_time=1.03687
Train Epoch: 19 [199/250 25472/32000 (80%)] Loss: 1.50413 (QuantReg: 16.42700) QuantErr: 16.42700 batch_time=0.94374
Train Epoch: 19 [210/250 26880/32000 (84%)] Loss: 1.20764 (QuantReg: 16.16178) QuantErr: 16.16178 batch_time=0.95028
Train Epoch: 19 [221/250 28288/32000 (88%)] Loss: 1.68380 (QuantReg: 15.63519) QuantErr: 15.63519 batch_time=0.92179
Train Epoch: 19 [232/250 29696/32000 (93%)] Loss: 1.13377 (QuantReg: 16.20352) QuantErr: 16.20352 batch_time=0.91693
Train Epoch: 19 [243/250 31104/32000 (97%)] Loss: 1.79107 (QuantReg: 15.79208) QuantErr: 15.79208 batch_time=0.90657
Train Epoch: 19 codebook_update_time=1.99010
Saving checkpoint: /apdcephfs/share_47076/gimwang/HCQ/exps/HCQ_MSRVTT_1kA_xlnet-large/checkpoint-epoch19.pth ...
Done in 11.495s
removing stale ckpt [epoch 18] [took 0.00s]
epoch : 19
loss : 1.6140804972648621
quant_reg : 15.734696117401123
quant_err : 15.734696117401123
learning_rate : 7.944286369164366e-06
n_samples : 608000
n_steps : 4750