Skip to content

Commit c7c7961

Browse files
committed
finish
1 parent 1d076f1 commit c7c7961

File tree

6 files changed

+4
-46
lines changed

6 files changed

+4
-46
lines changed

.idea/workspace.xml

+2-6
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

README.md

+2-32
Original file line numberDiff line numberDiff line change
@@ -337,7 +337,7 @@ You can check detail of dataset [here](https://arxiv.org/abs/1605.00459) <br>
337337
I follow original paper's parameter settings. (below) <br>
338338

339339
![conf](image/transformer-model-size.jpg)
340-
### 2.1 Transformer - Baseline
340+
### 2.1 Model Specification
341341

342342
* total parameters = 55,207,087
343343
* model size = 215.7MB
@@ -362,7 +362,7 @@ I follow original paper's parameter settings. (below) <br>
362362
* clip = 1
363363
* weight_decay = 5e-4
364364

365-
#### 2.1.2 Training Result
365+
#### 2.2 Training Result
366366

367367
![image](saved/transformer-base/train_result.jpg)
368368
* Minimum Training Loss = 2.852672759656864
@@ -376,36 +376,6 @@ I follow original paper's parameter settings. (below) <br>
376376

377377
<br><br>
378378

379-
### 2.2 Transformer - Big
380-
* total parameters = 232,082,095
381-
* model size = 906.6MB
382-
* lr scheduling : ReduceLROnPlateau
383-
384-
#### 2.2.1 configuration
385-
386-
* batch_size = 32
387-
* max_len = 256
388-
* d_model = 1024
389-
* n_layers = 6
390-
* n_heads = 16
391-
* ffn_hidden = 4096
392-
* drop_prob = 0.3
393-
* init_lr = 0.1
394-
* factor = 0.9
395-
* min_lr = init_lr * 1e-12
396-
* patience = 10
397-
* warmup = 300
398-
* adam_eps = 5e-9
399-
* epoch = 3000
400-
* clip = 1
401-
* weight_decay = 5e-4
402-
403-
404-
#### 2.2.2 Training Result
405-
406-
Training now ...
407-
408-
<br><br>
409379

410380
## 3. Reference
411381
|Reference|

result/bleu.txt

-1
This file was deleted.

result/test_loss.txt

-1
This file was deleted.

result/train_loss.txt

-1
This file was deleted.

saved/__init__.py

-5
This file was deleted.

0 commit comments

Comments
 (0)