1
1
# Pre-trained Models
2
2
3
3
⚠️ Disclaimer: Checkpoints are based on training with publicly available datasets.
4
- Some datasets contain limitations, including non-commercial use limitations. Please review the terms and conditions made available by third parties before using
5
- the datasets provided. Checkpoints are licensed under
4
+ Some datasets contain limitations, including non-commercial use limitations.
5
+ Please review the terms and conditions made available by third parties before
6
+ using the datasets provided. Checkpoints are licensed under
6
7
[ Apache 2.0] ( https://github.com/tensorflow/models/blob/master/LICENSE ) .
7
8
8
9
⚠️ Disclaimer: Datasets hyperlinked from this page are not owned or distributed
@@ -16,8 +17,9 @@ models.
16
17
17
18
### How to Initialize from Checkpoint
18
19
19
- ** Note:** TF-HUB/Savedmodel is the preferred way to distribute models as it is
20
- self-contained. Please consider using TF-HUB for finetuning tasks first.
20
+ ** Note:** TF-HUB/Kaggle-Savedmodel is the preferred way to distribute models as
21
+ it is self-contained. Please consider using TF-HUB/Kaggle for finetuning tasks
22
+ first.
21
23
22
24
If you use the [ NLP training library] ( train.md ) ,
23
25
you can specify the checkpoint path link directly when launching your job. For
@@ -29,11 +31,11 @@ python3 train.py \
29
31
--params_override=task.init_checkpoint=PATH_TO_INIT_CKPT
30
32
```
31
33
32
- ### How to load TF-HUB SavedModel
34
+ ### How to load TF-HUB/Kaggle SavedModel
33
35
34
36
Finetuning tasks such as question answering (SQuAD) and sentence
35
- prediction (GLUE) support loading a model from TF-HUB. These built-in tasks
36
- support a specific ` task.hub_module_url ` parameter. To set this parameter,
37
+ prediction (GLUE) support loading a model from TF-HUB/Kaggle . These built-in
38
+ tasks support a specific ` task.hub_module_url ` parameter. To set this parameter,
37
39
replace ` --params_override=task.init_checkpoint=... ` with
38
40
` --params_override=task.hub_module_url=TF_HUB_URL ` , like below:
39
41
@@ -54,7 +56,7 @@ in order to keep consistent with BERT paper.
54
56
55
57
### Checkpoints
56
58
57
- Model | Configuration | Training Data | Checkpoint & Vocabulary | TF-HUB SavedModels
59
+ Model | Configuration | Training Data | Checkpoint & Vocabulary | Kaggle SavedModels
58
60
---------------------------------------- | :--------------------------: | ------------: | ----------------------: | ------:
59
61
BERT-base uncased English | uncased_L-12_H-768_A-12 | Wiki + Books | [ uncased_L-12_H-768_A-12] ( https://storage.googleapis.com/tf_model_garden/nlp/bert/v3/uncased_L-12_H-768_A-12.tar.gz ) | [ ` BERT-Base, Uncased ` ] ( https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/ )
60
62
BERT-base cased English | cased_L-12_H-768_A-12 | Wiki + Books | [ cased_L-12_H-768_A-12] ( https://storage.googleapis.com/tf_model_garden/nlp/bert/v3/cased_L-12_H-768_A-12.tar.gz ) | [ ` BERT-Base, Cased ` ] ( https://tfhub.dev/tensorflow/bert_en_cased_L-12_H-768_A-12/ )
@@ -74,7 +76,7 @@ We also have pretrained BERT models with variants in both network architecture
74
76
and training methodologies. These models achieve higher downstream accuracy
75
77
scores.
76
78
77
- Model | Configuration | Training Data | TF-HUB SavedModels | Comment
79
+ Model | Configuration | Training Data | Kaggle SavedModels | Comment
78
80
-------------------------------- | :----------------------: | -----------------------: | ------------------------------------------------------------------------------------: | ------:
79
81
BERT-base talking heads + ggelu | uncased_L-12_H-768_A-12 | Wiki + Books | [ talkheads_ggelu_base] ( https://tfhub.dev/tensorflow/talkheads_ggelu_bert_en_base/1 ) | BERT-base trained with [ talking heads attention] ( https://arxiv.org/abs/2003.02436 ) and [ gated GeLU] ( https://arxiv.org/abs/2002.05202 ) .
80
82
BERT-large talking heads + ggelu | uncased_L-24_H-1024_A-16 | Wiki + Books | [ talkheads_ggelu_large] ( https://tfhub.dev/tensorflow/talkheads_ggelu_bert_en_large/1 ) | BERT-large trained with [ talking heads attention] ( https://arxiv.org/abs/2003.02436 ) and [ gated GeLU] ( https://arxiv.org/abs/2002.05202 ) .
@@ -96,13 +98,12 @@ ALBERT repository.
96
98
97
99
### Checkpoints
98
100
99
- Model | Training Data | Checkpoint & Vocabulary | TF-HUB SavedModels
101
+ Model | Training Data | Checkpoint & Vocabulary | Kaggle SavedModels
100
102
---------------------------------------- | ------------: | ----------------------: | ------:
101
- ALBERT-base English | Wiki + Books | [ ` ALBERT Base ` ] ( https://storage.googleapis.com/tf_model_garden/nlp/albert/albert_base.tar.gz ) | https://tfhub.dev/tensorflow/albert_en_base/3
102
- ALBERT-large English | Wiki + Books | [ ` ALBERT Large ` ] ( https://storage.googleapis.com/tf_model_garden/nlp/albert/albert_large.tar.gz ) | https://tfhub.dev/tensorflow/albert_en_large/3
103
- ALBERT-xlarge English | Wiki + Books | [ ` ALBERT XLarge ` ] ( https://storage.googleapis.com/tf_model_garden/nlp/albert/albert_xlarge.tar.gz ) | https://tfhub.dev/tensorflow/albert_en_xlarge/3
104
- ALBERT-xxlarge English | Wiki + Books | [ ` ALBERT XXLarge ` ] ( https://storage.googleapis.com/tf_model_garden/nlp/albert/albert_xxlarge.tar.gz ) | https://tfhub.dev/tensorflow/albert_en_xxlarge/3
105
-
103
+ ALBERT-base English | Wiki + Books | [ ` ALBERT Base ` ] ( https://storage.googleapis.com/tf_model_garden/nlp/albert/albert_base.tar.gz ) | [ albert_en_base] ( https://tfhub.dev/tensorflow/albert_en_base/3 )
104
+ ALBERT-large English | Wiki + Books | [ ` ALBERT Large ` ] ( https://storage.googleapis.com/tf_model_garden/nlp/albert/albert_large.tar.gz ) | [ albert_en_large] ( https://tfhub.dev/tensorflow/albert_en_large/3 )
105
+ ALBERT-xlarge English | Wiki + Books | [ ` ALBERT XLarge ` ] ( https://storage.googleapis.com/tf_model_garden/nlp/albert/albert_xlarge.tar.gz ) | [ albert_en_xlarge] ( https://tfhub.dev/tensorflow/albert_en_xlarge/3 )
106
+ ALBERT-xxlarge English | Wiki + Books | [ ` ALBERT XXLarge ` ] ( https://storage.googleapis.com/tf_model_garden/nlp/albert/albert_xxlarge.tar.gz ) | [ albert_en_xxlarge] ( https://tfhub.dev/tensorflow/albert_en_xxlarge/3 )
106
107
107
108
## ELECTRA
108
109
0 commit comments