Skip to content

Commit 9e78b91

Browse files
Merge pull request #13544 from tensorflow:bharatjetti-patch-4
PiperOrigin-RevId: 752566225
2 parents 1775a8a + 0300571 commit 9e78b91

File tree

1 file changed

+16
-15
lines changed

1 file changed

+16
-15
lines changed

official/nlp/docs/pretrained_models.md

+16-15
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,9 @@
11
# Pre-trained Models
22

33
⚠️ Disclaimer: Checkpoints are based on training with publicly available datasets.
4-
Some datasets contain limitations, including non-commercial use limitations. Please review the terms and conditions made available by third parties before using
5-
the datasets provided. Checkpoints are licensed under
4+
Some datasets contain limitations, including non-commercial use limitations.
5+
Please review the terms and conditions made available by third parties before
6+
using the datasets provided. Checkpoints are licensed under
67
[Apache 2.0](https://github.com/tensorflow/models/blob/master/LICENSE).
78

89
⚠️ Disclaimer: Datasets hyperlinked from this page are not owned or distributed
@@ -16,8 +17,9 @@ models.
1617

1718
### How to Initialize from Checkpoint
1819

19-
**Note:** TF-HUB/Savedmodel is the preferred way to distribute models as it is
20-
self-contained. Please consider using TF-HUB for finetuning tasks first.
20+
**Note:** TF-HUB/Kaggle-Savedmodel is the preferred way to distribute models as
21+
it is self-contained. Please consider using TF-HUB/Kaggle for finetuning tasks
22+
first.
2123

2224
If you use the [NLP training library](train.md),
2325
you can specify the checkpoint path link directly when launching your job. For
@@ -29,11 +31,11 @@ python3 train.py \
2931
--params_override=task.init_checkpoint=PATH_TO_INIT_CKPT
3032
```
3133

32-
### How to load TF-HUB SavedModel
34+
### How to load TF-HUB/Kaggle SavedModel
3335

3436
Finetuning tasks such as question answering (SQuAD) and sentence
35-
prediction (GLUE) support loading a model from TF-HUB. These built-in tasks
36-
support a specific `task.hub_module_url` parameter. To set this parameter,
37+
prediction (GLUE) support loading a model from TF-HUB/Kaggle. These built-in
38+
tasks support a specific `task.hub_module_url` parameter. To set this parameter,
3739
replace `--params_override=task.init_checkpoint=...` with
3840
`--params_override=task.hub_module_url=TF_HUB_URL`, like below:
3941

@@ -54,7 +56,7 @@ in order to keep consistent with BERT paper.
5456

5557
### Checkpoints
5658

57-
Model | Configuration | Training Data | Checkpoint & Vocabulary | TF-HUB SavedModels
59+
Model | Configuration | Training Data | Checkpoint & Vocabulary | Kaggle SavedModels
5860
---------------------------------------- | :--------------------------: | ------------: | ----------------------: | ------:
5961
BERT-base uncased English | uncased_L-12_H-768_A-12 | Wiki + Books | [uncased_L-12_H-768_A-12](https://storage.googleapis.com/tf_model_garden/nlp/bert/v3/uncased_L-12_H-768_A-12.tar.gz) | [`BERT-Base, Uncased`](https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/)
6062
BERT-base cased English | cased_L-12_H-768_A-12 | Wiki + Books | [cased_L-12_H-768_A-12](https://storage.googleapis.com/tf_model_garden/nlp/bert/v3/cased_L-12_H-768_A-12.tar.gz) | [`BERT-Base, Cased`](https://tfhub.dev/tensorflow/bert_en_cased_L-12_H-768_A-12/)
@@ -74,7 +76,7 @@ We also have pretrained BERT models with variants in both network architecture
7476
and training methodologies. These models achieve higher downstream accuracy
7577
scores.
7678

77-
Model | Configuration | Training Data | TF-HUB SavedModels | Comment
79+
Model | Configuration | Training Data | Kaggle SavedModels | Comment
7880
-------------------------------- | :----------------------: | -----------------------: | ------------------------------------------------------------------------------------: | ------:
7981
BERT-base talking heads + ggelu | uncased_L-12_H-768_A-12 | Wiki + Books | [talkheads_ggelu_base](https://tfhub.dev/tensorflow/talkheads_ggelu_bert_en_base/1) | BERT-base trained with [talking heads attention](https://arxiv.org/abs/2003.02436) and [gated GeLU](https://arxiv.org/abs/2002.05202).
8082
BERT-large talking heads + ggelu | uncased_L-24_H-1024_A-16 | Wiki + Books | [talkheads_ggelu_large](https://tfhub.dev/tensorflow/talkheads_ggelu_bert_en_large/1) | BERT-large trained with [talking heads attention](https://arxiv.org/abs/2003.02436) and [gated GeLU](https://arxiv.org/abs/2002.05202).
@@ -96,13 +98,12 @@ ALBERT repository.
9698

9799
### Checkpoints
98100

99-
Model | Training Data | Checkpoint & Vocabulary | TF-HUB SavedModels
101+
Model | Training Data | Checkpoint & Vocabulary | Kaggle SavedModels
100102
---------------------------------------- | ------------: | ----------------------: | ------:
101-
ALBERT-base English | Wiki + Books | [`ALBERT Base`](https://storage.googleapis.com/tf_model_garden/nlp/albert/albert_base.tar.gz) | https://tfhub.dev/tensorflow/albert_en_base/3
102-
ALBERT-large English | Wiki + Books | [`ALBERT Large`](https://storage.googleapis.com/tf_model_garden/nlp/albert/albert_large.tar.gz) | https://tfhub.dev/tensorflow/albert_en_large/3
103-
ALBERT-xlarge English | Wiki + Books | [`ALBERT XLarge`](https://storage.googleapis.com/tf_model_garden/nlp/albert/albert_xlarge.tar.gz) | https://tfhub.dev/tensorflow/albert_en_xlarge/3
104-
ALBERT-xxlarge English | Wiki + Books | [`ALBERT XXLarge`](https://storage.googleapis.com/tf_model_garden/nlp/albert/albert_xxlarge.tar.gz) | https://tfhub.dev/tensorflow/albert_en_xxlarge/3
105-
103+
ALBERT-base English | Wiki + Books | [`ALBERT Base`](https://storage.googleapis.com/tf_model_garden/nlp/albert/albert_base.tar.gz) | [albert_en_base](https://tfhub.dev/tensorflow/albert_en_base/3)
104+
ALBERT-large English | Wiki + Books | [`ALBERT Large`](https://storage.googleapis.com/tf_model_garden/nlp/albert/albert_large.tar.gz) | [albert_en_large](https://tfhub.dev/tensorflow/albert_en_large/3)
105+
ALBERT-xlarge English | Wiki + Books | [`ALBERT XLarge`](https://storage.googleapis.com/tf_model_garden/nlp/albert/albert_xlarge.tar.gz) | [albert_en_xlarge](https://tfhub.dev/tensorflow/albert_en_xlarge/3)
106+
ALBERT-xxlarge English | Wiki + Books | [`ALBERT XXLarge`](https://storage.googleapis.com/tf_model_garden/nlp/albert/albert_xxlarge.tar.gz) | [albert_en_xxlarge](https://tfhub.dev/tensorflow/albert_en_xxlarge/3)
106107

107108
## ELECTRA
108109

0 commit comments

Comments
 (0)