You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* allow scoring with features only (gradients deleted)
* migrate to black codestyle
* update scores_finalized in JSON file
* blockwise scoring (relevant when scoring large datasets, i.e. many targets)
* move vectorize f-n to projectors; fast projector (incoming feature) will not vectorize
* save on I/O overhead by only writing once to disk when scoring
* custom CudaProjector for large models to avoid overflow error in CUDA kernel
* allow computing TRAK with respect to specified parameter groups
---------
Co-authored-by: Alaa Khaddaj <alaakh@mit.edu>
Copy file name to clipboardExpand all lines: docs/source/bert.rst
+57-30
Original file line number
Diff line number
Diff line change
@@ -63,7 +63,7 @@ to fit our API signatures.
63
63
64
64
We slightly redefine the :code:`forward` function so that we can pass in the inputs (:code:`input_ids`, etc.) as positional arguments instead of as keyword arguments.
65
65
66
-
For data loading, we adapt the code from Hugging Face example:
66
+
For data loading, we adapt the code from the HuggingFace example:
67
67
68
68
.. raw:: html
69
69
@@ -132,7 +132,7 @@ For data loading, we adapt the code from Hugging Face example:
132
132
133
133
#NOTE: CHANGE THIS IF YOU WANT TO RUN ON FULL DATASET
134
134
TRAIN_SET_SIZE=5_000
135
-
VAL_SET_SIZE=1_00
135
+
VAL_SET_SIZE=10
136
136
137
137
definit_loaders(batch_size=16):
138
138
ds_train = get_dataset('train')
@@ -180,38 +180,59 @@ The model output function is implemented as follows:
0 commit comments