Releases: KarelZe/thesis
Changes between 8 May - 14 May
Took Friday and Saturday off to get uni-related work done.
What's Changed
Empirical Study ⚗️
- Clean up of outdated files ♻️ by @KarelZe in #355
- Implemented correct feature importance measures (WIP) @KarelZe in #322
- includes SAGE values with zero-one loss and permutation in groups. Also, opened an issue iancovert/sage#18 to discuss the idea and implementation with the authors. (WIP)
- includes visualizations of categorical embeddings with highly promising results.
- includes new approach to calculate attention maps (Cheffer et al) (WIP)
Writing 📖
- Add paragraphs on label smoothing, lr warmup, optimizer, and viz🤖 by @KarelZe in #350
- Add results hyperparameter search gradient-boosting 😺 by @KarelZe in #352
- Rewrite chapter Hyperparameter Search with updated results 🗺️ by @KarelZe in #354
- Chapter on the selection of supervised methods👩🎓 (WIP) by @KarelZe in #353
Other Changes
- Bump gcsfs from 2023.4.0 to 2023.5.0 by @dependabot in #351
- Bump google-auth from 2.17.3 to 2.18.0 by @dependabot in #357
Outlook 🔭
See https://github.com/users/KarelZe/projects/1/views/4.
Full Changelog: 23-19...23-20
Changes between 1 May - 7 May
What's Changed
Empirical Study ⚗️
- Implement Pre-Training🛝 (WIP) by @KarelZe in #343
- Implement and Study Feature Importances🪄 (WIP) by @KarelZe in #322
- identified that random feature permutation won't work as expected
Writing 📖
- Extend chapter on hyperparameter tuning, training of supervised / semi-supervised methods 📖 by @KarelZe in #342
- includes new insights in the training configuration of models
- includes new insights on the hyperparameters and their necessity
- identified smaller errors that led to largely fluctuating errors
Other Changes
- Bump typer from 0.8.0 to 0.9.0 by @dependabot in #341
- Bump requests from 2.29.0 to 2.30.0 by @dependabot in #344
Outlook 🔭
See https://github.com/users/KarelZe/projects/1/views/4.
Full Changelog: 23-18...23-19
Changes between 24 April - 30 April
What's Changed
Empirical Study ⚗️
- Fix typo in
train_model.py
🐍 by @KarelZe in #323 - Add label smoothing🍷 by @KarelZe in #328
- Fix Transformer Implementation 🚑 by @KarelZe in #320
- Add logistic regression🌉 by @KarelZe in #329
Writing 📖
- Rewrite chapter on Pre-Training and Rewrite selection of semi-supervised methods🤖 by @KarelZe in #316
- Add improved visualizations 🖼️ by @KarelZe in #318
- Edit in review comments👩🎓 by @KarelZe in #319
- Various writing improvements📖 by @KarelZe in #324
- Discussion on computational demand + smaller fixes🏭 by @KarelZe in #338
- Add in missing page numbers🔢 by @KarelZe in #340
Other Changes
- Bump requests from 2.28.2 to 2.29.0 by @dependabot in #321
- Bump fastparquet from 2023.2.0 to 2023.4.0 by @dependabot in #325
- Bump typer from 0.7.0 to 0.8.0 by @dependabot in #339
Outlook 🔭
See https://github.com/users/KarelZe/projects/1/views/4.
Full Changelog: 23-17...23-18
Changes between 17 April - 23 April
What's Changed
Empirical Study ⚗️
- Started with EDA on unlabelled data. Still have to make sense of the results.
- Continued working on the invalid gradient problem. Haven't yet figured out, how to reproduce it reliably.
Writing 📖
- Reworked chapter on token embeddings
- Reworked chapter on FT-Transformer
- Reworked chapter on decision trees
- Shortened several chapters
- Added chapter on Attention Mechanism
- Added chapter on Gradient Boosting Procedure
- Added discussion on Selection of Semi-Supervised Approaches
- Added chapter on Pre-Training of Transformers
- Various other improvements: notation, viz, typos, 🇺🇸 / 🇬🇧 dialect, etc.
Other Changes
- Bump psutil from 5.9.4 to 5.9.5 by @dependabot in #313
Outlook 🔭
See https://github.com/users/KarelZe/projects/1/views/4.
Full Changelog: 23-16...23-17
Changes between 10 April - 16 April
What's Changed
Empirical Study ⚗️
- Implement proper training setup for transformers🤖 by @KarelZe in #292
- Remove TabTransformer🤖 by @KarelZe in #305
Writing 📖
- Fill in gap for trade initiator definition🧑🌾 by @KarelZe in #217
- Chapter on Self-Training⭕ by @KarelZe in #296
- Chapter on hyperparameter tuning🏎️ by @KarelZe in #300
- Streamline thesis 🚈 by @KarelZe in #302
- Add chapter on TokenEmbeddings💤 by @KarelZe in #307
- Streamline writing of thesis🪜 by @KarelZe in #297
- Paragraph on Random Feature Permutation / Partial Dependence Plots📑 by @KarelZe in #310
- Edit in comments from Patrick🐥 by @KarelZe in #308
Other Changes
- Bump gcsfs from 2023.3.0 to 2023.4.0 by @dependabot in #298
- Add chapter on hyperparameter tuning (current state)🏎️ by @KarelZe in #295
- Bump google-auth from 2.17.2 to 2.17.3 by @dependabot in #303
Outlook
Full Changelog: 23-15...23-16
Changes between 3 April - 9 April
What's Changed
Empirical Study ⚗️
- Fix totals in tables📊 by @KarelZe in #276
- Add retraining / semi-supervised mode to gradient boosting😺 by @KarelZe in #278
- Create summary statistics classical trade classification rules📊 by @KarelZe in #279
- Code review of data preparation notebooks😈 by @KarelZe in #280
- Run studies for SelfTrainingClassifier
🅾️ by @KarelZe in #249 - Fix statistical tests in effective spread calculation🌄 by @KarelZe in #281
- Add transfer learning results🔄️ by @KarelZe in #285
- Select benchmark on validation set🔧 by @KarelZe in #291
- Delete references to Docker⚓ by @KarelZe in #294
Writing 📖
- Chapter on evaluation metric🪙 by @KarelZe in #216
- Delete outdated files and add questions for meeting❌ by @KarelZe in #283
- Chapter on Semi-Supervised Learning🦯 by @KarelZe in #284
- Various improvements: evaluation metric, hyperparameter tuning, and application study🎩 by @KarelZe in #286
Other Changes
- Bump google-auth from 2.17.1 to 2.17.2 by @dependabot in #288
Full Changelog: 23-14...23-15
Changes between 27 March and 2 April
What's Changed
Empirical Study ⚗️
- Allow unclassified in ClassicalClassifier🏦 by @KarelZe in #219
- Implement Self-Training for CatBoost⭕ by @KarelZe in #215
- Extend result generation🏁 by @KarelZe in #228
- Improve Result Tables🖨️ by @KarelZe in #234
- Fix midpoint/spread in
ClassicalClassifier
🐞 by @KarelZe in #235 - Improve feature engineering notebook🤏 by @KarelZe in #236
- Remove from feature set mode none the zero imputation🐞 by @KarelZe in #239
- Generate ISE / CBOE supervised results of Gradient Boosting🐈 by @KarelZe in #243
- Improvement of resumable studies and SelfTrainingClassifier
🅾️ by @KarelZe in #246 and in #224 - Run studies for SelfTrainingClassifier
🅾️ (WIP) by @KarelZe in #249 - Add visualizations of hyperparameter search space and fix minor typos🌔 by @KarelZe in #248
Writing 📖
- Chapter on Feature Engineering🪄 by @KarelZe in #212
- Update chapter on dataset/results 📑 by @KarelZe in #237
- Run studies for SelfTrainingClassifier
🅾️ (WIP) by @KarelZe in #249 - Add chapter on random feature permutation🔀 (WIP) by @KarelZe in #217
Other Changes
- Bump google-auth from 2.16.3 to 2.17.0 by @dependabot in #229
- Bump google-auth from 2.17.0 to 2.17.1 by @dependabot in #242
Outlook 🔭
See https://github.com/users/KarelZe/projects/1/views/4.
Full Changelog: 23-13...23-14
Changes between 20 March and 26 March
Picked work again on Thursday.
What's Changed
Empirical Study ⚗️
- Fix NaN gradients🐞 by @KarelZe in #137
- Implement Self-Training Classifier⭕ (WIP) by @KarelZe in #209
- Allow unclassified in ClassicalClassifier🏦 (WIP) by @KarelZe in #218
Writing 📖
- Add questions for tomorrow❓ by @KarelZe in #205
- Create chapter on Data-Preprocessing🌋 by @KarelZe in #214
- Section on Feature Engineering🪄 (WIP) by @KarelZe in #208
- Chapter on Self-Training⭕ (WIP) by @KarelZe in #209
- Paragraph on Random Feature Permutation📑(WIP) by @KarelZe in #210
Other Changes
- Bump google-auth from 2.16.2 to 2.16.3 by @dependabot in #213
Outlook 🛩️
Full Changelog: 23-12...23-13
Changes between 13 March and 19 March
Didn't work 100 % on thesis. Spent most time on exam prep.
BwHPC Cluster is down until Friday. Thus, I will spend my time after the exam on writing ✏️ .
What's Changed
Empirical Study ⚗️
- Generate results for classical classifier + effective spread👸 by @KarelZe in #200
- Automatic generation of results tables🏇 by @KarelZe in #201
- Automatic result / viz generation for gradient boosting🙀 by @KarelZe in #203
- Add ROC / Recall curves to notebooks🦉 by @KarelZe in #204
- Extended pipeline for result generation🛕 by @KarelZe in #202
- Gathered some ideas on how to retrieve the feature importances / need to correct probabilities.
Outlook🎒
- exam prep (Mo - Wed)
- write the chapter on data preprocessing incl. viz
- shorten / rewrite the chapter on feature engineering
- prewrite the sub-chapter on random feature permutation. Make sure it is the best possible choice.
- create prototype for grouped random feature permutation
- review and test #137
Full Changelog: 23-11...23-12
Changes between 6 March and 12 March
Didn't work 100 % on thesis. Spent some time on exam prep.
What's Changed
Writing 📖
- Chapter on environment🪐 by @KarelZe in #196
- Add chapter on train-test split🥮 by @KarelZe in #193
- Streamline chapter on train-test split and improved visualizations🚂 by @KarelZe in #197 and #198
- Add chapter on trade initiator🥯 by @KarelZe in #199
Other Changes
- Bump tqdm from 4.64.1 to 4.65.0 by @dependabot in #195
- Bump gcsfs from 2023.1.0 to 2023.3.0 by @dependabot in #194
Outlook🎒
- finish remaining tasks from last week
- exam prep
Full Changelog: 23-10...23-11