Skip to content

Conversation

liranc6
Copy link

@liranc6 liranc6 commented Jul 31, 2025

No description provided.

liranc6 added 6 commits July 28, 2025 10:56
… fast eval.

files changed: eval.py, metrics/knowmem.py, metrics/privleak.py, metrics/verbmem.py
The primary purpose is to improve evaluation robustness and flexibility when managing model outputs and debug workflows.

The primary changes are:

- Updated `eval_model` to ensure `forget_data`, `retain_data`, and `holdout_data` are initialized consistently before use.
- Replaced hardcoded paths with `os.path.join` using `MUSE_DIR` in `eval_model` for improved path handling.
- Added a `kwargs` parameter to both `eval_model` and `load_then_eval_models` to support dynamic control over file creation and loading.
- Implemented conditional logic in `eval_model` for managing `privleak` file generation based on `kwargs['create_new_files']`.
- Removed unused imports and dynamic import logic from `eval.py`, replacing `importlib` with `sys.path.append` to streamline module loading.
- Improved debug visibility in `eval_model` with additional `print` statements for key file paths and parameter values.
- Increased `debug_subset_len` from 2 to 50 in `eval_model` for broader test coverage during debug mode.
- Updated `exp.ipynb` to align with changes in model handling and evaluation behavior in `eval_model`.
@liranc6 liranc6 force-pushed the initial-checks-cosmetic-edits branch from 8f76d7d to aba8ff4 Compare August 11, 2025 11:34
Purpose: Improve the clarity and depth of ILL evaluation, and introduce new tools for classifier-based analysis.

Changes:
- Updated  to clean outputs, improve ROC curves, and set .
- Improved structure and markdown clarity in , with added analysis on loss distributions and unlearning.
- Added  and  for classifier-based ILL feature exploration.
- Added  script for reproducible, scriptable Random Forest analysis.

These updates improve reproducibility, interpretability, and support deeper ILL feature analysis.
…ion in notebooks

The primary purpose is to fix broken imports and implement functional Input Loss Landscape feature computation for machine learning interpretability analysis.

The primary changes are:

- Enhanced import structure in  with additional sklearn modules and SHAP availability check.
- Replaced broken  function calls with working ILL feature computation pipeline.
- Added comprehensive logistic regression analysis with performance metrics, confusion matrix, and feature importance analysis.
- Integrated permutation importance computation and visualization for feature interpretability.
- Fixed execution flow by removing error-prone cells and replacing with successful feature extraction results.
- Updated notebook outputs to show successful ILL feature computation for forget/retain/holdout datasets.
- Added baseline logistic regression performance evaluation with 74% accuracy and detailed classification report.
- Modified  to align with the working implementation in .
The primary purpose is to evaluate the loss landscape of first neighbor sentences to understand the impact of unlearning.

The primary changes are:

- Created a new notebook `MUSE/notebooks/1st_neighbor_classification.ipynb` to analyze the loss landscape of first neighbor sentences.

- Modified `loss_landscape.py` to extract logic into `new_ILL_eval`, `get_features`, and `normalize_features`.

- Replaced dynamic imports with `sys.path` appends in `utils.py`.

- Added `transformers` to `requirements.txt`.

- Increased UMAP dimensionality from 2 to 10 in `embedding.py`.

- Added AUC heatmap and bar chart of top features in `visualization.py`.

- Modified `plotting.py` to return `matplotlib` figure objects instead of file paths.

- Updated `plotting.py` to align with changes made in `visualization.py`.
The purpose of this change is to prevent errors when saving the statistical distances heatmap.

The changes include:

- Added `os.makedirs(plots_base_dir, exist_ok=True)` before saving the heatmap in `eval_with_ILL.py` to ensure the directory exists.
The primary purpose is to provide a reproducible workflow for evaluating Input Loss Landscape (ILL) features on the TOFU dataset using a Llama-2-7b model.

The primary changes are:

- Added `TOFU/notebooks/eval_with_ILL.ipynb` containing a step-by-step pipeline for:

  - Loading and preprocessing the TOFU dataset from Hugging Face.

  - Loading model and tokenizer with correct prompt formatting.

  - Running ILL evaluation using project utilities and saving results.

  - Extracting and normalizing ILL feature tensors for analysis.

  - Visualizing loss landscape features with matplotlib plots.

- The notebook demonstrates integration between the TOFU, MUSE, and project source directories.

- Example code for prompt formatting, model inference, and loss calculation is included for clarity.

- Notebook serves as a reference for future ILL experiments and analysis on TOFU.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant