Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Orb v3 #240

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

Orb v3 #240

wants to merge 12 commits into from

Conversation

DeNeutoy
Copy link
Contributor

@DeNeutoy DeNeutoy commented Apr 4, 2025

Description

Orb-v3 - coming in the next few days! I've also included the script I used to do the geo_opt analysis (mostly based on the one in matbench, but isolated to a single script), because I know that was slightly in the air based on #230. To check, I ran a couple of other models using their predictions in figshare (showing the utility of getting people to to this!):

stol structure_rmsd vs_dft n_sym_ops_mae symmetry decrease symmetry match symmetry increase n_structures model
1.e-2 0.07313 1.809 0.05561 0.8144 0.1231 2.570e5 MACE MPA 0
1.e-5 0.06078 4.163 0.4739 0.3636 0.1188 2.570e5 esen-30m-oam
1.e-2 0.06078 2.102 0.1616 0.7103 0.1085 2.570e5 esen-30m-oam
1.e-5 0.06391 2.033 0.04393 0.7057 0.2453 2.570e5 sevennet-mf-ompa
1.e-2 0.06391 1.705 0.04667 0.8181 0.1280 2.570e5 sevennet-mf-ompa
1.e-5 0.07481 10.17 0.8694 0.1304 0.0001440 2.570e5 Orb-v3-con-inf-mpa
1.e-2 0.07481 2.609 0.1592 0.7281 0.1012 2.570e5 Orb-v3-con-inf-mpa

We are still in the process of adding the exact models to orb-models but we'll make a release over the weekend and then I will update this PR.

Checklist

Please check the following items before submitting your PR:

  • I created a new folder and YAML metadata file models/<arch_name>/<model_variant>.yml for my submission. arch_name is the name of the architecture and model_variant.yml includes things like author details, training set names and important hyperparameters.
  • I added the my new model as a new attribute on the Model.<arch_name> enum in enums.py.
  • I uploaded the energy/force/stress model prediction file for the WBM test set to Figshare or another cloud storage service (<yyyy-mm-dd>-<model_variant>-preds.csv.gz).
  • I uploaded the model-relaxed structures file to Figshare or another cloud storage service in JSON lines format (<yyyy-mm-dd>-wbm-IS2RE-FIRE.jsonl.gz). JSON Lines allows fast loading of small numbers of structures with pandas.read_json(lines=True, nrows=100) for inspection.
  • I uploaded the phonon predictions to Figshare or another cloud storage service (<yyyy-mm-dd>-kappa-103-FIRE-<values-of-dist|fmax|symprec>.gz).
  • I included the urls to the Figshare files in the YAML metadata file (models/<arch_name>/<model_variant>.yml). If not using Figshare I have included the urls to the cloud storage service in the description of the PR.
  • I included the test script (test_<arch_name>_<task>.py for task in discovery, kappa, diatomics) that generated the prediction files.

Additional Information (Optional)

  • I included a training script (train_<arch_name>.py) if I trained a model specifically for this benchmark.
  • I included a readme.md with additional details about my model.

@DeNeutoy
Copy link
Contributor Author

DeNeutoy commented Apr 6, 2025

@janosh / @CompRhys this is ready now - I am still seeing some error from the tests regarding the csv predictions - I can include them here, but I think it's not being fetched from the aws bucket - is it preferable to have them stored remotely?

I'll send a follow up PR when the paper is out to update a couple of the yaml sections, shouldn't be too long (but the models are available as of now in orb-models on main).

@CompRhys
Copy link
Collaborator

CompRhys commented Apr 6, 2025

@janosh / @CompRhys this is ready now - I am still seeing some error from the tests regarding the csv predictions - I can include them here, but I think it's not being fetched from the aws bucket - is it preferable to have them stored remotely?

@janosh has typically been mirroring things to figshare so they can't be deleted and break things and so I think that will fix the test. I will take a quick look but nothing to fix on your side I'd guess, i'll also get the linting issue which is that the license in the v3 yml needs a dash rather than a space to fit the schema --> update: the file path needs to put the downloaded file in the models folder. It's the path where the analysis code saves the csv after downloading it from figshare and the test checks that it gets put inside the models directory.

I'll send a follow up PR when the paper is out to update a couple of the yaml sections, shouldn't be too long (but the models are available as of now in orb-models on main).

Sounds good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants