Orb v3 #240

DeNeutoy · 2025-04-04T18:53:09Z

Description

Orb-v3 - coming in the next few days! I've also included the script I used to do the geo_opt analysis (mostly based on the one in matbench, but isolated to a single script), because I know that was slightly in the air based on #230. To check, I ran a couple of other models using their predictions in figshare (showing the utility of getting people to to this!):

stol	structure_rmsd vs_dft	n_sym_ops_mae	symmetry decrease	symmetry match	symmetry increase	n_structures	model
1.e-2	0.07313	1.809	0.05561	0.8144	0.1231	2.570e5	MACE MPA 0
1.e-5	0.06078	4.163	0.4739	0.3636	0.1188	2.570e5	esen-30m-oam
1.e-2	0.06078	2.102	0.1616	0.7103	0.1085	2.570e5	esen-30m-oam
1.e-5	0.06391	2.033	0.04393	0.7057	0.2453	2.570e5	sevennet-mf-ompa
1.e-2	0.06391	1.705	0.04667	0.8181	0.1280	2.570e5	sevennet-mf-ompa
1.e-5	0.07481	10.17	0.8694	0.1304	0.0001440	2.570e5	Orb-v3-con-inf-mpa
1.e-2	0.07481	2.609	0.1592	0.7281	0.1012	2.570e5	Orb-v3-con-inf-mpa

We are still in the process of adding the exact models to orb-models but we'll make a release over the weekend and then I will update this PR.

Checklist

Please check the following items before submitting your PR:

I created a new folder and YAML metadata file models/<arch_name>/<model_variant>.yml for my submission. arch_name is the name of the architecture and model_variant.yml includes things like author details, training set names and important hyperparameters.
I added the my new model as a new attribute on the Model.<arch_name> enum in enums.py.
I uploaded the energy/force/stress model prediction file for the WBM test set to Figshare or another cloud storage service (<yyyy-mm-dd>-<model_variant>-preds.csv.gz).
I uploaded the model-relaxed structures file to Figshare or another cloud storage service in JSON lines format (<yyyy-mm-dd>-wbm-IS2RE-FIRE.jsonl.gz). JSON Lines allows fast loading of small numbers of structures with pandas.read_json(lines=True, nrows=100) for inspection.
I uploaded the phonon predictions to Figshare or another cloud storage service (<yyyy-mm-dd>-kappa-103-FIRE-<values-of-dist|fmax|symprec>.gz).
I included the urls to the Figshare files in the YAML metadata file (models/<arch_name>/<model_variant>.yml). If not using Figshare I have included the urls to the cloud storage service in the description of the PR.
I included the test script (test_<arch_name>_<task>.py for task in discovery, kappa, diatomics) that generated the prediction files.

Additional Information (Optional)

I included a training script (train_<arch_name>.py) if I trained a model specifically for this benchmark.
I included a readme.md with additional details about my model.

for more information, see https://pre-commit.ci

DeNeutoy · 2025-04-06T23:03:11Z

@janosh / @CompRhys this is ready now - I am still seeing some error from the tests regarding the csv predictions - I can include them here, but I think it's not being fetched from the aws bucket - is it preferable to have them stored remotely?

I'll send a follow up PR when the paper is out to update a couple of the yaml sections, shouldn't be too long (but the models are available as of now in orb-models on main).

CompRhys · 2025-04-06T23:30:30Z

@janosh / @CompRhys this is ready now - I am still seeing some error from the tests regarding the csv predictions - I can include them here, but I think it's not being fetched from the aws bucket - is it preferable to have them stored remotely?

~~@janosh has typically been mirroring things to figshare so they can't be deleted and break things and so I think that will fix the test.~~ I will take a quick look but nothing to fix on your side I'd guess, i'll also get the linting issue which is that the license in the v3 yml needs a dash rather than a space to fit the schema --> update: the file path needs to put the downloaded file in the models folder. It's the path where the analysis code saves the csv after downloading it from figshare and the test checks that it gets put inside the models directory.

I'll send a follow up PR when the paper is out to update a couple of the yaml sections, shouldn't be too long (but the models are available as of now in orb-models on main).

Sounds good.

DeNeutoy and others added 10 commits April 4, 2025 10:23

add model yaml for orb-v3, include analysis script

14bb082

update

d1d1524

update num params

1648ed9

[pre-commit.ci] auto fixes from pre-commit.com hooks

374e631

for more information, see https://pre-commit.ci

add enums, update model weight path

c3ea415

use Omat24 not OMAT

af8f9f1

use http urls

98f71e9

try to fix some lint, conform to yaml schema

c928338

lint

863a61b

[pre-commit.ci] auto fixes from pre-commit.com hooks

43854fe

for more information, see https://pre-commit.ci

Merge remote-tracking branch 'origin/main' into orb-v3

7e202d6

fix: license for ci, download the orb models to the target folder

435a2fd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Orb v3 #240

Orb v3 #240

DeNeutoy commented Apr 4, 2025 •

edited

Loading

DeNeutoy commented Apr 6, 2025

CompRhys commented Apr 6, 2025 •

edited

Loading

Orb v3 #240

Are you sure you want to change the base?

Orb v3 #240

Conversation

DeNeutoy commented Apr 4, 2025 • edited Loading

Description

Checklist

Additional Information (Optional)

DeNeutoy commented Apr 6, 2025

CompRhys commented Apr 6, 2025 • edited Loading

DeNeutoy commented Apr 4, 2025 •

edited

Loading

CompRhys commented Apr 6, 2025 •

edited

Loading