Skip to content

#2047 - Add enum/permissible value validation to API endpoint #1590

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

cristina-stonepedraza
Copy link
Contributor

Add to the existing MIxS environmental triad slot review API endpoint to include enum validation to ensure that env package, local scale, broad scale, and medium fields include slots that are part of each enum's permissible value list.

@cristina-stonepedraza cristina-stonepedraza self-assigned this Apr 3, 2025
@cristina-stonepedraza
Copy link
Contributor Author

Here is an example of the TSV file results so far:
image

def fetch_nmdc_submission_schema_view():

# Use SchemaView to open nmdc_submission_schema.yaml and get the enums
view = SchemaView("https://raw.githubusercontent.com/microbiomedata/submission-schema/refs/heads/main/src/nmdc_submission_schema/schema/nmdc_submission_schema.yaml")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nmdc-submission-schema is already a dependency of this project, so there's no need to fetch from GitHub here. In addition to the unnecessary overhead of making a request over the internet, the bigger concern is that fetching it from the main branch will, in general, get a version that is ahead of what's used in here nmdc-server.

The good news is that there is already code which generates a SchemaView instance from the schema file bundled in the nmdc-submission-schema package:

submission_schema_files = importlib.resources.files("nmdc_submission_schema")
# Load each class in the submission schema, ensure that each slot of the class
# is fully materialized into attributes, and then drop the slot usage definitions
# to save some bytes.
schema_path = submission_schema_files / "schema/nmdc_submission_schema.yaml"
sv = SchemaView(str(schema_path))

I would recommend extracting that code into a utility function which returns a SchemaView instance (the result can/should even be cached). The new function can be called from the generate_submission_schema_files function and this function (although I might nitpick the name of this fetch_nmdc_submission_schema_view function since it doesn't return a SchemaView).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants