Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add alternative way of ovr computation #275

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

nicidob
Copy link
Contributor

@nicidob nicidob commented Jul 13, 2020

This is an alternative method of computing OVR. Instead of using massive scale player plus-minus, it uses team-level results. By looking at game results and taking a minute-averaged rating for each team, it tries to predict team margin-of-victory. By using home_team - away_team, the regressed intercept becomes home court advantage.

Benefits

  • With 82 games per season, this scales faster than player plus minus (usually under 10 players per season)
  • In a GM game, you're trying to build a better roster to win games, so using team aggregated OVR for "this is a better team" makes sense. By construction, a team built optimizing this OVR will beat a team built optimizing player-level +/-.
  • Almost all coefficients are positive without any special hacks.
  • The lone exception is 2pt, which is usually about 50% of the coefficient for 3pt, which means you could read it as A * 3pt + B * (3pt - 2pt), which is a fine formula, using 3pt rating and how much better your 3pt is than your 2pt.
  • The prediction is much more stable with better p-values for the regressed coefficients.
  • The result here is simpler and easier for others to reimplement for their own tools and calculators.

@dumbmatter
Copy link
Member

https://github.com/zengm-games/zengm/blob/new-ovr/analysis/player-ovr-basketball/process2.py - I tried to adapt your code to run on a JSON league file rather than the CSV exports, since it's easier to export large league files.

The prediction is much more stable with better p-values for the regressed coefficients.

Is this true? I am getting a fair amount of variability even when running with 200 seasons of box scores. Here's a few runs:

(
    0.198 * (ratings.hgt - 47.8) +
    0.0710 * (ratings.stre - 47.1) +
    0.121 * (ratings.spd - 50.4) +
    0.0654 * (ratings.jmp - 48.5) +
    0.0353 * (ratings.endu - 37.5) +
    0.0386 * (ratings.ins - 40.1) +
    0.0318 * (ratings.dnk - 46.2) +
    0.0131 * (ratings.ft - 43.2) +
    0.0100 * (ratings.fg - 43.2) +
    0.128 * (ratings.tp - 43.3) +
    0.0715 * (ratings.oiq - 41.6) +
    0.100 * (ratings.diq - 42.3) +
    0.109 * (ratings.drb - 50.6) +
    0.0936 * (ratings.pss - 47.2) +
    0.0100 * (ratings.reb - 48.5)
) + 45.6
(
    0.212 * (ratings.hgt - 48.0) +
    0.0961 * (ratings.stre - 46.9) +
    0.134 * (ratings.spd - 50.1) +
    0.0661 * (ratings.jmp - 48.3) +
    0.0204 * (ratings.endu - 37.5) +
    0.0231 * (ratings.ins - 40.0) +
    0.0183 * (ratings.dnk - 45.7) +
    0.0372 * (ratings.ft - 42.8) +
    0.0100 * (ratings.fg - 42.7) +
    0.111 * (ratings.tp - 42.9) +
    0.0834 * (ratings.oiq - 41.4) +
    0.0941 * (ratings.diq - 42.1) +
    0.0876 * (ratings.drb - 50.5) +
    0.104 * (ratings.pss - 47.3) +
    0.0100 * (ratings.reb - 48.6)
) + 45.4
(
    0.208 * (ratings.hgt - 47.7) +
    0.0938 * (ratings.stre - 46.8) +
    0.154 * (ratings.spd - 50.3) +
    0.0400 * (ratings.jmp - 48.3) +
    0.019 * (ratings.endu - 37.4) +
    0.0238 * (ratings.ins - 39.9) +
    0.0287 * (ratings.dnk - 45.8) +
    0.021 * (ratings.ft - 42.9) +
    0.0100 * (ratings.fg - 42.9) +
    0.121 * (ratings.tp - 43.0) +
    0.092 * (ratings.oiq - 41.3) +
    0.0924 * (ratings.diq - 42.1) +
    0.0903 * (ratings.drb - 50.5) +
    0.0971 * (ratings.pss - 47.3) +
    0.0100 * (ratings.reb - 48.4)
) + 45.4

End result is basically that 80% of players stay within +/-2 of their previous ovr, but there are some that get up to +/-10 or so. Mostly because of the difference in the value of hgt I think, centers tend to get a boost.

The result here is simpler and easier for others to reimplement for their own tools and calculators.

Also skeptical about this... the code is longer and probably more confusing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants