ORIGINAL POST 3/6/2017:
There's an interesting offensive player stat out there called PORPAG, which is an acronym for "Points Over Replacement Per Adjusted Game." PORPAG was created by MSU fans KJ and Spartan Dan back in 2009, and the basic idea is to estimate how many more points per game a player creates than a hypothetical "replacement player" would. The basic formula for PORPAG is:
(OffRtg – 88) * %Poss * Min% *65
The 88 is the presumed O-rating of the replacement player, and the 65 at the end is the constant for possessions in a game. These days we'd probably want to use something more like 68 to account the tempo-frenzy unleashed by the 30-second shot clock. Arguably, we could also increase the O-rating of the replacement player to account for increased average efficiency since 2009, but I'm leaving it at 88 because the Chinese consider 8 to be a very lucky number and my mom was born on August 8th.
Michigan fan CT in TC keeps the PORPAG torch lit, and tweeted out the final conference-only numbers for the Big Ten today:
Overall, this list tracks common sense reasonably well. Walton and Swanigan are pretty clearly the top two offensive players. And most of the other good offensive players in the conference make the cut.
But one thing POPRAG definitely misses is a fundamental basketball fact: it is harder to be efficient at high usage than it is to be efficient at low usage. Because PORPAG doesn't account for this basketball fact, it shows a clear bias for lower-usage high-efficiency players. This is how you get a guy like Abdur-Rahman (16.6 % usage) over Ethan Happ (28.9 %).
To try to fix this problem, I'm proposing a new stat called PORPAGATU! This stands for PORPAG At That Usage, with an exclamation point for emphasis. The fundamental problem with PORPAG is that there simply are not "replacement players" who can come in and even function at Ethan Happ's 29% usage rate. To account for this, I propose adjusting the hypothetical replacement player's O-Rating in the formula as follows:
Less than 10 % usage: 93
Between 10% and 30%: 93 down to 83 on a sliding scale.
More than 30% usage: 83
There's probably a better way to make this adjustment, and I'll take suggestions on that and think about it more. But with this fairly simple adjustment, here's what I get:
Some of the more unsatisfying results of PORPAG are noticeably diminished. For example, Nigel Hayes appears at 25th and Zak Showalter drops out from 21st—as it should be. Showalter's dominance of this metric earlier in the season was what first got me thinking about this kind of adjustment. Other high usage guys like Jok, Happ, Ward and Trimble all move up significantly, better reflecting their offensive value (I think). While still not perfect, my subjective take is that every difference between the PORPAG and PORAGATU! is pro-PORPAGATU!
For the record, here's the Big Ten PORPAGATU! for the whole season (including non-conference):
Again, not perfect, but a pretty decent proxy for offensive value in my opinion.
Got a better idea for how to make the high usage adjustment? Let me know. If I change the formula, I'll update this post. I'll also put up a PORPAGATU! page on the T-Rank site when I get a chance.
I have made some further refinements to this would-be stat:
- Instead of using the sliding scale to adjust the value of a replacement player, I will just use the player's "usage-adjusted O-Rating" as the basis for the formula. That is calculated as: O-Rating + ((Usage - 20) * 1.25). In other words, add or subtract 1.25 points for every point above or below an average usage of 20.
- For these purposes, I will also adjust a player's O-Rating for the level of competition he has faced. This is done by comparing the average adjusted defensive efficiencies of his opponents versus the overall average defensive efficiency. So if the average efficiency is 103, and a player's opponents have an average adjusted defensive efficiency of 100, his O-Rating will be multiplied by 1.03 (103/100).
- Because usage is now accounted for in the player's O-Rating itself, it is not necessary (or proper) to include usage later in the formula and the total result must be divided by 20 to maintain the same scale.
- Because I can, instead of using a constant for the Tempo adjustment I'll just use whatever the average D1 tempo happens to be at a given moment.
((ORtg * (D1 Eff. / Opponents' Avg. Def. Eff.) + ((Usage - 20) * 1.25)) - 88) * Min% * D1 Avg. Tempo / 500
I've added the results of this to the 2018 Team Pages on the T-Rank site, and I hope to roll them into the rest of the site soon.
Partially spurred by BTG's post on Grady Eifert, I've made some further refinements to the PORPAGATU! formula. Eifert was an example of a ultra-low-usage + very-high-efficiency loophole in the stat. You may recall that trying to account for that kind of guy was the original impetus for the "ATU" part of the acronym in the first place.
Overall, it isn't too big a problem—just a handful of guys over 10+ years really slip through, and for some of them (like maybe Jon Diebler) you could argue that the stat was actually on to something. Also, to some extent the "problem" is unsolvable: ultimately usage and offensive rating tell us different things, and a single stat that tries to combine them is fundamentally a questionable analytical exercise.
But let's face it, questionable analytics is totally my brand.
Here are the changes:
FIRST. I now apply the strength of schedule adjustment after the usage adjustment instead of before. For a guy like Grady Eifert—who was playing for a good team with a good strength of schedule—this means the SOS adjustment is operating on a smaller base, and therefore has less of an effect.
Here's a simplified example using Eifert's 2019 numbers:
Old: Adj O-Rtg = 144.7 * 1.077 + ((10.5 - 20) * 1.25) = 144.0
New: Adj O-Rtg = (144.7 + ((10.5-20) * 1.25)) * 1.077 = 143.0
Small beans, but a little better.
SECOND. I subtract 1.5 per point under 20 usage instead of 1.25. So:
Old: Adj O-Rtg = 144.7 * 1.077 + ((10.5 - 20) * 1.25) = 144.0
New: Adj O-Rtg = (144.7 + ((10.5-20) * 1.5)) * 1.077 = 140.5
THIRD. Now let's get nuts. One fact, I think, about the relationship between usage and efficiency is that efficiency becomes more "stable" with more usage. For example, when trying to project performance for a guy with a 120 O-Rating, you can be more confident he's going sustain that if his usage is closer to 30 than it is to 10. This makes sense because essentially the guy with a higher usage has just done more offensive stuff. And it's particularly true for the low-usage high-efficiency guys because what's often going on there is that they've shot an unsustainable percentage from three over not very many attempts.
One way to visualize this fact is to look at the standard deviation of adjusted efficiencies at various usage levels, and the resulting trendline:
The x-axis there is usage, and the y-axis is the standard deviation for adjusted offensive rating around that usage. Here's how I use this information:
- Take the adjusted O-Rating and calculate the number of standard deviations above mean (i.e., a Z-Score), based on the trendline standard deviation for a given usage. The formula for that is:
Adj. Z-Score = (Adj. O-Rating - Avg. Eff.) / (Usage * -.144 + 13.023)
- Multiply this adjusted Z-Score by 10.143 (the imputed standard deviation at 20 usage) and add to the average efficiency. For Grady Eifert we get:
Adj. Z-Score = (144.7 - 103.1) / ((10.5 * -.144 + 13.023) = 3.62
Adj. O-Rating = 103.1 + (3.62 * 10.143) = 139.8
That 139.8 would then be the initial input for the formula above, so:
Final Adj. O-Rating = (139.8 + ((10.5-20) * 1.5)) * 1.077 = 135.2
Basically, for low usage guys their O-Ratings are regressed to the mean and for high usage guys their O-Ratings are stretched out from the mean.
Is this mathematically, statistically, or analytically sound? Almost certainly not! I have no idea! But it makes the results more pleasing to me, therefore it is done.
Final product, Grady Eifert's PRPG! falls to 4.3, from around 5.0, and falls from about 3rd to 8th in the Big Ten last year. Take that, Grady.
While I'm on the topic, there are a couple other minor tweaks I made a while back, both to make the stat more comparable across seasons: I use 69.4 for the tempo variable (rather than calculating it for the season in question) and I normalize the average efficiency to 104.9 (can't remember why I chose that number, but it's ultimately arbitrary).
o_adj = avg_eff / opp_de
uFactor = Usage * -.144 + 13.023
altZscore = (ORtg - avg_eff) / uFactor
xORtg = avg_eff + altZscore * 10.143
if Usage > 20: adjoe = (xORtg + ((Usage - 20) * 1.25)) * o_adj
otherwise: adjoe = (xORtg +((Usage - 20) * 1.5)) * o_adj
porpag = (adjoe + (104.9 - avg_eff) - (88)) * actual_Min_per * 69.4 / 500