I want to start off in saying, I don't yet have a solution to this problem and may never have one. I might be beating a dead horse in this post too. We know the defense stat UZR is flawed. Some of these points that follow are well based, some might be a bit unreasonable. I found these comments off of saber sites like Fangraphs. They make a good illustration of the problems with UZR.
"I think the more logical (and maybe more common) suggestion is to use multi-year and/or regressed fielding runs in the defense portion of things. I know Dave will counter with the argument that then we’re describing true talent rather than what actually *happened*, but I’ve never cared for that argument. And if you understand how UZR works, it’s not measuring what actually happened anyway (many plays are entirely thrown out of the data)."
This states two problems.
- One year UZR values are sketchy. It's not a large enough sample size. The creator of UZR has stated that we need to use multiple years to draw on conclusions from the data. My current thought on this arguement is, could we regress these UZR values in half to get closer to the mean. We would only do this for short samples.
Problem two stated in the comment was that plays are left out of the data. This is true. Defensive shifts are left out, any plays that involve them. As we know over the last few years defensive shifts have been increasing, more and more are being out on. This means more and more plays are being left out of UZR. This is a problem for sure.
Now I want slightly alter the course of this topic to defense and WAR, which became a sticky subject on fangraphs in September. Jeff Passan, a baseball conservationist, stated that he thought the defensive part of WAR was being weighed too much. He stated Alex Gordon as an example of a player who despite is ok-good batting numbers was listed as one of the top players in the game by WAR. There was a post about it on Fangraphs and everything. I'm going to respond to Passan' arguement on how I see it.
- WAR weighing defense is not the problem in my eyes. What's being fed into WAR from the defensive side is incorrect. It's not the output, it's the input which of course is UZR. The creator of UZR, like I mentioned at the beginning said you should use multi year samples. WAR is using one year UZR data, feeding data into it which means very little takeaways from it are accurate. I understand why they don't use multi year samples, but I think these numbers need to be regressed just a small bit closer to the mean (0). The mean is average and since we're not making very good estimates using UZR the way it is now, why not get it closer to the mean? I'm not advocating changing WAR, I'm pushing to change UZR, which leads me into my second point.
- Let's say we were to do what I mentioned above. Change UZR. I still don't know if I want it going into WAR. I know it will continue too and it should. If it's the best we have then we have to roll with it, but when we find a better option we need to input it into WAR. I'm just saying with this point, that UZR is messing up WAR, WAR isn't messing up UZR.
Thanks for reading, I hope I painted an accurate picture of where we are at with Defense, UZR, and WAR. I might make a second post on this leading into a new idea we'll see.