Sports Analysis

Saturday, January 24, 2015

Projection Sytems

First off I think the people who construct baseball projection systems have a basic understanding of projecting things. I just think they could be made more accurate, by using one thing that I think everybody has. Common sense. The numbers that I have seen from ZIPS off of Fangraphs aren't very well thought out. It looks like a computer spat out the numbers, which is exactly the case. When making projections people need to look at more then stats. Stats can help, and they have helped. But we can do better then these projection systems. Let's dig deeper.

I'm not sure I could make a better projection system, but I know a lot of people can make better ones then what we currently have. Something that is directly related to the creators of these projections is that the people who read them take them as gospel. You should never believe projections, they are useful but do not buy too much into them. Last year while researching for fantasy baseball, I came across a projection for Danny Salazar. Danny had come up late in 2013 season, put up excellent numbers with an FIP and strikeout rate to back it up. He was young and every projection I read on him had him touted as the next great pitcher. I pick him up in fantasy and he does awful, gets sent down to the minors before May even ends and I drop him. Here is one projection for him from the offseason last year for 2014.

W-L	ERA	IP	Strikeouts	WHIP
14-6	3.13	175	188	1.10

Compared, to his real season...

W-L	ERA	IP	Strikeouts	WHIP
6-8	4.25	110	120	1.38

Projection systems can miss on players, they do all the time. I see some projections out for next season already that I don't agree with. The reason being, they input numbers but don't input common sense. And for some reason people take these with as much attention as end of the season numbers.

Sorry for the shorter post but I just thought I would share that. Thanks for reading.

Saturday, December 13, 2014

Defense, UZR, and WAR

I want to start off in saying, I don't yet have a solution to this problem and may never have one. I might be beating a dead horse in this post too. We know the defense stat UZR is flawed. Some of these points that follow are well based, some might be a bit unreasonable. I found these comments off of saber sites like Fangraphs. They make a good illustration of the problems with UZR.

"I think the more logical (and maybe more common) suggestion is to use multi-year and/or regressed fielding runs in the defense portion of things. I know Dave will counter with the argument that then we’re describing true talent rather than what actually *happened*, but I’ve never cared for that argument. And if you understand how UZR works, it’s not measuring what actually happened anyway (many plays are entirely thrown out of the data)."

This states two problems.

- One year UZR values are sketchy. It's not a large enough sample size. The creator of UZR has stated that we need to use multiple years to draw on conclusions from the data. My current thought on this arguement is, could we regress these UZR values in half to get closer to the mean. We would only do this for short samples.

Problem two stated in the comment was that plays are left out of the data. This is true. Defensive shifts are left out, any plays that involve them. As we know over the last few years defensive shifts have been increasing, more and more are being out on. This means more and more plays are being left out of UZR. This is a problem for sure.

Now I want slightly alter the course of this topic to defense and WAR, which became a sticky subject on fangraphs in September. Jeff Passan, a baseball conservationist, stated that he thought the defensive part of WAR was being weighed too much. He stated Alex Gordon as an example of a player who despite is ok-good batting numbers was listed as one of the top players in the game by WAR. There was a post about it on Fangraphs and everything. I'm going to respond to Passan' arguement on how I see it.

- WAR weighing defense is not the problem in my eyes. What's being fed into WAR from the defensive side is incorrect. It's not the output, it's the input which of course is UZR. The creator of UZR, like I mentioned at the beginning said you should use multi year samples. WAR is using one year UZR data, feeding data into it which means very little takeaways from it are accurate. I understand why they don't use multi year samples, but I think these numbers need to be regressed just a small bit closer to the mean (0). The mean is average and since we're not making very good estimates using UZR the way it is now, why not get it closer to the mean? I'm not advocating changing WAR, I'm pushing to change UZR, which leads me into my second point.

- Let's say we were to do what I mentioned above. Change UZR. I still don't know if I want it going into WAR. I know it will continue too and it should. If it's the best we have then we have to roll with it, but when we find a better option we need to input it into WAR. I'm just saying with this point, that UZR is messing up WAR, WAR isn't messing up UZR.

Thanks for reading, I hope I painted an accurate picture of where we are at with Defense, UZR, and WAR. I might make a second post on this leading into a new idea we'll see.

Saturday, November 22, 2014

2014 Park Factors

Above 100: Hitters Park
Below 100: Pitchers Park
Exactly 100: Neutral

The reason for me posting these is because:

A) You might be interested in it.

B) These are the Park Factors I use for Total Runs.

In this article you can see how Park Factors are actually put into Total Runs (I have a new way of doing it)
http://regresstothemean.blogspot.com/2014/11/changes-to-total-runs.html?m=1

I'll be using just the team name, so Cubs of course means Wrigley Field.

D'Backs: 103
Braves: 98
Orioles: 99
Red Sox: 107
Cubs: 97
White Sox: 99
Reds: 97
Indians: 98
Rockies: 123
Tigers: 102
Astros: 100
Royals: 102
Angels: 97
Dodgers: 92
Marlins: 100
Brewers: 99
Twins: 105
Mets: 94
Yankees: 101
A's: 104
Phillies: 95
Pirates: 103
Padres: 92
Giants: 98
Mariners: 93
Cardinals: 100
Rays: 98
Rangers: 97
Blue Jays: 101
Nationals: 106

Friday, November 21, 2014

Changes to Total Runs

Some changes to my metric Total Runs have been implented. The metric's formula is still
Runs Created+
Defensive Runs Saved+
Speed Runs+
League Adjustment+
Positional Adjustment+
Park Factors=
Total Runs

As you may have noticed my metric (Speed Runs) is also included. I believe I made a blog post on it a while back.

The League Adjustments for 2014 have been determined and they are +5 for NL batters, +3 for AL batters. That's all League Adjustments all for 2014. (It's different for other years)

The way the Positional Adjustments and Park Factors were calculated has been changed. The basis on the change is the same for both metrics. I'll cover the Postional Adjustments first.

As you may know Positional Adjustments used to be given to all players, on a fixed amount based on the position they played. Here is the chart.

Catcher: + 42 runs
Shortstop: + 36 runs
Second Base: +32 runs
Center Fielder: +29 runs
Third Base: + 25 runs
Right Fielder: + 20 runs
Left Fielder: + 19 runs
First Base: + 13 runs
DH: - 7 runs

I used to give the full credit depending based on this chart which is based off the assumption that they played 162 games. The change is now, you only get credit for the games you played. The way to determine the Positonal Adjustment Value is now the following:

Games Played/162 * Postional Adjustment Value based on Position.

This only gives them credit for the games they played rather then giving them a credit for 162 games. My old approach was just lazy but now it's been fixed.

Now to the revamped Park Factors Value. Like the Positional Adjustment, I used to give credit to the players for playing 162 games now that's been adjusted.

Above 100 Hitters park
Below 100 pitchers park
100=Neutral

Games Played/162 * + or - the player's park above or below 100.

That's the new way we do Park Factors and Postional Adjustments. Only giving then credit for the games they actually played. The Positional Adjustment will have a larger impact on the player's Total Runs over Park Factors most of the time.

Total Runs is hopefully done with new parts. And by parts I mean the 6 "values" that go into it. The way we figure these values will continue to be changed. I've already had some new ideas to fix Speed Runs a bit.

Just an update on the Database for 2014 stats. I'm currently on the last names starting with P so making steady progress.

Monday, October 27, 2014

Fielding Metric for Total Runs

As you may or may not know I have now came out with a fielding metric. I think I’ve only posted results/formulas for outfielders and middle infielders. I’m working on expanding to other positions as well but for now all positions have not been completed. Meanwhile, I have a bit of a problem on my hands for Total Runs.

About a year ago I introduced Total Runs and made adjustments like Park Factors to it. This summer I made yet another adjustment and that was based on the leagues offensive power. My metric I used at the time for it was DRS, also known as Defensive Runs Saved. As I mentioned in the first paragraph I have invented my first defensive metric and it’s reliable, kind of. You can’t tell me DRS is reliable too though. You know that’s the problem with defensive stats, throughout building my Fielding Linear Weights metric I was constantly comparing it to what DRS and UZR said while I should be comparing it to my subjective knowledge. So, to make this long story short I think I’ve made up my mind on this regarding whether to use DRS or FLW and that is I’ll be using DRS. There is one main reason for this and that is:

1) If I choose FLW, I would have to figure all of the Fielding Linear Weights before I could do any Total Runs values.

There is also a secondary reason and that is, I think I might need another year of tinkering with FLW before I insert it into Total Runs. The trustworthy and common acceptance might be a bit lower on FLW than DRS too.

That’s all I have for today. Thanks for reading, some results on 2014 data regarding Total Runs, FLW, and other stats I created should start to appear on here in about 1-2 weeks.

Pages