Back soon...
In response to your queries, many apologies. As you probably know, my computer fried late last September, which cut short my projections by about 4 games last year. But that is no excuse that we are now almost two weeks into the 2011 season and I haven’t posted a thing.
The short explanation is that I have been busy doing other things. And on top of that, I have been reformatting my code and input to allow for more and better analysis this year (more on that in the weeks to come). Suffice it to say that 2011 predictions are coming in the next couple of weeks. Until then, I’ll try to spend less time making excuses and more time making predictions.
Technical Difficulties
Loyal followers (both of you),
Unfortunately, yesterday my computer ate itself. The data was backed up, so nothing critical should be lost, but due to the fact that everything running this site was happening on that computer, it's going to take a week or so at least to get a new one, get everything installed, and be ready to go again with the predictions and analysis.
And by then of course, the season will be over.
On the plus side, the regular season has mostly shaken itself out by now. We'll see if we can get things going in time for some good postseason analysis though.
In any case, thanks for your patience.
Brad
Updated Forecaster's Challenge results
So scratch the end of my last post. I'm not 3rd anymore in the Forecaster's Challenge. According to the updated results here I am now 6th. Slipped a little, but still right up there. Last year my Achilles heel was picking a bunch of pitchers that were out for the season. So in the offseason I redoubled my efforts to figure out who was projected to actually play, and my performance has certainly improved as a result, and I was right up there at the All-Star break. I'll have to dig a little deeper now (whenever I get the chance - i.e. in the offseason) to figure out what is distinguishing the winners from the losers this year, and why I've slipped in the second half. There is certainly a lot of variance in these things, although it is impressive how last year's winner has done very well again this year. I imagine he is a very experienced Fantasy Leaguer.What’s with the 2 sets of preseason predictions?
For those of you that have viewed the preseason predictions before, you may have noticed some things like (why are no pitchers projected to have more than 12 wins?) and (why is Albert Pujols projected to 480 Abs?) There are a couple of reasons why these numbers are lower than you probably would have expected (especially in retrospect). For one, we expect performance to regress to the mean a bit (google "regression to the mean in baseball" if you don't know what this is all about). And for another, these are average expected performances, not most likely expected performances. Where that comes into play most notably is in accounting for injuries. For instance, there may be a 10% chance a player will get injured, and when injured he has 0 at bats and when not injured he has 600. So the expected value is 540 AB (600*90%), but the most likely value is 600.
This explains a lot of why these playing time projections are a little off of your expectations. The one other reason that these projections are off is that some of my methods for projecting these things are just not that well-developed. Those of you that are familiar with my research know that the main models I have built are intended to estimate the probabilities of specific outcomes of specific at bats, and then based on that I have built out systems to predict outcomes for games and seasons. But I have not developed sophisticated models to predict playing time and injuries and such over the course of a season, and as a result some of the projections that are based on overall playing time and player usage, like at bats and wins and saves may be a little off or just not that great.
This is where the second set of projections come in. Knowing the weaknesses of my models, I have leveraged public sources of playing time and usage projections to leverage (what I believe to be) better information about these variables (most notably the community projections at fangraphs). Incorporating this information (by generally blending some of the usage based estimates with mine) I have augmented my projections. You will notice though that although the aggregate numbers change here, the ability based estimates like batting average and OPS have remained the same.
The last big question then is, how do these projections compare? Well on true player ability metrics that I would still maintain are the most important for actual estimation of player ability and impact (excepting propensity to injury), they are the same. For the aggregate metrics, which are no doubt a bit more important to applications like fantasy sports, they differ a bit. To evaluate one versus the other, one piece of anecdotal information would be to look at my rankings in the Forecasters Challenge (at www.insidethebook.com) last year – when I used only my algorithms – and this year when I augmented my results a la version 2 above. Last year I was in the middle of the pack. This year I am ranking third overall in the main challenge, and 1st in one of the alternate challenges.