Ignoring and Embracing Small Samples


Apr 5, 2013; Baltimore, MD, USA; Minnesota Twins center fielder Aaron Hicks (32) lays on the ground after fouling a ball off his leg in the sixth inning against the Baltimore Orioles on Opening Day at Oriole Park at Camden Yards. The Orioles defeated the Twins 9 – 5. Mandatory Credit: Joy R. Absalon-USA TODAY Sports

I learned a valuable lesson yesterday.  I’ve been toying with this idea for a week or so and started outlining this post on Tuesday.   Then, Cee Angi of SBNation posts this excellent article yesterday and I realize that my article is going to look like either a copy or some sort of veiled rebuttal or it will be completely ignored.  However, I still like the concept so I’ll give my thoughts.  So, if nothing else, I hope to have taught everyone a valuable lesson:  don’t wait because if you do, Cee Angi will beat you (metaphorically, but possibly physically, but probably not, but who knows).

For fans of downtrodden teams, the early part of the season is often the most enjoyable part of the season.  We haven’t had real baseball in our lives for months, so we thrive from gaining the opportunity to watch meaningful games.  However, these meaningful games can produce a lot of meaningless noise.  Much of this noise comes in the form of early season statistics, formed into dreaded small samples.  Small samples are the scourge of the modern baseball society, and it is not hard to figure out why.

Aaron Hicks has been disaster so far, so send him back to AAA, or maybe just cut him altogether!  Kevin Correia has been a revelation, so extend his contract indefinitely and while we’re at it, trade away the pitching prospects for championship champagne coupons!  Ryan Doumit‘s beard is freaking thick, so I will project that he will have a face filled with hair by seasons’ end!  Just hair, no face.

The point of using statistics is to try learn what significance the numbers present to us.  By rule, small samples are not statistically significant.  Without a good sample, any conclusions drawn aren’t particularly useful when trying to determine why something happened or what could happen next.  What we are left with is simple descriptive information.  We can basically state what happened, but deeper meaning cannot actually be attached.

Small samples have basically no predictive value.  It would be crazy to assume that Aaron Hicks is going to be a bust because his first taste of MLB pitching has lead to a tidal wave of flailed swings.  If the Twins sent him to AAA or if fans gave up on his promise, both groups would likely regret the decision.  However, it isn’t crazy to suggest that Hicks take a day off here and there, or possibly move down in the batting order.  Hicks is struggling, and that is apparent, but that doesn’t mean he has some deficiency that will prevent him from having a good career.

April 08, 2013; Kansas City, MO, USA; Minnesota Twins pitcher Kevin Correia (30) delivers a pitch against the Kansas City Royals during the first inning at Kauffman Stadium. Mandatory Credit: Peter G. Aiken-USA TODAY Sports

Small samples aren’t all that evaluative either.  Kevin Correia has had good success in his first two starts.  Looking just at the facts, he has gotten a lot of ground balls and double plays.  It would be incorrect to come to a conclusion that Correia has somehow changed the way he pitches or has improved his ability to get key ground balls, at least at this point.  Evaluating Correia on the small sample would basically ignore his entire track record as a pitcher.  Ignoring a better sample is a major no-no.  Basically, Correia has had a couple good games, nothing more and nothing less.

However, I don’t believe that small samples are completely useless.  While I believe that small samples aren’t particularly useful in finding meaning, it doesn’t mean they aren’t fun to look at or even worth monitoring.  Eventually, small samples can become large samples.  If we pay attention to the sample as it grows and changes, we might actually be able to find more meaning than we would have if just looking at the sample as a whole, in hindsight.

In addition, even the smallest of samples tells us what actually happened.  If a player is 4-4 with 4 home runs, we can safely say that the player had an excellent night at the plate.  Whether that game will ever be replicated, we cannot say.  What caused that power explosion?  We’re not qualified to answer that question.  Did the player have a ridiculously special game?  Uh, yup.

If that same player posts a .450/.550/.800 slash line with six home runs over the first two weeks of the season, it would be faulty logic to assume that this player is on any sort of record setting pace.  It would be unfair to change your expectations for that player, based on that sample.  Extrapolating numbers to figure out season totals would be a complete waste of a calculator or abacus.  However, it would be absolutely reasonable to marvel at this incredible stretch of baseball.

The majority of the issues with small samples come from the real scientific and real statistical community.  Baseball is a game; it’s not a science.  We can enjoy small samples in baseball because the greater meaning isn’t really all that important in the first place.  If we’re wrong in projecting a record-breaking season, then I guess we’re wrong.  Who really cares?  If we write off a slow-starting prospect and then he starts to hit, I’m pretty sure we’ll still be allowed on the bandwagon when he turns it around.

Relying on small samples to make declarations is a losing game.  That doesn’t mean it isn’t a fun game.  While it is always fun to be right about something, sometimes it can be equally fun to live in the world of small samples, gleefully watching as every record is on pace to be broken.  In that world, a player will actually hit 324 home runs one day.  While it may not be the real world, it’s a fun place to visit from time to time.

Speaking of stats, I debuted as a cartoonist this weekend.  If you are interested, you can find it here.  I also wrote about Three True Outcomes Games awhile back.  I’m not sure there is a smaller sample than one game.  You can read about that one here.  How do you feel about small samples?  Please share your thoughts in the comments below.