up the start of a very good guide on using SQL and R to graph things - such as the location of every pitch that Joe Mauer homored off of in 2009. Thanks to that - with some help from JD Sussman - I was able to set things up on my computer and start playing around with it. One of the first things I thought to do was look at Jeremy Guthrie and all those home runs he gave up last year. As discussed previously, Guthrie's home run rate went from 1.1 HR/9 in 2008 to 1.6 HR/9 in 2009, mostly due to an increase in flyball rate from 38% to 47% - his HR/FB rate was similar at 10.9% to 10.5% (which are both very reasonable). Here's a graphic that shows the pitch location for the home runs that Guthrie gave up in 2009 (the blue dots) and in 2008 (the red dots). The major square that most of the dots are found in is the strike-zone. It's divided into four quadrants (by color) and then there's a black box in the center of the strike-zone showing those pitches that were right down the middle.Jeff Zimmerman recently put
Of his 23 home runs from 2008 (he actually gave up 24, but the Pitch/FX database seems to have missed the location for one), 11 are in the upper half of the zone (48%). In 2009, by contrast, 26 of his 35 home runs were in the upper half of the zone (74%). And it didn't seem to really be the case that Guthrie made more mistakes, as there were fewer home runs hit on pitches in the middle of the plate in 2009 (about 17 of 35, or 49%) than in 2008 (about 15 of 23, or 61%). I tried to run some more involved queries at that point, but MySQL didn't want to cooperate and R started crashing, so I fell back to the more comfortable* Excel and SAS. * I spend my days at work largely using Excel and SAS. Then I come home and spend time working in Excel and (now) SAS. Not the most exciting sounding thing in the world, but somehow I quite enjoy it. I thought that maybe Guthrie was throwing all of his pitches higher in the strike-zone. The red line shows the proportion of all pitches thrown around a certain height (I used buckets of 3 inch intervals) in 2008, and the blue line shows the heights for 2009. The two parallel lines give an idea of the strike-zone height.
As you can see, there wasn't really any difference in pitch height. If anything, it's a hair lower in 2009. Breaking it down by pitch type; first fastballs:
The '09 heaters were actually lower in the zone than the '08 ones - by almost an inch, on average. Maybe that's not great, as the pitches were more in the middle of the zone rather than in the upper middle, but in any case that wouldn't seem like it should result in more flyballs. Change-ups:
Now we're getting somewhere (maybe). It looks like someone had a little harder time getting the change-up down in the zone last year, compared to the year before. Guthrie threw the pitch over 2 inches higher in 2009. Sliders:
Whereas Guthrie did a nice job throwing the slider down towards the knees in 2008, he got the ball up more in 2009. Not a huge difference though - only about half an inch, on average. Curveballs:
I find it interesting that the shape of both distributions is very similar, though shifted downward quite a bit from 2008 to 2009. Guthrie was able to get the curve almost 4 inches lower in the zone last season, which seems like it would be a positive development. Here's a break-down on the balls in play, with air outs (fly outs, pop outs), ground outs, and everything else (line drives, hits). I'm aware that this isn't exactly the same as general FB% and GB%, but I didn't know how to get that with the data I have, and I thought that the changes in out type would be an OK proxy. Since there weren't very many curveballs, I lumped them in with sliders as "breaking-balls".
For every pitch type, Guthrie's share of outs in the air to balls in play went up by 13-14%. It looks like the change-up remained his best groundball pitch, but even it had a big jump in air outs. One could say that there were more air outs because of the outfield defense, but the O's outfield UZR was actually higher in 2008 (+38.9 runs) than 2009 (-19.3 runs). Also, Guthrie's BABIP went from .267 in '08 to .294 in '09, so he was getting more outs on balls in play in general. Still, comparing just the air outs to the ground outs, there was a sharp increase in the former and not at all in the latter. I wouldn't be comfortable drawing hard conclusions from this, but it looks like every type of pitch was being hit into the air more in 2009. Here's a look at which pitches the home runs were hit against. The HRper1000 columns are the number of home runs hit per 1,000 of each type of pitch thrown.
The rate for curveballs went down a little - corresponding with the pitch being lower in the zone - but there isn't a huge number of them anyway. For sliders it went up a little (as did the height of the pitch), but that's pretty much the same. For fastballs the increase was more pronounced, and goes against the the pitch height graph. It's the change-ups were there was a very big jump - Guthrie threw the pitch higher in the zone, and it was deposited in the seats at a very high rate. Switching gears a little; would a change in vertical pitch movement in general contribute to the flyball increase? According to this post, the rate at which a pitcher gives up flyballs and groundballs changes pretty decisively with the change in a pitches vertical movement:
It's just for fastballs, but you can see that the more sink there is on the pitch (less "rise"), the greater the groundball rate and the lower the flyball and pop-up rates. How did the movement on Guthrie's pitches change? Here are the movement distributions, with the drop/rise on the vertical axis and the frequency (percent of all such pitches) on the horizontal axis. The black horizontal line gives the league average vertical movement for the given pitch type. Again, 2008 is red and 2009 is blue. Fastball:
Guthrie's fastball has always had a lot of "rise" to it, so you would expect him to be a flyball pitcher. It had even less sink on it in 2009 than in 2008 though, which would contribute somewhat to the increase in flyballs. It's lower, but straighter, and my understanding is that movement contributes more than location to the type of batted ball. Change-up:
The change-up had much less sink on it, and there was never a ton to begin with. Given that the pitch was also coming in higher in the zone, perhaps Jeremy was throwing them to the same places he was in 2008 but they just didn't get all the way down where he expected them to. It was a good pitch for him in '08, so hopefully he can get it back to where it was. Slider:
The slider actually had a little more downward movement, and it was already better than league average. Maybe the increase in flyball and home run rate has to do with pitch sequencing or the almost one mph drop in average velocity (or luck). Curveball:
The curve had almost the exact same average movement in both years, but how it got there was a little different. In 2008 the pitch was pretty consistent. In 2009 though, Guthrie seemed to snap off some really good ones but then also leave quite a few hangers. One last thing; contact rate. In 2008, batter made contact about 83% of the time they swung at a pitch. In 2009, it went up to about 87%. Here's the breakdown of the change for each pitch type:
Not a great trend all around. The curveball really started getting hit, and given that the pitch got taken deep more frequently than the slider it might make sense for Guthrie to scrap it all together. So what's the verdict? Honestly, I'm not that sure. I'd say it's one part less sink on the fastball, two to three parts less movement on the change-up along with leaving it higher in the zone, and maybe four parts random variation.