Sunday, October 11, 2009

Series Win Probability Graphs: STL vs. LAD

Trying something new here, so any feedback would be appreciated. I did two things to come up with the graph below. First, I adjusted the FanGraphs win probabilities for each game based on the line, so instead of starting G1 at 50-50, the Dodgers started at 44.4% (+122/-130).

I then incorporated both the actual game lines and my guesses for games that never happened to figure out the series odds based on the potential result of each game. If the Cardinals had won Game 1, they would've been at 75.4% to win the series; LAD's victory got the Dodgers up to 61.7%. So throughout the game, those two are weighted by the batter-to-batter single game win probability.

Do that for all three games, and you end up with this:

(click to enlarge)
I think the main thing this shows is just how little Holliday's error is to blame for St. Louis going home early.

Please leave any thoughts/potential additions in the comments.

8 comments:

adam said...

This is a very neat graph, but if you wanted to be a true badass you would enter your dataset into a flash graph and allow the user to roll-over the entire graph to receive pop-up-style data about each play instead of denoting just a few key moments... but that would likely be a pretty significant undertaking. Examples of sites that employ these flash graphs are sportsclubstats.com and Advanced NFL Stats, as I'm sure you know.

Edward said...

How do you alter the Fangraphs data based on the new initial chance of winning the game.

That is, hypothetically, the first player of the game increases WP from 50% to 51.5%, according to Fangraphs. However, your numbers show that his team had a 44% chance of winning instead of 50%. Is the new, post-player WP 45.5%? Still 51.5%?

adam said...

Yeah, that's also a good point. I'd imagine the effect of the initial chance diminishes over time (and my initial intuition says the effect would diminish in a linear fashion, but I dunno. Later innings you also have the reliever effect, as the "initial chance" is largely based in the starting pitchers; that's kind of a mess to concretely solve).

Vegas Watch said...

What I'm doing is:

Game Win Probability = FG Win Prob +(Vegas Win Prob-0.5)*(1-Percentage of Game Completed)

Obvious problem with this, besides the relievers thing, is that if the favored team jumped out to a big lead they could get over 100%. That didn't come up for this series, with the underdog Dodgers winning every game, but is something I should fix before I make another one of these.

Xeifrank said...

Unless I am miss interpreting something here, it looks to me that your graph is actually showing "how much" Holliday's error cost the Cardinals. Holliday should be credited with the difference in series win probability before his error to the end of Game #2. Looks like nearly a 40% jump in probability. That's huge for one play.
vr, Xei

Vegas Watch said...

"Holliday should be credited with the difference in series win probability before his error to the end of Game #2."

Why? Because he walked Blake, gave up a single to Belliard, walked Martin, and gave up a single to Loretta? The Cardinals' G2 win probability didn't go down to 0% after he made that error.

hoody said...

the reliever problem is an interesting one. what i might do is treat the bullpen as an aggregate and then weight how often each reliever is used over a season by a specific manager. if one wanted to get really subtle one could weight the relievers based on context, that is if it's in the ninth inning of a close game the long reliever is weighted less than the closer. this all is very problematic in a multitude of ways (e.g. the sample size with players like Sherrill would be awful small), but the idea of weighting use patterns and using the bullpen as an aggregate could be productive if taken up by people better at math than I am.

adam said...

This is what I came up with to control overflow:

if ( fangraphs probability > .5 ) { gwp = fangraphs + ( vegas - .5 ) * 2 * ( 1 - fangraphs ) * ( 1 - pct completed ) }
else { gwp = fangraphs + ( vegas - .5 ) * 2 * fangraphs * (1 - pct completed ) }

That produces a graph like this where the lines are level curves of the Vegas odds. I'd love to see other solutions, though. I think I thought of one at some point that was more clever but forgot it.

Post a Comment