PDA

View Full Version : Sox Probability Rollercoaster


JungleJimR
09-18-2008, 01:11 PM
Coolstandings.com runs a million computer simulations each day to assess playoff probabilities. It takes into account all of the various home/away and strength of schedule dimensions. Check it out at: www.espn.com/mlb/standings (http://www.espn.com/mlb/standings) (look to the 3 columns on the right)

The following is last week's up and down history of the Sox Probability for winning the Central Divn:

End
of - Prob - Events
Fri - 53% - Start of week
Sat- 44% - Sox rained out-Twins win DH
Sun- 56% - Sox win DH-Twins lose
Mon- 54%- Sox and Twins lose
Tues-67%- Sox win-Twins lose
Wed- 72%- Sox and Twins lose.
Thur- 55%- Sox lose, Twins have big, big win
Fri - 72%- Sox win - Twins lose --- System really values a 3 game loss differential
Sat - 71% - Sox and Twins lose - System pauses before the big series

35th&Shields
09-18-2008, 01:24 PM
The probabilities also illustrate just how important a win last night would have been. To have been able to get a win with our #5 pitcher on the mound in Yankee Stadium, on a night the Twins lost, would have been huge. Oh well. Let's keep fighting and try and pick up a W tonight against Mussina.

cws05champ
09-18-2008, 01:55 PM
I think Baseball prospectus still has the 2005 White Sox as a 98% probability to win the division. :D:

spiffie
09-18-2008, 02:02 PM
I think Baseball prospectus still has the 2005 White Sox as a 98% probability to win the division. :D:
HA!!!! I have to say three years after their tortured explanation for that this joke STILL makes me laugh every time I see it!

JungleJimR
09-18-2008, 02:03 PM
I think Baseball prospectus still has the 2005 White Sox as a 98% probability to win the division. :D:


I have been a Sox fan for a long, long time; and hang with them rain or shine -

However, if I could get anything like these odds, I would not hesitate to put money on the Twins.

kjhanson
09-18-2008, 02:18 PM
Speaking of Baseball Prospectus, they have the Sox at about 88% to win the division. If you do not-so-Cool Standings' dumb mode, the Sox are at 82% to win the division.

jabrch
09-18-2008, 02:36 PM
Simulated games - they have worked so well for the Cubs. I don't give a damn what the output of 1,000,000 computer run games is. It has zero to do with the outcome of 11 human games.

Eddo144
09-18-2008, 02:38 PM
Simulated games - they have worked so well for the Cubs. I don't give a damn what the output of 1,000,000 computer run games is. It has zero to do with the outcome of 11 human games.
No kidding. But no one is suggesting we actually pick the playoff teams based on these odds. It's just information, sheesh.

Next you'll be suggesting Vegas stops making lines on football games. "Just because a team is favored by 10 1/2 points doesn't mean they can't lose."

Marqhead
09-18-2008, 02:42 PM
I think Baseball prospectus still has the 2005 White Sox as a 98% probability to win the division. :D:

What is the story behind this, or is there a link to a thread that will explain it?

Evman5
09-18-2008, 02:54 PM
What is the story behind this, or is there a link to a thread that will explain it?


I am pretty sure that even after we clinched the division in '05 BP still had us as 98% chance of winning the division.

FielderJones
09-18-2008, 03:01 PM
What is the story behind this, or is there a link to a thread that will explain it?

Here (http://www.whitesoxinteractive.com/vbulletin/showthread.php?t=58671) you go. I started using the search term propellerhead, but ended up having to use prospectus instead.

UofCSoxFan
09-18-2008, 03:03 PM
I have been a Sox fan for a long, long time; and hang with them rain or shine -

However, if I could get anything like these odds, I would not hesitate to put money on the Twins.

You realize that you said you would put money on the Twins at 49 to 1 odds to win the 2005 AL Central? You do realize that is like saying you'd put money on the Bears today to win SuperBowl XLI (which they already did not win) given the right odds?

That reminds me of the quote by Keivn of The Office (and I'm quoting from memory): "If someone gives you one million to one on anything, you take it. If John Mellencamp ever wins an Oscar, I'm going to be a very rich dude."

JungleJimR
09-18-2008, 03:15 PM
You realize that you said you would put money on the Twins at 49 to 1 odds to win the 2005 AL Central? You do realize that is like saying you'd put money on the Bears today to win SuperBowl XLI (which they already did not win) given the right odds?

."
b
Sorry, my misread of 2005 for 2008!
(new bifocals maybe)

kjhanson
09-18-2008, 03:19 PM
End
of - Prob - Events
Sun- 56% - Sox win DH-Twins lose
Mon- 54%- Sox and Twins lose


If you want to know why I call it not-so-Cool Standings, take a look at that logic. The Sox knock one off their magic number, and yet their probability of winning the division goes down. Not to mention that we played one of our remaining hard games (Yankees) and the Twins played an easy game (Indians).

JungleJimR
09-18-2008, 03:21 PM
I think Baseball prospectus still has the 2005 White Sox as a 98% probability to win the division. :D:


Coolstandings may have resolved this rounding problem - e.g. since the day LAA had clinched the AL West this year, their probability has been at 100%.

jabrch
09-18-2008, 04:31 PM
No kidding. But no one is suggesting we actually pick the playoff teams based on these odds. It's just information, sheesh.

Next you'll be suggesting Vegas stops making lines on football games. "Just because a team is favored by 10 1/2 points doesn't mean they can't lose."

Actually, Vegas lines are good for one thing only - betting. They are not there to forecast likelihood of winning, nor do the profess to. Sheesh.

All I am saying is that simulated games done a million times don't mean ****. It's one thing to use technology to analze real data about a baseball game to draw meaningful conclusions about what happened. It is a reach to use that to predict what may happen next. It is complete insanity to then take that and create games that never happened, and say that they predict a likelihood of a real result happening.

Computer simulations are great for games of chance. Toss a coin 100 times - what's the result going to be. That's a great use of a computer simulation as there are so few variables. Evaluate the productivity of a piece of machinery on a manufacturing line, that works to 99.9999999% accuracy - and project out defect rates or volumes - that's another great one. Predicting the outcome of a sporting event, so few of the variables can be isolated, evaluated, measured, controlled and input into a model is just asshattery.

The fact that supposedly statistics minded folks propogate this crap without more clear disclaimers of its worth is just nuts. Then the fact that ignorant, self-proclaimed statistics people, who don't get it, preach to this crap is even more silly.

People are welcome to value % chances of winning all they want. I'm equally within my right to think that it is complete idiocy.

Sheesh.

RockyMtnSoxFan
09-18-2008, 04:56 PM
Actually, Vegas lines are good for one thing only - betting. They are not there to forecast likelihood of winning, nor do the profess to. Sheesh.

All I am saying is that simulated games done a million times don't mean ****. It's one thing to use technology to analze real data about a baseball game to draw meaningful conclusions about what happened. It is a reach to use that to predict what may happen next. It is complete insanity to then take that and create games that never happened, and say that they predict a likelihood of a real result happening.

Computer simulations are great for games of chance. Toss a coin 100 times - what's the result going to be. That's a great use of a computer simulation as there are so few variables. Evaluate the productivity of a piece of machinery on a manufacturing line, that works to 99.9999999% accuracy - and project out defect rates or volumes - that's another great one. Predicting the outcome of a sporting event, so few of the variables can be isolated, evaluated, measured, controlled and input into a model is just asshattery.

The fact that supposedly statistics minded folks propogate this crap without more clear disclaimers of its worth is just nuts. Then the fact that ignorant, self-proclaimed statistics people, who don't get it, preach to this crap is even more silly.

People are welcome to value % chances of winning all they want. I'm equally within my right to think that it is complete idiocy.

Sheesh.

Take a deep breath.

Statistics is a branch of mathematics. The numbers given by coolstandings are predictions, kind of like the weather forecast; if the forecast gives a 60% chance of rain but it stays sunny, that doesn't mean the forecast was completely wrong. There was also a 40% chance it wouldn't rain.

The numbers put out by coolstandings are useful if you realize what they mean and where they come from. They are based primarily on run differential and remaining schedule. Thus, when the Sox were ahead of the Twins and had a better run differential, but only a marginal advantage in chance of winning the division, it was a sign that the Sox still had more difficult opponents on their schedule than the Twins.

Before you go calling people idiots for using statistics, consider that it only makes you look ignorant.

Lundind1
09-18-2008, 04:58 PM
Computer simulations are great for games of chance. Toss a coin 100 times - what's the result going to be. That's a great use of a computer simulation as there are so few variables. Evaluate the productivity of a piece of machinery on a manufacturing line, that works to 99.9999999% accuracy - and project out defect rates or volumes - that's another great one. Predicting the outcome of a sporting event, so few of the variables can be isolated, evaluated, measured, controlled and input into a model is just asshattery.

People are welcome to value % chances of winning all they want. I'm equally within my right to think that it is complete idiocy.

Sheesh.

I agree with that sentiment. Just ask how many of the computers simulations gave the New York Giants even an n'th of a chance to win the Super Bowl at the end of the season last year. That is why we play the games. The Sox may surprise us, and hopefully in a good way.

RockyMtnSoxFan
09-18-2008, 05:07 PM
I agree with that sentiment. Just ask how many of the computers simulations gave the New York Giants even an n'th of a chance to win the Super Bowl at the end of the season last year. That is why we play the games. The Sox may surprise us, and hopefully in a good way.

Come on, people, we all know that anything can happen in baseball, and you have to play the games. Nobody is saying that statistics can replace the games; besides, that would be less fun. All we're saying is that statistics can give useful insight into how a team or player is doing, and how difficult it might be for them to make it to the playoffs. Just like winning percentage and games behind are statistics, so are the playoff odds. It's a measuring stick, not a replacement for reality.

Lundind1
09-18-2008, 05:16 PM
I understand, and being from Chicago we all know what happens when you assume that something is locked up before it is, AHEM....some other team in town.

I like the measuring stick, I just try not to base my enjoyment or lack thereof on it. I just hate like hell to let it cause me undue pain if something were to go wrong, thats all.

Lip Man 1
09-18-2008, 05:23 PM
Speaking of BP, remember all the "hoopla" from certain writers like Dave Van Dyke who had these stories on how they pegged the Sox win total in 2007 EXACTLY right?

Geez, when does the article come out showing how BP bit the big one this season on their prediction?

Or should I stop holding my breath for it?

LOL

Lip

jabrch
09-18-2008, 05:44 PM
Take a deep breath.

Thanks for reminding me - glad you are here Rocky.


Before you go calling people idiots for using statistics, consider that it only makes you look ignorant.


If you read my post, I did nothing of the sort. I use statistics every day. There is nothing at all bad about using them. And I did not ever "call someone an idiot for using statistics".

There is a problem with misusing them - and worse yet, abusing them. I'm all for the use statistics where they are meaningful. This is not an application that fits.

Foulke You
09-18-2008, 05:49 PM
Speaking of BP, remember all the "hoopla" from certain writers like Dave Van Dyke who had these stories on how they pegged the Sox win total in 2007 EXACTLY right?

Geez, when does the article come out showing how BP bit the big one this season on their prediction?

Or should I stop holding my breath for it?

LOL

Lip
The BP supporters quietly slink back into their hole when things don't go according to their calculations. I couldn't stand all the "I told you so" nonsense about BP picking the Sox perfect last year at 72 Wins. No mention of how wrong they were in 2005 when we won 99 games and the championship to boot. They projected our team at 72 wins last year when we were healthy which would have been incorrect. A lot went into the Sox having a 72W-90L season. Chief among them were injuries to Podsednik, Dye, Erstad, Thome, and Joe Crede. This was over half the starting 9! Throw in a bullpen that imploded and this provided a recipe for disaster. BP basically lucked into that prediction because had the Sox stayed healthy, they probably would have been more like 82W-80L. They still weren't going anywhere with that bullpen but it would have been a more respectable record.

Lundind1
09-18-2008, 05:57 PM
There is a problem with misusing them - and worse yet, abusing them. I'm all for the use statistics where they are meaningful. This is not an application that fits.

What is the Homer Simpson line used on Smartline : "Aw, people can come up with statistics to prove anything, Kent. Forfty percent of all people know that." :D:

getonbckthr
09-18-2008, 06:30 PM
I just wanna know are we 100% to win the 2005 AL CEntral yet?

RockyMtnSoxFan
09-18-2008, 06:38 PM
If you read my post, I did nothing of the sort. I use statistics every day. There is nothing at all bad about using them. And I did not ever "call someone an idiot for using statistics".

There is a problem with misusing them - and worse yet, abusing them. I'm all for the use statistics where they are meaningful. This is not an application that fits.

Sorry. Maybe I should take a deep breath too.

I agree that statistics can be misused. I don't think this is such a case, however. If you use the playoff odds as a measure of remaining schedule difficulty/run differential, it can be useful to see which teams have the harder road to make the playoffs.

I found it interesting a while back when we had played better opponents than the Twins, but also had a tougher remaining schedule. From here on out, we have easier opponents, but more road games.

jabrch
09-18-2008, 06:47 PM
Sorry. Maybe I should take a deep breath too.

It's all good dude...no worries



I agree that statistics can be misused. I don't think this is such a case, however. If you use the playoff odds as a measure of remaining schedule difficulty/run differential, it can be useful to see which teams have the harder road to make the playoffs.

I just don't believe you can simulate a baseball game on a computer and get a high degree of accuracy when using run differential and strenght of schedule. Certainly the Sox are a good illustration of run differential being kinda weak. If you believe what we are told alot, this team, more than any (and nobody has provent that to me) is one that scores big bunches or gets shut down. Differential looks odd that way. Also there are far too many variables exist that a computer can't pick up.

I work in a 6 Sigma environment and I lead projects with that rigor. I love the use of statistics to measure and evaluate performance - if those statistics truly are performance measures. Run differential, to me, doesn't tell you enough to draw the conclusions that BP or Coolstats say they can draw.

Peace....Love....Sox...

RowanDye
09-18-2008, 07:18 PM
Thanks for reminding me - glad you are here Rocky.




If you read my post, I did nothing of the sort. I use statistics every day. There is nothing at all bad about using them. And I did not ever "call someone an idiot for using statistics".

There is a problem with misusing them - and worse yet, abusing them. I'm all for the use statistics where they are meaningful. This is not an application that fits.

Yea, you're right. We should probably just give up trying to model things because there are too many variables (i.e. the weather).

While I agree that some of the BP types make fools of themselves and come off as know-it-alls, you do the same when you imply that modeling a baseball game is completely meaningless. As the models improve, we will begin to approach the set of definable terms and reduce the amount of residual error.

We are in agreement with the fact that some people misinterpret or overstate the meaning of statistics. They are not determinant.

SBSoxFan
09-18-2008, 08:36 PM
Yea, you're right. We should probably just give up trying to model things because there are too many variables (i.e. the weather).

While I agree that some of the BP types make fools of themselves and come off as know-it-alls, you do the same when you imply that modeling a baseball game is completely meaningless. As the models improve, we will begin to approach the set of definable terms and reduce the amount of residual error.

We are in agreement with the fact that some people misinterpret or overstate the meaning of statistics. They are not determinant.

Comparing the weather with the outcome of a sporting event is a poor comparison. The accuracy of weather predictions improve as the time frame decreases, i.e., weather predictions about tomorrow are more accurate than weather predictions about a day a week from now. This is not the case for a sporting event. There is little to no correlation between how the Sox perform tonight versus the Yankees and tomorrow versus Kansas City.

I'd like to hear what variables you would add to improve the model of the physical performance of 20+ athletes on a given night. To run a sufficiently accurate model would, I suspect, take longer than it took Deep Thought to come up with 42. :D:

itsnotrequired
09-18-2008, 09:41 PM
There is a problem with misusing them - and worse yet, abusing them. I'm all for the use statistics where they are meaningful. This is not an application that fits.

How are statistics being "abused" in this instance?

kitekrazy
09-18-2008, 09:54 PM
Simulated games - they have worked so well for the Cubs. I don't give a damn what the output of 1,000,000 computer run games is. It has zero to do with the outcome of 11 human games.

LOL. That the best statement I've read.

jabrch
09-18-2008, 11:26 PM
How are statistics being "abused" in this instance?

Because computer simulations, no matter how many times run, don't take into account anywhere near the number of variables that exist when 50 humans play a game on grass in a stadium with a crowd and an atmosphere. It just isn't real. It's gerryrigging numbers that lack meaning.

jabrch
09-18-2008, 11:27 PM
Yea, you're right. We should probably just give up trying to model things because there are too many variables (i.e. the weather).

I never said that either - just that this model doesn't say anything significant and that it is misleading to use it as such.

itsnotrequired
09-18-2008, 11:28 PM
Because computer simulations, no matter how many times run, don't take into account anywhere near the number of variables that exist when 50 humans play a game on grass in a stadium with a crowd and an atmosphere. It just isn't real. It's gerryrigging numbers that lack meaning.

The same could be said for weather models, population models, etc.

voodoochile
09-18-2008, 11:45 PM
The same could be said for weather models, population models, etc.
Weather models, yes. I am sure the number of different distinct things that influence the outcome of a baseball game rapidly push calculating that outcome ahead of time into some pretty far reaches of chaos math, which is needed to predict weather with any degree of accuracy.

Population on the other hand is pretty straightforward. When you calculate in the information from years and years of life insurance data it's really not that tough of a field mathematically.

itsnotrequired
09-18-2008, 11:52 PM
Population on the other hand is pretty straightforward. When you calculate in the information from years and years of life insurance data it's really not that tough of a field mathematically.

o rly?

http://www.fpri.org/ww/0505.200407.eberstadt.demography.html

Seems more like hating on these "playoff percentages" just because BP is involved.

:shrug:

voodoochile
09-18-2008, 11:59 PM
o rly?

http://www.fpri.org/ww/0505.200407.eberstadt.demography.html

Seems more like hating on these "playoff percentages" just because BP is involved.

:shrug:

That's a long article for midnight. I skimmed it, but is there an actual argument in there that predicting world population trends over the next decade is difficult to do?

And no, I am not hating on BP. I do think they take themselves WAY to seriously and their numbers really don't predict well over the course of a season of wins and losses because there are just too many factors that influence a baseball game, one of which is weather :D:

JungleJimR
09-18-2008, 11:59 PM
[quote=JungleJimR;2044399]Coolstandings.com runs a million computer simulations each day to assess playoff probabilities. It takes into account all of the various home/away and strength of schedule dimensions. Check it out at: www.espn.com/mlb/standings (http://www.espn.com/mlb/standings) (look to the 3 columns on the right)

The following is last week's up and down history of the Sox Probability for winning the Central Divn:

End
of - Prob - Events
Fri - 53% - Start of week
Sat- 44% - Sox rained out-Twins win DH
Sun- 56% - Sox win DH-Twins lose
Mon- 54%- Sox and Twins lose
Tues-67%- Sox win-Twins lose
Wed- 72%- Sox and Twins lose.
Thur - 55%- Sox lose - Twins have big, big win

SBSoxFan
09-19-2008, 12:05 AM
Weather models, yes. I am sure the number of different distinct things that influence the outcome of a baseball game rapidly push calculating that outcome ahead of time into some pretty far reaches of chaos math, which is needed to predict weather with any degree of accuracy.


I'm not sure meteorologists are using chaos theory to forecast tomorrow's weather on the 10-o'clock news. I think your right though --- factors that affect the outcome of a sporting event are extremely numerous. Where the weather and a sporting event differ, however, is that weather predictions are fairly accurate in the short term because the initial conditions don't often change very much. A butterfly flapping its wings in Mexico has only a long-term affect on Chicago's weather. A pitcher losing it, or finding it, once he leaves the bullpen certainly has a greater influence on the short term, i.e., the game at hand. The factors are exacerbated by the fact that a baseball game is a discrete event as opposed to weather which is continuous. Discrete-time events require different math for analysis.

MetroPD
09-19-2008, 12:11 AM
Never count out the Twins to magically steal the show from the Sox.

JungleJimR
09-19-2008, 12:15 AM
Never count out the Twins to magically steal the show from the Sox.


This was a really huge win for the Twins and we can only watch to see if this turns this critical TB series their way.

voodoochile
09-19-2008, 12:54 AM
I'm not sure meteorologists are using chaos theory to forecast tomorrow's weather on the 10-o'clock news. I think your right though --- factors that affect the outcome of a sporting event are extremely numerous. Where the weather and a sporting event differ, however, is that weather predictions are fairly accurate in the short term because the initial conditions don't often change very much. A butterfly flapping its wings in Mexico has only a long-term affect on Chicago's weather. A pitcher losing it, or finding it, once he leaves the bullpen certainly has a greater influence on the short term, i.e., the game at hand. The factors are exacerbated by the fact that a baseball game is a discrete event as opposed to weather which is continuous. Discrete-time events require different math for analysis.

For tomorrow's weather? No. What about a week from now? A month? Because of the number of different factors that can influence weather over a period of time, chaos theory does indeed rule the predictions. Anyone who tells you they can tell you with any degree of accuracy what the weather will be like in Chicago past 10 days out is lying and anything past 5 starts to get pretty fuzzy.

SBSoxFan
09-19-2008, 01:03 AM
For tomorrow's weather? No. What about a week from now? A month? Because of the number of different factors that can influence weather over a period of time, chaos theory does indeed rule the predictions. Anyone who tells you they can tell you with any degree of accuracy what the weather will be like in Chicago past 10 days out is lying and anything past 5 starts to get pretty fuzzy.

I agree completely. I didn't say chaos theory didn't rule the predictions, I just doubted that's it's used in predictions we see on the news or in the newspaper. I've never asked a meteorologist, however, so I could be wrong.

And I believe they can tell you what the weather will be like tomorrow with far greater accuracy than they can tell you how the Sox will do against KC tomorrow. Of course, you and I already know the answer to that. :bandance:

delben91
09-19-2008, 07:29 AM
This was a really huge win for the Twins and we can only watch to see if this turns this critical TB series their way.

If the Sox start winning it won't matter what the Twins do.

JungleJimR
09-19-2008, 09:09 AM
If the Sox start winning it won't matter what the Twins do.



No doubt, but we've been having a very hard time winning on the road these days; and wouldn't it make you just down right giddy to go into Minn 3 games up in the loss column?

jabrch
09-19-2008, 09:59 AM
The same could be said for weather models, population models, etc.

You are entitled to that opinion. If you really believe that there is an equal level of precision between these post season models and the others your mention - then have at it. Party on. Tomorrows weather is easier to predict for obvious reasons - you can see today's weather moving from the West..and at least have the majority of the data. Tomorrows baseball game is largely independent of yesterdays. It is even more independent of the game played in May.

I think that's ridiculous - and as much as it is your right to believe it, it is my right to call it complete horsecrap. I spend a large portion of my day dealing with statistics and with models. If my model was a flimsy as this, I'd have no credibility. We look to Six Sigma precision. Often we get two or three sigma and work hard to move up the scale. But with so little information in a model, and so many variables out there unaccounted for, I just wouldn't even consider it as legitimate.

Again - your choice - but it is equally my choice to have an opinion.

itsnotrequired
09-19-2008, 12:11 PM
You are entitled to that opinion. If you really believe that there is an equal level of precision between these post season models and the others your mention - then have at it. Party on. Tomorrows weather is easier to predict for obvious reasons - you can see today's weather moving from the West..and at least have the majority of the data. Tomorrows baseball game is largely independent of yesterdays. It is even more independent of the game played in May.

I think that's ridiculous - and as much as it is your right to believe it, it is my right to call it complete horsecrap. I spend a large portion of my day dealing with statistics and with models. If my model was a flimsy as this, I'd have no credibility. We look to Six Sigma precision. Often we get two or three sigma and work hard to move up the scale. But with so little information in a model, and so many variables out there unaccounted for, I just wouldn't even consider it as legitimate.

Again - your choice - but it is equally my choice to have an opinion.

I just don't see the reason to get bent out of shape over a postseason odds report. BP isn't giving odds of wins for individuals games but rather the remaining games as a whole. Using numbers and trends over the season, they used a model to run some simulated games using Monte Carlo methods and come out with cumulitive results. It isn't like

Frankly, I don't know what inputs they put into the model so can't really say if it is a garbage in-garbage out type of scenario.

jabrch
09-19-2008, 12:26 PM
I just don't see the reason to get bent out of shape over a postseason odds report.

Me neither.

BP isn't giving odds of wins for individuals games but rather the remaining games as a whole.

I understand - and I don't think that those odds are of any value.

Frankly, I don't know what inputs they put into the model so can't really say if it is a garbage in-garbage out type of scenario.

While that is very sensible, the fact of the matter is that many of thse inputs are unquantifiable, inconsistent, and not projectable. You can do a Monte Carlo simulation easily if you can project the output of an event within the subset of events, and then forecast its likelihood.

I love stats. That's the funny thing about it. I just hate the misuse of them.

kjhanson
09-19-2008, 02:09 PM
I love stats. That's the funny thing about it. I just hate the misuse of them.

You're in the wrong place, man, let me tell you. There are a lot of people on this forum, and many others, that take one elementary stats class and anoint themselves the next Bill James.

Now I'd personally like to think I have a bit more credibility than the average poster, given my degree in Statistics from a top-five University, and my day-to-day use of statistics. As such, I've already pointed out how bad Cool Standings is.

The truth of the matter is, the probability of a coin landing on heads when you flip it is 50%. And the probability the White Sox are going to the playoffs is 0 or 1. (::awaiting backlash for this one::)

Let's make it 1.

Eddo144
09-19-2008, 03:19 PM
The truth of the matter is, the probability of a coin landing on heads when you flip it is 50%. And the probability the White Sox are going to the playoffs is 0 or 1.
I'll bite. The outcome that the Sox make the playoffs is 0 or 1. That's the same outcome of a coinflip (arbitrarily setting tails to 0 and heads to 1, for instance).

JungleJimR
09-21-2008, 11:15 AM
I think that's ridiculous - and as much as it is your right to believe it, it is my right to call it complete horsecrap. I spend a large portion of my day dealing with statistics and with models. If my model was a flimsy as this, I'd have no credibility. We look to Six Sigma precision. Often we get two or three sigma and work hard to move up the scale. But with so little information in a model, and so many variables out there unaccounted for, I just wouldn't even consider it as legitimate.

Again - your choice - but it is equally my choice to have an opinion.

I'm afraid that if our social and medical sciences were required to match this level of precision in order to infer causality then 99% of their most important work would be thrown out! We wouldn't know anything about what makes for better academic performance; about what important factors contribute to heart disease or the various cancers;to maternal and child health.

I, as you, have worked in manufacturing facing the challenges of controlling processes in order to limit end product variances to far less than one %. It's essential for us to not only identify, but to control every variable causing these product variances.

In contrast, the social and medical sciences work in far more complex causal systems, where most researchers would be ecstatic if they identified those causes that could account for only 50-60% of the resulting variance. A very robust statistical science has evolved that governs researchers at every point in their effort from sample size and selection to operationalization of variables to the confidence levels supported by their results. Every year, vast amounts of research in complex systems produce very meaningful results, without having to account for all variance, or even most of it.

Re: 0 or 1 as the only possible outcomes - Yes, each outcome has binary values, but the probability (before the event) of each outcome is not - but somewhere betwn 0 and 1. Can you see that it is our object here to estimate that?
If not, then perhaps we are bewildered with a wide-spread need for certainty. But isn't it the case in post quantum times that one thing we know for certain, is that there is no certainty? Nothing is certain - everything has a probability distribution, everything.

But let's not go there, but only to the vastly more trivial level of figuring out what our Sox chances are to win the Central.

JungleJimR
09-21-2008, 11:19 AM
Coolstandings.com runs a million computer simulations each day to assess playoff probabilities. It takes into account all of the various home/away and strength of schedule dimensions. Check it out at: www.espn.com/mlb/standings (http://www.espn.com/mlb/standings) (look to the 3 columns on the right)

The following is last week's up and down history of the Sox Probability for winning the Central Divn:

End
of - Prob - Events
Fri - 53% - Start of week
Sat- 44% - Sox rained out-Twins win DH
Sun- 56% - Sox win DH-Twins lose
Mon- 54%- Sox and Twins lose
Tues-67%- Sox win-Twins lose
Wed- 72%- Sox and Twins lose.
Thur- 55%- Sox lose, Twins have big, big win
Fri - 72%- Sox win - Twins lose --- System really values a 3 game loss differential
Sat - 71% - Sox and Twins lose - System pauses before the big series

Let's go up by 4 in the loss column before Tuesday.

LJS1993
09-21-2008, 11:21 AM
Man I love this. Another pencil necked poindexter who never even had a sniff of the game in real life. Run your simulations all you want but it fails to take in so many factors that make baseball so great. :rolling::rolling:

MISoxfan
09-21-2008, 12:29 PM
I don't even care a little about these probability results, but you LJS are an *******.

Mod Edit: Why you decided to make this post right at the crucial point in the season is beyond me. It's clearly spelled out that personal attacks will not be tolerated. I was nice - I only gave you 3 days. In the future if you have a problem with another poster, simply ignore them or if you feel they are violating board policy, report their post.

kjhanson
09-21-2008, 12:41 PM
Let me point out the obvious again:

Wednesday: Magic Number = 9, 11 games to play and we are at 72%
Friday: Magic Number = 7, 9 games to play and we are still at 72%
Saturday: Magic Number = 6, 8 games to play and we are at 71%

How in the hell do our "odds" go down from Friday to Saturday when we shave a game off the magic number and have one less game to play?

LJS1993
09-21-2008, 12:42 PM
I don't even care a little about these probability results, but you LJS are an *******.

Mod Edit: Why you decided to make this post right at the crucial point in the season is beyond me. It's clearly spelled out that personal attacks will not be tolerated. I was nice - I only gave you 3 days. In the future if you have a problem with another poster, simply ignore them or if you feel they are violating board policy, report their post.

Cool, I'm glad you have an opinion on me. Like me, hate me, call me an *******, but at least have an opinion.

itsnotrequired
09-21-2008, 12:56 PM
Let me point out the obvious again:

Wednesday: Magic Number = 9, 11 games to play and we are at 72%
Friday: Magic Number = 7, 9 games to play and we are still at 72%
Saturday: Magic Number = 6, 8 games to play and we are at 71%

How in the hell do our "odds" go down from Friday to Saturday when we shave a game off the magic number and have one less game to play?

Which site are you looking at? BP has the Sox at 87.0% right now. They did drop 0.75% from yesterday but that is within the noise of the Monte Carlo method they use to come up with their odds.

Sox probability is up 28% over the last week even though they haven't increased their lead over Minnesota. Makes sense as like you pointed out, games were knocked off the schedule.

:shrug:

voodoochile
09-21-2008, 12:59 PM
Cool, I'm glad you have an opinion on me. Like me, hate me, call me an *******, but at least have an opinion.

Classic troll reply... I don't care if you agree with my posts, just so long as you think about me...

kjhanson
09-21-2008, 01:51 PM
Which site are you looking at? BP has the Sox at 87.0% right now. They did drop 0.75% from yesterday but that is within the noise of the Monte Carlo method they use to come up with their odds.

Sox probability is up 28% over the last week even though they haven't increased their lead over Minnesota. Makes sense as like you pointed out, games were knocked off the schedule.

:shrug:

I was referring to CoolStandings' numbers, which Jim has been updating in his original post. BP has had the White Sox in the 80s all week, which, I think from a betting perspective (which is all these are good for), is a more accurate figure.

JungleJimR
09-21-2008, 02:19 PM
I was referring to CoolStandings' numbers, which Jim has been updating in his original post. BP has had the White Sox in the 80s all week, which, I think from a betting perspective (which is all these are good for), is a more accurate figure.


What is/are the Vegas line(s) for the Sox winning the Divn?

RockyMtnSoxFan
09-22-2008, 12:02 PM
At least most of the people arguing in this thread seem to have some idea about statistics. That makes it more interesting.

You are entitled to that opinion. If you really believe that there is an equal level of precision between these post season models and the others your mention - then have at it. Party on. Tomorrows weather is easier to predict for obvious reasons - you can see today's weather moving from the West..and at least have the majority of the data. Tomorrows baseball game is largely independent of yesterdays. It is even more independent of the game played in May.



I think the point about baseball games being independent is important. If you use this as an assumption for a model, you can look at the probability distributions for scoring and allowing runs (i.e., the probability that your team will score four runs in a game). If you also assume that, in any given game, the home and away teams score independently of each other, you can create a model which should be accurate. The problem with the last assumption comes from bullpen usage. A manager will put in his best reliever if he's ahead by a run, but someone else if he's down by three.

Anyway, I created a model like this, and found that it was slightly more accurate that the Pythagorean model. Then I found a paper by a mathematician from Brown University stating that the Pythagorean model could be derived from mine by using the appropriate curve fits.

So basically, (as I understand it) the coolstandings model is based on assumptions about the independence of scoring events, which may or may not be completely independent, but nevertheless provides some useful information.

JungleJimR
09-22-2008, 01:04 PM
At least most of the people arguing in this thread seem to have some idea about statistics. That makes it more interesting.
.......

So basically, (as I understand it) the coolstandings model is based on assumptions about the independence of scoring events, which may or may not be completely independent, but nevertheless provides some useful information.

(First part) - Yes, but, as I have previously stated, it seems that some, if not most who have commented in this thread, abhor thinking in terms of probabilities. And, therefore, much of what you and I are saying here falls on deaf ears

(Second part) - The fine print in the coolstandings model speaks to adding weight to recent trends, so their is perhaps a very small appreciation. And since they use a multiple reg model in several stages of the model, they could have some kind of auto-correlation variable. It seems that you could answer that better than me.

jabrch
09-22-2008, 01:09 PM
(First part) - Yes, but, as I have previously stated, it seems that some, if not most who have commented in this thread, abhor thinking in terms of probabilities. And, therefore, much of what you and I are saying here falls on deaf ears

When the probabilities are real - nobody abhors them.

What's the probability of a coin toss? What's the probability of busting when you hit in 16? What's the probability of a machine creating X widgets that pass a quality test in Y minutes of operating - and if so, what's the probability that machine lasts Z hours before needing maintenence?

All of this can be forecasted.

What's the probability of the Sox winning the division? That's voodoo (and I'm not referring to the WSI mod) not probability.

jabrch
09-22-2008, 01:11 PM
What is/are the Vegas line(s) for the Sox winning the Divn?

A month ago, it was something like 6:5...you won't even get even money right now - if that bet is even still on. I imagine that it is off and now all you could get would be Pennant or WS. And those odds would be about 3:1 and 5:1 if I had to guess.

JungleJimR
09-22-2008, 05:44 PM
When the probabilities are real - nobody abhors them.

What's the probability of a coin toss? All of this can be forecasted.

What's the probability of the Sox winning the division? That's voodoo (and I'm not referring to the WSI mod) not probability.

I no doubt overstated in using "abhor", maybe "resistant to use in baseball games" would have been more appropriate, sorry to offend, but at least it stimulated a reaction, so let's discuss.

Yes, a coin toss has a widely accepted probability distribution for the outcome of heads - with an expected value of 50%. But obviously over a specific sample with a size of 100 one might come up with 47%, or 53%, or even 20%, although these odds are rather remote. - The point is that there is a probability distribution for this event that has well established parameters - normal, with with certain tail lengths for confidence levels and standard error, etc.
For multiple toss coin events, there are also probabilities for certain combinations of outcomes, all based on the individual P of each event, and the joint P of multiple events, I am starting to go beyond my memory of correct terminology in working in permutation and combination land, so bear with me.

The probability of getting a minimum of one heads outcome when tossing the coin 2 times is 75%: 3 out of the 4 possible toss outcomes in tossing a coin 2 times have heads in the resulting combinations.

Conveniently, the outcome of a baseball game is also binary - a win or a loss. I would contend that the probability of each event varies somewhat from 50% depending on all of the wonderful factors that we fans look at - Home adds a little, away substracts; pitching matchups work one way or the other, and the overall relative strength of each team - hech, why not use current overall season winning %, or the last month % at the location with that pitcher, with that lineup, against the other team's lineup. The point is the chances for a Sox win in a particular game can be established by knowledgeable fans - we could find out the mean value of 25 Sox fans and establish a distribution curve around that mean. (Actually countless arguments in WSI are around how far the P varies from 50%) If you don't agree with this general postulate, then do not read any further.

Now after establishing the probability of a Sox win in each game, we could also establish the P distribution curve for a Twins loss in each game. Again, conveniently, P in 3 of these games have have already been established. Since all we Sox fans need right now is from all of the outcomes for these last 9 games - 3 joint, and 6 separate is a minimum of 5 outcomes that have either the Sox winning or the Twins losing. As in the minimum of one heads outcome in 2 tosses where the P of each event is 50%, we are looking for a minimum of 5 outcomes out of all 18 possible outcomes. If you are still reading, you get the point.

By all means we could establish such a P for our chances for winning the Central.

JungleJimR
09-22-2008, 05:59 PM
Coolstandings.com runs a million computer simulations each day to assess playoff probabilities. It takes into account all of the various home/away and strength of schedule dimensions. Check it out at: www.espn.com/mlb/standings (http://www.espn.com/mlb/standings) (look to the 3 columns on the right)

The following is last week's up and down history of the Sox Probability for winning the Central Divn:

End
of - Prob - Events
Fri - 53% - Start of week
Sat- 44% - Sox rained out-Twins win DH
Sun- 56% - Sox win DH-Twins lose
Mon- 54%- Sox and Twins lose
Tues-67%- Sox win-Twins lose
Wed- 72%- Sox and Twins lose.
Thur- 55%- Sox lose, Twins have big, big win
Fri - 72%- Sox win - Twins lose --- System really values a 3 game loss differential
Sat - 71% - Sox and Twins lose - System pauses before the big series

Sun - 71% - Sox and Twins win - System has Sox winning at least one game in Minn as much better than 50/50.

jabrch
09-22-2008, 06:02 PM
we could find out the mean value of 25 Sox fans and establish a distribution curve around that mean. (Actually countless arguments in WSI are around how far the P varies from 50%)

.
.
.

By all means we could establish such a P for our chances for winning the Central.

But that isn't the P of the Sox winning. That is the perception of a fan - even a knowledgeable one - which is not based on the same level of mathematical rigour that is needed to draw that conclusion. You'd be determining not the P of winning, rather the Perceived P of winning which is very subjective.

I love using math where you have a mathematic equation. I hate using math as a proxy for emotion/gut/etc. and then using calculations to hide the reality.

If a fan wants to say, "I KNOW we are going to win!" that's fine. We all know they don't "KNOW" it - they feel it. But all the math in the world doesn't change the fact that today's baseball game is based on hundreds and hundreds of events, ranging from if player X strains his neck sleeping last night, to the number of average runs scored last week, last month, etc. But why the math? Why the need to mathematically predict the unpredictable?

JungleJimR
09-22-2008, 06:17 PM
But that isn't the P of the Sox winning. That is the perception of a fan - even a knowledgeable one - which is not based on the same level of mathematical rigour that is needed to draw that conclusion. You'd be determining not the P of winning, rather the Perceived P of winning which is very subjective.

I love using math where you have a mathematic equation. I hate using math as a proxy for emotion/gut/etc. and then using calculations to hide the reality.

If a fan wants to say, "I KNOW we are going to win!" that's fine. We all know they don't "KNOW" it - they feel it. But all the math in the world doesn't change the fact that today's baseball game is based on hundreds and hundreds of events, ranging from if player X strains his neck sleeping last night, to the number of average runs scored last week, last month, etc. But why the math? Why the need to mathematically predict the unpredictable?

Well, I just don't know why you have a problem with assessing a probability to an event like the outcome of a Sox game. Don't you kind of wonder, or feel, something about our chances in tommorrow's game in Minn? I mean, who is pitching for us? I would say we should not be favored since we play sh..... up there and all, but I wouldn't go much below 43%, maybe 44.5% since there will be incredible pressure on the Twinkies. And..
So, what do you think our chances are? And you better defend your answer.

jabrch
09-22-2008, 06:24 PM
Well, I just don't know why you have a problem with assessing a probability to an event like the outcome of a Sox game. Don't you kind of wonder, or feel, something about our chances in tommorrow's game in Minn?

Absolutely - but my gut feel is not a mathematical probability and if I were to assign a number to it, it would be completely random and not a mathematical number at all.

I mean, who is pitching for us? I would say we should not be favored since we play sh..... up there and all, but I wouldn't go much below 43%, maybe 44.5% since there will be incredible pressure on the Twinkies. And..

But that's not math. That's not a probability. That's GUESSING and assigning a number to it.

So, what do you think our chances are? And you better defend your answer.

I think it is somewhere above 50%. But that's my own opionion. It isn't a fact, and I can't use it to create anything other than math based on my gut feel - which is not probability.

JungleJimR
09-22-2008, 11:02 PM
I think it is somewhere above 50%. But that's my own opionion. It isn't a fact, and I can't use it to create anything other than math based on my gut feel - which is not probability.


There you go! The water is not to cold, is it?

Yet, your response betrays an unmistakeble Sox bias - certainly understandable given that we are on WSI, but I took you for a really cold number, objective, engineer.
No way is or should the Sox be at or above 50% in any of the games in Minn. Although they are odds on to win one of them, which is all they need.
Sox on road winning % - .45
(1- Minn at home%) - .35
Vazquez on road % . 38
(1-Baker at home%) .25
Ave .33
Add 5% for improvement in Aug and Sept for Vazquez and that Baker's high % is based on small sample
Then Sox chance is 38% - 40%. Pretty much in line w Vegas line.
Really hope our Sox win this one, but they won't be favored. But Minn should be under a lot of pressure and maybe this should be factored in.

jabrch
09-22-2008, 11:34 PM
There you go! The water is not to cold, is it?

If you are asking my personal opinion - I can give it to you. I can even associate a bull**** percentage to it with nothing more than directional indicators. But that doesn't make it math. That doesn't make it statistics. That doesn't make it meaningful or significant. Applying crappy numbers to something is not probability if that isn't derrived from fact.

It has nothing to do with water being cold.

Yet, your response betrays an unmistakeble Sox bias - certainly understandable given that we are on WSI, but I took you for a really cold number, objective, engineer.

Screw that...this isn't my job.

No way is or should the Sox be at or above 50% in any of the games in Minn. Although they are odds on to win one of them, which is all they need.

That's your opinion - and there is no fact to support it.


Sox on road winning % - .45
(1- Minn at home%) - .35
Vazquez on road % . 38
(1-Baker at home%) .25
Ave .33
Add 5% for improvement in Aug and Sept for Vazquez and that Baker's high % is based on small sample
Then Sox chance is 38% - 40%.

That is of very little relevance to a single isolated game being played tomorrow. Tomorrows game will be played on the field, independent of all of the data you provided. None of that means much.

That sort of math works just fine if you are evaluating firm mathematic probabilities. Remember the story problems we had in school - where there are no variables that are uncontrolled or unknown...

The real world, on a baseball field, doesn't resemble a simple story problem.

Pretty much in line w Vegas line.

Which has little to do with probability and everything to do with money flow.

SBSoxFan
09-23-2008, 12:10 AM
There you go! The water is not to cold, is it?

Yet, your response betrays an unmistakeble Sox bias - certainly understandable given that we are on WSI, but I took you for a really cold number, objective, engineer.
No way is or should the Sox be at or above 50% in any of the games in Minn. Although they are odds on to win one of them, which is all they need.
Sox on road winning % - .45
(1- Minn at home%) - .35
Vazquez on road % . 38
(1-Baker at home%) .25
Ave .33
Add 5% for improvement in Aug and Sept for Vazquez and that Baker's high % is based on small sample
Then Sox chance is 38% - 40%. Pretty much in line w Vegas line.
Really hope our Sox win this one, but they won't be favored. But Minn should be under a lot of pressure and maybe this should be factored in.

Did you just take an average of probabilities to determine the overall probability of an outcome occurring? :thud:

[quote=jabrch;2049404]If you are asking my personal opinion - I can give it to you. I can even associate a bull**** percentage to it with nothing more than directional indicators. But that doesn't make it math. That doesn't make it statistics. That doesn't make it meaningful or significant. Applying crappy numbers to something is not probability if that isn't derrived from fact.

It has nothing to do with water being cold.

/quote]

Sure it does. Water is cold and not cold to some degree. It's called fuzzy logic. It's math. Some would argue it's a better approach than probability. While the outcome, a win or a loss, is crisp, all the factors that go into determining the win or loss are not crisp; they are fuzzy.

JungleJimR
09-23-2008, 11:07 AM
Did you just take an average of probabilities to determine the overall probability of an outcome occurring? :thud:

[quote=jabrch;2049404]If you are asking my personal opinion - I can give it to you. I can even associate a bull**** percentage to it with nothing more than directional indicators. But that doesn't make it math. That doesn't make it statistics. That doesn't make it meaningful or significant. Applying crappy numbers to something is not probability if that isn't derrived from fact.

It has nothing to do with water being cold.

/quote]

Sure it does. Water is cold and not cold to some degree. It's called fuzzy logic. It's math. Some would argue it's a better approach than probability. While the outcome, a win or a loss, is crisp, all the factors that go into determining the win or loss are not crisp; they are fuzzy.

My oh my. averages of averages are much better statistically speaking than simple ave - just like stratified random sampling is much better than simple random (more efficient for same sample size), and samples of samples is better than just one huge sample when looking at continuous improvement, etc. Check out www.encyclopedia.farlex.com/sampling+(statistics (http://www.encyclopedia.farlex.com/sampling+(statistics)) --- look under choosing a sample. I needn't say anything more.

RockyMtnSoxFan
09-23-2008, 11:14 AM
But that isn't the P of the Sox winning. That is the perception of a fan - even a knowledgeable one - which is not based on the same level of mathematical rigour that is needed to draw that conclusion. You'd be determining not the P of winning, rather the Perceived P of winning which is very subjective.

I love using math where you have a mathematic equation. I hate using math as a proxy for emotion/gut/etc. and then using calculations to hide the reality.

If a fan wants to say, "I KNOW we are going to win!" that's fine. We all know they don't "KNOW" it - they feel it. But all the math in the world doesn't change the fact that today's baseball game is based on hundreds and hundreds of events, ranging from if player X strains his neck sleeping last night, to the number of average runs scored last week, last month, etc. But why the math? Why the need to mathematically predict the unpredictable?

I think JungleJim made an excellent point above, and I will add to that. By looking at a team's histogram of scoring, you can create a probability distribution for how they will score in a future game. For example, a team with a good offense has maybe scored five runs in a game on 12 occasions during a season, and two runs on only 6 occasions (I'm making these numbers up), so they have a 12/162, or 7.4%, chance of scoring five runs, and a 6/162, or 3.7%, chance of scoring two runs. If you calculate these numbers for every run total on both offense and defense, you can find the P that they will score x number of runs and allow y number of runs.

Now, if you take two teams and do this, you can predict the outcome of future games. Yes, there are assumptions that go into these predictions (offense and defense in any given game are independent of each other, a team is likely to hit and pitch similar according to its history, etc), but if you recognize these assumptions you can still obtain meaningful results. These predictions are derived from real events and are not based on emotions, and the fact that anything can happen on the field is reflected by the trend of the predicted outcomes to 50%. What I mean by this is that, because there is such a wide variation in scoring from game to game, the model will generally predict the chance of winning closer to 50% than to 0% or 100%. The variable events that happen on the field might not be predicted directly, but they are indirectly taken into account by the uncertainty in the model.

voodoochile
09-23-2008, 11:21 AM
I think JungleJim made an excellent point above, and I will add to that. By looking at a team's histogram of scoring, you can create a probability distribution for how they will score in a future game. For example, a team with a good offense has maybe scored five runs in a game on 12 occasions during a season, and two runs on only 6 occasions (I'm making these numbers up), so they have a 12/162, or 7.4%, chance of scoring five runs, and a 6/162, or 3.7%, chance of scoring two runs. If you calculate these numbers for every run total on both offense and defense, you can find the P that they will score x number of runs and allow y number of runs.

Now, if you take two teams and do this, you can predict the outcome of future games. Yes, there are assumptions that go into these predictions (offense and defense in any given game are independent of each other, a team is likely to hit and pitch similar according to its history, etc), but if you recognize these assumptions you can still obtain meaningful results. These predictions are derived from real events and are not based on emotions, and the fact that anything can happen on the field is reflected by the trend of the predicted outcomes to 50%. What I mean by this is that, because there is such a wide variation in scoring from game to game, the model will generally predict the chance of winning closer to 50% than to 0% or 100%. The variable events that happen on the field might not be predicted directly, but they are indirectly taken into account by the uncertainty in the model.

Sorry, that's way to simplistic...

Who were the opposing pitchers on the days they scored those various run totals?

Which way was the wind blowing on those days?

Were there any starters on the DL when they scored 2 runs?

Was the manager playing a "getaway day lineup" when they scored 2 runs?

Were the results bunched?

Were players slumping or on a hot streak?

Those are all basic examples of day to day things that can influence the total number of runs the team will score on any given day.

That's before getting into the human element to a large degree also - home/emotional problems, hang overs, feeling owey (but not to the degree that requires a day off), etc.

Until you can model ALL of these factors and many more effectively, you cannot expect to predict run totals let alone wins and losses to a large degree of accuracy, IMO.

Breaking it down to expected runs or W/L% home and road doesn't start to tell the story...

RockyMtnSoxFan
09-23-2008, 11:43 AM
Sorry, that's way to simplistic...

Who were the opposing pitchers on the days they scored those various run totals?

Which way was the wind blowing on those days?

Were there any starters on the DL when they scored 2 runs?

Was the manager playing a "getaway day lineup" when they scored 2 runs?

Were the results bunched?

Were players slumping or on a hot streak?

Those are all basic examples of day to day things that can influence the total number of runs the team will score on any given day.

That's before getting into the human element to a large degree also - home/emotional problems, hang overs, feeling owey (but not to the degree that requires a day off), etc.

Until you can model ALL of these factors and many more effectively, you cannot expect to predict run totals let alone wins and losses to a large degree of accuracy, IMO.

Breaking it down to expected runs or W/L% home and road doesn't start to tell the story...

You're right, there are many factors that aren't accounted for. But that's the point: they can't be predicted. Especially for games further than two weeks out, there are too many variables to model everything. But we're not claiming to model everything. We're not saying that the model is 100% correct. But if you compare two teams and find that one generally scores more runs and allows fewer runs, you can estimate the probability that team will win. Because of all of the factors that aren't modeled, the estimate will be close to 50% (which is the "who knows" value) and the uncertainty will be high. I'm not saying that this is a perfect prediction, just that it is a useful measurement which provides information.

Eddo144
09-23-2008, 12:23 PM
Sorry, that's way to simplistic...

Who were the opposing pitchers on the days they scored those various run totals?

Which way was the wind blowing on those days?

Were there any starters on the DL when they scored 2 runs?

Was the manager playing a "getaway day lineup" when they scored 2 runs?

Were the results bunched?

Were players slumping or on a hot streak?

Those are all basic examples of day to day things that can influence the total number of runs the team will score on any given day.

That's before getting into the human element to a large degree also - home/emotional problems, hang overs, feeling owey (but not to the degree that requires a day off), etc.

Until you can model ALL of these factors and many more effectively, you cannot expect to predict run totals let alone wins and losses to a large degree of accuracy, IMO.

Breaking it down to expected runs or W/L% home and road doesn't start to tell the story...
Yeah, all those factors affect the amount of runs scored on a specific day. However, because of the sheer number of games played, things like wind and weather and quality of opposing pitchers tend to even out over the course of a season. Therefore, you can reasonably use a team's averages to make predictions.

For example, take tonight's game (I'm going to make this overly simplistic in terms of percentages, but you get the idea). You might find that the Sox run-scoring breakdown is...
0: 2%
1: 6%
2: 12%
3: 20%
4: 25%
5: 20%
6+: 15%

You might find that Baker gives up the following runs:
0: 1%
1: 3%
2: 6%
3: 15%
4: 35%
5: 30%
6+: 10%

Therefore, if you run 1000 simulations of the game, in 250 of those, the Sox will have scored 4 runs. Baker will have given up less than 4 runs in 25% of those 250 games, or 62.5. That would result in 187.5 wins for the Sox. You could then repeat that for the number of times the Sox score 0, 1, 2, 3, 5, and 6+ runs, and come up with an expected win percentage for tonight's game.

jabrch
09-23-2008, 01:04 PM
Sorry, that's way to simplistic...

Who were the opposing pitchers on the days they scored those various run totals?

Which way was the wind blowing on those days?

Were there any starters on the DL when they scored 2 runs?

Was the manager playing a "getaway day lineup" when they scored 2 runs?

Were the results bunched?

Were players slumping or on a hot streak?

Those are all basic examples of day to day things that can influence the total number of runs the team will score on any given day.

That's before getting into the human element to a large degree also - home/emotional problems, hang overs, feeling owey (but not to the degree that requires a day off), etc.

Until you can model ALL of these factors and many more effectively, you cannot expect to predict run totals let alone wins and losses to a large degree of accuracy, IMO.

Breaking it down to expected runs or W/L% home and road doesn't start to tell the story...

And that's just about 1% of the variables that this grade school analysis doesn't account for.

RockyMtnSoxFan
09-23-2008, 01:17 PM
Yeah, all those factors affect the amount of runs scored on a specific day. However, because of the sheer number of games played, things like wind and weather and quality of opposing pitchers tend to even out over the course of a season. Therefore, you can reasonably use a team's averages to make predictions.

For example, take tonight's game (I'm going to make this overly simplistic in terms of percentages, but you get the idea). You might find that the Sox run-scoring breakdown is...
0: 2%
1: 6%
2: 12%
3: 20%
4: 25%
5: 20%
6+: 15%

You might find that Baker gives up the following runs:
0: 1%
1: 3%
2: 6%
3: 15%
4: 35%
5: 30%
6+: 10%

Therefore, if you run 1000 simulations of the game, in 250 of those, the Sox will have scored 4 runs. Baker will have given up less than 4 runs in 25% of those 250 games, or 62.5. That would result in 187.5 wins for the Sox. You could then repeat that for the number of times the Sox score 0, 1, 2, 3, 5, and 6+ runs, and come up with an expected win percentage for tonight's game.


Exactly. But you could even do it without simulations. For example, consider the case where the Sox score 4 runs (25% by your numbers). There is a 25% chance that the Twins will score 3 or fewer and a 40% chance that they will score 5 or more. Do this for all the Sox scoring from 0 to 6+, and you can determine the chance that they score more than the Twins or less than the Twins. You have to be careful with the ties, which of course can't happen in a real game.

Mathematically, this is a double sum. If we were dealing with continuous rather than discrete values, it would be a double integral.

SBSoxFan
09-23-2008, 06:40 PM
[quote=SBSoxFan;2049444]Did you just take an average of probabilities to determine the overall probability of an outcome occurring? :thud:



My oh my. averages of averages are much better statistically speaking than simple ave - just like stratified random sampling is much better than simple random (more efficient for same sample size), and samples of samples is better than just one huge sample when looking at continuous improvement, etc. Check out www.encyclopedia.farlex.com/sampling+(statistics (http://www.encyclopedia.farlex.com/sampling+%28statistics)) --- look under choosing a sample. I needn't say anything more.


It doesn't look like you made a new statistic or chose a sub-sample. Rather, you gave the probability of an event occurring based on the probabilities of other, conditional events occurring. I thought those were always multiplied. Maybe I'm missing something.

Anyway, fuzzy is the way to go. It can better capture the degree to which events occur versus statistics.

JungleJimR
09-24-2008, 09:40 AM
[quote=JungleJimR;2049634]


It doesn't look like you made a new statistic or chose a sub-sample. Rather, you gave the probability of an event occurring based on the probabilities of other, conditional events occurring. I thought those were always multiplied. Maybe I'm missing something.

Anyway, fuzzy is the way to go. It can better capture the degree to which events occur versus statistics.

For the averages in question, I don't believe so. - Averages for the Sox away, and the Twins opponent in the dome, and Javy away, and Baker at home are all independent. Except, for the case(s) when all of these events had occured simultaneously. And that average, if there is one, is practically meaningless. Many fans would put a lot of faith in it - What Javy did in the Dome against Baker. But this is similar to what Javy had done in the Dome up until last night - 2-0.

JungleJimR
09-24-2008, 09:42 AM
Coolstandings.com runs a million computer simulations each day to assess playoff probabilities. It takes into account all of the various home/away and strength of schedule dimensions. Check it out at: www.espn.com/mlb/standings (http://www.espn.com/mlb/standings) (look to the 3 columns on the right)

The following is last week's up and down history of the Sox Probability for winning the Central Divn:

End
of - Prob - Events
Fri - 53% - Start of week
Sat- 44% - Sox rained out-Twins win DH
Sun- 56% - Sox win DH-Twins lose
Mon- 54%- Sox and Twins lose
Tues-67%- Sox win-Twins lose
Wed- 72%- Sox and Twins lose.
Thur- 55%- Sox lose, Twins have big, big win
Fri - 72%- Sox win - Twins lose --- System really values a 3 game loss differential
Sat - 71% - Sox and Twins lose - System pauses before the big series

Sun - 71% - Sox and Twins win - holding and waiting

Tues- 61% - Sox lose/Twins win - Sox still 2-1 to win at least one in Minn.

JungleJimR
09-24-2008, 01:34 PM
You're right, there are many factors that aren't accounted for. But that's the point: they can't be predicted. Especially for games further than two weeks out, there are too many variables to model everything. But we're not claiming to model everything. We're not saying that the model is 100% correct. But if you compare two teams and find that one generally scores more runs and allows fewer runs, you can estimate the probability that team will win. Because of all of the factors that aren't modeled, the estimate will be close to 50% (which is the "who knows" value) and the uncertainty will be high. I'm not saying that this is a perfect prediction, just that it is a useful measurement which provides information.

Are you not also coming to conclusion that as much as we want to enlighten some to a sense of how probabilities can provide perspective in things like this, you will fail in the end?
I really believe there are 2 kinds of people - maybe like Mars and Venus, whatever - those that have a disposition to think in probabilistic ways: "What are my chances for this or that"; vs. those that have a disposition that thinks in deterministic ways: "Until everything is known about what you are talking about, there is no way to answer your question. You will either win or lose this game or season. You can't fall in between"

RockyMtnSoxFan
09-24-2008, 01:44 PM
Are you not also coming to conclusion that as much as we want to enlighten some to a sense of how probabilities can provide perspective in things like this, you will fail in the end?
I really believe there are 2 kinds of people - maybe like Mars and Venus, whatever - those that have a disposition to think in probabilistic ways: "What are my chances for this or that"; vs. those that have a disposition that thinks in deterministic ways: "Until everything is known about what you are talking about, there is no way to answer your question. You will either win or lose this game or season. You can't fall in between"

Yeah, I think I've explained everything I can at this point. I think it also comes down to faith: people don't want to have the mystery of baseball explained away by statistics. They want to believe that it is way too complex to be explained by numbers. To some extent, I can understand this. But they refuse to consider that statistics aren't replacing the games, but just giving more information to the thinking fan.

JungleJimR
09-24-2008, 01:55 PM
Yeah, I think I've explained everything I can at this point. I think it also comes down to faith: people don't want to have the mystery of baseball explained away by statistics. They want to believe that it is way too complex to be explained by numbers. To some extent, I can understand this. But they refuse to consider that statistics aren't replacing the games, but just giving more information to the thinking fan.


You are wise beyond your years.

Nellie_Fox
09-24-2008, 02:06 PM
...But they refuse to consider that statistics aren't replacing the games, but just giving more information to the thinking fan.Unfortunately, too many fans get so wrapped up in the stats that they adopt the attitude that if it can't be quantified, it must not be important. They therefore dismiss any aspect of the game that doesn't have a good statistical measure, and as a result don't value players whose strengths lie in those areas.

downstairs
09-24-2008, 02:11 PM
Unfortunately, too many fans get so wrapped up in the stats that they adopt the attitude that if it can't be quantified, it must not be important. They therefore dismiss any aspect of the game that doesn't have a good statistical measure, and as a result don't value players whose strengths lie in those areas.

Sometimes, yes. But not everyone. For example- stat heads know fielding is far from being able to accurately be measured in numbers. Very far. But I don't hear them saying fielding doesn't matter.

A good stathead merely believes that everything should somehow be able to be explained in stats- we just don't know all of the ways yet (and may never for some things).

I've said it before: I like to know that the White Sox have a 62% chance of making the playoffs this year. If MIN wins the Central, am I wrong? No... they just hunkered down and overcame bad odds.

Nellie_Fox
09-24-2008, 02:18 PM
I don't hear them saying fielding doesn't matter.Maybe not in those words, but this board is loaded with people who pay no attention to anything but offense, and are fine with big-stick, no glove guys at every position, but never want good-glove, no stick guys at any position. I believe that attitude is directly attributable to stat geeks and fantasy leagues.

We have actually had a poster say that Aparicio "sucked" because of his OPS.

jabrch
09-24-2008, 02:27 PM
I think it also comes down to faith: people don't want to have the mystery of baseball explained away by statistics.

That's bull****. People don't want to see fake imaginary statistics used to fraudulently represent reality

They want to believe that it is way too complex to be explained by numbers. To some extent, I can understand this. But they refuse to consider that statistics aren't replacing the games, but just giving more information to the thinking fan.

So is that. The "thinking fan" is a self aggrandizing term. Taking meaningless numbers in the context of a particular game and applying them to that specific game and then mutliplying them by other equally meaningless numbers is not work that the "thinking fan" would do. It is ignorant of the level of rigor that a statistically focused person should use if (s)he is trying actually propose mathematical significance.

Optipessimism
09-24-2008, 02:32 PM
Unfortunately, too many fans get so wrapped up in the stats that they adopt the attitude that if it can't be quantified, it must not be important. They therefore dismiss any aspect of the game that doesn't have a good statistical measure, and as a result don't value players whose strengths lie in those areas.

:gulp:

Maybe not in those words, but this board is loaded with people who pay no attention to anything but offense, and are fine with big-stick, no glove guys at every position, but never want good-glove, no stick guys at any position. I believe that attitude is directly attributable to stat geeks and fantasy leagues.

We have actually had a poster say that Aparicio "sucked" because of his OPS.

:gulp:

That's bull****. People don't want to see fake imaginary statistics used to fraudulently represent reality



So is that. The "thinking fan" is a self aggrandizing term. Taking meaningless numbers in the context of a particular game and applying them to that specific game and then mutliplying them by other equally meaningless numbers is not work that the "thinking fan" would do. It is ignorant of the level of rigor that a statistically focused person should use if (s)he is trying actually propose mathematical significance.

:gulp:

JungleJimR
09-24-2008, 06:04 PM
We have actually had a poster say that Aparicio "sucked" because of his OPS.

ouch!!

Point well taken. The poster obviously never saw Little Looie play SS, like you and I. (Or, for that matter, never saw him taking a lead off 1st with Fox batting)

JungleJimR
09-25-2008, 09:36 AM
Coolstandings.com runs a million computer simulations each day to assess playoff probabilities. It takes into account all of the various home/away and strength of schedule dimensions. Check it out at: www.espn.com/mlb/standings (http://www.espn.com/mlb/standings) (look to the 3 columns on the right)

The following is last week's up and down history of the Sox Probability for winning the Central Divn:

End
of - Prob - Events
Fri - 53% - Start of week
Sat- 44% - Sox rained out-Twins win DH
Sun- 56% - Sox win DH-Twins lose
Mon- 54%- Sox and Twins lose
Tues-67%- Sox win-Twins lose
Wed- 72%- Sox and Twins lose.
Thur- 55%- Sox lose, Twins have big, big win
Fri - 72%- Sox win - Twins lose --- System really values a 3 game loss differential
Sat - 71% - Sox and Twins lose - System pauses before the big series

Sun - 71% - Sox & Twins win
Tue - 61% - Sox lose/Twins win
Wed- 49% - Sox lose/Twins win -------- This reporter disagrees, as system has trivialized the "Floyd Factor"; after all, Sox still a "loss ahead"

sox1970
09-29-2008, 11:49 PM
Coolstandings.com thinks we have a 61.8% chance of winning tomorrow....:shrug:

whitem0nkey
09-30-2008, 09:54 AM
Coolstandings.com thinks we have a 61.8% chance of winning tomorrow....:shrug:

ESPN standings say the same.

itsnotrequired
09-30-2008, 10:29 AM
56.7% on BP

Iwritecode
09-30-2008, 10:40 AM
56.7% on BP

I say it's 50%.

We either win or we don't. :shrug:

Noneck
09-30-2008, 11:45 AM
Minn -160 :(:

rustysurf83
09-30-2008, 01:21 PM
Minn -160 :(:

Everything I have seen says White Sox -147 or so.

downstairs
09-30-2008, 01:25 PM
Minn -160 :(:

Your bookie is high! Sox have killed MIN at home. Sox are the favorite, but i wouldn't say by that much.

gosox83
09-30-2008, 01:30 PM
Can anyone tell me what it is on bodoglife.com

I can't get on that site from work.

Thanks!!

Noneck
09-30-2008, 01:52 PM
Can anyone tell me what it is on bodoglife.com

I can't get on that site from work.

Thanks!!

Sox -143

Noneck
09-30-2008, 01:53 PM
Everything I have seen says White Sox -147 or so.
You are correct, I was wrong