Welcome to DU! The truly grassroots left-of-center political community where regular people, not algorithms, drive the discussions and set the standards. Join the community: Create a free account Support DU (and get rid of ads!): Become a Star Member Latest Breaking News General Discussion The DU Lounge All Forums Issue Forums Culture Forums Alliance Forums Region Forums Support Forums Help & Search

Loki Liesmith

(4,602 posts)
Sun Sep 25, 2016, 01:07 PM Sep 2016

538's estimates of probability net us almost ZERO INFORMATION

Just look at this graphic of the national poll average and the 538 probabilities



Maybe, MAYBE the 538 model is squeezing some information out at the margins. I bet I could construct a model that does as well or better than 538 simply by taking the differences in the two poll averages and converting it to a direct probability.

49 replies = new reply since forum marked as read
Highlight: NoneDon't highlight anything 5 newestHighlight 5 most recent replies
538's estimates of probability net us almost ZERO INFORMATION (Original Post) Loki Liesmith Sep 2016 OP
Interesting vadermike Sep 2016 #1
This chart doesn't make me nervous Loki Liesmith Sep 2016 #2
That was a great book- The Signal And The Noise. DemocratSinceBirth Sep 2016 #3
I have a love-hate relationship with boxing Loki Liesmith Sep 2016 #6
The debate DemocratSinceBirth Sep 2016 #10
posdef he's on ludes Loki Liesmith Sep 2016 #13
State polls matter more than national averages, except that the MSM tblue37 Sep 2016 #7
The average of national polls is a very good predictor of an electoral win Loki Liesmith Sep 2016 #9
I have never understood why folks see his probability spreads as so great. wncHillsupport Sep 2016 #4
Something does bother me though... DemocratSinceBirth Sep 2016 #5
Democrats have the ground game, so we win. onehandle Sep 2016 #8
I'm in PA too. Can you dish? Loki Liesmith Sep 2016 #11
I'm fairly confident that HRC will Charles Bukowski Sep 2016 #39
You are giving yourself far too much credit CajunBlazer Sep 2016 #12
"There is no way you could do that unless you are a statistician" Loki Liesmith Sep 2016 #14
Nope, no offense, but you wouldn't have a clue how to do that CajunBlazer Sep 2016 #15
You're right Loki Liesmith Sep 2016 #17
Sounds like you are jealous of Silvers CajunBlazer Sep 2016 #25
You really don't know who you're talking to lol unblock Sep 2016 #29
And you don't know who you are talking to CajunBlazer Sep 2016 #33
Oh snap lol. Just check out loki's history before bashing him unblock Sep 2016 #34
I am A statistician BlueInPhilly Sep 2016 #18
I agree completely - but saying that two models are essentally equal in accuracy CajunBlazer Sep 2016 #24
I didn't say 2 models are equal BlueInPhilly Sep 2016 #28
I have a good background in statistics CajunBlazer Sep 2016 #35
Nope, not disputing your statement BlueInPhilly Sep 2016 #40
Yes, he's WAY Overselling considering what the data actually shows (The mark of bad science/model) Foggyhill Sep 2016 #42
One final serious post Loki Liesmith Sep 2016 #19
A less flip answer Loki Liesmith Sep 2016 #16
Doesn't he weigh his polls? BlueInPhilly Sep 2016 #20
Yes, and I actually have real problems with some of the things he weights Loki Liesmith Sep 2016 #21
That too is bull shit CajunBlazer Sep 2016 #22
You are absolutely right Loki Liesmith Sep 2016 #23
You may be a statistician who is jealous of the fame that Silver has received CajunBlazer Sep 2016 #31
If that makes you happy, roll with it Loki Liesmith Sep 2016 #36
Professional Quant here BlueInPhilly Sep 2016 #38
Silver misses the unexpected MFM008 Sep 2016 #26
If statistics could accurately predict the outcome of any event... CajunBlazer Sep 2016 #37
true MFM008 Sep 2016 #47
Look at the graphs? That's not how you compare statistical models. RAFisher Sep 2016 #27
It's on my agenda, don't worry. Loki Liesmith Sep 2016 #30
Maybe you will be more receptive to this Loki Liesmith Sep 2016 #46
No, it tells us that Hillary has a higher probability of winning than trump. Whether still_one Sep 2016 #32
I guess I did poke the right person :) Persondem Sep 2016 #41
apologies for not getting back Loki Liesmith Sep 2016 #43
Tell us how on earth he collates heaps of polls that individually yield near 0 confidence! Foggyhill Sep 2016 #44
Update here Loki Liesmith Sep 2016 #45
No worries. I hope you took your time. Persondem Sep 2016 #48
Sometimes there is an oops. fleabiscuit Sep 2016 #49

Loki Liesmith

(4,602 posts)
2. This chart doesn't make me nervous
Sun Sep 25, 2016, 01:11 PM
Sep 2016

Except about Silver's (and any modelers) ability to distinguish signal and noise.

If you want to see who is likely to win, look at the national poll average and see who is ahead. Anything else may just be sophistry.

DemocratSinceBirth

(99,710 posts)
3. That was a great book- The Signal And The Noise.
Sun Sep 25, 2016, 01:15 PM
Sep 2016

I like the moral of the fox and the hedgehog. Nate Is becoming a hedgehog. All he knows is how to scare people and generate clicks.

When I look at all the models they suggest to me Hill is a 3-1 to 5-2 favorite.

Do you like boxing, Loki?


Money Mayweather was a 3-1 favorite against Manny Pacquiao and he won. He wanted a boring fight and that's what he got. A boring debate would be good for Hill.

Loki Liesmith

(4,602 posts)
6. I have a love-hate relationship with boxing
Sun Sep 25, 2016, 01:19 PM
Sep 2016

I've watched it destroy people. Many of the players are vile. There is a kind of brutal purity to it...your analogy to politics is very apt.

In the end Nate's model will most likely be right. So will everyone else's. Almost no one is wrong at the last minute.

DemocratSinceBirth

(99,710 posts)
10. The debate
Sun Sep 25, 2016, 01:27 PM
Sep 2016

Hillary is a hell of a lot smarter than me and if I can, in my mind, parry his verbal brickbats I am sure she can.

Will we see Trump on Quaaludesor or Crystal Meth tomorrow tonight?

tblue37

(65,318 posts)
7. State polls matter more than national averages, except that the MSM
Sun Sep 25, 2016, 01:21 PM
Sep 2016

uses any poll that presages a squeaker of an election as a tool for pumping up their horse race narrative.

By claiming all the time that Clinton is losing ground, the MSM helps Trump look like a plausible candidate, and a lot of low-info voters are undoubtedly influenced in his favor by stories that present him as a scrappy underdog defying all expectations to seriously challenge the campaign that is being presented as an invincible juggernaut.

The MSM did the same thing during the GOP primary, and by constantly painting Trump as a "winner" with polls, and then using those polls to justify giving him even more free positive publicity, they finally made him one.

Loki Liesmith

(4,602 posts)
9. The average of national polls is a very good predictor of an electoral win
Sun Sep 25, 2016, 01:26 PM
Sep 2016

I'm not convinced that Nate's model adds significant value to that of the national average.

wncHillsupport

(112 posts)
4. I have never understood why folks see his probability spreads as so great.
Sun Sep 25, 2016, 01:15 PM
Sep 2016

All he does is show huge overlaps where either candidate 'may' win an election. I can make a prediction like that without all the fancy modeling.
- There is a more likely chance Hillary will win.
- However, there is also a pretty good chance that Trump will win.

My rationale: It depends on which group turns out the most voters in swing states and which groups flood the poll booths in 'sure' states to be sure no 'stay at home' group loses their 'sure' state.

Nate is right either way, based on his overlaps. So he can't really be wrong. Big deal.

DemocratSinceBirth

(99,710 posts)
5. Something does bother me though...
Sun Sep 25, 2016, 01:18 PM
Sep 2016

I always read about the lack of organization in the Drumpf campaign but polling suggests supporters of both him and Clinton have been equally as likely to be contacted by the campaign.

Both can't be true.

onehandle

(51,122 posts)
8. Democrats have the ground game, so we win.
Sun Sep 25, 2016, 01:22 PM
Sep 2016

I encountered the GOP ground game for PA. It's a fucking joke.

PENNSYLVANIA WILL WIN THIS ELECTION.

 

Charles Bukowski

(1,132 posts)
39. I'm fairly confident that HRC will
Sun Sep 25, 2016, 03:37 PM
Sep 2016

overperform relative to the final RCP average for this reason. Similar to Obama.

CajunBlazer

(5,648 posts)
12. You are giving yourself far too much credit
Sun Sep 25, 2016, 01:28 PM
Sep 2016

"I bet I could construct a model that does as well or better than 538 simply by taking the differences in the two poll averages and converting it to a direct probability."

There is no way you could do that unless you are a statistician with who has studied the major polls, their biases, and their accuracy over the years. Just because two graphs have similar shapes, does not mean you can convert the vertical scale of one graph (national poll results) into the proper vertical scale of the other (probability of winning) without the extensive work which Nate Sliver has put into his work.

Silver does not even use national polls in his probability calculations. He work begins with a statistical evaluation of all of the recent polls for every state culminating in a probability of each candidate winning every state. Those results are then combined statistically into a probabilities of each candidate winning the electoral college vote.

Let me briefly explain who you methodology could make you look like a fool. Let's say that Candidate A is leading in the national polls, but trails Candidate B in the majority of the big swing state. Your methodology will lead you to believe that Candidate A is ahead while Candidate B would have a much better probability of winning.

The fact that the national poll results seem to correlate with the swing state results is coincidental.

Loki Liesmith

(4,602 posts)
14. "There is no way you could do that unless you are a statistician"
Sun Sep 25, 2016, 01:30 PM
Sep 2016

"who has studied the major polls, their biases, and their accuracy over the years."


Yeah, I couldn't be that, could I?

CajunBlazer

(5,648 posts)
15. Nope, no offense, but you wouldn't have a clue how to do that
Sun Sep 25, 2016, 02:16 PM
Sep 2016

Even if the national polls graphs to predict the results of Presidential elections (which we can't of course), without statistics you would have no way of assigning a probability winning equivalent to each change of percentage separation between the two candidates in the national polls without simply guessing.

For instance if Hillary were head by a single percentage point over in the combined national polls over Trump, what would you assign as her probability of winning the election? You could only guess. What would be her probability of winning if she were ahead by two or three points, ... or six or seven? Again, you would have no clue and could only guess, and my guess would be as good as your's so why would anyone pay attention to your guesses.

And all of that avoids the fact that we don't elect Presidents in this country based on the results of national popular vote - we use state by state tabulations to establish Electoral College votes which in turn determine the winner.

So your original statement, ""I bet I could construct a model that does as well or better than 538 simply by taking the differences in the two poll averages and converting it to a direct probability" is simply bull manure.

BlueInPhilly

(870 posts)
18. I am A statistician
Sun Sep 25, 2016, 02:26 PM
Sep 2016

And I say all models are wrong. Some just have smaller error than others. A model will NEVER equal reality. Ever. A model with zero error just means an over-fitted model that is too rigid it will not work except for the particular scenario is was fitted for.

So there.

CajunBlazer

(5,648 posts)
24. I agree completely - but saying that two models are essentally equal in accuracy
Sun Sep 25, 2016, 02:56 PM
Sep 2016

is equally wrong, especially when the two models are based on a totally different events - in this case a model of the popular vote verses a model of Electoral College results - and only one models the only path to victory.

BlueInPhilly

(870 posts)
28. I didn't say 2 models are equal
Sun Sep 25, 2016, 03:03 PM
Sep 2016

I said all models have varying degrees of error. You cannot look at one model and equate it with another. There are so many moving pieces to consider.

I do this for a living. How about you?

CajunBlazer

(5,648 posts)
35. I have a good background in statistics
Sun Sep 25, 2016, 03:15 PM
Sep 2016

and polling has been a hobby for years. However, you don't have to be professional statistician to know that if you wish to as accurately as possible model an event, say the results of votes in the electoral college, it would be wise to model that event rather model the results of a similar, but totally different event such as the popular vote.

I would expect that the above statement is not disputable.

BlueInPhilly

(870 posts)
40. Nope, not disputing your statement
Sun Sep 25, 2016, 03:39 PM
Sep 2016

But not sure why you are undermining other people. No one is jealous of Silver - he picked a niche and capitalized on it, so good for him.

Also, sometimes, you have to rely on proxy data that may not exactly be the same as what you want to predict, but may reasonably mimic its behaviour. It has been done and it has had decent results.

I do this for a living. I've seen all kinds of models, and like polls, not all models are created equal.

Foggyhill

(1,060 posts)
42. Yes, he's WAY Overselling considering what the data actually shows (The mark of bad science/model)
Sun Sep 25, 2016, 03:43 PM
Sep 2016

I don't care how "A+&quot sic) a polling outfit is supposed to be
A 500 person (or less) poll for me were the demo's are way off from actual voting population is close to garbage level.

The margin of error on such polls is 4.5% 19 times out of 20 (and a lot more 5% of the time).
IF THE RANDOM SAMPLE IS ACTUALLY REPRESENTATIVE OF THE POPULATION POLLED (say likely voters)

If your off from that by just a few persons, because say you use land lines, call during the day, and poll more older voters, or less minorities (or whatever limitations or biases inherent in your sampling method), then your result may be way beyond the margin of error and can't be relied in any way.

That's not even taking into account the fact that the way the interview questions are asked can change answers. Asking about two way choice, before 4 way will lead to different answers than the reverse (strange but true).

That's not even taking into account biases in likely vs registered voters that overlay this whole mess of polls.

And yes, I've got experience/expertise in stats through engineering/MBA and Communication degrees.

If you see a professionally made poll that polls 2048 people in Pennsylvania that shows Trump on top in registered voters... Then be very worried, otherwise chill out and GOTV.

Yes, it's my first post but been lurking awhile in the Hillary forum (even during the crazy primary).
I'm Canadian and can't vote in this Election, but feel like a concerned third party because the USA is our closest partner.

Loki Liesmith

(4,602 posts)
19. One final serious post
Sun Sep 25, 2016, 02:32 PM
Sep 2016

I stated above that national popular vote is a good predictor of electoral vote outcomes. I believe the data bear that out.

https://leftymitt.com/projects/us-election-dynamics/

The point is, not that this is the best model we can have, only that Silver's model (and really anybody's, including my own) do not necessarily introduce much new information over and against a simple poll average.

Furthermore, if a given model is actually NOISIER than a poll average, one begins to suspect that the model is INTRODUCING noise into the equation, or at least wonder how that model reacts to new information. For an industry that brags that it is bringing increased certainty to statistical questions, that would seem to be a problem, would it not?

I can't really be bothered to go into much more detail about this.

Loki Liesmith

(4,602 posts)
16. A less flip answer
Sun Sep 25, 2016, 02:18 PM
Sep 2016

If Silver's model has approximately the same variance as the polls he is incorporating into it over the same interval...how much information gain can his model claim to have?

Assuming a normal distribution, the entropy (uncertainty of an estimate) of a normal distribution is a function only of the variance of the distribution



Given that, if both the poll average and the Silver model are strongly correlated and have more or less the same variance, then the entropy of both distributions should be about the same.

Given their strong correlation that means that Silver's model brings minimal added value to our ability to predict outcomes.

BlueInPhilly

(870 posts)
20. Doesn't he weigh his polls?
Sun Sep 25, 2016, 02:33 PM
Sep 2016

He gives higher coefficients for some polls. That is a subjective judgment that needs to be evaluated objectively. We don't know how he separates one poll from another.

So he doesn't necessarily have a normal distribution - he skews it towards some and I bet his methodology is a blackbox. As your argument pointed out, it wouldn't make his method any more special.

Loki Liesmith

(4,602 posts)
21. Yes, and I actually have real problems with some of the things he weights
Sun Sep 25, 2016, 02:38 PM
Sep 2016

For example he does trendline adjustments to individual polls. If a given poll is a large outlier in one direction in a preceding epoch, and gives you a large lead in that epoch and then you poll again on the current epoch and you have a smaller lead, the current lead is actually counted as a loss. Essentially you penalize a given poll for finding an outlier in the past. If there was a significant house effect for a given pollster, I'd be kind of OK with this, except that Silver already corrects for house effects, so this amounts to a bit of a double correction. Or if the trendline adjustment was calculated from the aggregate on an epoch...I'd be mostly OK with that too.

Silver has been criticized for introducting too much correlated error in his model and I think this a fair criticism.

In any event the normal distribution example is a toy model. Trying to capture the fact that treating the 538 model as a black box transducer of some signal(s), his bitrate is pretty low. The information gain is marginal. Perhaps if the polls closed to <1% nationally the added value would become more clear...it would certainly be easier to evaluate the performance of his secret saunce in that case.

CajunBlazer

(5,648 posts)
22. That too is bull shit
Sun Sep 25, 2016, 02:45 PM
Sep 2016

Pulling information off of the internet does not make you a statistician. And were you a statistician you would never have made your absurd comments to begin with.

There is ample reason to believe that the nation polls need not be strongly correlated to a state by state statistical analysis of electoral college results. Just because they appear to be correlated for a particular data set does not indeed mean that they should be for any situation.

There is ample reason believe the that they may not be correlate at all. The national polls are strongly influenced by the by small samples taken from consistently blue and constantly red states where the difference in support between candidates can be and often is huge. The fact that when one takes these various small samples from each state and combines them in national polls and the graph of the results happen to resemble Silver's graph for an election cycle is no proof of correlation.

The fact that there is any correlation at all is coincidental - and there is not reason to believe that correlation will continue in the future. Even if were to accept the preposition that voters in swing states were very much the same as voters of the country as a whole, the two groups are not subject to the same stimuli. Candidates put hundreds of dollars into commercials, rallies and voter turn out drives into swing state, especially in the later stages of a campaign, which the people in other states never experience. There is no reason to believe that given all this additional stimuli voters in swing states will continue to behave in the same way as voters across the country.

Bottom line, it is a mistake to assume correlation in one data set means that that correlation will continue across all data sets. And you have not begun to address the issue explored in my other post that their is no why to assign probabilities of winning to the graphs obtained form national polls.

Loki Liesmith

(4,602 posts)
23. You are absolutely right
Sun Sep 25, 2016, 02:50 PM
Sep 2016

Pulling information off the internet does not make me a statistician. We agree. Why argue?

CajunBlazer

(5,648 posts)
31. You may be a statistician who is jealous of the fame that Silver has received
Sun Sep 25, 2016, 03:04 PM
Sep 2016

and are eager to discredit his work. However, in an effort to voice your displeasure in terms which most DU members could understand, you used a clumsy example and you were called on it. Then instead of admitting that it was a clumsy example as you should have, you tried to overcome the objections with statistical BS you knew didn't apply.

If you want to criticize Silver's work, you should at least have the decency to first understand how he does his calculations.
Then you can throw your statistical darts.

Loki Liesmith

(4,602 posts)
36. If that makes you happy, roll with it
Sun Sep 25, 2016, 03:17 PM
Sep 2016

I'll have a follow up on this as soon as I can rasterize the 538 trendlines, or get some csv files of it...

I still fail to see what's gotten you so riled up, but to each their own.

I don't see how constructive this is until I work up that data.

BlueInPhilly

(870 posts)
38. Professional Quant here
Sun Sep 25, 2016, 03:23 PM
Sep 2016

Last edited Sun Sep 25, 2016, 08:37 PM - Edit history (1)

I can construct my very own model but I do not have the time to gather all the data I need. I don't know if Nate normalizes for different confidence limits and intervals; if he doesn't, he should. Historical performance is only relevant to a certain degree. You cannot drive forward by simply looking at your rear view mirror. You have to consider turning points, changing population distribution, and other attributes that may affect an individual's propensity to vote for a candidate.

I take all these polls with a grain of salt. Not polls are created equal. I see a number and I immediately look at the underlying data - sorry, 2nd nature. Whilst a 500 sample size is enough for a +/- 5% confidence interval at 50%, the required sample size goes up considerably when a poll aims for more granular questions and answers.

Remember the adage: All models are wrong but some are right sometimes.

In the case of Nate's model, the volatility of his predictive variables (i.e., disparate polls) probably undermines any semblance of stability, and volatility is the krypton of models. We don't like it.

MFM008

(19,804 posts)
26. Silver misses the unexpected
Sun Sep 25, 2016, 02:58 PM
Sep 2016

He had the Seahawks down to win our second super bowl 2 years ago.
Totally winning.
Then came the unexpected throw instead of running the ball at the 2 yard line to win.
Intercepted.
We lose in the last seconds of the game.
Point is.....
Yeah we're still pissed.
They can't see the unexpected...
Or calculate it.
He was wrong.

CajunBlazer

(5,648 posts)
37. If statistics could accurately predict the outcome of any event...
Sun Sep 25, 2016, 03:18 PM
Sep 2016

We would never have any reason to have an election or play a game.

RAFisher

(466 posts)
27. Look at the graphs? That's not how you compare statistical models.
Sun Sep 25, 2016, 02:59 PM
Sep 2016

Do the math and come up with imperial data showing one is better than the other. Using smoothing or not doesn't make a model incorrect. Honestly I don't even understand the complaint is. You probably should first read how the 538 model works before claiming you know it's wrong.

Loki Liesmith

(4,602 posts)
30. It's on my agenda, don't worry.
Sun Sep 25, 2016, 03:04 PM
Sep 2016

As soon as I get a csv of 538's numbers I plan to look at

1) distribution of the absolute value of instantaneous derivative of 538 model probabilities vs. the same for huffpost poll trendlines.
2) distribution of 538 model probabilities - trendlines
3) PCA and ICA on both sets together resampled to the same grid.

Unfortunately having some trouble getting one raw data set. Expect rectified shortly.

Cheers,

still_one

(92,126 posts)
32. No, it tells us that Hillary has a higher probability of winning than trump. Whether
Sun Sep 25, 2016, 03:08 PM
Sep 2016

one subscribes to his model or not is another question.

The polls say 25% of the populous are undecided.

While undecideds will play a big part, I personally believe if Democrats come out in full force to vote, we will win.

Who is more motivated?

Persondem

(1,936 posts)
41. I guess I did poke the right person :)
Sun Sep 25, 2016, 03:39 PM
Sep 2016

I figured you could take a deeper dive into 538 ways of doing things than I could have. I too saw a problem with his use of the trendline as the adjustments did not seem to agree with the recent trend at all. Also he insists on using a Google based poll that is clearly an outlier for Trump; it consistently shows Trump with a double digit lead in FL. Said poll is internet based, has a 29% response rate and infers state of residency based on IP address.

I'll check back. Still curious as to what you come up with by crunching the 538 data.

Loki Liesmith

(4,602 posts)
43. apologies for not getting back
Sun Sep 25, 2016, 04:00 PM
Sep 2016

But I do tackle several of your thoughts in the posts above.

I've just written some python code to turn these jpegs into data I can crunch. That's not perfect, but I think it may be the best I can do.

Already converted the huffpo data, will be converting the 538 data while I watch football. Hopefully some answers tonight.

Foggyhill

(1,060 posts)
44. Tell us how on earth he collates heaps of polls that individually yield near 0 confidence!
Sun Sep 25, 2016, 04:22 PM
Sep 2016

And yes, I'm upset by the amount of attention this amount of bad science is getting.
But, shouldn't be really, much of the population has no understanding of stats and polling,
so misusing this lack of knowledge is kind of expected I guess (its certainly not up to scientific standard...)
I do question his ethics...

Persondem

(1,936 posts)
48. No worries. I hope you took your time.
Sun Sep 25, 2016, 08:23 PM
Sep 2016

I checked out the post you link to below. No way I could do the analysis you did but my thoughts after looking at his ratings for FL were that a lot of what Silver looks at is arbitrary ... or at least poorly explained. Also he seems to let polls with very questionable results influence his ratings - ala theUSC/LA Times and the Google based surveys. His trend line calculation for FL looked way off as well.

Thank you for taking the time to look at 538's data in detail.

Latest Discussions»Retired Forums»2016 Postmortem»538's estimates of probab...