November 25, 2004  ·  Lessig

shift.jpeg

So this is the beginning of some fascinating data. The graph represents the “shift” from 4pm exit poll data to final results. The puzzle is why the shift is so biased. It is an “academic” puzzle because it won’t matter to this election. But it should be explained.

  • http://arton.cunst.net/ hungerbug

    people did not want to commit themselves to their vote; they were ashamed. its as simple as that: very much like in any other poll, there is no guarantee, the interviewer is told the truth, especially with this candidate…

  • Michael

    I suspect the bulk of ordinary people didn’t even stop to talk to those conducting exit polls.

    Those that stopped to talk were people that are more likely to put political signs in their yards, stickers on their cars, and to a lesser extent, get involved in the political process.

    “The silent majority” just walked out of the polls and went on their way.

  • http://blog.deconcept.com Geoff

    i’d like to see data from the last 5 or 10 presidential elections too.

    without some comparison this won’t mean much to a lot of people.

  • http://allthingsalceste.com Dan

    2 possibilities:

    1) Exit pollsters are doing something horribly wrong.
    2) People voted after work?

  • http://www.daviddfriedman.com David Friedman

    1. What is the source of the 4 p.m. data? Most of what I have seen seemed to imply that the data that leaked were from the morning and the rest of the data were only released much later in corrected form.

    2. Only half the states are named on the graph, which makes it harder to guess the pattern, if there is one. It would help if the rest of the names went on, perhaps with names alternately above and below.

    3. Do we have the full exit polls, uncorrected? If they don’t show the pattern, it’s presumably a time of day effect. If they do, it’s something else.

    4. Does anyone know enough about the corrections nomally applied to the raw data to guess if that’s the issue?

    In other words, possibilities include

    A. Time of day.
    B. Willingness to be polled.
    C. Biased sample of precincts in the raw data–which perhaps would have been routinely corrected before the data were released.
    D. Fraud.

  • Max Lybbert

    The story I’ve heard is that, traditionally, conservatives vote before work, and liberals after work. However, that seems to be based on income levels, and income has become less of an indicator of how somebody will vote.

    For the record, I voted before work.

  • DOT

    A Professor of Mathematics at Temple University, John Allen Paulas, has joined Professor Freeman(?) of Univ. of Pennsylvania in reporting that the disparity between the exit polls and the actual vote count is statistically improbable.
    He states that the withholding of the full exit-poll data (by Edison Media Research, Mitofsky International, the Associated Press, and the various networks) is indefensible.
    He claims the cause could be:
    1. Massive fraud.
    2. Many people working independently to subvert the election.
    3. The tabulation machines and the software plus malevolent service by relatively few operatives.
    He also adds that without paper trails, it is difficult, but probably not impossible, to establish an accurate count.

  • Bill McGonigle

    Anecdotally, my Democratic friends like to prattle on about their politics and my Republican friends tend to just “shut up and vote”. (These are not “Evangelical Christian” Republicans). It will be interesting to see if the pollsters attempted to measure and/or account for this bias before conducting the polls.

  • http://www.needsomewood.us/ Need Some Wood

    Maybe people who voted for Bush were more ashamed to admit it – for good reason.

  • http://stereo.lu Guillaume Rischard

    I thought it would be interesting to see whether the swoosh served any purpose. The afternoon change notoriously changed the outcome in OH, but not in most of the other swoosh states. In Ohio, Florida, New Mexico and Indiana, the swoosh brought bush to 51%, 52%, 50% and 50%. In all the other swoosh states, either Bush was already winning or the swoosh wasn’t strong enough to stop Kerry from winning.

    If anyone wants to play with the numbers, I have cleaned up the html and uploaded the table here. The big secret we might have uncovered is that polls are inaccurate.

  • http://stereo.lu Guillaume Rischard

    Sorry, the table is here. Feel free to edit my above post for the URL and delete this post.

  • Max Lybbert

    DOT wrote:

    /*A Professor of Mathematics at Temple University, John Allen Paulas, has joined Professor Freeman(?) of Univ. of Pennsylvania in reporting that the disparity between the exit polls and the actual vote count is statistically improbable.
    */

    What does he base this on? On pre-election polls that had margins of error larger than the final spread? On the early-leaked poll data that included all of ten voters per state? On the exit poll data posted on CNN.com that shows Bush winning (but which has also been weighted specifically to show Bush winning)? Can you send me a link so I can take a look?

    /* He states that the withholding of the full exit-poll data (by Edison Media Research, Mitofsky International, the Associated Press, and the various networks) is indefensible.
    */

    Does this mean that he doesn’t have the full exit poll data? If so, what are his claims of fraud based on? Why would a college professor act so irresponsibly?

  • Richard Head

    Most likely, this is the result of “under polling,” the phenom where interviewees are not always honest in their responses because a candidate is too controversial, and this would make sense here. Under polling was observed repeatedly in the South in recent years when a white candidate ran against a black candidate, the white candidate would “under poll” because some whites gave the black candidate as their answer because they were ashamed of racism. The biggest problem with a conspiracy theory here is why would anyone risk exposure to stuff ballots in so many uncontested Bush states? Doesn’t make any sense…

  • Tito Villalobos

    Another explination that I’ve “heard” mentioned is that this time around the kerry supporters were a lot “angrier” than the bush supporters, and therefore made sure to get out and vote as early as possible. The Bush supporters were more leisurely, and voted when they made time that day.
    (For the record, I was one of the angry kerry supporters and I took the day off work.)

    I am deeply concerned at the large amount of problems we still have in our election system…. While I don’t think this election was “stolen” or anything, we do need to have these things worked out for next time. Things aren’t any better than they were in 2000, we just didn’t have a court fight this time.
    Just think if the Washington Govener’s race had happened in a state with computerized voting machines and no paper trail….

  • http://affbrainwash.com/chrisroach Roach

    Republicans have jobs and families to take care of and don’t have time to talk to exist pollers. Democrats are often disaffected ideologues, single welfare moms with time on their hands, or paid-off union workers, all of whom have plenty of time to “rap politics” with an annoying exit poller.

  • Alex in Los Angeles

    Dr. Freeman’s study can be read here:
    http://www.appliedresearch.us/sf/Documents/ExitPoll.pdf

    I think it would answer Max’s questions.

    You would also want to read this from Mysterypollster.com to better understand exit polling:
    http://www.typepad.com/t/trackback/1471163

  • Alex in Los Angeles

    Dr. Freeman’s study can be read here:
    http://www.appliedresearch.us/sf/Documents/ExitPoll.pdf

    I think it would answer Max’s questions.

    You would also want to read this from Mysterypollster.com to better understand exit polling:
    http://www.typepad.com/t/trackback/1471163

    In fact Mysterypollster.com conducted an extensive review of the Freeman paper here:
    http://www.typepad.com/t/trackback/1438364

    All in all, Freeman’s paper is valuable, the data he uses is the best available prior to the NEP release, but the the exit poll discrepancy may be due mostly to increasing bias in exit polls against the GOP.

    Error in the vote count, fraud, vote spoilage, etc. may be factors as well, but most likely they exist on a smaller scale than the exit poll discrepancy.

  • http://www.stevedonohue.blogspot.com/ Steven Donohue

    Of course any of us that have taken Stat 100 at some point in our lives understands the implications of not procuring a truly random sample. One famous example of failing to do so would be the election of 1936, where an incumbant FDR easile trounced Landon by one of the largest popular and electoral margins in history, despite a poll being done by a (then) famous magazine which had more than 2.5 million respondants and favored the challenger Landon.

    Perhaps the question I have is this- what exactly is the method being used in order to make sure the questions are being asked to random voters. How easily can a poll such as this turn into a convienance sample with no statistical bearing, whatever the numbers. And if it was truly a convenience sample, I do subscribe to the theory that Kerry supporters were more angry and visible than their red counterparts (Both Nader and Bush supporters :-)

  • DOT

    I apologize – I should have added the following to my prior comment:

    John Allen Paulos’ (Professor of Mathematics at Temple University) report was published in the Philadelphia Inquirer, and is now linked in: nov2truth.org

  • Max Lybbert

    I’m currently taking a look at the report by John Allen Paulos, and I keep seeing internal inconsistencies:

    For this report, I use data that apparently are based solely on subjects surveyed leaving the polling place. These data were reportedly not meant to be released directly to the public, and were reportedly available to late evening Election Night viewers only because a computer glitch prevented NEP from making updates sometime around 8:30 p.m. that night (pg. 4).

    and

    Some commentators on an early draft of this paper rejected these data as unweighted, meaning that they have not been adjusted to appropriately weight demographic groups pollsters knowingly under- or over-sampled, but it makes no sense that NEP would ever to distribute unweighted data to anyone, let alone publish them on the web election night (pg 5).

    Didn’t he just say that the data was accidentally posted? That is, NEP didn’t want to distribute these numbers, since they “were based solely on subjects surveyed leaving the polling place”? Isn’t that unweighted data?

    He seems like a pretty smart guy who wrote an intentionally misleading paper hoping to make enough smoke that people might think that there’s actual fire somewhere. For instance, take this comment, “Anchor people were discussing who Kerry would choose for his cabinet, conservative radio hosts were warning how now we�re going to see the true John and Teresa Heinz Kerry.” Unless he’s talking about CBS (which I didn’t watch), I have no idea what channel had anchor people actually discussing Kerry’s cabinet as if it were a done deal.

    Once upon a time people would be ashamed to write and sign something like this.

    I haven’t finished, but I suspect I’ll find other flaws in this paper.

  • http://www.sbs-world.com Zennie Abraham

    I still contend that this effort at figuring out an academic puzzle is flawed in that the constant assumption is people who answer the polls are telling the truth.

    Think about it, does anyone you don’t know tell you every voting action they took? My experience is no. Some people do and others consider it an intrusion. The danger is in attracting the attention of the voter who answers the poll with the intent to mislead, thus wrecking the data’s outcome assuming their are a large number of people who do this.

    I believe voters, thanks to television and communications, have become savy enough to do this.

    I’m also not convinced that the error considerations would cover such behavior. If that’s the case, the error consideration would have to be so large as to render any poll data as not reliable for the purpose of prediction.

    Zennie

  • Max Lybbert

    Well, I finished the report, although I just noticed that Stephen F. Freeman’s name is on it instead of John Allen Paulos (I found the report from the links posted in this forum, not from nov2truth.org).

    Aside from Freeman’s admission that unweighted data would destroy his case (and his denial of using unweighted data, a page after identifying the data as unweighted), there is another reason to consider the paper misleading, irresponsible, and incorrect.

    Remember that the discrepencies complained about are in the 2% range. In Florida, for instance, Bush was predicted to get 49.8% of the vote, but got 52.1% which is a difference of 2.2%.

    Without access to the data and methodology, we cannot model the sample characteristics precisely. But we do know the general procedures by which exit polls are conducted. … Based on these we can make a reasonable approximation(pg 10).

    Being off by just over 2% seems to be in the range of a reasonable approximation. Especially when you consider the next paragraph (on page 11):

    A random sample of a population can be modeled as a normal distribution curve. Exit polls, however, are not random samples. To avoid prohibitive expense, exit poll samples are clustered, which means that precincts, rather than individuals, are randomly selected. This increases variance and thus the margin of error.

    This is why BYU’s exit poll that undershot Bush’s Utah win by 1% was lauded on page 8. However, something tells me that the Utah exit poll had a margin of error larger than 1%. That is, if the poll were off by more than 1%, it would still be an accurate poll.

    Now, which is more likely, (1) a vast conspiracy, involving thousands of people able to pull off voter fraud on a massive scale without a hitch, or (2) a poll that is historically close, but never exact because it uses flawed methods, turned out to be close, but not exact this year as well?

    What do I mean by “flawed methods”? Picking precincts in a state to represent the entire state voting population. So long as voters base their decision on the same kinds of issues they used in previous years, this method works. However, when voters change their preference because of a changing world, this will skew the result and make the poll innaccurate.

    In an earlier post I wrote that, historically, conservative voters vote earlier in the day than liberal voters. In the past, income or age were the best ways to separate conservative and liberal voters. Today, religious affiliation is the best predictor of voter behavior. If the precincts chosen to represent the state were chosen according to income or age, then the poll would be wrong. If voters who historically voted Democrat (late voters, lower-income voters, young voters) instead voted for Bush for religious reasons, then the poll would be wrong. If conservative voters were less likely to return the pollster’s questionaire, then the poll would be wrong. Overall, there are several possible reasons that the exit poll may simply be wrong.

    And instead of recognizing this, we have a Ph.D. publicly state that the only possible way that the exit polls could be incorrect is from a vast conspiracy that I wasn’t even invited to be involved in! This kind of irresponsibility is amazing, but not inspiring.

  • marcello

    To get a handle on possible systematic errors, one should look at similar studies done for other elections.
    Q? Have these been done ?

    all this speculation can be answered if said studies were done. (one presumes that all the silly little details like do only Democrats answer exit polls before work but after they’ve had their lattes will be the same for past elections as for this one). If all other elections show no statistically significant deviations, then we know the systematic uncertainties are small and something is up with the present election.
    If the deviations in past elections are large as well, then the present strudy merely demonstrates that they do not have a handle on the systematic uncertainties.

    So, please, let’s nto waste too much time on speculation and find out if these studies were done in the past.

  • raoul

    “Today, religious affiliation is the best predictor of voter behavior.”

    Evidence please?

  • http://people.redhat.com/tiemann/ Michael Tiemann

    I agree with the earlier comment that with half the state names missing, the diagram is difficult to interpret. Another way that would make the graphic more understandable would be to use blue for the states that ended blue, red for the states that ended red, so one could see whether the “swoosh” actually impacted the outcome in the state.

  • Max Lybbert

    (Me): “Today, religious affiliation is the best predictor of voter behavior.”

    (Raoul): Evidence please?

    Well, I don’t have evidence per se (that is, I haven’t spent the time looking at studies of voter behavior). Instead, I have a few news sources that state religious affiliation is the second-best predictor, after party affiliation. So, I have to admit I was wrong in calling it the best overall predictor. But in the South (where I live) Democrats often vote for Republicans. And, according to what I understand, that applies in Florida as well (one of the states the paper believes was involved in the nation-wide conspiracy that hasn’t yet had a single leak to the media).

    So, as I stated earlier, if voter behavior is in a state of flux, then exit polls that massage the data according to old models of voter behavior will massage the data the wrong way, and not accurately predict the election. BTW, the paper’s “proof” that exit polls are accurate in the US includes their track record in Germany. Unfortunately politics in Germany are different enough that this concept falls flat. For instance, Germany’s multi-party system, less-religious society, and (IIRC) the ability the ruling party has to call elections at any time it thinks it will win (so the data doesn’t include elections where the majority of voters may feel conflicted between their party and the opposition) all make voter behavior easier to predict there. On the same note, references to Vincente Fox’s election (in Mexico) ignore the fact that the election wasn’t even close. The exit polls could have been off by 3% or 4% and still have accurately predicted the winner. In this election, exit polls off by that much would have wrongly predicted the election. Remember, the paper complains that Florida’s exit poll was off by only 2%.

    Bush got a lot more low-income votes than he was expected to, because voters squared off on religious differences instead of income differences. On top of that, the Catholic church pressured Catholics (which are largely Democrats) to consider abortion a bigger issue than war. I know that many Catholics ignored the advice, but I also know that many Catholics listened.

  • Alex in Los Angeles

    Max:

    I think you significantly mischaracterize exit poll methodology.
    1. Precincts are chosen randomly with weighting by expected turnout.
    2. Freeman does not use unweighted data. Your originally wondered if he used unweighted data and the answer is no.
    3. The 2% discrepancy you cite is significantly outside the margin of error for this exit poll as Freeman’s study explains.

    I agree in general with your skepticism of the accuracy of exit polls, but not with your arguments. The criticisms you offer mischaracterize exit poll data. And some of your criticism of Freeman is off the mark, because of this mischaracterization.

    BTW, professionals think the answer is GOP voter non-response bias, i.e. response rate for exit polls is around 50% and GOP voters were less likely to respond than Democratic voters.

    You would want to read this from Mysterypollster.com to better understand exit polling:
    http://www.typepad.com/t/trackback/1471163

    In fact Mysterypollster.com conducted an extensive review of the Freeman paper here:
    http://www.typepad.com/t/trackback/1438364

  • Max Lybbert

    Alex:

    (1) After taking another look at footnote 22 (page 11), I have to concede this point.

    (2) Freeman said the numbers he used “were reportedly not meant to be released directly to the public.” Why not? There are several reasons that the information might have not been meant to be released to the public. One is that it hadn’t been weighted.

    He later added a section that he believes the numbers are properly weighted, based solely on his belief that NEP wouldn’t intentionally post unweighted numbers. Then again, the numbers weren’t supposed to be posted publicly at all. That isn’t proof that the numbers were properly weighted, nor is it proof that the numbers aren’t weighted. To keep this post almost short, I’ll concede this point because Freeman’s paper still has a fatal flaw in it (next item).

    (3) The paper only gives the margin of error for Ohio (which was 2.2% — pg 12, footnote 23), and Bush’s 2.1% increase there falls squarely within it. I would expect Florida’s and Pennsylvania’s MOEs to be similar, because the sample size for a poll is usually chosen to get a particular margin of error. If they were, Bush’s 2.2% increase would be identical to the MOE.

    Freeman’s paper complains that Kerry’s overall drop is outside the margin of error. The mistake the paper makes is that it ignores the simple fact that Bush, Nader, and Badnarick all received statistically-defensible gains (Nader getting 1% of the vote compared to less than 1% of the vote, for instance), and those gains add up to Kerry’s corresponding loss. Remember all those pre-election polls with MOEs of 3% that were “statistically tied” if the results were within 6 points? Why? Because a 3% increase for one candidate would mean a 3% drop for the other — a total difference of 6%.

    I think we’ve found the culprit. Sorry, nothing to see here. Moral of the story: don’t put your money on vast conspiracy theories.

    What about Pennsylvania? In Pennsylvania, the gain was 3.2 points (as was Kerry’s drop), and that is fishy. Then again, the MOE isn’t always accurate. Page 13 says that the chance the election results would be that far outside the MOE is slightly more than one-in-one-hundred. Let’s see, chance one poll is off by that much is one-in-one-hundred, and we conduct fifty polls, what is the chance that one of those polls is off by that much? The way I see it, 50%. Statistically speaking, I think we can expect this once every other election (each time in a different state).

    For the record, I would like to have the raw data released as well. Then I could determine what the appropriate margins of error should be, and other people could determine if there is any evidence of true fraud. It’s a lot like honest companies submitting to audits.

  • Alex in Los Angeles

    Max:

    You misunderstand the categories of exit poll data. Freeman was wrong when he claims the data he used was not meant to be released. The Slate article he reference refers to the afternoon poll numbers. The numbers CNN.com posted on its website election NIGHT were released to the public, via cnn.com, and were weighted. What is true is that those election night exit polls are no longer available to the public as they were replace by corrected polls that match the election results. But don’t take my word for it. Read mysterypollster.com

    The Ohio margin of error at the 95% confidence level was +/-2.87%. Kerry received 48.5% but the exit poll stated 52.1% for a difference of 3.6%. But you make a seemingly valid point that the margin of error is “double” if Bush’s gains are taken into account. It does seem an absurd oversight if true. Interesting. I’ll write back what I find out.

    Thanks,

    Alex

  • Max Lybbert

    OK, thinking things over, I’ll concede a little ground.

    The most intelligent frauds are perpetrated in ways that leave plausible explanations. For instance, instead of stealing money straight from the company’s bank account, a crooked accountant would be smarter to structure some kind of fake service agreement. The accountant creates documents agreeing to pay a consulting firm for some service, and the consulting firm doesn’t exist (and the service is never provided, but nobody ever complains), although the check is written and the accountant gets his money. There are ways to catch this kind of thing, so the accountant works on making the fake transaction seem plausible and less likely to be looked at closely.

    So, I’m going to concede that if I were to commit voter fraud, I wouldn’t stuff ballot boxes with millions of votes, because I would expect that people would get suspicious when exit polls showing a different result. I would, instead, create a system meant to exploit weaknesses in exit polls so I could fly under the radar. For instance, I would create a lot of fake votes for Bush, and fewer fake votes for third-party candidates (who couldn’t win), since the election goes to whoever gets most votes, not necessarily a mojority.

    OTOH, when a strange occurence can be explained plausibly or with a vast conspiracy theory, I put my money on the plausible explanation.

    The only way to determine without reasonable doubt if fraud occured would be a release of the exit poll data, and access to the ballots (and voting machines). Since access to the Florida 2000 ballots was granted to several news organizations, I know it’s possible. I don’t expect to find anything, but an audit is always a good idea.

  • Alex in Los Angeles

    Hi Max,

    Thank you for the dialogue. Don’t forget that there are two other factors to the exit poll discrepancy:

    3. Provisional votes and spoiled votes after a recount is completed might lower the discrepancy into the MOE.

    4. Systematic errors in the widely deployed electronic tabulation machines used throughout the country. A hand recount would again ameliorate that problem.

    Thanks!

  • Cranky Observer

    Well, my speculation is as good as anyone’s, so here goes: it all ties back to the mid-2004 instruction by Karl Rove that Bush volunteers submit copies of their church directories to the campaign. On Election Day, Rove got the same early exit polls as everyone else. After cautioning Bush that things weren’t going well (hence the long faces in Bushland in the afternoon), he got on the phone and triggered off the process of using those directories to flush out more votes, using the gay-bashing argument in particular. Successfully as it turned out.

    Cranky

  • Max Lybbert

    Well, Alex, I may not like when you point out logic errors in my reasoning, or when you correct something based on my misunderstanding of a document, but I can eventually appreciate it.

    Thanks.

  • http://www.rezab.com reza behforooz

    Are exit polls distributed in a way that capture the actual voting population? I doubt. We live in a country, where zipcode has the highest correlation to voter’s decision (and interesting enough to SAT scores too).

    To get correct exit polls, you need to get a really good sample normalized on voter population and previous voting patterns.

    If you put more exit polls in the cities, then you’ll get a biased result.

    -reza

  • Alex in Los Angeles

    Exit poll FAQ at Mysterypollster.com:

    http://www.mysterypollster.com/main/2004/11/faq_questions_a.html

    My understanding:
    1. Random precinct selection within each state with each precinct’s chance of selection weighted by turnout history

    2. Each state is polled independently, but 1,469 precincts selected in all 50 states total. So on average 30 precints/state but Ohio and the other important states got more to increase accuracy in calling those states.

    3. A separate National exit poll is conducted using a different set of 250 precincts selected randomly with weighting by turnout history.

    Basic Limitations:
    The margin of error of exit polls is in the range of 3% and less, but there are emerging limitations to exit polls such as a declining response rate, which is now at around 50%. Also, it is being investigated whether GOP voter non-response is greater than Democratic voter non-response and why.

    Hope that helps.

  • Max Lybbert

    I hate to bring up an old subject, but I heard on NPR today that a lawsuit has been filed in Ohio over this argument that “the exit polls just couldn’t be that wrong.” Can anyone point me to a link where I can find more information?