From the Washington Post, by way of OTB:
Breaking from two decades of tradition, this year’s election exit poll is set to include surveys of voters in 31 states, not all 50 as it has for the past five presidential elections, according to multiple people involved in the planning.
Dan Merkle, director of elections for ABC News, and a member of the consortium that runs the exit poll, confirmed the shift Wednesday. The aim, he said, “is to still deliver a quality product in the most important states,” in the face of mounting survey costs.
The decision by the National Election Pool — a joint venture of the major television networks and The Associated Press — is sure to cause some pain to election watchers across the country. (For a full list of the states that won’t have exit polls scroll to the bottom of this post.)
Voters in the excluded states will still be interviewed as part of a national exit poll, but state-level estimates of the partisan, age or racial makeups of electorates won’t be available as they have been since 1992. The lack of data may hamper election night analyses in some states, and it will almost certainly limit post-election research for years to come.
DC is excluded, in addition to the nineteen states in black on the map (in case you can’t see it, Rhode Island is one of them). Remember when I talked about pseudostates? It appears that the NEP has found them.
This list includes sixteen red states and four blue ones (including DC). They excluded as many of Obama’s states as they included where McCain managed a majority of the vote. This ought to raise some serious alarm bells. I am at a loss as to what, precisely, the methodology here is.
They’re not going for regional balance, as they are excluding the entirety of the south central states and north central great plains. They’re excluding both Mormon states, and both West Virginia and Kentucky.
We could cite diversity, but no Texas.
We could say that “Oh, gosh, those rural states are expensive…” and yet urban/suburban Utah is excluded while Vermont is included.
We could talk about competitive senate or gubernatorial races, but North Dakota is a tossup and was excluded.
Does this matter? I don’t know. But it sure seems to me that exit polls in non-swing states should matter or should not. If they do not matter, I really do not see much reason to include six of Obama’s top ten. If phone polls are good enough for Louisiana, they’re good enough for Maryland. If they’re worried about missing something in Maryland, they ought to be worried about missing something in Louisiana.
I’d like to be able to say “this may discredit exit polls into oblivion” and a part of me wouldn’t mind that for a variety of reasons. The other part of me loves data. Here at the League, we’ve had people sift through it and get some quite interesting tidbits.
So I hope that this is a mistake or misunderstanding or something.
I can find no cause for disagreement here — either do the polling right, or don’t do it at all.
I find eliminating Tennessee and Georgia from the polling particularly distressing. These are states with strong Democratic minorities and real battleground regions. To call them automatic wins for Republicans is to oversimplify considerably.
IIRC, there’s plenty of Democrats in Texas, too. And nothing but Democrats in Arkansas and West Virginia, due to the way those states’ laws and histories have evolved over time.
From a competitive standpoint, there is no way that you can look at Mississippi and Alabama as being more competitive than Georgia and Tennessee. Aside from the red-state skew, I am really at a loss as to where they are coming from here.
Texas is the real stunner. No, it’s not going to be competitive this election. But the demographic shift going on there makes it far, far more interesting than New York. At least as far as I’m concerned.
I just don’t get it.
Apparently the media hasn’t learned about unlimited toll free calling plans. If only they’d watch the commercials on their own networks.
Or I could wear my Simon Bar Sinister hat (and SBS was always going to be my Nom de’ Guerre) and say they are doing what they’re doing to try and “leak” the “results” early to have large numbers of voters in the Central time zones and west give up.
Actually, I think Tennessee are automatic wins for the Republicans, or at least for Romney. Nonetheless, the demographics, crosstabs, and margins will be extremely interesting and valuable information post-election.
For purposes of the Presidential race, I agree with you.
In two years, though, Governor Haslam (still seems weird to me to say that; he was mayor of Knoxville when I lived there and I was unimpressed one way or the other) is very clearly not a lock for re-election, even if he’s done a good job.
I’d love to know how much they are saving by ditching those states. In TV money i can’t imagine it being all that much. In any case elections are important enough to do correctly instead of semi assed.
I saw this on WaPo and it took me a while to parse out what’s going on here. The key is in this sentence: “Voters in the excluded states will still be interviewed as part of a national exit poll, but state-level estimates of the partisan, age or racial makeups of electorates won’t be available as they have been since 1992.”
So they’re still going to poll people from those states to get a read on the national popular vote total, but they’re not going to attempt to get a statistically valid read on each of those state-level races.
Let me amplify and parse this out a bit more. You know how polls are always preceded by a statement like, “We polled 1049 registered adults… and we have a margin of error of +- 3%”? The sample size is always about that large. Why? Well this goes back to my SQL training at AT&T (my first job out of college, when I was still an engineer). If your population size was 1, how many would you have to poll? That’s right, that one guy. If your population size was 2, again your sample size would have to be both of them. For p=10, maybe s=5; for p=100, maybe s=20, etc. As p gets larger, the required sample size, s, to reach the same level of statistical significance levels off asymptotically. A graph of s vs. p looks a lot like a logarithmic plot. At the p levels we’re talking about for presidential elections–hundreds of thousands to millions–the sample size is always going to be about 1000.
But wait, there’s more! The population isn’t heterogeneous. As a political scientist, you’re interested in breaking this down by things like age, race, sex, partisanship, etc. So that sample also has to contain sufficient numbers of each of those cohort groups. You may have to oversample a bit in order to make sure you have enough black, lesbian, Republicans over the age of 50 or male, married, Hispanic, Democrats between the ages of 25 to 35, etc. Without that concern, the sample sizes would likely be something like 500 or 750.
So if you want a national poll with a +-3% margin, you need about a thousand respondents. If you want a state level poll with the same margin, you need… wait for it… about a thousand respondents. So a compete electoral map needs about 50,000 respondents. Similarly, a demographically weighted Congressional map is going to need almost 500,000 respondents. Exercise for the reader: contemplate what this would mean if you wanted to figure out all the state legislatures.
Now I doubt if they’ve ever tried to get that detailed. It depends a lot on the underlying population composition and what questions they’re trying to answer. But you can see how this could get really expensive really quick. And ever since they (the pollsters) muffed the Dewey-Truman race, they’ve gotten pretty darned good at squeezing a lot of information out of a handful of data.
To get back to the OP, it looks to me like they’ve decided to limit their detailed polling to those states that are competitive in either the presidential or congressional races. For instance, Kansas is deep, Romney, red, and I doubt if the Senate race is particularly close (I don’t even know if we have a Senate race this time around) but we have a couple congressional districts (around Wichita and the Topeka/Lawrence corridor) that flip now and then. That and our demographics, particularly the Hispanic voters–natch, have been morphing rapidly.
Of the ten most competitive congressional races as defined by sfgate, three are in blacked out states. I would be surprised if Vermont, for instance, has a remarkably competitive race. Or Mississippi. North Dakota has a competitive senate race. Georgia has a tossup. Kentucky and West Virginia have leaners. (While we’re at it, Rhode Island actually has a competitive one apparently.)
Regarding the phone polling, what jumped out at me with regard to that is that they are apparently getting far less data with the phone poll than with the poll-place ones.
Regardless of the initial rationale, it needs to be rethunk. When you are bypassing entire subregions, I think that’s problematic. It leaves a significant amount of information on the table.
And I just for the life of me cannot figure out why you leave out Texas and North Dakota, both rather dynamic, in favor of Mississippi and Alabama. Why Vermont is valid, but Idaho is not. Maryland yes, Georgia no?
I didn’t fit this into the comment’s body, but my overall faith in exit polling is lower than yours.
Exit polling used to be better. Now we have the phenomenon of “red-shift,” for instance. You know, where states with electronic black-box voting show “inexplicable” shifts to the right relative to the exit polls. Hmmm….
But regardless, I suspect it’s a fairly straight-forward, if complex, cost-benefit analysis. I bet if you could dig up which states do early voting and for how long as well as the stats on absentee balloting things might get clearer. Or not, IDK, just spit-ballin’ here. But they did mention that the traditional exit polls, i.e. standing outside the polling station and asking folks who they voted for, is having to be supplemented by phone polling to capture the early votes and that’s expensive. It also adds an additional layer of complexity and uncertainty to the weighting and needed massaging of data.
People bitch about polls, but if you take the job seriously, it’s HARD work to do right.
Oh, I’m not saying that I could do a better job of counting heads than they do. Merely that I think for a variety of reasons they are less reliable.
My theory, to the extent that I have one, is that there is a degree of bias involved. I don’t mean Evil Anti-Republican Bias. I mean, I think that they include New Jersey because they think it’s important. Georgia and Texas, less so. I think they more easily evaluate the importance of the northeastern states in comparison to their Great Plain brethren. I think if their formula lead to their cutting some northeastern states, they’d take a second look at the formula. Here, minus a public outcry, they won’t.
Which could be all wrong. I am more inclined than many to put the human motivation in corporate action. Even when it leads to decisions I think are wrong-headed (or outright disastrous).
Maybe. But none of that explains Kansas. We’re totally in the tank for Romney, we have no Senate races, our governor’s not up for re-election yet, and none of our House races are listed as competitive by RealClearPolitics. We have early voting but I think it’s only for a couple weeks. And let’s be serious, how many “northeastern elites” give a flying eff for what happens in Kansas?
The only thing I can think of that makes us interesting is the influx of Hispanics to mainly the southwest quarter of the state. So wrt to you comment about Texas’ demographics… maybe they have more, but we’re on the leading edge of a wave. Or something. Maybe.
I say the above based on “… state-level estimates of the partisan, age or racial makeups of electorates won’t be available …” So apparently they have some black-box algorithm or decision tree that incorporates demographics, competitiveness, and cost. It’s not that I have “faith” in them so much as I assume they’re professionals that have some idea what they’re doing. I also assume they would rather not be making these kind of cost-cutting compromises.
I also suspect that if you could get someone from the org to sit down and explain it to you, it would be so much inside baseball that you may not actually know more coming out than going in. Like sitting in on a staff meeting at CERN.
There really isn’t much of any explanation for including Kansas that I can find. Or any of the other McCain states that they are keeping in there (except Missouri and Montana, where he won very narrow pluralities). I have to stress again that I can almost fit the number of McCain states that are included on a single hand. Even if it was inadvertent, it should be considered a problem if we’re trying to understand the nation.
I’m not saying that there wasn’t a formula they used. Rather, I have a nagging suspicion that they accepted the states that were spit out of it too easily (in a way they couldn’t have if more interesting states had been excluded) or alternately that the formula spit out more than 19 states and they specifically chose not to exclude New Jersey and such because, hey, how can we exclude New Jersey?
BTW, the real tragedy is that Will is going to have to work harder to find Monday Trivia questions! 😉
I do have to confess that the trivia guy in me is saddened to see any of this be discontinued.
(BTW, I didn’t say so, but I do appreciate the perspective above and the background and such.)
If I were going to guess, I’d guess that the choices have something to do with predictability. I can think of two fairly quickly of two ways to approach that problem.
The easy approach might be: From the historical data, find the states where the detailed exit poll information is least important (in a statistical sense) to predicting election results. While academic researchers might like to get complete data, the people who are paying the bill are interested in beating each other to accurate election night predictions.
A more complicated approach might go like this: From the historical data, find the single state where, given that state’s broad exit poll numbers, and the other 49 states’ detailed exit poll numbers, you can most accurately predict the single state’s detailed numbers. Repeat the process for two states, three states, etc, until either (a) you’ve saved enough money on polling expenses or (b) your predictions of the detailed numbers become unacceptably inaccurate. So a Texas or a Georgia or a Nebraska would be excluded, not because of any particular political situation, but because their detailed numbers can be accurately estimated from other numbers that are still being collected.
The latter would require immense computational resources — it’s a combinatorial search problem. OTOH, immense computational resources have become quite cheap.
The “predictability” would carry more weight with me if it didn’t include some really, really predictable blue states (and a few red ones). I was also keeping an eye on regionalism for this reason and would have been more understanding if they thought they could extrapolate the findings from North Dakota into South Dakota, or extrapolate Idaho from its surrounding states (since Idaho is such a mess).
You might be able to get a glimpse at Texas without exit polling Texas if you’re exit polling Oklahoma, Arkansas, and northern Louisiana… but all of those states were excluded. When you’re excluding entire subregions, it sure seems to me that you’re seriously hindering your ability to get a grasp on the states because you can’t even extrapolate from similar voters elsewhere.
(I’d also add that Utah actually isn’t very predictable this time around, except that Romney will win. Will Romney get 75% of the vote? It’s not impossible. Or will he be hindered by inward migration? It’s a state with some changing dynamics. And, of course, there is a competitive House race there.)
I think Rod’s got a point above, but this is one of those moments where regardless of the underlying reason, this is going backwards.
It’s immensely important to increase total information regarding the electorate in the U.S. Our political process – especially in the information age – can be hugely informed by better data analysis all the way down to that local representative level. We should be adding more polling on more regional and local races, not reducing polling on the national level.
I get that this might be an economic decision, but seriously, fire some of those dorks at CNN and put their salaries to work, here.
So evidently, the TV networks are making $5.2 billion on TV advertising, their coffers are enhanced my 23%, and they can’t afford national state-by-state exit polls?!
As I’ve been driving across the states of TX and OK on I-40 today, I’ve had a fair amount of time to think about this
1. This isn’t about polling in general. There’s still Zogby and Rasmussen and Gallup and Quinnipiac and ABC, NBC, CBS, Fox, etc., etc., etc.
2. This is a specific poll conducted by one specific organization for one specific purpose; to predict the winners on election night prior to the official results being posted. Prior to the formation of the National Election Pool each news organization would conduct their own exit polling. In order to save money those self-same news organizations decided to go in together on the data collection effort and then independently conduct their own analysis and predictions on that data.
3. The methodology has always relied upon a great deal of interpolation and extrapolation. Predicting the results in precincts A, B, and C, based on samples from precincts D and E for instance. I’m not going to pretend to understand all the statistical methodology behind this, but it’s surprising how much information you can pull from such a sparse data set.
4. If you read the press release carefully, they’re not saying that they aren’t going to be exit polling at all in those 19 states. They’re saying they won’t be exit polling at a sufficient level to pull out “state-level estimates of the partisan, age or racial makeups of electorates,” which is a far-cry from what some of you seem to be assuming here. You’re still going to get the election night predictions based on “5% precincts reporting and exit polling” sort of thing. They just won’t be able to say exactly how the Hispanic or female vote is trending.
5. The academic value of the NEP exit polling data lies in it being this fairly consistent and comprehensive series of nation-wide snapshots taken at precise two-year intervals. That considerably simplifies the data analysis. This change doesn’t mean comparable data won’t be available. It just won’t be available from one source in a nice, neat, package and it will take more work to massage the data to answer specific questions. I don’t know anything about how and under what arrangements NEP makes it’s data set available to academics, but maybe these poly-sci researchers could pony up a few bucks to make up the difference in what the networks are willing to pay for?
6. This isn’t the only, or even the best, source of underlying demographics. You may recall we had a big, nationwide, survey only two years ago called the Census. Combining relatively fresh census data with the 2008 and 2010 exit polling data is likely what makes this reduction in data collection this time around statistically feasible.
7. Finally, wrt to Kansas. As I noted previously, and as Will reiterated, there doesn’t at first glance seem to be much reason to include Kansas while knocking out other states. I suspect if you could look at the underlying data from the past five election cycles you would see correlations in the detailed demographic data between KS and the other prairie states north and south. It’s also notable that KS is sandwiched between CO and MO east and west. MO and CO are really dissimilar while OK through the Dakotas are really similar. My guess is that by including KS they get crucial data that allowed them to knock out 2 or 3 or even more other states and still retain the same overall level of demographic “best guess-itude.”
1. Yeah, but this is the data for the most important sample set there is: those that actually vote.
2. That’s not the only reason for it, though. They also rely on this data to talk about the elections afterwards. Rather than some statistical magic, this list might simply be indicative of the states that the news outlets feel they are going to want to talk about after the election. Kansas’s inclusion may simply be due to What’s The Matter With, or maybe Kansas City’s newspaper (Yes, I know the principal KC is in Missouri, but my understanding is that it’s a bi-state media market) has some serious pull. The absence of Texas remains baffling under this view, but including New Jersey while excluding Texas makes sense.
3. We’ve also learned of the limits of this extrapolation. With an excel spreadsheet, I was better able to determine that calling Florida for Bush was a mistake than the networks apparently were.
4. I haven’t seen the press release. It’s better if they’re at least doing some polling in the states, but the inability to provide actual data for each state remains problematic. The inability to state where certain votes in certain states is trending is a pretty big deal, regardless of whether the state is “called” correctly (I have no doubt that Vermont will be called correctly, but it’s still good to have data available – now if only we were also getting it for South Dakota). According to the New York Times, the national precinct poll includes only 350 precincts nationally. Not confidence-inspiring for having any local data whatsoever.
5. The massaging is going to include guesswork. The lack of a uniform dataset remains problematic. You can investigate data for South Dakota by looking at polls taken there, but the actual data for election day voter information is going to be even more hindered than it already is. As I say above, I don’t consider this just a problem for future political science PhD’s. This sort of thing will almost certainly have an effect on how elections are covered. There is already a significant dearth of knowledge in precisely the states that are being excluded.
6. The census does a good job of telling us who lives where, but not necessarily of who is going to vote. Who votes from one election to the next is constantly shifting.
7. As I say above, Kansas’s inclusion may have little statistical rationale at all. Or maybe it’s along the lines that you’re talking about. But a glance at the voting patterns between North Dakota and Kansas demonstrate some significant differences. More broadly, one of the first things I did when I compiled the above map was look for the possibility of interstate extrapolation.
Instead, entire subregions are left out.
The only southern states that were included were Mississippi, Alabama, and some eastern swing states. Which is almost comical, because you might actually be able to extrapolate back and forth between Mississippi and Alabama, but both are included. You could maybe shift between Texas, northern Louisiana, Arkansas, and Oklahoma… but all are excluded. The only state you could shift onto Utah is Idaho, which is also excluded. You could pin down Idaho if you had eastern Washington, eastern Oregon, and Utah, but Utah is excluded (and Oregon is always going to be suspect).
Maybe Kansas was picked as a cross-representative state. But that doesn’t make it any easier to understand a good deal of the rest of the map. It looks to me to be more designed to exclude regions rather than trying to exclude certain states for which they can gather intel from other states.
The only southern states that were included were Mississippi, Alabama, and some eastern swing states. Which is almost comical, because you might actually be able to extrapolate back and forth between Mississippi and Alabama, but both are included.
I’ve hung out with professional statisticians most of my adult life, and on occasion, been required to pretend I was a professional statistician because no real statistician was available. A not uncommon conversation with non-statistician experts (let me call them “management” for this exercise) in some field goes like this:
Statistician: Here’s your model.
Management: But you didn’t include factor X!!
Statistician: That’s because factor X has little or no predictive power here.
Management: But it has to! It’s the key factor in the entire situation! Everyone knows that!
Statistician: The data doesn’t support that view.
I think you’re in the position of management here. In effect, you’re saying that it’s not possible to make accurate predictions about the election results in Tennessee from the gross voting results in Tennessee plus detailed results from Alabama, Mississippi and Missouri. The statisticians are saying, by the nature of the experimental design here, that it is possible. Or at least that, within the budget constraints they were given, Alabama is a good state for predicting results in other states and Tennessee is a good state for using out-of-state data for predictions. I am reasonably sure that the statisticians have a big pile of historical data to support their case.
I sympathize with the position that having consistent data collection across the 50 states is valuable, particularly for academics, but would argue that if that’s so, then the federal government and not the networks ought to be picking up the tab.
One thing I do know is that when you get into a complex experimental design problem like this one, with lots of historical data to use in considering a variety of factors, the results can often be non-intuitive, or even counter-intuitive.
Michael, there are two main purposes for exit polls. The first is to be able to call states. The second is to be able to look at demographics and how they voted after the election.
In the first case, this isn’t going to change much. None of the states removed from consideration are likely to alter it that much because, well, none of them are competitive. Past experience tells us most of what we need to know about these states. If the NEP were to say “We just can’t afford the state demographics angle anymore for non-competitive states”… I’d actually understand.
But here we are with Vermont, New Jersey, Maryland, Washington, Oregon, and so on… there is greater predictive value in which states were excluded from further inspection based on whether it’s a Republican state than whether it’s a competitive one.
And when looking at the individual states, for the second purpose, there are notable differences between Arkansas vs Mississippi and Kansas vs North Dakota. How do I know this? Because previous exit polls tell me. Now, the only way to really look at Arkansas is through phone polls and such independent of this. Because you certainly can’t extrapolate from Mississippi. And there is no way to dig down for Utah, because Idaho is the only state of any comparability.
I also consider the notion that this is Journalists vs Academics to be flawed. I’m relatively certain that the pollsters and political reporters are looking at exit polling from 2004 and 2008 to gauge what the demographic turnout is going to be. If Texas is indeed going to become more competitive due to demographic changes, that’s going to be hard to track without knowing how many Hispanics voted in Texas this cycle and how many old whites did that might not be around for the next election. This is a big deal journalistically, or ought to be.
Also, to accurately report on the next Arkansas Senate race, it would help for there to be data from Arkansas. You’re not going to get it from Mississippi. Of course, the alternative is to kinda ignore the race. Which, hey, is kind of like North Dakota this year. It’s there prerogative to do that, but there is grounds for objection here.