Translate

dimanche 18 décembre 2016

Québec: Léger ou CROP, ou les deux.

Bonjour,

Suite au débat récent sur les différences entre les récents sondages de Léger ou de CROP, je vous propose quelques analyses qui permettent d'y voir un peu plus clair. J'analyse tous les sondages publiés depuis juin 2014 par les deux firmes, soit 26 pour CROP et 19 pour Léger. Il n'est pas question pour moi de me prononcer sur "qui a raison". Je pense que les deux firmes tentent de faire leur travail le mieux possible et ont beaucoup plus d'avantages à publier de bonnes estimations qu'à prétendument pencher vers leurs supposés amis. Par ailleurs, les firmes sont sans doute les mieux placées pour examiner leurs données et voir s'il y a des corrections à apporter à leurs méthodes.

Notons que les deux firmes utilisent des panels web mais que les méthodes de recrutement sont différentes. Léger a son propre panel alors que CROP utilise un pourvoyeur, Research Now. Chaque méthode a ses avantages et ses désavantages. Enfin, rappelons que, puisque ce sont des échantillons non-probabilistes, la marge d'erreur ne s'applique pas. Il y a toutefois une tendance à parler dans ce cas, d'intervalle de crédibilité, qui se calcule comme la marge d'erreur et nous permet au moins de parler d'erreur possible.

Quelle est l'évolution des intentions de vote telle que mesurée par les deux firmes?

Le graphique suivant montre l'évolution des intentions de vote depuis juin 2014. Il montre que si l'on tient compte des sondages des deux firmes, l'appui au PLQ a diminué de cinq points durant les derniers mois de 2014 pour ensuite se stabiliser à 35%. L'appui au PQ, pendant ce temps, a augmenté de 10 points -- de 20% à 30% -- entre juin 2014 et juin 2015, pour ensuite diminuer à 27% environ. Pour ce qui est de la CAQ, ses appuis ont diminué de dix points en 2014-2015 -- de 30% à 20% -- et ont remonté depuis à 25%, très près des appuis au PQ. Les deux partis s'échangent les intentions de vote. Enfin, les appuis à Québec solidaire auraient connu une lente remontée pendant cette période mais sont maintenant revenus à leur niveau de juin 2014.

 On note toutefois des estimés "atypiques" dans ces distributions. Par exemple, en mars et mai 2015, deux estimations -- points rouge -- mettent le PLQ en bas de 30%. On note plusieurs sondages mettant le PQ près de 35% durant la même période. On note aussi, en 2015, des estimations exceptionnellement hautes pour Québec solidaire -- à plus de 15% -- et exceptionnellement basses pour la CAQ -- à 16%-17%. Si on regarde les deux derniers mois, on note des estimations pour le PLQ, exceptionnellement hautes (à 38%) ou basses (à 30-31%) et des estimations pour le PQ qui varient de 24% à 30%. Ces différences sont-elles habituelles et sont-elles associées à des firmes en particulier?




Y a-t-il des différences selon les firmes?

Les prochains graphiques comparent les estimations faites par les deux firmes pour chaque parti. Je commence par le Parti Libéral du Québec. Le graphique montre que les deux firmes traçaient un portrait quasi identique des intentions de vote jusqu'à tout récemment. On peut aussi noter qu'il arrive que les estimations de CROP pour le PLQ soient plus basses que celles de Léger -- à 29% en mars et en mai 2015 -- alors que Léger mettait le PLQ à 36%-37% à la même époque.

Enfin, si l'on regarde les quatre dernières estimations, on note que les deux firmes avaient des estimations tout à fait similaires jusqu'aux deux derniers sondages. Ceux-ci provoquent un éloignement des courbes -- en hausse pour CROP, en baisse pour Léger. Il ne faudrait toutefois que deux nouveaux sondages avec des estimations similaires pour que l'on conclue que les intentions de vote pour le PLQ sont en fait stables depuis le début de 2015. Il n'y a pas de différence statistiquement significative entre les estimations des deux firmes sur l'ensemble de la période.




Passons maintenant au Parti Québécois. On note les mêmes éléments qu'au graphique précédent. Il arrive que CROP estime les appuis au PQ plus haut que Léger et vice-versa. Par ailleurs, jusqu'au début de 2016, les estimations des deux firmes étaient identiques. Ce sont surtout les deux derniers sondages CROP qui entraînent les estimations pour le PQ à la baisse pour cette firme. Notons également que le sondage CROP d'octobre qui mettait le PQ à 30% était atypique par rapport aux autres sondages de la firme mais similaires aux estimations de Léger. Il demeure que, depuis le début de 2016, les estimations de Léger sont, relativement systématiques de quatre à cinq points plus élevées que celles de CROP. Par contre, statistiquement parlant, pour l'ensemble des sondages, il n'y a pas de différence entre les estimations de CROP et de Léger.


Pour ce qui est de la Coalition Avenir Québec, la situation est différente: Les deux firmes donnent des estimations identiques, avec peu de variations entre elles, comme l'illustre le graphique suivant. Pour les deux firmes, les intentions de vote pour la CAQ ont subi une baisse après l'élection de 2014, puis se sont mises à remonter en 2016. Elles semblent toutefois plafonner récemment.


Enfin, voici le graphique que nous obtenons pour Québec solidaire, un parti qui passe un peu sous le radar, étant donné ses faibles intentions de vote. On note une différence assez systématique entre les deux firmes, jusqu'à tout récemment. Il s'agit en fait du seul parti pour lequel il y a une différence statistiquement significative entre les deux firmes. On note que, pour Léger, les intentions de vote pour QS sont stables à 10% durant la période alors qu'elles sont plus élevées mais en diminution récente pour CROP.


Il y a chez CROP, une relation négative significative entre les intentions de vote pour le PLQ et celles pour Québec solidaire, alors que cette relation est moins forte et non significative chez Léger. Par contre, chez Léger, la corrélation est significative, négative et forte entre les appuis au PLQ et au PQ, alors qu'elle n'est pas significative chez CROP. Enfin, les deux firmes ont la même corrélation négative et forte entre les appuis au PQ et à la CAQ.


Conclusion

Contrairement à ce qu'on aurait pu penser, la différence entre les deux firmes est minime et non systématique, comme l'a d'ailleurs noté un sondeur. Par contre, la différence significative dans les appuis à Québec Solidaire qui a perduré pour presque toute la période est préoccupante et demanderait sans doute des analyses plus poussées.Il faudrait peut-être questionner également la répartition proportionnelle des indécis qui se pratique actuellement pour revenir à une répartition non-proportionnelle tel que cela se pratiquait jusqu'en 2002.

Que doit faire le citoyen moyen, l'observateur intéressé, le partisan passionné devant ces infomrations contradictoires? Une seule solution pour le moment: Prendre son mal en patience et attendre les prochains sondages. Si la situation perdurait, il faudrait sans doute que les deux firmes examinent leurs méthodes, leurs données et leurs modes de pondération avec attention pour s'assurer que toutes les décisions prises sont appropriées. Il demeure qu'il est de loin préférable que les deux firmes publient leurs estimations même lorsqu'elles diffèrent que de voir les deux firmes modifier leurs estimations pour s'aligner entre elles, comme on soupçonne que cela se fait dans certains pays. C'est la recette sûre pour un désastre dans les prédictions.

jeudi 10 novembre 2016

Could we have forecasted accurately?

Hi everybody,

Pour les francophones, notez que le billet est en anglais mais que des textes et des entrevues en français sont prévus au cours des prochains jours. Voici le lien au texte paru dans La Presse: Les méthodes de sondage et leurs limites et voici le lien à l'entrevue avec Anne-Marie Dussault (24/60) lundi soir dernier présentant mes prédictions (1er graphique de ce blog Entrevue avec Anne-Marie Dussault 24/60 lundi 7 novembre 2016 

I will start by bragging a bit... Here is the graph that I presented in a TV program on Monday night (in French) which means that I have proof of this (mind you):Interview with AMDussault, RDI, November 7, 2016. This graph shows voting intentions when I attribute 67% of the undecideds in each poll to Trump and 33% to Clinton. It forecasts Clinton ahead of Trump by one point, an almost perfect forecast (the two are one point too low compared with Others).


Why perform this non-proportional distribution of non-disclosers' preferences? I explained it elsewhere. It gave a perfect forecast of the Brexit results when I attributed the bulk of undecideds on the appropriate side, i.e. the Yes side in that case.  However, this time, I also have to thank Brian Breguet from the site "Too close to call" and Tamas Bodor, from the University of Wisconsin who sent me an email saying "this is the perfect storm for a spiral of silence". He wrote about it in IJPOR: The Issue of Timing and Opinion Congruity in Spiral of Silence Research: Why Does Research Suggest Limited Empirical Support for the Theory?  I borrowed the term "toxic climate" from him if I remember well.

Now what? Other intriguing figures that should have us tick?
In the preceding graph, we see that every point lost by "others" is taken by Trump. Others lost a third of their support, from 12% on September 1st to 8% on election day, according to the polls. They finally got 5% of the vote, the remaining three points going mostly to Trump.


What about the modes? Two remarks about the modes. In the following graph, you see that while the telephone polls -- with or without an interviewer -- show a "bump" in support for Clinton in October during the period where the debates were held, there is no such bump measured by the Internet polls. How come? We do not know the answer yet but we will have to look into that and see if is a recurrent phenomenon. It may be that internet polls that rely on panels use samples that are more homogeneous. It may mean that there were no real bump in support for Clinton.


Finally, the IVR/web online polls tended to show Trump ahead almost all the way. No herding here. They stand by their numbers. My conclusion was that they were outliers. What if, in fact, they had Trump higher because they phone only landline phones and those make up for 80% of their sample. This is coherent with the fact that Trump was quite strong in rural areas, where cell phones are not used as much as in cities. Perhaps there are too many cell phones proportionally in the samples. And since most people who have cell phones also have landline, they have more chance to be contacted. In short, urban people have more change to be selected.

Where have all the margins of errors gone? 

I have 21 polls in my data base during the last week -- I integrated the tracking polls only once during their field period and I did not integrate the LA Times polls. If you compute the margin of error for the difference between two proportions, you will realize that 18 of these 21 polls were within the margin of error for such a difference in proportions. This means that every time these polls were published, there should have been a mention -- RED ALERT! -- by the pollster and/or the media stating "According to the margin of error -- or credibility interval -- there is no difference between the two candidates".  Right now, the information is there but nobody speaks about what it means. The aggregators and analysts may say a very substantial majority of the polls had Clinton winning but  most individual polls showed a tie. It this had been stated loud and clear, the population would have been accurately informed.

What about the likely voter models? It is a black box. Happily, some pollsters publish two estimates, one for registered voters, another for likely voters, that will allow comparison. Perhaps we should always ask for these two estimates in order to be able to analyze the impact of using different kinds of models. The actual situation reminds me of France presidential election of 2002 where each pollster had its own recipe (but almost all the recipes gave exactly the same estimate!).

Finally, what about last minute changes?

This is usually "the explanation" put forward by the pollsters. However, in this case, it may be true that there were some last minute change where voting intentions for the other small candidates went to Trump. This even more likely since the majority of this support was for Johnson.

In conclusion

I think we had some means to forecast what was happening. The problem is that it is easy to say afterwards. Anyhow, we need polling errors to improve, though it is not fun at all. The AAPOR adhoc Comittee will certainly have the cooperation of all pollsters in its quest to understand what happened and, hopefully, recommend improvements. One of the questions however is the work of aggregators and analysts. With the type of analyses that we use, can we easily point to last minute changes?


Acknowledgements: Luis Pena Ibarra is responsible for validating and entering the data, conducting the analyses that produce the graphs and editing the graphs.


Methodology for this analysis.

1) Non-proportional attribution of preferences to non-disclosers was used in Quebec in the 1980s and 1990s. It was proposed by Maurice Pinard from MacGille University and validated by Pierre Drouilly from UQAM. It was used by pollsters at that time to compensate for the fact that the PLQ, a center-right party, was always underestimated by the polls. In the 1995 referendum on sovereignty, 75% of non-disclosers were attributed to the No side and it gave a perfect prediction (50.5-49.5).

2) The estimation presented is not an average, weighted or not. It is produced by a local regression (Loess). This analysis gives more importance to data that are close and less to outliers. It is rather dependent on where the series start. I started this series with the polls conducted after September 1st. This means that all the polls conducted since then influence the trend. I try to balance the need to have enough polls for analysis and the importance of not having old information influence the trend too much.

3) Every poll is positioned on the middle of its fieldwork, not at the end of it. This seems more appropriate since the information has been gathered over a period of time and reflects the variation in opinion during that period.

4) The data used come from the answers to the question about voting intention for the four candidates (and others).


5) All the tracking polls are included only once for each period: For example, a poll conducted on three days is included once every three days. In this way, we only include independent data, which is appropriate in statistical terms.

6) I do not include the LA Times polls, mostly for two reasons. First, there is only one sample interviewed, always the same. If this sample is biased, all the polls are therefore biased. Second, the question used ask respondents to rate the probability that they will vote for each candidate. It is well known that these probabilities do not usually add up to 100% unless it is "forced". We may also think that a proportion of these probabilities are around 100 or 0. We do not have the distribution of the these probabilities, only their average. In my view, there is not enough information to integrate this poll, the question asked is too much different from other polls to be compared with them and the sample is akin to a sample of professional respondents, which is problematic.

7) For Canadians, note that, in the USA, IVR cannot be used to call cell phones. This is why pollsters use Web opt-in for a part of their sample (20% in the case of Rasmussen).


                                                                                                                      

vendredi 4 novembre 2016

USA2016: What's happening? What about the "undecideds"?

Hi,

In this post, after presenting the usual graphs of the likely change in voting intentions since the beginning of September, I will focus on the "undecideds" and show what happens if I proceed to an allocation of the undecideds that is different from what is usually done.

Pour les francophones, notez que le billet est en anglais mais que des textes et des entrevues en français sont prévus au cours des prochains jours. Voici le texte paru dans La Presse: http://plus.lapresse.ca/screens/d6ab6a79-5aa8-40c3-b840-731fdc3e9246%7C_0.html

First, as I explained in the preceding post, the method I use is dependent on the period that is included in the analysis. Since there was movement in the polls recently, I dropped the polls conducted in August to ensure that the estimation will be sufficiently driven by recent estimations. The graphs and the analyses are now performed starting September 1st. I have 141 polls in the data base for that period (polls published until yesterday November 3). I integrate the tracking polls only once in their respective fieldwork period and I do not include the LA Times polls for a number of reasons (see methodology at the end).

The first graph traces the support for Clinton, Trump and others, excluding the Undecideds. This is the equivalent of a proportional attribution of undecideds and this is what all pollsters seem to do. The graph shows that Clinton is ahead of Trump. However, support for Clinton appears to be stable since the second debate while the support for Trump has increased, at the expense of the support for the other candidates, mostly Johnson. This analysis shows support for Clinton at close to 49% and Trump's at close to 45%. All the recent polls have Clinton ahead of Trump. The race is tightening, according to the polls, because the decrease in support for others is helping only Trump. Let us notice that support for Clinton is now higher than it was at the beginning of September. This is validated by the regression analyses that I performed.


If I analyze only the support for the two main candidates, we see that the proportion of support for Clinton first increased and then decreased in October so that her support is now somewhat higher than 52%, slightly more than at the beginning of September. There is no poll showing Trump ahead of Clinton in the last two weeks.



Finally, there is still a significant difference between modes of administration. The green line shows that polls using an IVR/online methodology, Rasmussen mostly, give a systematically lower estimate of support to Clinton.

On this question, I performed a series of regression analyses, controlling for time and time squared. They show that, after control for change over time,  both Web polls and Live phone polls give on average around 2 points more to Clinton. However, I remembered that in previous elections, there were questions asked about tracking polls. So I also wanted to check also for a difference between tracking polls and other polls. The conclusion: All else equal, tracking polls estimate support for Clinton 0.8 point lower than other polls.


Now, what about the "Undecideds"?

The current situation is that "nobody cares" about the undecideds and all the pollsters and the analysts attribute them proportionally tot he candidate. Mark Blumenthal presented an analysis of the Undecideds and Uncertain in the campaign here. It shows that undecideds tend to be slightly more republican or lean republican and less likely to approve of Obama's job performance. However, this analysis is performed on one series of polls conducted with one methodology. An analysis of all the polls shows that the proportion of undecideds has decreased a little over time -- it went from close to 8% at the beginning of September to around 5% last week. However the most important information is that the proportion of undecideds varies by mode of administration. Since the beginning of September, it was on average 8.2% for web polls, 5.9% for IVR/online polls and 4.4% for Live Phone polls. It also varies substantially between pollsters: From 2% (AG_GFK) to 13% (Zogby) for web polls, from 5.3% (Rasmussen) to 11% (Survey USA) for IVR/Online polls and from 1.3% (CNN) to 12% (Princeton Surveys) for Live Phone polls. This means that the proportion of Undecideds is a methodological feature more than a "real" proportion of likely voters who do not know whom they are going to vote for.

In previous elections or referendums, I used a non-proportional attribution of non-disclosers (including undecideds and respondents stating that they would not vote). I could rely on what had happened in Quebec in the 1995 referendum and in most elections (see Durand, Blais and Vachon (2001) in POQ about the Quebec election of 1998 ). In the Scottish referendum and in the Brexit referendum, I showed that attributing 67% of the non-disclosers to the Conservative side gave a better forecast than a proportional attribution. Note that this does not mean that I hypothesize that 67% of the non-disclosers are on the conservative side. I use this procedure as a simple and empirically validated way to compensate for differences between methodologies (house effects) and for a possible underrepresentation of the more conservatives respondents. These respondents may be less likely to be part of the samples, less likely to cooperate with pollsters and less likely to reveal their vote. This is coherent with the "spiral of Silence hypothesis" put forward by Elizabeth Noelle-Neuman long time ago.

Without any cue on the best allocation in the USA case, I opted for consistence and gave 67% of the non-disclosers to Trump, 33% to Clinton and none to the other candidates (it is well known that small candidates are almost never underestimated by the polls). The next graph shows what it means in terms of estimation. I want to stress that it would be the most pessimistic scenario for Clinton's supporters. Clinton still appears to be ahead of Trump, but only by 2 points.

Conclusion

This election will be very interesting in terms of analysis of electoral polls. We tend to think that the more "toxic" the climate in which an election occurs, the more likely a spiral of silence will occur in which some respondents with specific characteristics will not participate in polls or will not reveal their preference. It is not clear that we have this situation here. Are Trump supporters less likely to reveal their vote? Perhaps not but we may think that they are less likely to cooperate with an "institution" like pre-election polls. Participation in the election may also vary. And we should not forget that one mode of administration is pushing down the estimates of Clinton support.


Acknowledgements: Luis Pena Ibarra is responsible for validating and entering the data, conducting the analyses that produce the graphs and editing the graphs.

Methodology for this analysis.

1) The estimation presented is not an average, weighted or not. It is produced by a local regression (Loess). This analysis gives more importance to data that is close and less to outliers. It is however rather dependent on where the series start. I started this series with the polls conducted after September 1st. This means that all the polls conducted since then influence the trend. I try to balance the need to have enough polls for analysis and the importance of not having old information influence the trend too much.

2) Every poll is positioned on the middle of its fieldwork, not at the end of it. This seems more appropriate since the information has been gathered over a period of time and reflects the variation in opinion during that period.

3) The data used comes from the answer to the question about voting intention for the four candidates.


4) All the tracking polls are included only once for each period: For example, a poll conducted on three days is included once every three days. In this way, we only include independent data, which is appropriate in statistical terms.

5) I do not include the LA Times polls, mostly for two reasons. First, there is only one sample interviewed, always the same. If this sample is biased, all the polls are therefore biased. Second, the question used ask respondents to rate the probability that they will vote for each candidate. It is well known that these probabilities do not usually add to 100% unless it is "forced". We may also think that a proportion of these so-called probabilities are around 100 or 0. We do not have the distribution of the these probabilities, only their average. In my view, there is not enough information to integrate this poll, the question asked is too much different from other polls to be compared with them and the sample is akin to a sample of professional respondents, which is problematic.

 6) For Canadians, note that, in the USA, IVR cannot be used to call cell phones. This is why pollsters use Web opt-in for a part of their sample (20% in the case of Rasmussen).

vendredi 28 octobre 2016

USA2016: Is the race tightening? It's all about mode

Hi,

In this post, I examine whether the actual race between Clinton and Trump is really tightening as some suggest these days.

D'abord un mot pour les francophones: Désolée, ce billet sera seulement en anglais comme je le fais toujours lorsque j'analyse des sondages faits dans un pays anglophone. J'aimerais avoir le temps de traduire mais ce n'est pas le cas.

I show first our estimation of the progression of the race. The methodology used here is different from that used by other analysts. I explain these differences at the end of this post.

Some pollsters ask two questions related to voting intentions, one listing the four candidates and one asking preference between Clinton and Trump only. All the data analysed here is based on the first question. The fist graph shows the change in voting intention since August 1st 2016. The vertical lines indicate the three debates. The graph shows that, since the beginning of October, there are almost no polls where Trump has more support than Clinton. This would be shown by red dots (support for Trump in a poll) being higher than blue dots. There is indeed some variation in the polls and there were some polls between the two debates that showed Clinton very high (three blue dots between 52% and 55%). This may have lead some to believe that the gap between Clinton and Trump was widening substantially. But these polls seem to be outliers -- or related to specific news about Trump during that period -- as it can be seen by all the other polls that are rather close in their estimation.

Therefore, the estimation from all the polls is now Clinton at 49%, Trump at 42% and others at 10%. This estimation differs somewhat from other estimations probably because of methodological features (see methodology at the end). Notice however that the lines illustrating the estimation are regression lines that give more importance to data that are close to one another and less to outliers.

If we use the same data to figure out support for Clinton vs Trump only, i.e., the proportion of support for each candidate on the sum of their support, we get the following graph. This gives us Clinton at 54% and Trump at 46% of the total support for the two of them.

We clearly see, after mid-October, three red dots that are positioned higher than the 50% line, i.e. that show the support for Trump higher than the support for Clinton. And we have two other red dots that show support for Trump at 49%. Do these polls have specific characteristics that would explain their estimation? This is what we examine in the next section.

It's all about modes

 The next two graphs show, for Clinton and for Trump, the estimation of their support (on the sum of their support) traced by the polls according to the mode of administration, i.e. Web, live phone or IVR/online. We focus on support for Clinton, Trump's being the exact opposite. As we can see, the estimation lines for Clinton traced by Web and live phone polls (in blue) are almost identical. The green line represents the estimation traced by polls using a combination of IVR (interactive voice response) to households with landline phones and web polls to respondents who do not have access to a landline phone and who are members of an opt-in internet panel. These polls give a quite different estimate of support for Clinton, usually systematically lower than the other polls. There are three pollsters using this technology, i.e., Survey USA, Public Policy Polling (PPP) and Rasmussen. They all use a likely voter model. However, only Rasmussen estimates that Clinton's support is lower than Trump's. Since Rasmussen conducts a tracking poll and publishes 3-day estimates every day, if other analysts integrate all its estimates, it influences the average of the polls downward. In our case, we integrate tracking polls only once every "period" in order to include only independent data in the analysis (see methodology).

In short, without Rasmussen polls, the trend for Clinton is still going up, though it probably has reached a plateau. Her share of the support for the two main candidates is estimated at close to 55% by the Web and live phone polls but at 50% only by the IVR/online polls, this low figure being due solely to Rasmussen polls' estimates however.




Conclusion

The race does not seem to be tightening right now. There are some outliers that put Clinton somewhat higher but the main "problem" is with Rasmussen polls estimations that depart seriously from other pollsters, including those using the same methodology.


Acknowledgements: Luis Pena Ibarra is responsible for entering the data, conducting the analyses that produce the graphs and editing the graphs.

Methodology for this analysis.

1) The estimation produced is not an average, weighted or not. It is produced by a local regression (Loess). This analysis gives more importance to data that is close and less to outliers. It is however rather dependent on where the series start. For example, I started this series with the polls conducted after August 1st. This means that all the polls conducted since then influence the trend. My first graphs started June 1st. If I would still start at that date, the trend would be different because of the influence of the polls conducted in June and July. I try to balance the need to have enough polls for analysis and the importance of not having old information influence the trend too much.

2) Every poll is positioned on the middle of its fieldwork, not at the end of it. This seems more appropriate since the information has been gathered over a period of time and reflects the variation in opinion during that period.

3) All the tracking polls are included only once for each period. A poll conducted on three days is included once every three days. In this way, we only include independent data, which is appropriate in statistical terms.

4) The data used comes from the answer to the question about voting intention for the four candidates. Undecideds (non-disclosers) are attributed proportionally -- for now.

5) For Canadians, note that, in the USA, IVR cannot be used to call cell phones. This is why pollsters use Web opt-in for a part of their sample (20% in the case of Rasmussen).

USA2016: Is the race tightening? It's all about mode

Hi,

In this post, I examine whether the actual race between Clinton and Trump is really tightening as some suggest these days.

D'abord un mot pour les francophones: Désolée, ce billet sera seulement en anglais comme je le fais toujours lorsque j'analyse des sondages faits dans un pays anglophone. J'aimerais avoir le temps de traduire mais ce n'est pas le cas.

I show first our estimation of the progression of the race. The methodology used here is different from that used by other analysts. I explain these differences at the end of this post.

Some pollsters ask two questions related to voting intentions, one listing the four candidates and one asking preference between Clinton and Trump only. All the data analysed here is based on the first question. The fist graph shows the change in voting intention since August 1st 2016. The vertical lines indicate the three debates. The graph shows that, since the beginning of October, there are almost no polls where Trump has more support than Clinton. This would be shown by red dots (support for Trump in a poll) being higher than blue dots. There is indeed some variation in the polls and there were some polls between the two debates that showed Clinton very high (three blue dots between 52% and 55%). This may have lead some to believe that the gap between Clinton and Trump was widening substantially. But these polls seem to be outliers -- or related to specific news about Trump during that period -- as it can be seen by all the other polls that are rather close in their estimation.

Therefore, the estimation from all the polls is now Clinton at 49%, Trump at 42% and others at 10%. This estimation differs somewhat from other estimations probably because of methodological features (see methodology at the end). Notice however that the lines illustrating the estimation are regression lines that give more importance to data that are close to one another and less to outliers.

If we use the same data to figure out support for Clinton vs Trump only, i.e., the proportion of support for each candidate on the sum of their support, we get the following graph. This gives us Clinton at 54% and Trump at 46% of the total support for the two of them.

We clearly see, after mid-October, three red dots that are positioned higher than the 50% line, i.e. that show the support for Trump higher than the support for Clinton. And we have two other red dots that show support for Trump at 49%. Do these polls have specific characteristics that would explain their estimation? This is what we examine in the next section.

It's all about modes

 The next two graphs show, for Clinton and for Trump, the estimation of their support (on the sum of their support) traced by the polls according to the mode of administration, i.e. Web, live phone or IVR/online. We focus on support for Clinton, Trump's being the exact opposite. As we can see, the estimation lines for Clinton traced by Web and live phone polls (in blue) are almost identical. The green line represents the estimation traced by polls using a combination of IVR (interactive voice response) to households with landline phones and web polls to respondents who do not have access to a landline phone and who are members of an opt-in internet panel. These polls give a quite different estimate of support for Clinton, usually systematically lower than the other polls. There are three pollsters using this technology, i.e., Survey USA, Public Policy Polling (PPP) and Rasmussen. They all use a likely voter model. However, only Rasmussen estimates that Clinton's support is lower than Trump's. Since Rasmussen conducts a tracking poll and publishes 3-day estimates every day, if other analysts integrate all its estimates, it influences the average of the polls downward. In our case, we integrate tracking polls only once every "period" in order to include only independent data in the analysis (see methodology).

In short, without Rasmussen polls, the trend for Clinton is still going up, though it probably has reached a plateau. Her share of the support for the two main candidates is estimated at close to 55% by the Web and live phone polls but at 50% only by the IVR/online polls, this low figure being due solely to Rasmussen polls' estimates however.




Conclusion

The race does not seem to be tightening right now. There are some outliers that put Clinton somewhat higher but the main "problem" is with Rasmussen polls estimations that depart seriously from other pollsters, including those using the same methodology.


Acknowledgements: Luis Pena Ibarra is responsible for entering the data, conducting the analyses that produce the graphs and editing the graphs.

Methodology for this analysis.

1) The estimation produced is not an average, weighted or not. It is produced by a local regression (Loess). This analysis gives more importance to data that is close and less to outliers. It is however rather dependent on where the series start. For example, I started this series with the polls conducted after August 1st. This means that all the polls conducted since then influence the trend. My first graphs started June 1st. If I would still start at that date, the trend would be different because of the influence of the polls conducted in June and July. I try to balance the need to have enough polls for analysis and the importance of not having old information influence the trend too much.

2) Every poll is positioned on the middle of its fieldwork, not at the end of it. This seems more appropriate since the information has been gathered over a period of time and reflects the variation in opinion during that period.

3) All the tracking polls are included only once for each period. A poll conducted on three days is included once every three days. In this way, we only include independent data, which is appropriate in statistical terms.

4) The data used comes from the answer to the question about voting intention for the four candidates. Undecideds (non-disclosers) are attributed proportionally -- for now.

5) For Canadians, note that, in the USA, IVR cannot be used to call cell phones. This is why pollsters use Web opt-in for a part of their sample (20% in the case of Rasmussen).

mardi 28 juin 2016

Brexit: why and how were we misled? The modes again

Hi,

Since Friday, besides being angry at not forecasting the likely results, I ask myself how come. I thought that this required further analysis. What about modes?


Well, the difference between modes is one clear reason why I -- and others, I guess -- were misled. Contrary what we were expecting based on previous referendums and elections, the Web polls' estimates of the Leave side -- that we could call the "status quo ante" side -- were higher than telephone polls' estimates. As we can see in the following graph, during the campaign, web opt-in polls tended to put "Leave" at 50%. Telephone polls' estimates were way behind at the beginning of the campaign-- five points lower -- and around two points lower at the end.  On average, taking into account change over the campaign, web polls tended to put the Leave side 3.3 points higher than telephone polls.


However, there is substantial variation between estimates at the end of the campaign. Some web polls estimated support for Leave as high as 55% before Jo Cox's death and as low as 45% after. A similar situation exists for telephone polls: after estimating an advantage to Leave before Jo Cox's death, estimates went under the 50% line following her death. On average, in the polls, Jo Cox's death is followed by a drop of 2.9 points in the support for Leave.

Three conclusions arise from this analysis. First, the Web polls generally gave a better estimatation of the situation than telephone polls. We tended to believe that Web polls generally give higher estimates of the more liberal side (like in the Scottish referendum of 2014, in the UK 2015 general election and in many Canadian elections). However, in the US presidential campaign of 2012, web polls tended to produce lower estimates for Obama than telephone and IVR polls (see here). We may now conclude that Web polls generally give different estimates during the campaign but that they are not necessarily always biased in the same political direction.

Second, it is possible that people who were for Leave were even less likely to say so after Jo Cox's death. In short, maybe Jo Cox's death did not influence the vote but only the tendency to reveal a vote for Leave. This would give weight to the idea that social desirability -- or the Spiral of silence -- played a role in this campaign and that it was the Leave side who tended to hide its preferences.

Third, and very importantly, in the UK, it is the third time in a row -- Scotland 2014, UK 2015, Brexit 2016 -- that the estimates from the different modes differ substantially during the campaign but converge to similar average estimates at the end. This is mind boggling. I know that some researchers and pollsters explain it by "herding" but there is no proof of that, quite the reverse in fact. Although averages tend to be similar, there is much variation in estimates at the end of the campaign. It would be unlikely that pollsters using the same mode agree on an average! In France in 2002 (see Durand et al., 2004), I was informed by some pollsters on how they "chose" the published estimates. It would be "interesting" to get some insider information in the British case also.




With a non-proportional attribution of non-disclosers

Now, when I attribute 67% of non-disclosers to the Leave side, like I did in my last post and like I should have done all along, the web polls estimates show the Leave side ahead during all the campaign. On the opposite, the telephone polls did so only at the end. With this non-proportional attribution however, both modes give a very good estimate of the final vote, although some pollsters fared better than others.



In conclusion,

Pollsters conduct polls and their estimates are published in the media. The role of researchers is to analyze the polls and inform on their likely bias. However, when bias is not systematic, it is very difficult to do so. We need to understand when and why polls go wrong, and when and why there are differences in estimates according to mode of administration. However, in order to do so, we need to know where the numbers we are working with come from and how they are compiled, weighted and adjusted. Since polls may influence the vote in some situations, maybe poll data of published polls should be made available to researchers during electoral campaigns. Sturgis et al.'s analysis of the UK 2015 polls is a good example of what can be done when researchers have access to the data

vendredi 24 juin 2016

Brexit: we should have known better

Hi,

Well, Quebec voted no to leave Canada, Scotland, no to leave UK and then UK leaves the EU. It is a bit ironic when you see it from Quebec, although Europe is not a country per se.

So what happened?

I am not sure that pollsters failed that much. However, I think analysts -- and I include myself -- failed. And I know how and why I failed. At the beginning of the campaign, a colleague of mine, Henry Milner, told me that it may be that, in this referendum, the status quo side was the Leave side. His argument was that older people tended to vote Leave, that they were raised in a country that was outside the EU and they may want to go back to "normality".

So what is the consequence in terms of analysis? The "Law of even polls" -- which states that when it is equal, status quo or the more conservative side is likely to prevail -- still applies...but you ought to know which side is status quo! I should have known better. I amend my law, adding this by-law: If you want to know which side is status quo, look at how older people vote. They are the ones who win elections. People between 18 and 34 years old form less than 20% of the population and an even lower proportion of the voters.

So, here is the graph that I get when I attribute 67% of non-disclosers to the Leave side instead of the Remain side (the reverse of what I have done so far).  I get a perfect prediction of the results.

In conclusion

A number of analysts, journalists, pollsters noticed during the campaign that older people clearly favored the Leave side. This should have rung the bells and led us to conclude that, in a very close situation, the Leave side was likely to win. In my case, I should have listened and attributed two thirds of the non-disclosers to the Leave side instead of Remain. With this procedure, the prediction is perfect.

jeudi 23 juin 2016

Brexit, an update that changes things a bit

Hi,

I was not supposed to update unless there was some substantial change. Since yesterday, we added 10 new polls, i.e., those who were published since Tuesday and the Survey Monkey polls that were on the lists that we had consulted.

With these new polls, the situation is somewhat different.

The first graph shows change over time, with non-disclosers. It shows that there is a tendency towards a decrease in the proportion of non-disclosers (mostly undecideds). It also shows clearly now that the tendency is towards an increase in the support for Remain.


The second graph shows the estimates when non-disclosers are attributed proportionally. Even with this type of allocation, support for Remain is now ahead of Leave.




The final graph shows even more clearly that the Remain side is ahead of Leave. With this allocation, all the polls give a majority to Remain except one that puts the two sides at par. And the gap between the two is now estimated at 4.5 points.


Conclusion

With these new results, it is possible to conclude that the fatal shooting of Jo Cox probably had an impact on the campaign. It is rather clear from the second graph that most of the polls before the shooting gave an advantage to the Leave side and, on the contrary, most polls carried after give an advantage to Remain. If we look at the last graph that uses non-proportional allocation of non-disclosers, it is less clear, but nonetheless, the only estimates that gave a majority to Leave were before the shooting. More sophisticated statistical analysis will allow to validate -- or not -- this conclusion.

With these new results, we may conclude that Remain is likely to end up with a clear advantage of at least four points. Like everybody, I am eager to see the final results.

P.S. Thanks to Luis Pena Ibarra who recuperated the data and did most of the graphs for this campaign.

mercredi 22 juin 2016

Brexit, the day before

Hi,

In this last analysis before election day, I use only the polls that were conducted during the campaign, i.e., from April 15 to June 20. If new polls were published since then, it may hardly change much what we see now. However, if there are such polls, I may update this message during the day. I first look at change in support overall and then, I look at the different portraits traced by the two modes of administration, i.e., telephone and web opt-in.

Change in support

The first graph shows the estimates of the different pollsters. It shows that the two sides are very close to each other. It also shows that the proportion of non-disclosers -- including undecideds and those who say they will not vote for pollsters who keep them in the samples -- is quite stable. However, this proportion varies much between pollsters -- from 3% to 26% -- so that it is not appropriate to look at the estimates of remain and leave without attributing these non-disclosers so that the proportions of Remain and Leave add up to 100%.







The next graph shows change over time when non-disclosers are attributed proportionally to each side for each poll. This is the procedure used by all pollsters, except for one recent BMG telephone poll. I will go back to this question later on. The portrait that emerges is that the positions have "crystallized" since the end of May. Since then, support for leave appears to be somewhat higher than support for remain. One also has to notice that the ceiling was reached not after the shooting of MP Jo Cox, but much before.

It is interesting to point out that the same situation occurred in Scotland for the referendum on independence. You can see in my last post of that campaign that support for both sides had also reached a ceiling close to 50% in the last weeks of the campaign. However, In Scotland, it was slightly more favourable to the status quo.



However, what Scotland -- and Quebec 1995 -- also show is that a proportional attribution of non-disclosers is likely to overestimate support for change. For example, in Scotland, a non-proportional attribution of 67% to the No side gave an estimation that was still a few points lower than the results of the referendum. You can see this analysis in my post Scotland, the day after.
I used the same non-proportional attribution of non-disclosers for the Brexit. One pollster, BMG Research, used the same attribution for its telephone polls (not its web polls). The pollster states that it asked a number of questions (which ones, we don't know) that led to conclude that this allocation was the appropriate one. You may look at the BMG report here. In addition, this post by Elections, etc. shows that polls almost always overestimate change.

The following graph shows the likely change in support over time using non-proportional attribution. Remain appears to be about two points ahead of Leave. In fact, with this allocation, there was only a short period last week where Leave was ahead of Remain. The last polls tend to show Remain ahead, at least when we use non-proportional attribution of non-disclosers.




By mode

Is the portrait traced by the two modes of administration the same? Not exactly. The next two graphs show the portrait of change over time in support for Remain, using either proportional or non-proportional attribution of non-disclosers. The two graphs show that the portrait is not the same according to mode. They both show also that telephone polls tended to estimate support for Remain five points higher than web opt-in polls at the beginning of the campaign but this discrepancy was reduced to two points at the end. With proportional attribution, telephone polls estimate support for Remain at 50%, opt-in web polls at 48%. With non-proportional attribution, the respective estimates are 52% and 50%. This means that the global estimates depend in part on the proportion of Web versus telephone polls that are conducted, so that weighting according to mode of administration -- like Number Cruncher does -- is not a bad idea.




Conclusion

It is interesting to notice that, as with the Scottish 2014 or the Quebec 1995 referendum, the Change side had momentum during the campaign but it  reached a ceiling in the last two weeks (or the last few days in the Quebec case). It seems clear that referendum campaigns do make a difference. I leave it to political scientists to analyse why and how it does.

The fact that the two modes of administration do not give the same estimates, not only of the level of support but also of change over time, is problematic. It is even more problematic since often, in small markets, the only polls that are conducted are web opt-in polls. We will see tomorrow which mode  led to better estimates. But nonetheless, there is an urgent need for research on ways to improve samples and estimates of polls if we do not want polls to mislead voters.

Will Remain win tomorrow? Like many others (see Elections, etc, for example), I think that it will. First, I think a non-proportional attribution of non-disclosers is more realistic and appropriate than a proportional attribution. Second, my own analysis is that the "Law of even polls" applies, i.e., when polls' estimates show two sides at par, the status quo side is likely to win, as it was the case in recent elections (Israel, UK, for example). If Remain does not win, I will have to modify the Law to take into account exceptions and figure out why this campaign was an exception, at the end.

I will have a last post on Friday to compare estimates and results.

Au plaisir

mardi 14 juin 2016

Brexit: It's all about modes?

Hi,

We are close to entering the last week of the campaign. I will present an update of my last week analysis but, for this post, I will mainly focus on the major differences between modes.

First an update

As we can see in this first graph, the progression of the leave side went on during last week. However, it is important to notice that the Stay side remained stable. Its support did not decrease. What happens in that the progression of the Leave side seems to come almost entirely from a decrease in the proportion of respondents who say that they don't know how to vote or that they will not vote. This proportion, as noticed in a previous post, varies substantially, from3% to 15% during the last week.



If we attribute non-disclosers -- i.e. don't knows and will not vote -- proportionally like all the pollsters do, we get the following graph. We see that support for Leave has now pass support for Stay, as others have shown.



However, as I explained in preceding posts, empirically it is much more sound to attribute non-disclosers non proportionally, attributing more of them to the status quo side, in this case, Stay. If I keep the same non-proportional attribution that I used before -- 67% to stay and 33% to leave -- I get the following portrait of the situation. Support for Stay is slightly ahead of support of Leave, by about three points (it was five points last week).

However, it is very relevant to ask whether the portrait traced by polls is the same for telephone and Web opt-in polls.

It's all about modes

The first question is whether there is a difference between modes, controlling for change over time. In order to check for that, I perform a regression. The conclusions are:
  •  As we can easily see with the form of the curves in the preceding graphs, change over time is in the form of a reverse U (quadratic).
  • For Stay with proportional attribution of non-disclosers, after taking into account change over time, web polls give on average 5.1 points less to Stay than telephone polls. The mode, by itself, explains 45% of the variation between polls (which is huge!).
  • Using non-proportional attribution, the difference between modes is slightly corrected. Support for Stay in Web opt-in polls is 3.66 points lower on average than in telephone polls and mode explains 26% of the variation between polls.
 The second question is whether the two modes trace the same portrait of change over time. The simple answer is no. The first graph shows chance in support for Stay with proportional attribution of non-disclosers according to mode. It shows that, while telephone polls have estimated a steady decrease in support for Stay since the beginning of 2016, the web opt-in polls trace quite a different portrait. Is is only recently that they show a small decrease. These quite different portraits however converge to a similar estimate -- close to 50% -- in the last few days. The same thing happened in a way in the Scottish referendum where the difference between modes disappeared in the last weeks before election day.


If we use non-proportional attribution of non-disclosers, the portrait is similar but the endpoint estimate is slightly different, at 51.5%. However, since non-proportional attribution corrects for some of the differences between modes, there is no difference left in estimates according to mode.



 Conclusion

Although the polls using different modes of administration do not trace the same portrait of change in support over time, it seems that, at the end, they tend to agree. So, as of now, we do not have to start a battle on who's right.

It seems to me that referendums on "independence" somewhat look alike, if one compares Quebec 1995, Scotland 2014 and the actual Brexit. During the campaign, the "change" side always gains support and, in the days before election, comes close to 50 percent . In Quebec and in Scotland, the "Law of even polls" was respected. This is, when the two sides are at par, the status quo side is likely to win. Why is that? We may speculate. It is possible that people who are for status quo tend less to reveal their preferences to pollsters or are less present in the samples. It is also possible that some people who are in favour of change are afraid of what could happen if change win by a tiny margin. they may therefore change their minds at the last minute. Anyhow, it is easier and less consequential to tell a pollster that you are going to vote for change than to do it for real. And "the message" is sent to leaders nonetheless.

Another nine days to go to see whether what happened in Quebec and Scotland will happen for the Brexit. We know however that the situation is somewhat different, in particular in the sociodemographic profile of the supporters for the two sides. Support for change in the actual campaign comes more from older people, who tend to turnout in larger proportions.



**Notice on methodology: In the graphs, each point represents a poll estimate positioned at the middle of the fieldwork; lines represent the likely change in support estimated using Loess (Epanechnikov, 0.65).

 For methodologists and other interested people

A question to ask is whether there is more variation according to mode and whether there is variation within mode. The next graph shows a box-and whiskers plot of the variation according to mode in support for Stay with proportional attribution of non-disclosers. The graph again illustrates that support for Stay is estimated higher by telephone polls. However, there is not much difference between modes in the level of variation and not that many polls differ significantly from other polls using the same mode. Two poll estimates by Survation are significantly higher than the other web opt-in polls and one YouGov poll is somewhat lower. For telephone polls, ICM and ORB each have two polls that are somewhat low.



 A similar graph done with estimates using non-proportional attribution of non-disclosers show a similar portrait. However, this procedure reduces variance among telephone polls and now show two Ipsos-Mori and one ComRes polls somewhat higher than the other telephone polls.


A general conclusion from this analysis would be that the major difference between modes is in the estimates -- in this case the median estimate -- not in variation. And there is not much difference within modes either.

jeudi 9 juin 2016

To Brexit or not to Brexit,...

Hi everybody,

Welcome to my first analysis of the polls regarding the Brexit. I will perform the same analysis as for the Scottish referendum, using graphs of local regressions. I will look at the likely change in support for the Brexit and at the differences between modes.

First, here is the graph that takes into account all the polls conducted since January 2016. The dots represent poll estimates. The lines represent the estimation of change using local regressions (epanechnikov .65 for the specialists).

The graph shows that the two sides are now practically at the same level according to the published polls. It also shows that the proportion of non-disclosers -- including the undecideds and those who say they will not vote -- has decreased since March, from around 17% to 11%. It is the Leave side that has gained most from the decrease of the non-disclosers. The proportion of supporters for Stay has remained the same over the period.


However, the graphs also allow to notice the the proportion of non-disclosers -- the dots in the graph -- varies much, from 4% to 30%. This proportion varies by pollster  -- from an average of 4.7% for ORB to 27.8 for TNS.-- and by mode -- 16.8% for the Web polls, 10.2% for the telephone polls. Note that the proportion of non-disclosers  was not published for three ORB polls. Since this would have biased the analyses, I attributed a proportion of 5% of undecideds to these ORB polls and modified the proportion of stay and leave accordingly.


The following graph illustrates the change in support when undecideds are allocated proportionally to each side, which is the usual procedure for all the pollsters. The portrait is quite the same as with the preceding graph, i.e., the two sides are at par, with a possible tiny advantage for stay.


For the Scottish referendum, I had suggested that a non-proportional attribution of non-disclosers be used as was the case for the Quebec 1995 referendum. I had proposed to attribute 67% of the non-disclosers to the No side and 33% to the Yes side. This procedure produced a very good prediction. I had predicted at least a 7 points difference between the two sides. It ended up at 10 percentage points. The argument here is not that the non-disclosers really distribute themselves in these proportions. This procedure is a way to correct for a number of phenomena. It is likely that partisans of the status quo are less likely to be in the samples since generally they are more likely to be older and harder to contact. It is even more likely with Web polls. It is also likely that partisans of the status quo are less prone to answer polls and, when they do, to reveal their vote. In addition, the fact that the proportion of non-disclosers vary between pollsters means that it is a feature of the methods used more than of the real proportion in the population. Using a non-proportional attribution means that the higher the proportion of non-disclosers the higher the proportion that is attributed to the status quo. Empirically, for the polls conducted in 2016, there is a positive correlation between the proportion of non-disclosers and the proportion of supporters for Leave. This tends to justify the non-proportional attribution.

One could argue that the situation is different than for the Scottish referendum since, for instance, the older people were more likely to support the No side in Scotland while it is the opposite for the Brexit. Older people seem more likely to support the Leave side. However, this may be partly due to a paradox where older people who are for Leave are more likely to answer polls.

Since I do not have a theoretical or empirical justification to change the attribution that I used in the Scottish referendum, I decided to use the same. Here is the graph that I get using this procedure. The two sides are now about five points apart, which is -- I think -- more realistic.


In conclusion, it will be very interesting to follow the campaign in the next two weeks. My next post will deal with  the substantial differences in the portraits traced by web polls compared to telephone polls.