The BBCDebate: absentees more influential?

Last night the BBC aired its debate of the challengers, as it put it, with leaders of the five opposition parties squaring up to each other. Prime Minister David Cameron and Deputy Prime Minister Nick Clegg did not participate, and the latter was at pains to point out that he wasn’t even invited.

There’s little doubt this wasn’t the biggest Twitter event of the election campaign, but nonetheless well over a thousand tweets per minute were recorded, and in total we collected 151,417 tweets surrounding the event. Most activity, understandably, came towards the end of the debate as each politician tried to leave viewers with their version of events:

Number of tweets per minute

The spike towards the end could perhaps be explained away by the three “major” parties going into spinning overdrive as the debate closed; this seems clearer looking at the numbers of tweets per party:

Number of tweets per party

The second Ukip spike, just after 8:30pm, appears to coincide with Nigel Farage’s attack on the audience both in the studio and at home, while nearer 9pm is when the debate moved to immigration; at this point Ukip were getting more than twice as many mentions on Twitter as any other party.

As Sylvia outlined in our last post after the seven-way debate, we’ve created out own sentiment index, and below we plot the index for each of the parties, including the two not participating in the debate:

Sentiment during #BBCDebate


What is perhaps most notable is that the index with the biggest range is the Conservative one, despite David Cameron not participating; just before 9, not long after the question on defence, Conservative sentiment is at rock bottom, but just before the end of the debate (perhaps co-ordinated?), Tory sentiment is soaring, although in the final minute Labour’s sentiment is almost identical. The SNP, widely noted for their social media campaigning, also show a late burst, although Sturgeon’s somewhat disappointing final comments appear reflected in the last minute tail off in sentiment.

Overall it’s clear that very little is clear regarding who “won” last night, and whether indeed it was one of the two parties that didn’t participate – at least in the televised debate…

Racism, Farage and Clarkson

The political story of the last 24 hours is clear: Ukip’s leader Nigel Farage would scrap racial discrimination laws in order to set free our employers from the shackles that bind them. Regardless of one’s feelings about this (Fraser Nelson thinks the SNP’s anti-rich bigotry is more appalling, while naturally the Huffington Post takes a different line), there’s little doubt it’s driven much content on social media in the last 24 hours; in the last hour alone, over a thousand tweets specifically mentioning Ukip have been sent.

Here in Reading we’re collecting election-related Tweets, and so this seemed like an good opportunity to visualise what’s going on. Below is a word cloud composed of two types of words: firstly terms in green, such as party names and references, and other proper nouns, and the second set if plain old words, and how frequently they occur. The font size is dictated by the frequency of the word or term: bigger for more commonly found terms.

Wordcloud 12 March

Unsurprisingly Ukip figure prominently in the terms, but amongst the words we see: legislation, racist, racial, discrimination, equality, scrap, rid, laws, and Nigel.

One interesting word there is misrepresent; it’s often claimed that Ukip are misrepresented – could that be what’s happening here?

Another term, tucked away in small font is one that keeps rumbling along: Clarkson.

How Accurate are Constituency Polls?

An additional source of data to calibrate forecast models for the forthcoming general election this time around is the sudden abundance of constituency level polls, almost exclusively thanks to Lord Ashcroft.  This undoubtedly is an awesome resource, but there’s at least two problems:

  1. Some of them must be inaccurate, writes Stephen Tall: On the basis that 1 in 20 statistical tests will produce an error if we choose a 5% level of significance, so 1 in 20 polls, statistically speaking, must be wrong. Hence with close on 200 constituency polls thus far, at least 9 must be wrong – which ones, though?
  2. How do we calibrate constituency polls into forecast models? In order to do so, we need some historical precedent – a previous election, for example.

As with Stephen Tall’s article, I don’t wish to reduce the importance of, and the welcome addition of Ashcroft’s polls. However, I do wish to try and dig a little deeper into both of these questions.

The only historical precedent we have for Ashcroft’s polls are by-elections, where we know the outcome. Wikipedia’s page on constituency polling, which can with a little bit of pain be turned into a use-able spreadsheet, and marshalled for this purpose.

There have been six by-elections for which constituency polling was carried out in this parliament: Clacton, Eastleigh, Heywood and Middleton, Newark, Rochester and Strood, and Wythenshawe and Sale East. For these by-elections we can plot the opinion poll vote share against actual vote share each party received in the by-election.

By-Election Opinion Polls and Outcomes

The 45-degree line represents a polling ideal: opinion poll vote shares are exactly equal to outcomes. Clearly this is unrealistic for every poll, but pollsters must aim to be near to this line, assuming voting intent does not change between the polling date and election date. Points above the line show that a party got more votes on election day than they were polled to, while points below suggests they got fewer.

Plots are undoubtedly informative, but quantifying potential biases needs more serious statistical work; a linear regression of by-election vote shares on poll shares can reveal the extent to which polls may be biased towards or against particular parties.

The purple dots above the 45 degree line are indicative of a downward bias in polls for Ukip’s vote share; linear regression analysis shows that this is significant, and represents about six polling points: Ukip’s actual vote share in these by-elections was six points more than it was polled to get. Hence pollsters under-estimated Ukip support. Equivalently, Labour’s red dots are generally below the line; pollsters over-estimated Labour’s vote share by three points in these by-elections.

Now, to some extent, it can be argued by-elections are not representative of reality since they often constitute protest votes by fed up voters. And these two biases (the rest are insignificantly different from zero) definitely suggest a protest vote away from the major party (Labour) to the fringe party (Ukip). But were this to be the case, it should be that pollsters pick up this sentiment when polling likely voters?

Nonetheless, this mini-analysis does suggest that, by and large, constituency polling is accurate – deviations from the 45-degree line are marginal at best (except for Labour and Ukip)…