Sampling checks in GE 2015

The Elections Department released early indications of polling results for the first time during the 2015 General Elections by publicly announcing the result of sampling checks within 2 hours of the close of polls. Sampling checks have been carried out since 2001 or earlier, but the results were not made known to all candidates or the public until this year. The sampling check is carried out by Counting Assistants drawing a sample of 100 ballots from each counting table after opening the ballot boxes and mixing the ballot papers together. As each counting table corresponds to one polling district and there are about 2,000-3,000 voters per polling station, the 100-ballot sample per polling district corresponds to 3-4% of votes cast and is a large enough sample to make a good estimate of the final polling result

As shown in Table 1, the margin of error varies from ±1.2% points for a large GRC, Pasir Ris-Punggol with 66 polling districts, to ±4.4 % points for Potong Pasir, the smallest Single Member Constituency (SMC) with only 5 polling districts.  For simplicity, no adjustments are made to account for the variation in number of voters in each polling district, and the error margin is calculated for a 50:50 vote split.  The Straits Times has reported that the error margin for sampling counts is ±4% but this is a worst case and only applies to Potong Pasir which is the smallest SMC by far. On average, the uncertainty in the sampling check is within 1.3% points for GRCs and 3.3%  points for SMCs.

Table 1 – Error margins for selected constituencies and SMC and GRC averages
Number of polling districts Sample size Margin of error at 50% vote share
Pasir Ris-Punggol GRC 66 6,600 1.2%
Aljunied GRC 50 5,000 1.4%
East Coast GRC 32 3,200 1.7%
Bukit Panjang SMC 11 1,100 3.0%
Potong Pasir SMC 5 500 4.4%
GRC Average 45 4,500 1.5%
SMC Average 9 900 3.3%
Error margin is calculated as half the width of the two-sided 95% confidence interval (Normal approximation) at 50% vote share. All polling districts within a constituency are assumed to have an equal number of voters.

In a first-past-the-post election, however, what counts is not the actual vote share but crossing the 50% threshold. If we ignore three-cornered fights and make use of the normal approximation again, we can calculate a “victory threshold” or minimum sampling check result for which a candidate can be 95% certain of victory.  This is shown in Table 2 for selected constituencies.  Note that the victory threshold is slightly lower than would be obtained if we simply added the error margin from Table 1 to 50% because the victory threshold is a one-sided rather than a two-sided test. If you’re not statistically-inclined, don’t worry about it – that effect is small in this case. In Aljunied, despite the appearance of a nail-biting finish, the 52:48 sampling check result was above the victory threshold for the Workers’ Party and the PAP in fact had only a 0.2% chance of winning in that constituency once the sampling check result was known [See note 1].  In Punggol East, however, the sampling check result of 51% for PAP was lower than the victory threshold of 52.4% and the PAP could only be 75% certain that they would win there. Put another way, Lee Li Lian still had a 25% chance of winning even after seeing the sampling check results. In all the other constituencies, the winning party’s sampling check result significantly exceeded the victory threshold so the final result was not in doubt once the sampling checks were completed.

Table 2 – Victory threshold for selected constituencies.
No. of polling districts Victory threshold Actual PAP share in sampling check Probability of PAP win given sampling check result
Pasir Ris-Punggol (6-member GRC) 66 51.0% 73% 100%
Aljunied (5-member GRC) 50 51.2% 48% 0.2%
East Coast (4-member GRC) 32 51.5% 61% 100%
Sengkang West (SMC) 13 52.3% 63% 100%
Punggol East (SMC) 12 52.4% 51% 75%
Hougang (SMC) 9 52.7% 42% 0.0%
Potong Pasir (SMC) 5 53.7% 68% 100%
1. Victory threshold is defined as the minimum sample count result for which a candidate can be 95% certain of receiving over 50% of the actual vote (one-sided test).
2. Probability of PAP win is the probability that the actual PAP vote share is > 50%, given the observed sampling check result.

The error between the sampling check and the actual result gets smaller as the sample size increases. Hence GRCs will have smaller error margins than SMCs and 6-member GRCs will have smaller error margins than 3-member GRCs. This is seen in Figure 1 where the difference between the sampling check and the actual result was only 0.1% points in the 6-member Pasir Ris-Punggol GRC while the largest difference of 2.6% points was observed in MacPherson SMC.  The observed differences between sampling check and actual vote counts were all within the expected 95% error margins except for one constituency, but one out of 29 is about right, statistically. We also did not take rounding errors into account, which slightly widen the margin of error.

image001

The sample check is a form of a quick count, which is used in developing democracies where there are concerns with regards to the compilation of electoral results by the central government. In Singapore’s case, there is no obvious need for a quick count as the entire counting process can be observed by candidates’ counting agents and elections results have always been announced within a few hours. Nonetheless, given that the Elections Department has chosen to conduct sampling checks, the decision to publicly reveal sampling check results is a welcome one.

Notes

[1] Some additional uncertainty is caused by the sampling check results only being reported as whole number percentages. The true PAP sampling check result from 5,000 samples could have been as high as 48.5% rather than the reported 48%.  However, this would still have given the PAP only a 1.8% chance of victory once the sampling check results were known.

Data

Sampling check results  and actual vote counts are tabulated in Sample check vs actual v3 (Excel format).

A version of this note was previously published on The Online Citizen, http://www.theonlinecitizen.com/2015/09/how-accurate-is-the-ge-sample-vote-count/.

Guide for Counting Agents (ELD)

Looks like the Elections Department has released its’ Guide for Counting Agents in the Hougang by-election, and the sampling check for “the purpose of checking against the result of count for that counting place” remains in place. Seems fairly pointless to check “against the result of count” given that there’s no way to change what’s on the ballot papers even if the final results don’t agree with the sampling check.

The real question is whether the AROs will disclose the results of the validity check to Counting Agents at the time that it is carried out, and who receives the “sampling check” information after it is compiled by ELD HQ, but before the announcement of the vote counts at the counting centres.

http://www.eld.gov.sg/pdf/Guide%20for%20Counting%20Agents%20(Final).pdf

Background here:

https://stngiam.wordpress.com/2012/05/01/electoral-procedure-sampling-checks/

Electoral Procedure: Sampling checks in the 2011 Presidential Election

As recounted in my earlier posts, I served as a Counting Agent in both the General Election and Presidential Election last year. One of the pleasant surprises of the 2011 elections was the number of Singaporeans who stepped forward as volunteers to assist the different parties and candidates in campaigning and to serve as Polling Agents and Counting Agents in both elections. Polling agents are appointed by candidates to observe the polling process while Counting Agents observe the counting of ballot papers. Unfortunately, I think the smaller parties were overwhelmed by the response so the administration and training of their volunteers was less than ideal. Still, it is a good sign of the health of Singapore’s political development that so many did step forward to serve.

The Elections Department (ELD) also helped by publishing for the first time two guides for Polling Agents and Counting Agents. Unfortunately, these guides were only released three days before polling day so it was too late for the candidates to use them in their training sessions. Hopefully, the Elections Department will update these guides for future elections and release them earlier so that candidates, agents and voters will have a clearer understanding of the procedures and rules regarding the casting and counting of votes.

By and large, I think both elections went off smoothly and by the time of the Presidential Election, both elections officials and Counting Agents were already familiar with the procedures and in some cases, with each other, because they had met previously during the General Election. The Elected President is intended to be above party politics and I was pleased to find that at least in the counting centre that I was assigned to, there was a high level of co-operation between the Counting Agents representing all four of the candidates. Apart from Tony Tan, the other candidates did not manage to recruit enough Counting Agents to cover all the Counting Places. Nonetheless, Counting Agents for the other three candidates informally spread themselves out among the tables and held watching briefs for each other. In any case, there were few disagreements between the Counting Agents and Assistant Returning Officers (AROs) or among the Counting Agents over adjudication of ballots.

Counting Procedures

The counting procedures for both Parliamentary and Presidential Elections are substantially similar and ELD’s Guide for Counting Agents provides a good overview. One of the interesting features of the process is what ELD calls in its guide a “sampling check”

Sampling checks

5.16 During the counting process, the ARO will conduct a sampling check to obtain a sample of the possible electoral outcome for that counting place, for the purpose of checking against the result of count for that counting place.

What I observed was that after the ballot boxes were opened and the contents mixed together on the counting table, one of the counting assistants would randomly select 100 ballot papers and do a quick tally of the votes on that sample and then report the results to the Assistant Returning Officer in charge of that Counting Place. The results of the sampling check were not formally announced to those present but Counting Agents could observe the recording of the results by the AROs. As I mentioned earlier, there was very good co-operation and sharing of data between the Counting Agents representing all four candidates so I managed to collect the sampling check data for all the counting tables at our Counting Centre (Table 1).

Table 1 – Sampling check results for Presidential Election, 27 August 2011
Nanyang Junior College Counting Centre
Counting Place

Polling District

Polling Station

Tan Cheng Bock

Tan Jee Say

Tony Tan Keng Yam

Tan Kin Lian
1

GK01

MA27

Nanyang JC

33%

25%

36%

6%
2

GK02

MA24

Braddell Heights CC (B)

22%

34%

39%

5%
3

GK03

MA23

Braddell Heights CC (A)

30%

28%

38%

4%
4

GK04

MA22

419 Serangoon Central

35%

21%

34%

10%
5

GK05

MA26

305 Serangoon Ave 2

21%

32%

43%

4%
6

GK06

MA25

240 Serangoon Ave 2

26%

30%

35%

9%
Overall for Counting Centre

27.8%

28.3%

37.5%

6.3%
Sampling is conducted by taking a sample of 100 ballots at each Counting Place after mixing of ballot papers but before commencement of counting. The overall share for each candidate was computed by simply averaging the results for each polling district without adjusting for the different number of voters in each polling district.

Tony Tan came out ahead in all the polling districts in the sampling check, just as he did in the final tally (Table 2) though there was some difference between the final result and the sampling check (Table 3).

Table 2 – Actual results for Presidential Election 27 August 2011
Nanyang Junior College Counting Centre
Counting Place

Polling District

Polling Station

Tan Cheng Bock

Tan Jee Say

Tony Tan Keng Yam

Tan Kin Lian

Total number of valid votes

1

GK01

MA27

Nanyang JC

34.1%

26.6%

34.0%

5.3%

3,237
2

GK02

MA24

Braddell Heights CC (B)

32.5%

28.6%

33.0%

5.9%

3,074
3

GK03

MA23

Braddell Heights CC (A)

31.5%

28.7%

34.8%

5.1%

3,198
4

GK04

MA22

419 Serangoon Central

32.5%

26.0%

35.8%

5.7%

3,539
5

GK05

MA26

305 Serangoon Ave 2

33.5%

25.3%

36.4%

4.9%

2,946
6

GK06

MA25

240 Serangoon Ave 2

32.7%

24.8%

36.6%

5.9%

3,434
Overall for Counting Centre

32.8%

26.6%

35.1%

5.5%

19,428
See https://stngiam.wordpress.com/2011/08/29/flash-results-micropolling-results-of-presidential-elections-2011/ for more polling-district level results.
Table 3 – Difference between actual vote share and sampling check
Nanyang Junior College Counting Centre
Counting Place

Polling District

Polling Station

Tan Cheng Bock

Tan Jee Say

Tony Tan Keng Yam

Tan Kin Lian
1

GK01

MA27

Nanyang JC

1.1% pt

1.6% pt

-2.0% pt

-0.7% pt
2

GK02

MA24

Braddell Heights CC (B)

10.5% pt

-5.4% pt

-6.0% pt

0.9% pt
3

GK03

MA23

Braddell Heights CC (A)

1.5% pt

0.7% pt

-3.2% pt

1.1% pt
4

GK04

MA22

419 Serangoon Central

-2.5% pt

5.0% pt

1.8% pt

-4.3% pt
5

GK05

MA26

305 Serangoon Ave 2

12.5% pt

-6.7% pt

-6.6% pt

0.9% pt
6

GK06

MA25

240 Serangoon Ave 2

6.7% pt

-5.2% pt

1.6% pt

-3.1% pt
Overall for Counting Centre

5.0% pt

-1.7% pt

-2.4% pt

-0.8% pt
e.g, in polling district MA27, Tan Cheng Bock actually received 34.1% of the vote compared to 33% in the sampling check, a difference of 1.1 % points.

The sampling check is not specifically called out in the Presidential Elections Act or Parliamentary Elections Act though it does not appear to be prohibited either. I did not observe the counting assistants carrying out a sampling check during last May’s General Elections. However, I did observe the ARO at a different counting centre personally pick up a stack of ballots and scrutinize them very closely. When I asked him what he was doing at that time, he answered that he was checking the validity of the ballot papers. Possibly, he was referring to Section 50(1)(a) of the Parliamentary Elections Act under which ballot papers must bear an official authentication mark to be considered valid. Given the thoroughness of ELD’s pre-election preparations and the scrutiny of Presiding Officers and Polling Agents, not to mention voters, during polling, I find it very unlikely that any unauthenticated ballot papers could slip through. In any case, the ARO is required under Section 50 to check the validity of every ballot paper when it is counted so a validity check on a subset of the ballots appears to be superfluous.

Regardless, the validity check or sampling check cannot affect election results because they are only conducted after polls have closed. Conceivably, the sampling check could be construed as being a form of exit polling and while Section 78D of the Parliamentary Election Act prohibits the publication of exit poll results on polling day, this prohibition only applies while polling stations are open. Even if a sampling check were conducted during a Parliamentary Election and the results leaked out, there would not be any violation of the Act because polls would already have closed by the the time the sampling check is conducted.

Sampling check as predictor of election result

The ELD Guide for Counting Agents says that the purpose of the sampling check is to “obtain a sample of the possible electoral outcome for that counting place, for the purpose of checking against the result of count for that counting place.” This sentence is quite awkwardly constructed and doesn’t make a lot of sense since the the final vote tally will be the official result regardless of whether it agrees with the sampling check. Presumably, what they really meant to say was that the sampling check is used to predict the outcome of the election early in the counting process.

As can be seen in Tables 1 to 3, the sampling check predicted correctly that Tony Tan would come out on top at Nanyang Junior College, though his actual vote share was 2.4% lower than that in the sampling check. The sampling check result for Tan Cheng Bock in polling district MA24 stands out as it was 11 percentage points lower than his actual vote share. I estimate a slightly more than 1% chance of this occurring by chance, which is a low probability but not exceptionally low. Of course, it’s also quite possible that the Counting Agent at that table just made a mistake because the ARO did not officially announce the sampling checks results over the table.

For this election, analyzing the sampling check results is quite challenging because there were four candidates so the problem is a multiple comparison problem rather than the usual comparison of two proportions. In a normal two-horse race, we would just have to predict whether the votes for one candidate exceed 50% and that would tell us the outcome of the race. In this case, however, we would have had to predict the vote shares of at least two, perhaps three, candidates, but the vote shares of the candidates are not independently distributed, which makes the problem rather difficult. If any more statistically-inclined reader has a good method for estimating probability distributions for this type of problem, please contact me.

Four-way elections will hopefully remain rare in Singapore, so I present a simplified analysis of the sampling check in a standard two-way election instead. There were 782 polling stations in the last election and if 100 ballots are sampled from each one, there would be a total of 78,200 ballots in the sampling check for a nation-wide election such as the presidential election. We assume that each polling district has the same number of voters, and using the normal approximation to the binomial distribution, the 95% confidence interval for the sampling check is roughly ±0.4% points. If we don’t need to estimate the actual vote share and only need to know whether a candidate has won (i.e., received > 50% of the vote), we can be 95% confident that he has won if he receives over 50.3% of the votes in the sampling check (one-tailed test). For the elections officials, what counts perhaps is not who won but rather whether there would be a recount. To avoid a recount, the winning candidate must receive at least 51% of the final vote (2% winning margin over his opponent) so if the sampling check reveals that one candidate has scored at least 51.4% in the sample, the elections officials can be 99% certain that they would not have to stay overnight. In reality, the number of voters varies from about 2,000 to 3,500 per polling district and since voter turnout will be known by the close of polls, we could make some adjustments for polling district size and voter turnout to improve the accuracy of the forecast. Of course, there is no way to estimate the number of spoilt votes, which could affect the results, but I don’t think those would have a large effect in most circumstances.

Because the sample size is large in a presidential election, the forecast made by the sampling check is quite precise. In parliamentary elections, however, there may be as few as five polling districts in a single-member constituency (SMC) such as Potong Pasir so the sample would be smaller and the uncertainty in the sampling check larger. Assuming a sample of 500 out of a total of 15,870 valid votes in Potong Pasir, a candidate would have to receive at least 53.7% in the sampling check to be 95% certain of winning the election (one-tailed test). Hougang is larger and has nine polling districts with 23,000 voters. For that constituency, a candidate would have to poll at least 52.7% in the sampling check to be 95% certain of winning the election. Again, I’m assuming equal polling district sizes in these analyses but adjusting for polling district size and turnout would be more important in small constituencies.

Purpose of the sampling check

A rather obvious question is what ELD does with the sampling check data. As described above, one possible use of the sampling check is to predict whether recounts would be necessary and to prepare the elections officials accordingly. I do not know whether this was done during the presidential election, but I presume not, because I did not observe the elections officials at my counting centre start to make preparations for the recount until very late in the night. Since the sampling check takes place after the close of polls it cannot affect voter turnout and it cannot have any effect on the ballot papers which have already been poured out and mixed together on the counting table. The only possible effect that I can conceive is that if a candidate learns that the results are close in a particular counting centre, he could redeploy his more persuasive Counting Agents there in the hope of swaying the ARO into interpreting unclear ballots in a more favorable manner. This has less of an impact in Presidential Elections where every vote has the same weight regardless of location, but in a General Election, political parties may be able to use sampling check data to reposition Counting Agents from safer seats to more contested constituencies where they might be able to make a difference. Smaller parties in particular could benefit more from this information in that they could make more effective use of their smaller pool of volunteers whereas larger parties already have an excess of Counting Agents so have lesser need to redeploy them even in the event of a close fight. To ensure the appearance of impartiality, however, ELD should formally announce the results of the sampling check rather than leave it to Counting Agents to look over the shoulders of the AROs. While the AROs at my counting centre did not prevent the Counting Agents from jotting down the results of the sampling check, they did not explicitly announce the results in the same way that they announced the final vote count over the table.

On reflection, however, it is not really clear to me what purpose the sampling check serves. ELD does not appear to use the results to prepare its officials for recounts, and it does not officially share the results with candidates or media. Hopefully ELD would be able to explain the purpose and use of the sampling check when it prepares its Guides for Candidates and Counting Agents for the next election — whether General Election or by-election. While I can appreciate it if ELD has concerns that revealing sampling check results could raise temperatures in close elections, I also don’t think it is tenable for them to conduct a sampling check during the course of counting without being more open and transparent as to the procedure and the use of the data generated by the sampling check.