Remote invigilation – lessons learned during the pandemic

Demand for remote invigilation (or remote proctoring – the two terms are used interchangeably) has grown significantly during the current pandemic, as awarding organisations (AOs) have been forced to find new solutions to allow candidates to take their exams safely and securely.

AOs have had to get to grips with the practicalities of remote invigilation in a short space of time. We have undertaken research into the approaches followed and, below, we summarise a few of the issues that awarding organisations are facing and some of the lessons learned.

Key Findings

Good practice for remote invigilation is still developing and is currently largely undocumented in formal research. This applies to all aspects of remote invigilation (decisions about which modes to use, the selection of technology, the extent to which automated functions for detecting suspicious practice through technology can be trusted, the test security infrastructure, and the training and management of human-mediated processes such as invigilation, review, appeals, etc.). AOs are largely “making it up as they go” based on their lengthy experience about what would be acceptable assessment practice – understandable given the COVID crisis – and regulatory processes are largely working to catch up. Relaxations in regulation during COVID have helped, although there is uncertainty about the implications when/if these relaxations are removed as education and assessment return to normal practice.
Experiences with RI are generally encouraging, but most organisations have experienced teething problems, some of which have been severe and lengthy enough to risk reputational damage. However, as time progresses, and individual AOs adjust to RI delivery, their satisfaction (and that of their candidates) appears to increase, suggesting that significant periods of teething problems can be overcome.
Remote invigilation can broadly be divided into three distinct options – Live invigilation, Record & Review and AI-mediated RI. There is some overlap, particularly between the use of AI to support R&R.

Live Invigilation

In Live Invigilation (where candidates are watched via one or more cameras while taking a test), the ratio of invigilators to candidates varies greatly – from 1:2 to 1:15 or more. The quality of service must be affected by this and we would expect over time the range of ratios to close. At very high invigilation ratios (one invigilator supervising many candidates), there is a perception emerging that AI-supported R&R might be better (in terms of identifying cheating).
We are reasonably confident that live invigilation, with strong processes, well trained and monitored invigilators, a low ratio of invigilators to candidates (probably less than 1:4) and good technology is at least as secure as a traditional exam centre. However, because the risks are different to a traditional exam centre model (which is well understood by stakeholders, including regulators), Live Invigilation will not necessarily be immediately acceptable to a wider audience. Good communication with candidates and stakeholders (e.g. employers, regulators, the public, etc.) will be an important part of any new implementation.

Record & Review

Many AOs choose Record & Review, as it is significantly cheaper to operate than Live Invigilation. Candidates’ assessments are recorded and reviewed after (either watching at speed, or by sample, or AI-triggered. However, AOs need to consider the proportion of recorded videos that will trigger a review (and the amount of reviewing therefore needed) when choosing a remote invigilation solution. Many AOs are continuing to deal with very large numbers of false positives and had not factored the manpower and logistical implications of this into their business and operational models associated with this review.
Record & Review appears to be more resilient in the event of weak or intermittent internet than live invigilation. We believe this may be because of more relaxed low-bandwidth triggers than for Live Invigilation, or because of local functionality that allows video to be stored locally and subsequently uploaded.

AI-mediated

AI-mediated (without R&R – ie where a computer alone decides whether the candidate’s activity is acceptable) is widely regarded as only good enough for low stakes assessment – the AI algorithms simply aren’t good enough (yet) to accurately distinguish between acceptable and unacceptable behaviours.
The extent of use and effectiveness of AI technology to identify candidate behaviours/test activity for review as a possible rules breach seems to be developing quickly. It seems sensible that AI will eventually develop to the point where AI-supported R&R is as good or better than live invigilation. However the evidence we have heard to date is of large numbers of false positives and missed true positives.

Candidate experience

There is little more than anecdotal evidence about candidate satisfaction with RI at this point, although there have been a significant number of very negative stories where candidates’ assessment experience has fallen way below that which their awarding organisation or its regulator would find acceptable.
Use of online proctoring services have increased rapidly according to several suppliers. In parallel (and probably as a result of stretched capacity), customer service has deteriorated and some candidate experiences are sufficiently poor to create risk of reputational damage to some AOs. It is important to note that because candidates are remote, the service provided before the exam (both in preparation, and while in the “exam waiting room”) and (to a lesser extent) after the exam are important – it’s not just about the invigilated experience during the test.
Some candidates are waiting long times to access their exams due to service capacity. Capacity planning with providers (particularly for live invigilation, but also for R&R as a managed service) is important, particularly for session-based exams.
The end-to-end candidate experience needs to be carefully thought through – there are many points of failure which can lead to failed tests and a poor candidate experience. In an environment where practice is emerging, our view (based largely on anecdotal evidence) is that it is acceptable (to candidates) for practices to need improvement in their early stages, provided that AOs (a) communicate very clearly with candidates about what is expected, and (b) are seen to be responding quickly and adequately to feedback as it emerges.
Candidates do not follow guidance well (this is not news to people working in exams, but Remote Invigilation provides new ways for candidates to get in a mess prior to an exam due to lack of proper preparation), and this can result in failed assessments (assessments not running or completing properly) and complaints. Interlocking preparatory activities with permission to start, and giving candidates opportunities prior to the test time to get ready are both seen as critical.

Conclusion

It is clear that many awarding organisations have had teething problems when introducing remote invigilation, particularly relating to candidate experience. Nevertheless, we have not come across any AO that has tried remote invigilation and is planning to abandon it (e.g. once exam centres can reopen) which we take as a significant positive, and, to some extent, mirrors the experience in previous years with adoption of etesting.

News & Reports