Evaluation and statistical services

Charts and data analysis

Quantitative evaluation

AlphaPlus has developed a highly successful function for carrying out quantitative research and analysis and has accumulated a diverse range of clients, including UK government departments, e.g. DfE, BIS and Ofqual; awarding organisations; and learned societies, e.g. the Institute of Physics.

We analyse a broad range of data types, including examinations and assessment data, consultation and questionnaire data, and national education data sets such as the national pupil database and EduBase. We are accredited by government to handle massive data sets, and total confidentiality is assured.

The quantitative analysis team care about education, so we leverage data to help improve the education system. To this end, we often collaborate with other specialists, finding solutions to problems and conveying the outcomes of complex analysis clearly and succinctly to our clients. Using a range of graphical presentation techniques, we give insight into what the data is telling us so that clients can go on to use it to best effect.

But it is not always complex. While highly advanced and sophisticated analytical techniques – and diverse analytical approaches, such as using generalisability theory (g-theory) or item response theory (IRT) – may be appropriate for some projects, some questions can be answered relatively simply through careful study. Importantly, we use the most suitable techniques to answer the research question at hand and to meet the client’s requirements.

Assessment performance management

AlphaPlus is a leading assessment consultancy in the UK. Major government and AO clients rely on our Assessment Quality Improvement (AQI) service both in the UK and around the world. Our team have the expertise you’d expect – in assessment and statistics – but they are also excellent communicators, speaking and writing clearly and accurately and giving messages that are to the point and free of jargon. We provide realistic evaluations of results (not theoretical, not perfectionist), telling our clients where and why we think improvements are necessary but equally giving sensible, calm judgements of where services are good enough, or as good as they can be.

Needless to say, AlphaPlus is accredited by government agencies as a responsible handler of sensitive data and we have a strong track record of respecting confidentiality.

Results processing

We publish high-stakes exam results for several clients – from across the professional exams and schools’ assessment sectors. In this work, we provide reports and outputs to clients’ specific requirements (Angoff scores for a professional exam, vs. standardised scores for a school-based assessment, for example).

We also apply multiple checking routines (both manual and automated) to make sure that we provide results of high accuracy to the high standards expected by our clients.

Classical test analysis and IRT

AlphaPlus uses a range of techniques from within CTT to give insight into the performance of tests, exams and questions within them. Such techniques might include:

Calculating facility values to show the ease or difficulty of questions.
Calculating discrimination indices to show whether questions seem to function well in the test.
Reliability estimation to assess the overall quality of a test.
Construction of scoring scales (e.g. standardised scores) of varying types.

All our analyses can be accompanied by a range of informative visuals. We also provide interpretations and write-ups; telling users what the scores mean in their context. We avoid technical jargon and get to the nub of what the client needs to know.

Our statisticians are highly skilled and use the R programming environment to carry out analysis quickly and efficiently. This programmatic approach also means that we have a clear audit trail, showing precisely what was done and how.

Item response theory (IRT)

As well as classical theory analyses, our teams are skilled IRT practitioners. We use a range of models, focusing predominantly on the Rasch, one-parameter model. In our IRT work, we do the following:

Use IRT to construct scaled sets of items for adaptive tests.
Carry out equating and linking studies to find out the relative difficulty of test versions relative to each other.
Checking the functioning of items using techniques such as model fit.
Making sure that tests are fair for all test takers – for instance by undertaking differential item functioning (DIF) analysis.

As with our classical theory analyses, IRT outputs are interpreted clearly for the intelligent layperson, summarised succinctly and are accompanied by informative graphics.