The selection of employees is one of the most significant tasks a human resources practitioner is faced with. This affects the flow of employees entering and exiting the firm. Many issues may arise if the recruiting process is not in accordance with the South African Legislation, namely the Employment equity act and the labour relations act, which governs the reliability validity, bias and fairness of psychometric assessment measures. These legislations have been brought about to protect employees against discrimination and unfair practices which were experienced during the previous dispensations. In accordance to the terms of the provisions of Section 8 of the Employment Equity Act (55 of 1998) “Psychological testing and other similar assessments of an employee are prohibited unless the test or assessment being used (a) has been scientifically shown to be valid and reliable; (b) can be applied fairly to all employees; and (c) is not biased against any employee or group”. The psychological measuring instrument we have chosen in accordance with HPCSA is the APIL B. The Ability Processing of Information and Learning Battery (APIL-B) will be critically evaluated within this essay. This psychometric assessments battery can therefore assist recruiters who use these measures to identify employees who have the potential to grow and learn with in organisations. Furthermore it should be noted that the APIL B is a cognitive measure and is not only used for recruitment and selection in organisations, but can also be used for selection into schools, universities and other areas. Hence, this essay will report the APIL B through the following headings namely evaluating the APIL B, Composition of the APIL B, Validity, Reliability, Bias and Limitations.
Evaluating the APIL B
According to Foxcroft and Roodt (2013), it is an assessment practitioner’s duty to evaluate the information offered about a measure and determine whether it is valid and reliable for its intended purpose. Foxcroft and Roodt (2013), further state that for evaluating a measure, some of the things that an assessment practitioner should consider are: how long ago it was developed; quality of manual contents; clarity of instructions and cultural appropriateness.
First conceptualized in 1994 by T.R Taylor, the APIL B - Ability, Processing of Information and Learning Battery- (Taylor, n.d.), was designed as a set of tests with the purpose of assessing ones vital cognitive capabilities. In order for the assessment to be most effective it should be administered on individuals with individuals with a minimum of twelve years educational background (Taylor, n.d.). The APIL B is ideal for identifying those who are likely to master new cognitively challenging content in a training context and establishing levels in order to place people in the correct positions. Taylor (n.d.), has identified three norms that the APIL B makes use of, namely: stanines (scale of 1 – 9) ; stens (scale of 1- 10) and percentiles (scale of 1- 99). Taylor (n.d), further states that stens are used in the Flexibility-Accuracy-Speed Tests (FAST), stanines are used in the concept formation test; the memory test and Knowledge transfer test while percentiles are used in the curve of learning test.
According to Taylor (n.d.), the APIL B is divided into five test booklets and two ancillary booklets which make up eight scores namely: Abstract thinking; Speed of information processing; accuracy of information processing; cognitive flexibility; Performance gain in a learning task; final level of proficiency; Memory and understanding and Transfer of knowledge, which will take approximately three hours and forty five minutes to administer.
Composition APIL Battery
Concept formation test
This test was designed to assess one’s ability to “think abstractly and conceptually: to form abstract concepts, reason hypothetically, theorise, build scenarios (and) trace causes” (Taylor, p. 4, n.d.). The test is comprised of thirty questions; each consisting of six depictions of similar nature the test taker must identify the depiction that does not share a characteristic that the rest of the depictions share (Taylor, n.d.).
Flexibility-Accuracy-Speed Tests (FAST)
Taylor (n.d.) suggests that “this battery within a battery measures speed (quickness) and accuracy of information processing, and cognitive flexibility”. The FAST test is made up of four individual assessments namely: Series; Mirror image; Transformations and combined tests. All four assessments are time sensitive and have been designed in such a way that it is very rare for a test taker to actually complete the entire assessment. It uses shapes of different sizes which may contain either a dot or line in the center. The basic idea of the tests is to identify a pattern and find the omitted depiction.
Curve of learning
According to Taylor (n.d.), this test focuses on a learning potential, it aims to assess ones capacity at which they are able to master new skills. It looks at future achievement potential rather than the abilities that the person already has. The test is split into four timed sessions which requires the test taker to decode a series of paired images into another set of images and once again decode these images to a set of words. Images are decoded with the aid of the first ancillary booklet, the dictionary.
Memory test
Directly after the test taker has completed the curve of learning test, the memory test is administered. It follows the same concept as the curve of learning where the test takers are required to decode images to words; however the dictionary is now taken away. The performance of the test taker on this test reflects the extent to which the test taker has understood the logical relation between the symbols and words.
Knowledge transfer test
According to Ferguson (1956, as cited in Taylor, n.d.), transferring knowledge and skills to similar areas or situations is a vital process of cognitive development. The knowledge transfer test, as the name suggests, measures this ability. The test consists of a series of connected depictions referred to as “pieces of equipment” (Taylor, p. 19, n.d.), which have a specific feature in addition to a basic shape. The test taker is required to categorize them under symbols. Test takers are also given the second ancillary booklet.
Before a test can be used on test takers, the validity of the measure needs to be established to ensure that the test is valid for the purpose it is to be used for. Foxcroft and Roodt (2013) state that the “the validity of a measure concerns what the test measures and how well it does so”. In the studies consulted it has been evident that construct and criterion validity were shown to be present in the APIL B assessment. The construct validity of a measure is the extent to which it measures the theoretical construct or trait that it is supposed to measure (Foxcroft & Roodt 2013). The second validation measure of criterion validity was defined by Phelan and Wren (2005) who stated that “Criterion-Related Validity is used to predict future or current performance”. The method that used to determine criterion related validity is predictive validity. Murphy and Davidshofer (2005) define predictive validity as a method of determining criterion validity. It also used to determine the correlation of a test takers test score and there criterion related scores.
Taylor (1995) investigated the validity on the CFT, where he gave the measure to 33 first-year university students who had been accepted into the university on merits other than their grade twelve results. Taylor correlated the marks from their CFT assessments and the marks of the course they took; which were to improve their logical thinking and reasoning skills. Therefore the correlation was 0.44 (p = 0.012). Taylor (1995, as cited in Strachan, 1998) found in another study which investigated the validity on the Curve of Learning and Memory and Understanding tests was conducted using a sample of 110 workers from a beverage manufacturing firm. The criteria for evaluating workers included facets such as their capacity to learn new procedures and concepts, to understand why things happen in the firm as a whole, and their capacity to plan and organise. These results averaged correlations of 0.35. The low correlation can be attributed to the fact that a diverse sample was not used. A further study by Taylor (1995) found criterion scores which was given to 43 employees who were enrolled in a course designed to prepare them for a promotion in junior management positions. The correlations here were reported to be 0.67 and 0.79 respectively, which can be interrupted to prove to be an arcuate predictor of performance.
In an additional study conducted by Lopes, Roodt and Mauer (2001) on the predictive validity of the APIL-B in a financial institution; the purpose was to assess the predictive validity of the APIL test battery, in order to identify learning potential. A sample of 235 successful job applicants were used to complete the test battery and found the predictive validity of the test battery was assessed using a canonical discriminant analysis procedure. The procedure was adopted in view of the nominal strength of the manager’s ratings, and due to the limited sample size the 5 point rating scale was eventually collapsed to a 2 point classification.
It should be noted that an assessment is reliable if it measures the same construct in a consistent and precise manner over time. Foxcroft and Roodt (2009) define reliability of a measure as “the consistency to which it measures whatever it measures”. Split – half reliability was a major psychometric property of reliability used among majority of the literature we consulted.
Split Half Reliability
In the APIL B, (Taylor 1995) elucidates that split half reliability was used to investigate whether or not the APIL-B is reliable. Focroxft and Roodt (2013, p. 47) define split-half reliability as “obtained by splitting the measure into two equivalents (after a single administration of the test) and computing the correlation coefficient between these two sets of scores”.
During Taylors’ investigations into the reliability of the APIL B, he used a sample of six groups to test reliability coefficient of the flexibility, accuracy and speed test and the knowledge transfer test. These have reliability coefficients from a low of 0.70 – to a high of 0.86 and 0.71 – 0.84 respectively Taylor (1995).
In a study done by the defence force which lasted over a period of three years with new recruits. The purpose was to determine whether the psychometric evaluation processes can reliably predict the learning potential of first year recruits at the academy. The FAST considered the following; firstly, the APIL B investigated whether the FAST has a positive effect on how quickly recruits learn new abilities. It was found that a significant relationship with a reliability coefficient of (r=0,491) exists between flexibility of information processing and steepness of the learning curve. This therefore is below the accepted reliability coefficient of (r=0.70)
Secondly, it was found that a strong relationship with a reliability coefficient of (r=0,72) is apparent between speed of information processing and the total amount of work completed by the recruits. Lastly, it was determined that the small relationship with a reliability coefficient of (r=0,392) exists between accuracy of information processing and steepness of the learning curve. This therefore is below the accepted reliability coefficient of (r=0.70). However, the results concluded that three components of the FAST, are accurate in predicting how quickly new recruits in the defence force will develop new competencies. The findings also further indicated that the accuracy with which information is processed has a minimal influence on the rate a recruit will develop new competencies (Pretorius 2010).
In terms of the knowledge transfer test which investigated if there was a transfer of knowledge to crystalized abilities. Meaning it investigated if there was a transfer of what the recruits learnt and how they apply it in combat situations. Pretorius (2010) defines crystalized abilities as “are specialized insight or understanding and knowledge that emerge via transfer from existing knowledge and that is subsequently, successfully stored in memory”. The Memory and Understanding sub-test of the APIL-B was used to measure crystallized ability of recruits. It was found that a positive relationship exists between the transferring of knowledge in what the recruit learnt and crystalized abilities. The reliability coefficient was reported as a positive directional relationship between transfer of knowledge and crystallized abilities. A substantial relationship with a reliability coefficient of (r=0,515) exists between memory and understanding and crystallized abilities. This therefore suggests that a moderate correlation exists.
In terms of the curve of learning, it was found that prior learning has a positive directional effect on learning performance thus the results indicate a substantial relationship and moderate correlation with a reliability coefficient of (r=0,431). In concluding with this study, it can be said that the defence force’s use of the APIL B was not fair and efficient, as it is biased towards a historically disadvantaged groups (Pretorius 2010).
A de Goede and Theron (2010) study concurred with Pretorius (2010) where a non-probability sample of 434 new recruits from the South African Police Service Training College in Philippi, Cape Town was used. Even though the size of the selected sample is quite acceptable, making use of a non-probability sampling of the target population, caution should be taken when making generalisations. De Goede and Theron (2010), found that a score of reliability score of (r= 0.45). This suggests that a question mark hangs over the success with which at least some of the concealed variables comprising the results of the learning potential police recruits.
Standard Error of Measurement
Foxcroft and Roodt (2013, p.249), “explain that the standard error of measurement indicates the band of error around each obtained score, and examiners should aware of the standard error of measurement for each subtest before interpreting the test – takers score”. Therefore, assessors must be cognisant of the test takers history and current circumstances. Factors such as culture, transient conditions, prior learning and test wiseness can have an impact on the variance between the true score (obtained under perfect conditions) and the obtained score. Pretorius (2010) outlines that prior learning of an individual and their familiarity with taking assessment has a significant impact on their ability to perform in test conditions. While Doosi (2000) was of the view that a testees culture as well as environmental factors will also affect the scores of the historically disadvantaged people of South Africa.
Piro (2011) explains that bias “implies that test scores obtained for various subgroups of a given population cannot be interpreted in the same way across the groups”. Taylor (1995) suggests APIL-B was designed as a learning potential test and therefore limits any biased based on cultural differences. This is a result of the test being a non-verbal test, except for the instructions, and the test comprises of mainly geometric depictions thus language does not become an issue of concern.
Strachan (2008) concurs with Taylor (1995) in a study conducted with a sample of 400 individuals, 66 testees had African surnames while the large majority can be classified as white. The data analysis for both race groups were highly correlated indicating that there is no potential for bias. However, it should be kept in mind that this was not a representative sample.
Further studies were consulted to investigate the potential bias in the APIL B. A sample of 20 psychological professionals from various fields, were asked to investigate the cultural bias of the APIL-B; found that 6 out of the 20 felt that the test was bias (Doosi 2000). Thus, it can be stated that there is a potential for bias based on one’s culture. Similarly, Pretorius (2010) concurs with Doosi as he found that the APIL B was accused of being biased and under representing the cognitive capacity of individuals from historically disadvantaged backgrounds. Thus, in order to bring recruitment practices in line with legislation in the Employment Equity act, these tests was subsequently replaced with a selection battery thought to be less susceptible to culture, race and gender bias. This resulted in the measure being removed from use in the context of the defence force.
Limitations of the APIL B
According to de Goede and Theron (2010), it was found that the sample was not diverse enough for the representative target population. This is further verified by Strachan (2008), who also did not make use of a diverse sample. Therefore based on the literature from these authors, it is evident that accurate conclusions cannot be drawn indicating that there are limitations in the above studies.
In the end the results show that, the APIL-B is able to predict the performance of individuals not only in certain institutions but for any selection at an accurate level and therefore makes the battery a vital instrument to use. It is evident that the APIL-B is a somewhat outdated measure but still proves to be valid and reliable in measuring cognitive abilities today. However, caution could be taken when administering the APIL-B, as some authors have found that bias is present in historically disadvantaged groups. This essay therefore reported on the APIL B through evaluation of the APIL B, Composition of the APIL B, Validity, Reliability, Bias and Limitations.
Firstly, it should be noted that the APIL B is an outdated selection battery. In order for organisations to make fair decisions in line with the Employment Equity Act, a more relevant battery needs to be considered. Secondly, it should also not be used in its individual capacity within the recruitment and selection process and is it advisable to be used in harmony with other valid information such as candidates’ curriculum vita’s and other test results. Thirdly, the use of the APIL can be considered bias in instances where people from different cultures and race groups are affected. In addition, Strachan (2008), De Goede and Theron (2010), should make use of a more representative sample in order to draw conclusions about the reliability of their studies. Lastly, we also propose that measures within the battery not require such strict prior learning criteria as these have been shown to bias the historically disadvantaged individuals who have not had exposure to prior learning.
Reference List
Doosi, M. (2000). An investigation into the attitudes, opinions, and feelings of psychometric test administrators toward the APIL B as a culture fair assessment with special reference to the employment equity act. Unpublished master’s thesis,
University Of Natal, Durban.
Employment Equity Act (95 of 1998)…… to be continued by Ross
Foxcroft, C., Roodt, G. (2013). Introduction to Psychological Assessment in the South
African context (4th ed). Cape Town: Oxford University Press.
De Goede, J., Theron, C. (2010). An Investigation into the internal structure of the Learning potential construct of the APIL B test battery. Management Dynamics, 19(4),30-55
Lopes, A., Roodt, G., & Mauer, R. (2001). The predictive validity of the APIL-B in a financial institution. Journal of Industrial Psychology, 27(1), 61-69.
Murphy, K.R., & Davidshofer, C.O. (2005). Psychology testing principles and applications (6th ed.). New Jersey: Pearson Educational International
Phelan, C., & Wren, J. (2005). Exploring Reliability In Academic Assessment.
Retrieved March 3, 2014 From
Piro, K. (2011). Investigating the Impact of a Psychometric Assessment Technique In The South African Automotive Industry. Unpublished Thesis, Nelson Mandela Metropolitan University, Port Elizabeth
Pretorius, M. (2010). Validation of a selection battery used by the South African Military Academy. Unpublished master’s thesis, University of Stellenbosch, Stellenbosch
Strachan, E. J. (2008). APIL-B as a predictor of job performance in a South African financial consulting firm. Unpublished master’s thesis, University of Cape Town, Cape Town
Taylor, T.R. (1995). APIL. Johannesburg: Aprolab
Taylor, T. (n.d.). Administrator’s manual for APIL Battery. Johannesburg: Aprolab
You May Also Find These Documents Helpful
Selection is the second stage in a normal organization entry scheme and starts at the end of recruitment. It involves identifying candidates who meet key requirements through a battery of tests and purposeful interviews. The selection policy of an organization determines the procedures that should be utilized in picking the right person for available job opening therein. In its absence, selection would depend on the whims and caprices of the managers. This process should be as thorough as the budget allows. A string of interviews that determine the personality, proficiency, and the cognitive abilities of the future employee must be accomplished to ensure the best people are placed in our organization.…
- 1651 Words
- 7 Pages
Better Essays -
Assessor should exercise professionalism throughout the assessment and comply with the Code of Practice for Assessors (Appendix…
- 4326 Words
- 18 Pages
Powerful Essays -
The role of the assessor is to assess the learner’s knowledge and performance in a range of tasks. This includes ensuring the learner has demonstrated competence and knowledge in the assessment to the standard of criteria.…
- 890 Words
- 4 Pages
Good Essays -
There are many attributes that need to be present within all working environments. The diversity of these characteristics and of differing skill levels needs to mix well with the needs of the organization to fill many work-related roles, and hiring the best candidates who balance each other make the workplace more efficient and effective in obtaining established business goals. Other advantages include identifying potential employees who would be disastrous for the organization if they were chosen, by determining who would be bad performers. Conversely, assessments can help identify exceptional strengths possessed by candidates. However, unlike…
- 5324 Words
- 22 Pages
Powerful Essays -
“If assessment is to be seen as a valuable tool and respected by learners, colleagues and other stakeholders, then it must be seen to do what it purports to do i.e. it must be effective.” (Wilson, 2008, P289)…
- 1153 Words
- 6 Pages
Better Essays -
Many of organizations today use a variety of techniques for collecting evidence and data about applicants. Methods such as, interviews, personality tests, ability tests, assessment centers, physical tests can be used to classify if applicants are suitable or unsuitable for the job and the company's culture. According to Schultz and Schultz (2010), hiring decisions usually are not based on one method, but on a combination of methods. Organizations are using varied selection tools to guarantee that they collect all of the relevant information. Job analysts are measuring these facts carefully, with objectively and in a nondiscriminatory manner (Schultz & Schultz, 2010).…
- 1502 Words
- 5 Pages
Powerful Essays -
The proposing team (2A), started with the argument that personality test should be used alongside other tools and should not be the sole basis for hiring. They provided the arguments that, personality tests will help the employer to comprehend the employee’s preferences, values and how the differ from other employees. An example of the Myers Briggs test that has been used for over forty years that is used by many companies was stated, which gives the same result no matter how many times…
- 769 Words
- 4 Pages
Good Essays -
The assessor will need to ask themselves if the assessment process when put into practice will be:…
- 6420 Words
- 26 Pages
Good Essays -
I believe that we should have a standardized test for High School students to graduate. It is a way of measuring basic skills of students. It puts all the knowledge of what we have learned over the past 12 years to the test. It’s put in a test to see how much you’ve truly learned and to prepare you for your adult life. Another reason we should have this standardized test is because it will better prepare the students for college. Students are given this final test and are released into the real world. They have a better understanding of basic arithmetic, reading, and writing that could help to get a good job.…
- 433 Words
- 2 Pages
Satisfactory Essays -
Validity addresses whether the assessment measures what it is supposed to measure (Gormley, 2011; Van der Vleuten, 1996).…
- 971 Words
- 4 Pages
Powerful Essays -
Recruitment and selection are the core activities of the HR department in any organization as it is directly linked to the employees of the organization. These processes are not only important but also the most difficult as it involves a lot of cost on the part of the company. Unlike the recruitment process the selection process also involves a lot of cost in terms of interviews and tests in conducting the selection of the employees.. Many of us already know that the psychometric tests are commonly used in almost every organization not only in the selection process but also in the process of performance management of the employees. The reason for carrying out the test in the same in the two situations it is to measure the ability and the performance of the employees. In this paper I will discuss what is a psychometric test? How the companies use it and also the advantages and disadvantages of introducing a psychometric test in the selection process.…
- 2713 Words
- 11 Pages
Powerful Essays -
A person’s cognitive ability alone is rarely a good indicator of his or her job performance. Other factors, such as personality and mental stability, are significant as well. Corporations are beginning to notice major benefits in screening out undesirable applicants and employees. How are they able to measure and predict which employees are likely to be undesirable? They use psychological tests. We will begin by looking at the history of workplace testing, then discuss some types of psychological tests, how these tests benefit employers, common testing instruments, institutions that use these tests, and some related limitations and legal concerns.…
- 3405 Words
- 14 Pages
Powerful Essays -
Diagnosis – the identification of nature of an illness or other problem by examination of the symptoms, looking at distinctive characteristics and applying them accurately in order to come to a conclusion about where the problem lies within a person - either in their head or their body…
- 396 Words
- 2 Pages
Satisfactory Essays -
Goodwin, L.D. & Leech, N.L. (2003). Assessment in action -- The meaning of validity in the…
- 2617 Words
- 8 Pages
Better Essays -
Post-apartheid psychological assessors started to realize that the selection process was indeed unfair and not equitable; they became aware of new assessment measures that could be incorporated into the South African workforce because it dealt with Multi-cultural tests, these tests helped measure performance of the current state and also helped for future selection processors. Another reason as to why psychological assessment needed to be re-evaluated is because of the new legislation and the changing labour market after 1994, therefore responsible and valid test measures needed to be developed. “Responsible assessment can benefit selection and…
- 311 Words
- 2 Pages
Satisfactory Essays