Don Kirkpatrick originally developed his four levels of training evaluation — reaction, learning, behavior, and results — almost one-half century ago. At each level the data become more objective and more meaningful to management and the bottom line.
In addition, progressing through each level requires more work and more sophisticated data-gathering techniques.
It is interesting to note that, many organizations only evaluate training at the first level. It is easiest but doesn’t really give organizations what they need to measure the value of training.
Level I: Reaction
Level I, or participant reaction data, measures participant satisfaction with the training. The employee paid for the experience may be the deciding point as to whether it will be repeated.
Sometimes called smile sheets, Level I evaluation usually consists of a questionnaire that participants use to rate their level of satisfaction with the training program and the trainer, among other things.
Level II: Learning
Level II measures the extent to which learning has occurred. The measurement of knowledge, skills, or attitude (KSAs) change indicates what participants have absorbed and whether they know how to implement what they learned.
The training design’s objectives provide an initial basis for what to evaluate in Level II.
Tests, skill practices, simulations, group evaluations, role-plays, and other assessment tools focus on what participants learned during the program. Measuring learning provides excellent data about what participants mastered as a result of the training experience. This data can be used in several ways:
- a self-assessment for participants to compare what they gained as a result of the training;
- an assessment of an employee’s knowledge and skills related to the job requirements;
- an assessment of whether participants possess knowledge to safely perform their duties; this is especially critical in a manufacturing setting.
Level III: Behavior
Level III evaluation measures whether the skills and knowledge are being implemented. Because this measurement focuses on changes in behavior on the job, it becomes more difficult to measure for several reasons. First, participants can’t implement the new behavior until they have an opportunity. In addition, it is difficult to predict when the new behavior will be implemented.
Measuring Levels I and II should occur immediately following the training, but you can see why this would not be true for Level III. To conduct a Level III evaluation correctly, you must find time to observe the participants on the job, create questionnaires, speak to supervisors, and correlate data. The measure may encourage a behavioral change on the job.
When possible, Level III can be quantified and tied to other outcomes on the job. When a lack of transfer of skills is clearly defined, it can clearly point to a required training design change. A before and after measurement provides data that can be used to understand other events. Sometimes, Level III evaluations help to determine the reasons change has not occurred that are not related to training.
Level IV: Results
Level IV measures the business impact. Sometimes called cost-benefit analysis (incorrectly) or return on investment, it determines whether the benefits from the training were worth the cost of the training. At this level, the evaluation is not accomplished through methods like those suggested in the previous three levels.
Results could be determined by factors such as reduced turnover, improved quality, increased quantity or output, reduction of costs, increase in profits, increased sales, improved customer service, reduction in waste or errors, less absenteeism, or fewer grievances.
A cost-benefit analysis is usually completed before a training program is created to decide whether it is worth the investment of resources required to develop the program. Return on investment (ROI) is conducted after the training has been completed to determine whether it was worth the investment.
Measurements focus on the actual results on the business as participants successfully apply the program material. Typical measures may include output, quality, time, costs, and customer satisfaction.
Adapted from Elaine Biech, Training for Dummies, Wiley