Year : 2022 | Volume
: 7 | Issue : 2 | Page : 184--191
Assessment of clinical skill competency of medical postgraduate students – Recommendations for upcoming CBME curriculum for postgraduates
R Rajashree1, Smita Kottagi2, Triveni Jambale2, Gajanan Channashetti3,
1 Department of Physiology, Biochemistry, Gadag Institute of Medical Sciences, Gadag, Karnataka, India
2 Department of Biochemistry, Gadag Institute of Medical Sciences, Gadag, Karnataka, India
3 Department of Ophthalmology, Gadag Institute of Medical Sciences, Gadag, Karnataka, India
Department of Ophthalmology, Gadag Institute of Medical Sciences, Gadag, Karnataka
Direct observation of postgraduate medical trainees with actual patients by clinical faculty has traditionally been a standard tool to assess knowledge and skills in clinical subjects. By assessing and providing feedback to medical trainees performing a medical interview, physical examination, or counselling patients, faculty can facilitate budding physicians to have successful practice of medicine in their future. Despite the advances in clinical skills evaluation, direct observation has been a most popular and time tested method. However, observation of postgraduate medical students by faculty is very subjective and unfortunately often sporadic and non-standardized. Substantial amount of literature identifies several threats to its construct validity as an assessment tool. Although many tried to demonstrate methods to minimize those threats, there are many lacunae that remain inherent to direct observation method. Hence the need of the time is to relook at the observation as an assessment tool, instead of discarding the whole as inappropriate one. The authors initially made an effort to analyse the problems in present settings in India. After an extensive literature search authors advocate few additions and modifications to the existing system. Thus the present study not only highlights the pitfalls in the direct observation method but also suggests solutions for the problem.
|How to cite this article:|
Rajashree R, Kottagi S, Jambale T, Channashetti G. Assessment of clinical skill competency of medical postgraduate students – Recommendations for upcoming CBME curriculum for postgraduates.BLDE Univ J Health Sci 2022;7:184-191
|How to cite this URL:|
Rajashree R, Kottagi S, Jambale T, Channashetti G. Assessment of clinical skill competency of medical postgraduate students – Recommendations for upcoming CBME curriculum for postgraduates. BLDE Univ J Health Sci [serial online] 2022 [cited 2023 Mar 31 ];7:184-191
Available from: https://www.bldeujournalhs.in/text.asp?2022/7/2/184/355860
Patient-based clinical examinations have always been an essential component of assessment in medical schools. For decades, it is considered a needed complement to the written and oral examinations. Direct observation by faculty has traditionally been a standard tool to assess higher order of skills in clinical subjects. It is one of the observational assessment tools designed to assess “DOES” level of George Miller's pyramid. Despite being an inherent part of assessment in medical schools, it has been an informal and underused assessment tool across all specialties. Substantial amount of literature not only identifies several threats to its construct validity as an assessment but also demonstrates methods to minimize those threats as well.,,, In India, the present examination system for postgraduate (PG) students in medical schools needs a dynamic curricular reform as it is being done for undergraduates in new CBME curriculum.
Currently, the scheme of PG examination comprises a four-step approach for passing the PG degree examinations, as detailed in [Annexure 1]. The first step includes a dissertation work carried out by the PG student, completion of which shall be a precondition for a student to appear for the final PG examination. The second step consists of written examination of four papers carrying 100 marks each. Each paper consists of 2 long and 6 short essays carrying 20 and 10 marks each, respectively. The third step consists of practical examination aimed at assessing psychomotor/skill-related competencies where different techniques and procedures are being tested. Here, students' ability to make relevant and valid observations on patients, interpretations, and inference of laboratory or experimental work relating to his/her subject are assessed at various examination settings by different assessors. The total marks assigned for practical/clinical examination is 200 marks. Out of 200 marks, each student is assigned one long case carrying 65 marks and the time allotted is 45 min. Further, 3 short cases carrying 45 marks each were tested for about 30 min each. Each student has to face four examiners (2 internal and 2 external), and each examiner will have to assess one case presentation by the PG. The fourth step consists of oral/viva voce examination aiming to assess the depth of knowledge, logical reasoning, confidence, and also the communication skills of the examinee. Out of 100 marks assigned, 80 marks will be dedicated to viva, assessed by all four examiners conjointly. The remaining 20 marks will be given for pedagogy, wherein a PG will have present an assigned topic over a period of 10 min in front of all four examiners. A PG shall secure not <50% marks in each component to pass the university examination, namely (1) theory and (2) practical including clinical and viva voce examination individually. Criterion to declare distinction is a grand total aggregate mark of 75% and above in the first attempt.[INLINE:1]
PG students in medical schools are expected to be able to manage a variety of critical cases after the completion of their clinical training in concerned specialty. However, many of these individuals fail to perform well in their future clinical practice despite scoring well in the examinations., Physicians' poor performance could be attributed to either limited clinical experiences or to the poor assessment methods., Now, the major challenge posed to the clinical faculty is identification of the pitfalls in the assessment and measures to correct them, which eventually predict the successful performance of the medical PG students in their future. Regrettably, current assessment methods of clinical competency, direct observation in particular, are often implicit, unsystematic, and therefore inadequate. Many medical schools in India are still following the age-old curricula for assessment pattern of PGs, though a handful of schools, especially deemed university-owned colleges, have revised to some extent. This warrants the necessity of uniform curricula, timetable, etc., throughout India irrespective of their affiliations as all will be designated as Indian medical postgraduates in future. In the present study, we proposed a set of recommendations to reduce the validity threats that are identified in the present observational assessment system in India.
Analysis of the Problem
There are several threats to validity which interfere with the meaningful interpretation of the data in assessment system of clinical competency by direct observation. We used Messick's concept of construct validity as our overall conceptual framework to analyze the problems with existing examination setting.,, Kane et al. stated that “validity model is the process of building logical, scientific and empirical argument to support or refute very specific intended interpretations of assessment scores.” Essentially, all validity is construct validity and is based on five major sources of evidence, namely content, response process, internal structure, relation to other variables, and consequences.,, Unfortunately, there may be many more threats to validity than the sources of evidence and two major threats to validity are construct undersampling (CU) and construct-irrelevant variance (CIV).
CU refers to the under- or biased sampling of the content domain by the assessment tool. In the present examination setting, we assess the PG students based on very few observations of clinical behaviors: one long and three short cases. Too few cases lead to lack of generalizability, which is one of the major CU threats to validity. Cases are under-representative of the domain, as these might not have addressed all the specified core competencies developed in the context of desirable physician attributes, namely patient care, medical knowledge, practice-based learning and improvement, systems-based practice, professionalism, interpersonal skills, and communication.
Furthermore, higher case specificity is well documented in observational assessments and the length of time given to test is also not adequate (30 min each for minor cases) to draw meaning conclusions., Clinical observation primarily depends on faculty rating which introduces a lot of subjectivity. Also many other threats such as incomplete observation and too few raters may worsen the problems further.
The present PG curriculum is still following the old pattern which was not revised for decades. The methods presently followed may be suitable to decade-old clinical setting/context as then there were a lot of resource constrains such as infrastructure, manpower, and time.
Potential CIV threats are inherently present in clinical performance examinations. CIV mainly focuses on the issues related to systematic rater error such as cognitive bias (halo, severity, leniency, central tendency, recency effect, and restriction of range), gender, ethnic, and race-related bias, which results in unfair scoring. Rating by faculty who are not well trained in the use of rating scales poses threat to CI validity. They would inappropriately rate the students' performance which eventually leads to scores that are higher or lower than they should be.
Other major CIV threats originate from flawed rating scales or ambiguous checklists. Regrettably, luck plays a major role in getting a difficult or cooperative patient, as selection of cases is based on lottery method, leading to erroneous students' scores. Here, patients' cultural background, gender, language, and educational level all reflect on the presentation of actual clinical features to examinee, who end up with incomplete or wrong diagnosis.
Review of Literature
We reviewed English literature in PubMed using the search terms “direct observation, performance observation, clinical observation, and validity threats.” Furthermore, we searched Google Scholar to gain more knowledge about ACGME, IMG, NCME, AERA, etc. Ample anecdotal evidence is available to show the need of changes to be made in the health professions education as well as in assessment methods, which is integral to eventual improvement in this area. Researchers have shown that faculty evaluators need to be diligent in performing clinical observational assessments to provide reliable, valid data. Current evidences suggest significant deficiencies in faculty direct observation evaluation skills and examination settings with actual patients. Assessments that heavily rely upon the data collected by direct observations should focus on the level of evaluation using Miller's pyramid or Kirkpatrick's criteria. Since the observational assessment tests the highest level of Miller's pyramid – “DOES,” raters should be able to judge accurately.
Loftus et al. showed the importance of five links in the cascade of observational assessment to maximize the judgment accuracy. The links are namely clinical behavior/response by the learner, faculty observation, interpretation of observed behavior by faculty, recording of the data, and judgment of the collected data by rater. The sources of error may arise from the 2nd, 3rd, and 4th links, which are due to poor eyesight, insight, flawed foresight, respectively. The R-I-M-E model was found to be useful for the faculty to judge the students precisely, as leaning develops from “reporter” to “interpreter” to “manager” to “educator.”,,
A structured format will improve the accuracy of observations by enhancing the reliability and ultimately the validity of the assessment method of residents' clinical competence by faculty. Reliability refers to precision of the examination indicating the possibility of obtaining the same results by repeated measures. Construct validity is a unitary concept with multiple facets, used to support or refute the meaning associated with assessment data.,,, Although reliability is one of the essential components of the validity, it is not sufficient. Low reliability is a major threat, and to generalize the obtained data, generalizability theory can be applied.
Among the two major threats to validity, the CU threat can be reduced by increasing the number and length of time for clinical encounters, which will be sufficient to achieve minimum generalizability., Generalizability coefficient of minimum 0.80 is required for high-stakes performance examinations, while phi coefficient is used for criterion-referenced performance examinations. Errors due to under-representative sampling of domain can be corrected by systematic planning of the blueprint for the specific cases as well as rating scales. CU threats related to raters can be addressed by having more independent ratings, which are sufficient enough to produce interpretable, generalizable data. Using trained raters from different disciplines with different prospective and also complimenting it with 360° evaluation, which has displayed the alpha reliability coefficient of 0.89, will help to reduce the errors in rating.
CIV threats are mainly due to systematic rater error than the random. Having many trained raters will reduce systematic error, however, a student may still be lucky or unlucky in drawing “dove” or “hawk” raters, respectively. Providing the raters with scoring rubric, in which they were already received training about its usage, is another remedial measure. Rater's cognitive bias can be eliminated by providing the frame of reference training and feedback. Systematic errors associated with checklist or rating scales can be eliminated by having the expert opinion in framing the anchors and by conducting pilot studies.,
It is recommended to use high fidelity simulated patients (SPs) which help the medical PG students to diagnose and treat patients' problems with expected competencies. Usually, the diagnostic expertise is content specific, tending to be limited to clinical problems with which a physician has had substantial experience, which can be reduced by the use of SPs.
PG students are rarely being observed and evaluated during their educational process. This may be due to the lack of faculty time, resource constraints, and a perceived lack of validation of the assessment. The mini-clinical evaluation exercise (mini-CEX) is typically based on real-life data, and many authors demonstrated it as a reliable and acceptable instrument for the workplace-based assessments.,
Other less emphasized, neglected aspects that need to be introduced and assessed are noncognitive variables of students. Although many researchers showed the importance of noncognitive variables in predicting the successful performance of PG students as physicians of tomorrow, its assessment is never given the importance.,, The multiple mini-interview (MMI) is an innovative, valid tool, developed by McMaster University (2001) to assess noncognitive qualities. MMI consists of 10 stations, each representing one qualitative category elaborated by McGaghie, which are assessed by using behaviorally anchored global rating scales with 7 points (1 = unsatisfactory and 7 = outstanding). Minimum score of 5 (satisfactory) in each station and total cutoff score was decided to set at 50, which was noncompensatory.,
Sir William Osler, one of the most honored physicians in the history of medicine, stated that hospitals are not only meant for the patient care but also should serve as schools to facilitate PG students' learning and training in solving real-world problems. After providing the standardized learning environment, equal emphasis should be given to evaluation system to assess the candidates' clinical competence. Evaluation system should address the gain of necessary knowledge and skills at the end of their postgraduation in medical school and readiness to start their clinical practice in future. Toward this purpose, we propose few tools which have least validity threats. Having explained early the lacunae in the existing examination setting and after reviewing the literature, we are recommending few changes in the setting instead of replacing the whole evaluation system, as shown in [Annexure 2].[INLINE:2]
Validity threats posed to the direct observation method of assessment can be addressed efficiently by re-compartmentalizing the domains as follows. Clinical skills can be observed in about 20 OSCE stations lasting up to 15–30 min depending on the task to be performed., Content blueprint sample as provided in [Annexure 3] should be prepared well in advance matching the objectives in skills domain exclusively by experts, without repeating the knowledge domains already tested in MCQ and MEQ examinations. Each station should have 20 independent raters and arrangement for simultaneous video recording, which are later assessed independently by different experts in the respective subjects.,[INLINE:3]
Noncognitive variables such as professionalism and communication skills should be necessarily assessed in another 10-station MMI., MMI stations should be planned with some changes as mentioned above for OSCE stations to optimize the valid outcome. Noncognitive variables and systems-based practice, which necessitate the interprofessional communication skills, can be assessed effectively by 360° evaluation, which has a proven reliability of 0.89.,
Physicians are expected to be the lifelong learners, who need to update and upgrade their medical knowledge and skills throughout their practicing period. These qualities can be nurtured since the beginning of their medical career by assessing the interest in improvement by mini-CEX, which has demonstrated high-reliability coefficients. Mini-CEX will help to eliminate the errors that arise from assessment by direct observation during final examinations, such as less number of cases, raters, biased rating, inadequate timing, and luck-based performances. Mini-CEX undoubtedly compensates the shortcomings in final, onetime assessment of students, since their performance on real patients was regularly assessed throughout the year by multiple raters.
Although growing body of research has shown that several threats exist in assessment by direct observational method, it is frequently done in a nonstandardized setting and in an unsystematic way. Hence, to eliminate these errors, a number of medical schools are using SPs. Medical simulation technology provides the maximum control over the training environment and also the examination setting, resulting in valid assessment of clinical performance by examinees. SPs are trained persons or devices that attempt to present education and evaluation problems authentically.
SPs usually demonstrate high fidelity in portraying real patients in natural clinical situations. The major hitch in using simulations is the high cost and difficulty of establishing clinical realism. Medical simulations only complement and will never replace the educational activities based on real patient care experiences. Its effectiveness depends on employing mastery model of training and informed use for trainees, including providing feedback, engaging learners in deliberate practice, and integrating SPs into curriculum and examinations.,
Our recommendation is that deliberate mixed practice of ten mini-CEXs on real patients and two on SPs (totally 12) in a year should compensate the scores of clinical assessment in final examinations. Furthermore, we strongly recommend the use of SPs which provide the controlled setting and result in valid assessment scores during final examinations. Standard setting, a process used to distinguish categories such as pass or fail, should be done employing either the borderline or contrasting group methods., In nutshell, direct observation is a unique and reliable tool in the assessment of clinical competency of medical students however requires some essential changes to optimize its validity. The fundamental aspect of training the physicians is to see whether they are able to apply the acquired knowledge and skill in natural settings or not. We have identified several threats to validity and so recommended methods to minimize those threats by gathering evidences from five sources, as shown in [Annexure 4]. Direct observation should be included in medical education curricula but with the proposed recommendations and special attention paid to the development and design.[INLINE:4]
The current practice of assessing the performance of PG students on actual patients by direct observation in final examination needs to be more systematic and standardized to provide meaningful interpretation to the obtained data. The present article identifies several threats to validity and also recommends the essential changes to improvise the existing observational assessment. Validity, the sine qua non of the assessment data, faces major threats such as CU and CIV, which can be eliminated by the proposed remedies. A set of 16 practical recommendations for improving observational assessment practices, namely broad, systematic sampling of clinical situations; keep rating instruments short and focused; encourage immediate feedback; encourage prompt recording of observational data; give rater feedback about their ratings; supplement formal with unobtrusive observation; make promotion decisions via group review; supplement traditional observation with other clinical skills measures (e.g., OSCE and SPs); encourage rating of specific performances rather than global ratings; and establish the meaning of ratings in the manner used to set normal limits for clinical diagnostic investigations.,
We conclude our proposal by stating that the modified valid teaching methods and assessment tools need to be incorporated in the upcoming PG curriculum. Age-old PG curriculum needs a facelift with advanced and valid learning methods. In summary, to assess the knowledge domain we propose, MCQ, MTF and MEQ format for the PG final Examination. On the other hand, skills which are not adequately assessed by the MCQ and MEQ formats should be complemented with other formats like objective structured practical/clinical examination (OSPE/OSCE I and II) with valid checklists and rating rubrics. However, attitude domain is not well addressed in the existing system, which is one of the major predictors of the successful performance of physicians in future.,, Thus, there is an urgent need to evaluate noncognitive variables, using tools such as 360°/multisource feedback, role-plays, counseling, interviews etc. Policymakers in the National Medical Council (NMC) can think of many other innovative observational assessment tools by using standardized patients, simulations, etc., and should withstand all unexpected barriers in future. COVID-19 pandemic has taught us many things, both good and bad. Online teaching as well as assessment has picked up a sudden unexpected pace during the COVID era since March 2020. Computer-based case simulations and objective structured video examination and modified version of OSCE are considered very useful tools to test skills during online assessment nowadays.
Currently, many reforms are proposed in CBME curriculum for undergraduates but not yet for PGs. Hence, it is the need of the hour to revise and upgrade PG curriculum up to the global standards. Furthermore, there is an urgent need to implement the proposed solutions uniformly through Indian medical schools as presently there is no such mandate from the NMC about the uniformity. Operationalization of CBME is yet another major challenge however the lucrative curriculum we propose. In this regard, phase-wise implementation and effective faculty developments programs for sensitization and training in the new PG curriculum can address the issue of effectiveness of new proposal.
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
|1||Williams RG, Klamen DA, McGaghie WC. Cognitive, social and environmental sources of bias in clinical performance ratings. Teach Learn Med 2003;15:270-92.|
|2||Fromme HB, Karani R, Downing SM. Direct observation in medical education: A review of the literature and evidence for validity. Mt Sinai J Med 2009;76:365-71.|
|3||Downing SM. Validity: On meaningful interpretation of assessment data. Med Educ 2003;37:830-7.|
|4||Messick S. Validity. In: Linn RL, editor. Educational Measurement. 3rd ed. New York, NY: Macmillan Publishing Co, Inc; 1989.|
|5||Brown E, Rosinski EF, Altman DF. Comparing medical school graduates who perform poorly in residency with graduates who perform well. Acad Med 1993;68:806-8.|
|6||Dirschl DR, Campion ER, Gilliam K. Resident selection and predictors of performance: Can we be evidence based? Clin Orthop Relat Res 2006;449:44-9.|
|7||Joorabchi B, Devries JM. Evaluation of clinical competence: The gap between expectation and performance. Pediatrics 1996;97:179-84.|
|8||Ward PJ. Influence of study approaches on academic outcomes during pre-clinical medical education. Med Teach 2011;33:e651-62.|
|9||Messick S. Validity of psychological assessment: Validation of inferences from persons' responses and performances as scientific inquiry into score meaning. Am Psychol 1995;50:741-9.|
|10||Kane M. Content-related validity evidence in test development. In: Downing SM, Thomas M, editors. Handbook of Test Development. Mahwah, NJ: Lawrence Erlbaum Associates Publishers; 2006.|
|11||American Educational Research Association. Standards for Educational and Psychological Testing. Washington, DC: American Educational Research Association; 1999.|
|12||Vallevand A, Violato C. A predictive and construct validity study of a high-stakes objective clinical examination for assessing the clinical competence of international medical graduates. Teach Learn Med 2012;24:168-76.|
|13||Downing SM, Haladyna TM. Validity threats: Overcoming interference with proposed interpretations of assessment data. Med Educ 2004;38:327-33.|
|14||Accreditation Council for Graduate Medical Education. Outcomes Project: General Competencies; 2004. Available from: http://www.acgme.org/outcome/comp/compFull.asp. [Last accessed on 2012 May 11].|
|15||Elstein AS, Shuman LS, Sprafka SA. Medical Problem Solving: An Analysis of Clinical Reasoning. Cambridge, Massachusetts: Harvard University Press; 1978.|
|16||Norman GR, Tugwell P, Feightner JW, Muzzin LJ, Jacoby LL. Knowledge and clinical problem-solving. Med Educ 1985;19:344-56.|
|17||Engelhard G. Monitoring raters in performance assessments. In: Tindal G, Haladyna T, editors. Large-scale Assessment Programs for ALL Students: Development, Implementation, and Analysis. Mahwah, NJ: Erlbaum; 2002. p. 261-87.|
|18||Holmboe ES, Yepes M, Williams F, Huot SJ. Feedback and the mini clinical evaluation exercise. J Gen Intern Med 2004;19:558-61.|
|19||Loftus EF, Schneider NG. Behold with Strange Surprise: Judicial Reactions to Expert Testimony Concerning Eyewitness Reliability. USA: University of Missouri-Kansas City Law Review; 1987. p. 56, 1-45.|
|20||Downing SM, Yudkowsky R. Assessment in Health Professions Education. New York, NY: Routledge; 2009.|
|21||Pangaro L. A new vocabulary and other innovations for improving descriptive in-training evaluations. Acad Med 1999;74:1203-7.|
|22||Ogburn T, Espey E. The R-I-M-E method for evaluation of medical students on an obstetrics and gynecology clerkship. Am J Obstet Gynecol 2003;189:666-9.|
|23||Carnahan D, Hemmer PA. “Descriptive evaluation”. In: Fincher RM, editor. Guidebook for Clerkship Directors. 3rd ed. Omaha: Alliance for Clinical Education; 2005. p. 150-62.|
|24||Noel GL, Herbers JE Jr., Caplow MP, Cooper GS, Pangaro LN, Harvey J. How well do internal medicine faculty members evaluate the clinical skills of residents? Ann Intern Med 1992;117:757-65.|
|25||Kane MT. An argument-based approach to validity. Psychol Bull 1992;112:527-35.|
|26||Bari V. Direct observation of procedural skills in radiology. AJR Am J Roentgenol 2010;195:W14-8.|
|27||van der Vleuten CP, Swanson DB. Assessment of clinical skills with standardized patients: State of the art. Teach Learn Med 1990;2:58-76.|
|28||Brannick MT, Erol-Korkmaz HT, Prewett M. A systematic review of the reliability of objective structured clinical examination scores. Med Educ 2011;45:1181-9.|
|29||Williams RG. Have standardized patient examinations stood the test of time and experience? Teach Learn Med 2004;16:215-22.|
|30||Massagli TL, Carline JD. Reliability of a 360° evaluation to assess resident competence. Am J Phys Med Rehabil 2007;86:845-52.|
|31||Herbers JE Jr., Noel GL, Cooper GS, Harvey J, Pangaro LN, Weaver MJ. How accurate are faculty evaluations of clinical competence? J Gen Intern Med 1989;4:202-8.|
|32||Swanson DB, Norman GR, Linn RL. Performance-based assessment: Lessons from the health professions. Educ Res 1995;24:5-11.|
|33||Sidhu RS, Hatala R, Barron S, Broudo M, Pachev G, Page G. Reliability and acceptance of the mini-clinical evaluation exercise as a performance assessment of practicing physicians. Acad Med 2009;84:S113-5.|
|34||Alves de Lima A, Conde D, Costabel J, Corso J, Van der Vleuten C. A laboratory study on the reliability estimations of the mini-CEX. Adv Health Sci Educ Theory Pract 2013;18:5-13.|
|35||Altmaier EM, McGuinness G, Wood P, Ross RR, Bartley J, Smith W. Defining successful performance among pediatric residents. Pediatrics 1990;85:139-43.|
|36||Kastner L, Gore E, Novack AH, et al. Paeditric residents' attitudes and cognitive knowledge and faculty ratings. J Pediatr 1984;104:814-8.|
|37||Ginsburg S, Regehr G, Hatala R, McNaughton N, Frohna A, Hodges B, et al. Context, conflict, and resolution: A new conceptual framework for evaluating professionalism. Acad Med 2000;75 Suppl 10:S6-11.|
|38||McGaghie WC. Qualitative variables in medical school admission. Acad Med 1990;65:145-9.|
|39||Eva KW, Rosenfeld J, Reiter HI, Norman GR. Admissions OSCE: The Multiple Mini-interviews. Med Educ 2004;38:314-26.|
|40||Reiter HI, Eva KW, Rosenfeld J, Norman GR. Multiple mini-interviews predict clerkship and licensing examination performance. Med Educ 2007;41:378-84.|
|41||Rushforth HE. Objective Structured Clinical Examination (OSCE): Review of literature and implications for nursing education. Nurse Educ Today 2007;27:481-90.|
|42||Shatzer JH, Darosa D, Colliver JA, Barkmeier L. Station-length requirements for reliable performance-based examination scores. Acad Med 1993;68:224-9.|
|43||Scheidt PC, Lazoritz S, Ebbeling WL, Figelman AR, Moessner HF, Singer JE. Evaluation of system providing feedback to students on videotaped patient encounters. J Med Educ 1986;61:585-90.|
|44||Yang YY, Lee FY, Hsu HC, Huang CC, Chen JW, Cheng HM, et al. Assessment of first-year post-graduate residents: Usefulness of multiple tools. J Chin Med Assoc 2011;74:531-8.|
|45||Cook DA, Beckman TJ, Mandrekar JN, Pankratz VS. Internal structure of mini-CEX scores for internal medicine residents: Factor analysis and generalizability. Adv Health Sci Educ Theory Pract 2010;15:633-45.|
|46||Hawkins RE, Margolis MJ, Durning SJ, Norcini JJ. Constructing a validity argument for the mini-clinical evaluation exercise: A review of the research. Acad Med 2010;85:1453-61.|
|47||Boulet JR, Murray D, Kras J, Woodhouse J, McAllister J, Ziv A. Reliability and validity of a simulation-based acute care skills assessment for medical students and residents. Anesthesiology 2003;99:1270-80.|
|48||Marinopoulos SS, Dorman T, Ratanawongsa N, Wilson LM, Ashar BH, Magaziner JL, et al. Effectiveness of Continuing Medical Education. Rockville, MD: Agency for Healthcare Research and Quality: Evidence Report/Technology Assessment No 149; 2007.|
|49||McGaghie WC, Siddall VJ, Mazmanian PE, Myers J; American College of Chest Physicians Health and Science Policy Committee. Lessons for continuing medical education from simulation research in undergraduate and graduate medical education: effectiveness of continuing medical education: American College of Chest Physicians Evidence-Based Educational Guidelines. Chest 2009;135 Suppl 3:62S-8S.|
|50||Block JH, editor. Mastery Learning: Theory and Practice. New York: Holt, Rinehart and Winston; 1971.|
|51||Liu M, Liu KM. Setting pass scores for clinical skills assessment. Kaohsiung J Med Sci 2008;24:656-63.|
|52||Burrows PJ, Bingham L, Brailovsky CA. A modified contrasting groups method used for setting the passmark in a small scale standardised patient examination. Adv Health Sci Educ Theory Pract 1999;4:145-54.|
|53||Kogan JR, Holmboe ES, Hauer KE. Tools for direct observation and assessment of clinical skills of medical trainees: A systematic review. JAMA 2009;302:1316-26.|
|54||Boulet JR, McKinley DW, Whelan GP, Hambleton RK. Quality assurance methods for performance-based assessments. Adv Health Sci Educ Theory Pract 2003;8:27-47.|
|55||Brennan RL. Generalizability Theory. New York: Springer-Verlag; 2001.|
|56||Tavakol M, Brennan RL. Medical education assessment: A brief overview of concepts in generalizability theory. Int J Med Educ 2013;4:221-2.|
|57||Campbell DT, Fiske DW. Convergent and discriminant validation by the multitrait-multimethod matrix. Psychol Bull 1959;56:81-105.|
|58||Norcini JJ. Setting standards on educational tests. Med Educ 2003;37:464-9.|
|59||Rajashree R, Chandrashekhar DM. Competency based medical education in India: A work in progress. Indian J Physiol Pharmacol 2020:64 Suppl 1:S7-9.|