Experienced exam markers were unable to detect papers generated by artificial intelligence (AI) in 94% of cases and gave them higher grades than those written by real students, a new study has found.

Researchers at the University of Reading used ChatGPT to generate exam answers that were submitted for several undergraduate psychology modules. They used ChatGPT-4 – the most advanced version of the popular AI platform – and submitted the answers using fake student identities.

The team believes their blind study was the largest and most robust of its kind to date in trying to challenge human educators to detect AI-generated content. They said their findings should provide a “wake-up call” for educators across the world.

A recent Unesco survey of 450 schools and universities found that less than 10% had policies or guidance on the use of generative AI.

Peter Scarfe, an associate professor at Reading’s school of psychology and clinical language sciences and co-leader of the study, said: “Many...

Parents
  • This is worrying and needs urgent attention. As @Peter pointed out, the solution would be to have students defend their submissions in a face-to-face interview. This may not be feasible for a large cohort. One way of addressing the issue of large cohort would be to use AI to complete the first round of grading and then the markers would use the marking time to conduct the interview.

Comment
  • This is worrying and needs urgent attention. As @Peter pointed out, the solution would be to have students defend their submissions in a face-to-face interview. This may not be feasible for a large cohort. One way of addressing the issue of large cohort would be to use AI to complete the first round of grading and then the markers would use the marking time to conduct the interview.

Children
No Data