The Misuse of AI: ChatGPT and Unfair Advantage in Assessment Tests

A little while ago, our team at Zoftify advertised for a project manager role. As part of our standard process, we included three moderately challenging take-home questions to give us an initial impression of the candidates’ abilities.

These questions aren't rocket science for someone with hands-on experience with client projects. They're a handy way to weed out the 'shotgun' applicants, saving us and them precious time.

Typically, only about 15-20% of applicants produce satisfactory answers, moving them on to the next stage. But that was before ChatGPT entered the scene.

When something is not alright

Surprisingly, in 2023, everyone's become an expert. This is especially noticeable among our applicants — 65% of them, to be exact.

That’s right. We began receiving four times as many applications with competently answered qualification questions. It was a result we'd normally celebrate, but something didn't feel right.

We had a feeling that AI was the reason. And sure enough, our suspicions coincided with the burgeoning popularity of ChatGPT.

ChatGPT's imperfections

Our qualification questions operate on a straightforward principle: there's no single correct answer, allowing for a range of responses. We aim to evaluate two things: the candidates' understanding of the topic and their thinking process.

One task, for example, involves responding to a hypothetical email from a client with a complicated request. Interestingly, we started noticing a striking similarity in most candidates' responses. Different wording, same problem-solving approach.

This brings us to a limitation of the current version of ChatGPT. As OpenAI CEO Sam Altman pointed out:

It’s still limited, and it seems more impressive on first use than it does after you spend more time with it.

— Sam Altman

I've come to see this as well. ChatGPT tends to deliver similar results, just dressed up differently. Many people just haven’t realized this yet.

Tackling the use of generative AI in assessment tests

The pressing question is how to spot candidates who game assessment tests using ChatGPT, and what to do about it. In other words, how do we ensure we're hiring real talent, not posers riding the AI wave?

Don't get me wrong — ChatGPT is a fantastic tool. I have no issue with candidates using it to help articulate their knowledge. If they draft their responses and then turn to ChatGPT for some language polishing, that's a clever use of modern tools.

But those who merely feed our questions into ChatGPT and submit the AI's responses as is — they're not worth your time.

After filtering through close to 300 applications, our team has refined our approach to weed out the cheaters:

We review all applications and their responses.
We spot and discard any matching responses — around 40% of what we get. They're surprisingly easy to spot, thanks to specific unnecessary details ChatGPT likes to include.
We cross-reference the remaining high-quality responses with other candidate information—CVs, cover letters, shared links, and experience. If there's a glaring mismatch between the candidate's experience and their superb answers, they are rejected.
Finally, we hold an intro call with the shortlisted candidates. We pose the same question but in a different context. If their verbal response falls short of their written one, that's a red flag.

We've also tweaked our assessment tests:

We've scrapped all closed-ended questions.
Our questions now revolve around practical, real-life examples, no more theoretical scenarios.
We provide a detailed context or backstory for the task and then ask candidates to generate ideas based on this setting.

Final thoughts

Just like any technological innovation, ChatGPT is a mixed bag. It's capable of delivering immense benefits while simultaneously raising complex challenges. It's clear that companies, particularly those recruiting remotely, must recalibrate their candidate evaluation processes to navigate this new landscape.

Organizations need to reconsider their candidate assessment strategies. They should zero in on those individuals who infuse their unique flavor into the qualification process. If AI can do the job a candidate does at the same level, what's the rationale behind hiring them in the first place?

What truly distinguishes us from AI is our individuality and creativity. This is what will always make humans valuable in the workforce.