Beyond The Final Answer: Why Non-Experts Can’t Spot Bad AI Code

The second challenge is in verifying whether the code generated by the model is correct. In GridBook, users were able to see the natural language utterance, synthesized formula and the result of the formula. Of these, participants heavily relied on ‘eyeballing’ the final output as a means of evaluating the correctness of the code, rather than, for example, reading code or testing rigorously.

\ While this lack of rigorous testing by end-user programmers is unsurprising, some users, particularly those with low computer self-efficacy, might overestimate the accuracy of the AI, deepening the overconfidence end-user programmers are known to have in their programs’ accuracy (Panko, 2008). Moreover, end-user programmers might not be able to discern the quality of non-functional aspects of the generated code, such as security, robustness or performance issues.

:::info Authors:

(1) Advait Sarkar, Microsoft Research, University of Cambridge (advait@microsoft.com);

(2) Andrew D. Gordon, Microsoft Research, University of Edinburgh (adg@microsoft.com);

(3) Carina Negreanu, Microsoft Research (cnegreanu@microsoft.com);

(4) Christian Poelitz, Microsoft Research (cpoelitz@microsoft.com);

(5) Sruti Srinivasa Ragavan, Microsoft Research (a-srutis@microsoft.com);

(6) Ben Zorn, Microsoft Research (ben.zorn@microsoft.com).

:::

:::info This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.

:::

This content originally appeared on HackerNoon and was authored by Pair Programming AI Agent

Print Share Comment Cite Upload Translate Updates

APA

Pair Programming AI Agent | Sciencx (2025-08-09T14:30:02+00:00) Beyond The Final Answer: Why Non-Experts Can’t Spot Bad AI Code. Retrieved from https://www.scien.cx/2025/08/09/beyond-the-final-answer-why-non-experts-cant-spot-bad-ai-code/

MLA

" » Beyond The Final Answer: Why Non-Experts Can’t Spot Bad AI Code." Pair Programming AI Agent | Sciencx - Saturday August 9, 2025, https://www.scien.cx/2025/08/09/beyond-the-final-answer-why-non-experts-cant-spot-bad-ai-code/

HARVARD

Pair Programming AI Agent | Sciencx Saturday August 9, 2025 » Beyond The Final Answer: Why Non-Experts Can’t Spot Bad AI Code., viewed ,<https://www.scien.cx/2025/08/09/beyond-the-final-answer-why-non-experts-cant-spot-bad-ai-code/>

VANCOUVER

Pair Programming AI Agent | Sciencx - » Beyond The Final Answer: Why Non-Experts Can’t Spot Bad AI Code. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/08/09/beyond-the-final-answer-why-non-experts-cant-spot-bad-ai-code/

CHICAGO

" » Beyond The Final Answer: Why Non-Experts Can’t Spot Bad AI Code." Pair Programming AI Agent | Sciencx - Accessed . https://www.scien.cx/2025/08/09/beyond-the-final-answer-why-non-experts-cant-spot-bad-ai-code/

IEEE

" » Beyond The Final Answer: Why Non-Experts Can’t Spot Bad AI Code." Pair Programming AI Agent | Sciencx [Online]. Available: https://www.scien.cx/2025/08/09/beyond-the-final-answer-why-non-experts-cant-spot-bad-ai-code/. [Accessed: ]

rf:citation

» Beyond The Final Answer: Why Non-Experts Can’t Spot Bad AI Code | Pair Programming AI Agent | Sciencx | https://www.scien.cx/2025/08/09/beyond-the-final-answer-why-non-experts-cant-spot-bad-ai-code/ |

Please log in to upload a file.

There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.

Table of Links

9.2. Issue 2: Code correctness, quality and (over)confidence

Related Posts