Unit testing AI apps (#note)

This content originally appeared on Stefan Judis Web Development and was authored by Stefan Judis

How do you evaluate your software's doing what it's supposed to do?

Do you test all your app's possible cases, branches and states? I don't, at least not manually. Nobody aint time to manually click through all the edge cases. QA'ing a simple login form takes time, let alone testing complex applications.

Having robots do that helps a ton, and I recommend writing automated tests to help you sleep well at night (and release fewer bugs)!

Ignoring the burden of writing and maintaining tests, testing a "normal" web application is straightforward because it's predictable. Throw something at your app and expect a result. It should always do the same. Most apps are CRUD apps anyway — easy peasy.

But what if there are unpredictable parts in your app's core?

If you're riding the AI buzzword wave, you probably implemented an "I know everything" smart-ass right in your app's core that's known for lying and spreading fake news. (Yes, I mean some sort of LLM.)

How would you test your app's quality if you're building software on top of software you probably don't understand?

Here's Hamel Husain's recommendation:

There are three levels of evaluation to consider:

Level 1: Unit Tests

Level 2: Model & Human Eval (this includes debugging)

Level 3: A/B testing

I'm not planning to get into serious AI work or LLM programming anytime soon, but unit testing software sitting on top of LLMs is fascinating and worth more than a bookmark!

Reply to Stefan

This content originally appeared on Stefan Judis Web Development and was authored by Stefan Judis

Print Share Comment Cite Upload Translate Updates

APA

Stefan Judis | Sciencx (2024-03-31T22:00:00+00:00) Unit testing AI apps (#note). Retrieved from https://www.scien.cx/2024/03/31/unit-testing-ai-apps-note/

MLA

" » Unit testing AI apps (#note)." Stefan Judis | Sciencx - Sunday March 31, 2024, https://www.scien.cx/2024/03/31/unit-testing-ai-apps-note/

HARVARD

Stefan Judis | Sciencx Sunday March 31, 2024 » Unit testing AI apps (#note)., viewed ,<https://www.scien.cx/2024/03/31/unit-testing-ai-apps-note/>

VANCOUVER

Stefan Judis | Sciencx - » Unit testing AI apps (#note). [Internet]. [Accessed ]. Available from: https://www.scien.cx/2024/03/31/unit-testing-ai-apps-note/

CHICAGO

" » Unit testing AI apps (#note)." Stefan Judis | Sciencx - Accessed . https://www.scien.cx/2024/03/31/unit-testing-ai-apps-note/

IEEE

" » Unit testing AI apps (#note)." Stefan Judis | Sciencx [Online]. Available: https://www.scien.cx/2024/03/31/unit-testing-ai-apps-note/. [Accessed: ]

rf:citation

» Unit testing AI apps (#note) | Stefan Judis | Sciencx | https://www.scien.cx/2024/03/31/unit-testing-ai-apps-note/ |

Please log in to upload a file.

There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.

Related Posts