This content originally appeared on DEV Community and was authored by Takuto Yuki
Introduction
End-to-end (E2E) testing is crucial for ensuring software quality, yet it is often costly. Writing and maintaining test scripts is time-consuming, and even minor DOM changes can cause test failures.
While some engineers enjoy writing tests, few specialize in E2E testing. The tech industry has responded with numerous automated and no-code E2E testing solutions, but these tools are often expensive and lack precision.
browser-use offers a new approach by automating browser interactions. Given its capabilities, I experimented with using it for E2E test automation.
Overview
In this experiment, I used the following tools and services:
- Tools: Python, browser-use, Playwright, Jest
-
Test site: Sauce Demo
- A mock e-commerce site widely used for testing
- Fully mocked backend with publicly available test accounts
- Goal: Minimize manual effort in E2E testing
Key Takeaway
It performs well but requires optimization.
The complete source code is available in this repository:
GitHub - browser-use E2E test automation
Strategy
I initially expected to fully automate E2E testing with a single prompt like "Test this site with E2E!" However, the process required structured steps:
- Extract the site structure using browser-use
- Generate test scenarios for each page
- Review and refine the generated scenarios manually
- Generate test code based on the reviewed scenarios
- Execute the generated test code
Step 1: Extracting the Site Structure
The first step was extracting the site's structure with browser-use to create a list of pages for testing.
Processing the entire site at once did not work effectively, likely due to LLM processing constraints. Handling each page separately improved stability.
Prompt Example
site_structure_task = f"""
Analyze the website starting from {url}. Identify and output:
1. All accessible pages and subpages within the domain ({url}), including dynamically loaded content.
2. Each page's purpose in concise terms.
3. Include:
- Static links
- JavaScript-driven links
- Form submissions
- API endpoints (if visible)
4. Group similar structured pages (e.g., query parameters like ?id=).
## Output JSON Format:
[
{ "path": "<path or URL>", "purpose": "<brief description>" },
...
]
## Login Information
- id: {user_id}
- password: {password}
"""
Step 2: Generating Test Scenarios
Once the site structure was extracted, test scenarios were generated for each page. By generating scenarios in natural language first, I achieved:
- Better manual review
- More stable test code generation
Prompt Example
I can read Japanese more accurately, so I set it to Japanese as the scenario_language.
scenario_task = f"""
Generate exhaustive test scenarios for the following page:
- Page: {page_path}
Purpose: {page_purpose}
For this page, include all possible user actions, such as:
- Form submissions
- Button clicks
- Dropdown selections
- Interactions with modals or dynamic elements
Test both expected behaviors and edge cases for each action.
Output format:
path: {page_path},
actions:
- test: <description of action>,
expect: <expected result>,
- test: <description of action>,
expect: <expected result>,
The output must be written in {scenario_language}.
## Root URL
{url}
## Login Information
- id: {user_id}
- password: {password}
"""
Sample Output
path: /,
actions:
- test: Enter correct username and password, then click login,
expect: Redirect to user dashboard,
- test: Leave username blank and attempt login,
expect: Show error message,
- test: Enter invalid username and attempt login,
expect: Show error message,
Step 3: Generating Test Code
Using the reviewed scenarios, Jest and Playwright-based test code was generated. Although running tests directly via LLM is possible, generating structured test code is more reliable and cost-effective.
Prompt Example
task = f"""
Generate Jest + Playwright test code for:
- URL: {url}
- Scenario: {scenario}
Ensure the output is fully executable without modification.
"""
Generated Test Code Example
const { test, expect } = require('@playwright/test');
test.describe('Login Tests', () => {
test('Valid login', async ({ page }) => {
await page.goto('https://www.saucedemo.com/');
await page.fill('input[name="user-name"]', 'standard_user');
await page.fill('input[name="password"]', 'secret_sauce');
await page.click('input[name="login-button"]');
await expect(page).toHaveURL('https://www.saucedemo.com/inventory.html');
});
});
In this file, beforeEach is also created correctly
const { test, expect } = require('@playwright/test');
test.describe('Checkout Step One Tests', () => {
test.beforeEach(async ({ page }) => {
await page.goto('https://www.saucedemo.com/');
await page.fill('#user-name', 'standard_user');
await page.fill('#password', 'secret_sauce');
await page.click('#login-button');
await page.goto('https://www.saucedemo.com/checkout-step-one.html');
});
test('User fills in all fields correctly and clicks the Continue button', async ({ page }) => {
await page.fill('#first-name', 'John');
await page.fill('#last-name', 'Doe');
await page.fill('#postal-code', '12345');
await page.click('#continue');
await expect(page).toHaveURL('https://www.saucedemo.com/checkout-step-two.html');
});
});
Test Execution & Results
The generated tests were executed, and 44 tests ran, with 18 failures. However, many failures were reasonable:
-
Incorrect Expectations (5 tests)
- Example: Expected an error when entering numbers in the name field, but the site allowed it.
- The test scenarios were generated based on ideal behavior, but the actual behavior of the test site differed.
- This is not an issue with the test code itself but rather a mismatch between the expected functionality and the actual implementation of the site.
-
Test Runner Mismatch (12 tests)
- The tests assumed Playwright’s runner, but Jest Circus was used, causing failures. This can be fixed by specifying the correct runner in the prompt.
Cost Analysis
Using OpenAI’s GPT-4o API, I ran multiple tests, totaling $7. However, with optimized prompts, the entire pipeline (site structure analysis → scenario generation → test code generation) costs under $1.
Conclusion
Manual E2E test writing is becoming obsolete. However, programming languages remain the best way to provide precise AI instructions—for now.
This content originally appeared on DEV Community and was authored by Takuto Yuki

Takuto Yuki | Sciencx (2025-01-29T02:14:37+00:00) Automating 44 E2E Tests with AI-Powered Browser Control for Under $1. Retrieved from https://www.scien.cx/2025/01/29/automating-44-e2e-tests-with-ai-powered-browser-control-for-under-1/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.