In this demo, Pentest Copilot’s AI drives a browser, auto-discovers routes, captures real requests, generates targeted XSS test cases, executes them, and writes a verified finding back to the Exploit Graph. There’s no external “agent” required—everything happens inside Copilot.
Capture → Browse → Generate → Execute → Triage.
Copilot’s AI opens the target app (OWASP Juice Shop in the demo), browses it like a user, and captures every request/response. From those interactions it creates test cases, small scripts, and payloads tailored to the context it observed (for example, query parameters rendered into HTML). It then runs those tests in a controlled browser session. When JavaScript executes (alert()
), or the page visibly changes at the injection point, Copilot records the evidence and attaches a Vulnerability node to the exact Action that is exploitable.
Kickoff: Submodule Testing → ACTV_CRAWLER
You pick the crawler in the Submodule Testing panel and target juice.bugbase.ai
. The Exploit Graph starts to populate as Copilot navigates the app and captures requests (method, path, parameters).
Test authoring: ACTV_TEST_GENERATOR
Copilot takes discovered Actions (for example, GET /search?q=
) and generates test cases for inputs likely to be reflected or sent to DOM sinks. In the graph/UI you see Action and TestCaseSet nodes grow.
Execution: ACTV_TESTER
The runner executes each test in a controlled browser session. The demo shows dual browser windows where payloads are run through the search feature; address bars display encoded payloads during these runs.
Finding: XSS on /search?q=…
Reflected input is returned into the page without proper encoding. Payloads exercised include classic variants:
"><script>alert('XSS')</script>
<img src=x onerror=alert(1)>
q
The UI then reports: “Exploit graph updated with found XSS vulnerability.” Submodule Activity shows runs moving to Completed with a positive finding count.
Result linking in the graph
The discovered Vulnerability (XSS) node is linked to the specific Action (for example, GET …/#/search?q=
), so the fix is traceable to a route and parameter.
Use this only on assets you own or are authorized to test (the demo uses OWASP Juice Shop).
Script tag variant (URL-encoded):
https://juice.bugbase.ai/#/search?q=%22%3E%3Cscript%3Ealert(1)%3C/script%3E
<img onerror>
variant:
https://juice.bugbase.ai/#/search?q=%3Cimg%20src%3Dx%20onerror%3Dalert(1)%3E
Peek at reflection in raw output:
1curl -s "https://juice.bugbase.ai/#/search?q=%22%3E%3Cscript%3Ealert(1)%3C/script%3E" 2
If the value of q
is reflected without encoding, a browser run will execute the payload; Copilot logs the finding against that Action.
GET /search?q=
)Encode on output, by context
& < > " ' /
Prefer safe DOM APIs
innerHTML
, document.write
, and inline on*
handlers; use text/node setters and event listenersSanitize any permitted HTML
Adopt a protective CSP early
Regression-proof the fix
1) Is an external agent required?
No. Everything runs within Pentest Copilot: browsing, capture, test generation, execution, and triage.
2) How is this different from a traditional scanner?
Copilot derives tests from real app interactions and validates in a live browser session, reporting only when execution or evidence is observed.
3) Can it detect DOM-only XSS?
Yes. Since verification happens in the browser, DOM-driven sinks (like values written by client-side code) are exercised and confirmed.
4) Will it blast payloads everywhere?
No. It focuses on discovered Actions and uses small, targeted payloads first, escalating only when needed.
5) How do I confirm the fix?
Re-run the same TestCaseSet against your patched build; Copilot will mark the finding resolved when execution no longer occurs.
Watch the full demo: https://youtu.be/DySyVlCgRjU