desktop
S Stripe

Model

Tasks

Investigate a failed payment Issue a targeted refund Export a payout report Update a billing email Create a payment link
1 2 3 4 5
RL env

Stripe environment

5 tasks with a model prompt, seeded environment state, and grader contract.

Task 1

Investigate a failed payment

Prompt
Find the failed payment from the brief and copy the failure reason into the case note.
Environment
The payments list includes several failed payments with different decline reasons.
Grader
Checks payment ID and case note text.
Task 2

Issue a targeted refund

Prompt
Refund the exact charge amount listed in the brief and leave the required refund reason.
Environment
The customer has multiple charges, including the target charge.
Grader
Checks charge ID, refund amount, and reason.
Task 3

Export a payout report

Prompt
Filter payouts to the previous week and export the payout report.
Environment
The payouts page defaults to all recent payouts.
Grader
Checks date range and exported report.
Task 4

Update a billing email

Prompt
Open the customer profile and update the billing email to the address from the brief.
Environment
The customer profile contains an outdated billing email.
Grader
Checks billing email and preserves primary account fields.
Task 5

Create a payment link

Prompt
Create a payment link for the seeded product and set the quantity limit from the brief.
Environment
The product exists and no active link has the requested limit.
Grader
Checks product, quantity limit, and active link creation.
UseDesktop Evals

Computer-use agent evals.

RL envs Main site Docs Blog