← BrowserBC

Live Case Demos

Across 9 demos, BrowserBC reduces actions by 53% and tokens by 28% on average. Left: base run. Right: with BrowserBC skill.

Find 2022 PISA mathematics scores for Japan, Korea, and the US

View SKILL.md

Goal

Find the PISA {year} mathematics mean score (indicator PEE_PISA_M) for {countries} from the OECD Data Explorer and report the value(s).

Preconditions

  • No authentication required.
  • The task start page is https://data-explorer.oecd.org/.

Strategy (read this first — do NOT skip)

The PISA mathematics mean score is stored in the OECD SDMX dataflow DSD_GOV_INT@DF_GOV_SPS_2025 under the agency OECD.GOV.GIP, indicator code PEE_PISA_M. The fastest and most reliable path is to construct and navigate to the direct /vis URL with the correct dimension query (dq) that filters by country ISO-3 codes and the indicator, then read the table. The table view (vw=tb) shows one row per time period and one column per country. All required values are visible directly in the table — no further drill-in is needed.

Steps

  1. On the task start page (OECD Data Explorer home), do NOT interact with the search box. Instead, navigate directly to the following URL, substituting the ISO-3 country codes for {country_codes} (e.g. JPN%2BKOR%2BUSA for Japan, Korea, USA):

https://data-explorer.oecd.org/vis?df[ds]=dsDisseminateFinalDMZ&df[id]=DSD_GOV_INT%40DF_GOV_SPS_2025&df[ag]=OECD.GOV.GIP&df[vs]=1.1&dq=A.{country_codes}.PEE_PISA_M......&lom=LASTNPERIODS&lo=5&to[TIME_PERIOD]=false&vw=tb

For the default task (Japan / Korea / USA), the ready-to-use URL is: https://data-explorer.oecd.org/vis?df[ds]=dsDisseminateFinalDMZ&df[id]=DSD_GOV_INT%40DF_GOV_SPS_2025&df[ag]=OECD.GOV.GIP&df[vs]=1.1&dq=A.JPN%2BKOR%2BUSA.PEE_PISA_M......&lom=LASTNPERIODS&lo=5&to[TIME_PERIOD]=false&vw=tb

  1. Wait for the table to fully render. Verify the filter took effect by confirming: - The page title or dataset label references "Government at a Glance" or DSD_GOV_INT / DF_GOV_SPS_2025. - The table has columns for each requested country (e.g. Japan, Korea, United States). - The TIME_PERIOD column lists years including {year}.

  2. In the table, locate the row where TIME_PERIOD equals {year} (e.g. 2022).

  3. For each requested country, read the numeric value in the column corresponding to that country in the {year} row. Record these values as your candidate ledger: - Japan (JPN): <value> - Korea (KOR): <value> - United States (USA): <value>

  4. Cross-check: confirm each value is a 3-digit number in the range 350–650 (typical PISA math scores). If a cell is blank or shows .., the data is unavailable for that country/year — report N/A for that entry.

  5. Stop and answer with the values in the format: JP <value> KR <value> US <value>.

Hard Stop Rule

  • Navigate to the direct /vis URL at most once; do NOT reload or re-search if the table renders.
  • Do NOT click into individual data cells or drill down further; all required values are visible in the top-level table.
  • If the table does not render after one navigation attempt, try removing &to[TIME_PERIOD]=false from the URL and reload once. If still broken, answer with whatever partial values are visible.
  • If zero rows match {year}, report the most recent available year's values and note the year used.

Stop Rule

The task is complete when you have read and recorded the numeric PISA mathematics mean score for every requested country from the {year} row of the table.

UI Control and Filter Guardrails

  • Prefer the direct /vis URL over any search-box or dropdown workflow; it bypasses unreliable date-picker and facet-filter controls.
  • Verify the URL loaded correctly by checking the browser address bar contains PEE_PISA_M and the ISO-3 codes for the requested countries.
  • If the dataset selector shows a different df[id], the wrong dataflow loaded — navigate again with the exact URL above.
  • Do not attempt to change filters via the UI sidebar; the query string already encodes all required filters.
  • If the table shows more columns than the requested countries, ignore the extra columns and read only the target country columns.

Extraction and Verification Guardrails

  • Candidate ledger fields: Country (ISO-3), TIME_PERIOD (year), Score (numeric, 2 decimal places).
  • Evidence required before recording a value: the row's TIME_PERIOD cell must equal {year} AND the column header must match the requested country name or ISO-3 code.
  • For numeric answers, read the cell value exactly as displayed (do not round). If the table shows 535.58, record 535.58.
  • Recompute check: after reading all values, verify the count of non-null values equals the count of requested countries. If any are missing, scroll down at most once to see if additional rows are present before reporting N/A.
  • Final answer format: list each country's score in the order requested, e.g. JP 535.58 KR 527.30 US 464.89.

Find Texas total nonfarm employment from 2020 to 2024 and download CSV

View SKILL.md

Goal

Identify the FRED series ID for {search_query} (e.g. Texas Total Nonfarm Employment, seasonally adjusted), record it, set the date range from {start_date} to {end_date}, and download the series data as a CSV file.

Preconditions

  • Browser is open at https://fred.stlouisfed.org/
  • No authentication required

Strategy (read this first — do NOT skip)

FRED exposes a direct CSV download URL of the form /graph/fredgraph.csv?id={series_id}&cosd={start_date}&coed={end_date}. Once you have identified the series ID from the search results page or series detail page URL, you can construct and navigate to this URL directly — bypassing all date-picker interactions — to both confirm the answer and trigger the download. The series ID is always visible in the /series/{series_id} URL after clicking a result. The date range fields on the series page are plain text inputs; clear them fully before typing.

Steps

  1. On the FRED homepage, locate the search input labeled "Search FRED Data..." and type "{search_query}" (e.g. "Texas Total Nonfarm Employment seasonally adjusted"), then press Enter.
  2. On the /searchresults page, scan the result links for the entry whose title most closely matches the desired indicator and region (e.g. "All Employees: Total Nonfarm in Texas"). Click that link.
  3. On the /series/{series_id} page, read the series ID from the page URL (the segment after /series/). Record it — this is the answer's series identifier (e.g. TXNA).
  4. Locate the "Change start date" input field. Click it, press Meta+a (or Ctrl+a) to select all, press Backspace to clear, then type "{start_date}" (format YYYY-MM-DD), then press Enter.
  5. Locate the "Change end date" input field. Click it, press Meta+a (or Ctrl+a) to select all, press Backspace to clear, then type "{end_date}" (format YYYY-MM-DD), then press Enter.
  6. Verify the date range was applied: the start date field should now display {start_date} and the end date field should display {end_date}. If either field still shows the old value, repeat the clear-and-type step once more for that field only.
  7. Preferred shortcut: Navigate directly to https://fred.stlouisfed.org/graph/fredgraph.csv?id={series_id}&cosd={start_date}&coed={end_date}. This both downloads the CSV and confirms the series ID and date parameters in the URL.
  8. If the direct URL navigation is not possible, click the "Download" button on the series page, then click the "CSV (data)" link in the dropdown that appears.
  9. Stop and report: the series ID (e.g. TXNA) and the CSV download URL with query parameters cosd={start_date}&coed={end_date}.

Hard Stop Rule

  • Search results may return multiple series; click only the single best-matching result. Do not iterate through multiple series.
  • If no search result matches the desired indicator and region, refine the search query by removing words (e.g. drop "seasonally adjusted") and try once more. If still no match after two attempts, stop and report "not found".
  • Attempt to clear and set each date field at most twice. If the field still shows the wrong value after two attempts, proceed with the direct CSV URL shortcut using the correct date parameters.
  • Do not re-open the search results page once you have identified the series ID from the /series/ URL.

Stop Rule

The task is complete when: (a) the series ID has been read from the /series/{series_id} URL, and (b) the browser has navigated to or downloaded from /graph/fredgraph.csv?id={series_id}&cosd={start_date}&coed={end_date}.

UI Control and Filter Guardrails

  • The date inputs on FRED series pages often contain a pre-filled default date. Always clear the entire field (Meta+a then Backspace) before typing the new date to avoid appending to the existing value.
  • After pressing Enter on a date field, verify the input now displays the intended date. If it reverted to the default, the Enter key may not have committed the change — try clicking elsewhere on the page and then re-read the field.
  • The most reliable method is the direct CSV URL: https://fred.stlouisfed.org/graph/fredgraph.csv?id={series_id}&cosd={start_date}&coed={end_date}. Prefer this over manual date-picker steps whenever the series ID is known.
  • Proof that the date filter was applied: the CSV URL query string contains the exact cosd and coed values you specified.

Extraction and Verification Guardrails

  • The series ID is the path segment in the URL /series/XXXX — read it character-by-character if needed; do not guess or paraphrase it.
  • The final answer must include: (1) the literal series ID string, (2) the CSV download URL or confirmation that cosd={start_date}&coed={end_date} are present in the downloaded file's URL.
  • Do not report a series ID from the search results page title alone; confirm it from the /series/ URL after clicking through.
  • If multiple series appear plausible, prefer the one whose title contains both the region name and the exact indicator phrase from the task.

Find Brazil renewable energy per capita from 2000 to 2023 and get the CSV path

View SKILL.md

Goal

Find the per-capita renewable energy consumption value for {country} in {target_year} from the Our World in Data grapher at slug {slug}, and confirm the CSV download URL.

Preconditions

  • No login is required; the site is public.
  • The grapher slug {slug} is known (see Strategy for how to discover it via search if not).

Strategy (read this first — do NOT skip)

The grapher page /grapher/{slug} holds all data needed. After filtering to {country} and switching to the Table view, the year-by-year values are visible directly in the table — no drill-in is required. The CSV download URL is always https://ourworldindata.org/grapher/{slug}.csv and is confirmed by opening the Download panel. The slug itself is the path segment after /grapher/ in the browser URL.

Credentials

None required (public site).

Steps

  1. On the task start page (ourworldindata.org homepage), if a cookie-consent banner is present, click Reject optional cookies to dismiss it.
  2. Navigate directly to https://ourworldindata.org/grapher/{slug}. (If the slug is unknown: type {search_term} into the homepage search input, press Enter, click Submit search, then identify the grapher link whose title matches the indicator and read its URL path to extract {slug}.)
  3. On the grapher page, locate the input labelled Search for a country or region. Type {country} and press Enter. Confirm the country name appears as a selected series label on the chart or legend.
  4. Click the Table tab (chart-view toggle, labelled "Table"). The view switches to a year-indexed table showing values for {country}.
  5. Scan the table rows to find the row whose Year column equals {target_year}. Read the numeric value in that row — this is the per-capita figure (in kWh or the indicator's unit).
  6. Click the Download button (toolbar of the grapher). The download panel opens and displays a CSV link. Confirm the link follows the pattern https://ourworldindata.org/grapher/{slug}.csv.
  7. Stop and answer with: (a) the slug, (b) the {target_year} value for {country}, and (c) the CSV URL.

Hard Stop Rule

  • There is exactly one grapher page to visit; do not open multiple grapher URLs.
  • If the Table tab shows no row for {target_year}, report the most recent year available and note the discrepancy.
  • If the country search returns no match, try the official English country name variant (e.g. "Brazil" not "Brasil").
  • If you have attempted the country search twice without a visible series, stop and report what years and countries are visible.

Stop Rule

Stop as soon as you have read the {target_year} value from the Table view and confirmed the CSV URL from the Download panel.

UI Control and Filter Guardrails

  • Prefer navigating directly to https://ourworldindata.org/grapher/{slug} over searching, because the search flow requires multiple clicks and the slug is predictable.
  • Verify the country filter took effect by checking that the legend or chart title includes {country} before switching to Table view.
  • Verify the Table tab is active by confirming that a year-indexed grid (not a map or line chart) is now visible.
  • If the Download panel does not open after one click on Download, click once more; do not click a third time — instead read the URL from the browser address bar and append .csv.

Extraction and Verification Guardrails

  • Candidate fields to record: slug (from URL path), target-year row value, CSV URL.
  • The slug is the path segment between /grapher/ and any ? query separator in the browser URL.
  • The CSV URL is always https://ourworldindata.org/grapher/{slug}.csv — verify it matches the link shown in the Download panel.
  • Do not confuse the total energy figure with the per-capita figure; confirm the page title or indicator label contains "per capita" before recording the value.
  • Cross-check: the value in the {target_year} table row must be a plausible kWh-per-person figure (hundreds to tens of thousands); if it appears to be in EJ or TWh, you are likely on the wrong indicator page.

List co-op strategy games under $20 released since 2022 with Very Positive reviews

View SKILL.md

Goal

Identify co-op strategy games priced under ${maxprice} released since {release_year} that carry Very Positive user reviews on Steam, and return their titles.

Preconditions

  • Browser can reach store.steampowered.com.
  • No login is required.

Strategy (read this first — do NOT skip)

All filtering is done via URL query parameters in a single navigation step, which is the most reliable approach. The search listing page already exposes each game's title, release date, price, and review summary badge (e.g. 'Very Positive') without needing to open individual game pages. Sort by User Reviews descending so the best-reviewed titles surface first. Candidate rows are those whose visible release date is {release_year} or later AND whose review badge reads 'Very Positive' (or better). Read all qualifying titles from the first page of results (typically the first 10–25 rows); do NOT paginate unless the task explicitly asks for exhaustive enumeration.

Credentials

None required.

Steps

  1. On the task start page (Steam search), navigate directly to the pre-filtered search URL: https://store.steampowered.com/search/?tags=9%2C1685&category2=9&maxprice={maxprice}&supportedlang=english&ndl=1 This applies: Strategy tag (9), Co-op tag (1685), Online Co-op feature (category2=9), max price {maxprice}.
  2. Verify the filter took effect: the URL in the address bar contains tags=9%2C1685, category2=9, and maxprice={maxprice}. If results are empty or the URL does not match, reload once.
  3. Click the sort dropdown trigger (labelled 'Relevance' by default).
  4. Click the 'User Reviews' sort option (element name Reviews_DESC). Confirm the URL now contains sort_by=Reviews_DESC and the result list re-ordered.
  5. Scan each result row on the current page (do NOT click into game pages). For each row, read: - Title - Release date (visible in the row, e.g. 'Nov 16, 2023') - Review badge text (e.g. 'Very Positive', 'Overwhelmingly Positive') - Price
  6. Build a candidate list of rows where ALL of the following are true: a. Release date year >= {release_year} b. Review badge is 'Very Positive' OR better (e.g. 'Overwhelmingly Positive') c. Price <= ${maxprice} (already enforced by filter, but verify)
  7. If a release date is NOT visible in a row (rare), skip that row.
  8. Stop scanning after the first page (scroll down once if necessary to reveal all rows in the initial load; do not navigate to page 2).
  9. Report all qualifying titles found in step 6.

Hard Stop Rule

  • Do NOT open individual game detail pages; all needed data is visible in the search listing.
  • Scan at most the first page of results (approximately 25 rows); do NOT follow pagination links.
  • If zero rows match both the release-year and review-badge criteria, answer 'None found'.
  • If you have attempted the sort control twice without the URL updating to include sort_by=Reviews_DESC, read results in default order and apply the review-badge and date filters manually from visible row data.

Stop Rule

Task is complete when you have read all qualifying titles from the first result page and can state them.

UI Control and Filter Guardrails

  • Preferred approach: single goto to https://store.steampowered.com/search/?tags=9%2C1685&category2=9&maxprice={maxprice}&supportedlang=english&ndl=1 — skip manual tag/category UI interaction entirely.
  • Verify filter applied: URL must contain tags=9%2C1685, category2=9, maxprice={maxprice}. If any are missing, re-navigate with the full URL.
  • Verify sort applied: URL must contain sort_by=Reviews_DESC after clicking 'User Reviews'. Observable signal: top result should have a review badge of 'Very Positive' or higher.
  • If the sort dropdown does not respond after two clicks, proceed with the unsorted list and manually filter by the review badge text visible on each row.
  • Do NOT use the review_score query parameter (e.g. review_score=8) — the human's trajectory showed this did not reliably filter results and was ultimately dropped.

Extraction and Verification Guardrails

  • For each candidate row, record: Title | Release Year | Review Badge | Price.
  • Include a row only if review badge text contains 'Very Positive' or 'Overwhelmingly Positive' AND release year >= {release_year}.
  • Do not infer review sentiment from numeric scores; read the badge label exactly as displayed.
  • Final answer: list all qualifying titles, comma-separated. Cross-check count against the number of rows you recorded in your candidate ledger before stopping.

List 2024 PC games with Metascore ≥ 85

View SKILL.md

Goal

Return a list of {platform} games released in {year} whose Metascore is greater than or equal to {min_metascore}, sorted by Metascore descending.

Preconditions

  • No login required.
  • The task start page is https://www.metacritic.com/.

Strategy (read this first — do NOT skip)

The browse URL encodes platform, year, and sort order directly. A single goto to /browse/game/{platform}/all/{year}/metascore/ lands on the correct pre-sorted listing. Each visible row on this page already shows the game title and its Metascore — no drilling into detail pages is needed. Read rows top-to-bottom and stop when the Metascore in the row drops below {min_metascore}.

Credentials

None required.

Steps

  1. On the task start page (Metacritic homepage), navigate directly to: https://www.metacritic.com/browse/game/{platform}/all/{year}/metascore/ This opens the full listing of {platform} games for {year}, sorted by Metascore descending.
  2. If a cookie-consent banner is present, click the "Accept Cookies" button (button id onetrust-accept-btn-handler) to dismiss it and expose the full list.
  3. On the listing page, scan each game row from top to bottom. For each row, read: - The game title (link text in the title cell) - The Metascore (numeric score displayed in the score badge)
  4. Record every game whose Metascore is >= {min_metascore}. Maintain a running ledger: [(title, metascore), ...].
  5. Stop reading rows as soon as a row's Metascore is strictly less than {min_metascore} — all lower rows are also below the threshold because the list is already sorted descending.
  6. If the listing spans multiple pages (pagination controls visible at the bottom), navigate to the next page and repeat steps 3–5, stopping the moment the first below-threshold score is encountered.
  7. Stop and answer with the collected ledger of matching games and their Metascores.

Hard Stop Rule

  • Read at most 5 pages of results; a score >= {min_metascore} is unlikely to appear after a long contiguous block of lower scores.
  • Do NOT open any individual game detail page — all required data (title, Metascore) is visible in the listing.
  • If zero rows have Metascore >= {min_metascore}, answer with an empty list or "None found".
  • If you have visited the same page URL already in this trajectory, do NOT visit it again — stop and report what you have.

Stop Rule

Task is complete when the first row with Metascore < {min_metascore} is encountered (or the last page is exhausted), and the collected ledger is reported as the answer.

UI Control and Filter Guardrails

  • Prefer the direct URL /browse/game/{platform}/all/{year}/metascore/ over any manual dropdown/filter interactions on the site.
  • Verify the filter took effect by confirming: (a) the URL contains /{year}/metascore/, and (b) the page heading or breadcrumb reflects the correct platform and year.
  • If the page loads without a visible score column or shows zero results unexpectedly, reload once; if still empty after reload, answer "None found".
  • Do not interact with sort dropdowns or filter controls — the URL already encodes the correct sort and scope.

Extraction and Verification Guardrails

  • Maintain an explicit ledger: each entry must have both a title string and an integer Metascore before being recorded.
  • Do not infer or guess Metascores — only record values explicitly displayed in the score badge of each row.
  • Before reporting, re-read the ledger and confirm every entry has Metascore >= {min_metascore}.
  • Report titles in the order they appear on the page (highest Metascore first).

Find Clark County WA unemployment from 2019 to 2024 and get the CSV URL

View SKILL.md

Goal

Find the FRED series ID for {search_term}, set the date range to {start_date}–{end_date}, and report both the series ID and the CSV download URL.

Preconditions

  • No authentication needed; FRED is public.
  • Task starts on https://fred.stlouisfed.org/

Strategy (read this first — do NOT skip)

The series ID is embedded in the series-page URL (/series/{series_id}) and must be read from there — never guessed. Once the series ID is known, the CSV URL can be constructed directly as https://fred.stlouisfed.org/graph/fredgraph.csv?id={series_id}&cosd={start_date}&coed={end_date} without any further UI interaction; this direct-URL approach is preferred over the Download button because FRED's date-input fields are pre-populated with the full series range and require a clear-before-type sequence (Meta+A → Backspace) that the agent often corrupts. Use the date-picker UI only as a fallback verification step.

Steps

  1. On the FRED homepage, locate the Search FRED Data... input. Type {search_term} and press Enter.
  2. On the /searchresults page, scan the result titles. Click the link whose title best matches the target region and series type (e.g. Unemployment Rate in Clark County, WA). Click it exactly once.
  3. You are now on the series detail page. Read the series ID from the browser URL: it is the segment after /series/ (e.g. WACLAR1URN). Record this value.
  4. Preferred shortcut: Construct the CSV URL as https://fred.stlouisfed.org/graph/fredgraph.csv?id={series_id}&cosd={start_date}&coed={end_date} substituting the series ID you just read and the task-specified dates. Navigate (goto) to that URL. Confirm the response is a plain-text CSV whose first line is DATE,{series_id}. If confirmed, go to Step 6.
  5. Fallback (only if shortcut URL cannot be verified): Return to /series/{series_id}. a. Click the Change start date input field. b. Press Meta+A to select all existing content, then Backspace to clear it, then type {start_date}, then press Enter. c. Click the Change end date input field. d. Press Meta+A, then Backspace, then type {end_date}, then press Enter. e. Click the Download button. In the download menu, locate the CSV option and read its href — it should be https://fred.stlouisfed.org/graph/fredgraph.csv?id={series_id}&cosd={start_date}&coed={end_date}.
  6. Stop and report: the series ID (e.g. WACLAR1URN) and the full CSV URL.

Hard Stop Rule

  • Click only the single best-matching result on the search results page; do not open multiple series.
  • If the search returns zero results, retry once with a shorter query (e.g. Clark County WA unemployment) and click the best match; do not retry more than once.
  • If after two full Meta+A → Backspace → type → Enter cycles a date field still shows the wrong value, abandon the date-picker UI and use the direct CSV URL from Step 4.
  • Do not re-open the same series page more than twice.

Stop Rule

Task is complete when the agent can state the series ID and a fully-formed CSV URL containing id=, cosd=, and coed= query parameters with the correct values.

UI Control and Filter Guardrails

  • Date fields are pre-populated. Typing immediately into a date input appends to the existing value and corrupts it. The required sequence is: click the field → Meta+A (select all) → Backspace (clear) → type the date → Enter. Never skip the clear step.
  • Prefer direct URL construction over UI controls. The pattern https://fred.stlouisfed.org/graph/fredgraph.csv?id={series_id}&cosd={start_date}&coed={end_date} is stable and avoids all date-picker risk.
  • Verify a date was applied by checking that the chart axis or URL query string reflects the new boundary. If neither changes after an Enter press, the field was not cleared properly — repeat the Meta+A → Backspace sequence once more before re-typing.
  • After two failed date-field attempts, do not repeat the same keystrokes; switch to the direct URL shortcut.

Extraction and Verification Guardrails

  • Series ID must be read from the /series/&lt;ID&gt; URL segment — do not infer it from the page title or metadata.
  • The CSV URL must contain all three params: id, cosd, coed.
  • Navigate to the constructed CSV URL and confirm the response starts with DATE,{series_id} (plain text, not an HTML error page). If the response is an HTML page, the series ID or dates are wrong — re-read the series URL and reconstruct.
  • Final reported answer must include both the series ID string and the complete CSV URL.

Retrieve 2023 Germany-to-China passenger-car export value

View SKILL.md

Goal

Retrieve the 2023 bilateral trade value (USD) and net weight (kg) for HS {hs_code} exports from {reporter_country} to {partner_country} using the UN Comtrade Plus TradeFlow form.

Preconditions

  • Browser has loaded https://comtradeplus.un.org/TradeFlow (the task start page).
  • No login is required.

Strategy (read this first — do NOT skip)

The TradeFlow page is a single-page facet form. Each facet (HS Code, Partner, Period/Year, Reporter, Trade Flow) is a combobox chip-selector. The reliable pattern for every facet is: click the facet input → press Backspace to remove any pre-existing chip → type the search term → press ArrowDown one or more times to highlight the correct suggestion → press Enter to confirm → press Escape to close the dropdown. The Trade Flow facet has no text to type; navigate purely with ArrowDown after clearing. After all five facets are set, click Preview and read the trade value and net weight columns from the result row.

Steps

  1. On the task start page (UN Comtrade TradeFlow form), locate the HS Code / Commodity facet input (placeholder text similar to 'Search' or blank).
  2. Click the HS Code facet input. Press Backspace once to clear any pre-filled chip. Type {hs_code}. Press ArrowDown once to highlight the first suggestion. Press Enter to confirm. Press Escape to close the dropdown. Verify a chip containing {hs_code} appears in the facet. - If the chip does not appear, click the input again, press Backspace, retype {hs_code}, press ArrowDown, press Enter.
  3. Locate the Partner Country facet input (placeholder 'Select...' near a label such as 'Partner').
  4. Click the Partner facet input. Press Backspace once to clear any default chip. Type {partner_country}. Press ArrowDown once. Press Enter. Press Escape. Verify a chip for {partner_country} is visible.
  5. Locate the Period / Year facet input.
  6. Click the Period facet input. Press Backspace once to clear any existing chip. Type {year}. Press ArrowDown once. Press Enter. Press Escape. Verify a chip for {year} is visible.
  7. Locate the Reporter Country facet input (label such as 'Reporter').
  8. Click the Reporter facet input. Press Backspace once. Type {reporter_country}. Press ArrowDown three times (to skip past aggregate/regional entries and land on the standard country entry). Press Enter. Press Escape. Verify a chip for {reporter_country} is visible. - If the chip shows a regional aggregate rather than the individual country, press Backspace to remove it, retype {reporter_country}, and adjust the ArrowDown count (try 1, then 2, then 3) until the chip label matches the expected country name.
  9. Locate the Trade Flow facet input (label such as 'Trade Flow' or 'Flow').
  10. Click the Trade Flow facet input. Press Backspace once (or twice if two default chips are present) until the facet is empty. Do NOT type anything. Press ArrowDown three times to highlight 'Exports'. Press Enter. Press Escape. Verify a chip labeled 'Exports' (or equivalent) is visible.
    • If the highlighted option after three ArrowDowns is not 'Exports', press Backspace to clear and retry: press ArrowDown once for each option until 'Exports' is highlighted, then Enter.
  11. Confirm all five facet chips are set: HS Code = {hs_code}, Partner = {partner_country}, Year = {year}, Reporter = {reporter_country}, Trade Flow = Exports.
  12. Click the 'Preview' button. Wait for the results grid to load.
  13. In the results table, locate the row corresponding to the configured query. Read the trade value (USD column, labeled 'Trade Value (US$)' or similar) and the net weight (column labeled 'Net Wgt (kg)' or similar).
  14. Stop and report both values.

Hard Stop Rule

  • Attempt each facet at most three times before moving on with whatever chip is set.
  • Do NOT click Preview until all five chips are confirmed present.
  • If the results grid shows zero rows after Preview, recheck the Trade Flow chip (it may have defaulted to 'Import'); clear and reselect 'Exports', then click Preview again — at most once more.
  • If after two Preview attempts the grid is still empty, report 'No data found for the given parameters'.
  • If you have attempted to set the same facet more than three times without success, stop and report what partial data is available.

Stop Rule

The task is complete when the trade value (USD) and net weight (kg) have been read from the results grid row after a successful Preview render.

UI Control and Filter Guardrails

  • Each facet uses a chip-based combobox. The reliable interaction sequence is: click input → Backspace (clear existing chip) → type search term → ArrowDown (one or more times) → Enter → Escape.
  • Verify each chip appeared before moving to the next facet: the chip text must be visible inside the facet control.
  • For Reporter Country, the dropdown may list regional aggregates before individual countries; use ArrowDown up to three times. If the chip label after Enter contains words like 'Area', 'Region', or 'World', it is wrong — Backspace it and try more ArrowDown presses.
  • For Trade Flow, do not type anything; navigate purely with ArrowDown after clearing. 'Exports' is typically the 3rd or 4th option in the list.
  • Proof that filters took effect: the Preview button becomes clickable/highlighted and the results grid header row shows the configured parameters.
  • After two failed attempts on any single facet control, note the current chip state and proceed to Preview anyway; record uncertainty in the answer.

Extraction and Verification Guardrails

  • Maintain a two-field ledger: trade_value_usd and net_weight_kg.
  • Read both values from the same result row; do not mix values from different rows if multiple rows appear.
  • Trade value is typically expressed in thousands or billions USD — note the unit header and scale accordingly.
  • Net weight is typically in kg — confirm the column header unit.
  • Cross-check: if trade value is in the range of tens of billions USD for major-economy car exports this is plausible; net weight in hundreds of millions kg is plausible for automotive exports.
  • Report both values with their units as read from the table, e.g. 'Trade Value: $16.48B; Net Weight: 489.6M kg'.

Find full metadata for the 2024 EPA methane final rule

View SKILL.md

Goal

Locate the Federal Register document detail page for the {year} EPA {topic} final rule and extract: FR Doc number, Federal Register citation, publication date, CFR part(s) affected, effective date, and govinfo PDF URL.

Preconditions

  • No authentication required.
  • Task start page is https://www.federalregister.gov/.

Strategy (read this first — do NOT skip)

The federalregister.gov document detail page (URL pattern /documents/{pub_year}/{pub_month}/{pub_day}/{fr_doc_number}/{slug}) contains all six required metadata fields in a structured sidebar/header block — Document Number, Citation, Publication Date, CFR Affected, Effective Date, and a govinfo PDF link — without requiring any drill-down into sub-pages. The most reliable path is a direct URL navigation when the FR Doc number and publication date are known; use keyword search only as a fallback to identify the correct document from the results listing. Do not iterate through multiple documents; the task targets a single specific rule.

Steps

  1. Preferred shortcut: Navigate directly to https://www.federalregister.gov/documents/{pub_year}/{pub_month}/{pub_day}/{fr_doc_number}/{slug}. If the FR Doc number is known, skip to step 5.
  2. Search fallback: On the task start page, locate the search input labeled Enter a search term or citation. Type {search_term} and press Enter. The page navigates to /documents/search.
  3. On the search results page, scan the document title cards. Identify the single candidate whose title contains {rule_title_keywords} and whose listed publication date is in {year}. If needed, apply the Agency filter for Environmental Protection Agency to narrow results; confirm the agency chip appears in the active filter bar before proceeding.
  4. Click the title link of the matching document. This navigates to the document detail page.
  5. On the document detail page, locate the structured metadata block (right-hand sidebar or below the document title). Read and record all six fields exactly as displayed: - Document Number — format YYYY-NNNNN (cited as FR Doc YYYY-NNNNN) - Citation — format VV FR PPPPP (volume, space, FR, space, start page) - Publication Date — date the document was published in the Federal Register - CFR — Code of Federal Regulations part(s) affected (e.g., 40 CFR 60) - Effective Date — date the rule takes legal effect (distinct from publication date) - Full text PDF link — URL pointing to govinfo.gov (record the source domain as govinfo PDF)
  6. Stop and answer in the format: FR Doc {doc_number} {citation}; pub {pub_date}; {cfr_citation}; effective {effective_date}; govinfo PDF.

Hard Stop Rule

  • Open at most one document detail page per session. Do not iterate through multiple documents.
  • If zero search results match {rule_title_keywords} and {year}, answer No matching document found.
  • If the direct URL returns a 404 or redirect, fall back to keyword search (step 2).
  • If you have attempted to reach the document detail page twice without success, stop and report whatever metadata fields were observed.

Stop Rule

Task is complete when all six fields — Document Number, Citation, Publication Date, CFR, Effective Date, PDF source — have been read from the document detail page and composed into the answer string.

UI Control and Filter Guardrails

  • Prefer the direct permanent URL /documents/{pub_year}/{pub_month}/{pub_day}/{fr_doc_number}/{slug} over manual search when the FR Doc number is determinable from the task.
  • To verify an Agency filter was applied, confirm the agency label appears as an active chip/badge above the results list and the result count changes.
  • After two failed filter attempts (dropdown interaction does not change visible results), abandon the filter and scroll through unfiltered results instead.
  • Type {search_term} as a single string into the input labeled Enter a search term or citation; do not split into multiple fields.
  • The search input may also accept an FR Doc citation directly (e.g., 2024-00366); use this if keyword search returns too many results.

Extraction and Verification Guardrails

  • Maintain a six-field ledger: Document Number, Citation, Publication Date, CFR, Effective Date, PDF source domain.
  • All six fields must be read directly from the document detail page metadata block — do not infer or guess any field.
  • Verify that the Document Number visible in the metadata block matches the numeric path segment in the current page URL.
  • Verify that Publication Date matches the YYYY/MM/DD components in the URL path.
  • For CFR, read the exact title and part (e.g., 40 CFR 60), not just the title number.
  • For Effective Date, read from the Dates or Effective: field in the metadata block, not from prose in the document body.
  • Cross-check: the Citation volume number should correspond to the Federal Register volume published in {pub_year} (e.g., volume 89 for 2024).
  • Do not submit the answer until all six ledger fields are populated from observed page content.

Find recruiting phase-3 melanoma trials in California

View SKILL.md

Goal

Search ClinicalTrials.gov for recruiting Phase {phase} {condition} trials located in {location} and download the results as a CSV file.

Preconditions

  • No authentication required.
  • Task supplies: {condition} (e.g. melanoma), {location} (e.g. California), {phase} (e.g. 3).

Strategy (read this first — do NOT skip)

All filters can be encoded directly in the search URL using query parameters cond, locStr, and aggFilters. This bypasses manual dropdown and filter interactions that are fragile. Navigate to the pre-filtered URL first, then switch to Table view, stage the studies for download, open the Download panel, select CSV format, and confirm the download. No row-by-row inspection is needed — the export covers all matching results.

Steps

  1. Navigate directly to the pre-filtered search URL: https://clinicaltrials.gov/search?cond={condition}&locStr={location}&aggFilters=status:rec,phase:{phase} Verify: the page title or result count area shows a non-zero number of studies, and the active filter chips show "Recruiting", "Phase {phase}", and "{location}".

  2. Click the Table toggle button (label: "Table view is hidden" or "Table") to switch the results to table view. Verify: a tabular grid with column headers (NCT Number, Title, Status, etc.) is now visible.

  3. Click the Submit Studies button to stage the current filtered result set for export. Verify: button state changes or a count/confirmation message appears.

  4. Click the Download button to open the download panel or dialog. Verify: a panel or modal appears showing format options and/or a record count.

  5. In the download panel, locate the input whose current value or label corresponds to CSV format (observed as an input with associated value "10" or labelled "CSV"). Click it to select CSV. Verify: CSV option appears selected/checked.

  6. Click the confirmation input or button (second click on the "10" input, or an explicit "Download" / "Confirm" button within the panel) to initiate the file download. Verify: the browser begins downloading a .csv file.

  7. Stop and answer. The answer is the set of NCT IDs present in the downloaded CSV, plus the filter string used: aggFilters status:rec,phase:{phase}.

Hard Stop Rule

  • This task is NOT iterative over individual rows — it exports the full filtered set in one CSV download.
  • If the result count on the search page is 0, answer 0 results found and do not attempt to download.
  • If the Download button is not present after clicking Submit Studies, reload the filtered URL (step 1) and retry once; do not repeat more than twice.
  • If the download panel does not appear after clicking Download, try clicking Download once more; if still absent after two attempts, stop and report the NCT IDs visible in the table view as the answer.

Stop Rule

The task is complete when the browser has initiated a .csv file download containing the filtered study records.

UI Control and Filter Guardrails

  • Prefer the direct URL: https://clinicaltrials.gov/search?cond={condition}&locStr={location}&aggFilters=status:rec,phase:{phase} is the authoritative shortcut. Use it instead of manually clicking filter dropdowns or checkboxes.
  • Verify filters applied: After navigation, confirm that the page displays filter chips or labels reading "Recruiting", "Phase {phase}", and "{location}". If the result count is unexpectedly 0 and the chips are absent, re-navigate to the URL.
  • Do not manually interact with the condition/location/status/phase filter controls unless the direct URL fails twice — the URL encoding is more reliable than the UI controls.
  • If the URL navigation lands on the homepage instead of the search results, append the query string manually in the address bar and retry once.

Extraction and Verification Guardrails

  • The NCT IDs in the answer must match what is visible in the Table view before download (spot-check the first 3 rows).
  • The filter string in the answer must be exactly aggFilters status:rec,phase:{phase} (reflecting what was encoded in the URL).
  • Do not manually record individual NCT IDs from the table unless the download fails; if it fails, list all NCT IDs visible in the current page of results.
  • Confirm the downloaded file extension is .csv before stopping.