Skip to content

Commit 3917df4

Browse files
derekmeeganclaude
andauthored
Add browse get markdown command (#1907)
## Summary - Adds `browse get markdown [selector]` to convert page HTML to clean markdown - Defaults to body content when no selector given, accepts optional CSS/XPath/ref selector - Uses `node-html-markdown` for quality conversion (links, tables, code blocks preserved) - Useful for agents that need readable page content without HTML noise ## Usage ```bash browse get markdown # full page body as markdown browse get markdown .article # specific element browse get markdown @0-5 # ref from snapshot ``` ## Test results | Test | Local | Remote (Browserbase) | |------|-------|---------------------| | `get markdown` (body default) | HN full page markdown | HN full page markdown | | `get markdown .titleline` (selector) | Clean link with title | Clean link with title | ## Test plan - [x] Test locally with no selector (full body) - [x] Test locally with CSS selector - [x] Test on remote Browserbase session (no selector) - [x] Test on remote Browserbase session (with selector) 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 041cccc commit 3917df4

4 files changed

Lines changed: 104 additions & 1 deletion

File tree

.changeset/add-get-markdown.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
"@browserbasehq/browse-cli": patch
3+
---
4+
5+
Add `browse get markdown [selector]` command to convert page HTML to markdown. Defaults to body content, supports optional selector for specific elements. Uses node-html-markdown for high-quality conversion with links, tables, and code blocks preserved.

packages/cli/package.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,7 @@
5757
"@browserbasehq/stagehand": "workspace:*",
5858
"commander": "^12.0.0",
5959
"dotenv": "^16.4.5",
60+
"node-html-markdown": "^1.3.0",
6061
"pino": "^9.6.0",
6162
"pino-pretty": "^13.0.0",
6263
"ws": "^8.18.0"

packages/cli/src/index.ts

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ import * as readline from "readline";
1919
import type { Protocol } from "devtools-protocol";
2020
import { version as VERSION } from "../package.json";
2121
import { resolveWsTarget } from "./resolve-ws";
22+
import { NodeHtmlMarkdown } from "node-html-markdown";
2223

2324
const program = new Command();
2425

@@ -1231,6 +1232,11 @@ async function executeCommand(
12311232
.deepLocator(resolveSelector(selector!))
12321233
.isChecked(),
12331234
};
1235+
case "markdown": {
1236+
const target = selector ? resolveSelector(selector) : "body";
1237+
const html = await page!.deepLocator(target).innerHtml();
1238+
return { markdown: NodeHtmlMarkdown.translate(html) };
1239+
}
12341240
default:
12351241
throw new Error(`Unknown get type: ${what}`);
12361242
}
@@ -2465,7 +2471,7 @@ program
24652471
program
24662472
.command("get <what> [selector]")
24672473
.description(
2468-
"Get page info: url, title, text, html, value, box, visible, checked",
2474+
"Get page info: url, title, text, html, markdown, value, box, visible, checked",
24692475
)
24702476
.action(async (what: string, selector?: string) => {
24712477
const opts = program.opts<GlobalOpts>();

pnpm-lock.yaml

Lines changed: 91 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)