Skip to main content

Overview

The @onkernel/ai-sdk package provides Vercel AI SDK-compatible tools for browser automation powered by Kernel. This package exposes a Playwright execution tool that allows Large Language Models (LLMs) to browse the web, interact with websites, and perform automation tasks through natural language instructions. With this tool, AI agents can execute Playwright code on Kernel’s remote browsers, enabling powerful browser automation capabilities in your AI-powered applications.

Installation

Install the package along with its peer dependencies:
npm install @onkernel/ai-sdk zod
npm install ai @onkernel/sdk
The @onkernel/sdk and ai packages are peer dependencies that must be installed separately.

Prerequisites

Before using the AI SDK tool, you’ll need:
  1. Kernel API Key - Obtain from the Kernel Dashboard or through the Vercel Marketplace integration
  2. AI Model Provider - An API key for your chosen LLM provider (OpenAI, Anthropic, etc.)
  3. Kernel Browser Session - A running browser session created via the Kernel SDK

How It Works

The playwrightExecuteTool creates a Vercel AI SDK tool that:
  1. Accepts natural language instructions from an LLM
  2. Converts those instructions into Playwright code
  3. Executes the code on a Kernel remote browser
  4. Returns the results back to the LLM
This enables AI agents to autonomously browse websites, extract data, and perform complex automation tasks.

Usage with generateText()

The simplest way to use the AI SDK tool is with Vercel’s generateText() function:
import { openai } from '@ai-sdk/openai';
import { playwrightExecuteTool } from '@onkernel/ai-sdk';
import { Kernel } from '@onkernel/sdk';
import { generateText } from 'ai';

// 1) Create Kernel client and start a browser session
const client = new Kernel({
  apiKey: process.env.KERNEL_API_KEY,
});

const browser = await client.browsers.create({});

const sessionId = browser.session_id;
console.log('Browser session started:', sessionId);

// 2) Create the Playwright execution tool
const playwrightTool = playwrightExecuteTool({
  client,
  sessionId
});

// 3) Use with Vercel AI SDK
const result = await generateText({
  model: openai('gpt-5.1'),
  prompt: 'Open example.com and click the first link',
  tools: {
    playwright_execute: playwrightTool,
  },
});

console.log('Result:', result.text);

// 4) Clean up
await client.browsers.deleteByID(sessionId);

Usage with Agent() Class

For more complex, multi-step automation tasks, use the Vercel AI SDK’s Agent() class. Agents can autonomously plan and execute a series of actions to accomplish a goal:
import { openai } from '@ai-sdk/openai';
import { playwrightExecuteTool } from '@onkernel/ai-sdk';
import { Kernel } from '@onkernel/sdk';
import { Experimental_Agent as Agent, stepCountIs } from 'ai';

const kernel = new Kernel({
  apiKey: process.env.KERNEL_API_KEY
});

const browser = await kernel.browsers.create({});

// Initialize the AI agent with GPT-5.1
const agent = new Agent({
  model: openai('gpt-5.1'),
  tools: {
    playwright_execute: playwrightExecuteTool({
      client: kernel,
      sessionId: browser.session_id,
    }),
  },
  stopWhen: stepCountIs(20), // Maximum 20 steps
  system: `You are a browser automation expert. You help users execute tasks in their browser using Playwright.`,
});

// Execute the agent with the user's task
const { text, steps, usage } = await agent.generate({
  prompt: 'Go to news.ycombinator.com, find the top 3 posts, and summarize them',
});

console.log('Agent response:', text);
console.log('Steps taken:', steps.length);
console.log('Token usage:', usage);

await kernel.browsers.deleteByID(browser.session_id);

Tool Parameters

The playwrightExecuteTool function accepts the following parameters:
function playwrightExecuteTool(options: {
  client: Kernel;     // Kernel SDK client instance
  sessionId: string;  // Existing browser session ID
}): Tool;

Tool Input Schema

The generated tool accepts the following input from the LLM:
{
  code: string;          // Required: JavaScript/TypeScript code to execute
  timeout_sec?: number;  // Optional: Execution timeout in seconds (default: 60)
}
Under the hood, the tool calls client.browsers.playwright.execute(sessionId, { code, timeout_sec }).

Examples

Web Scraping

const result = await generateText({
  model: openai('gpt-5.1'),
  prompt: 'Go to producthunt.com and extract the top 5 product names and descriptions',
  tools: {
    playwright_execute: playwrightTool,
  },
});

Form Automation

const agent = new Agent({
  model: openai('gpt-5.1'),
  tools: {
    playwright_execute: playwrightTool,
  },
  stopWhen: stepCountIs(10),
  system: 'You are a form filling assistant.',
});

const result = await agent.generate({
  prompt: 'Navigate to example.com/contact, fill out the contact form with name "John Doe" and email "john@example.com", and submit it',
});

Data Extraction

const result = await generateText({
  model: openai('gpt-5.1'),
  prompt: 'Visit github.com/onkernel/kernel-nextjs-template, extract the README content, and count how many code examples are shown',
  tools: {
    playwright_execute: playwrightTool,
  },
});

Best Practices

1. Session Management

Always clean up browser sessions after use to avoid unnecessary costs:
try {
  const result = await generateText({
    // ... configuration
  });
  // Process results
} finally {
  await client.browsers.deleteByID(sessionId);
}

2. Error Handling

Implement robust error handling for production applications:
try {
  const agent = new Agent({
    model: openai('gpt-5.1'),
    tools: {
      playwright_execute: playwrightTool,
    },
    stopWhen: stepCountIs(20),
  });

  const result = await agent.generate({
    prompt: userTask,
  });

  return { success: true, data: result.text };
} catch (error) {
  console.error('Agent execution failed:', error);
  return { success: false, error: error.message };
} finally {
  await client.browsers.deleteByID(sessionId);
}

3. Enable Stealth Mode

For websites with bot detection, enable stealth mode:
const browser = await client.browsers.create({
  stealth: true, // Evade bot detection
});

4. Live View for Debugging

Live view is enabled by default for non-headless browsers. Access the live view URL from the browser object:
const browser = await client.browsers.create({});

console.log('Watch your browser:', browser.live_view_url);

Troubleshooting

Tool Not Being Called

If the LLM isn’t using the tool, make your prompt more explicit:
const result = await generateText({
  model: openai('gpt-5.1'),
  prompt: 'Use the Playwright tool to navigate to example.com and extract the page title',
  tools: {
    playwright_execute: playwrightTool,
  },
});

Timeout Errors

If you’re experiencing timeout errors, the LLM can request longer timeouts by including the timeout_sec parameter in the generated Playwright code execution request.

Additional Resources