LLM Prompt API

An interactive guide to the future of on-device AI in the browser.

The LLM Prompt API gives web developers direct access to large language models running entirely on a user's device. This on-device approach offers significant benefits, including enhanced privacy (user data never leaves the machine) and the ability to run offline once the model is downloaded.

The API is available via an Origin Trial, or by enabling the chrome://flags/#prompt-api-for-gemini-nano-multimodal-input flag in your browser. For more information, check out the official documentation.

More features like tool use, embedding models, and more coming soon!

1. Checking Availability

Before trying to create a model session, check if the browser supports the API and if the required model data is ready. The LanguageModel.availability() method returns the status, which can be available, downloadable, downloading, or unavailable.

const availability = await LanguageModel.availability();
addOutput(`Model availability: ${availability}`);

2. Creating a Session & Handling Downloads

If the model isn't available locally, creating a session with LanguageModel.create() may trigger a download. You can show the progress to the user with the monitor option.

const session = await LanguageModel.create({
    monitor: (monitor) => {
        monitor.addEventListener("downloadprogress", (e) => {
             // e.loaded is a value between 0 and 1
            const percent = Math.round(e.loaded * 100);
            if (percent < 100) {
                addOutput(`Downloading... ${percent}%`, true);
            } else {
                addOutput(`Download complete. Processing model, this may take a bit...`, true);
            }
        });
    }
});
addOutput("Session created and ready!", true);

3. Creating a Session with Parameters

You can control the model's creativity by setting parameters like temperature (randomness) and topK (limits the pool of next words). Remember, sessions are stateful—each prompt adds to the conversation's context.

Performance Note

Call LanguageModel.create() as soon as you know the model will be needed to avoid a delay on the first prompt.

const session = await LanguageModel.create({
    temperature: 0.8,
    topK: 10
});
addOutput("Session with custom parameters created!");
const result = await session.prompt("Tell me a one sentence story about a girl and her dog.");
addOutput("\n\n" + result);

4. Basic Prompting

session.prompt() sends a request and returns the complete response as a string.

const session = await LanguageModel.create();
const result = await session.prompt("Write a short, upbeat poem about coding.");
addOutput(result);

5. Streaming Responses

For longer responses, session.promptStreaming() returns a ReadableStream that provides the response in chunks.

const session = await LanguageModel.create();
const stream = session.promptStreaming("Tell me a one-paragraph story about a brave robot.");

for await (const chunk of stream) {
    addOutput(chunk);
}

6. System Prompts

Set a persistent persona for the model by providing a system role in initialPrompts.

const session = await LanguageModel.create({
    initialPrompts: [
        { role: "system", content: "You are a witty pirate. All your responses must be in character." }
    ]
});
const stream = await session.promptStreaming("What's the best thing about sailing the high seas?");
for await (const chunk of stream) {
    addOutput(chunk);
}

7. Stateful Conversations

Sessions are stateful, meaning they remember the context of the conversation. You can ask follow-up questions, and the model will use the previous prompts and responses to inform its new answer.

const session = await LanguageModel.create();

addOutput("User: Tell me a one liner about cats.\n\n");
addOutput("Assistant: ");
const stream1 = session.promptStreaming("Tell me a one liner about cats.");
for await (const chunk of stream1) {
    addOutput(chunk);
}

addOutput("\n\nUser: Tell me another one.\n\n");
addOutput("Assistant: ");
const stream2 = session.promptStreaming("Tell me another one.");
for await (const chunk of stream2) {
    addOutput(chunk);
}

8. Cloning a Session

Use session.clone() to start a new, independent conversation from the same base state.

Performance Note

Use clone() to efficiently share an initial state and avoid reprocessing base prompts for multiple conversations.

const baseSession = await LanguageModel.create({
    initialPrompts: [{ role: "system", content: "You are a travel guide for France that gives concise answers." }]
});

// Create two independent clones
const clone1 = await baseSession.clone();
const clone2 = await baseSession.clone();

// Ask clone 1 about the capital
const stream1 = await clone1.promptStreaming("What is the capital of France?");
for await (const chunk of stream1) {
    addOutput(chunk, false, 'clone1');
}

// Ask clone 2 about pastries
const stream2 = await clone2.promptStreaming("What color is a croissant?");
for await (const chunk of stream2) {
    addOutput(chunk, false, 'clone2');
}

9. Multi-shot Prompting

Guide the model by providing examples ("shots") of the desired interaction in initialPrompts.

const session = await LanguageModel.create({
    initialPrompts: [
        { role: "system", content: "Suggest up to 3 emojis for a comment." },
        { role: "user", content: "This is amazing!" },
        { role: "assistant", content: "🎉, ✨, ❤️" },
        { role: "user", content: "LGTM" },
        { role: "assistant", content: "👍, ✅" }
    ]
});
const stream = await session.promptStreaming("This is a big improvement!");
for await (const chunk of stream) {
    addOutput(chunk);
}

10. Multi-modal Input

Pass non-text inputs like images by specifying expectedInputs and providing an array of content types.

This canvas provides an image for the example:

const session = await LanguageModel.create({
    expectedInputs: [{ type: "image" }]
});

// The userDrawnImage can be any ImageBitmapSource, like an <img> or <canvas>
const userDrawnImage = document.getElementById('mock-canvas');

const stream = await session.promptStreaming([{
    role: "user",
    content: [
        { type: "text", value: "Describe the image." },
        { type: "image", value: userDrawnImage }
    ]
}]);

for await (const chunk of stream) {
    addOutput(chunk);
}

11. Structured Output

Constrain the model's output to a format like JSON or text matching a regular expression using responseConstraint.

JSON Schema

const schema = {
    type: "object",
    properties: {
        city: { type: "string", description: "The city name." },
        country: { type: "string", description: "The country name." },
    },
    required: ["city", "country"]
};

const session = await LanguageModel.create();
const result = await session.prompt(
    "What is the capital of France?",
    { responseConstraint: schema }
);

const parsed = JSON.parse(result);
addOutput(JSON.stringify(parsed, null, 2));

Regular Expression

const hexColorRegExp = /^#[0-9a-fA-F]{6}$/;

const session = await LanguageModel.create();
const result = await session.prompt(
    "Give me a hex code for a calm blue color.",
    { responseConstraint: hexColorRegExp }
);

addOutput(result);

12. Using a Prefix

Guide the model by pre-filling the start of its response. This is useful for code completion, where the model generates code based on a function signature or description.

const session = await LanguageModel.create({
    initialPrompts: [{ 
        role: "system", 
        content: "You are a JavaScript code completion assistant. Complete the given function. Do not provide any explanations or surrounding text." 
    }]
});

const functionDescription = "A JavaScript function that calculates the factorial of a number.";
const functionSignature = "function factorial(n) {";

const stream = await session.promptStreaming([
  { role: "user", content: functionDescription },
  { role: "assistant", content: functionSignature, prefix: true }
]);

// Display the prefix immediately
addOutput(functionSignature, true);

for await (const chunk of stream) {
    addOutput(chunk);
}

13. Appending Messages with `append()`

Add messages to the session's context with append() without immediately generating a response.

Performance Note

Use append() to pre-process inputs before the user submits the final prompt, hiding latency.

const session = await LanguageModel.create();

// Add messages to the session's history without getting a response.
await session.append([{role: "user", content: "Fact 1: The sky is blue."}]);
await session.append([{role: "user", content: "Fact 2: Grass is green."}]);

addOutput("Appended two facts to the session context.\n\nNow, asking for a summary...\n\n");

// Now prompt for a response based on the full context.
const stream = await session.promptStreaming("Summarize the facts I've given you in a single sentence.");

for await (const chunk of stream) {
    addOutput(chunk);
}

14. Aborting a Prompt

Provide an AbortSignal to stop a long-running request.

const controller = new AbortController();
window.abortExampleController = controller; // Make accessible to button

try {
    const session = await LanguageModel.create();
    const stream = session.promptStreaming(
        "Tell me the entire history of AI in great detail.",
        { signal: controller.signal }
    );
    for await (const chunk of stream) {
        addOutput(chunk);
    }
} catch (err) {
    if (err.name === 'AbortError') {
        addOutput('\n--- Prompt aborted by user. ---', false);
    } else {
        throw err; // Re-throw other errors
    }
}

Translation & Language Detector API

An interactive guide to on-device translation and language detection.

The Translator and Language Detector APIs expose a browser's existing translation capabilities to web pages. This allows for simple, resource-efficient, and private text translation and language identification.

For more information, check out the official documentation for the Translator API and the Language Detector API.

1. Checking Availability For a Language Pair

Check if capabilities are supported. Enter BCP 47 language tags (e.g., en, es, ja) to see the status.

// This code is not editable. Use the inputs above.
const sourceLanguage = document.getElementById('translate-source-lang').value;
const targetLanguage = document.getElementById('translate-target-lang').value;
const availability = await Translator.availability({ sourceLanguage, targetLanguage });
addOutput(`Translator availability ('${sourceLanguage}' -> '${targetLanguage}'): ${availability}`);

2. Monitoring Downloads

If a language pair isn't available locally, creating a translator may trigger a download. You can show progress with the monitor option.

const translator = await Translator.create({
    sourceLanguage: "en",
    targetLanguage: "de",
    monitor: (m) => {
        m.addEventListener("downloadprogress", e => {
            const percent = Math.round(e.loaded * 100);
            addOutput(`Downloading German model... ${percent}%`, true);
        });
    }
});

addOutput("Translator created! Download (if any) is complete.", true);
const result = await translator.translate("This is a test.");
addOutput(`\n\nTranslation: ${result}`);

3. Basic Translation

Create a Translator instance for a language pair. The translate() method returns the translated text.

const translator = await Translator.create({
  sourceLanguage: "en",
  targetLanguage: "fr"
});

const result = await translator.translate("Hello, world! How are you today?");
addOutput(result);

4. Language Detection

The LanguageDetector identifies the language of a text, returning possibilities sorted by confidence.

const detector = await LanguageDetector.create();
const text = "Comment allez-vous aujourd'hui?";
const results = await detector.detect(text);

let output = `Detected languages for: "${text}"\n\n`;
for (const result of results) {
    const confidence = (result.confidence * 100).toFixed(2);
    output += `${result.detectedLanguage}: ${confidence}% confident\n`;
}
addOutput(output.trim());

Summarizer API

An interactive guide to on-device text summarization.

The Summarizer API provides a high-level interface for generating summaries of text. This allows web applications to perform tasks like creating headlines, summarizing articles, or condensing user reviews, all on the user's device.

For more information, check out the official documentation.

1. Checking Availability

First, check if the Summarizer API is available in the browser.

const availability = await Summarizer.availability();
addOutput(`Summarizer availability: ${availability}`);

2. Basic Summarization

Create a Summarizer and use the summarize() method to get a condensed version of a long text.

const textToSummarize = `The James Webb Space Telescope (JWST) is a space telescope designed primarily to conduct infrared astronomy. As the largest optical telescope in space, it is equipped with high-resolution and high-sensitivity instruments, allowing it to view objects too old, distant, or faint for the Hubble Space Telescope. This enables investigations across many fields of astronomy and cosmology, such as observation of the first stars and the formation of the first galaxies, and detailed atmospheric characterization of potentially habitable exoplanets.`;

const summarizer = await Summarizer.create();
const result = await summarizer.summarize(textToSummarize);
addOutput(result);

3. Streaming a Summary

For longer summaries, you can stream the result to display it token by token as it's generated.

const textToSummarize = `The history of artificial intelligence (AI) began in antiquity, with myths, stories and rumors of artificial beings endowed with intelligence or consciousness by master craftsmen. The seeds of modern AI were planted by classical philosophers who attempted to describe the process of human thinking as the mechanical manipulation of symbols. This work culminated in the invention of the programmable digital computer in the 1940s, a machine based on the abstract essence of mathematical reasoning. This device and the ideas behind it inspired a handful of scientists to begin seriously discussing the possibility of building an electronic brain.`;

const summarizer = await Summarizer.create();
const stream = await summarizer.summarizeStreaming(textToSummarize);

for await (const chunk of stream) {
    addOutput(chunk);
}

4. Summarization Options

You can control the format of the output by specifying a type and length when creating the summarizer.

const textToSummarize = `The history of artificial intelligence (AI) began in antiquity, with myths, stories and rumors of artificial beings endowed with intelligence or consciousness by master craftsmen. The seeds of modern AI were planted by classical philosophers who attempted to describe the process of human thinking as the mechanical manipulation of symbols. This work culminated in the invention of the programmable digital computer in the 1940s, a machine based on the abstract essence of mathematical reasoning. This device and the ideas behind it inspired a handful of scientists to begin seriously discussing the possibility of building an electronic brain.`;

// Let's ask for a short headline
const summarizer = await Summarizer.create({
    type: 'headline',
    length: 'short'
});
const result = await summarizer.summarize(textToSummarize);
addOutput(result);

Writer & Rewriter APIs

Interactive guides for on-device content generation and revision.

The Writer and Rewriter APIs provide high-level assistance for content creation and modification. Use the Writer API to generate new text from a prompt, and the Rewriter API to adjust existing text.

For more information, check out the official documentation for the Writer API and the Rewriter API.

Writer API

1. Checking Writer Availability

Before using the API, you should check if it's available in the user's browser.

const availability = await Writer.availability();
addOutput(`Writer availability: ${availability}`);

2. Basic Writing

Generate new text from a simple prompt describing the writing task.

const writer = await Writer.create();
const result = await writer.write("A short, exciting paragraph about a journey to Mars.");
addOutput(result);

3. Writing with Options

Control the generated text by specifying a tone and length.

const writer = await Writer.create({
    tone: 'formal',
    length: 'long'
});
const stream = await writer.writeStreaming("An email to a client requesting project feedback.");

for await (const chunk of stream) {
    addOutput(chunk);
}

Rewriter API

1. Checking Rewriter Availability

Just like the other APIs, it's a good practice to check for availability first.

const availability = await Rewriter.availability();
addOutput(`Rewriter availability: ${availability}`);

2. Basic Rewriting

Provide text to the rewrite() method to get a revised version.

const rewriter = await Rewriter.create();
const originalText = "This thing is kinda cool i guess, u should get one.";
const result = await rewriter.rewrite(originalText);
addOutput(`Original: ${originalText}\n\nRewritten: ${result}`);

3. Rewriting with Options

Guide the revision by specifying a desired tone or length.

const rewriter = await Rewriter.create({
    tone: 'more-formal'
});
const originalText = "hey, can you pls look at this when you have a sec? thx";
const result = await rewriter.rewrite(originalText);
addOutput(`Original: ${originalText}\n\nRewritten to be more formal: ${result}`);

1. Checking Availability

2. Creating a Session & Handling Downloads

3. Creating a Session with Parameters

Performance Note

4. Basic Prompting

5. Streaming Responses

6. System Prompts

7. Stateful Conversations

8. Cloning a Session

Performance Note

9. Multi-shot Prompting

10. Multi-modal Input

11. Structured Output

JSON Schema

Regular Expression

12. Using a Prefix

13. Appending Messages with append()

Performance Note

14. Aborting a Prompt

1. Checking Availability For a Language Pair

2. Monitoring Downloads

3. Basic Translation

4. Language Detection

1. Checking Availability

2. Basic Summarization

3. Streaming a Summary

4. Summarization Options

Writer API

1. Checking Writer Availability

2. Basic Writing

3. Writing with Options

Rewriter API

1. Checking Rewriter Availability

2. Basic Rewriting

3. Rewriting with Options

13. Appending Messages with `append()`