An interactive guide to the future of on-device AI in the browser.
The LLM Prompt API gives web developers direct access to large language models running entirely on a user's device. This on-device approach offers significant benefits, including enhanced privacy (user data never leaves the machine) and the ability to run offline once the model is downloaded.
The API is available via an Origin Trial, or by enabling the chrome://flags/#prompt-api-for-gemini-nano-multimodal-input
flag in your browser. For more information, check out the official documentation.
More features like tool use, embedding models, and more coming soon!
Before trying to create a model session, check if the browser supports the API and if the required model data is ready. The LanguageModel.availability()
method returns the status, which can be available
, downloadable
, downloading
, or unavailable
.
const availability = await LanguageModel.availability();
addOutput(`Model availability: ${availability}`);
If the model isn't available locally, creating a session with LanguageModel.create()
may trigger a download. You can show the progress to the user with the monitor
option.
const session = await LanguageModel.create({
monitor: (monitor) => {
monitor.addEventListener("downloadprogress", (e) => {
// e.loaded is a value between 0 and 1
const percent = Math.round(e.loaded * 100);
if (percent < 100) {
addOutput(`Downloading... ${percent}%`, true);
} else {
addOutput(`Download complete. Processing model, this may take a bit...`, true);
}
});
}
});
addOutput("Session created and ready!", true);
You can control the model's creativity by setting parameters like temperature
(randomness) and topK
(limits the pool of next words). Remember, sessions are statefulβeach prompt adds to the conversation's context.
Call LanguageModel.create()
as soon as you know the model will be needed to avoid a delay on the first prompt.
const session = await LanguageModel.create({
temperature: 0.8,
topK: 10
});
addOutput("Session with custom parameters created!");
const result = await session.prompt("Tell me a one sentence story about a girl and her dog.");
addOutput("\n\n" + result);
session.prompt()
sends a request and returns the complete response as a string.
const session = await LanguageModel.create();
const result = await session.prompt("Write a short, upbeat poem about coding.");
addOutput(result);
For longer responses, session.promptStreaming()
returns a ReadableStream
that provides the response in chunks.
const session = await LanguageModel.create();
const stream = session.promptStreaming("Tell me a one-paragraph story about a brave robot.");
for await (const chunk of stream) {
addOutput(chunk);
}
Set a persistent persona for the model by providing a system
role in initialPrompts
.
const session = await LanguageModel.create({
initialPrompts: [
{ role: "system", content: "You are a witty pirate. All your responses must be in character." }
]
});
const stream = await session.promptStreaming("What's the best thing about sailing the high seas?");
for await (const chunk of stream) {
addOutput(chunk);
}
Sessions are stateful, meaning they remember the context of the conversation. You can ask follow-up questions, and the model will use the previous prompts and responses to inform its new answer.
const session = await LanguageModel.create();
addOutput("User: Tell me a one liner about cats.\n\n");
addOutput("Assistant: ");
const stream1 = session.promptStreaming("Tell me a one liner about cats.");
for await (const chunk of stream1) {
addOutput(chunk);
}
addOutput("\n\nUser: Tell me another one.\n\n");
addOutput("Assistant: ");
const stream2 = session.promptStreaming("Tell me another one.");
for await (const chunk of stream2) {
addOutput(chunk);
}
Use session.clone()
to start a new, independent conversation from the same base state.
Use clone()
to efficiently share an initial state and avoid reprocessing base prompts for multiple conversations.
const baseSession = await LanguageModel.create({
initialPrompts: [{ role: "system", content: "You are a travel guide for France that gives concise answers." }]
});
// Create two independent clones
const clone1 = await baseSession.clone();
const clone2 = await baseSession.clone();
// Ask clone 1 about the capital
const stream1 = await clone1.promptStreaming("What is the capital of France?");
for await (const chunk of stream1) {
addOutput(chunk, false, 'clone1');
}
// Ask clone 2 about pastries
const stream2 = await clone2.promptStreaming("What color is a croissant?");
for await (const chunk of stream2) {
addOutput(chunk, false, 'clone2');
}
Guide the model by providing examples ("shots") of the desired interaction in initialPrompts
.
const session = await LanguageModel.create({
initialPrompts: [
{ role: "system", content: "Suggest up to 3 emojis for a comment." },
{ role: "user", content: "This is amazing!" },
{ role: "assistant", content: "π, β¨, β€οΈ" },
{ role: "user", content: "LGTM" },
{ role: "assistant", content: "π, β
" }
]
});
const stream = await session.promptStreaming("This is a big improvement!");
for await (const chunk of stream) {
addOutput(chunk);
}
Pass non-text inputs like images by specifying expectedInputs
and providing an array of content types.
This canvas provides an image for the example:
const session = await LanguageModel.create({
expectedInputs: [{ type: "image" }]
});
// The userDrawnImage can be any ImageBitmapSource, like an <img> or <canvas>
const userDrawnImage = document.getElementById('mock-canvas');
const stream = await session.promptStreaming([{
role: "user",
content: [
{ type: "text", value: "Describe the image." },
{ type: "image", value: userDrawnImage }
]
}]);
for await (const chunk of stream) {
addOutput(chunk);
}
Constrain the model's output to a format like JSON or text matching a regular expression using responseConstraint
.
const schema = {
type: "object",
properties: {
city: { type: "string", description: "The city name." },
country: { type: "string", description: "The country name." },
},
required: ["city", "country"]
};
const session = await LanguageModel.create();
const result = await session.prompt(
"What is the capital of France?",
{ responseConstraint: schema }
);
const parsed = JSON.parse(result);
addOutput(JSON.stringify(parsed, null, 2));
const hexColorRegExp = /^#[0-9a-fA-F]{6}$/;
const session = await LanguageModel.create();
const result = await session.prompt(
"Give me a hex code for a calm blue color.",
{ responseConstraint: hexColorRegExp }
);
addOutput(result);
Guide the model by pre-filling the start of its response. This is useful for code completion, where the model generates code based on a function signature or description.
const session = await LanguageModel.create({
initialPrompts: [{
role: "system",
content: "You are a JavaScript code completion assistant. Complete the given function. Do not provide any explanations or surrounding text."
}]
});
const functionDescription = "A JavaScript function that calculates the factorial of a number.";
const functionSignature = "function factorial(n) {";
const stream = await session.promptStreaming([
{ role: "user", content: functionDescription },
{ role: "assistant", content: functionSignature, prefix: true }
]);
// Display the prefix immediately
addOutput(functionSignature, true);
for await (const chunk of stream) {
addOutput(chunk);
}
append()
Add messages to the session's context with append()
without immediately generating a response.
Use append()
to pre-process inputs before the user submits the final prompt, hiding latency.
const session = await LanguageModel.create();
// Add messages to the session's history without getting a response.
await session.append([{role: "user", content: "Fact 1: The sky is blue."}]);
await session.append([{role: "user", content: "Fact 2: Grass is green."}]);
addOutput("Appended two facts to the session context.\n\nNow, asking for a summary...\n\n");
// Now prompt for a response based on the full context.
const stream = await session.promptStreaming("Summarize the facts I've given you in a single sentence.");
for await (const chunk of stream) {
addOutput(chunk);
}
Provide an AbortSignal
to stop a long-running request.
const controller = new AbortController();
window.abortExampleController = controller; // Make accessible to button
try {
const session = await LanguageModel.create();
const stream = session.promptStreaming(
"Tell me the entire history of AI in great detail.",
{ signal: controller.signal }
);
for await (const chunk of stream) {
addOutput(chunk);
}
} catch (err) {
if (err.name === 'AbortError') {
addOutput('\n--- Prompt aborted by user. ---', false);
} else {
throw err; // Re-throw other errors
}
}
An interactive guide to on-device translation and language detection.
The Translator and Language Detector APIs expose a browser's existing translation capabilities to web pages. This allows for simple, resource-efficient, and private text translation and language identification.
For more information, check out the official documentation for the Translator API and the Language Detector API.
Check if capabilities are supported. Enter BCP 47 language tags (e.g., en
, es
, ja
) to see the status.
// This code is not editable. Use the inputs above.
const sourceLanguage = document.getElementById('translate-source-lang').value;
const targetLanguage = document.getElementById('translate-target-lang').value;
const availability = await Translator.availability({ sourceLanguage, targetLanguage });
addOutput(`Translator availability ('${sourceLanguage}' -> '${targetLanguage}'): ${availability}`);
If a language pair isn't available locally, creating a translator may trigger a download. You can show progress with the monitor
option.
const translator = await Translator.create({
sourceLanguage: "en",
targetLanguage: "de",
monitor: (m) => {
m.addEventListener("downloadprogress", e => {
const percent = Math.round(e.loaded * 100);
addOutput(`Downloading German model... ${percent}%`, true);
});
}
});
addOutput("Translator created! Download (if any) is complete.", true);
const result = await translator.translate("This is a test.");
addOutput(`\n\nTranslation: ${result}`);
Create a Translator
instance for a language pair. The translate()
method returns the translated text.
const translator = await Translator.create({
sourceLanguage: "en",
targetLanguage: "fr"
});
const result = await translator.translate("Hello, world! How are you today?");
addOutput(result);
The LanguageDetector
identifies the language of a text, returning possibilities sorted by confidence.
const detector = await LanguageDetector.create();
const text = "Comment allez-vous aujourd'hui?";
const results = await detector.detect(text);
let output = `Detected languages for: "${text}"\n\n`;
for (const result of results) {
const confidence = (result.confidence * 100).toFixed(2);
output += `${result.detectedLanguage}: ${confidence}% confident\n`;
}
addOutput(output.trim());
An interactive guide to on-device text summarization.
The Summarizer API provides a high-level interface for generating summaries of text. This allows web applications to perform tasks like creating headlines, summarizing articles, or condensing user reviews, all on the user's device.
For more information, check out the official documentation.
First, check if the Summarizer API is available in the browser.
const availability = await Summarizer.availability();
addOutput(`Summarizer availability: ${availability}`);
Create a Summarizer
and use the summarize()
method to get a condensed version of a long text.
const textToSummarize = `The James Webb Space Telescope (JWST) is a space telescope designed primarily to conduct infrared astronomy. As the largest optical telescope in space, it is equipped with high-resolution and high-sensitivity instruments, allowing it to view objects too old, distant, or faint for the Hubble Space Telescope. This enables investigations across many fields of astronomy and cosmology, such as observation of the first stars and the formation of the first galaxies, and detailed atmospheric characterization of potentially habitable exoplanets.`;
const summarizer = await Summarizer.create();
const result = await summarizer.summarize(textToSummarize);
addOutput(result);
For longer summaries, you can stream the result to display it token by token as it's generated.
const textToSummarize = `The history of artificial intelligence (AI) began in antiquity, with myths, stories and rumors of artificial beings endowed with intelligence or consciousness by master craftsmen. The seeds of modern AI were planted by classical philosophers who attempted to describe the process of human thinking as the mechanical manipulation of symbols. This work culminated in the invention of the programmable digital computer in the 1940s, a machine based on the abstract essence of mathematical reasoning. This device and the ideas behind it inspired a handful of scientists to begin seriously discussing the possibility of building an electronic brain.`;
const summarizer = await Summarizer.create();
const stream = await summarizer.summarizeStreaming(textToSummarize);
for await (const chunk of stream) {
addOutput(chunk);
}
You can control the format of the output by specifying a type
and length
when creating the summarizer.
const textToSummarize = `The history of artificial intelligence (AI) began in antiquity, with myths, stories and rumors of artificial beings endowed with intelligence or consciousness by master craftsmen. The seeds of modern AI were planted by classical philosophers who attempted to describe the process of human thinking as the mechanical manipulation of symbols. This work culminated in the invention of the programmable digital computer in the 1940s, a machine based on the abstract essence of mathematical reasoning. This device and the ideas behind it inspired a handful of scientists to begin seriously discussing the possibility of building an electronic brain.`;
// Let's ask for a short headline
const summarizer = await Summarizer.create({
type: 'headline',
length: 'short'
});
const result = await summarizer.summarize(textToSummarize);
addOutput(result);
Interactive guides for on-device content generation and revision.
The Writer and Rewriter APIs provide high-level assistance for content creation and modification. Use the Writer API to generate new text from a prompt, and the Rewriter API to adjust existing text.
For more information, check out the official documentation for the Writer API and the Rewriter API.
Before using the API, you should check if it's available in the user's browser.
const availability = await Writer.availability();
addOutput(`Writer availability: ${availability}`);
Generate new text from a simple prompt describing the writing task.
const writer = await Writer.create();
const result = await writer.write("A short, exciting paragraph about a journey to Mars.");
addOutput(result);
Control the generated text by specifying a tone
and length
.
const writer = await Writer.create({
tone: 'formal',
length: 'long'
});
const stream = await writer.writeStreaming("An email to a client requesting project feedback.");
for await (const chunk of stream) {
addOutput(chunk);
}
Just like the other APIs, it's a good practice to check for availability first.
const availability = await Rewriter.availability();
addOutput(`Rewriter availability: ${availability}`);
Provide text to the rewrite()
method to get a revised version.
const rewriter = await Rewriter.create();
const originalText = "This thing is kinda cool i guess, u should get one.";
const result = await rewriter.rewrite(originalText);
addOutput(`Original: ${originalText}\n\nRewritten: ${result}`);
Guide the revision by specifying a desired tone
or length
.
const rewriter = await Rewriter.create({
tone: 'more-formal'
});
const originalText = "hey, can you pls look at this when you have a sec? thx";
const result = await rewriter.rewrite(originalText);
addOutput(`Original: ${originalText}\n\nRewritten to be more formal: ${result}`);