Anthropic chat model integration.

Setup: Install @langchain/anthropic and set environment variable ANTHROPIC_API_KEY.

npm install @langchain/anthropic
export ANTHROPIC_API_KEY="your-api-key"

Runtime args can be passed as the second argument to any of the base runnable methods .invoke. .stream, .batch, etc. They can also be passed via the .bind, or the second arg in the .bindTools method, like shown in the example below:

// When calling `.bind`, call options should be passed via the first argument
const llmWithArgsBound = llm.bind({
stop: ["\n"],
tools: [...],
});

// When calling `.bindTools`, call options should be passed via the second argument
const llmWithTools = llm.bindTools(
[...],
{
stop: ["stop on this token!"],
}
);
Instantiate
import { ChatAnthropic } from '@langchain/anthropic';

const llm = new ChatAnthropic({
model: "claude-3-5-sonnet-20240620",
temperature: 0,
maxTokens: undefined,
maxRetries: 2,
// apiKey: "...",
// baseUrl: "...",
// other params...
});

Invoking
const messages = [
{
type: "system" as const,
content: "You are a helpful translator. Translate the user sentence to French.",
},
{
type: "human" as const,
content: "I love programming.",
},
];
const result = await llm.invoke(messages);
console.log(result);
AIMessage {
  "id": "msg_01QDpd78JUHpRP6bRRNyzbW3",
  "content": "Here's the translation to French:\n\nJ'adore la programmation.",
  "response_metadata": {
    "id": "msg_01QDpd78JUHpRP6bRRNyzbW3",
    "model": "claude-3-5-sonnet-20240620",
    "stop_reason": "end_turn",
    "stop_sequence": null,
    "usage": {
      "input_tokens": 25,
      "output_tokens": 19
    },
    "type": "message",
    "role": "assistant"
  },
  "usage_metadata": {
    "input_tokens": 25,
    "output_tokens": 19,
    "total_tokens": 44
  }
}

Streaming Chunks
for await (const chunk of await llm.stream(messages)) {
console.log(chunk);
}
AIMessageChunk {
  "id": "msg_01N8MwoYxiKo9w4chE4gXUs4",
  "content": "",
  "additional_kwargs": {
    "id": "msg_01N8MwoYxiKo9w4chE4gXUs4",
    "type": "message",
    "role": "assistant",
    "model": "claude-3-5-sonnet-20240620"
  },
  "usage_metadata": {
    "input_tokens": 25,
    "output_tokens": 1,
    "total_tokens": 26
  }
}
AIMessageChunk {
  "content": "",
}
AIMessageChunk {
  "content": "Here",
}
AIMessageChunk {
  "content": "'s",
}
AIMessageChunk {
  "content": " the translation to",
}
AIMessageChunk {
  "content": " French:\n\nJ",
}
AIMessageChunk {
  "content": "'adore la programmation",
}
AIMessageChunk {
  "content": ".",
}
AIMessageChunk {
  "content": "",
  "additional_kwargs": {
    "stop_reason": "end_turn",
    "stop_sequence": null
  },
  "usage_metadata": {
    "input_tokens": 0,
    "output_tokens": 19,
    "total_tokens": 19
  }
}

Aggregate Streamed Chunks
import { AIMessageChunk } from '@langchain/core/messages';
import { concat } from '@langchain/core/utils/stream';

const stream = await llm.stream(messages);
let full: AIMessageChunk | undefined;
for await (const chunk of stream) {
full = !full ? chunk : concat(full, chunk);
}
console.log(full);
AIMessageChunk {
  "id": "msg_01SBTb5zSGXfjUc7yQ8EKEEA",
  "content": "Here's the translation to French:\n\nJ'adore la programmation.",
  "additional_kwargs": {
    "id": "msg_01SBTb5zSGXfjUc7yQ8EKEEA",
    "type": "message",
    "role": "assistant",
    "model": "claude-3-5-sonnet-20240620",
    "stop_reason": "end_turn",
    "stop_sequence": null
  },
  "usage_metadata": {
    "input_tokens": 25,
    "output_tokens": 20,
    "total_tokens": 45
  }
}

Bind tools
import { z } from 'zod';

const GetWeather = {
name: "GetWeather",
description: "Get the current weather in a given location",
schema: z.object({
location: z.string().describe("The city and state, e.g. San Francisco, CA")
}),
}

const GetPopulation = {
name: "GetPopulation",
description: "Get the current population in a given location",
schema: z.object({
location: z.string().describe("The city and state, e.g. San Francisco, CA")
}),
}

const llmWithTools = llm.bindTools([GetWeather, GetPopulation]);
const aiMsg = await llmWithTools.invoke(
"Which city is hotter today and which is bigger: LA or NY?"
);
console.log(aiMsg.tool_calls);
[
  {
    name: 'GetWeather',
    args: { location: 'Los Angeles, CA' },
    id: 'toolu_01WjW3Dann6BPJVtLhovdBD5',
    type: 'tool_call'
  },
  {
    name: 'GetWeather',
    args: { location: 'New York, NY' },
    id: 'toolu_01G6wfJgqi5zRmJomsmkyZXe',
    type: 'tool_call'
  },
  {
    name: 'GetPopulation',
    args: { location: 'Los Angeles, CA' },
    id: 'toolu_0165qYWBA2VFyUst5RA18zew',
    type: 'tool_call'
  },
  {
    name: 'GetPopulation',
    args: { location: 'New York, NY' },
    id: 'toolu_01PGNyP33vxr13tGqr7i3rDo',
    type: 'tool_call'
  }
]

Structured Output
import { z } from 'zod';

const Joke = z.object({
setup: z.string().describe("The setup of the joke"),
punchline: z.string().describe("The punchline to the joke"),
rating: z.number().optional().describe("How funny the joke is, from 1 to 10")
}).describe('Joke to tell user.');

const structuredLlm = llm.withStructuredOutput(Joke);
const jokeResult = await structuredLlm.invoke("Tell me a joke about cats");
console.log(jokeResult);
{
  setup: "Why don't cats play poker in the jungle?",
  punchline: 'Too many cheetahs!',
  rating: 7
}

Multimodal
import { HumanMessage } from '@langchain/core/messages';

const imageUrl = "https://example.com/image.jpg";
const imageData = await fetch(imageUrl).then(res => res.arrayBuffer());
const base64Image = Buffer.from(imageData).toString('base64');

const message = new HumanMessage({
content: [
{ type: "text", text: "describe the weather in this image" },
{
type: "image_url",
image_url: { url: `data:image/jpeg;base64,${base64Image}` },
},
]
});

const imageDescriptionAiMsg = await llm.invoke([message]);
console.log(imageDescriptionAiMsg.content);
The weather in this image appears to be beautiful and clear. The sky is a vibrant blue with scattered white clouds, suggesting a sunny and pleasant day. The clouds are wispy and light, indicating calm conditions without any signs of storms or heavy weather. The bright green grass on the rolling hills looks lush and well-watered, which could mean recent rainfall or good growing conditions. Overall, the scene depicts a perfect spring or early summer day with mild temperatures, plenty of sunshine, and gentle breezes - ideal weather for enjoying the outdoors or for plant growth.

Usage Metadata
const aiMsgForMetadata = await llm.invoke(messages);
console.log(aiMsgForMetadata.usage_metadata);
{ input_tokens: 25, output_tokens: 19, total_tokens: 44 }

Stream Usage Metadata
const streamForMetadata = await llm.stream(
messages,
{
streamUsage: true
}
);
let fullForMetadata: AIMessageChunk | undefined;
for await (const chunk of streamForMetadata) {
fullForMetadata = !fullForMetadata ? chunk : concat(fullForMetadata, chunk);
}
console.log(fullForMetadata?.usage_metadata);
{ input_tokens: 25, output_tokens: 20, total_tokens: 45 }

Response Metadata
const aiMsgForResponseMetadata = await llm.invoke(messages);
console.log(aiMsgForResponseMetadata.response_metadata);
{
  id: 'msg_01STxeQxJmp4sCSpioD6vK3L',
  model: 'claude-3-5-sonnet-20240620',
  stop_reason: 'end_turn',
  stop_sequence: null,
  usage: { input_tokens: 25, output_tokens: 19 },
  type: 'message',
  role: 'assistant'
}

Hierarchy (view full)

Constructors

Properties

clientOptions: ClientOptions

Overridable Anthropic ClientOptions

maxTokens: number = 2048

A maximum number of tokens to generate before stopping.

model: string = "claude-2.1"

Model name to use

modelName: string = "claude-2.1"

Model name to use

streamUsage: boolean = true

Whether or not to include token usage data in streamed chunks.

true
streaming: boolean = false

Whether to stream the results or not

temperature: number = 1

Amount of randomness injected into the response. Ranges from 0 to 1. Use temp closer to 0 for analytical / multiple choice, and temp closer to 1 for creative and generative tasks.

topK: number = -1

Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Defaults to -1, which disables it.

topP: number = -1

Does nucleus sampling, in which we compute the cumulative distribution over all the options for each subsequent token in decreasing probability order and cut it off once it reaches a particular probability specified by top_p. Defaults to -1, which disables it. Note that you should either alter temperature or top_p, but not both.

anthropicApiKey?: string

Anthropic API key

apiKey?: string

Anthropic API key

apiUrl?: string
invocationKwargs?: Kwargs

Holds any additional parameters that are valid to pass to anthropic.messages that are not explicitly specified on this class.

stopSequences?: string[]

A list of strings upon which to stop generating. You probably want ["\n\nHuman:"], as that's the cue for the next turn in the dialog agent.

batchClient: Anthropic
streamingClient: Anthropic

Methods

  • Get the identifying parameters for the model

    Returns {
        max_tokens: number;
        model:
            | string & {}
            | "claude-2.1"
            | "claude-3-opus-20240229"
            | "claude-3-sonnet-20240229"
            | "claude-3-haiku-20240307"
            | "claude-2.0"
            | "claude-instant-1.2";
        model_name: string;
        metadata?: Metadata;
        stop_sequences?: string[];
        stream?: boolean;
        system?: string;
        temperature?: number;
        tool_choice?: ToolChoiceAuto | ToolChoiceAny | ToolChoiceTool;
        tools?: Tool[];
        top_k?: number;
        top_p?: number;
    }

    • max_tokens: number

      The maximum number of tokens to generate before stopping.

      Note that our models may stop before reaching this maximum. This parameter only specifies the absolute maximum number of tokens to generate.

      Different models have different maximum values for this parameter. See models for details.

    • model:
          | string & {}
          | "claude-2.1"
          | "claude-3-opus-20240229"
          | "claude-3-sonnet-20240229"
          | "claude-3-haiku-20240307"
          | "claude-2.0"
          | "claude-instant-1.2"

      The model that will complete your prompt.

      See models for additional details and options.

    • model_name: string
    • Optionalmetadata?: Metadata

      An object describing metadata about the request.

    • Optionalstop_sequences?: string[]

      Custom text sequences that will cause the model to stop generating.

      Our models will normally stop when they have naturally completed their turn, which will result in a response stop_reason of "end_turn".

      If you want the model to stop generating when it encounters custom strings of text, you can use the stop_sequences parameter. If the model encounters one of the custom sequences, the response stop_reason value will be "stop_sequence" and the response stop_sequence value will contain the matched stop sequence.

    • Optionalstream?: boolean

      Whether to incrementally stream the response using server-sent events.

      See streaming for details.

    • Optionalsystem?: string

      System prompt.

      A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role. See our guide to system prompts.

    • Optionaltemperature?: number

      Amount of randomness injected into the response.

      Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks.

      Note that even with temperature of 0.0, the results will not be fully deterministic.

    • Optionaltool_choice?: ToolChoiceAuto | ToolChoiceAny | ToolChoiceTool

      How the model should use the provided tools. The model can use a specific tool, any available tool, or decide by itself.

    • Optionaltools?: Tool[]

      Definitions of tools that the model may use.

      If you include tools in your API request, the model may return tool_use content blocks that represent the model's use of those tools. You can then run those tools using the tool input generated by the model and then optionally return results back to the model using tool_result content blocks.

      Each tool definition includes:

      • name: Name of the tool.
      • description: Optional, but strongly-recommended description of the tool.
      • input_schema: JSON schema for the tool input shape that the model will produce in tool_use output content blocks.

      For example, if you defined tools as:

      [
      {
      "name": "get_stock_price",
      "description": "Get the current stock price for a given ticker symbol.",
      "input_schema": {
      "type": "object",
      "properties": {
      "ticker": {
      "type": "string",
      "description": "The stock ticker symbol, e.g. AAPL for Apple Inc."
      }
      },
      "required": ["ticker"]
      }
      }
      ]

      And then asked the model "What's the S&P 500 at today?", the model might produce tool_use content blocks in the response like this:

      [
      {
      "type": "tool_use",
      "id": "toolu_01D7FLrfh4GYq7yT1ULFeyMV",
      "name": "get_stock_price",
      "input": { "ticker": "^GSPC" }
      }
      ]

      You might then run your get_stock_price tool with {"ticker": "^GSPC"} as an input, and return the following back to the model in a subsequent user message:

      [
      {
      "type": "tool_result",
      "tool_use_id": "toolu_01D7FLrfh4GYq7yT1ULFeyMV",
      "content": "259.75 USD"
      }
      ]

      Tools can be used for workflows that include running client-side tools and functions, or more generally whenever you want the model to produce a particular JSON structure of output.

      See our guide for more details.

    • Optionaltop_k?: number

      Only sample from the top K options for each subsequent token.

      Used to remove "long tail" low probability responses. Learn more technical details here.

      Recommended for advanced use cases only. You usually only need to use temperature.

    • Optionaltop_p?: number

      Use nucleus sampling.

      In nucleus sampling, we compute the cumulative distribution over all the options for each subsequent token in decreasing probability order and cut it off once it reaches a particular probability specified by top_p. You should either alter temperature or top_p, but not both.

      Recommended for advanced use cases only. You usually only need to use temperature.