LLMs

info

If you'd like to write your own LLM, see this how-to. If you'd like to contribute an integration, see Contributing integrations.

Features (natively supported)

All LLMs implement the Runnable interface, which comes with default implementations of all methods, ie. ainvoke, batch, abatch, stream, astream. This gives all LLMs basic support for async, streaming and batch, which by default is implemented as below:

Async support defaults to calling the respective sync method in asyncio's default thread pool executor. This lets other async functions in your application make progress while the LLM is being executed, by moving this call to a background thread.
Streaming support defaults to returning an Iterator (or AsyncIterator in the case of async streaming) of a single value, the final result returned by the underlying LLM provider. This obviously doesn't give you token-by-token streaming, which requires native support from the LLM provider, but ensures your code that expects an iterator of tokens can work for any of our LLM integrations.
Batch support defaults to calling the underlying LLM in parallel for each input by making use of a thread pool executor (in the sync batch case) or asyncio.gather (in the async batch case). The concurrency can be controlled with the max_concurrency key in RunnableConfig.

Each LLM integration can optionally provide native implementations for async, streaming or batch, which, for providers that support it, can be more efficient. The table shows, for each integration, which features have been implemented with native support.

Provider	Package
AI21LLM	langchain-ai21
AnthropicLLM	langchain-anthropic
AzureOpenAI	langchain-openai
BedrockLLM	langchain-aws
CohereLLM	langchain-cohere
FireworksLLM	langchain-fireworks
OllamaLLM	langchain-ollama
OpenAILLM	langchain-openai
TogetherLLM	langchain-together
VertexAILLM	langchain-google_vertexai

All LLMs

Label	Description
AI21 Labs	This example goes over how to use LangChain to interact with AI21 Jur...
Aleph Alpha	The Luminous series is a family of large language models.
Alibaba Cloud PAI EAS	Machine Learning Platform for AI of Alibaba Cloud is a machine learni...
Amazon API Gateway	Amazon API Gateway is a fully managed service that makes it easy for ...
Anthropic	You are currently on a page documenting the use of Anthropic legacy C...
Anyscale	Anyscale is a fully-managed Ray platform, on which you can build, dep...
Aphrodite Engine	Aphrodite is the open-source large-scale inference engine designed to...
Arcee	This notebook demonstrates how to use the Arcee class for generating ...
Azure ML	Azure ML is a platform used to build, train, and deploy machine learn...
Azure OpenAI	You are currently on a page documenting the use of Azure OpenAI text ...
Baichuan LLM	Baichuan Inc. (https Efficiency, Health, and Happiness.
Baidu Qianfan	Baidu AI Cloud Qianfan Platform is a one-stop large model development...
Banana	Banana is focused on building the machine learning infrastructure.
Baseten	Baseten is a Provider in the LangChain ecosystem that implements the ...
Beam	Calls the Beam API wrapper to deploy and make subsequent calls to an ...
Bedrock	You are currently on a page documenting the use of Amazon Bedrock mod...
Bittensor	Bittensor is a mining network, similar to Bitcoin, that includes buil...
CerebriumAI	Cerebrium is an AWS Sagemaker alternative. It also provides API acces...
ChatGLM	ChatGLM-6B is an open bilingual language model based on General Langu...
Clarifai	Clarifai is an AI Platform that provides the full AI lifecycle rangin...
Cloudflare Workers AI	Cloudflare AI documentation listed all generative text models availab...
Cohere	You are currently on a page documenting the use of Cohere models as t...
C Transformers	The C Transformers library provides Python bindings for GGML models.
CTranslate2	CTranslate2 is a C++ and Python library for efficient inference with ...
Databricks	Databricks Lakehouse Platform unifies data, analytics, and AI on one ...
DeepInfra	DeepInfra is a serverless inference as a service that provides access...
DeepSparse	This page covers how to use the DeepSparse inference runtime within L...
Eden AI	Eden AI is revolutionizing the AI landscape by uniting the best AI pr...
ExLlamaV2	ExLlamav2 is a fast inference library for running LLMs locally on mod...
Fireworks	You are currently on a page documenting the use of Fireworks models a...
ForefrontAI	The Forefront platform gives you the ability to fine-tune and use ope...
Friendli	Friendli enhances AI application performance and optimizes cost savin...
GigaChat	This notebook shows how to use LangChain with GigaChat.
Google AI	You are currently on a page documenting the use of Google models as t...
Google Cloud Vertex AI	You are currently on a page documenting the use of Google Vertex text...
GooseAI	GooseAI is a fully managed NLP-as-a-Service, delivered via API. Goose...
GPT4All	GitHub:nomic-ai/gpt4all an ecosystem of open-source chatbots trained ...
Gradient	Gradient allows to fine tune and get completions on LLMs with a simpl...
Huggingface Endpoints	The Hugging Face Hub is a platform with over 120k models, 20k dataset...
Hugging Face Local Pipelines	Hugging Face models can be run locally through the HuggingFacePipelin...
IBM watsonx.ai	WatsonxLLM is a wrapper for IBM watsonx.ai foundation models.
IPEX-LLM	IPEX-LLM is a PyTorch library for running LLM on Intel CPU and GPU (e...
Javelin AI Gateway Tutorial	This Jupyter Notebook will explore how to interact with the Javelin A...
JSONFormer	JSONFormer is a library that wraps local Hugging Face pipeline models...
KoboldAI API	KoboldAI is a "a browser-based front-end for AI-assisted writing with...
Konko	Konko API is a fully managed Web API designed to help application dev...
Layerup Security	The Layerup Security integration allows you to secure your calls to a...
Llama.cpp	llama-cpp-python is a Python binding for llama.cpp.
Llamafile	Llamafile lets you distribute and run LLMs with a single file.
LM Format Enforcer	LM Format Enforcer is a library that enforces the output format of la...
Manifest	This notebook goes over how to use Manifest and LangChain.
Minimax	Minimax is a Chinese startup that provides natural language processin...
MLX Local Pipelines	MLX models can be run locally through the MLXPipeline class.
Modal	The Modal cloud platform provides convenient, on-demand access to ser...
MoonshotChat	Moonshot is a Chinese startup that provides LLM service for companies...
MosaicML	MosaicML offers a managed inference service. You can either use a var...
NLP Cloud	The NLP Cloud serves high performance pre-trained or custom models fo...
oci_generative_ai	Oracle Cloud Infrastructure Generative AI
OCI Data Science Model Deployment Endpoint	OCI Data Science is a fully managed and serverless platform for data ...
OctoAI	OctoAI offers easy access to efficient compute and enables users to i...
Ollama	You are currently on a page documenting the use of Ollama models as t...
OpaquePrompts	OpaquePrompts is a service that enables applications to leverage the ...
OpenAI	You are currently on a page documenting the use of OpenAI text comple...
OpenLLM	🦾 OpenLLM is an open platform for operating large language models (L...
OpenLM	OpenLM is a zero-dependency OpenAI-compatible LLM provider that can c...
OpenVINO	OpenVINO™ is an open-source toolkit for optimizing and deploying AI i...
Petals	Petals runs 100B+ language models at home, BitTorrent-style.
PipelineAI	PipelineAI allows you to run your ML models at scale in the cloud. It...
Predibase	Predibase allows you to train, fine-tune, and deploy any ML model—fro...
Prediction Guard	Basic LLM usage
PromptLayer OpenAI	PromptLayer is the first platform that allows you to track, manage, a...
RELLM	RELLM is a library that wraps local Hugging Face pipeline models for ...
Replicate	Replicate runs machine learning models in the cloud. We have a librar...
Runhouse	Runhouse allows remote compute and data across environments and users...
SageMakerEndpoint	Amazon SageMaker is a system that can build, train, and deploy machin...
SambaNova	SambaNova's Sambaverse and Sambastudio are platforms for running your...
Solar	This community integration is deprecated. You should use ChatUpstage ...
SparkLLM	SparkLLM is a large-scale cognitive model independently developed by ...
StochasticAI	Stochastic Acceleration Platform aims to simplify the life cycle of a...
Nebula (Symbl.ai)	Nebula is a large language model (LLM) built by Symbl.ai. It is train...
TextGen	GitHub:oobabooga/text-generation-webui A gradio web UI for running La...
Titan Takeoff	TitanML helps businesses build and deploy better, smaller, cheaper, a...
Together AI	You are currently on a page documenting the use of Together AI models...
Tongyi Qwen	Tongyi Qwen is a large-scale language model developed by Alibaba's Da...
vLLM	vLLM is a fast and easy-to-use library for LLM inference and serving,...
Volc Engine Maas	This notebook provides you with a guide on how to get started with Vo...
Intel Weight-Only Quantization	Weight-Only Quantization for Huggingface Models with Intel Extension ...
Writer	Writer is a platform to generate different language content.
Xorbits Inference (Xinference)	Xinference is a powerful and versatile library designed to serve LLMs,
YandexGPT	This notebook goes over how to use Langchain with YandexGPT.
Yi	01.AI, founded by Dr. Kai-Fu Lee, is a global company at the forefron...
Yuan2.0	Yuan2.0 is a new generation Fundamental Large Language Model develope...

LLMs

Features (natively supported)

All LLMs

Was this page helpful?

You can also leave detailed feedback on GitHub.

LLMs

Features (natively supported)​

All LLMs​

Was this page helpful?

You can also leave detailed feedback on GitHub.

Features (natively supported)

All LLMs