LLMs
info
If you'd like to write your own LLM, see this how-to. If you'd like to contribute an integration, see Contributing integrations.
Features (natively supported)
All LLMs implement the Runnable interface, which comes with default implementations of all methods, ie. ainvoke
, batch
, abatch
, stream
, astream
. This gives all LLMs basic support for async, streaming and batch, which by default is implemented as below:
- Async support defaults to calling the respective sync method in asyncio's default thread pool executor. This lets other async functions in your application make progress while the LLM is being executed, by moving this call to a background thread.
- Streaming support defaults to returning an
Iterator
(orAsyncIterator
in the case of async streaming) of a single value, the final result returned by the underlying LLM provider. This obviously doesn't give you token-by-token streaming, which requires native support from the LLM provider, but ensures your code that expects an iterator of tokens can work for any of our LLM integrations. - Batch support defaults to calling the underlying LLM in parallel for each input by making use of a thread pool executor (in the sync batch case) or
asyncio.gather
(in the async batch case). The concurrency can be controlled with themax_concurrency
key inRunnableConfig
.
Each LLM integration can optionally provide native implementations for async, streaming or batch, which, for providers that support it, can be more efficient. The table shows, for each integration, which features have been implemented with native support.
All LLMs
Label | Description |
---|---|
AI21 Labs | This example goes over how to use LangChain to interact with AI21 Jur... |
Aleph Alpha | The Luminous series is a family of large language models. |
Alibaba Cloud PAI EAS | Machine Learning Platform for AI of Alibaba Cloud is a machine learni... |
Amazon API Gateway | Amazon API Gateway is a fully managed service that makes it easy for ... |
Anthropic | You are currently on a page documenting the use of Anthropic legacy C... |
Anyscale | Anyscale is a fully-managed Ray platform, on which you can build, dep... |
Aphrodite Engine | Aphrodite is the open-source large-scale inference engine designed to... |
Arcee | This notebook demonstrates how to use the Arcee class for generating ... |
Azure ML | Azure ML is a platform used to build, train, and deploy machine learn... |
Azure OpenAI | You are currently on a page documenting the use of Azure OpenAI text ... |
Baichuan LLM | Baichuan Inc. (https Efficiency, Health, and Happiness. |
Baidu Qianfan | Baidu AI Cloud Qianfan Platform is a one-stop large model development... |
Banana | Banana is focused on building the machine learning infrastructure. |
Baseten | Baseten is a Provider in the LangChain ecosystem that implements the ... |
Beam | Calls the Beam API wrapper to deploy and make subsequent calls to an ... |
Bedrock | You are currently on a page documenting the use of Amazon Bedrock mod... |
Bittensor | Bittensor is a mining network, similar to Bitcoin, that includes buil... |
CerebriumAI | Cerebrium is an AWS Sagemaker alternative. It also provides API acces... |
ChatGLM | ChatGLM-6B is an open bilingual language model based on General Langu... |
Clarifai | Clarifai is an AI Platform that provides the full AI lifecycle rangin... |
Cloudflare Workers AI | Cloudflare AI documentation listed all generative text models availab... |
Cohere | You are currently on a page documenting the use of Cohere models as t... |
C Transformers | The C Transformers library provides Python bindings for GGML models. |
CTranslate2 | CTranslate2 is a C++ and Python library for efficient inference with ... |
Databricks | Databricks Lakehouse Platform unifies data, analytics, and AI on one ... |
DeepInfra | DeepInfra is a serverless inference as a service that provides access... |
DeepSparse | This page covers how to use the DeepSparse inference runtime within L... |
Eden AI | Eden AI is revolutionizing the AI landscape by uniting the best AI pr... |
ExLlamaV2 | ExLlamav2 is a fast inference library for running LLMs locally on mod... |
Fireworks | You are currently on a page documenting the use of Fireworks models a... |
ForefrontAI | The Forefront platform gives you the ability to fine-tune and use ope... |
Friendli | Friendli enhances AI application performance and optimizes cost savin... |
GigaChat | This notebook shows how to use LangChain with GigaChat. |
Google AI | You are currently on a page documenting the use of Google models as t... |
Google Cloud Vertex AI | You are currently on a page documenting the use of Google Vertex text... |
GooseAI | GooseAI is a fully managed NLP-as-a-Service, delivered via API. Goose... |
GPT4All | GitHub:nomic-ai/gpt4all an ecosystem of open-source chatbots trained ... |
Gradient | Gradient allows to fine tune and get completions on LLMs with a simpl... |
Huggingface Endpoints | The Hugging Face Hub is a platform with over 120k models, 20k dataset... |
Hugging Face Local Pipelines | Hugging Face models can be run locally through the HuggingFacePipelin... |
IBM watsonx.ai | WatsonxLLM is a wrapper for IBM watsonx.ai foundation models. |
IPEX-LLM | IPEX-LLM is a PyTorch library for running LLM on Intel CPU and GPU (e... |
Javelin AI Gateway Tutorial | This Jupyter Notebook will explore how to interact with the Javelin A... |
JSONFormer | JSONFormer is a library that wraps local Hugging Face pipeline models... |
KoboldAI API | KoboldAI is a "a browser-based front-end for AI-assisted writing with... |
Konko | Konko API is a fully managed Web API designed to help application dev... |
Layerup Security | The Layerup Security integration allows you to secure your calls to a... |
Llama.cpp | llama-cpp-python is a Python binding for llama.cpp. |
Llamafile | Llamafile lets you distribute and run LLMs with a single file. |
LM Format Enforcer | LM Format Enforcer is a library that enforces the output format of la... |
Manifest | This notebook goes over how to use Manifest and LangChain. |
Minimax | Minimax is a Chinese startup that provides natural language processin... |
MLX Local Pipelines | MLX models can be run locally through the MLXPipeline class. |
Modal | The Modal cloud platform provides convenient, on-demand access to ser... |
MoonshotChat | Moonshot is a Chinese startup that provides LLM service for companies... |
MosaicML | MosaicML offers a managed inference service. You can either use a var... |
NLP Cloud | The NLP Cloud serves high performance pre-trained or custom models fo... |
oci_generative_ai | Oracle Cloud Infrastructure Generative AI |
OCI Data Science Model Deployment Endpoint | OCI Data Science is a fully managed and serverless platform for data ... |
OctoAI | OctoAI offers easy access to efficient compute and enables users to i... |
Ollama | You are currently on a page documenting the use of Ollama models as t... |
OpaquePrompts | OpaquePrompts is a service that enables applications to leverage the ... |
OpenAI | You are currently on a page documenting the use of OpenAI text comple... |
OpenLLM | 🦾 OpenLLM is an open platform for operating large language models (L... |
OpenLM | OpenLM is a zero-dependency OpenAI-compatible LLM provider that can c... |
OpenVINO | OpenVINO™ is an open-source toolkit for optimizing and deploying AI i... |
Petals | Petals runs 100B+ language models at home, BitTorrent-style. |
PipelineAI | PipelineAI allows you to run your ML models at scale in the cloud. It... |
Predibase | Predibase allows you to train, fine-tune, and deploy any ML model—fro... |
Prediction Guard | Basic LLM usage |
PromptLayer OpenAI | PromptLayer is the first platform that allows you to track, manage, a... |
RELLM | RELLM is a library that wraps local Hugging Face pipeline models for ... |
Replicate | Replicate runs machine learning models in the cloud. We have a librar... |
Runhouse | Runhouse allows remote compute and data across environments and users... |
SageMakerEndpoint | Amazon SageMaker is a system that can build, train, and deploy machin... |
SambaNova | SambaNova's Sambaverse and Sambastudio are platforms for running your... |
Solar | This community integration is deprecated. You should use ChatUpstage ... |
SparkLLM | SparkLLM is a large-scale cognitive model independently developed by ... |
StochasticAI | Stochastic Acceleration Platform aims to simplify the life cycle of a... |
Nebula (Symbl.ai) | Nebula is a large language model (LLM) built by Symbl.ai. It is train... |
TextGen | GitHub:oobabooga/text-generation-webui A gradio web UI for running La... |
Titan Takeoff | TitanML helps businesses build and deploy better, smaller, cheaper, a... |
Together AI | You are currently on a page documenting the use of Together AI models... |
Tongyi Qwen | Tongyi Qwen is a large-scale language model developed by Alibaba's Da... |
vLLM | vLLM is a fast and easy-to-use library for LLM inference and serving,... |
Volc Engine Maas | This notebook provides you with a guide on how to get started with Vo... |
Intel Weight-Only Quantization | Weight-Only Quantization for Huggingface Models with Intel Extension ... |
Writer | Writer is a platform to generate different language content. |
Xorbits Inference (Xinference) | Xinference is a powerful and versatile library designed to serve LLMs, |
YandexGPT | This notebook goes over how to use Langchain with YandexGPT. |
Yi | 01.AI, founded by Dr. Kai-Fu Lee, is a global company at the forefron... |
Yuan2.0 | Yuan2.0 is a new generation Fundamental Large Language Model develope... |