🤖 Large language models (LLMs)
Overview
Embedchain comes with built-in support for various popular large language models. We handle the complexity of integrating these models for you, allowing you to easily customize your language model interactions through a user-friendly interface.
OpenAI
Google AI
Azure OpenAI
Anthropic
Cohere
Together
Ollama
vLLM
Clarifai
GPT4All
JinaChat
Hugging Face
Llama2
Vertex AI
Mistral AI
AWS Bedrock
Groq
NVIDIA AI
OpenAI
To use OpenAI LLM models, you have to set the OPENAI_API_KEY
environment variable. You can obtain the OpenAI API key from the OpenAI Platform.
Once you have obtained the key, you can use it like this:
If you are looking to configure the different parameters of the LLM, you can do so by loading the app using a yaml config file.
Function Calling
Embedchain supports OpenAI Function calling with a single function. It accepts inputs in accordance with the Langchain interface.
With any of the previous inputs, the OpenAI LLM can be queried to provide the appropriate arguments for the function.
Google AI
To use Google AI model, you have to set the GOOGLE_API_KEY
environment variable. You can obtain the Google API key from the Google Maker Suite
Azure OpenAI
To use Azure OpenAI model, you have to set some of the azure openai related environment variables as given in the code block below:
You can find the list of models and deployment name on the Azure OpenAI Platform.
Anthropic
To use anthropic’s model, please set the ANTHROPIC_API_KEY
which you find on their Account Settings Page.
Cohere
Install related dependencies using the following command:
Set the COHERE_API_KEY
as environment variable which you can find on their Account settings page.
Once you have the API key, you are all set to use it with Embedchain.
Together
Install related dependencies using the following command:
Set the TOGETHER_API_KEY
as environment variable which you can find on their Account settings page.
Once you have the API key, you are all set to use it with Embedchain.
Ollama
Setup Ollama using https://github.com/jmorganca/ollama
vLLM
Setup vLLM by following instructions given in their docs.
Clarifai
Install related dependencies using the following command:
set the CLARIFAI_PAT
as environment variable which you can find in the security page. Optionally you can also pass the PAT key as parameters in LLM/Embedder class.
Now you are all set with exploring Embedchain.
GPT4ALL
Install related dependencies using the following command:
GPT4all is a free-to-use, locally running, privacy-aware chatbot. No GPU or internet required. You can use this with Embedchain using the following code:
JinaChat
First, set JINACHAT_API_KEY
in environment variable which you can obtain from their platform.
Once you have the key, load the app using the config yaml file:
Hugging Face
Install related dependencies using the following command:
First, set HUGGINGFACE_ACCESS_TOKEN
in environment variable which you can obtain from their platform.
You can load the LLMs from Hugging Face using three ways:
Hugging Face Hub
To load the model from Hugging Face Hub, use the following code:
Hugging Face Local Pipelines
If you want to load the locally downloaded model from Hugging Face, you can do so by following the code provided below:
Hugging Face Inference Endpoint
You can also use Hugging Face Inference Endpoints to access custom endpoints. First, set the HUGGINGFACE_ACCESS_TOKEN
as above.
Then, load the app using the config yaml file:
Currently only supports text-generation
and text2text-generation
for now [ref].
See langchain’s hugging face endpoint for more information.
Llama2
Llama2 is integrated through Replicate. Set REPLICATE_API_TOKEN
in environment variable which you can obtain from their platform.
Once you have the token, load the app using the config yaml file:
Vertex AI
Setup Google Cloud Platform application credentials by following the instruction on GCP. Once setup is done, use the following code to create an app using VertexAI as provider:
Mistral AI
Obtain the Mistral AI api key from their console.
AWS Bedrock
Setup
- Before using the AWS Bedrock LLM, make sure you have the appropriate model access from Bedrock Console.
- You will also need to authenticate the
boto3
client by using a method in the AWS documentation - You can optionally export an
AWS_REGION
Usage
The model arguments are different for each providers. Please refer to the AWS Bedrock Documentation to find the appropriate arguments for your model.
Groq
Groq is the creator of the world’s first Language Processing Unit (LPU), providing exceptional speed performance for AI workloads running on their LPU Inference Engine.
Usage
In order to use LLMs from Groq, go to their platform and get the API key.
Set the API key as GROQ_API_KEY
environment variable or pass in your app configuration to use the model as given below in the example.
NVIDIA AI
NVIDIA AI Foundation Endpoints let you quickly use NVIDIA’s AI models, such as Mixtral 8x7B, Llama 2 etc, through our API. These models are available in the NVIDIA NGC catalog, fully optimized and ready to use on NVIDIA’s AI platform. They are designed for high speed and easy customization, ensuring smooth performance on any accelerated setup.
Usage
In order to use LLMs from NVIDIA AI, create an account on NVIDIA NGC Service.
Generate an API key from their dashboard. Set the API key as NVIDIA_API_KEY
environment variable. Note that the NVIDIA_API_KEY
will start with nvapi-
.
Below is an example of how to use LLM model and embedding model from NVIDIA AI:
Token Usage
You can get the cost of the query by setting token_usage
to True
in the config file. This will return the token details: prompt_tokens
, completion_tokens
, total_tokens
, total_cost
, cost_currency
.
The list of paid LLMs that support token usage are:
- OpenAI
- Vertex AI
- Anthropic
- Cohere
- Together
- Groq
- Mistral AI
- NVIDIA AI
Here is an example of how to use token usage:
If a model is missing and you’d like to add it to model_prices_and_context_window.json
, please feel free to open a PR.
If you can't find the specific LLM you need, no need to fret. We're continuously expanding our support for additional LLMs, and you can help us prioritize by opening an issue on our GitHub or simply reaching out to us on our Slack or Discord community.
Was this page helpful?