When we say “custom”, we mean that you can customize the loader and chunker to your needs. This is done by passing a custom loader and chunker to the add method.

from embedchain import App
import your_loader
from my_module import CustomLoader
from my_module import CustomChunker

app = App()
loader = CustomLoader()
chunker = CustomChunker()

app.add("source", data_type="custom", loader=loader, chunker=chunker)

The custom loader and chunker must be a class that inherits from the BaseLoader and BaseChunker classes respectively.

If the data_type is not a valid data type, the add method will fallback to the custom data type and expect a custom loader and chunker to be passed by the user.

Example:

from embedchain import App
from embedchain.loaders.github import GithubLoader

app = App()

loader = GithubLoader(config={"token": "ghp_xxx"})

app.add("repo:embedchain/embedchain type:repo", data_type="github", loader=loader)

app.query("What is Embedchain?")
# Answer: Embedchain is a Data Platform for Large Language Models (LLMs). It allows users to seamlessly load, index, retrieve, and sync unstructured data in order to build dynamic, LLM-powered applications. There is also a JavaScript implementation called embedchain-js available on GitHub.