Data types
⚙️ Custom
When we say “custom”, we mean that you can customize the loader and chunker to your needs. This is done by passing a custom loader and chunker to the add
method.
from embedchain import App
import your_loader
from my_module import CustomLoader
from my_module import CustomChunker
app = App()
loader = CustomLoader()
chunker = CustomChunker()
app.add("source", data_type="custom", loader=loader, chunker=chunker)
The custom loader and chunker must be a class that inherits from the BaseLoader
and BaseChunker
classes respectively.
If the data_type
is not a valid data type, the add
method will fallback to the custom
data type and expect a custom loader and chunker to be passed by the user.
Example:
from embedchain import App
from embedchain.loaders.github import GithubLoader
app = App()
loader = GithubLoader(config={"token": "ghp_xxx"})
app.add("repo:embedchain/embedchain type:repo", data_type="github", loader=loader)
app.query("What is Embedchain?")
# Answer: Embedchain is a Data Platform for Large Language Models (LLMs). It allows users to seamlessly load, index, retrieve, and sync unstructured data in order to build dynamic, LLM-powered applications. There is also a JavaScript implementation called embedchain-js available on GitHub.
Was this page helpful?