Add all web pages from an xml-sitemap. Filters non-text files. Use the data_type as sitemap. Eg:

from embedchain import App

app = App()

app.add('https://example.com/sitemap.xml', data_type='sitemap')