We are excited to share an upcoming improvement in search quality: Moveworks AI Assistant will now be able to better answer questions from long KB articles with the help of an improved chunking algorithm that handles long articles better.
This change will be rolled out gradually to all customers and expected to available for all customers in all regions by Thursday, December 18.
What problem are we addressing?
We are improving search performance on documents that contain very long chunks, which are Moveworks-extracted sub-parts of the article.
“Chunking” is an important part of processing KB articles or web format documents (it is also sometimes referred to as “snippetization” in Moveworks terminology), which breaks down every article into smaller parts when it is ingested.
Every chunk is converted into an embedding, which is a semantic representation of the text that captures what the chunk is about, and this enables Moveworks Search to precisely match the specific part of an article that is relevant to a user’s question, and also allows the AI Assistant to surface that part in a citation if it is used the response. This is the underpinning of our semantic search capabilities.
Moveworks uses HTML headers to divide KB articles into chunks. While this results in a mostly even distribution of chunks with moderate sizes, sometimes the individual chunks could be as long as 1000 words if the chunking occurs at a different header level, or if the document does not have consistent use of headers.
Embeddings generated from such large chunks of text are in general less effective than smaller, more focused chunks because they contain too much text for the embeddings to precisely capture the essence of the content, and hence, the signal they create is too diffuse to be precisely matched to a specific question.
How does this change solve the problem?
We want to ensure that every chunk created is of an optimal size, while also preserving the benefits of using headers to divide the document.
Consequently, we have implemented a target maximum limit on the size of each chunk, which is 256 tokens (about 200 words). The exceptions to this are for scenarios where a larger chunk is useful to retain:
- If a paragraph has less than 512 tokens, it won’t be split.
- If a list or table has less than 1024 tokens, it won’t be split.
With this change, you can still expect Moveworks to create chunks or snippets from KB articles along header boundaries as we have always done, but wherever this leads to excessively long chunks, the new chunking algorithm will create additional chunks, staying close to the target maximum size of 200 tokens per chunk.
What can users expect?
Users will see an improvement in performance on long articles! Additionally, you may see the exact chunks or snippets that are cited may have different text from earlier, and in some cases, you may also see more cited chunks.
Does this change Moveworks’ recommendations for writing KB articles?
The high level guidelines are still relevant:
- You should still continue to use consistent header styles
- If the snippetization occurs at a higher header level, then Moveworks will create “right-sized” snippets by supplementing header-based snippets with further chunking, in case those chunks are too big
So in essence, while header styles are still important, our chunking will handle certain kinds of exceptions more gracefully than before.
When will this update be available?
This change will be rolled out gradually to all customers and expected to available for all customers in all regions by Thursday, December 18.
I saw a significant change in search performance for an article. What should I do?
Our evaluations show that for the vast majority of articles from our customers, this change will improve search performance. However, if you notice any degradation for a high priority article, please file a support ticket with the example utterance and expected article, and we will review your case.