EmbeddingStore
Overview
EmbeddingStore provides methods to embed text data and perform similarity search using embeddings. It supports asynchronous embedding and blocking search, and can be backed by various embedding store implementations.
Methods Summarized
Embeds a PDF document by splitting it into paragraphs of specified size and overlap The text segments are stored in the embedding store and will get metadata from the pdf and if possible also the name of the PDF is set if that is known.
Embeds a PDF document by splitting it into paragraphs of specified size and overlap, with metadata.
Asynchronously embeds an array of text data with optional metadata and stores the results.
Generates embeddings for all records in the specified foundSet for the given textColumns and stores them in the specified vector column.
Performs a blocking similarity search for the given text, returning the best matches from the store.
Performs a blocking similarity search for the given text, returning the best matches from the store.
Methods Detailed
embed(pdfSource, maxSegmentSizeInChars, maxOverlapSizeInChars)
Embeds a PDF document by splitting it into paragraphs of specified size and overlap The text segments are stored in the embedding store and will get metadata from the pdf and if possible also the name of the PDF is set if that is known.
Parameters
Object pdfSource This can be JSFile or String represeting a file path, or byte[] representing the PDF content.
Number maxSegmentSizeInChars How many characters per segment we can have.
Number maxOverlapSizeInChars How many characters of overlap between segments.
Returns: Promise A Promise resolving to the store
embed(pdfSource, maxSegmentSizeInChars, maxOverlapSizeInChars, metaData)
Embeds a PDF document by splitting it into paragraphs of specified size and overlap, with metadata. The text segments are stored in the embedding store and will get metadata from the pdf and if possible also the name of the PDF is set if that is known. Also additional metadata can be provided that will be attached to each segment.
Parameters
Object pdfSource This can be JSFile or String represeting a file path, or byte[] representing the PDF content.
Number maxSegmentSizeInChars How many characters per segment we can have.
Number maxOverlapSizeInChars How many characters of overlap between segments.
Object metaData Metadata to attach to each segment.
Returns: Promise A Promise resolving to the store
embed(data, metaData)
Asynchronously embeds an array of text data with optional metadata and stores the results.
Parameters
Array data The array of text strings to embed.
Array metaData An array of metadata maps, one for each text string (may be null).
Returns: Promise A Promise resolving to the store
embedAll(foundSet, textColumns)
Generates embeddings for all records in the specified foundSet for the given textColumns and stores them in the specified vector column.
Parameters
JSFoundSet foundSet records in the foundSet are embedded
Array textColumns columns of the foundSet to embed
Returns: Promise A Promise resolving with the given foundSet when embeddings are stored, or rejects on error.
search(text)
Performs a blocking similarity search for the given text, returning the best matches from the store. This returns the default 3 results
Parameters
String text The query text to search for.
Returns: Array An array of SearchResult objects representing the best matches.
search(text, maxResults)
Performs a blocking similarity search for the given text, returning the best matches from the store.
Parameters
Returns: Array An array of SearchResult objects representing the best matches.
Last updated
Was this helpful?