EmbeddingStore

Overview

EmbeddingStore provides methods to embed text data and perform similarity search using embeddings. It supports asynchronous embedding and blocking search, and can be backed by various embedding store implementations.

Methods Summarized

Type
Name
Summary

Embeds a PDF document by splitting it into paragraphs of specified size and overlap The text segments are stored in the embedding store and will get metadata from the pdf and if possible also the name of the PDF is set if that is known.

Embeds a PDF document by splitting it into paragraphs of specified size and overlap, with metadata.

Asynchronously embeds an array of text data with optional metadata and stores the results.

Generates embeddings for all records in the specified foundSet for the given textColumns and stores them in the specified vector column.

Performs a blocking similarity search for the given text, returning the best matches from the store.

Performs a blocking similarity search for the given text, returning the best matches from the store.

Methods Detailed

embed(pdfSource, maxSegmentSizeInChars, maxOverlapSizeInChars)

Embeds a PDF document by splitting it into paragraphs of specified size and overlap The text segments are stored in the embedding store and will get metadata from the pdf and if possible also the name of the PDF is set if that is known.

Parameters

  • Object pdfSource This can be JSFile or String represeting a file path, or byte[] representing the PDF content.

  • Number maxSegmentSizeInChars How many characters per segment we can have.

  • Number maxOverlapSizeInChars How many characters of overlap between segments.

Returns: Promise A Promise resolving to the store

embed(pdfSource, maxSegmentSizeInChars, maxOverlapSizeInChars, metaData)

Embeds a PDF document by splitting it into paragraphs of specified size and overlap, with metadata. The text segments are stored in the embedding store and will get metadata from the pdf and if possible also the name of the PDF is set if that is known. Also additional metadata can be provided that will be attached to each segment.

Parameters

  • Object pdfSource This can be JSFile or String represeting a file path, or byte[] representing the PDF content.

  • Number maxSegmentSizeInChars How many characters per segment we can have.

  • Number maxOverlapSizeInChars How many characters of overlap between segments.

  • Object metaData Metadata to attach to each segment.

Returns: Promise A Promise resolving to the store

embed(data, metaData)

Asynchronously embeds an array of text data with optional metadata and stores the results.

Parameters

  • Array data The array of text strings to embed.

  • Array metaData An array of metadata maps, one for each text string (may be null).

Returns: Promise A Promise resolving to the store

embedAll(foundSet, textColumns)

Generates embeddings for all records in the specified foundSet for the given textColumns and stores them in the specified vector column.

Parameters

  • JSFoundSet foundSet records in the foundSet are embedded

  • Array textColumns columns of the foundSet to embed

Returns: Promise A Promise resolving with the given foundSet when embeddings are stored, or rejects on error.

search(text)

Performs a blocking similarity search for the given text, returning the best matches from the store. This returns the default 3 results

Parameters

  • String text The query text to search for.

Returns: Array An array of SearchResult objects representing the best matches.

search(text, maxResults)

Performs a blocking similarity search for the given text, returning the best matches from the store.

Parameters

  • String text The query text to search for.

  • Number maxResults The maximum number of results to return.

Returns: Array An array of SearchResult objects representing the best matches.


Last updated

Was this helpful?