Building AI Features.

This guide will show you how to infuse your applications with AI capabilities, using the AI Plugin

Overview

Developers can build AI features into their applications using Servoy's AI Plugin, a developer-focused toolkit, which enables a broad range of functionality using today's latest Large Language Models (LLMs).

How It Works

Developers can use the plugin to programmatically interact with models of their choosing for chat, vector embedding/search and agentic features. This enables them to infuse business applications with AI capabilities for a broad range of use cases.

Use Cases

The potential use cases are virtually limitless, but we'll break it into a few categories and patterns.

Natural Language Interfaces (NLI)

Business Apps are NOT User-Friendly In the real-world, we deal with systems of record, structured data, and rigid business rules. Despite our best efforts in modern UI design, the end-user has always been forced to reckon with the inherent structure of the underlying system.

Better UX through NLIs LLMs provide the potential to break this age-old incompatibility by allowing users to interact with systems using Natural Language (text, but also images and sound), both by understanding user intent as input, and by explaining results as output.

For example, a user today may be forced to review a complicated BI report, when she likely needs an answer to a question or to sharpen her insight or to help with a decision.

Perhaps she could just...ask in her own words! and be answered in plain language or even a picture? Here are some more examples:

Let users interact with the application using everyday language
Turn plain-language requests into searches, filters, and reports
Reduce the need for complex or highly customized UI
Support flexible input instead of rigid forms
Make advanced features easier to access without training
Capture user intent without requiring knowledge of the data model

Knowledge Retrieval

Most organizations rely on an ecosystem of data and information. AI-infused business applications can provide context on-demand.

Let users find information by meaning, not just keywords
Surface relevant documents and records automatically
Find similar cases, issues, or documents
Answer questions using your own content and data
Ground AI responses in known, trusted sources

Unstructured Data

Most organizations sit on large amounts of dark data — emails, documents, notes, attachments, images, PDFs, and free-text fields that are stored but rarely used. AI makes it possible to extract meaning from this unstructured content and turn it into data that applications can search, reason over and act on.

Turn documents archives into structured, usable data
Automatically classify and tag content
Extract key information from messy or inconsistent inputs
Group and match similar content
Make large volumes of text searchable and actionable

Assistants

Assistants can be embedded directly into applications to provide contextual guidance, explanations, and support at the moment it’s needed. They help users understand data, decisions, and processes without leaving the context of their work.

Assistants commonly:

Provide in-app guidance and explanations
Help users understand records, screens, and decisions
Answer questions using application context
Support better human decision-making
Reduce friction in complex workflows

Agents

Agents go a step further than assistants by acting on behalf of the user. They can plan and execute multi-step tasks, call application services or external tools, and operate with varying levels of autonomy while remaining under application control. An agent would work in collaboration with a user or, using Servoy's Automation Tools, could run completely autonomously to fulfill tasks and pass control to users only as needed.

In General, Agents can:

Execute multi-step tasks toward a defined goal
Invoke application services and external tools
Coordinate actions across systems or workflows
Operate with user oversight or approval
Automate repetitive or complex processes

Model Choice

Servoy’s AI plugin is intentionally model-agnostic. It does not lock you into a specific LLM, embedding model, or vector store. Instead, it provides a consistent integration layer that lets you choose, configure, and evolve the AI components that best fit your application, architecture, and compliance requirements.

What This Means in Practice

Choose your own LLMs, embedding models, and vector stores
Switch or combine providers without changing application logic
Manage your own API keys and credentials
Control usage limits, quotas, and rate limiting
Monitor and manage token consumption and costs
Decide where models run (cloud, private, or local)
Apply your own security, compliance, and data-handling policies

Why This Matters

Avoid vendor lock-in
Adapt quickly as models and providers evolve
Align AI usage with your organization’s governance and cost controls

Get Started

TBD

Building Chat Flows

Quick Overview

All language models work in essentially the same way: they take text as input and return text as output. A "chat completion" is simply a structured way to send prompts, instructions, and conversation history to a model and receive a response.

The model itself does not understand your application, data, or workflows. It is up to the developer to provide context, design prompt templates, manage conversation state, and connect the model’s output to application logic and UI.

The examples below show how chat completions can be used as a foundational building block that you can extend with context, retrieval, and application behavior.

You don't have to build a chatbot While chat completions can be used to build chatbot-style interfaces, they are not limited to chat-based UX. In many applications, chat completions run entirely behind the scenes — generating explanations, interpreting user intent, transforming text, or driving application logic — without the end user ever seeing a “chat” interface.

Basic Chat Completion

This is the Hello World example of starting up a Chat-Cilent, sending in a prompt and receiving the response. It's just a few lines of code and every use case builds upon this.

Build a Chat Client instance

First, you will need to create an instance of a ChatClient using a Builder object. There are different builders, depending on the for the model provider of your choice.

// OpenAI Example
let client = plugins.ai.createOpenAiChatBuilder()
	  .apiKey(MY_API_KEY) // Plug-in your API key
		.modelName('gpt-3.5-turbo') // Select a model of your choice
		.addSystemMessage('You are a helpful assistant') // The guiding message for this model
		.build(); //builds the instance

// Gemini Example
let client = plugins.ai.createGeminiChatBuilder()
	  .apiKey(MY_API_KEY) // Plug-in your API key
		.modelName('gemini-3-pro') // Select a model of your choice
		.addSystemMessage('You are a helpful assistant') // The guiding message for this model
		.build(); //builds the instance

Here you see a couple of example builders for different providers, both taking several parameters. Some are optional, but the following are required:

API Key - Obtain this from your model's vendor after you set up your developer account.
Model Name - Obtain a list of compatible model names from your vendor of choice.
System Message - This is a guiding principle for the model, including what type of responses it generates. It could be that it is a domain expert (i.e. US Copywrite Law) or it always answers in Strict JSON output, etc.

Handing API Keys As a best practice, do not hardcode it, but load it from a secure location, such as your servoy.properties file, i.e. application.getServoyProperty('openai_api_key');

Prompt the Model

// send the userMessage
let promise = client.chat(userMessage);

Once you have created a ChatClient instance, sending your prompt into the model is a single line of code to call the chat method, passing in the userMessage parameter. This method runs asynchronously and will return a JavaScript Promise object to manage the response. More on that below.

Handle the Response

// send the user message and 
	client.chat(userMessage).then(
	
		/** @param {plugins.ai.ChatResponse} response */
		function(response) {
			application.output(response.getResponse());
		
		// handle errors
		}).catch(
			function(error) {
				application.output('Error: ' + error.message);
			}
		);

Here you can see that the Promise object resolves to a ChatResponse object, which is passed into the handling function. Then simply call the getResponse method to get the String value of the response.

Conversation Memory

Chat Models, contrary to many assumptions, don't actually preserve any state or memory. The illusion of a continuous conversation is created by sending the chat history with every call. Fortunately, Servoy's AI plugin handles this for you. All you have to do is enable memory.

Setting the Max Memory

let client = plugins.ai.createOpenAiChatBuilder()
	  .apiKey(MY_API_KEY) // Plug-in your API key
		.modelName('gpt-3.5-turbo') // Select a model of your choice
		.maxMemoryTokens(1000) // Enables memory
		.addSystemMessage('You are a helpful assistant') // The guiding message for this model
		.build(); //builds the instance

To enable chat memory, simply set the max number of tokens that is "remembered" when you provision the ChatClient by calling maxMemoryTokens on the builder. After this, you can reuse the client object and every time you call chat(), it will automatically remember the last X tokens from your session to be used for context.

Token Management Memory is not enabled unless specified and it is recommended to only use memory for use cases where you need to keep a conversation thread as part of your context. Keep in mind your costs per token and your feature requirements when setting your max value.

Streaming Response

The response payload from a chat completion request is actually delivered in chunks as it is generated. However, the default approach is to call the then() method of the Promise object, which resolves to the entire response payload. This is really all you need if you are doing pure programmatic interactions.

However, if you are displaying the response content to the user, you may want to display the chucks as they become available.

// send the user message and handle the response
	client.chat(userMessage,
		
		// on partial response (append string)
		/** @param {plugins.ai.ChatResponse} response */
		function(response){
			responseMessage += response
		},
		
		// on completion
		/** @param {plugins.ai.ChatResponse} response */
		function(response){
			responseMessage += response.getResponse();
		},
		
		// on error
		function(e){
			application.output('Error during streaming chat: ' + e.message, LOGGINGLEVEL.ERROR);
		}
	);

To handle a streaming response, call the chat() method as before, but with slightly different parameters:

String The prompt input (same as the first example)
Function This function is called on-partial-complete and a String is passed in. You can append this String to a local variable to show the response as it is generated.
Function This next function is called when the final chunk is generated. You can append this String one last time to complete the transaction.

Remember that all you need to do is create a Form Variable and append the chunks to it in each callback. You can render the result with a data-bound component (such as a TextArea) as it is generated.

Vector Embedding & Search

Overview

Vector Embeddings are a way to represent text as numerical values that capture its meaning rather than its exact wording. By converting text into embeddings, applications can compare, search, and match content based on semantic similarity instead of keywords alone.

The Servoy AI plugin supports creating Embedding Models to generating embeddings from text, and store those embeddings in a Vector Store. Once stored, embeddings can be searched to find related or similar content, enabling features such as semantic search, similarity matching, and retrieval of relevant context for AI-driven workflows.

Developers control how embeddings are created, what content is embedded, where embeddings are stored, and how search results are used within the application. This makes vector search a flexible building block that can be applied to many use cases, including document search, case matching, recommendation, and context retrieval for language models.

Basic Text Embedding

While there are many use cases and approaches, all of them follow a general pattern

Configure an Embedding Model to convert text into vector representations
Generate embeddings for documents, records, or other content
Store those embeddings in a Vector Store, along with relevant metadata
Convert a search content into an embedding using the same model
Search the vector store for the most semantically similar vectors
Use the search results in application logic, UI, or prompt construction

Create an Embedding Model

var embeddingModel = plugins.ai.createOpenAiEmbeddingModelBuilder()
		.apiKey(scopes.exampleAIPlugin.getOpenAIApiKey())
		.modelName('text-embedding-3-small')
		.build();

Embedding Models are a type of LLM that is specifically designed to generate vector embeddings. This is your first step. To create a model instance, you'll need:

Your API key (i.e. for OpenAI or Gemini)
The name of the model that you want to use

Create a Vector Store

While you can directly use the model to create vector embeddings (An array of numbers), it's most common to pass those vectors into a store. You can do this in one step by creating a store from the model, then generating the embeddings and storing them in a single line of code.

// The text to embedd
var textToEmbed = [
    'The quick, brown fox jumped over the lazy dog',
    'The woman asked the waiter for the check',
    'Force equals mass multiplied by acceleration'
];

// create the store (use in-mem implementation)
var inMemoryVectorStore = embeddingModel.createInMemoryStore();

// embedd the text
var promise = inMemoryVectorStore.embed([myText]);

// on success/error (optional)
promise.then(
    function(){application.output('Embedding complete');}
);

In this example, the embed() method takes a single argument:

An Array of Strings to embed and store.

Vector Search

Once you have stored content as vectors in a store, you can do similarity searches.

// The max number of results to retrieve
var maxResults = 3;

// The search text
var searchText = 'A physics equation';
var results = inMemoryVectorStore.search(searchText,maxResults);

// Loop over the results
results.forEach(
		/** @param {plugins.ai.SearchResult} result */
		function(result){
		  // print the result
			application.output('Result: ' + result.getText() + ' with score: ' + result.getScore());
		}
);

In this example, we search for similarity scores and print the resulting score

Search Results Printed to Console

Result: Force equals mass multiplied b..., Score: 0.7347
Result: The quick, brown fox jumped ov..., Score: 0.5615
Result: The woman asked the waiter for..., Score: 0.5313

You can see that each item in the Vector Store is returned with a Similarity Score (0-1). This is a simple example, but if you take this to scale, you can embed, classify and search documents and unstructured data.

Document Understanding

Continuing with the vector embedding example, let's take a look at how one could digest and search an entire document.

Chunking and Embedding Documents

// get a file reference
var file = plugins.file.convertToJSFile('product-manual.pdf');

// Create the in-memory embedding store
store = plugins.ai.createOpenAiEmbeddingModelBuilder()
	        .apiKey(scopes.exampleAIPlugin.getOpenAIApiKey())
	        .modelName('text-embedding-3-small').build()
	        .createInMemoryStore();
	
// Embed the file with a chunk size of 500 and an overlap of 50
store.embed(file,500,50);

This example uses a Vector Store, as before, but this time the call to embed is taking the following arguments:

File - a JSFile object. This is the file, whose contents will be embedded.
Chunk Size — Number The text content is split into smaller chunks, and each chunk is embedded separately. This value defines the maximum size of each chunk, measured in tokens.
Chunk Overlap — Number Specifies how much text (in tokens) is shared between adjacent chunks. Using an overlap helps preserve context when content is split at arbitrary boundaries, reducing the risk of losing meaning across chunks.

What is the ideal chunk strategy? It depends on the document content and use case. For general documents, start with 300-500 tokens with a 10-15% overlap. Very long-form text may need larger chunks to capture context. Highly structured (dense) text, such as code, lists, tables, etc. may require smaller chunks.

Searching Documents

Once you have chunked and embedded your document(s), you can search for matching chunks using the exact same approach from the previous example:

var results = store.search('battery type',maxResults);

This will return an array of results, each containing the matching chunk and its similarity score.

Using Documents as a Knowledge Base

Let's build on this example to show how you can use documents as a knowledge base by leveraging vector search and chat together. Imagine that you have digested a repository of product manuals and the end-user can interact with the the knowledge base via chat.

// question or user-input
var question = 'Are the earbuds water resistant';

// Get the top-ranking chunk
var results = store.search(question,1);

// build a chat client to ask the question with context
var chatClient = plugins.ai.createOpenAiChatBuilder()
		.apiKey(scopes.exampleAIPlugin.getOpenAIApiKey())
	    .modelName('gpt-4o')
		  .addSystemMessage('You are great at answering questions about products when given context directly from the product manual.')
		  .build();
		
// build the prompt with the context from the most relevant chunk
var prompt = 'Use the following context to answer the question.\n\n\
The question is: "' + question + '"\n\n\
The context is:\n\n"' + results[0].getText();

// ask the chat model
chatClient.chat(prompt).then(function(response){
	application.output('Answer: ' + response.getResponse());
});

In this example, we used the search result from the document as an input for context in a chat session. (For simplicity, we chose only the top-ranking chunk, but you can imagine a more complex scenario that fuses and ranks multiple results to provide context for the chat input.)

Combined Tooling and Approaches! There is no limit to the combination of tools and approaches that you can take. This simple example shows how you can chain calls between two models, but the combinations are endless.

Using Vector Metadata

When you embed text into vectors, the embedding captures meaning — but not where it came from. Metadata solves that by storing structured fields alongside each vector so you can (a) trace results back to the original source and (b) filter or scope searches to the right subset of content.

Practical Uses for Metadata

Traceability: show “this result came from Document X (chunk 12)”
Filtering: search only within a customer, tenant, project, category, date range, etc.
Security scoping: restrict retrieval to what the current user is allowed to access
Navigation: open the exact record or document location directly from results

Attach metadata at the time of embedding

// document metadata
var metadata = {fileName:'earbuds-product-manua.pdf, fileID:123}
store.embed(file,500,50,metadata);

Evaluate metadata when

var results = store.search(question,1);
application.output('Matching file: ' + result[0].getMetadata().fileName);

Embedding Relational Data

Vector embeddings are not limited to documents and files. Text derived from relational data can also be embedded to enable semantic search, similarity matching, and intent-based retrieval over application records.

For example, suppose an end-user is searching a database of products and enters a key word "beer"

In this approach, selected fields from a database record are combined into a textual representation and embedded as a vector, while metadata is used to preserve the record’s identity, type, and access scope. This allows applications to perform semantic search across structured records—such as customers, cases, tickets, or products—while still resolving results back to concrete records that users can view and act on.

Create Embeddings for Record Sets

Embedding based on Servoy FoundSet

// get a foundset based on products table
var fs = datasources.db.example_data.products.getFoundSet();
fs.loadAllRecords();

// embed the product name and descritpion
store.embedAll(fs,'productname', 'product_desc')
		.then(
			function(){application.output('embedded success')},
			function(e){application.output('embedded failed: ' + e)}
		);

In this example, Servoy's AI plugin offers a shortcut: By calling embedAll and passing a FoundSet object, you can embed a set of records in one simple call. You'll notice that the method is overloaded, so you can embed data from multiple text columns in a single call. Finally, you don't have to worry about metadata. The plugin will automatically store each record's primary key (PK) column(s).

Semantic Search of Record Embeddings

var results = store.search(searchText,10); // get the top-ten results
	
// store ids of matching records
var ids = [];
for(var i in results){
		// include all matches with a strong similarity (above 0.7)
		if(results[i].getScore() > .7){
			ids.push(results[i].getMetadata().productid); // access the metadata to link it back
		}
}
// load the best-matching records
foundset.loadRecords(databaseManager.convertToDataSet(ids));

In this example, we search the vector store for products with a name similar to the input search text. Critically, we use the getMetadata method of the results to link it back to the products entity. This method returns a simple JavaScript object, from which you can directly access the PK value(s) by column name.

Semantic Search You've just seen how easy it is to turn your database into something which can be searched "semantically" and without SQL. (In a later example, we'll see how to combine vector and SQL into a powerful search!)

Types of Vector Stores

In-Memory Vector Store

Until this point, all the examples have used an In-Memory Vector Store. This implementation is great for getting started, because it is easy to setup and has the same functionality. Depending on your use case, it may be perfectly adequate.

When NOT to use an in-memory store:

As the name suggests, this type of vector store caches the vectors and therefore has the following limitations:

Not persistent — Vectors stored in memory will not persist beyond the current runtime in which they were embedded. Therefore it is not ideal for using across sessions.
Not Scalable — If you need to embed large volumes of data, then you should consider a proper Vector Database. In-memory is ideal for quick, one-off jobs, such as a single document. However, it's not uncommon for an organization to vectorize many documents in a single store. In this case, you would want to avoid the in-mem implementation.

Persistent Vector Store

Many relational databases include extensions to provide Vector Embedding and Search capabilities. Servoy's AI plugin gives you the option to connect to vector-enabled databases and use them as your store.

Since v2025.12, Servoy Developer ships PG Vector, a vector extension for the PostgreSQL database. You can use this out-of-the-box on your PostgreSQL servers.

Create a DB-backed Vector Store

var store = model.createServoyEmbeddingStoreBuilder()
		.serverName('example_data')
		.tableName('file_embeddings')
		.recreate(true)
		.addText(true)
		.metaDataColumn().name('filename').columnType(JSColumn.TEXT).add()
		.build();

In this example, we use the createServoyEmbeddingStoreBuilder method of the EmbeddingModel to create a persistent Vector Store, with the following options:

serverName — The target server (must have PG Vector or vector extension)
tableName — The name of the table that will be created on that server
recreate — Boolean, true if you want the persistence cleared (not used across sessions)
addText — Boolean, true if you want to add the plain (unembedded) text
metadataColumns — sub-builder to capture metadata, for later filtering and retrieval

Tool Calling

FAQ

How am I charged for model usage?

Can I use a local or open source model?

PreviousOutput format NextExtensions

Last updated 1 day ago

Was this helpful?

hashtagOverview

hashtagHow It Works

hashtagUse Cases

hashtagNatural Language Interfaces (NLI)

hashtagKnowledge Retrieval

hashtagUnstructured Data

hashtagAssistants

hashtagAgents

hashtagModel Choice

hashtagWhat This Means in Practice

hashtagWhy This Matters

hashtagGet Started

hashtagBuilding Chat Flows

hashtagQuick Overview

hashtagBasic Chat Completion

hashtagBuild a Chat Client instance

hashtagPrompt the Model

hashtagHandle the Response

hashtagConversation Memory

hashtagSetting the Max Memory

hashtagStreaming Response

hashtagVector Embedding & Search

hashtagOverview

hashtagBasic Text Embedding

hashtagCreate an Embedding Model

hashtagCreate a Vector Store

hashtagVector Search

hashtagDocument Understanding

hashtagChunking and Embedding Documents

hashtagSearching Documents

hashtagUsing Documents as a Knowledge Base

hashtagUsing Vector Metadata

hashtagEmbedding Relational Data

hashtagCreate Embeddings for Record Sets

hashtagSemantic Search of Record Embeddings

hashtagTypes of Vector Stores

hashtagIn-Memory Vector Store

hashtagPersistent Vector Store

hashtagTool Calling

hashtagFAQ

hashtagHow am I charged for model usage?

hashtagCan I use a local or open source model?

Overview

How It Works

Use Cases

Natural Language Interfaces (NLI)

Knowledge Retrieval

Unstructured Data

Assistants

Agents

Model Choice

What This Means in Practice

Why This Matters

Get Started

Building Chat Flows

Quick Overview

Basic Chat Completion

Build a Chat Client instance

Prompt the Model

Handle the Response

Conversation Memory

Setting the Max Memory

Streaming Response

Vector Embedding & Search

Overview

Basic Text Embedding

Create an Embedding Model

Create a Vector Store

Vector Search

Document Understanding

Chunking and Embedding Documents

Searching Documents

Using Documents as a Knowledge Base

Using Vector Metadata

Embedding Relational Data

Create Embeddings for Record Sets

Semantic Search of Record Embeddings

Types of Vector Stores

In-Memory Vector Store

Persistent Vector Store

Tool Calling

FAQ

How am I charged for model usage?

Can I use a local or open source model?