Blog

All Blog Posts | Next Post | Previous Post

Add AI superpower to your Delphi & C++Builder apps part 6: RAG

Monday, June 2, 2025

TMS Software Delphi Components

This is part 6 in our article series about adding AI superpower to your Delphi & C++Builder apps. In the first 5 installments, we covered:

In this new episode, we will explore RAG. RAG stands for Retrieval Augmented Generation. Let's turn to AI to get us a proper definition of RAG:

Definition

RAG stands for Retrieval-Augmented Generation. It's a technique in AI that combines two key components:

Retrieval: Finding relevant information from external sources (like databases, documents, or knowledge bases)
Generation: Using a language model to create responses based on that retrieved information

Here's how it works: Instead of relying solely on what a language model learned during training, RAG systems first search for relevant information from external sources, then use that information to generate more accurate and up-to-date responses.

Why RAG is useful:

Provides access to current information beyond the model's training data
Reduces hallucinations by grounding responses in actual sources
Allows models to work with private or domain-specific information
Enables citation of sources for transparency

RAG with TTMSFNCCloudAI

So, how can we use RAG when using the TTMSFNCCloudAI component to automate the use of LLMs. This is possible in two ways:

We add data to the context when sending the prompt text
We take advantage of files we can store in the account we use for the cloud LLM service

If we want to reuse existing data, clearly method 2 is a preferred method. It avoids we have to send data along with the prompt over and over again. We can upload the files once and refer to the files by the ID given by the cloud LLM service along with the prompt. The reality at this moment is that from the cloud LLMs supported by TTMSFNCCloudAI, at this moment OpenAI, Gemini, Claude and Mistral have support for file storage. While Mistral documents it supports it, we have found so far not a way to succesfully use their API. Even with their provided CURL examples and using CURL it fails. So, we focused for now on OpenAI, Gemini and Claude. Later more on how to use it.

RAG with data sent along with the prompt

There is fortunately broad support for using this mechanism along the supported cloud LLMs. It is also simple to use. The TTMSFNCCloudAI component has the methods:

AddFile()
AddText()
AddURL()

AddFile() allows use to send data like text files, PDF files, Excel files, image files, audio files along with the prompt* (supported file types different from LLM to LLM at this moment, check the service to verify the supported types). For binary files, this data is typically base64 encoded and sent along with the prompt.

AddText() sends additional data with the prompt as plain text.

AddURL() sends a hyperlink to data, which can be a link to an online text document, image, PDF, etc... and also here again, to be checked with the LLM what is supported.

RAG with uploaded files

Another approach is to send the files separately to the cloud LLM (for those cloud LLMs who have this capability) and refer to this data (multiple times) via an ID. At this moment, we have implemented this for OpenAI, Gemini and Claude.

To have access to files, the TTMSCloudAI component has the .Files collection. Call TTMSFNCCloudAI.GetFIles to retrieve all previously uploaded file information via the .Files collection. Call TTMSFNCCloudAI.UploadFile() to add files to the cloud LLM. To delete a file, use TTMSFNCCloudAI.Files[index].Delete.

When the files collection is filled with the files retrieved from the cloud LLM, the ID's of the available files is automatically sent along with the prompt and thus, the LLM can use it for retrieval augmented generation.

With following example, we assume that the concept becomes much clearer right-away:

We uploaded to the cloud LLM a text file containing the story of The Golden Bird by the Brothers Grimm. With this story added, we can query the LLM for the information we want from this fairy tale.

TMS Software Delphi Components

What you see in the screenshot of the demo (that is also included in the latest version of TMS FNC Cloud Pack) is the list of files that was uploaded to the cloud LLM, here Claude. Then you can see the prompt that queries for a summary and main characters of the fairy tale.

The code used for this is as simple as:

// send the prompt
TMSFNCCloudAI1.Context.Text := memo1.Lines.Text;

// get the response
procedure TForm1.TMSFNCCloudAI1Executed(Sender: TObject;
  AResponse: TTMSFNCCloudAIResponse; AHttpStatusCode: Integer;
  AHttpResult: string);
begin
  ProgressBar1.State := pbsPaused;
  if AHttpStatusCode = 200 then
  begin
    memo2.Text := AResponse.Content.Text;
  end
  else
    ShowMessage('HTTP error code: '+AHttpStatusCode.ToString+#13#13+ AHttpResult);
end;

OpenAI assistants

OpenAI works with stored files in a somewhat different way than the Claude and Gemini APIs. The OpenAI Completion API can't work with uploaded files. One needs to use the Assistants API. Basically, you create an assistant with the API, you then create a thread and run on this created thread the assistant adding messages (prompts) to it that contain both the prompt and the file IDs the LLM should refer to. You can then run this thread to perform the requested action but you need to poll for the completion of the thread and when completed, to get the result:

The programmatic interface we offer in Delphi via the TTMSFNCCloudAI component to execute this from your apps is:

  if TMSFNCCloudAI1.Service = aiOpenAI then
  begin
    // if an assistant exists, use it, if not create a new assistant 
    TMSFNCCloudAI1.GetAssistants(procedure(AResponse: TTMSFNCCloudAIResponse; AHttpStatusCode: Integer; AHttpResult: string)
     var
       id: string;
     begin
       if AHttpStatusCode div 100 = 2 then
       begin
         if TMSFNCCloudAI1.Assistants.Count > 0 then
         begin
           id := TMSFNCCloudAI1.Assistants[0].ID;
           RunThread(id);
         end
         else
           TMSFNCCloudAI1.CreateAssistant('My assistant','you assist me with files',[aitFileSearch], procedure(const AID: string)
             begin
               id := AID;
               RunThread(id);
             end);

       end;
     end);
  end;

procedure TForm1.RunThread(id: string);
begin
  // create a new thread
  TMSFNCCloudAI1.CreateThread(procedure(const AId: string)
    var
      sl: TStringList;
      i: integer;
      threadid: string;
    begin
      threadid := aid;
      // create a list of file IDs found to add these to the message for the thread to run with the assistant 
      sl := TStringList.Create;
      for i := 0 to TMSFNCCloudAI1.Files.Count - 1 do
        sl.Add(TMSFNCCloudAI1.Files[i].ID);

      TMSFNCCloudAI1.CreateMessage(ThreadId, 'user', Memo1.Lines.Text, sl, aitFileSearch, procedure(const AId: string)
        begin
          // message created, now run the thread and wait for it to complete
          TMSFNCCloudAI1.RunThreadAndWait(ThreadId, id, procedure(AResponse: TTMSFNCCloudAIResponse; AHttpStatusCode: Integer; AHttpResult: string)
            begin
              // when the thread run completed, the LLM returned a response
              memo2.Lines.Add(AHttpResult);
              btnExec.Enabled := true;
              progressbar1.State := pbsPaused;
            end);
        end,
        procedure(AResponse: TTMSFNCCloudAIResponse; AHttpStatusCode: Integer; AHttpResult: string)
        begin
          memo2.Lines.Add('Error ' + AHttpStatusCode.ToString);
          memo2.Lines.Add(AHttpResult);
          btnExec.Enabled := true;
          progressbar1.State := pbsPaused;
        end
        );
      sl.Free;
    end);
end;

Conclusion

While the cloud LLMs have a quite similar feature set for chat completion (as we have seen in the prior installments of this article series), we see that in the area of RAG and using files for RAG, there is quite a difference in how the different LLMs deal with it. We also see that several cloud LLMs do not have the files feature yet or just added where documentation lacks to get it working properly. As the race in AI is ongoing, we expect these things to stabilize and become a common feature. The advantage of using a component TTMSFNCCloudAI from the TMS FNC Cloud Pack for this is that you are relieved from the hard work to not only decipher these APIs, but also ensure that the Delphi code is adapted by our team to further evolutions of the APIs of LLMs.

This new functionality for using files with the cloud LLM is available in TMS FNC Cloud Pack. If you have an active TMS ALL-ACCESS license, you can now also get access to the first test version of TMS AI Studio that uses the TTMSFNCCloudAI component but also has everything on board to let you build MCP servers and clients.

Bruno Fierens

This blog post has not received any comments yet.