Blog

All Blog Posts | Next Post | Previous Post

Additional audio transcribing support in TMS AI Studio v1.2.3.0 and more ...

Thursday, September 25, 2025

TMS Software Delphi Components tmsaistudio

We are pleased to announce the availability of TMS AI Studio v1.2.3.0!

More audio transcribing capabilities

We have upped the support for audio transcribing in several ways. First of all, we have added two additional services:

Mistral : using the Mistral vortex audio transcribing model
Gemini : using the multimodal capability of gemini-2.5-flash

Simply set TMSMCPCloudAI.Service to either aiMistral or aiGemini and call TMSMCPCloudAI.Transcribe() on an existing MP3 file or recorded sound buffer and it will return the text detected from the audio.

Note that for direct microphone audio recording, we have the free Delphi audio library you can retrieve from our Github repository.

OpenAI audio translation and model configuration

OpenAI not only offers the whisper-1 model for audo transcribing, it also introduced gpt-4o-mini-transcribe and gpt-4o-transcribe. You can now select the model via TMSMCPCloudAI.Settings.OpenAITranscribeModel.

Other than this, OpenAI also offers the automatic transcribing of audio to the English language irrespective of the language of the audio. This offers a capability to deal with spoken audio in the English language in your software irrespective of what language the user spoke. This can be done via a new method Translate that can be used with an MP3 sound buffer or MP3 file:

procedure TTMSMCPCloudAI.Translate(SoundFile: string); overload;
procedure TTMSMCPCloudAI.Translate(SoundBuffer: TMemoryStream); overload;

Note that at this moment, this translate capability is only offered through OpenAI.

New TTS / STT demo

We added a new demo that shows the speech to text, text to speech and prompts used for translation capabilities for OpenAI. You can use this demo application as a spoken word translater to the language of choice.

TMS Software Delphi Components

What we do here is first of all we record the audio from the default computer microphone after a click on the green button and when the user clicks stop, we stop the recording and get the recorded sound as MP3 stream and send it to OpenAI via TMSMCPCloudAI1.Transcribe(s);

var
  s: TMemoryStream;
begin
  if not FIsSpeaking then
  begin
    ar.ClearRecordedData;
    ar.StartRecording;
  end
  else
  begin
    ar.StopRecording;
    s := ar.GetMP3Stream(20500);
    s.Position := 0;
    TMSMCPCloudAI1.Transcribe(s);
    s.Free;
  end;

When OpenAI transcribed the audio, the event TMSMCPCloudAI.OnTranscribeAudio is triggered from where we can retrieve the spoken words as text and create a prompt to perform translation to the selected language:

procedure TForm1.DoTranslate(Text, Language: string);
begin
  TMSMCPCloudAI1.AssistantRole.Text := 'You are a translator that literally translates this text to '+ language;
  TMSMCPCloudAI1.Context.Text := Text;
  TMSMCPCloudAI1.Execute;
end;

When this prolmpt got executed and thus returns the translated text, we can invoke OpenAI's text to speech function to let the computer speak the translated text:

procedure TForm1.TMSMCPCloudAI1Executed(Sender: TObject;
  AResponse: TTMSMCPCloudAIResponse; AHttpStatusCode: Integer;
  AHttpResult: string);
begin
  if AHttpStatusCode div 100 = 2 then  // HTTP status code 200 = success
  begin
    memo2.Lines.Text := AResponse.Content.Text;
    TMSMCPCloudAI1.Speak(memo2.Lines.Text);
  end;
end;

That is how easy it is to let your computer speak for you in another language.

Usage tracking across requests

When using AI from services as OpenAI, Gemini, Mistral, Claude, Grok, Perplexity, DeepSeek, ... this isn't free. Typically the cost is in direct relation to the tokens used. A token in the context of AI literally means: a single unit of meaning used to represent words, subwords, or punctuation in a piece of text.

Tokens are taken in account for the prompt text as well as for the resulting answer text produced by the LLM. In TMSMCPCloudAI, this number of tokens consumed for a prompt request can be retrieved from the response object:

procedure TForm1.TMSMCPCloudAI1Executed(Sender: TObject;
  AResponse: TTMSMCPCloudAIResponse; AHttpStatusCode: Integer;
  AHttpResult: string);
begin
  // here we can check the tokens used for the request:
  AResponse.TotalTokens: integer  // sum of prompt and completion tokens
  AResponse.PromptTokens: integer   // tokens used for the prompt
  AResponse.CompletionTokens: integer  // tokens used in the response text produced by the LLM

We have now extended the TMSMCPCloudAI class to track the usage across multiple requests via TMSMCPCloudAI.Usage. As long as you do not call TMSMCPCloudAI.Usage.Reset, the number of tokens will be added and so you can check the total number of tokens used during a session. This result can be retrieved via:

TMSMCPCloudAI.Usage.TotalTokens: integer
TMSMCPCloudAI.Usage.PromptTokens: integer
TMSMCPCloudAI.Usage.CompletionTokens: integer

Get started

If you are new to integrating AI capabilities into your Delphi applications, download the fully functional TMS AI Studio trial version and discover how you can make your software more powerful with it.
If you have a active TMS ALL-ACCESS subscription, you'll find this product automatically in your toolbox. If you are a student or teacher, we have now also our free academic version of TMS AI Studio!

We are very eager to learn what amazing new functionality you'll integrate with TMS AI Studio in your apps or what other or additional AI powered functionality you would like to see added in next versions of TMS AI Studio. Let us know in the blog comments below or via email!

Bruno Fierens

Add AI superpower to your Delphi & C++Builder apps part 1

Add AI superpower to your Delphi & C++Builder apps part 2: function calling

Add AI superpower to your Delphi & C++Builder apps part 3: multimodal LLM use

Add AI superpower to your Delphi & C++Builder apps part 4: create MCP servers

Add AI superpower to your Delphi & C++Builder apps part 5: create your MCP client

Add AI superpower to your Delphi & C++Builder apps part 6: RAG

Introducing TMS AI Studio: Your Complete AI Development Toolkit for Delphi

Automatic invoice data extraction in Delphi apps via AI

AI based scheduling in classic Delphi desktop apps

Voice-Controlled Maps in Delphi with TMS AI Studio + OpenAI TTS/STT

Creating an n8n Workflow to use a Logging MCP Server

Supercharging Delphi Apps with TMS AI Studio v1.2 Toolsets: Fine-Grained AI Function Control

AI-powered HTML Reports with Embedded Browser Visualization

Additional audio transcribing support in TMS AI Studio v1.2.3.0 and more ...

Introducing Attributes Support for MCP Servers in Delphi

Using AI Services securely in TMS AI Studio

Automate StellarDS database operations with AI via MCP

TMS AI Studio v1.4 is bringing HTTP.sys to MCP

Windows Service Deployment Guide for the HTTP.SYS-Ready MCP Server Built with TMS AI Studio

This blog post has not received any comments yet.

All Blog Posts | Next Post | Previous Post

Explore All Products

Blog

Additional audio transcribing support in TMS AI Studio v1.2.3.0 and more ...

Thursday, September 25, 2025

More audio transcribing capabilities

OpenAI audio translation and model configuration

New TTS / STT demo

Usage tracking across requests

Get started

This blog post has not received any comments yet.

Add a new comment

Blog Search

Explore All Products

Blog

Additional audio transcribing support in TMS AI Studio v1.2.3.0 and more ...

Thursday, September 25, 2025

More audio transcribing capabilities

OpenAI audio translation and model configuration

New TTS / STT demo

Usage tracking across requests

Get started

Related Blog Posts

This blog post has not received any comments yet.

Add a new comment