Blog
All Blog Posts | Next Post | Previous Post
Additional audio transcribing support in TMS AI Studio v1.2.3.0 and more ...
Today
We are pleased to announce the availability of TMS AI Studio v1.2.3.0!
More audio transcribing capabilities
We have upped the support for audio transcribing in several ways. First of all, we have added two additional services:
- Mistral : using the Mistral vortex audio transcribing model
- Gemini : using the multimodal capability of gemini-2.5-flash
Simply set TMSMCPCloudAI.Service to either aiMistral or aiGemini and call TMSMCPCloudAI.Transcribe() on an existing MP3 file or recorded sound buffer and it will return the text detected from the audio.
Note that for direct microphone audio recording, we have the free Delphi audio library you can retrieve from our Github repository.
OpenAI audio translation and model configuration
OpenAI not only offers the whisper-1 model for audo transcribing, it also introduced gpt-4o-mini-transcribe and gpt-4o-transcribe. You can now select the model via TMSMCPCloudAI.Settings.OpenAITranscribeModel.
Other than this, OpenAI also offers the automatic transcribing of audio to the English language irrespective of the language of the audio. This offers a capability to deal with spoken audio in the English language in your software irrespective of what language the user spoke. This can be done via a new method Translate that can be used with an MP3 sound buffer or MP3 file:
procedure TTMSMCPCloudAI.Translate(SoundFile: string); overload; procedure TTMSMCPCloudAI.Translate(SoundBuffer: TMemoryStream); overload;
Note that at this moment, this translate capability is only offered through OpenAI.
New TTS / STT demo
We added a new demo that shows the speech to text, text to speech and prompts used for translation capabilities for OpenAI. You can use this demo application as a spoken word translater to the language of choice.
What we do here is first of all we record the audio from the default computer microphone after a click on the green button and when the user clicks stop, we stop the recording and get the recorded sound as MP3 stream and send it to OpenAI via TMSMCPCloudAI1.Transcribe(s);
var s: TMemoryStream; begin if not FIsSpeaking then begin ar.ClearRecordedData; ar.StartRecording; end else begin ar.StopRecording; s := ar.GetMP3Stream(20500); s.Position := 0; TMSMCPCloudAI1.Transcribe(s); s.Free; end;
procedure TForm1.DoTranslate(Text, Language: string); begin TMSMCPCloudAI1.AssistantRole.Text := 'You are a translator that literally translates this text to '+ language; TMSMCPCloudAI1.Context.Text := Text; TMSMCPCloudAI1.Execute; end;
procedure TForm1.TMSMCPCloudAI1Executed(Sender: TObject; AResponse: TTMSMCPCloudAIResponse; AHttpStatusCode: Integer; AHttpResult: string); begin if AHttpStatusCode div 100 = 2 then // HTTP status code 200 = success
begin memo2.Lines.Text := AResponse.Content.Text; TMSMCPCloudAI1.Speak(memo2.Lines.Text); end; end;
Usage tracking across requests
When using AI from services as OpenAI, Gemini, Mistral, Claude, Grok, Perplexity, DeepSeek, ... this isn't free. Typically the cost is in direct relation to the tokens used. A token in the context of AI literally means: a single unit of meaning used to represent words, subwords, or punctuation in a piece of text.
Tokens are taken in account for the prompt text as well as for the resulting answer text produced by the LLM. In TMSMCPCloudAI, this number of tokens consumed for a prompt request can be retrieved from the response object:
procedure TForm1.TMSMCPCloudAI1Executed(Sender: TObject; AResponse: TTMSMCPCloudAIResponse; AHttpStatusCode: Integer; AHttpResult: string); begin // here we can check the tokens used for the request: AResponse.TotalTokens: integer // sum of prompt and completion tokens AResponse.PromptTokens: integer // tokens used for the prompt AResponse.CompletionTokens: integer // tokens used in the response text produced by the LLM
TMSMCPCloudAI.Usage.TotalTokens: integer TMSMCPCloudAI.Usage.PromptTokens: integer TMSMCPCloudAI.Usage.CompletionTokens: integer
Get started
If you are new to integrating AI capabilities into your Delphi applications, download the fully functional TMS AI Studio trial version and discover how you can make your software more powerful with it.
If you have a active TMS ALL-ACCESS subscription, you'll find this product automatically in your toolbox. If you are a student or teacher, we have now also our free academic version of TMS AI Studio!
We are very eager to learn what amazing new functionality you'll integrate with TMS AI Studio in your apps or what other or additional AI powered functionality you would like to see added in next versions of TMS AI Studio. Let us know in the blog comments below or via email!
Bruno Fierens

This blog post has not received any comments yet.
All Blog Posts | Next Post | Previous Post