Transcribe endpoint

Dictation, combining speech-to-text with command-and-control capabilities, is a powerful tool for clinicians to drive EHR adoption, streamline documentation capture, and automate repetitive tasks. It can be available on desktop, web, and mobile applications and in a variety of languages.

Connecting from a browser? Never expose client_id/client_secret or a full-scope access token to the frontend. Issue a limited-scope token with scope="openid transcribe" from your backend so the token can only be used against this WebSocket endpoint. See Limited-scope credentials for streaming APIs.

This page explains key features available in Corti dictation enabled by the /transcribe API endpoint. Corti provides both a dictation API and web component, ready for you to integrate with your app.

Dictation is available for all supported languages; however, those with Enhanced or Premier tier availability have the most functionality available.

Feature availability per language

Base tier

Language	Code	Interim Results	Commands	Spoken Punctuation	Formatting
All	All

See the languages page for full list of available languages.

Enhanced and Premier tier

Language	Code	Interim Results	Commands	Spoken Punctuation	Formatting
Arabic	`ar`
Danish	`da`
Dutch	`nl`
English (US)	`en`
English (AU)	`en-AU`
English (UK)	`en-GB`
Finnish	`fi`
French	`fr`
German	`de`
Hungarian	`hu`
Norwegian	`no`
Spanish	`es`
Swedish	`sv`
Swiss French	`fr-CH`
Swiss German	`gsw-CH`
Swiss High German	`de-CH`

Features

Click on the cards to learn more…

Languages

Corti speech to text is specifically designed for use in the healthcare domain. A tier system has been introduced to categorize functionality and performance that is available per language and endpoint. Languages in the Enhanced and Premier tiers have the utmost functionality and recognition accuracy - they’re the ones recommended for dictation use.

Interim Results

Interim results are low latency previews of final transcript text used to present visual feedback to the user, during active dictation, to bring reassurance that audio is being processed. This enables real-time quality validation.

Punctuation

Punctuation is essential for coherent documentation. Setting the spokenPunctuation parameter to true in /transcribe configuration enables users to control when punctuation is inserted in the document output. Additionally, the parameter automaticPunctuation can be used to have the AI model add periods and commas as appropriate.

Commands

Commands is a key functionality that brings the system beyond speech-to-text to a complete dictation solution. Put your users in the driver seat to control their workflow by defining commands to insert templates, navigate the application, automate repetitive tasks, and more!

Formatting

Speech to text can be used to create a verbatim transcript of the audio; however, some content is not documented in the same manner as it is verbalized. The formatting features provide control over how key information should for represented in the textual output.

Audio Events

Real-time events during audio streaming about quality and speech activity, intended to notify integrator of audio health degradation, periods of silence, or other events that could support application behavior or user warnings.

Replacements

Ability to define words or phrases that should be returned in place of the standard output by the speech-to-text model.

Keyterms

Bias speech-to-text output so that new words can be introduced to the system vocabulary (e.g., surnames) or to improve recognition reliability for homophones and words with ambiguous pronunciation.

Dictation workflows

Hold-to-talk

Most common for dictation using a handheld microphone, providing the ultimate control over turning the microphone on (press and hold the record button) and off (release the button).

Toggle-to-talk

Most common for dictation using a wearable or desktop microphone, where the microphone is turned on and remains in active recording state until it is turned off.

Click here for detailed guide on how to properly insert transcript segments with proper handling of whitespace, interim vs. final results, and text vs. rawTranscriptText fields.

Please contact us for more information or help.

Endpoints

Features

Best Practices

Resources

Feature availability per language

Base tier

Enhanced and Premier tier

Features

Languages

Interim Results

Punctuation

Commands

Formatting

Audio Events

Replacements

Keyterms

Dictation workflows

Hold-to-talk

Toggle-to-talk

​Feature availability per language

​Base tier

​Enhanced and Premier tier

​Features

​Languages

​Interim Results

​Punctuation

​Commands

​Formatting

​Audio Events

​Replacements

​Keyterms

​Dictation workflows

Hold-to-talk

Toggle-to-talk

Feature availability per language

Base tier

Enhanced and Premier tier

Features

Languages

Interim Results

Punctuation

Commands

Formatting

Audio Events

Replacements

Keyterms

Dictation workflows