The Speech-to-Text aspect of the Speech service transcribes audio streams into text. Your application can display this text to the user or act upon it as command input. You can use this service either with an SDK client library (for supported platforms and languages) or a representational state transfer (REST) API.
The Speech-to-text aspect of the Speech services, in Azure Cognitive Services, provides a real-time transcription of audio streams based on machine learning and artificial intelligence. The Speech services APIs allow developers to add end-to-end, real-time speech transcription to their applications or services.
Speech services are designed to perform real-time speech-to-text for scenarios like:
- Translation of live presentations
- In-person or remote translated communications
- Customer support
- Business intelligence
- Media subtitling
- Multilingual AI interactions
Before you can begin performing your speech-to-text translation, you need to create an Azure Speech resource. You can do this by using the Azure portal, the Azure CLI, or the Cloud Shell. This exercise will use the Azure portal.
Sign in to the Azure portal.
Select + Create a resource. In the Search the Marketplace box, type speech and press Enter.
In the Results list, select Speech. In the Speech pane, select Create.
Enter a unique name for your Speech Service resource.
Select an appropriate subscription.
Choose a location to host the resource. This is typically the region where the resource will be used.
For the Pricing tier, select a tier. The tiers may change but currently, you can selects F0 or S0. For testing, we selected F0.
Create a new resource group (RG) named mslearn-speechapi to hold your resources. You can also choose an existing RG if you wish
Select Create to create a subscription to the Speech Translation API.