音频转文本

该端点允许您使用指定的模型和参数转录音频文件。

请求正文参数

文件"（文本）：要转录的音频文件对象（非文件名），格式如下： flac、mp3、mp4、mpeg、mpga、m4a、ogg、wav 或 webm。

model`（文本）：要使用的模型 ID。目前只有 whisper-1（由我们的开源 Whisper V2 模型提供）可用。

prompt （文本）：可选文本，用于引导模型的风格或延续之前的音频片段。提示语应与音频语言相匹配。

response_format （文本）：输出格式，可选 json、text、srt、verbose_json 或 vtt。

temperature （文本）：取样温度，介于 0 和 1 之间。取值越高（如 0.8），输出越随机，取值越低（如 0.2），输出越集中、确定。如果设置为 0，模型将使用对数概率自动提高温度，直到达到特定阈值。

语言（文本）：输入音频的语言。提供 ISO-639-1 格式的输入语言将提高准确性和延迟。

curl --location --request POST 'https://api.deerapi.com/v1/audio/transcriptions' \ --header 'Authorization: Bearer {{api-key}}' \ --form 'file=@""' \ --form 'model="whisper-1"' \ --form 'prompt="eiusmod nulla"' \ --form 'response_format="json"' \ --form 'temperature="0"' \ --form 'language=""'

{ "text": "Imagine the wildest idea that you've ever had, and you're curious about how it might scale to something that's a 100, a 1,000 times bigger. This is a place where you can get to do that." }

请求正文参数

请求参数

返回响应

音频转文本

请求正文参数#

请求参数

返回响应

请求正文参数