douyin-archive/app/api/stt/prompt.md

你将接收一段音频。请完成：你将接收一段音频。语音活动检测（VAD）与声源分类。

**输出 JSON Schema（示例）**

```json
{
  "speech_detected": true,
  "language": "zh-CN",
  "audio_type": null,
  "transcript": ["大家好，我是xxx", "欢迎来到今天的视频", "今天我们来聊聊AI的未来"],
  "non_speech_summary": null,
}
```

**当无发言时返回：**

```json
{
  "speech_detected": false,
  "language": null,
  "audio_type": "music | ambience | animal | mechanical | other",
  "transcript": [],
  "non_speech_summary": "示例：纯音乐-钢琴独奏，节奏舒缓；或 环境声-雨声伴随雷鸣。",
}
```

** 规则补充 **

* 允许`language` 为多标签（如 "zh-CN,en"）或为 `null`（无发言时）。