30 lines
778 B
Markdown
30 lines
778 B
Markdown
你将接收一段音频。请完成:你将接收一段音频。语音活动检测(VAD)与声源分类。
|
||
|
||
**输出 JSON Schema(示例)**
|
||
|
||
```json
|
||
{
|
||
"speech_detected": true,
|
||
"language": "zh-CN",
|
||
"audio_type": null,
|
||
"transcript": ["大家好,我是xxx", "欢迎来到今天的视频", "今天我们来聊聊AI的未来"],
|
||
"non_speech_summary": null,
|
||
}
|
||
```
|
||
|
||
**当无发言时返回:**
|
||
|
||
```json
|
||
{
|
||
"speech_detected": false,
|
||
"language": null,
|
||
"audio_type": "music | ambience | animal | mechanical | other",
|
||
"transcript": [],
|
||
"non_speech_summary": "示例:纯音乐-钢琴独奏,节奏舒缓;或 环境声-雨声伴随雷鸣。",
|
||
}
|
||
```
|
||
|
||
** 规则补充 **
|
||
|
||
* 允许`language` 为多标签(如 "zh-CN,en")或为 `null`(无发言时)。
|