Summary
Packaged research-oriented ASR capability into a web demo with local inference, preprocessing, batch handling, and structured export.
Business Value
Validated solution feasibility with a rapid prototype, reducing project initiation risk.
Engineering Depth
Demonstrates rapid domain knowledge absorption and strict project timeline management.
Evidence
项目验收确认
Delivery record · Confidence Medium · Verified 2026-02-10
- Evidence level: strict review (core sections only show verifiable metrics)
- Source type: Project delivery record
- Source link: no public link provided, review against delivery records
- Verified at: 2026-02-10 (127 days ago, fresh evidence)
Rationale: Medium confidence: missing a public source link.
Repository · Confidence High · Verified 2026-03-31
- Evidence level: strict review (core sections only show verifiable metrics)
- Source type: Repository / code records
- Source link: no public link provided, review against delivery records
- Verified at: 2026-03-31 (78 days ago, fresh evidence)
Rationale: High confidence: organized under strict evidence rules, traceable to repository or code records, verified 78 days ago.
Background
客户希望验证中文方言语音自动转写在科研场景中的可行性,需要在较短周期内拿到可直接演示和试用的原型系统。
Challenge
既要把本地 ASR 模型稳定封装成 Web 应用,又要兼顾批量上传、浏览器录音、结果导出和非技术用户可操作性,时间窗口也比较紧。
Action and Results
Solution
- 推理封装:基于 Flask +
transformers ASR pipeline 封装本地模型,并结合 torchaudio、ffmpeg 做音频格式标准化。 - 交互链路:支持多文件上传、流式上传、浏览器录音、示例音频体验与上传后再次转写,覆盖演示与试用主路径。
- 结果结构化:通过 Excel 表维护声调/声母/韵母映射,输出 IPA 拆分结果,并支持文本/Excel 导出。
- 可运维性:加入请求日志、错误日志、批量并发处理和系统手册页面,方便验收与后续排障。
Result
按期交付 IPA Demo 原型,形成“上传/录音 -> 转写 -> IPA 拆分 -> 导出”的完整演示闭环,可用于后续工程化讨论与立项评估。
Key Signals
Wrapped local ASR inference and audio preprocessing. Added batch processing and structured result export. Delivered a web-based demo for feasibility validation. Tech Stack
FlaskPyTorchTransformersTorchaudioASRPandasFFmpeg