Conversation
Summary of ChangesHello @realquantumcookie, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! 此PR旨在解决Telegram语音消息与OpenAI API语音转文本(STT)模型之间的兼容性问题。由于Telegram语音消息采用OGG格式,导致STT模型无法正确处理。本次更改在Telegram适配器中引入了新功能,能够下载OGG格式的语音消息并将其转换为WAV格式,从而确保与STT服务的顺畅集成。 Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Hey - 我发现了两个问题,并给出了一些总体反馈:
- 直接使用
file_basename作为临时文件名,在多个语音消息同名的情况下可能会产生冲突;建议在临时路径和输出路径中添加一个唯一后缀(例如消息 ID 或 UUID),以避免并发请求中的竞争条件。 - 新增的下载/转换逻辑没有处理失败场景(例如下载过程中网络错误或转换错误),目前会直接中断流程;建议用 try/except 包裹这部分逻辑,并在失败时回退到原始的
file.file_path或返回一个清晰的错误响应。
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Using `file_basename` directly for the temp filename can cause collisions if multiple voice messages share the same name; consider adding a unique suffix (e.g., message ID or a UUID) to the temp and output paths to avoid race conditions in concurrent requests.
- The new download/convert logic does not handle failures (e.g., network error during download or conversion error) and will currently break the flow; consider wrapping this block in try/except and falling back to the original `file.file_path` or returning a clear error response.
## Individual Comments
### Comment 1
<location path="astrbot/core/platform/sources/telegram/tg_adapter.py" line_range="384-391" />
<code_context>
file = await update.message.voice.get_file()
+
+ file_basename = os.path.basename(file.file_path)
+ temp_dir = get_astrbot_temp_path()
+ temp_path = os.path.join(temp_dir, file_basename)
+ temp_path = await download_image_by_url(file.file_path, path=temp_path)
+ path_wav = os.path.join(
+ temp_dir,
+ f"{file_basename}.wav",
+ )
+ path_wav = await convert_audio_to_wav(temp_path, path_wav)
+
message.message = [
</code_context>
<issue_to_address>
**suggestion (performance):** 考虑一下这些临时音频文件在何时以及如何被清理,以避免磁盘空间被占满。
原始下载文件(`temp_path`)和转换后的文件(`path_wav`)都会保留在 `temp_dir` 中且没有清理逻辑。对于短时处理的场景,它们应该写入到一个会被定期清理的位置,或者在处理完成后显式删除;否则,对于长期运行的 bot,这些文件会逐渐占满磁盘。
推荐实现方式:
```python
import os
from astrbot.core.star.star_handler import star_handlers_registry
from astrbot.core.utils.io import download_file, download_image_by_url, file_to_base64
```
```python
elif update.message.voice:
file = await update.message.voice.get_file()
file_basename = os.path.basename(file.file_path)
temp_dir = get_astrbot_temp_path()
temp_path = os.path.join(temp_dir, file_basename)
path_wav = os.path.join(
temp_dir,
f"{file_basename}.wav",
)
try:
temp_path = await download_image_by_url(file.file_path, path=temp_path)
path_wav = await convert_audio_to_wav(temp_path, path_wav)
message.message = [
Comp.Record(file=path_wav, url=path_wav),
]
finally:
# Best-effort cleanup of temporary audio files to avoid disk buildup
for _path in (temp_path, path_wav):
if _path and os.path.exists(_path):
try:
os.remove(_path)
except OSError:
# Ignore cleanup errors; they shouldn't break message handling
pass
```
这里假设所有需要读取 WAV 文件的处理逻辑都会在同一个调用栈中同步完成(即在 `finally` 执行之前完成)。如果你的处理流水线会延迟读取 `Comp.Record.file`/`.url`(例如在另一个任务或 worker 中),就需要将清理逻辑移动到生命周期的更后阶段——在语音消息完全处理完之后——或者重构 `Comp.Record`,使其使用内存数据而不是文件路径。
</issue_to_address>
### Comment 2
<location path="astrbot/core/platform/sources/telegram/tg_adapter.py" line_range="383-385" />
<code_context>
elif update.message.voice:
file = await update.message.voice.get_file()
+
+ file_basename = os.path.basename(file.file_path)
+ temp_dir = get_astrbot_temp_path()
+ temp_path = os.path.join(temp_dir, file_basename)
+ temp_path = await download_image_by_url(file.file_path, path=temp_path)
+ path_wav = os.path.join(
</code_context>
<issue_to_address>
**issue (bug_risk):** 临时文件仅使用 basename 作为文件名,在并发或重复处理场景下可能发生冲突。
由于 `temp_path` 只是来自 `os.path.basename(file.file_path)`,不同的语音消息如果有相同的 basename(或者同一条消息被并发处理),就可能在 `temp_dir` 中互相覆盖,从而导致竞争条件或损坏的音频文件。请在临时文件名中增加一个唯一标识(例如 UUID、时间戳或 update/message ID)。
</issue_to_address>帮我变得更有用!请在每条评论上点 👍 或 👎,我会根据你的反馈改进评审质量。
Original comment in English
Hey - I've found 2 issues, and left some high level feedback:
- Using
file_basenamedirectly for the temp filename can cause collisions if multiple voice messages share the same name; consider adding a unique suffix (e.g., message ID or a UUID) to the temp and output paths to avoid race conditions in concurrent requests. - The new download/convert logic does not handle failures (e.g., network error during download or conversion error) and will currently break the flow; consider wrapping this block in try/except and falling back to the original
file.file_pathor returning a clear error response.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Using `file_basename` directly for the temp filename can cause collisions if multiple voice messages share the same name; consider adding a unique suffix (e.g., message ID or a UUID) to the temp and output paths to avoid race conditions in concurrent requests.
- The new download/convert logic does not handle failures (e.g., network error during download or conversion error) and will currently break the flow; consider wrapping this block in try/except and falling back to the original `file.file_path` or returning a clear error response.
## Individual Comments
### Comment 1
<location path="astrbot/core/platform/sources/telegram/tg_adapter.py" line_range="384-391" />
<code_context>
file = await update.message.voice.get_file()
+
+ file_basename = os.path.basename(file.file_path)
+ temp_dir = get_astrbot_temp_path()
+ temp_path = os.path.join(temp_dir, file_basename)
+ temp_path = await download_image_by_url(file.file_path, path=temp_path)
+ path_wav = os.path.join(
+ temp_dir,
+ f"{file_basename}.wav",
+ )
+ path_wav = await convert_audio_to_wav(temp_path, path_wav)
+
message.message = [
</code_context>
<issue_to_address>
**suggestion (performance):** Consider how/when these temporary audio files are cleaned up to avoid disk buildup.
Both the original download (`temp_path`) and converted file (`path_wav`) remain in `temp_dir` with no cleanup. For short‑lived processing, they should either be written to a location that’s periodically cleaned or explicitly deleted once processing is complete; otherwise, long‑running bots will gradually fill the disk.
Suggested implementation:
```python
import os
from astrbot.core.star.star_handler import star_handlers_registry
from astrbot.core.utils.io import download_file, download_image_by_url, file_to_base64
```
```python
elif update.message.voice:
file = await update.message.voice.get_file()
file_basename = os.path.basename(file.file_path)
temp_dir = get_astrbot_temp_path()
temp_path = os.path.join(temp_dir, file_basename)
path_wav = os.path.join(
temp_dir,
f"{file_basename}.wav",
)
try:
temp_path = await download_image_by_url(file.file_path, path=temp_path)
path_wav = await convert_audio_to_wav(temp_path, path_wav)
message.message = [
Comp.Record(file=path_wav, url=path_wav),
]
finally:
# Best-effort cleanup of temporary audio files to avoid disk buildup
for _path in (temp_path, path_wav):
if _path and os.path.exists(_path):
try:
os.remove(_path)
except OSError:
# Ignore cleanup errors; they shouldn't break message handling
pass
```
This change assumes that any processing that needs to read the WAV file happens synchronously within the same call stack (i.e., before the `finally` block executes). If your pipeline defers reading `Comp.Record.file`/`.url` until later (e.g., in another task or worker), you'll need to move the cleanup logic to a later point in the lifecycle—after the voice message has been fully processed—or refactor `Comp.Record` to work with in-memory data instead of a file path.
</issue_to_address>
### Comment 2
<location path="astrbot/core/platform/sources/telegram/tg_adapter.py" line_range="383-385" />
<code_context>
elif update.message.voice:
file = await update.message.voice.get_file()
+
+ file_basename = os.path.basename(file.file_path)
+ temp_dir = get_astrbot_temp_path()
+ temp_path = os.path.join(temp_dir, file_basename)
+ temp_path = await download_image_by_url(file.file_path, path=temp_path)
+ path_wav = os.path.join(
</code_context>
<issue_to_address>
**issue (bug_risk):** Using only the basename for temp files can cause collisions under concurrent or repeated processing.
Because `temp_path` comes only from `os.path.basename(file.file_path)`, different voice messages with the same basename (or concurrent handling of the same message) can overwrite each other in `temp_dir`, leading to race conditions or corrupted audio. Please add a uniqueness component (e.g., UUID, timestamp, or update/message ID) to the temp filename.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| temp_dir = get_astrbot_temp_path() | ||
| temp_path = os.path.join(temp_dir, file_basename) | ||
| temp_path = await download_image_by_url(file.file_path, path=temp_path) | ||
| path_wav = os.path.join( | ||
| temp_dir, | ||
| f"{file_basename}.wav", | ||
| ) | ||
| path_wav = await convert_audio_to_wav(temp_path, path_wav) |
There was a problem hiding this comment.
suggestion (performance): 考虑一下这些临时音频文件在何时以及如何被清理,以避免磁盘空间被占满。
原始下载文件(temp_path)和转换后的文件(path_wav)都会保留在 temp_dir 中且没有清理逻辑。对于短时处理的场景,它们应该写入到一个会被定期清理的位置,或者在处理完成后显式删除;否则,对于长期运行的 bot,这些文件会逐渐占满磁盘。
推荐实现方式:
import os
from astrbot.core.star.star_handler import star_handlers_registry
from astrbot.core.utils.io import download_file, download_image_by_url, file_to_base64 elif update.message.voice:
file = await update.message.voice.get_file()
file_basename = os.path.basename(file.file_path)
temp_dir = get_astrbot_temp_path()
temp_path = os.path.join(temp_dir, file_basename)
path_wav = os.path.join(
temp_dir,
f"{file_basename}.wav",
)
try:
temp_path = await download_image_by_url(file.file_path, path=temp_path)
path_wav = await convert_audio_to_wav(temp_path, path_wav)
message.message = [
Comp.Record(file=path_wav, url=path_wav),
]
finally:
# Best-effort cleanup of temporary audio files to avoid disk buildup
for _path in (temp_path, path_wav):
if _path and os.path.exists(_path):
try:
os.remove(_path)
except OSError:
# Ignore cleanup errors; they shouldn't break message handling
pass这里假设所有需要读取 WAV 文件的处理逻辑都会在同一个调用栈中同步完成(即在 finally 执行之前完成)。如果你的处理流水线会延迟读取 Comp.Record.file/.url(例如在另一个任务或 worker 中),就需要将清理逻辑移动到生命周期的更后阶段——在语音消息完全处理完之后——或者重构 Comp.Record,使其使用内存数据而不是文件路径。
Original comment in English
suggestion (performance): Consider how/when these temporary audio files are cleaned up to avoid disk buildup.
Both the original download (temp_path) and converted file (path_wav) remain in temp_dir with no cleanup. For short‑lived processing, they should either be written to a location that’s periodically cleaned or explicitly deleted once processing is complete; otherwise, long‑running bots will gradually fill the disk.
Suggested implementation:
import os
from astrbot.core.star.star_handler import star_handlers_registry
from astrbot.core.utils.io import download_file, download_image_by_url, file_to_base64 elif update.message.voice:
file = await update.message.voice.get_file()
file_basename = os.path.basename(file.file_path)
temp_dir = get_astrbot_temp_path()
temp_path = os.path.join(temp_dir, file_basename)
path_wav = os.path.join(
temp_dir,
f"{file_basename}.wav",
)
try:
temp_path = await download_image_by_url(file.file_path, path=temp_path)
path_wav = await convert_audio_to_wav(temp_path, path_wav)
message.message = [
Comp.Record(file=path_wav, url=path_wav),
]
finally:
# Best-effort cleanup of temporary audio files to avoid disk buildup
for _path in (temp_path, path_wav):
if _path and os.path.exists(_path):
try:
os.remove(_path)
except OSError:
# Ignore cleanup errors; they shouldn't break message handling
passThis change assumes that any processing that needs to read the WAV file happens synchronously within the same call stack (i.e., before the finally block executes). If your pipeline defers reading Comp.Record.file/.url until later (e.g., in another task or worker), you'll need to move the cleanup logic to a later point in the lifecycle—after the voice message has been fully processed—or refactor Comp.Record to work with in-memory data instead of a file path.
| file_basename = os.path.basename(file.file_path) | ||
| temp_dir = get_astrbot_temp_path() | ||
| temp_path = os.path.join(temp_dir, file_basename) |
There was a problem hiding this comment.
issue (bug_risk): 临时文件仅使用 basename 作为文件名,在并发或重复处理场景下可能发生冲突。
由于 temp_path 只是来自 os.path.basename(file.file_path),不同的语音消息如果有相同的 basename(或者同一条消息被并发处理),就可能在 temp_dir 中互相覆盖,从而导致竞争条件或损坏的音频文件。请在临时文件名中增加一个唯一标识(例如 UUID、时间戳或 update/message ID)。
Original comment in English
issue (bug_risk): Using only the basename for temp files can cause collisions under concurrent or repeated processing.
Because temp_path comes only from os.path.basename(file.file_path), different voice messages with the same basename (or concurrent handling of the same message) can overwrite each other in temp_dir, leading to race conditions or corrupted audio. Please add a uniqueness component (e.g., UUID, timestamp, or update/message ID) to the temp filename.
There was a problem hiding this comment.
Code Review
This PR addresses the issue of Telegram voice messages being in OGG format, which causes problems with the OpenAI API STT model. It implements OGG download and WAV conversion logic in the TG Adaptor to enable proper transcription. However, the current implementation introduces a high-severity Path Traversal vulnerability. User-controlled filenames are used without sanitization, allowing files to be written to arbitrary locations. Please refer to the specific comments for remediation, which involves using securely generated filenames.
| file_basename = os.path.basename(file.file_path) | ||
| temp_dir = get_astrbot_temp_path() | ||
| temp_path = os.path.join(temp_dir, file_basename) | ||
| temp_path = await download_image_by_url(file.file_path, path=temp_path) | ||
| path_wav = os.path.join( | ||
| temp_dir, | ||
| f"{file_basename}.wav", | ||
| ) | ||
| path_wav = await convert_audio_to_wav(temp_path, path_wav) |
There was a problem hiding this comment.
A high-severity Path Traversal vulnerability exists here. The filename from file.file_path is used to construct local file paths without proper sanitization, which could allow an attacker to write files to arbitrary locations. Additionally, the use of download_image_by_url is misleading and incorrect for OGG audio files, as it internally calls save_temp_img which saves files as .jpg. While the provided suggestion addresses the path traversal by generating a secure, random filename and explicitly setting the .ogg extension for the temporary file, ensure that download_image_by_url correctly saves the file with the specified extension, or consider using a more generic file download function like download_file if download_image_by_url forces a .jpg extension.
# To prevent path traversal, generate a random filename instead of using the one from the URL.
safe_basename = uuid.uuid4().hex
temp_dir = get_astrbot_temp_path()
# Assume the downloaded voice file is in ogg format as per the PR description.
temp_path = os.path.join(temp_dir, f\"{safe_basename}.ogg\")
await download_image_by_url(file.file_path, path=temp_path)
path_wav = os.path.join(
temp_dir, f\"{safe_basename}.wav\"
)
path_wav = await convert_audio_to_wav(temp_path, path_wav)| temp_dir, | ||
| f"{file_basename}.wav", | ||
| ) | ||
| path_wav = await convert_audio_to_wav(temp_path, path_wav) |
| from astrbot.core.star.filter.command_group import CommandGroupFilter | ||
| from astrbot.core.star.star import star_map | ||
| from astrbot.core.star.star_handler import star_handlers_registry | ||
| from astrbot.core.utils.io import download_file, download_image_by_url, file_to_base64 |
此PR解决Telegram的语音消息format是ogg而不是wav的问题,在调用openai api STT模型时会出现问题
Modifications / 改动点
在TG Adaptor增加了OGG下载和转换成wav逻辑
Screenshots or Test Results / 运行截图或测试结果
修改前:
修改后:

Checklist / 检查清单
requirements.txt和pyproject.toml文件相应位置。/ I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations inrequirements.txtandpyproject.toml.Summary by Sourcery
Bug Fixes:
Original summary in English
Summary by Sourcery
Bug Fixes: