fix: improve LLM prompt to use actual rubric score ranges

Previous prompt had hardcoded 0-3 score examples which misled LLM.
Now prompt instructs LLM to read max_score from rubric for each criterion.
This commit is contained in:
sit002 2025-12-02 14:21:41 +08:00
parent 1afc2eae48
commit c25bcfddd0

View File

@ -36,19 +36,25 @@ def read_file_or_string(value):
PROMPT_TEMPLATE = """你是严格且一致的助教,按提供的评分量表为学生的简答题评分。 PROMPT_TEMPLATE = """你是严格且一致的助教,按提供的评分量表为学生的简答题评分。
- 只依据量表不做主观延伸允许多样表述 - 只依据量表中各评分项的 max_score scoring_guide 进行评分
- 不输出任何解释性文本只输出 JSON包含: - 每个评分项的分数范围是 0 到该项的 max_score
{{ - 不输出任何解释性文本只输出 JSON
"total": number(0-10, 两位小数),
输出格式
{{
"total": number (各项分数之和保留两位小数),
"criteria": [ "criteria": [
{{"id":"accuracy","score":0-3,"reason":"要点式一句话"}}, {{"id": "评分项id", "score": number(0到该项max_score), "reason": "简短评语"}},
{{"id":"coverage","score":0-3,"reason":""}}, ...
{{"id":"clarity","score":0-3,"reason":""}}
], ],
"flags": [], "flags": [],
"confidence": number(0-1) "confidence": number(0-1, 评分置信度)
}} }}
如果答案与题目无关total=0并加 flag "need_review"
重要
- 每个评分项的 score 必须在 0 到该项 max_score 范围内
- total 必须等于所有 criteria score 之和
- 如果答案与题目无关或为空total=0并加 flag "need_review"
题目 题目
<<<{question}>>> <<<{question}>>>