fix: improve LLM prompt to use actual rubric score ranges
Previous prompt had hardcoded 0-3 score examples which misled LLM. Now prompt instructs LLM to read max_score from rubric for each criterion.
This commit is contained in:
parent
1afc2eae48
commit
c25bcfddd0
@ -36,19 +36,25 @@ def read_file_or_string(value):
|
||||
|
||||
PROMPT_TEMPLATE = """你是严格且一致的助教,按提供的评分量表为学生的简答题评分。
|
||||
|
||||
- 只依据量表,不做主观延伸;允许多样表述。
|
||||
- 不输出任何解释性文本;只输出 JSON,包含:
|
||||
{{
|
||||
"total": number(0-10, 两位小数),
|
||||
"criteria": [
|
||||
{{"id":"accuracy","score":0-3,"reason":"要点式一句话"}},
|
||||
{{"id":"coverage","score":0-3,"reason":""}},
|
||||
{{"id":"clarity","score":0-3,"reason":""}}
|
||||
],
|
||||
"flags": [],
|
||||
"confidence": number(0-1)
|
||||
}}
|
||||
如果答案与题目无关,total=0,并加 flag "need_review"。
|
||||
- 只依据量表中各评分项的 max_score 和 scoring_guide 进行评分
|
||||
- 每个评分项的分数范围是 0 到该项的 max_score
|
||||
- 不输出任何解释性文本;只输出 JSON
|
||||
|
||||
输出格式:
|
||||
{{
|
||||
"total": number (各项分数之和,保留两位小数),
|
||||
"criteria": [
|
||||
{{"id": "评分项id", "score": number(0到该项max_score), "reason": "简短评语"}},
|
||||
...
|
||||
],
|
||||
"flags": [],
|
||||
"confidence": number(0-1, 评分置信度)
|
||||
}}
|
||||
|
||||
重要:
|
||||
- 每个评分项的 score 必须在 0 到该项 max_score 范围内
|
||||
- total 必须等于所有 criteria 的 score 之和
|
||||
- 如果答案与题目无关或为空,total=0,并加 flag "need_review"
|
||||
|
||||
【题目】
|
||||
<<<{question}>>>
|
||||
|
||||
Loading…
Reference in New Issue
Block a user