LLM sometimes returns inconsistent total values. Now we always compute total = sum(criteria.score) for accuracy.