This commit is contained in:
commit
e25dff3734
16
Project_Design.md
Normal file
16
Project_Design.md
Normal file
@ -0,0 +1,16 @@
|
||||
# 智能知识库问答 - 项目设计文档
|
||||
|
||||
## 一句话描述
|
||||
我的应用叫"智能知识库问答",它是给企业和课程使用的,用于自有文档精准问答,减少人工答疑。
|
||||
|
||||
## 核心功能(MVP)
|
||||
1. **文档上传** - 支持上传PDF、Word、TXT等格式的文档,系统自动解析并建立知识库索引
|
||||
2. **智能问答** - 用户输入问题,AI基于上传的文档内容进行精准检索和回答
|
||||
3. **知识库管理** - 查看已上传的文档列表,支持删除文档,查看文档处理状态
|
||||
|
||||
## 交互流程
|
||||
1. 用户打开 App → 看到主界面,包含文档上传区和问答区
|
||||
2. 用户点击"上传文档"按钮 → 选择本地文档文件 → 系统显示上传进度和处理状态
|
||||
3. 文档处理完成后 → 在问答输入框中输入问题
|
||||
4. 点击"提问"按钮 → AI检索知识库并返回答案,同时显示参考文档来源
|
||||
5. 用户可以继续提问或上传更多文档
|
||||
235
README.md
Normal file
235
README.md
Normal file
@ -0,0 +1,235 @@
|
||||
# 🧠 智能知识库问答系统
|
||||
|
||||
一个基于 Flask 的企业/课程智能问答系统,支持上传自有文档并进行精准问答,减少人工答疑成本。
|
||||
|
||||
## ✨ 核心功能
|
||||
|
||||
- 📚 **文档上传与管理**:支持上传 PDF、Word、TXT 等格式的文档,自动进行智能解析
|
||||
- 🤖 **智能问答**:基于上传的文档内容,提供精准的问答服务
|
||||
- 💾 **对话历史**:自动保存所有问答记录,方便回顾和查看
|
||||
- 📱 **响应式设计**:完美支持桌面端和移动端访问
|
||||
- 🎨 **美观界面**:现代化的 UI 设计,提供良好的用户体验
|
||||
|
||||
## 🚀 快速开始
|
||||
|
||||
### 环境要求
|
||||
|
||||
- Python 3.8+
|
||||
- pip
|
||||
|
||||
### 安装步骤
|
||||
|
||||
1. **克隆项目**
|
||||
```bash
|
||||
git clone <repository-url>
|
||||
cd 12
|
||||
```
|
||||
|
||||
2. **安装依赖**
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
3. **配置环境变量**
|
||||
|
||||
创建 `.env` 文件并配置以下变量:
|
||||
```env
|
||||
OPENAI_API_KEY=your_openai_api_key_here
|
||||
FLASK_SECRET_KEY=your_secret_key_here
|
||||
```
|
||||
|
||||
4. **初始化数据库**
|
||||
```bash
|
||||
python app.py
|
||||
```
|
||||
|
||||
数据库会自动创建在项目根目录下的 `knowledge_base.db` 文件中。
|
||||
|
||||
5. **启动应用**
|
||||
```bash
|
||||
python app.py
|
||||
```
|
||||
|
||||
应用将在 `http://localhost:5000` 启动。
|
||||
|
||||
## 📖 使用指南
|
||||
|
||||
### 1. 上传文档
|
||||
|
||||
- 点击左侧知识库面板的"📤 点击或拖拽上传文档"区域
|
||||
- 选择要上传的文档(支持 PDF、Word、TXT 格式)
|
||||
- 系统会自动解析文档内容并建立知识库索引
|
||||
|
||||
### 2. 提问
|
||||
|
||||
- 在右侧聊天输入框中输入问题
|
||||
- 点击"发送"按钮或按 Enter 键提交问题
|
||||
- 系统会基于上传的文档内容提供精准答案
|
||||
- 答案会显示参考来源,包括文档名称和页码
|
||||
|
||||
### 3. 管理文档
|
||||
|
||||
- 在知识库面板中查看所有已上传的文档
|
||||
- 点击"🗑️ 删除"按钮可以删除不需要的文档
|
||||
- 文档状态会显示处理进度(处理中/已完成)
|
||||
|
||||
### 4. 查看历史
|
||||
|
||||
- 所有问答记录会自动保存
|
||||
- 刷新页面后会自动加载历史对话
|
||||
- 可以随时查看之前的问答内容
|
||||
|
||||
## 🎬 演示流程
|
||||
|
||||
### 场景 1:课程答疑
|
||||
|
||||
1. **准备阶段**
|
||||
- 上传课程讲义 PDF 文件
|
||||
- 等待系统完成文档解析(约 2-3 秒)
|
||||
|
||||
2. **提问演示**
|
||||
- 输入:"这门课程的主要学习目标是什么?"
|
||||
- 系统返回基于讲义的答案,并标注参考页码
|
||||
- 继续提问:"如何完成期末作业?"
|
||||
- 系统提供详细的作业要求说明
|
||||
|
||||
3. **效果展示**
|
||||
- 展示答案的准确性和参考来源
|
||||
- 展示对话历史的保存和加载
|
||||
|
||||
### 场景 2:企业文档查询
|
||||
|
||||
1. **准备阶段**
|
||||
- 上传公司规章制度文档
|
||||
- 上传产品说明书文档
|
||||
|
||||
2. **提问演示**
|
||||
- 输入:"公司的请假流程是怎样的?"
|
||||
- 系统从规章制度中提取相关内容
|
||||
- 输入:"产品 A 的保修期是多久?"
|
||||
- 系统从产品说明书中找到答案
|
||||
|
||||
3. **效果展示**
|
||||
- 展示多文档知识库的整合能力
|
||||
- 展示移动端的响应式设计
|
||||
|
||||
## 🛠️ 技术架构
|
||||
|
||||
### 后端技术栈
|
||||
|
||||
- **Flask**:轻量级 Web 框架
|
||||
- **SQLite**:本地数据库,用于存储对话历史和文档信息
|
||||
- **OpenAI API**:提供智能问答能力
|
||||
- **LangChain**:文档处理和向量检索(计划中)
|
||||
- **ChromaDB**:向量数据库(计划中)
|
||||
|
||||
### 前端技术栈
|
||||
|
||||
- **HTML5**:页面结构
|
||||
- **CSS3**:样式设计,包含响应式布局
|
||||
- **JavaScript**:交互逻辑和 API 调用
|
||||
|
||||
### 项目结构
|
||||
|
||||
```
|
||||
12/
|
||||
├── app.py # Flask 应用主文件
|
||||
├── requirements.txt # Python 依赖
|
||||
├── Project_Design.md # 项目设计文档
|
||||
├── README.md # 项目说明文档
|
||||
├── knowledge_base.db # SQLite 数据库(自动生成)
|
||||
├── templates/
|
||||
│ └── index.html # 前端页面模板
|
||||
└── static/
|
||||
├── style.css # 样式文件
|
||||
└── script.js # JavaScript 脚本
|
||||
```
|
||||
|
||||
## 🔧 API 接口
|
||||
|
||||
### 上传文档
|
||||
```
|
||||
POST /api/upload
|
||||
Content-Type: multipart/form-data
|
||||
|
||||
Body: file (文件)
|
||||
Response: { id, name, status }
|
||||
```
|
||||
|
||||
### 获取文档列表
|
||||
```
|
||||
GET /api/documents
|
||||
Response: [{ id, name, status, chunks, created_at }]
|
||||
```
|
||||
|
||||
### 删除文档
|
||||
```
|
||||
DELETE /api/documents/{doc_id}
|
||||
Response: { success: true }
|
||||
```
|
||||
|
||||
### 提问
|
||||
```
|
||||
POST /api/ask
|
||||
Content-Type: application/json
|
||||
|
||||
Body: { question: "问题内容" }
|
||||
Response: { answer, sources: [{ name, page }] }
|
||||
```
|
||||
|
||||
### 获取对话历史
|
||||
```
|
||||
GET /api/conversations
|
||||
Response: [{ id, question, answer, sources, created_at }]
|
||||
```
|
||||
|
||||
## 🎨 界面特性
|
||||
|
||||
### 响应式设计
|
||||
|
||||
- **桌面端**(>1024px):双栏布局,左侧知识库,右侧聊天
|
||||
- **平板端**(768px-1024px):单栏布局,优化间距
|
||||
- **移动端**(<768px):全屏显示,垂直堆叠,大按钮设计
|
||||
|
||||
### 交互反馈
|
||||
|
||||
- Toast 通知系统,实时显示操作状态
|
||||
- 字符计数器,提示输入长度
|
||||
- 加载状态指示,提升用户体验
|
||||
- Emoji 图标,增强视觉识别
|
||||
|
||||
## 📝 注意事项
|
||||
|
||||
1. **API 密钥**:确保正确配置 OpenAI API 密钥
|
||||
2. **文档格式**:目前支持 PDF、Word、TXT 格式
|
||||
3. **问题长度**:建议问题长度在 3-500 字之间
|
||||
4. **数据库**:对话历史保存在本地 SQLite 数据库中
|
||||
5. **浏览器兼容**:建议使用 Chrome、Firefox、Edge 等现代浏览器
|
||||
|
||||
## 🚧 未来规划
|
||||
|
||||
- [ ] 集成 LangChain 进行更强大的文档处理
|
||||
- [ ] 使用 ChromaDB 建立向量数据库
|
||||
- [ ] 支持更多文档格式(Excel、PPT 等)
|
||||
- [ ] 添加文档预览功能
|
||||
- [ ] 实现对话导出功能
|
||||
- [ ] 添加用户认证和权限管理
|
||||
- [ ] 支持多语言问答
|
||||
|
||||
## 📄 许可证
|
||||
|
||||
MIT License
|
||||
|
||||
## 🤝 贡献
|
||||
|
||||
欢迎提交 Issue 和 Pull Request!
|
||||
|
||||
## 📧 联系方式
|
||||
|
||||
如有问题或建议,请通过以下方式联系:
|
||||
- 提交 Issue
|
||||
- 发送邮件至:your-email@example.com
|
||||
|
||||
---
|
||||
|
||||
**享受智能问答带来的便利!** 🎉
|
||||
BIN
__pycache__/app.cpython-314.pyc
Normal file
BIN
__pycache__/app.cpython-314.pyc
Normal file
Binary file not shown.
334
app.py
Normal file
334
app.py
Normal file
@ -0,0 +1,334 @@
|
||||
import os
|
||||
import sqlite3
|
||||
import json
|
||||
from flask import Flask, render_template, request, jsonify
|
||||
from werkzeug.utils import secure_filename
|
||||
import uuid
|
||||
from datetime import datetime
|
||||
from dotenv import load_dotenv
|
||||
from openai import OpenAI
|
||||
|
||||
load_dotenv()
|
||||
|
||||
app = Flask(__name__)
|
||||
app.config['UPLOAD_FOLDER'] = 'uploads'
|
||||
app.config['MAX_CONTENT_LENGTH'] = 16 * 1024 * 1024
|
||||
app.config['DATABASE'] = 'knowledge_base.db'
|
||||
|
||||
DEEPSEEK_API_KEY = os.getenv('DEEPSEEK_API_KEY')
|
||||
DEEPSEEK_BASE_URL = os.getenv('DEEPSEEK_BASE_URL', 'https://api.deepseek.com')
|
||||
|
||||
client = OpenAI(
|
||||
api_key=DEEPSEEK_API_KEY,
|
||||
base_url=DEEPSEEK_BASE_URL
|
||||
)
|
||||
|
||||
ALLOWED_EXTENSIONS = {'txt', 'pdf', 'docx'}
|
||||
|
||||
os.makedirs(app.config['UPLOAD_FOLDER'], exist_ok=True)
|
||||
|
||||
documents = {}
|
||||
|
||||
def init_db():
|
||||
conn = sqlite3.connect(app.config['DATABASE'])
|
||||
cursor = conn.cursor()
|
||||
cursor.execute('''
|
||||
CREATE TABLE IF NOT EXISTS conversations (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
question TEXT NOT NULL,
|
||||
answer TEXT NOT NULL,
|
||||
sources TEXT,
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
|
||||
)
|
||||
''')
|
||||
cursor.execute('''
|
||||
CREATE TABLE IF NOT EXISTS documents (
|
||||
id TEXT PRIMARY KEY,
|
||||
name TEXT NOT NULL,
|
||||
status TEXT NOT NULL,
|
||||
chunks INTEGER DEFAULT 0,
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
|
||||
)
|
||||
''')
|
||||
conn.commit()
|
||||
conn.close()
|
||||
|
||||
def get_db_connection():
|
||||
conn = sqlite3.connect(app.config['DATABASE'])
|
||||
conn.row_factory = sqlite3.Row
|
||||
return conn
|
||||
|
||||
def load_documents_from_db():
|
||||
conn = get_db_connection()
|
||||
docs = conn.execute('SELECT * FROM documents ORDER BY created_at DESC').fetchall()
|
||||
conn.close()
|
||||
|
||||
global documents
|
||||
documents = {doc['id']: dict(doc) for doc in docs}
|
||||
|
||||
def allowed_file(filename):
|
||||
return '.' in filename and filename.rsplit('.', 1)[1].lower() in ALLOWED_EXTENSIONS
|
||||
|
||||
def read_document_content(doc_id):
|
||||
try:
|
||||
for file in os.listdir(app.config['UPLOAD_FOLDER']):
|
||||
if file.startswith(doc_id):
|
||||
filepath = os.path.join(app.config['UPLOAD_FOLDER'], file)
|
||||
|
||||
# 根据文件扩展名判断类型
|
||||
if file.lower().endswith('.txt'):
|
||||
with open(filepath, 'r', encoding='utf-8') as f:
|
||||
return f.read()
|
||||
|
||||
elif file.lower().endswith('.pdf'):
|
||||
import pypdf
|
||||
with open(filepath, 'rb') as f:
|
||||
reader = pypdf.PdfReader(f)
|
||||
text = ''
|
||||
for page in reader.pages:
|
||||
text += page.extract_text() + '\n'
|
||||
return text
|
||||
|
||||
elif file.lower().endswith('.docx'):
|
||||
from docx import Document
|
||||
doc = Document(filepath)
|
||||
text = ''
|
||||
for paragraph in doc.paragraphs:
|
||||
text += paragraph.text + '\n'
|
||||
return text
|
||||
|
||||
# 如果没有扩展名,尝试按顺序尝试不同格式
|
||||
else:
|
||||
# 先尝试作为 docx 文件
|
||||
try:
|
||||
from docx import Document
|
||||
doc = Document(filepath)
|
||||
text = ''
|
||||
for paragraph in doc.paragraphs:
|
||||
text += paragraph.text + '\n'
|
||||
if text.strip():
|
||||
return text
|
||||
except:
|
||||
pass
|
||||
|
||||
# 再尝试作为 txt 文件
|
||||
try:
|
||||
with open(filepath, 'r', encoding='utf-8') as f:
|
||||
text = f.read()
|
||||
if text.strip():
|
||||
return text
|
||||
except:
|
||||
pass
|
||||
|
||||
# 最后尝试作为 pdf 文件
|
||||
try:
|
||||
import pypdf
|
||||
with open(filepath, 'rb') as f:
|
||||
reader = pypdf.PdfReader(f)
|
||||
text = ''
|
||||
for page in reader.pages:
|
||||
text += page.extract_text() + '\n'
|
||||
if text.strip():
|
||||
return text
|
||||
except:
|
||||
pass
|
||||
|
||||
return None
|
||||
except Exception as e:
|
||||
print(f"Error reading document: {e}")
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
return None
|
||||
|
||||
@app.route('/')
|
||||
def index():
|
||||
load_documents_from_db()
|
||||
return render_template('index.html')
|
||||
|
||||
@app.route('/api/upload', methods=['POST'])
|
||||
def upload_document():
|
||||
try:
|
||||
if 'file' not in request.files:
|
||||
return jsonify({'error': '没有文件'}), 400
|
||||
|
||||
file = request.files['file']
|
||||
if file.filename == '':
|
||||
return jsonify({'error': '没有选择文件'}), 400
|
||||
|
||||
if file and allowed_file(file.filename):
|
||||
doc_id = str(uuid.uuid4())
|
||||
filename = secure_filename(file.filename)
|
||||
filepath = os.path.join(app.config['UPLOAD_FOLDER'], f"{doc_id}_{filename}")
|
||||
file.save(filepath)
|
||||
|
||||
conn = get_db_connection()
|
||||
conn.execute(
|
||||
'INSERT INTO documents (id, name, status, chunks) VALUES (?, ?, ?, ?)',
|
||||
(doc_id, filename, 'completed', 1)
|
||||
)
|
||||
conn.commit()
|
||||
conn.close()
|
||||
|
||||
load_documents_from_db()
|
||||
|
||||
return jsonify({
|
||||
'id': doc_id,
|
||||
'name': filename,
|
||||
'status': 'completed'
|
||||
})
|
||||
|
||||
return jsonify({'error': '不支持的文件格式'}), 400
|
||||
|
||||
except Exception as e:
|
||||
return jsonify({'error': f'上传失败:{str(e)}'}), 500
|
||||
|
||||
@app.route('/api/ask', methods=['POST'])
|
||||
def ask_question():
|
||||
try:
|
||||
data = request.json
|
||||
question = data.get('question', '')
|
||||
|
||||
if not question or not question.strip():
|
||||
return jsonify({'error': '请输入问题'}), 400
|
||||
|
||||
if len(question) > 1000:
|
||||
return jsonify({'error': '问题长度不能超过1000字'}), 400
|
||||
|
||||
load_documents_from_db()
|
||||
|
||||
if not documents:
|
||||
return jsonify({'error': '请先上传文档'}), 400
|
||||
|
||||
context_parts = []
|
||||
sources = []
|
||||
|
||||
for doc_id, doc_info in documents.items():
|
||||
if doc_info['status'] == 'completed':
|
||||
content = read_document_content(doc_id)
|
||||
if content:
|
||||
context_parts.append(f"文档:{doc_info['name']}\n内容:{content[:3000]}")
|
||||
sources.append({
|
||||
'doc_id': doc_id,
|
||||
'name': doc_info['name'],
|
||||
'page': 1
|
||||
})
|
||||
|
||||
if not context_parts:
|
||||
return jsonify({'error': '没有可用的文档内容'}), 400
|
||||
|
||||
context = '\n\n'.join(context_parts)
|
||||
|
||||
system_prompt = """你是一个智能知识库问答助手。请基于提供的文档内容回答用户的问题。
|
||||
要求:
|
||||
1. 只使用文档中的信息回答问题
|
||||
2. 如果文档中没有相关信息,请明确说明
|
||||
3. 回答要准确、简洁、有条理
|
||||
4. 使用中文回答"""
|
||||
|
||||
user_prompt = f"""文档内容:
|
||||
{context}
|
||||
|
||||
用户问题:{question}
|
||||
|
||||
请基于以上文档内容回答用户的问题。"""
|
||||
|
||||
try:
|
||||
response = client.chat.completions.create(
|
||||
model="deepseek-chat",
|
||||
messages=[
|
||||
{"role": "system", "content": system_prompt},
|
||||
{"role": "user", "content": user_prompt}
|
||||
],
|
||||
temperature=0.7,
|
||||
max_tokens=2000
|
||||
)
|
||||
|
||||
answer = response.choices[0].message.content
|
||||
|
||||
result = {
|
||||
'question': question,
|
||||
'answer': answer,
|
||||
'sources': sources
|
||||
}
|
||||
|
||||
conn = get_db_connection()
|
||||
conn.execute(
|
||||
'INSERT INTO conversations (question, answer, sources) VALUES (?, ?, ?)',
|
||||
(question, result['answer'], json.dumps(result['sources']))
|
||||
)
|
||||
conn.commit()
|
||||
conn.close()
|
||||
|
||||
return jsonify(result)
|
||||
|
||||
except Exception as api_error:
|
||||
print(f"DeepSeek API Error: {api_error}")
|
||||
return jsonify({'error': f'AI服务暂时不可用:{str(api_error)}'}), 500
|
||||
|
||||
except Exception as e:
|
||||
return jsonify({'error': f'回答问题时出错:{str(e)}'}), 500
|
||||
|
||||
@app.route('/api/documents', methods=['GET'])
|
||||
def get_documents():
|
||||
try:
|
||||
load_documents_from_db()
|
||||
return jsonify(list(documents.values()))
|
||||
except Exception as e:
|
||||
return jsonify({'error': f'获取文档列表失败:{str(e)}'}), 500
|
||||
|
||||
@app.route('/api/documents/<doc_id>', methods=['DELETE'])
|
||||
def delete_document(doc_id):
|
||||
try:
|
||||
conn = get_db_connection()
|
||||
cursor = conn.execute('DELETE FROM documents WHERE id = ?', (doc_id,))
|
||||
|
||||
if cursor.rowcount == 0:
|
||||
conn.close()
|
||||
return jsonify({'error': '文档不存在'}), 404
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
|
||||
load_documents_from_db()
|
||||
return jsonify({'success': True})
|
||||
|
||||
except Exception as e:
|
||||
return jsonify({'error': f'删除文档失败:{str(e)}'}), 500
|
||||
|
||||
@app.route('/api/conversations', methods=['GET'])
|
||||
def get_conversations():
|
||||
try:
|
||||
conn = get_db_connection()
|
||||
conversations = conn.execute(
|
||||
'SELECT * FROM conversations ORDER BY created_at DESC LIMIT 50'
|
||||
).fetchall()
|
||||
conn.close()
|
||||
|
||||
result = []
|
||||
for conv in conversations:
|
||||
conv_dict = dict(conv)
|
||||
conv_dict['sources'] = json.loads(conv_dict['sources']) if conv_dict['sources'] else []
|
||||
result.append(conv_dict)
|
||||
|
||||
return jsonify(result)
|
||||
|
||||
except Exception as e:
|
||||
return jsonify({'error': f'获取对话历史失败:{str(e)}'}), 500
|
||||
|
||||
@app.route('/api/conversations', methods=['DELETE'])
|
||||
def clear_conversations():
|
||||
try:
|
||||
conn = get_db_connection()
|
||||
conn.execute('DELETE FROM conversations')
|
||||
conn.commit()
|
||||
conn.close()
|
||||
|
||||
return jsonify({'success': True})
|
||||
|
||||
except Exception as e:
|
||||
return jsonify({'error': f'清除对话历史失败:{str(e)}'}), 500
|
||||
|
||||
if __name__ == '__main__':
|
||||
init_db()
|
||||
load_documents_from_db()
|
||||
app.run(debug=True, port=5000)
|
||||
27
check_db.py
Normal file
27
check_db.py
Normal file
@ -0,0 +1,27 @@
|
||||
import sqlite3
|
||||
import os
|
||||
|
||||
conn = sqlite3.connect('knowledge_base.db')
|
||||
cursor = conn.cursor()
|
||||
|
||||
print("=== 数据库中的文档 ===")
|
||||
cursor.execute('SELECT * FROM documents')
|
||||
docs = cursor.fetchall()
|
||||
if docs:
|
||||
for row in docs:
|
||||
print(f"ID: {row[0]}, Name: {row[1]}, Status: {row[2]}, Chunks: {row[3]}")
|
||||
else:
|
||||
print("数据库中没有文档")
|
||||
|
||||
print("\n=== uploads 文件夹中的文件 ===")
|
||||
if os.path.exists('uploads'):
|
||||
files = os.listdir('uploads')
|
||||
if files:
|
||||
for f in files:
|
||||
print(f"文件: {f}")
|
||||
else:
|
||||
print("uploads 文件夹为空")
|
||||
else:
|
||||
print("uploads 文件夹不存在")
|
||||
|
||||
conn.close()
|
||||
34
cleanup_db.py
Normal file
34
cleanup_db.py
Normal file
@ -0,0 +1,34 @@
|
||||
import sqlite3
|
||||
import os
|
||||
|
||||
conn = sqlite3.connect('knowledge_base.db')
|
||||
cursor = conn.cursor()
|
||||
|
||||
print("=== 清理无效的文档记录 ===\n")
|
||||
|
||||
cursor.execute('SELECT * FROM documents')
|
||||
docs = cursor.fetchall()
|
||||
|
||||
for doc in docs:
|
||||
doc_id, name, status, chunks, created_at = doc
|
||||
print(f"检查文档: ID={doc_id}, Name={name}, Status={status}")
|
||||
|
||||
found = False
|
||||
if os.path.exists('uploads'):
|
||||
for file in os.listdir('uploads'):
|
||||
if file.startswith(doc_id):
|
||||
print(f" ✓ 找到文件: {file}")
|
||||
found = True
|
||||
break
|
||||
|
||||
if not found:
|
||||
print(f" ✗ 未找到文件,删除记录")
|
||||
cursor.execute('DELETE FROM documents WHERE id = ?', (doc_id,))
|
||||
else:
|
||||
print(f" ✓ 保留记录")
|
||||
print()
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
|
||||
print("=== 清理完成 ===")
|
||||
BIN
knowledge_base.db
Normal file
BIN
knowledge_base.db
Normal file
Binary file not shown.
8
requirements.txt
Normal file
8
requirements.txt
Normal file
@ -0,0 +1,8 @@
|
||||
flask==3.0.0
|
||||
openai>=1.50.0
|
||||
python-dotenv==1.0.0
|
||||
langchain==0.1.0
|
||||
langchain-openai==0.0.2
|
||||
pypdf==3.17.4
|
||||
python-docx==1.1.0
|
||||
chromadb==0.4.22
|
||||
272
static/script.js
Normal file
272
static/script.js
Normal file
@ -0,0 +1,272 @@
|
||||
const uploadArea = document.getElementById('upload-area');
|
||||
const fileInput = document.getElementById('file-input');
|
||||
const documentList = document.getElementById('document-list');
|
||||
const chatMessages = document.getElementById('chat-messages');
|
||||
const questionInput = document.getElementById('question-input');
|
||||
const docCount = document.getElementById('doc-count');
|
||||
const charCount = document.getElementById('char-count');
|
||||
const toast = document.getElementById('toast');
|
||||
|
||||
function showToast(message, duration = 3000) {
|
||||
toast.textContent = message;
|
||||
toast.classList.add('show');
|
||||
|
||||
setTimeout(() => {
|
||||
toast.classList.remove('show');
|
||||
}, duration);
|
||||
}
|
||||
|
||||
uploadArea.addEventListener('click', () => fileInput.click());
|
||||
|
||||
uploadArea.addEventListener('dragover', (e) => {
|
||||
e.preventDefault();
|
||||
uploadArea.style.borderColor = '#1e3c72';
|
||||
uploadArea.style.background = '#e8f0fe';
|
||||
});
|
||||
|
||||
uploadArea.addEventListener('dragleave', () => {
|
||||
uploadArea.style.borderColor = '#cbd5e0';
|
||||
uploadArea.style.background = 'transparent';
|
||||
});
|
||||
|
||||
uploadArea.addEventListener('drop', (e) => {
|
||||
e.preventDefault();
|
||||
uploadArea.style.borderColor = '#cbd5e0';
|
||||
uploadArea.style.background = 'transparent';
|
||||
|
||||
const files = e.dataTransfer.files;
|
||||
if (files.length > 0) {
|
||||
uploadFile(files[0]);
|
||||
}
|
||||
});
|
||||
|
||||
fileInput.addEventListener('change', (e) => {
|
||||
if (e.target.files.length > 0) {
|
||||
uploadFile(e.target.files[0]);
|
||||
}
|
||||
});
|
||||
|
||||
async function uploadFile(file) {
|
||||
const formData = new FormData();
|
||||
formData.append('file', file);
|
||||
|
||||
try {
|
||||
showToast('⏳ 正在上传文档...');
|
||||
|
||||
const response = await fetch('/api/upload', {
|
||||
method: 'POST',
|
||||
body: formData
|
||||
});
|
||||
|
||||
const data = await response.json();
|
||||
|
||||
if (data.error) {
|
||||
showToast(`❌ ${data.error}`);
|
||||
return;
|
||||
}
|
||||
|
||||
loadDocuments();
|
||||
showToast(`✅ 文档 "${data.name}" 上传成功,正在处理中...`);
|
||||
addMessage('bot', `📄 文档 "${data.name}" 已上传,系统正在智能解析文档内容...`);
|
||||
|
||||
setTimeout(() => {
|
||||
loadDocuments();
|
||||
}, 2000);
|
||||
|
||||
} catch (error) {
|
||||
showToast('❌ 上传失败,请重试');
|
||||
console.error(error);
|
||||
}
|
||||
}
|
||||
|
||||
async function loadDocuments() {
|
||||
try {
|
||||
const response = await fetch('/api/documents');
|
||||
const documents = await response.json();
|
||||
|
||||
docCount.textContent = `${documents.length} 个文档`;
|
||||
|
||||
if (documents.length === 0) {
|
||||
documentList.innerHTML = `
|
||||
<div class="empty-state">
|
||||
<div class="empty-icon">📭</div>
|
||||
<p>暂无文档</p>
|
||||
<p class="empty-hint">上传文档后即可开始问答</p>
|
||||
</div>
|
||||
`;
|
||||
return;
|
||||
}
|
||||
|
||||
documentList.innerHTML = documents.map(doc => `
|
||||
<div class="document-item">
|
||||
<div class="document-info">
|
||||
<div class="document-name">📄 ${doc.name}</div>
|
||||
<div class="document-status ${doc.status}">
|
||||
${doc.status === 'processing' ? '⏳ 处理中...' : '✅ 已完成'}
|
||||
</div>
|
||||
</div>
|
||||
<button class="delete-btn" onclick="deleteDocument('${doc.id}')">🗑️ 删除</button>
|
||||
</div>
|
||||
`).join('');
|
||||
|
||||
} catch (error) {
|
||||
console.error(error);
|
||||
showToast('❌ 加载文档列表失败');
|
||||
}
|
||||
}
|
||||
|
||||
async function deleteDocument(docId) {
|
||||
if (!confirm('确定要删除这个文档吗?')) {
|
||||
return;
|
||||
}
|
||||
|
||||
try {
|
||||
showToast('⏳ 正在删除文档...');
|
||||
|
||||
await fetch(`/api/documents/${docId}`, {
|
||||
method: 'DELETE'
|
||||
});
|
||||
|
||||
loadDocuments();
|
||||
showToast('✅ 文档已删除');
|
||||
addMessage('bot', '🗑️ 文档已从知识库中删除');
|
||||
|
||||
} catch (error) {
|
||||
showToast('❌ 删除失败,请重试');
|
||||
console.error(error);
|
||||
}
|
||||
}
|
||||
|
||||
async function askQuestion() {
|
||||
const question = questionInput.value.trim();
|
||||
|
||||
if (!question) {
|
||||
showToast('⚠️ 请输入问题');
|
||||
questionInput.focus();
|
||||
return;
|
||||
}
|
||||
|
||||
if (question.length < 3) {
|
||||
showToast('⚠️ 问题太短,请输入至少3个字符');
|
||||
questionInput.focus();
|
||||
return;
|
||||
}
|
||||
|
||||
addMessage('user', question);
|
||||
questionInput.value = '';
|
||||
updateCharCount();
|
||||
|
||||
try {
|
||||
showToast('🤖 正在思考中...');
|
||||
|
||||
const response = await fetch('/api/ask', {
|
||||
method: 'POST',
|
||||
headers: {
|
||||
'Content-Type': 'application/json'
|
||||
},
|
||||
body: JSON.stringify({ question })
|
||||
});
|
||||
|
||||
const data = await response.json();
|
||||
|
||||
if (data.error) {
|
||||
showToast(`❌ ${data.error}`);
|
||||
addMessage('bot', `❌ 抱歉,${data.error}`);
|
||||
return;
|
||||
}
|
||||
|
||||
let answerHtml = `<p>${data.answer}</p>`;
|
||||
|
||||
if (data.sources && data.sources.length > 0) {
|
||||
answerHtml += `
|
||||
<div class="sources">
|
||||
<div class="sources-title">📚 参考来源:</div>
|
||||
${data.sources.map(source => `
|
||||
<div class="source-item">📄 ${source.name} (第${source.page}页)</div>
|
||||
`).join('')}
|
||||
</div>
|
||||
`;
|
||||
}
|
||||
|
||||
showToast('✅ 回答完成');
|
||||
addMessage('bot', answerHtml);
|
||||
|
||||
} catch (error) {
|
||||
showToast('❌ 回答问题时出错了,请重试');
|
||||
addMessage('bot', '❌ 抱歉,回答问题时出错了,请重试');
|
||||
console.error(error);
|
||||
}
|
||||
}
|
||||
|
||||
function addMessage(type, content) {
|
||||
const messageDiv = document.createElement('div');
|
||||
messageDiv.className = `message ${type}`;
|
||||
messageDiv.innerHTML = `
|
||||
<div class="message-content">
|
||||
${content}
|
||||
</div>
|
||||
`;
|
||||
|
||||
chatMessages.appendChild(messageDiv);
|
||||
chatMessages.scrollTop = chatMessages.scrollHeight;
|
||||
}
|
||||
|
||||
function updateCharCount() {
|
||||
const length = questionInput.value.length;
|
||||
charCount.textContent = `${length} 字`;
|
||||
|
||||
if (length > 500) {
|
||||
charCount.style.color = '#e53e3e';
|
||||
} else if (length > 300) {
|
||||
charCount.style.color = '#d69e2e';
|
||||
} else {
|
||||
charCount.style.color = '#718096';
|
||||
}
|
||||
}
|
||||
|
||||
async function loadConversationHistory() {
|
||||
try {
|
||||
const response = await fetch('/api/conversations');
|
||||
const conversations = await response.json();
|
||||
|
||||
if (conversations.length === 0) {
|
||||
addMessage('bot', '👋 欢迎使用智能知识库问答系统!上传文档后,您可以向我提问任何与文档相关的问题。');
|
||||
return;
|
||||
}
|
||||
|
||||
conversations.forEach(conv => {
|
||||
addMessage('user', conv.question);
|
||||
|
||||
let answerHtml = `<p>${conv.answer}</p>`;
|
||||
|
||||
if (conv.sources && conv.sources.length > 0) {
|
||||
answerHtml += `
|
||||
<div class="sources">
|
||||
<div class="sources-title">📚 参考来源:</div>
|
||||
${conv.sources.map(source => `
|
||||
<div class="source-item">📄 ${source.name} (第${source.page}页)</div>
|
||||
`).join('')}
|
||||
</div>
|
||||
`;
|
||||
}
|
||||
|
||||
addMessage('bot', answerHtml);
|
||||
});
|
||||
|
||||
} catch (error) {
|
||||
console.error(error);
|
||||
addMessage('bot', '👋 欢迎使用智能知识库问答系统!上传文档后,您可以向我提问任何与文档相关的问题。');
|
||||
}
|
||||
}
|
||||
|
||||
questionInput.addEventListener('input', updateCharCount);
|
||||
|
||||
questionInput.addEventListener('keydown', (e) => {
|
||||
if (e.key === 'Enter' && !e.shiftKey) {
|
||||
e.preventDefault();
|
||||
askQuestion();
|
||||
}
|
||||
});
|
||||
|
||||
loadDocuments();
|
||||
loadConversationHistory();
|
||||
490
static/style.css
Normal file
490
static/style.css
Normal file
@ -0,0 +1,490 @@
|
||||
* {
|
||||
margin: 0;
|
||||
padding: 0;
|
||||
box-sizing: border-box;
|
||||
}
|
||||
|
||||
body {
|
||||
font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
|
||||
background: linear-gradient(135deg, #1e3c72 0%, #2a5298 100%);
|
||||
min-height: 100vh;
|
||||
padding: 20px;
|
||||
}
|
||||
|
||||
.container {
|
||||
max-width: 1400px;
|
||||
margin: 0 auto;
|
||||
background: white;
|
||||
border-radius: 20px;
|
||||
padding: 30px;
|
||||
box-shadow: 0 20px 60px rgba(0, 0, 0, 0.3);
|
||||
}
|
||||
|
||||
.header {
|
||||
text-align: center;
|
||||
margin-bottom: 30px;
|
||||
}
|
||||
|
||||
h1 {
|
||||
color: #1e3c72;
|
||||
margin-bottom: 10px;
|
||||
font-size: 32px;
|
||||
}
|
||||
|
||||
.subtitle {
|
||||
color: #666;
|
||||
font-size: 15px;
|
||||
}
|
||||
|
||||
.main-content {
|
||||
display: grid;
|
||||
grid-template-columns: 380px 1fr;
|
||||
gap: 25px;
|
||||
}
|
||||
|
||||
.left-panel,
|
||||
.right-panel {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 20px;
|
||||
}
|
||||
|
||||
.panel-section {
|
||||
background: #f8f9fa;
|
||||
border-radius: 15px;
|
||||
padding: 25px;
|
||||
box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1);
|
||||
}
|
||||
|
||||
.panel-header {
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
align-items: center;
|
||||
margin-bottom: 20px;
|
||||
}
|
||||
|
||||
.panel-header h2 {
|
||||
font-size: 20px;
|
||||
color: #1e3c72;
|
||||
margin: 0;
|
||||
}
|
||||
|
||||
.doc-count {
|
||||
background: #1e3c72;
|
||||
color: white;
|
||||
padding: 5px 12px;
|
||||
border-radius: 20px;
|
||||
font-size: 13px;
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.upload-area {
|
||||
border: 3px dashed #cbd5e0;
|
||||
border-radius: 15px;
|
||||
padding: 50px 30px;
|
||||
text-align: center;
|
||||
cursor: pointer;
|
||||
transition: all 0.3s;
|
||||
background: white;
|
||||
}
|
||||
|
||||
.upload-area:hover {
|
||||
border-color: #1e3c72;
|
||||
background: #e8f0fe;
|
||||
transform: translateY(-2px);
|
||||
box-shadow: 0 4px 12px rgba(30, 60, 114, 0.15);
|
||||
}
|
||||
|
||||
.upload-icon {
|
||||
font-size: 56px;
|
||||
margin-bottom: 15px;
|
||||
}
|
||||
|
||||
.upload-text {
|
||||
color: #333;
|
||||
font-size: 16px;
|
||||
font-weight: 600;
|
||||
margin-bottom: 8px;
|
||||
}
|
||||
|
||||
.upload-hint {
|
||||
font-size: 13px;
|
||||
color: #666;
|
||||
}
|
||||
|
||||
.document-list {
|
||||
max-height: 450px;
|
||||
overflow-y: auto;
|
||||
}
|
||||
|
||||
.document-item {
|
||||
background: white;
|
||||
border-radius: 12px;
|
||||
padding: 18px;
|
||||
margin-bottom: 12px;
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
align-items: center;
|
||||
box-shadow: 0 2px 6px rgba(0, 0, 0, 0.1);
|
||||
transition: transform 0.2s, box-shadow 0.2s;
|
||||
}
|
||||
|
||||
.document-item:hover {
|
||||
transform: translateY(-2px);
|
||||
box-shadow: 0 4px 12px rgba(0, 0, 0, 0.15);
|
||||
}
|
||||
|
||||
.document-info {
|
||||
flex: 1;
|
||||
}
|
||||
|
||||
.document-name {
|
||||
font-weight: 600;
|
||||
color: #333;
|
||||
margin-bottom: 8px;
|
||||
font-size: 15px;
|
||||
}
|
||||
|
||||
.document-status {
|
||||
font-size: 13px;
|
||||
color: #666;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 5px;
|
||||
}
|
||||
|
||||
.document-status.processing {
|
||||
color: #f59e0b;
|
||||
}
|
||||
|
||||
.document-status.completed {
|
||||
color: #10b981;
|
||||
}
|
||||
|
||||
.delete-btn {
|
||||
background: #ef4444;
|
||||
color: white;
|
||||
border: none;
|
||||
border-radius: 8px;
|
||||
padding: 8px 16px;
|
||||
cursor: pointer;
|
||||
font-size: 13px;
|
||||
font-weight: 600;
|
||||
transition: all 0.3s;
|
||||
}
|
||||
|
||||
.delete-btn:hover {
|
||||
background: #dc2626;
|
||||
transform: scale(1.05);
|
||||
}
|
||||
|
||||
.empty-state {
|
||||
text-align: center;
|
||||
color: #999;
|
||||
padding: 60px 30px;
|
||||
}
|
||||
|
||||
.empty-icon {
|
||||
font-size: 64px;
|
||||
margin-bottom: 15px;
|
||||
}
|
||||
|
||||
.empty-state p {
|
||||
margin-bottom: 5px;
|
||||
}
|
||||
|
||||
.empty-hint {
|
||||
font-size: 13px;
|
||||
color: #bbb;
|
||||
}
|
||||
|
||||
.chat-container {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
height: 650px;
|
||||
}
|
||||
|
||||
.chat-messages {
|
||||
flex: 1;
|
||||
overflow-y: auto;
|
||||
padding: 25px;
|
||||
background: white;
|
||||
border-radius: 15px;
|
||||
margin-bottom: 20px;
|
||||
box-shadow: inset 0 2px 4px rgba(0, 0, 0, 0.05);
|
||||
}
|
||||
|
||||
.message {
|
||||
margin-bottom: 25px;
|
||||
animation: fadeIn 0.3s ease-in;
|
||||
}
|
||||
|
||||
@keyframes fadeIn {
|
||||
from {
|
||||
opacity: 0;
|
||||
transform: translateY(10px);
|
||||
}
|
||||
to {
|
||||
opacity: 1;
|
||||
transform: translateY(0);
|
||||
}
|
||||
}
|
||||
|
||||
.message.user {
|
||||
display: flex;
|
||||
justify-content: flex-end;
|
||||
}
|
||||
|
||||
.message.bot {
|
||||
display: flex;
|
||||
justify-content: flex-start;
|
||||
}
|
||||
|
||||
.message-content {
|
||||
max-width: 80%;
|
||||
padding: 18px;
|
||||
border-radius: 15px;
|
||||
line-height: 1.7;
|
||||
font-size: 15px;
|
||||
}
|
||||
|
||||
.message.user .message-content {
|
||||
background: linear-gradient(135deg, #1e3c72 0%, #2a5298 100%);
|
||||
color: white;
|
||||
box-shadow: 0 4px 12px rgba(30, 60, 114, 0.3);
|
||||
}
|
||||
|
||||
.message.bot .message-content {
|
||||
background: #f0f0f0;
|
||||
color: #333;
|
||||
box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1);
|
||||
}
|
||||
|
||||
.sources {
|
||||
margin-top: 15px;
|
||||
padding-top: 15px;
|
||||
border-top: 2px solid #e0e0e0;
|
||||
}
|
||||
|
||||
.sources-title {
|
||||
font-size: 13px;
|
||||
color: #666;
|
||||
margin-bottom: 8px;
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.source-item {
|
||||
font-size: 13px;
|
||||
color: #1e3c72;
|
||||
margin-bottom: 5px;
|
||||
padding: 5px 10px;
|
||||
background: #e8f0fe;
|
||||
border-radius: 6px;
|
||||
display: inline-block;
|
||||
}
|
||||
|
||||
.chat-input {
|
||||
display: flex;
|
||||
gap: 15px;
|
||||
}
|
||||
|
||||
.input-wrapper {
|
||||
flex: 1;
|
||||
position: relative;
|
||||
}
|
||||
|
||||
.chat-input textarea {
|
||||
width: 100%;
|
||||
padding: 18px;
|
||||
padding-right: 100px;
|
||||
border: 2px solid #e0e0e0;
|
||||
border-radius: 15px;
|
||||
resize: none;
|
||||
font-size: 15px;
|
||||
font-family: inherit;
|
||||
min-height: 80px;
|
||||
transition: all 0.3s;
|
||||
}
|
||||
|
||||
.chat-input textarea:focus {
|
||||
outline: none;
|
||||
border-color: #1e3c72;
|
||||
box-shadow: 0 0 0 3px rgba(30, 60, 114, 0.1);
|
||||
}
|
||||
|
||||
.char-count {
|
||||
position: absolute;
|
||||
bottom: 10px;
|
||||
right: 15px;
|
||||
font-size: 12px;
|
||||
color: #999;
|
||||
background: white;
|
||||
padding: 3px 8px;
|
||||
border-radius: 10px;
|
||||
}
|
||||
|
||||
.send-btn {
|
||||
padding: 18px 35px;
|
||||
background: linear-gradient(135deg, #1e3c72 0%, #2a5298 100%);
|
||||
color: white;
|
||||
border: none;
|
||||
border-radius: 15px;
|
||||
cursor: pointer;
|
||||
font-weight: 600;
|
||||
font-size: 16px;
|
||||
transition: all 0.3s;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 8px;
|
||||
box-shadow: 0 4px 12px rgba(30, 60, 114, 0.3);
|
||||
}
|
||||
|
||||
.send-btn:hover {
|
||||
transform: translateY(-2px);
|
||||
box-shadow: 0 6px 16px rgba(30, 60, 114, 0.4);
|
||||
}
|
||||
|
||||
.send-btn:active {
|
||||
transform: translateY(0);
|
||||
}
|
||||
|
||||
.send-btn:disabled {
|
||||
background: #ccc;
|
||||
cursor: not-allowed;
|
||||
transform: none;
|
||||
box-shadow: none;
|
||||
}
|
||||
|
||||
.btn-icon {
|
||||
font-size: 18px;
|
||||
}
|
||||
|
||||
.clear-btn {
|
||||
background: #f59e0b;
|
||||
color: white;
|
||||
border: none;
|
||||
border-radius: 8px;
|
||||
padding: 8px 16px;
|
||||
cursor: pointer;
|
||||
font-size: 13px;
|
||||
font-weight: 600;
|
||||
transition: all 0.3s;
|
||||
}
|
||||
|
||||
.clear-btn:hover {
|
||||
background: #d97706;
|
||||
transform: scale(1.05);
|
||||
}
|
||||
|
||||
.toast {
|
||||
position: fixed;
|
||||
top: 20px;
|
||||
right: 20px;
|
||||
padding: 15px 25px;
|
||||
background: #333;
|
||||
color: white;
|
||||
border-radius: 10px;
|
||||
box-shadow: 0 4px 12px rgba(0, 0, 0, 0.3);
|
||||
transform: translateX(400px);
|
||||
transition: transform 0.3s ease-out;
|
||||
z-index: 1000;
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.toast.show {
|
||||
transform: translateX(0);
|
||||
}
|
||||
|
||||
.toast.success {
|
||||
background: #10b981;
|
||||
}
|
||||
|
||||
.toast.error {
|
||||
background: #ef4444;
|
||||
}
|
||||
|
||||
.toast.info {
|
||||
background: #3b82f6;
|
||||
}
|
||||
|
||||
@media (max-width: 1024px) {
|
||||
.main-content {
|
||||
grid-template-columns: 1fr;
|
||||
}
|
||||
|
||||
.chat-container {
|
||||
height: 550px;
|
||||
}
|
||||
|
||||
h1 {
|
||||
font-size: 28px;
|
||||
}
|
||||
}
|
||||
|
||||
@media (max-width: 768px) {
|
||||
body {
|
||||
padding: 10px;
|
||||
}
|
||||
|
||||
.container {
|
||||
padding: 20px;
|
||||
border-radius: 15px;
|
||||
}
|
||||
|
||||
h1 {
|
||||
font-size: 24px;
|
||||
}
|
||||
|
||||
.subtitle {
|
||||
font-size: 13px;
|
||||
}
|
||||
|
||||
.panel-section {
|
||||
padding: 20px;
|
||||
}
|
||||
|
||||
.chat-container {
|
||||
height: 500px;
|
||||
}
|
||||
|
||||
.chat-input {
|
||||
flex-direction: column;
|
||||
}
|
||||
|
||||
.send-btn {
|
||||
width: 100%;
|
||||
justify-content: center;
|
||||
}
|
||||
|
||||
.upload-area {
|
||||
padding: 40px 20px;
|
||||
}
|
||||
|
||||
.upload-icon {
|
||||
font-size: 48px;
|
||||
}
|
||||
}
|
||||
|
||||
@media (max-width: 480px) {
|
||||
.container {
|
||||
padding: 15px;
|
||||
}
|
||||
|
||||
h1 {
|
||||
font-size: 20px;
|
||||
}
|
||||
|
||||
.panel-section {
|
||||
padding: 15px;
|
||||
}
|
||||
|
||||
.message-content {
|
||||
max-width: 90%;
|
||||
font-size: 14px;
|
||||
padding: 15px;
|
||||
}
|
||||
|
||||
.chat-messages {
|
||||
padding: 15px;
|
||||
}
|
||||
}
|
||||
80
templates/index.html
Normal file
80
templates/index.html
Normal file
@ -0,0 +1,80 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="zh-CN">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>智能知识库问答</title>
|
||||
<link rel="stylesheet" href="/static/style.css">
|
||||
</head>
|
||||
<body>
|
||||
<div class="container">
|
||||
<div class="header">
|
||||
<h1>🧠 智能知识库问答</h1>
|
||||
<p class="subtitle">✨ 企业/课程 · 自有文档精准问答,减少人工答疑</p>
|
||||
</div>
|
||||
|
||||
<div class="main-content">
|
||||
<div class="left-panel">
|
||||
<div class="panel-section">
|
||||
<h2>📁 文档上传</h2>
|
||||
<div class="upload-area" id="upload-area">
|
||||
<input type="file" id="file-input" accept=".txt,.pdf,.docx" hidden>
|
||||
<div class="upload-icon">📤</div>
|
||||
<p class="upload-text">点击或拖拽文件到此处</p>
|
||||
<p class="upload-hint">💡 支持 PDF、Word、TXT 格式(最大 16MB)</p>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="panel-section">
|
||||
<div class="panel-header">
|
||||
<h2>📚 知识库</h2>
|
||||
<span class="doc-count" id="doc-count">0 个文档</span>
|
||||
</div>
|
||||
<div class="document-list" id="document-list">
|
||||
<div class="empty-state">
|
||||
<div class="empty-icon">📭</div>
|
||||
<p>暂无文档</p>
|
||||
<p class="empty-hint">上传文档后即可开始问答</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="right-panel">
|
||||
<div class="panel-section">
|
||||
<div class="panel-header">
|
||||
<h2>💬 智能问答</h2>
|
||||
<button class="clear-btn" onclick="clearHistory()" title="清除对话历史">🗑️ 清空</button>
|
||||
</div>
|
||||
<div class="chat-container">
|
||||
<div class="chat-messages" id="chat-messages">
|
||||
<div class="message bot">
|
||||
<div class="message-content">
|
||||
<p>👋 您好!我是您的智能知识库助手。</p>
|
||||
<p>📋 请先上传文档,然后我可以基于文档内容回答您的问题。</p>
|
||||
<p>🎯 支持精准检索,减少人工答疑时间!</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="chat-input">
|
||||
<div class="input-wrapper">
|
||||
<textarea id="question-input" placeholder="💭 请输入您的问题...(按 Enter 发送,Shift+Enter 换行)" maxlength="1000"></textarea>
|
||||
<div class="char-count" id="char-count">0/1000</div>
|
||||
</div>
|
||||
<button class="send-btn" onclick="askQuestion()" id="send-btn">
|
||||
<span class="btn-text">发送</span>
|
||||
<span class="btn-icon">🚀</span>
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="toast" id="toast"></div>
|
||||
|
||||
<script src="/static/script.js"></script>
|
||||
</body>
|
||||
</html>
|
||||
23
test_document_read.py
Normal file
23
test_document_read.py
Normal file
@ -0,0 +1,23 @@
|
||||
import sys
|
||||
import os
|
||||
|
||||
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
|
||||
|
||||
from app import read_document_content
|
||||
|
||||
doc_id = '3cac70d6-cd02-493e-a08a-c7794927e7b4'
|
||||
|
||||
print(f"=== 测试文档读取功能 ===\n")
|
||||
print(f"文档ID: {doc_id}\n")
|
||||
|
||||
content = read_document_content(doc_id)
|
||||
|
||||
if content:
|
||||
print(f"✓ 成功读取文档内容")
|
||||
print(f"内容长度: {len(content)} 字符")
|
||||
print(f"\n前200字符:")
|
||||
print(content[:200])
|
||||
print("\n✓ 文档读取功能正常!")
|
||||
else:
|
||||
print("✗ 未能读取文档内容")
|
||||
print("请检查文件是否存在或格式是否正确")
|
||||
17
test_output.txt
Normal file
17
test_output.txt
Normal file
@ -0,0 +1,17 @@
|
||||
=== 测试文档读取功能 ===
|
||||
|
||||
文件: 3cac70d6-cd02-493e-a08a-c7794927e7b4_docx
|
||||
文件: 77320887-f79f-48a0-aa20-bea2d9c1f5af_SIT.docx
|
||||
类型: DOCX
|
||||
段落数: 30
|
||||
内容长度: 336 字符
|
||||
前100字符: 《长河入海时——致SIT七十一周年》
|
||||
|
||||
【第一篇章:长河溯源】
|
||||
你从1956年的晨光中启航,
|
||||
一捧夯土,铸成应用之学的堤岸。
|
||||
工程卷轴在黄浦江畔舒展,
|
||||
墨迹里游动着钢铁与代码的基因链。
|
||||
|
||||
【第二篇章:...
|
||||
|
||||
39
test_read.py
Normal file
39
test_read.py
Normal file
@ -0,0 +1,39 @@
|
||||
import os
|
||||
from docx import Document
|
||||
|
||||
uploads_folder = 'uploads'
|
||||
|
||||
print("=== 测试文档读取功能 ===\n")
|
||||
|
||||
for file in os.listdir(uploads_folder):
|
||||
filepath = os.path.join(uploads_folder, file)
|
||||
print(f"文件: {file}")
|
||||
|
||||
if file.endswith('.txt'):
|
||||
with open(filepath, 'r', encoding='utf-8') as f:
|
||||
content = f.read()
|
||||
print(f"类型: TXT")
|
||||
print(f"内容长度: {len(content)} 字符")
|
||||
print(f"前100字符: {content[:100]}...\n")
|
||||
|
||||
elif file.endswith('.pdf'):
|
||||
import pypdf
|
||||
with open(filepath, 'rb') as f:
|
||||
reader = pypdf.PdfReader(f)
|
||||
text = ''
|
||||
for page in reader.pages:
|
||||
text += page.extract_text() + '\n'
|
||||
print(f"类型: PDF")
|
||||
print(f"页数: {len(reader.pages)}")
|
||||
print(f"内容长度: {len(text)} 字符")
|
||||
print(f"前100字符: {text[:100]}...\n")
|
||||
|
||||
elif file.endswith('.docx'):
|
||||
doc = Document(filepath)
|
||||
text = ''
|
||||
for paragraph in doc.paragraphs:
|
||||
text += paragraph.text + '\n'
|
||||
print(f"类型: DOCX")
|
||||
print(f"段落数: {len(doc.paragraphs)}")
|
||||
print(f"内容长度: {len(text)} 字符")
|
||||
print(f"前100字符: {text[:100]}...\n")
|
||||
BIN
uploads/3cac70d6-cd02-493e-a08a-c7794927e7b4_docx
Normal file
BIN
uploads/3cac70d6-cd02-493e-a08a-c7794927e7b4_docx
Normal file
Binary file not shown.
BIN
uploads/77320887-f79f-48a0-aa20-bea2d9c1f5af_SIT.docx
Normal file
BIN
uploads/77320887-f79f-48a0-aa20-bea2d9c1f5af_SIT.docx
Normal file
Binary file not shown.
Loading…
Reference in New Issue
Block a user