Compare commits

..

48 Commits

Author SHA1 Message Date
python
d27c18d0d2 修正生成文档错误测试 2025-12-30 10:41:35 +08:00
python
7bb69af45e 再次修复模板目录层级结构 2025-12-26 09:32:15 +08:00
python
ac8bdba941 完成121个模板更新和测试。 2025-12-26 09:16:31 +08:00
python
eec66cbe05 重构模板保存和下载逻辑,将文件保存到本地template_finish文件夹,并更新文档服务以从本地读取模板文件。增强了错误处理,确保文件路径的有效性和安全性。 2025-12-18 16:45:31 +08:00
python
fb7fb985ad 重做谈话审批表并且修改测试代码 2025-12-15 16:23:37 +08:00
python
557c9ae351 增加占位符替换测试脚本 2025-12-15 14:45:42 +08:00
python
cb4a07f148 优化更新调整了数据字段关联关系工具,可以正常显示并且添加和修改。 2025-12-14 20:12:05 +08:00
python
665612d2bf 更新.env文件以反映新的MinIO和AI服务配置,同时删除无用的文档和模板文件,增强了系统的灵活性和可维护性。添加了数据库备份和恢复功能,支持通过API管理租户ID和字段关联,优化了前端模板字段管理界面。 2025-12-14 16:56:26 +08:00
python
1d0f7a5bfe 添加.env文件到.gitignore以避免将环境变量文件纳入版本控制 2025-12-12 22:46:59 +08:00
python
672dd2e516 优化数据库查询逻辑,移除对tenant_id的限制,确保在获取文件配置和字段信息时不再依赖tenant_id。同时,更新文档服务和字段服务的初始化逻辑,从环境变量中读取数据库和MinIO配置,增强了配置的灵活性和可维护性。 2025-12-12 22:41:01 +08:00
python
2563e7fc74 同步数据库和minio模板数据至智慧监督服务器 2025-12-12 15:20:43 +08:00
python
640f7834b6 临时保存一下 2025-12-12 09:50:22 +08:00
python
70f5be89ce 删除无效和重复的模板文件,更新数据库记录以确保模板与字段的关联关系正确,同时修复了占位符替换逻辑中的问题,增强了错误处理和调试信息输出,确保在不同环境下的稳定性。更新了md说明文档。 2025-12-11 21:19:23 +08:00
python
dab5d8ee59 添加通过XML直接替换Word文档占位符的功能,作为表格处理失败时的备用方案。同时,优化表格占位符替换逻辑,增强错误处理和调试信息输出,确保在处理复杂表格时的稳定性。 2025-12-11 16:48:43 +08:00
python
4d9080855c 优化文档服务中的表格处理逻辑,使用索引方式访问行和单元格以避免迭代时的索引错误,同时增强对异常情况的处理,确保在访问行、单元格和段落时的稳定性 2025-12-11 16:34:50 +08:00
python
91fcd5461d 修复ubuntu表格中占位符替换的问题 2025-12-11 16:30:42 +08:00
python
d01f367ffb 修复ubuntu环境下占位符替换为嗯体 2025-12-11 16:24:21 +08:00
python
7cbe4b29b7 修复文档生成异常。修复模板输入字段关联异常。 2025-12-11 16:10:48 +08:00
python
d3afba9191 调整模板和测试页面 2025-12-11 15:10:46 +08:00
python
d8d2817aed 修复文档生成逻辑中的冲突,更新文件路径和下载链接的返回逻辑,增强用户体验。同时,优化前端文件列表加载功能,支持加载所有可用模板,并添加清空文件列表的功能。 2025-12-11 14:50:07 +08:00
python
28bf100ca4 添加文件配置查询接口,支持通过taskId获取文档,增强参数验证和错误处理能力。同时,优化文档生成逻辑,确保生成的文档名称和路径的准确性,提升代码可读性和维护性。 2025-12-11 12:14:25 +08:00
python
6dd272d083 添加生成MinIO预签名下载URL的功能,更新文档生成逻辑以返回下载链接,同时优化控制台输出信息,提升用户体验和错误处理能力。 2025-12-11 12:10:13 +08:00
python
2a5952f3f5 更新输入字段的默认值,调整文件项的添加逻辑,改用文件ID替代模板编码,简化文件配置查询,提升代码可读性和维护性。 2025-12-11 09:16:14 +08:00
python
a320f55da0 更新文档生成逻辑,改用文件ID替代模板编码,增强参数验证和错误处理能力。同时,调整文件配置查询逻辑,确保根据文件ID获取文件配置信息,提升代码可读性和维护性。 2025-12-11 09:09:10 +08:00
python
ebc1154beb 更新下载链接生成逻辑,修改文件路径以反映最新的文档版本。同时,在文档服务中添加文件路径有效性检查,确保模板文件路径不为空,提升错误处理能力。 2025-12-10 19:02:20 +08:00
python
0563ff5346 更新AI服务逻辑,改进年龄字段的推断机制,允许在缺失年龄数据时根据出生年月自动计算年龄,并优化相关日志记录。同时,更新文档以反映新的字段配置和使用说明,确保数据提取的准确性和完整性。 2025-12-10 14:16:59 +08:00
python
e38ba42669 删除多个不再需要的文档,包括AI服务错误分析报告、模板树状结构更新说明、数据库备份和恢复工具使用说明等,简化项目结构,提升可维护性。同时,更新文档服务以支持从input_data和template_code中提取模板配置,增强查询逻辑的灵活性。 2025-12-10 10:39:36 +08:00
python
11be119ffc 更新环境配置,支持多种AI服务提供商(华为和硅基流动),增强API调用的灵活性和可配置性,同时更新文档以反映新的配置选项和使用说明。 2025-12-10 10:05:45 +08:00
python
cd27bb4bd0 优化AI服务的内容提取逻辑,增强对缺失字段的推断能力,改进后处理机制以提升数据提取的准确性和完整性。 2025-12-10 09:49:57 +08:00
python
6871c2e803 增强后处理逻辑,允许从原始输入文本中提取缺失的性别和年龄字段,改进数据推断的准确性和完整性。 2025-12-10 09:37:37 +08:00
python
24fdfdea4c 增强AI服务的内容提取逻辑,更新提取助手的描述,添加后处理逻辑以推断缺失的性别、职级和线索来源字段,确保提取结果的准确性和完整性。 2025-12-09 15:32:25 +08:00
python
563d97184b 优化AI服务的内容提取逻辑,更新提取助手的描述,增强JSON格式的严格性,修复字段名错误和下划线前缀处理,确保提取结果的准确性和一致性。 2025-12-09 15:19:32 +08:00
python
9bf1dd1210 优化AI服务的内容提取逻辑,增强对API返回结果的处理能力,改进JSON解析和错误处理机制,确保在提取数据失败时能够返回空结果而不抛出异常,同时记录详细的调试信息以提高容错性和可维护性。 2025-12-09 15:01:31 +08:00
python
315301fc0b 添加AI日志记录器支持,增强对话日志记录功能,记录请求和响应信息,包括成功和错误情况,以提高调试和监控能力。 2025-12-09 14:51:33 +08:00
python
8bebc13efe 优化调整抽取 2025-12-09 14:41:26 +08:00
python
f1b5c52500 修正json repair安装和导入 2025-12-09 14:18:32 +08:00
python
7c30e59328 增强AI服务的JSON解析能力,添加对jsonrepair库的支持以处理不完整或格式错误的JSON,改进字段提取逻辑以允许部分字段为空,提升数据提取的容错性和准确性。 2025-12-09 14:13:07 +08:00
python
eaa384cf7e 更新提示配置和AI服务内容,增强信息提取助手的描述,明确提取要求,添加后处理逻辑以推断缺失字段,改进字段提取方法以提高数据提取的准确性和完整性。 2025-12-09 12:56:28 +08:00
python
b8d89c28ec 增强调试信息,添加对AI返回结果和字段映射的打印,改进字段名清理逻辑以避免空字段名的处理错误,确保数据提取的准确性和完整性。 2025-12-09 12:45:11 +08:00
python
e1d8d27dc4 更新提示配置,统一日期格式为中文格式,增强AI服务的日期规范化功能,添加对常见拼写错误的处理逻辑,改进字段名清理和规范化方法以提高数据提取准确性。 2025-12-09 12:34:01 +08:00
python
e31cd0b764 添加API最大token数配置,增强JSON解析功能,新增清理和修复JSON字符串的方法,改进字段名规范化逻辑以提高数据提取准确性。 2025-12-09 12:14:34 +08:00
python
d8fa4c3d7e 添加API超时配置,支持思考模式下动态调整超时时间;修改重试机制的延迟时间,从1秒改为2秒,增强错误处理逻辑。 2025-12-09 11:58:07 +08:00
python
c7a7780e71 更新提示配置和AI服务内容,简化信息提取助手的描述,明确提取要求,增强对API返回内容的处理逻辑,添加调试信息以便于问题排查。 2025-12-09 11:46:52 +08:00
python
14ff607b52 为华为大模型API调用添加重试机制,增强了错误处理逻辑,确保在请求失败时能够自动重试并提供详细的错误信息。同时,将API调用逻辑分离到单独的方法中,以提高代码可读性和可维护性。 2025-12-09 11:41:45 +08:00
python
8461725a13 更新提示配置和AI服务内容,增强信息提取助手的描述,明确提取要求和规则,添加新的字段提取逻辑以提高提取准确性和完整性。 2025-12-09 11:39:18 +08:00
python
684cb0141a 增强AI服务的JSON提取功能,添加了从文本中提取JSON对象的方法,改进了对华为大模型返回内容的处理,确保只返回JSON对象而不包含其他说明。 2025-12-09 11:30:02 +08:00
python
f0cb4a7ba0 调整env文件,配置华为大模型最新参数 2025-12-09 11:03:19 +08:00
python
7d50b160c2 创建新分支,用于静态交通本地服务器部署,调整默认为华为大模型调用 2025-12-09 10:33:41 +08:00
965 changed files with 58737 additions and 34689 deletions

324
.cursorrules Normal file
View File

@ -0,0 +1,324 @@
# 智慧监督AI文书写作服务 - AI开发手册
## 项目背景
本项目是一个基于大模型的智能文书生成服务,主要功能包括:
- 从非结构化文本中提取结构化字段数据使用AI大模型
- 根据字段数据填充Word模板生成正式文书
- 支持多种业务类型的文书模板管理
- 文档存储和下载管理MinIO对象存储
核心业务流程:
1. 接收输入的非结构化文本数据
2. 使用AI大模型提取结构化字段
3. 根据字段数据填充Word模板
4. 生成文档并上传到MinIO
5. 返回文档下载链接
## 技术栈与编码标准
### 核心技术栈
- **Python版本**: Python 3.8+
- **Web框架**: Flask 3.0.0
- **数据库**: MySQL (使用PyMySQL 1.1.2)
- **文档处理**: python-docx 1.1.0
- **对象存储**: MinIO 7.2.3
- **AI服务**:
- 华为大模型 (DeepSeek-R1-Distill-Llama-70B)
- 硅基流动 (DeepSeek-V3.2-Exp)
- **其他依赖**: flask-cors, flasgger, python-dotenv, requests, openpyxl, json-repair
### 编码规范
- **代码风格**: 遵循PEP 8规范
- **命名规范**:
- 类名使用大驼峰命名PascalCase`AIService`, `DocumentService`
- 函数和变量使用小写下划线命名snake_case`get_connection`, `field_data`
- 常量使用全大写下划线命名:`AI_PROVIDER`, `DB_HOST`
- **注释要求**:
- 所有类和方法必须有docstring使用三引号
- 复杂逻辑必须添加行内注释
- 使用中文注释(项目统一使用中文)
- **类型提示**: 建议使用类型提示typing模块提高代码可读性
- **异常处理**: 必须使用try-except捕获异常并提供有意义的错误信息
### 文件编码
- 所有Python文件使用UTF-8编码
- 文件开头不需要BOM标记
## 项目文件结构
```
.
├── app.py # Flask主应用定义所有API路由
├── requirements.txt # Python依赖列表
├── .env.example # 环境变量配置示例
├── .cursorrules # AI开发手册本文件
├── services/ # 服务层(业务逻辑)
│ ├── __init__.py
│ ├── ai_service.py # AI服务封装大模型调用逻辑
│ ├── document_service.py # 文档服务Word模板填充、MinIO上传
│ ├── field_service.py # 字段服务:数据库字段配置查询
│ └── ai_logger.py # AI日志记录器
├── utils/ # 工具类
│ ├── __init__.py
│ └── response.py # 统一API响应格式工具
├── config/ # 配置文件
│ ├── prompt_config.json # AI提示词配置
│ └── field_defaults.json # 字段默认值配置
├── static/ # 静态文件
│ ├── index.html # 测试页面
│ └── template_field_manager.html # 模板字段管理页面
├── template/ # Word模板文件目录
├── template_finish/ # 已完成的模板文件
├── test_scripts/ # 测试脚本
└── 技术文档/ # 技术文档目录
```
## 架构约束与最佳实践
### 1. 分层架构
项目采用分层架构,严格遵循以下层次:
- **路由层** (`app.py`): 只负责接收HTTP请求、参数验证、调用服务层、返回响应
- **服务层** (`services/`): 包含所有业务逻辑,服务类之间可以相互调用
- **工具层** (`utils/`): 提供通用工具函数,不包含业务逻辑
- **数据层**: 数据库操作封装在服务层中,不单独抽象
**重要原则**:
- 路由层不包含业务逻辑,只做参数验证和响应格式化
- 服务层方法应该是可测试的不依赖Flask的request对象
- 数据库连接在使用后必须关闭使用try-finally确保
### 2. 服务层设计规范
#### AI服务 (`services/ai_service.py`)
- 负责所有AI大模型调用
- 支持多种AI服务提供商华为、硅基流动通过环境变量切换
- 必须处理JSON解析失败的情况提供多种修复机制
- 字段名规范化将AI返回的各种字段名格式映射到正确的字段编码
- 日期格式规范化统一转换为中文格式YYYY年MM月 或 YYYY年MM月DD日
- 后处理:从已有信息推断缺失字段(如从出生年月计算年龄)
#### 文档服务 (`services/document_service.py`)
- 负责Word模板下载、填充、上传
- 占位符格式:`{{field_code}}`(双大括号)
- 必须处理表格中的占位符使用索引访问避免迭代器bug
- 提供XML备用方案处理特殊表格结构
- 文档名称生成:从原始文件名提取基础名称,添加被核查人姓名后缀
- MinIO路径格式`/{tenant_id}/{timestamp}/{file_name}`
#### 字段服务 (`services/field_service.py`)
- 负责从数据库查询字段配置
- 构建AI提示词根据输入数据和输出字段配置生成提示词
- 支持从配置文件加载提示词模板和字段默认值
### 3. 数据库设计规范
#### 主要数据表
- `f_polic_field`: 字段配置表
- `id`: 主键
- `name`: 字段名称
- `filed_code`: 字段编码注意数据库字段名是filed_code不是field_code
- `field_type`: 字段类型1=输入字段2=输出字段)
- `state`: 状态1=启用0=禁用)
- `f_polic_file_config`: 文件配置表(模板配置)
- `id`: 主键作为fileId使用
- `name`: 文件名称
- `file_path`: MinIO中的文件路径
- `input_data`: JSON格式的输入数据配置
- `state`: 状态1=启用0=禁用)
- `f_polic_file_field`: 文件字段关联表
- `file_id`: 文件配置ID
- `filed_id`: 字段ID
- `state`: 状态1=启用0=禁用)
#### 数据库操作规范
- 所有数据库操作必须使用参数化查询防止SQL注入
- 使用`pymysql.cursors.DictCursor`获取字典格式结果
- 数据库连接必须在使用后关闭try-finally模式
- 事务操作必须正确处理回滚
### 4. API设计规范
#### 统一响应格式
所有API必须使用`utils/response.py`中的工具函数:
**成功响应**:
```python
return success_response(data={'key': 'value'}, msg="ok")
```
**错误响应**:
```python
return error_response(code=400, error_msg="错误信息")
```
**响应结构**:
```json
{
"code": 0, // 0表示成功其他值表示错误码
"data": {}, // 响应数据
"msg": "ok", // 响应消息
"timestamp": "1234567890", // 时间戳(毫秒)
"errorMsg": "", // 错误信息(成功时为空)
"isSuccess": true // 是否成功
}
```
#### API路由规范
- 使用`@app.route`装饰器定义路由
- 支持Swagger文档使用flasgger
- 路由路径使用小写字母和连字符:`/api/ai/extract`
- 保留旧路径以兼容:`/ai/extract` 和 `/api/ai/extract` 同时支持
#### 错误码规范
- `0`: 成功
- `400`: 请求参数错误
- `500`: 服务器内部错误
- `1001`: 模板不存在
- `2001`: AI解析超时
- `2002`: AI解析失败
- `3001`: 文件生成失败
- `3002`: 文件保存失败
### 5. 环境变量配置
所有配置通过环境变量管理,使用`.env`文件(不要提交到版本控制):
**必需配置**:
- `DB_HOST`: 数据库主机
- `DB_PORT`: 数据库端口
- `DB_USER`: 数据库用户名
- `DB_PASSWORD`: 数据库密码
- `DB_NAME`: 数据库名称
- `MINIO_ENDPOINT`: MinIO服务地址
- `MINIO_ACCESS_KEY`: MinIO访问密钥
- `MINIO_SECRET_KEY`: MinIO密钥
- `MINIO_BUCKET`: MinIO存储桶名称
- `MINIO_SECURE`: 是否使用HTTPStrue/false
**AI服务配置**:
- `AI_PROVIDER`: AI服务提供商'huawei' 或 'siliconflow'
- `HUAWEI_API_ENDPOINT`: 华为API地址
- `HUAWEI_API_KEY`: 华为API密钥
- `HUAWEI_MODEL`: 华为模型名称
- `HUAWEI_API_TIMEOUT`: 超时时间(秒)
- `HUAWEI_API_MAX_TOKENS`: 最大token数
- `SILICONFLOW_URL`: 硅基流动API地址
- `SILICONFLOW_API_KEY`: 硅基流动API密钥
- `SILICONFLOW_MODEL`: 硅基流动模型名称
- `SILICONFLOW_API_TIMEOUT`: 超时时间(秒)
- `SILICONFLOW_API_MAX_TOKENS`: 最大token数
**可选配置**:
- `PORT`: 服务端口默认7500
- `DEBUG`: 调试模式true/false默认false
- `TENANT_ID`: 租户ID用于MinIO路径
### 6. 错误处理规范
- 所有可能失败的操作必须使用try-except捕获异常
- 异常信息要详细,包含上下文信息
- 数据库操作异常必须回滚事务
- 文件操作异常必须清理临时文件
- 对外暴露的错误信息要友好,不泄露内部实现细节
**示例**:
```python
try:
# 业务逻辑
result = some_operation()
return success_response(data=result)
except ValueError as e:
return error_response(400, f"参数错误: {str(e)}")
except Exception as e:
# 记录详细错误日志
print(f"[ERROR] 操作失败: {str(e)}")
import traceback
print(traceback.format_exc())
return error_response(500, "服务器内部错误")
```
### 7. 日志规范
- 使用`print`输出日志项目当前使用print不是logging模块
- 日志格式:`[级别] 消息内容`
- 日志级别:
- `[DEBUG]`: 调试信息(详细的执行过程)
- `[INFO]`: 一般信息(正常流程)
- `[WARN]`: 警告信息(不影响功能但需要注意)
- `[ERROR]`: 错误信息(功能失败)
- 关键操作必须记录日志AI调用、文件生成、数据库操作
### 8. 代码质量要求
- **可读性**: 代码要清晰易懂,变量名要有意义
- **可维护性**: 避免重复代码,提取公共方法
- **可测试性**: 服务层方法应该是纯函数,便于单元测试
- **健壮性**: 处理边界情况,避免崩溃
- **性能**: 数据库查询要优化避免N+1查询
### 9. 特殊注意事项
#### Word模板处理
- 使用`python-docx`库处理Word文档
- 占位符格式:`{{field_code}}`(双大括号)
- 表格访问必须使用索引方式避免迭代器导致的IndexError
- 处理跨run的占位符时需要合并runs并保持格式
- 某些复杂表格结构可能导致访问失败提供XML备用方案
#### AI字段提取
- AI返回的JSON可能格式不正确需要多种修复机制
- 字段名可能不规范(如`_source`、`target_organisation`),需要规范化映射
- 日期格式需要统一转换为中文格式
- 缺失字段需要从已有信息推断(如从出生年月计算年龄)
#### MinIO文件管理
- 文件路径使用相对路径(以`/`开头)
- 上传文件时自动生成时间戳路径
- 支持生成预签名下载URL7天有效期
- 临时文件使用后必须清理
## 开发工作流
1. **添加新功能**:
- 在服务层添加业务逻辑方法
- 在路由层添加API端点
- 更新Swagger文档注释
- 测试功能是否正常
2. **修改现有功能**:
- 理解现有代码逻辑
- 保持API接口兼容性如需要修改考虑版本控制
- 更新相关文档
3. **调试问题**:
- 查看日志输出(使用[DEBUG]级别)
- 检查数据库数据是否正确
- 验证环境变量配置
- 测试AI服务是否可用
## 常见问题处理
1. **AI解析失败**: 检查输入文本质量查看AI日志尝试修复JSON格式
2. **模板填充失败**: 检查占位符格式是否正确,查看表格结构是否异常
3. **MinIO上传失败**: 检查网络连接验证MinIO配置和权限
4. **数据库连接失败**: 检查数据库配置和网络连接
## 代码生成指导
当AI需要生成代码时请遵循以下原则
1. **保持一致性**: 遵循项目现有的代码风格和架构模式
2. **错误处理**: 所有可能失败的操作都要有异常处理
3. **日志记录**: 关键操作要记录日志
4. **参数验证**: API接口要验证输入参数
5. **资源清理**: 文件、数据库连接等资源要正确清理
6. **中文注释**: 使用中文编写注释和文档字符串
7. **类型提示**: 建议使用类型提示提高代码可读性
## 更新日志
- 2025-12-13: 创建初始版本

68
.env
View File

@ -1,14 +1,68 @@
# 硅基流动API配置 # ========== AI服务提供商配置 ==========
SILICONFLOW_API_KEY=sk-xnhmtotmlpjomrejbwdbczbpbyvanpxndvbxltodjwzbpmni # 选择使用的AI服务提供商
SILICONFLOW_MODEL=deepseek-ai/DeepSeek-V3.2-Exp # 可选值: 'huawei' 或 'siliconflow'
# 默认值: 'siliconflow'
AI_PROVIDER=siliconflow
# 华为大模型API配置预留 # ========== 华为大模型API配置 ==========
HUAWEI_API_ENDPOINT= # 当 AI_PROVIDER=huawei 时使用以下配置
HUAWEI_API_KEY=
# 数据库配置 # API端点地址
HUAWEI_API_ENDPOINT=http://10.100.31.26:3001/v1/chat/completions
# API密钥
HUAWEI_API_KEY=sk-PoeiV3qwyTIRqcVc84E8E24cD2904872859a87922e0d9186
# 模型名称
HUAWEI_MODEL=DeepSeek-R1-Distill-Llama-70B
# API超时配置
# 开启思考模式时,响应时间会显著增加,需要更长的超时时间
# 默认180秒3分钟
HUAWEI_API_TIMEOUT=180
# API最大token数配置
# 开启思考模式时模型可能生成更长的响应需要更多的token
# 默认12000
HUAWEI_API_MAX_TOKENS=12000
# ========== 硅基流动API配置 ==========
# 当 AI_PROVIDER=siliconflow 时使用以下配置
# API端点地址默认值通常不需要修改
SILICONFLOW_URL=https://api.siliconflow.cn/v1/chat/completions
# API密钥必需
SILICONFLOW_API_KEY=sk-pgujibohpenkomkwlufexmqzyckglgogdiubfplgqxkfqgfu
# 模型名称(默认值,通常不需要修改)
SILICONFLOW_MODEL=Qwen/Qwen2.5-72B-Instruct
# API超时配置
# 默认120秒
SILICONFLOW_API_TIMEOUT=120
# API最大token数配置
# 默认2000
SILICONFLOW_API_MAX_TOKENS=2000
# ========== 数据库配置 ==========
DB_HOST=152.136.177.240 DB_HOST=152.136.177.240
DB_PORT=5012 DB_PORT=5012
DB_USER=finyx DB_USER=finyx
DB_PASSWORD=6QsGK6MpePZDE57Z DB_PASSWORD=6QsGK6MpePZDE57Z
DB_NAME=finyx DB_NAME=finyx
# ========== MinIO配置可选文档生成功能需要 ==========
MINIO_ENDPOINT=minio.datacubeworld.com:9000
MINIO_ACCESS_KEY=JOLXFXny3avFSzB0uRA5
MINIO_SECRET_KEY=G1BR8jStNfovkfH5ou39EmPl34E4l7dGrnd3Cz0I
MINIO_BUCKET=finyx
MINIO_SECURE=true
# ========== 服务配置 ==========
# 服务端口
PORT=7500
# 调试模式true/false
DEBUG=False

View File

@ -1,14 +1,68 @@
# 硅基流动API配置 # ========== AI服务提供商配置 ==========
SILICONFLOW_API_KEY=your_api_key_here # 选择使用的AI服务提供商
SILICONFLOW_MODEL=deepseek-ai/DeepSeek-V3.2-Exp # 可选值: 'huawei' 或 'siliconflow'
# 默认值: 'siliconflow'
AI_PROVIDER=siliconflow
# 华为大模型API配置预留 # ========== 华为大模型API配置 ==========
HUAWEI_API_ENDPOINT= # 当 AI_PROVIDER=huawei 时使用以下配置
HUAWEI_API_KEY=
# 数据库配置 # API端点地址
HUAWEI_API_ENDPOINT=http://10.100.31.26:3001/v1/chat/completions
# API密钥
HUAWEI_API_KEY=sk-PoeiV3qwyTIRqcVc84E8E24cD2904872859a87922e0d9186
# 模型名称
HUAWEI_MODEL=DeepSeek-R1-Distill-Llama-70B
# API超时配置
# 开启思考模式时,响应时间会显著增加,需要更长的超时时间
# 默认180秒3分钟
HUAWEI_API_TIMEOUT=180
# API最大token数配置
# 开启思考模式时模型可能生成更长的响应需要更多的token
# 默认12000
HUAWEI_API_MAX_TOKENS=12000
# ========== 硅基流动API配置 ==========
# 当 AI_PROVIDER=siliconflow 时使用以下配置
# API端点地址默认值通常不需要修改
SILICONFLOW_URL=https://api.siliconflow.cn/v1/chat/completions
# API密钥必需
SILICONFLOW_API_KEY=sk-pgujibohpenkomkwlufexmqzyckglgogdiubfplgqxkfqgfu
# 模型名称(默认值,通常不需要修改)
SILICONFLOW_MODEL=Qwen/Qwen2.5-72B-Instruct
# API超时配置
# 默认120秒
SILICONFLOW_API_TIMEOUT=120
# API最大token数配置
# 默认2000
SILICONFLOW_API_MAX_TOKENS=2000
# ========== 数据库配置 ==========
DB_HOST=152.136.177.240 DB_HOST=152.136.177.240
DB_PORT=5012 DB_PORT=5012
DB_USER=finyx DB_USER=finyx
DB_PASSWORD=6QsGK6MpePZDE57Z DB_PASSWORD=6QsGK6MpePZDE57Z
DB_NAME=finyx DB_NAME=finyx
# ========== MinIO配置可选文档生成功能需要 ==========
MINIO_ENDPOINT=minio.datacubeworld.com:9000
MINIO_ACCESS_KEY=JOLXFXny3avFSzB0uRA5
MINIO_SECRET_KEY=G1BR8jStNfovkfH5ou39EmPl34E4l7dGrnd3Cz0I
MINIO_BUCKET=finyx
MINIO_SECURE=true
# ========== 服务配置 ==========
# 服务端口
PORT=7500
# 调试模式true/false
DEBUG=False

56
.gitignore vendored Normal file
View File

@ -0,0 +1,56 @@
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
# Virtual Environment
venv/
env/
ENV/
# IDE
.vscode/
.idea/
*.swp
*.swo
*~
# Logs
logs/
*.log
# Environment variables
.env
.env.local
# Database
*.db
*.sqlite
*.sqlite3
# OS
.DS_Store
Thumbs.db
# Project specific
parsed_fields.json
*.docx.bak
.env

View File

@ -6,8 +6,10 @@
- ✅ AI解析接口 (`/api/ai/extract`) - 从输入文本中提取结构化字段 - ✅ AI解析接口 (`/api/ai/extract`) - 从输入文本中提取结构化字段
- ✅ 字段配置管理 - 从数据库读取字段配置 - ✅ 字段配置管理 - 从数据库读取字段配置
- ✅ 支持硅基流动大模型DeepSeek - ✅ 支持多种AI服务提供商
- 🔄 预留华为大模型接口支持 - 华为大模型DeepSeek-R1-Distill-Llama-70B
- 硅基流动DeepSeek-V3.2-Exp
- ✅ 可通过配置灵活切换AI服务提供商
- ✅ Web测试界面 - 可视化测试解析功能 - ✅ Web测试界面 - 可视化测试解析功能
## 项目结构 ## 项目结构
@ -70,14 +72,30 @@ copy .env.example .env
cp .env.example .env cp .env.example .env
``` ```
编辑 `.env` 文件,填入你的API密钥 编辑 `.env` 文件,填入你的配置
```env ```env
# 硅基流动API配置必需 # ========== AI服务提供商配置 ==========
SILICONFLOW_API_KEY=your_api_key_here # 选择使用的AI服务提供商
SILICONFLOW_MODEL=deepseek-ai/DeepSeek-V3.2-Exp # 可选值: 'huawei' 或 'siliconflow'
# 默认值: 'siliconflow'
AI_PROVIDER=siliconflow
# 数据库配置(已默认配置,如需修改可调整) # ========== 华为大模型API配置当 AI_PROVIDER=huawei 时使用) ==========
HUAWEI_API_ENDPOINT=http://10.100.31.26:3001/v1/chat/completions
HUAWEI_API_KEY=sk-PoeiV3qwyTIRqcVc84E8E24cD2904872859a87922e0d9186
HUAWEI_MODEL=DeepSeek-R1-Distill-Llama-70B
HUAWEI_API_TIMEOUT=180
HUAWEI_API_MAX_TOKENS=12000
# ========== 硅基流动API配置当 AI_PROVIDER=siliconflow 时使用) ==========
SILICONFLOW_URL=https://api.siliconflow.cn/v1/chat/completions
SILICONFLOW_API_KEY=your_siliconflow_api_key_here
SILICONFLOW_MODEL=deepseek-ai/DeepSeek-V3.2-Exp
SILICONFLOW_API_TIMEOUT=120
SILICONFLOW_API_MAX_TOKENS=2000
# ========== 数据库配置 ==========
DB_HOST=152.136.177.240 DB_HOST=152.136.177.240
DB_PORT=5012 DB_PORT=5012
DB_USER=finyx DB_USER=finyx
@ -85,6 +103,41 @@ DB_PASSWORD=6QsGK6MpePZDE57Z
DB_NAME=finyx DB_NAME=finyx
``` ```
**AI服务提供商选择说明**
- **华为大模型**:设置 `AI_PROVIDER=huawei`,并配置 `HUAWEI_API_KEY``HUAWEI_API_ENDPOINT`
- **硅基流动**:设置 `AI_PROVIDER=siliconflow`(默认值),并配置 `SILICONFLOW_API_KEY`
如果配置的AI服务不完整系统会自动尝试使用另一个可用的服务。
**华为大模型API调用示例**
```bash
curl --location --request POST 'http://10.100.31.26:3001/v1/chat/completions' \
--header 'Authorization: Bearer sk-PoeiV3qwyTIRqcVc84E8E24cD2904872859a87922e0d9186' \
--header 'Content-Type: application/json' \
--data-raw '{
"model": "DeepSeek-R1-Distill-Llama-70B",
"messages": [
{
"role": "user",
"content": "介绍一下山西的营商环境,推荐适合什么行业经营"
}
],
"stream": false,
"presence_penalty": 1.03,
"frequency_penalty": 1.0,
"repetition_penalty": 1.0,
"temperature": 0.5,
"top_p": 0.95,
"top_k": 1,
"seed": 1,
"max_tokens": 8192,
"n": 2,
"best_of": 2
}'
```
### 3. 启动服务 ### 3. 启动服务
```bash ```bash
@ -243,7 +296,7 @@ print(response.json())
## 常见问题 ## 常见问题
**Q: 提示"未配置AI服务"** **Q: 提示"未配置AI服务"**
A: 检查 `.env` 文件中的 `SILICONFLOW_API_KEY` 是否已正确配置。 A: 系统仅支持华为大模型(已内置默认配置),请确保 `.env` 文件中正确设置了 `HUAWEI_API_KEY``HUAWEI_API_ENDPOINT`。如果华为大模型不可用请检查网络连接和API配置。
**Q: 解析结果为空?** **Q: 解析结果为空?**
A: 检查输入文本是否包含足够的信息,可以尝试更详细的输入文本。 A: 检查输入文本是否包含足够的信息,可以尝试更详细的输入文本。

View File

@ -0,0 +1,68 @@
# 模板字段导出说明
## 功能说明
`export_template_fields_to_excel.py` 脚本用于导出所有模板及其关联的输入字段和输出字段到Excel表格方便汇总整理模板和字段关系。
## 使用方法
```bash
python export_template_fields_to_excel.py
```
## 输出文件
脚本会在当前目录生成Excel文件文件名格式`template_fields_export_YYYYMMDD_HHMMSS.xlsx`
## Excel表格结构
生成的Excel表格包含以下列
1. **模板ID** - 模板在数据库中的唯一标识
2. **模板名称** - 模板的中文名称
3. **模板上级** - 模板的分类路径(从文件路径或模板名称推断,可能不完整,需要手动补充)
4. **输入字段** - 该模板关联的输入字段列表,格式:`字段名称(字段编码); 字段名称(字段编码)`
5. **输出字段** - 该模板关联的输出字段列表,格式:`字段名称(字段编码); 字段名称(字段编码)`
6. **输入字段数量** - 输入字段的个数
7. **输出字段数量** - 输出字段的个数
## 注意事项
1. **模板上级字段**脚本会尝试从文件路径或模板名称推断模板的分类但可能不完整或不准确。您可以在Excel中手动补充或修正。
2. **字段格式**:输入字段和输出字段以分号分隔,每个字段的格式为 `字段名称(字段编码)`
3. **数据来源**所有数据来自数据库只导出状态为启用state=1的模板和字段。
4. **后续使用**您可以基于这个Excel表格
- 手动补充或修正模板上级分类
- 新增模板和字段关系
- 创建导入脚本将修改后的数据导入数据库
## 示例数据
```
模板ID: 1765432134276990
模板名称: 1.请示报告卡(初核谈话)
模板上级: 2-初核模版/2.谈话审批
输入字段: 线索信息(clue_info); 被核查人员工作基本情况线索(target_basic_info_clue)
输出字段: 被核查人姓名(target_name); 被核查人员单位及职务(target_organization_and_position); ...
输入字段数量: 2
输出字段数量: 3
```
## 导入脚本开发建议
后续开发导入脚本时,可以参考以下步骤:
1. 读取Excel文件
2. 解析模板名称、模板上级、输入字段、输出字段
3. 根据模板名称查找或创建模板记录
4. 根据字段编码查找字段ID
5. 创建或更新模板和字段的关联关系
## 相关文件
- `export_template_fields_to_excel.py` - 导出脚本
- `template_fields_export_*.xlsx` - 生成的Excel文件

Binary file not shown.

View File

@ -0,0 +1,582 @@
"""
分析和修复字段编码问题
1. 分析f_polic_file_field表中的重复项
2. 检查f_polic_field表中的中文field_code
3. 根据占位符与字段对照表更新field_code
4. 合并重复项并更新关联表
"""
import os
import json
import pymysql
import re
from typing import Dict, List, Optional, Tuple
from datetime import datetime
from pathlib import Path
# 数据库连接配置
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
CREATED_BY = 655162080928945152
UPDATED_BY = 655162080928945152
CURRENT_TIME = datetime.now()
# 从占位符与字段对照表文档中提取的字段映射
# 格式: {字段名称: field_code}
FIELD_NAME_TO_CODE_MAPPING = {
# 基本信息字段
'被核查人姓名': 'target_name',
'被核查人员单位及职务': 'target_organization_and_position',
'被核查人员单位': 'target_organization',
'被核查人员职务': 'target_position',
'被核查人员性别': 'target_gender',
'被核查人员出生年月': 'target_date_of_birth',
'被核查人员出生年月日': 'target_date_of_birth_full',
'被核查人员年龄': 'target_age',
'被核查人员文化程度': 'target_education_level',
'被核查人员政治面貌': 'target_political_status',
'被核查人员职级': 'target_professional_rank',
'被核查人员身份证号': 'target_id_number',
'被核查人员身份证件及号码': 'target_id_number',
'被核查人员住址': 'target_address',
'被核查人员户籍住址': 'target_registered_address',
'被核查人员联系方式': 'target_contact',
'被核查人员籍贯': 'target_place_of_origin',
'被核查人员民族': 'target_ethnicity',
# 问题相关字段
'线索来源': 'clue_source',
'主要问题线索': 'target_issue_description',
'被核查人问题描述': 'target_problem_description',
# 审批相关字段
'初步核实审批表承办部门意见': 'department_opinion',
'初步核实审批表填表人': 'filler_name',
'批准时间': 'approval_time',
# 核查相关字段
'核查单位名称': 'investigation_unit_name',
'核查组代号': 'investigation_team_code',
'核查组组长姓名': 'investigation_team_leader_name',
'核查组成员姓名': 'investigation_team_member_names',
'核查地点': 'investigation_location',
# 风险评估相关字段
'被核查人员家庭情况': 'target_family_situation',
'被核查人员社会关系': 'target_social_relations',
'被核查人员健康状况': 'target_health_status',
'被核查人员性格特征': 'target_personality',
'被核查人员承受能力': 'target_tolerance',
'被核查人员涉及问题严重程度': 'target_issue_severity',
'被核查人员涉及其他问题的可能性': 'target_other_issues_possibility',
'被核查人员此前被审查情况': 'target_previous_investigation',
'被核查人员社会负面事件': 'target_negative_events',
'被核查人员其他情况': 'target_other_situation',
'风险等级': 'risk_level',
# 其他字段
'线索信息': 'clue_info',
'被核查人员工作基本情况线索': 'target_basic_info_clue',
'被核查人员工作基本情况': 'target_work_basic_info',
'请示报告卡请示时间': 'report_card_request_time',
'应到时间': 'appointment_time',
'应到地点': 'appointment_location',
'承办部门': 'handling_department',
'承办人': 'handler_name',
'谈话通知时间': 'notification_time',
'谈话通知地点': 'notification_location',
'被核查人员本人认识和态度': 'target_attitude',
'纪委名称': 'commission_name',
}
def is_chinese(text: str) -> bool:
"""判断字符串是否包含中文字符"""
if not text:
return False
return bool(re.search(r'[\u4e00-\u9fff]', text))
def analyze_f_polic_field(conn) -> Dict:
"""分析f_polic_field表找出中文field_code和重复项"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
print("\n" + "="*80)
print("1. 分析 f_polic_field 表")
print("="*80)
# 查询所有字段
cursor.execute("""
SELECT id, name, filed_code, field_type, state
FROM f_polic_field
WHERE tenant_id = %s
ORDER BY name, filed_code
""", (TENANT_ID,))
fields = cursor.fetchall()
print(f"\n总共找到 {len(fields)} 个字段记录")
# 找出中文field_code
chinese_field_codes = []
for field in fields:
if is_chinese(field['filed_code']):
chinese_field_codes.append(field)
print(f"\n发现 {len(chinese_field_codes)} 个中文field_code:")
for field in chinese_field_codes:
print(f" - ID: {field['id']}, 名称: {field['name']}, field_code: {field['filed_code']}")
# 找出重复的字段名称
name_to_fields = {}
for field in fields:
name = field['name']
if name not in name_to_fields:
name_to_fields[name] = []
name_to_fields[name].append(field)
duplicates = {name: fields_list for name, fields_list in name_to_fields.items()
if len(fields_list) > 1}
print(f"\n发现 {len(duplicates)} 个重复的字段名称:")
for name, fields_list in duplicates.items():
print(f"\n 字段名称: {name} (共 {len(fields_list)} 条记录)")
for field in fields_list:
print(f" - ID: {field['id']}, field_code: {field['filed_code']}, "
f"field_type: {field['field_type']}, state: {field['state']}")
# 找出重复的field_code
code_to_fields = {}
for field in fields:
code = field['filed_code']
if code not in code_to_fields:
code_to_fields[code] = []
code_to_fields[code].append(field)
duplicate_codes = {code: fields_list for code, fields_list in code_to_fields.items()
if len(fields_list) > 1}
print(f"\n发现 {len(duplicate_codes)} 个重复的field_code:")
for code, fields_list in duplicate_codes.items():
print(f"\n field_code: {code} (共 {len(fields_list)} 条记录)")
for field in fields_list:
print(f" - ID: {field['id']}, 名称: {field['name']}, "
f"field_type: {field['field_type']}, state: {field['state']}")
return {
'all_fields': fields,
'chinese_field_codes': chinese_field_codes,
'duplicate_names': duplicates,
'duplicate_codes': duplicate_codes
}
def analyze_f_polic_file_field(conn) -> Dict:
"""分析f_polic_file_field表找出重复项"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
print("\n" + "="*80)
print("2. 分析 f_polic_file_field 表")
print("="*80)
# 查询所有关联关系
cursor.execute("""
SELECT fff.id, fff.file_id, fff.filed_id,
fc.name as file_name, f.name as field_name, f.filed_code
FROM f_polic_file_field fff
LEFT JOIN f_polic_file_config fc ON fff.file_id = fc.id
LEFT JOIN f_polic_field f ON fff.filed_id = f.id
WHERE fff.tenant_id = %s
ORDER BY fff.file_id, fff.filed_id
""", (TENANT_ID,))
relations = cursor.fetchall()
print(f"\n总共找到 {len(relations)} 个关联关系")
# 找出重复的关联关系相同的file_id和filed_id
relation_key_to_records = {}
for rel in relations:
key = (rel['file_id'], rel['filed_id'])
if key not in relation_key_to_records:
relation_key_to_records[key] = []
relation_key_to_records[key].append(rel)
duplicates = {key: records for key, records in relation_key_to_records.items()
if len(records) > 1}
print(f"\n发现 {len(duplicates)} 个重复的关联关系:")
for (file_id, filed_id), records in duplicates.items():
print(f"\n 文件ID: {file_id}, 字段ID: {filed_id} (共 {len(records)} 条记录)")
for record in records:
print(f" - 关联ID: {record['id']}, 文件: {record['file_name']}, "
f"字段: {record['field_name']} ({record['filed_code']})")
# 统计使用中文field_code的关联关系
chinese_relations = [rel for rel in relations if rel['filed_code'] and is_chinese(rel['filed_code'])]
print(f"\n发现 {len(chinese_relations)} 个使用中文field_code的关联关系:")
for rel in chinese_relations[:10]: # 只显示前10个
print(f" - 文件: {rel['file_name']}, 字段: {rel['field_name']}, "
f"field_code: {rel['filed_code']}")
if len(chinese_relations) > 10:
print(f" ... 还有 {len(chinese_relations) - 10}")
return {
'all_relations': relations,
'duplicate_relations': duplicates,
'chinese_relations': chinese_relations
}
def get_correct_field_code(field_name: str, current_code: str) -> Optional[str]:
"""根据字段名称获取正确的field_code"""
# 首先从映射表中查找
if field_name in FIELD_NAME_TO_CODE_MAPPING:
return FIELD_NAME_TO_CODE_MAPPING[field_name]
# 如果当前code已经是英文且符合规范保留
if current_code and not is_chinese(current_code) and re.match(r'^[a-z_]+$', current_code):
return current_code
return None
def fix_f_polic_field(conn, dry_run: bool = True) -> Dict:
"""修复f_polic_field表中的问题"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
print("\n" + "="*80)
print("3. 修复 f_polic_field 表")
print("="*80)
if dry_run:
print("\n[DRY RUN模式 - 不会实际修改数据库]")
# 获取所有字段
cursor.execute("""
SELECT id, name, filed_code, field_type, state
FROM f_polic_field
WHERE tenant_id = %s
""", (TENANT_ID,))
fields = cursor.fetchall()
updates = []
merges = []
# 按字段名称分组,找出需要合并的重复项
name_to_fields = {}
for field in fields:
name = field['name']
if name not in name_to_fields:
name_to_fields[name] = []
name_to_fields[name].append(field)
# 处理每个字段名称
for field_name, field_list in name_to_fields.items():
if len(field_list) == 1:
# 单个字段检查是否需要更新field_code
field = field_list[0]
correct_code = get_correct_field_code(field['name'], field['filed_code'])
if correct_code and correct_code != field['filed_code']:
updates.append({
'id': field['id'],
'name': field['name'],
'old_code': field['filed_code'],
'new_code': correct_code,
'field_type': field['field_type']
})
else:
# 多个字段,需要合并
# 找出最佳的field_code
best_field = None
best_code = None
for field in field_list:
correct_code = get_correct_field_code(field['name'], field['filed_code'])
if correct_code:
if not best_field or (field['state'] == 1 and best_field['state'] == 0):
best_field = field
best_code = correct_code
# 如果没找到最佳字段,选择第一个启用的,或者第一个
if not best_field:
enabled_fields = [f for f in field_list if f['state'] == 1]
best_field = enabled_fields[0] if enabled_fields else field_list[0]
best_code = get_correct_field_code(best_field['name'], best_field['filed_code'])
if not best_code:
# 生成一个基于名称的code
best_code = field_name.lower().replace('被核查人员', 'target_').replace('被核查人', 'target_')
best_code = re.sub(r'[^\w]', '_', best_code)
best_code = re.sub(r'_+', '_', best_code).strip('_')
# 确定要保留的字段和要删除的字段
keep_field = best_field
remove_fields = [f for f in field_list if f['id'] != keep_field['id']]
# 更新保留字段的field_code
if best_code and best_code != keep_field['filed_code']:
updates.append({
'id': keep_field['id'],
'name': keep_field['name'],
'old_code': keep_field['filed_code'],
'new_code': best_code,
'field_type': keep_field['field_type']
})
merges.append({
'keep_field_id': keep_field['id'],
'keep_field_name': keep_field['name'],
'keep_field_code': best_code or keep_field['filed_code'],
'remove_field_ids': [f['id'] for f in remove_fields],
'remove_fields': remove_fields
})
# 显示更新计划
print(f"\n需要更新 {len(updates)} 个字段的field_code:")
for update in updates:
print(f" - ID: {update['id']}, 名称: {update['name']}, "
f"{update['old_code']} -> {update['new_code']}")
print(f"\n需要合并 {len(merges)} 组重复字段:")
for merge in merges:
print(f"\n 保留字段: ID={merge['keep_field_id']}, 名称={merge['keep_field_name']}, "
f"field_code={merge['keep_field_code']}")
print(f" 删除字段: {len(merge['remove_field_ids'])}")
for remove_field in merge['remove_fields']:
print(f" - ID: {remove_field['id']}, field_code: {remove_field['filed_code']}, "
f"field_type: {remove_field['field_type']}, state: {remove_field['state']}")
# 执行更新
if not dry_run:
print("\n开始执行更新...")
# 1. 先更新field_code
for update in updates:
cursor.execute("""
UPDATE f_polic_field
SET filed_code = %s, updated_time = %s, updated_by = %s
WHERE id = %s
""", (update['new_code'], CURRENT_TIME, UPDATED_BY, update['id']))
print(f" ✓ 更新字段 ID {update['id']}: {update['old_code']} -> {update['new_code']}")
# 2. 合并重复字段:先更新关联表,再删除重复字段
for merge in merges:
keep_id = merge['keep_field_id']
for remove_id in merge['remove_field_ids']:
# 更新f_polic_file_field表中的关联
cursor.execute("""
UPDATE f_polic_file_field
SET filed_id = %s, updated_time = %s, updated_by = %s
WHERE filed_id = %s AND tenant_id = %s
""", (keep_id, CURRENT_TIME, UPDATED_BY, remove_id, TENANT_ID))
# 删除重复的字段记录
cursor.execute("""
DELETE FROM f_polic_field
WHERE id = %s AND tenant_id = %s
""", (remove_id, TENANT_ID))
print(f" ✓ 合并字段: 保留 ID {keep_id}, 删除 {len(merge['remove_field_ids'])} 个重复字段")
conn.commit()
print("\n✓ 更新完成")
else:
print("\n[DRY RUN] 以上操作不会实际执行")
return {
'updates': updates,
'merges': merges
}
def fix_f_polic_file_field(conn, dry_run: bool = True) -> Dict:
"""修复f_polic_file_field表中的重复项"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
print("\n" + "="*80)
print("4. 修复 f_polic_file_field 表")
print("="*80)
if dry_run:
print("\n[DRY RUN模式 - 不会实际修改数据库]")
# 找出重复的关联关系
cursor.execute("""
SELECT file_id, filed_id, COUNT(*) as count, GROUP_CONCAT(id) as ids
FROM f_polic_file_field
WHERE tenant_id = %s
GROUP BY file_id, filed_id
HAVING count > 1
""", (TENANT_ID,))
duplicates = cursor.fetchall()
print(f"\n发现 {len(duplicates)} 组重复的关联关系")
deletes = []
for dup in duplicates:
file_id = dup['file_id']
filed_id = dup['filed_id']
ids = [int(id_str) for id_str in dup['ids'].split(',')]
# 保留第一个,删除其他的
keep_id = ids[0]
remove_ids = ids[1:]
deletes.append({
'file_id': file_id,
'filed_id': filed_id,
'keep_id': keep_id,
'remove_ids': remove_ids
})
print(f"\n 文件ID: {file_id}, 字段ID: {filed_id}")
print(f" 保留关联ID: {keep_id}")
print(f" 删除关联ID: {', '.join(map(str, remove_ids))}")
# 执行删除
if not dry_run:
print("\n开始删除重复的关联关系...")
for delete in deletes:
for remove_id in delete['remove_ids']:
cursor.execute("""
DELETE FROM f_polic_file_field
WHERE id = %s AND tenant_id = %s
""", (remove_id, TENANT_ID))
print(f" ✓ 删除文件ID {delete['file_id']} 和字段ID {delete['filed_id']} 的重复关联")
conn.commit()
print("\n✓ 删除完成")
else:
print("\n[DRY RUN] 以上操作不会实际执行")
return {
'deletes': deletes
}
def check_other_tables(conn):
"""检查其他可能受影响的表"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
print("\n" + "="*80)
print("5. 检查其他关联表")
print("="*80)
# 检查f_polic_task表
print("\n检查 f_polic_task 表...")
try:
cursor.execute("""
SELECT COUNT(*) as count
FROM f_polic_task
WHERE tenant_id = %s
""", (TENANT_ID,))
task_count = cursor.fetchone()['count']
print(f" 找到 {task_count} 个任务记录")
# 检查是否有引用字段ID的列
cursor.execute("DESCRIBE f_polic_task")
columns = [col['Field'] for col in cursor.fetchall()]
print(f" 表字段: {', '.join(columns)}")
# 检查是否有引用f_polic_field的字段
field_refs = [col for col in columns if 'field' in col.lower() or 'filed' in col.lower()]
if field_refs:
print(f" 可能引用字段的列: {', '.join(field_refs)}")
except Exception as e:
print(f" 检查f_polic_task表时出错: {e}")
# 检查f_polic_file表
print("\n检查 f_polic_file 表...")
try:
cursor.execute("""
SELECT COUNT(*) as count
FROM f_polic_file
WHERE tenant_id = %s
""", (TENANT_ID,))
file_count = cursor.fetchone()['count']
print(f" 找到 {file_count} 个文件记录")
cursor.execute("DESCRIBE f_polic_file")
columns = [col['Field'] for col in cursor.fetchall()]
print(f" 表字段: {', '.join(columns)}")
except Exception as e:
print(f" 检查f_polic_file表时出错: {e}")
def main():
"""主函数"""
print("="*80)
print("字段编码问题分析和修复工具")
print("="*80)
try:
conn = pymysql.connect(**DB_CONFIG)
# 1. 分析f_polic_field表
field_analysis = analyze_f_polic_field(conn)
# 2. 分析f_polic_file_field表
relation_analysis = analyze_f_polic_file_field(conn)
# 3. 检查其他表
check_other_tables(conn)
# 4. 询问是否执行修复
print("\n" + "="*80)
print("分析完成")
print("="*80)
print("\n是否执行修复?")
print("1. 先执行DRY RUN不实际修改数据库")
print("2. 直接执行修复(会修改数据库)")
print("3. 仅查看分析结果,不执行修复")
choice = input("\n请选择 (1/2/3默认1): ").strip() or "1"
if choice == "1":
# DRY RUN
print("\n" + "="*80)
print("执行DRY RUN...")
print("="*80)
fix_f_polic_field(conn, dry_run=True)
fix_f_polic_file_field(conn, dry_run=True)
print("\n" + "="*80)
confirm = input("DRY RUN完成。是否执行实际修复(y/n默认n): ").strip().lower()
if confirm == 'y':
print("\n执行实际修复...")
fix_f_polic_field(conn, dry_run=False)
fix_f_polic_file_field(conn, dry_run=False)
print("\n✓ 修复完成!")
elif choice == "2":
# 直接执行
print("\n" + "="*80)
print("执行修复...")
print("="*80)
fix_f_polic_field(conn, dry_run=False)
fix_f_polic_file_field(conn, dry_run=False)
print("\n✓ 修复完成!")
else:
print("\n仅查看分析结果,未执行修复")
conn.close()
except Exception as e:
print(f"\n✗ 执行失败: {e}")
import traceback
traceback.print_exc()
if __name__ == '__main__':
main()

View File

@ -0,0 +1,555 @@
"""
分析和更新模板树状结构
根据 template_finish 目录结构规划树状层级并更新数据库中的 parent_id 字段
"""
import os
import json
import pymysql
from pathlib import Path
from typing import Dict, List, Optional, Tuple
from datetime import datetime
# 数据库连接配置
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
CREATED_BY = 655162080928945152
UPDATED_BY = 655162080928945152
CURRENT_TIME = datetime.now()
# 项目根目录
PROJECT_ROOT = Path(__file__).parent
TEMPLATES_DIR = PROJECT_ROOT / "template_finish"
# 从 init_all_templates.py 复制的文档类型映射
DOCUMENT_TYPE_MAPPING = {
"1.请示报告卡XXX": {
"template_code": "REPORT_CARD",
"name": "1.请示报告卡XXX",
"business_type": "INVESTIGATION"
},
"2.初步核实审批表XXX": {
"template_code": "PRELIMINARY_VERIFICATION_APPROVAL",
"name": "2.初步核实审批表XXX",
"business_type": "INVESTIGATION"
},
"3.附件初核方案(XXX)": {
"template_code": "INVESTIGATION_PLAN",
"name": "3.附件初核方案(XXX)",
"business_type": "INVESTIGATION"
},
"谈话通知书第一联": {
"template_code": "NOTIFICATION_LETTER_1",
"name": "谈话通知书第一联",
"business_type": "INVESTIGATION"
},
"谈话通知书第二联": {
"template_code": "NOTIFICATION_LETTER_2",
"name": "谈话通知书第二联",
"business_type": "INVESTIGATION"
},
"谈话通知书第三联": {
"template_code": "NOTIFICATION_LETTER_3",
"name": "谈话通知书第三联",
"business_type": "INVESTIGATION"
},
"1.请示报告卡(初核谈话)": {
"template_code": "REPORT_CARD_INTERVIEW",
"name": "1.请示报告卡(初核谈话)",
"business_type": "INVESTIGATION"
},
"2谈话审批表": {
"template_code": "INTERVIEW_APPROVAL_FORM",
"name": "2谈话审批表",
"business_type": "INVESTIGATION"
},
"3.谈话前安全风险评估表": {
"template_code": "PRE_INTERVIEW_RISK_ASSESSMENT",
"name": "3.谈话前安全风险评估表",
"business_type": "INVESTIGATION"
},
"4.谈话方案": {
"template_code": "INTERVIEW_PLAN",
"name": "4.谈话方案",
"business_type": "INVESTIGATION"
},
"5.谈话后安全风险评估表": {
"template_code": "POST_INTERVIEW_RISK_ASSESSMENT",
"name": "5.谈话后安全风险评估表",
"business_type": "INVESTIGATION"
},
"1.谈话笔录": {
"template_code": "INTERVIEW_RECORD",
"name": "1.谈话笔录",
"business_type": "INVESTIGATION"
},
"2.谈话询问对象情况摸底调查30问": {
"template_code": "INVESTIGATION_30_QUESTIONS",
"name": "2.谈话询问对象情况摸底调查30问",
"business_type": "INVESTIGATION"
},
"3.被谈话人权利义务告知书": {
"template_code": "RIGHTS_OBLIGATIONS_NOTICE",
"name": "3.被谈话人权利义务告知书",
"business_type": "INVESTIGATION"
},
"4.点对点交接单": {
"template_code": "HANDOVER_FORM",
"name": "4.点对点交接单",
"business_type": "INVESTIGATION"
},
"4.点对点交接单2": {
"template_code": "HANDOVER_FORM_2",
"name": "4.点对点交接单2",
"business_type": "INVESTIGATION"
},
"5.陪送交接单(新)": {
"template_code": "ESCORT_HANDOVER_FORM",
"name": "5.陪送交接单(新)",
"business_type": "INVESTIGATION"
},
"6.1保密承诺书(谈话对象使用-非中共党员用)": {
"template_code": "CONFIDENTIALITY_COMMITMENT_NON_PARTY",
"name": "6.1保密承诺书(谈话对象使用-非中共党员用)",
"business_type": "INVESTIGATION"
},
"6.2保密承诺书(谈话对象使用-中共党员用)": {
"template_code": "CONFIDENTIALITY_COMMITMENT_PARTY",
"name": "6.2保密承诺书(谈话对象使用-中共党员用)",
"business_type": "INVESTIGATION"
},
"7.办案人员-办案安全保密承诺书": {
"template_code": "INVESTIGATOR_CONFIDENTIALITY_COMMITMENT",
"name": "7.办案人员-办案安全保密承诺书",
"business_type": "INVESTIGATION"
},
"8-1请示报告卡初核报告结论 ": {
"template_code": "REPORT_CARD_CONCLUSION",
"name": "8-1请示报告卡初核报告结论 ",
"business_type": "INVESTIGATION"
},
"8.XXX初核情况报告": {
"template_code": "INVESTIGATION_REPORT",
"name": "8.XXX初核情况报告",
"business_type": "INVESTIGATION"
}
}
def generate_id():
"""生成ID使用时间戳+随机数的方式,模拟雪花算法)"""
import time
import random
timestamp = int(time.time() * 1000)
random_part = random.randint(100000, 999999)
return timestamp * 1000 + random_part
def identify_document_type(file_name: str) -> Optional[Dict]:
"""根据完整文件名识别文档类型"""
base_name = Path(file_name).stem
if base_name in DOCUMENT_TYPE_MAPPING:
return DOCUMENT_TYPE_MAPPING[base_name]
return None
def scan_directory_structure(base_dir: Path) -> Dict:
"""
扫描目录结构构建树状层级
Returns:
包含目录和文件层级结构的字典
"""
structure = {
'directories': {}, # {path: {'name': ..., 'parent': ..., 'level': ...}}
'files': {} # {file_path: {'name': ..., 'parent': ..., 'template_code': ...}}
}
def process_path(path: Path, parent_path: Optional[str] = None, level: int = 0):
"""递归处理路径"""
if path.is_file() and path.suffix == '.docx':
# 处理文件
file_name = path.stem
doc_config = identify_document_type(file_name)
structure['files'][str(path)] = {
'name': file_name,
'parent': parent_path,
'level': level,
'template_code': doc_config['template_code'] if doc_config else None,
'full_path': str(path)
}
elif path.is_dir():
# 处理目录
dir_name = path.name
structure['directories'][str(path)] = {
'name': dir_name,
'parent': parent_path,
'level': level
}
# 递归处理子目录和文件
for child in sorted(path.iterdir()):
if child.name != '__pycache__':
process_path(child, str(path), level + 1)
# 从根目录开始扫描
if TEMPLATES_DIR.exists():
for item in sorted(TEMPLATES_DIR.iterdir()):
if item.name != '__pycache__':
process_path(item, None, 0)
return structure
def get_existing_data(conn) -> Dict:
"""
获取数据库中的现有数据
Returns:
{
'by_id': {id: {...}},
'by_name': {name: {...}},
'by_template_code': {template_code: {...}}
}
"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
sql = """
SELECT id, name, parent_id, template_code, input_data, file_path, state
FROM f_polic_file_config
WHERE tenant_id = %s
"""
cursor.execute(sql, (TENANT_ID,))
configs = cursor.fetchall()
result = {
'by_id': {},
'by_name': {},
'by_template_code': {}
}
for config in configs:
config_id = config['id']
config_name = config['name']
# 尝试从 input_data 中提取 template_code
template_code = config.get('template_code')
if not template_code and config.get('input_data'):
try:
input_data = json.loads(config['input_data']) if isinstance(config['input_data'], str) else config['input_data']
if isinstance(input_data, dict):
template_code = input_data.get('template_code')
except:
pass
result['by_id'][config_id] = config
result['by_name'][config_name] = config
if template_code:
# 如果已存在相同 template_code保留第一个
if template_code not in result['by_template_code']:
result['by_template_code'][template_code] = config
cursor.close()
return result
def analyze_structure():
"""分析目录结构和数据库数据"""
print("="*80)
print("分析模板目录结构和数据库数据")
print("="*80)
# 连接数据库
try:
conn = pymysql.connect(**DB_CONFIG)
print("✓ 数据库连接成功\n")
except Exception as e:
print(f"✗ 数据库连接失败: {e}")
return None, None
# 扫描目录结构
print("扫描目录结构...")
dir_structure = scan_directory_structure(TEMPLATES_DIR)
print(f" 找到 {len(dir_structure['directories'])} 个目录")
print(f" 找到 {len(dir_structure['files'])} 个文件\n")
# 获取数据库现有数据
print("获取数据库现有数据...")
existing_data = get_existing_data(conn)
print(f" 数据库中有 {len(existing_data['by_id'])} 条记录\n")
# 分析缺少 parent_id 的记录
print("分析缺少 parent_id 的记录...")
missing_parent = []
for config in existing_data['by_id'].values():
if config.get('parent_id') is None:
missing_parent.append(config)
print(f"{len(missing_parent)} 条记录缺少 parent_id\n")
conn.close()
return dir_structure, existing_data
def plan_tree_structure(dir_structure: Dict, existing_data: Dict) -> List[Dict]:
"""
规划树状结构
Returns:
更新计划列表每个元素包含
{
'type': 'directory' | 'file',
'name': ...,
'parent_name': ...,
'level': ...,
'action': 'create' | 'update',
'config_id': ... (如果是更新),
'template_code': ... (如果是文件)
}
"""
plan = []
# 按层级排序目录
directories = sorted(dir_structure['directories'].items(),
key=lambda x: (x[1]['level'], x[0]))
# 按层级排序文件
files = sorted(dir_structure['files'].items(),
key=lambda x: (x[1]['level'], x[0]))
# 创建目录映射用于查找父目录ID
dir_id_map = {} # {dir_path: config_id}
# 处理目录(按层级顺序)
for dir_path, dir_info in directories:
dir_name = dir_info['name']
parent_path = dir_info['parent']
level = dir_info['level']
# 查找父目录ID
parent_id = None
if parent_path:
parent_id = dir_id_map.get(parent_path)
# 检查数据库中是否已存在
existing = existing_data['by_name'].get(dir_name)
if existing:
# 更新现有记录
plan.append({
'type': 'directory',
'name': dir_name,
'parent_name': dir_structure['directories'].get(parent_path, {}).get('name') if parent_path else None,
'parent_id': parent_id,
'level': level,
'action': 'update',
'config_id': existing['id'],
'current_parent_id': existing.get('parent_id')
})
dir_id_map[dir_path] = existing['id']
else:
# 创建新记录(目录节点)
new_id = generate_id()
plan.append({
'type': 'directory',
'name': dir_name,
'parent_name': dir_structure['directories'].get(parent_path, {}).get('name') if parent_path else None,
'parent_id': parent_id,
'level': level,
'action': 'create',
'config_id': new_id,
'current_parent_id': None
})
dir_id_map[dir_path] = new_id
# 处理文件
for file_path, file_info in files:
file_name = file_info['name']
parent_path = file_info['parent']
level = file_info['level']
template_code = file_info['template_code']
# 查找父目录ID
parent_id = dir_id_map.get(parent_path) if parent_path else None
# 查找数据库中的记录(通过 template_code 或 name
existing = None
if template_code:
existing = existing_data['by_template_code'].get(template_code)
if not existing:
existing = existing_data['by_name'].get(file_name)
if existing:
# 更新现有记录
plan.append({
'type': 'file',
'name': file_name,
'parent_name': dir_structure['directories'].get(parent_path, {}).get('name') if parent_path else None,
'parent_id': parent_id,
'level': level,
'action': 'update',
'config_id': existing['id'],
'template_code': template_code,
'current_parent_id': existing.get('parent_id')
})
else:
# 创建新记录(文件节点)
new_id = generate_id()
plan.append({
'type': 'file',
'name': file_name,
'parent_name': dir_structure['directories'].get(parent_path, {}).get('name') if parent_path else None,
'parent_id': parent_id,
'level': level,
'action': 'create',
'config_id': new_id,
'template_code': template_code,
'current_parent_id': None
})
return plan
def generate_update_sql(plan: List[Dict], output_file: str = 'update_template_tree.sql'):
"""生成更新SQL脚本"""
sql_lines = [
"-- 模板树状结构更新脚本",
f"-- 生成时间: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}",
"-- 注意:执行前请备份数据库!",
"",
"USE finyx;",
"",
"START TRANSACTION;",
""
]
# 按层级分组
by_level = {}
for item in plan:
level = item['level']
if level not in by_level:
by_level[level] = []
by_level[level].append(item)
# 按层级顺序处理(从顶层到底层)
for level in sorted(by_level.keys()):
sql_lines.append(f"-- ===== 层级 {level} =====")
sql_lines.append("")
for item in by_level[level]:
if item['action'] == 'create':
# 创建新记录
if item['type'] == 'directory':
sql_lines.append(f"-- 创建目录节点: {item['name']}")
sql_lines.append(f"INSERT INTO f_polic_file_config")
sql_lines.append(f" (id, tenant_id, parent_id, name, input_data, file_path, created_time, created_by, updated_time, updated_by, state)")
parent_id_sql = f"{item['parent_id']}" if item['parent_id'] else "NULL"
sql_lines.append(f"VALUES ({item['config_id']}, {TENANT_ID}, {parent_id_sql}, '{item['name']}', NULL, NULL, NOW(), {CREATED_BY}, NOW(), {UPDATED_BY}, 1);")
else:
# 文件节点(需要 template_code
sql_lines.append(f"-- 创建文件节点: {item['name']}")
input_data = json.dumps({
'template_code': item.get('template_code', ''),
'business_type': 'INVESTIGATION'
}, ensure_ascii=False).replace("'", "''")
sql_lines.append(f"INSERT INTO f_polic_file_config")
sql_lines.append(f" (id, tenant_id, parent_id, name, input_data, file_path, template_code, created_time, created_by, updated_time, updated_by, state)")
parent_id_sql = f"{item['parent_id']}" if item['parent_id'] else "NULL"
template_code_sql = f"'{item.get('template_code', '')}'" if item.get('template_code') else "NULL"
sql_lines.append(f"VALUES ({item['config_id']}, {TENANT_ID}, {parent_id_sql}, '{item['name']}', '{input_data}', NULL, {template_code_sql}, NOW(), {CREATED_BY}, NOW(), {UPDATED_BY}, 1);")
sql_lines.append("")
else:
# 更新现有记录
current_parent = item.get('current_parent_id')
new_parent = item.get('parent_id')
if current_parent != new_parent:
sql_lines.append(f"-- 更新: {item['name']} (parent_id: {current_parent} -> {new_parent})")
parent_id_sql = f"{new_parent}" if new_parent else "NULL"
sql_lines.append(f"UPDATE f_polic_file_config")
sql_lines.append(f"SET parent_id = {parent_id_sql}, updated_time = NOW(), updated_by = {UPDATED_BY}")
sql_lines.append(f"WHERE id = {item['config_id']} AND tenant_id = {TENANT_ID};")
sql_lines.append("")
sql_lines.append("COMMIT;")
sql_lines.append("")
sql_lines.append("-- 更新完成")
# 写入文件
with open(output_file, 'w', encoding='utf-8') as f:
f.write('\n'.join(sql_lines))
print(f"✓ SQL脚本已生成: {output_file}")
return output_file
def print_analysis_report(dir_structure: Dict, existing_data: Dict, plan: List[Dict]):
"""打印分析报告"""
print("\n" + "="*80)
print("分析报告")
print("="*80)
print(f"\n目录结构:")
print(f" - 目录数量: {len(dir_structure['directories'])}")
print(f" - 文件数量: {len(dir_structure['files'])}")
print(f"\n数据库现状:")
print(f" - 总记录数: {len(existing_data['by_id'])}")
missing_parent = sum(1 for c in existing_data['by_id'].values() if c.get('parent_id') is None)
print(f" - 缺少 parent_id 的记录: {missing_parent}")
print(f"\n更新计划:")
create_count = sum(1 for p in plan if p['action'] == 'create')
update_count = sum(1 for p in plan if p['action'] == 'update')
print(f" - 需要创建: {create_count}")
print(f" - 需要更新: {update_count}")
print(f"\n层级分布:")
by_level = {}
for item in plan:
level = item['level']
by_level[level] = by_level.get(level, 0) + 1
for level in sorted(by_level.keys()):
print(f" - 层级 {level}: {by_level[level]} 个节点")
print("\n" + "="*80)
def main():
"""主函数"""
# 分析
dir_structure, existing_data = analyze_structure()
if not dir_structure or not existing_data:
return
# 规划树状结构
print("规划树状结构...")
plan = plan_tree_structure(dir_structure, existing_data)
print(f" 生成 {len(plan)} 个更新计划\n")
# 打印报告
print_analysis_report(dir_structure, existing_data, plan)
# 生成SQL脚本
print("\n生成SQL更新脚本...")
sql_file = generate_update_sql(plan)
print("\n" + "="*80)
print("分析完成!")
print("="*80)
print(f"\n请检查生成的SQL脚本: {sql_file}")
print("确认无误后,可以执行该脚本更新数据库。")
print("\n注意:执行前请备份数据库!")
if __name__ == '__main__':
main()

148
analyze_duplicate_fields.py Normal file
View File

@ -0,0 +1,148 @@
"""分析 f_polic_field 表中的重复字段"""
import pymysql
import os
from dotenv import load_dotenv
from collections import defaultdict
load_dotenv()
TENANT_ID = 615873064429507639
conn = pymysql.connect(
host=os.getenv('DB_HOST', '152.136.177.240'),
port=int(os.getenv('DB_PORT', 5012)),
user=os.getenv('DB_USER', 'finyx'),
password=os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
database=os.getenv('DB_NAME', 'finyx'),
charset='utf8mb4'
)
cursor = conn.cursor(pymysql.cursors.DictCursor)
print("=" * 80)
print("1. 分析按 name 字段的重复情况")
print("=" * 80)
# 查询所有字段
cursor.execute("""
SELECT id, name, filed_code, field_type, state
FROM f_polic_field
WHERE tenant_id = %s
ORDER BY name, id
""", (TENANT_ID,))
all_fields = cursor.fetchall()
# 按 name 分组
name_groups = defaultdict(list)
for field in all_fields:
name_groups[field['name']].append(field)
# 找出重复的 name
duplicate_names = {name: fields for name, fields in name_groups.items() if len(fields) > 1}
print(f"\n发现 {len(duplicate_names)} 个重复的字段名称:\n")
for name, fields in sorted(duplicate_names.items()):
print(f"字段名称: {name}")
for field in fields:
print(f" ID: {field['id']}, filed_code: {field['filed_code']}, field_type: {field['field_type']}, state: {field['state']}")
print()
print("\n" + "=" * 80)
print("2. 分析按 filed_code 字段的重复情况")
print("=" * 80)
# 按 filed_code 分组
code_groups = defaultdict(list)
for field in all_fields:
code_groups[field['filed_code']].append(field)
# 找出重复的 filed_code
duplicate_codes = {code: fields for code, fields in code_groups.items() if len(fields) > 1}
print(f"\n发现 {len(duplicate_codes)} 个重复的字段编码:\n")
for code, fields in sorted(duplicate_codes.items()):
print(f"字段编码: {code}")
for field in fields:
print(f" ID: {field['id']}, name: {field['name']}, field_type: {field['field_type']}, state: {field['state']}")
print()
print("\n" + "=" * 80)
print("3. 分析重复字段的关联关系f_polic_file_field")
print("=" * 80)
# 获取所有重复字段的ID
all_duplicate_field_ids = set()
for fields in duplicate_names.values():
for field in fields:
all_duplicate_field_ids.add(field['id'])
for fields in duplicate_codes.values():
for field in fields:
all_duplicate_field_ids.add(field['id'])
if all_duplicate_field_ids:
placeholders = ','.join(['%s'] * len(all_duplicate_field_ids))
cursor.execute(f"""
SELECT ff.file_id, ff.filed_id, f.name, f.filed_code, fc.name as file_name, fc.state as file_state
FROM f_polic_file_field ff
INNER JOIN f_polic_field f ON ff.filed_id = f.id
INNER JOIN f_polic_file_config fc ON ff.file_id = fc.id
WHERE ff.filed_id IN ({placeholders})
AND f.tenant_id = %s
ORDER BY f.filed_code, ff.file_id
""", list(all_duplicate_field_ids) + [TENANT_ID])
associations = cursor.fetchall()
# 按 filed_code 分组关联关系
code_associations = defaultdict(list)
for assoc in associations:
code_associations[assoc['filed_code']].append(assoc)
print(f"\n重复字段的关联关系:\n")
for code, assocs in sorted(code_associations.items()):
print(f"字段编码: {code} ({assocs[0]['name']})")
for assoc in assocs:
print(f" 字段ID: {assoc['filed_id']}, 文件ID: {assoc['file_id']}, 文件名: {assoc['file_name']}, 文件状态: {assoc['file_state']}")
print()
else:
print("\n没有发现重复字段的关联关系")
print("\n" + "=" * 80)
print("4. 统计每个 filed_code 关联的模板数量")
print("=" * 80)
cursor.execute("""
SELECT f.filed_code, f.name, COUNT(DISTINCT ff.file_id) as template_count,
GROUP_CONCAT(DISTINCT ff.filed_id ORDER BY ff.filed_id) as field_ids,
GROUP_CONCAT(DISTINCT fc.name ORDER BY fc.name SEPARATOR ' | ') as template_names
FROM f_polic_field f
LEFT JOIN f_polic_file_field ff ON f.id = ff.filed_id
LEFT JOIN f_polic_file_config fc ON ff.file_id = fc.id AND fc.state = 1
WHERE f.tenant_id = %s
GROUP BY f.filed_code, f.name
HAVING COUNT(DISTINCT ff.filed_id) > 0 OR f.filed_code IN (
SELECT filed_code FROM (
SELECT filed_code, COUNT(*) as cnt
FROM f_polic_field
WHERE tenant_id = %s
GROUP BY filed_code
HAVING cnt > 1
) AS dup
)
ORDER BY template_count DESC, f.filed_code
""", (TENANT_ID, TENANT_ID))
stats = cursor.fetchall()
print(f"\n字段关联统计(包含重复字段):\n")
for stat in stats:
print(f"字段编码: {stat['filed_code']}")
print(f" 字段名称: {stat['name']}")
print(f" 关联模板数: {stat['template_count']}")
print(f" 字段ID列表: {stat['field_ids']}")
if stat['template_names']:
print(f" 关联模板: {stat['template_names']}")
print()
cursor.close()
conn.close()

1482
app.py

File diff suppressed because it is too large Load Diff

314
backup_database.py Normal file
View File

@ -0,0 +1,314 @@
"""
数据库备份脚本
支持使用mysqldump命令或Python直接导出SQL文件
"""
import os
import sys
import subprocess
import pymysql
from datetime import datetime
from pathlib import Path
from dotenv import load_dotenv
# 加载环境变量
load_dotenv()
class DatabaseBackup:
"""数据库备份类"""
def __init__(self):
"""初始化数据库配置"""
self.db_config = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
# 备份文件存储目录
self.backup_dir = Path('backups')
self.backup_dir.mkdir(exist_ok=True)
def backup_with_mysqldump(self, output_file=None, compress=False):
"""
使用mysqldump命令备份数据库推荐方式
Args:
output_file: 输出文件路径如果为None则自动生成
compress: 是否压缩备份文件
Returns:
备份文件路径
"""
# 生成备份文件名
if output_file is None:
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
output_file = self.backup_dir / f"backup_{self.db_config['database']}_{timestamp}.sql"
output_file = Path(output_file)
# 构建mysqldump命令
cmd = [
'mysqldump',
f"--host={self.db_config['host']}",
f"--port={self.db_config['port']}",
f"--user={self.db_config['user']}",
f"--password={self.db_config['password']}",
'--single-transaction', # 保证数据一致性
'--routines', # 包含存储过程和函数
'--triggers', # 包含触发器
'--events', # 包含事件
'--add-drop-table', # 添加DROP TABLE语句
'--default-character-set=utf8mb4', # 设置字符集
self.db_config['database']
]
try:
print(f"开始备份数据库 {self.db_config['database']}...")
print(f"备份文件: {output_file}")
# 执行备份命令
with open(output_file, 'w', encoding='utf-8') as f:
result = subprocess.run(
cmd,
stdout=f,
stderr=subprocess.PIPE,
text=True
)
if result.returncode != 0:
error_msg = result.stderr.decode('utf-8') if result.stderr else '未知错误'
raise Exception(f"mysqldump执行失败: {error_msg}")
# 检查文件大小
file_size = output_file.stat().st_size
print(f"备份完成!文件大小: {file_size / 1024 / 1024:.2f} MB")
# 如果需要压缩
if compress:
compressed_file = self._compress_file(output_file)
print(f"压缩完成: {compressed_file}")
return str(compressed_file)
return str(output_file)
except FileNotFoundError:
print("错误: 未找到mysqldump命令请确保MySQL客户端已安装并在PATH中")
print("尝试使用Python方式备份...")
return self.backup_with_python(output_file)
except Exception as e:
print(f"备份失败: {str(e)}")
raise
def backup_with_python(self, output_file=None):
"""
使用Python直接连接数据库备份备用方式
Args:
output_file: 输出文件路径如果为None则自动生成
Returns:
备份文件路径
"""
if output_file is None:
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
output_file = self.backup_dir / f"backup_{self.db_config['database']}_{timestamp}.sql"
output_file = Path(output_file)
try:
print(f"开始使用Python方式备份数据库 {self.db_config['database']}...")
print(f"备份文件: {output_file}")
# 连接数据库
connection = pymysql.connect(**self.db_config)
cursor = connection.cursor()
with open(output_file, 'w', encoding='utf-8') as f:
# 写入文件头
f.write(f"-- MySQL数据库备份\n")
f.write(f"-- 数据库: {self.db_config['database']}\n")
f.write(f"-- 备份时间: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\n")
f.write(f"-- 主机: {self.db_config['host']}:{self.db_config['port']}\n")
f.write("--\n\n")
f.write(f"SET NAMES utf8mb4;\n")
f.write(f"SET FOREIGN_KEY_CHECKS=0;\n\n")
# 获取所有表
cursor.execute("SHOW TABLES")
tables = [table[0] for table in cursor.fetchall()]
print(f"找到 {len(tables)} 个表")
# 备份每个表
for table in tables:
print(f"备份表: {table}")
# 获取表结构
cursor.execute(f"SHOW CREATE TABLE `{table}`")
create_table_sql = cursor.fetchone()[1]
f.write(f"-- ----------------------------\n")
f.write(f"-- 表结构: {table}\n")
f.write(f"-- ----------------------------\n")
f.write(f"DROP TABLE IF EXISTS `{table}`;\n")
f.write(f"{create_table_sql};\n\n")
# 获取表数据
cursor.execute(f"SELECT * FROM `{table}`")
rows = cursor.fetchall()
if rows:
# 获取列名
cursor.execute(f"DESCRIBE `{table}`")
columns = [col[0] for col in cursor.fetchall()]
f.write(f"-- ----------------------------\n")
f.write(f"-- 表数据: {table}\n")
f.write(f"-- ----------------------------\n")
# 分批写入数据
batch_size = 1000
for i in range(0, len(rows), batch_size):
batch = rows[i:i+batch_size]
values_list = []
for row in batch:
values = []
for value in row:
if value is None:
values.append('NULL')
elif isinstance(value, (int, float)):
values.append(str(value))
else:
# 转义特殊字符
escaped_value = str(value).replace('\\', '\\\\').replace("'", "\\'")
values.append(f"'{escaped_value}'")
values_list.append(f"({', '.join(values)})")
columns_str = ', '.join([f"`{col}`" for col in columns])
values_str = ',\n'.join(values_list)
f.write(f"INSERT INTO `{table}` ({columns_str}) VALUES\n")
f.write(f"{values_str};\n\n")
print(f" 完成: {len(rows)} 条记录")
f.write("SET FOREIGN_KEY_CHECKS=1;\n")
cursor.close()
connection.close()
# 检查文件大小
file_size = output_file.stat().st_size
print(f"备份完成!文件大小: {file_size / 1024 / 1024:.2f} MB")
return str(output_file)
except Exception as e:
print(f"备份失败: {str(e)}")
raise
def _compress_file(self, file_path):
"""
压缩备份文件
Args:
file_path: 文件路径
Returns:
压缩后的文件路径
"""
import gzip
file_path = Path(file_path)
compressed_path = file_path.with_suffix('.sql.gz')
with open(file_path, 'rb') as f_in:
with gzip.open(compressed_path, 'wb') as f_out:
f_out.writelines(f_in)
# 删除原文件
file_path.unlink()
return compressed_path
def list_backups(self):
"""
列出所有备份文件
Returns:
备份文件列表
"""
backups = []
for file in sorted(self.backup_dir.glob('backup_*.sql*'), reverse=True):
file_info = {
'filename': file.name,
'path': str(file),
'size': file.stat().st_size,
'size_mb': file.stat().st_size / 1024 / 1024,
'modified': datetime.fromtimestamp(file.stat().st_mtime)
}
backups.append(file_info)
return backups
def main():
"""主函数"""
import argparse
parser = argparse.ArgumentParser(description='数据库备份工具')
parser.add_argument('--method', choices=['mysqldump', 'python', 'auto'],
default='auto', help='备份方法 (默认: auto)')
parser.add_argument('--output', '-o', help='输出文件路径')
parser.add_argument('--compress', '-c', action='store_true',
help='压缩备份文件')
parser.add_argument('--list', '-l', action='store_true',
help='列出所有备份文件')
args = parser.parse_args()
backup = DatabaseBackup()
# 列出备份文件
if args.list:
backups = backup.list_backups()
if backups:
print(f"\n找到 {len(backups)} 个备份文件:\n")
print(f"{'文件名':<50} {'大小(MB)':<15} {'修改时间':<20}")
print("-" * 85)
for b in backups:
print(f"{b['filename']:<50} {b['size_mb']:<15.2f} {b['modified'].strftime('%Y-%m-%d %H:%M:%S'):<20}")
else:
print("未找到备份文件")
return
# 执行备份
try:
if args.method == 'mysqldump':
backup_file = backup.backup_with_mysqldump(args.output, args.compress)
elif args.method == 'python':
backup_file = backup.backup_with_python(args.output)
else: # auto
try:
backup_file = backup.backup_with_mysqldump(args.output, args.compress)
except:
print("\nmysqldump方式失败切换到Python方式...")
backup_file = backup.backup_with_python(args.output)
print(f"\n备份成功!")
print(f"备份文件: {backup_file}")
except Exception as e:
print(f"\n备份失败: {str(e)}")
sys.exit(1)
if __name__ == '__main__':
main()

117
check_and_fix_duplicates.py Normal file
View File

@ -0,0 +1,117 @@
"""
检查并修复重复记录
"""
import pymysql
DB_CONFIG = {
'host': '152.136.177.240',
'port': 5012,
'user': 'finyx',
'password': '6QsGK6MpePZDE57Z',
'database': 'finyx',
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
UPDATED_BY = 655162080928945152
conn = pymysql.connect(**DB_CONFIG)
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
# 检查"1.初核请示"下的所有记录
cursor.execute("""
SELECT id, name, file_path, parent_id
FROM f_polic_file_config
WHERE tenant_id = %s AND parent_id = %s
ORDER BY id
""", (TENANT_ID, 1765431558933731)) # 1.初核请示
results = cursor.fetchall()
print(f"'1.初核请示'下有 {len(results)} 条记录:\n")
for r in results:
print(f"ID: {r['id']}, name: {r['name']}, file_path: {r['file_path']}")
# 检查"1请示报告卡"的记录
request_cards = [r for r in results if r['name'] == '1请示报告卡']
if len(request_cards) > 1:
print(f"\n发现 {len(request_cards)} 个重复的'1请示报告卡'记录")
# 保留file_path正确的那个
correct_one = None
for r in request_cards:
if r['file_path'] and '1.请示报告卡XXX' in r['file_path']:
correct_one = r
break
if correct_one:
# 删除其他的
for r in request_cards:
if r['id'] != correct_one['id']:
# 删除关联关系
cursor.execute("""
DELETE FROM f_polic_file_field
WHERE tenant_id = %s AND file_id = %s
""", (TENANT_ID, r['id']))
# 删除模板记录
cursor.execute("""
DELETE FROM f_polic_file_config
WHERE tenant_id = %s AND id = %s
""", (TENANT_ID, r['id']))
print(f"[DELETE] 删除重复记录: ID {r['id']}, file_path: {r['file_path']}")
# 检查"走读式谈话审批"下是否有"1请示报告卡"
cursor.execute("""
SELECT id, name, file_path, parent_id
FROM f_polic_file_config
WHERE tenant_id = %s AND parent_id = %s AND name = %s
""", (TENANT_ID, 1765273962700431, '1请示报告卡')) # 走读式谈话审批
result = cursor.fetchone()
if not result:
print("\n[WARN] '走读式谈话审批'下缺少'1请示报告卡'记录")
# 创建记录
import time
import random
timestamp = int(time.time() * 1000)
random_part = random.randint(100000, 999999)
new_id = timestamp * 1000 + random_part
insert_sql = """
INSERT INTO f_polic_file_config
(id, tenant_id, parent_id, name, input_data, file_path, created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, %s)
"""
cursor.execute(insert_sql, (
new_id,
TENANT_ID,
1765273962700431, # 走读式谈话审批
'1请示报告卡',
None,
'/615873064429507639/TEMPLATE/2025/12/1.请示报告卡(初核谈话).docx',
655162080928945152,
655162080928945152,
1
))
print(f"[CREATE] 在'走读式谈话审批'下创建'1请示报告卡'记录 (ID: {new_id})")
else:
# 检查file_path是否正确
if result['file_path'] and '1.请示报告卡(初核谈话)' not in result['file_path']:
cursor.execute("""
UPDATE f_polic_file_config
SET file_path = %s, updated_time = NOW(), updated_by = %s
WHERE tenant_id = %s AND id = %s
""", ('/615873064429507639/TEMPLATE/2025/12/1.请示报告卡(初核谈话).docx', UPDATED_BY, TENANT_ID, result['id']))
print(f"[UPDATE] 修复'走读式谈话审批''1请示报告卡'的file_path")
conn.commit()
print("\n[OK] 修复完成")
except Exception as e:
conn.rollback()
print(f"[ERROR] 修复失败: {e}")
import traceback
traceback.print_exc()
finally:
cursor.close()
conn.close()

View File

@ -0,0 +1,551 @@
"""
检查并修复 f_polic_file_field 表的关联关系
1. 检查无效的关联关联到不存在的 file_id filed_id
2. 检查重复的关联关系
3. 检查关联到已删除或未启用的字段/文件
4. 根据其他表的数据更新关联关系
"""
import pymysql
import os
from typing import Dict, List, Tuple
from collections import defaultdict
# 数据库连接配置
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
def check_invalid_relations(conn) -> Dict:
"""检查无效的关联关系(关联到不存在的 file_id 或 filed_id"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
print("\n" + "="*80)
print("1. 检查无效的关联关系")
print("="*80)
# 检查关联到不存在的 file_id
cursor.execute("""
SELECT fff.id, fff.file_id, fff.filed_id, fff.tenant_id
FROM f_polic_file_field fff
LEFT JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id
WHERE fff.tenant_id = %s AND fc.id IS NULL
""", (TENANT_ID,))
invalid_file_relations = cursor.fetchall()
# 检查关联到不存在的 filed_id
cursor.execute("""
SELECT fff.id, fff.file_id, fff.filed_id, fff.tenant_id
FROM f_polic_file_field fff
LEFT JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
WHERE fff.tenant_id = %s AND f.id IS NULL
""", (TENANT_ID,))
invalid_field_relations = cursor.fetchall()
print(f"\n关联到不存在的 file_id: {len(invalid_file_relations)}")
if invalid_file_relations:
print(" 详情:")
for rel in invalid_file_relations[:10]:
print(f" - 关联ID: {rel['id']}, file_id: {rel['file_id']}, filed_id: {rel['filed_id']}")
if len(invalid_file_relations) > 10:
print(f" ... 还有 {len(invalid_file_relations) - 10}")
print(f"\n关联到不存在的 filed_id: {len(invalid_field_relations)}")
if invalid_field_relations:
print(" 详情:")
for rel in invalid_field_relations[:10]:
print(f" - 关联ID: {rel['id']}, file_id: {rel['file_id']}, filed_id: {rel['filed_id']}")
if len(invalid_field_relations) > 10:
print(f" ... 还有 {len(invalid_field_relations) - 10}")
return {
'invalid_file_relations': invalid_file_relations,
'invalid_field_relations': invalid_field_relations
}
def check_duplicate_relations(conn) -> Dict:
"""检查重复的关联关系(相同的 file_id 和 filed_id"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
print("\n" + "="*80)
print("2. 检查重复的关联关系")
print("="*80)
# 查找重复的关联关系
cursor.execute("""
SELECT file_id, filed_id, COUNT(*) as count, GROUP_CONCAT(id ORDER BY id) as ids
FROM f_polic_file_field
WHERE tenant_id = %s
GROUP BY file_id, filed_id
HAVING COUNT(*) > 1
ORDER BY count DESC
""", (TENANT_ID,))
duplicates = cursor.fetchall()
print(f"\n发现 {len(duplicates)} 个重复的关联关系:")
duplicate_details = []
for dup in duplicates:
ids = [int(id_str) for id_str in dup['ids'].split(',')]
duplicate_details.append({
'file_id': dup['file_id'],
'filed_id': dup['filed_id'],
'count': dup['count'],
'ids': ids
})
print(f"\n 文件ID: {dup['file_id']}, 字段ID: {dup['filed_id']} (共 {dup['count']} 条)")
print(f" 关联ID列表: {ids}")
return {
'duplicates': duplicate_details
}
def check_disabled_relations(conn) -> Dict:
"""检查关联到已删除或未启用的字段/文件"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
print("\n" + "="*80)
print("3. 检查关联到已删除或未启用的字段/文件")
print("="*80)
# 检查关联到未启用的文件
cursor.execute("""
SELECT fff.id, fff.file_id, fff.filed_id, fc.name as file_name, fc.state as file_state
FROM f_polic_file_field fff
INNER JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id
WHERE fff.tenant_id = %s AND fc.state = 0
""", (TENANT_ID,))
disabled_file_relations = cursor.fetchall()
# 检查关联到未启用的字段
cursor.execute("""
SELECT fff.id, fff.file_id, fff.filed_id, f.name as field_name, f.filed_code, f.state as field_state
FROM f_polic_file_field fff
INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
WHERE fff.tenant_id = %s AND f.state = 0
""", (TENANT_ID,))
disabled_field_relations = cursor.fetchall()
print(f"\n关联到未启用的文件: {len(disabled_file_relations)}")
if disabled_file_relations:
print(" 详情:")
for rel in disabled_file_relations[:10]:
print(f" - 关联ID: {rel['id']}, 文件: {rel['file_name']} (ID: {rel['file_id']})")
if len(disabled_file_relations) > 10:
print(f" ... 还有 {len(disabled_file_relations) - 10}")
print(f"\n关联到未启用的字段: {len(disabled_field_relations)}")
if disabled_field_relations:
print(" 详情:")
for rel in disabled_field_relations[:10]:
print(f" - 关联ID: {rel['id']}, 字段: {rel['field_name']} ({rel['filed_code']}, ID: {rel['filed_id']})")
if len(disabled_field_relations) > 10:
print(f" ... 还有 {len(disabled_field_relations) - 10}")
return {
'disabled_file_relations': disabled_file_relations,
'disabled_field_relations': disabled_field_relations
}
def check_missing_relations(conn) -> Dict:
"""检查应该存在但缺失的关联关系(文件节点应该有输出字段关联)"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
print("\n" + "="*80)
print("4. 检查缺失的关联关系")
print("="*80)
# 获取所有有 template_code 的文件节点(这些应该是文件,不是目录)
cursor.execute("""
SELECT fc.id, fc.name, fc.template_code
FROM f_polic_file_config fc
WHERE fc.tenant_id = %s AND fc.template_code IS NOT NULL AND fc.state = 1
""", (TENANT_ID,))
file_configs = cursor.fetchall()
# 获取所有启用的输出字段
cursor.execute("""
SELECT id, name, filed_code
FROM f_polic_field
WHERE tenant_id = %s AND field_type = 2 AND state = 1
""", (TENANT_ID,))
output_fields = cursor.fetchall()
# 获取现有的关联关系
cursor.execute("""
SELECT file_id, filed_id
FROM f_polic_file_field
WHERE tenant_id = %s
""", (TENANT_ID,))
existing_relations = {(rel['file_id'], rel['filed_id']) for rel in cursor.fetchall()}
print(f"\n文件节点总数: {len(file_configs)}")
print(f"输出字段总数: {len(output_fields)}")
print(f"现有关联关系总数: {len(existing_relations)}")
# 这里不自动创建缺失的关联,因为不是所有文件都需要所有字段
# 只显示统计信息
print("\n注意: 缺失的关联关系需要根据业务逻辑手动创建")
return {
'file_configs': file_configs,
'output_fields': output_fields,
'existing_relations': existing_relations
}
def check_field_type_consistency(conn) -> Dict:
"""检查关联关系的字段类型一致性f_polic_file_field 应该只关联输出字段)"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
print("\n" + "="*80)
print("5. 检查字段类型一致性")
print("="*80)
# 检查是否关联了输入字段field_type=1
cursor.execute("""
SELECT fff.id, fff.file_id, fff.filed_id,
fc.name as file_name, fc.template_code, f.name as field_name, f.filed_code, f.field_type
FROM f_polic_file_field fff
INNER JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id
INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
WHERE fff.tenant_id = %s AND f.field_type = 1
ORDER BY fc.name, f.name
""", (TENANT_ID,))
input_field_relations = cursor.fetchall()
print(f"\n关联到输入字段 (field_type=1) 的记录: {len(input_field_relations)}")
if input_field_relations:
print(" 注意: f_polic_file_field 表通常只应该关联输出字段 (field_type=2)")
print(" 根据业务逻辑,输入字段不需要通过此表关联")
print(" 详情:")
for rel in input_field_relations:
print(f" - 关联ID: {rel['id']}, 文件: {rel['file_name']} (code: {rel['template_code']}), "
f"字段: {rel['field_name']} ({rel['filed_code']}, type={rel['field_type']})")
else:
print(" ✓ 所有关联都是输出字段")
return {
'input_field_relations': input_field_relations
}
def fix_invalid_relations(conn, dry_run: bool = True) -> Dict:
"""修复无效的关联关系"""
cursor = conn.cursor()
print("\n" + "="*80)
print("修复无效的关联关系")
print("="*80)
if dry_run:
print("\n[DRY RUN模式 - 不会实际修改数据库]")
# 获取无效的关联
invalid_file_relations = check_invalid_relations(conn)['invalid_file_relations']
invalid_field_relations = check_invalid_relations(conn)['invalid_field_relations']
all_invalid_ids = set()
for rel in invalid_file_relations:
all_invalid_ids.add(rel['id'])
for rel in invalid_field_relations:
all_invalid_ids.add(rel['id'])
if not all_invalid_ids:
print("\n✓ 没有无效的关联关系需要删除")
return {'deleted': 0}
print(f"\n准备删除 {len(all_invalid_ids)} 条无效的关联关系")
if not dry_run:
placeholders = ','.join(['%s'] * len(all_invalid_ids))
cursor.execute(f"""
DELETE FROM f_polic_file_field
WHERE id IN ({placeholders})
""", list(all_invalid_ids))
conn.commit()
print(f"✓ 已删除 {cursor.rowcount} 条无效的关联关系")
else:
print(f"[DRY RUN] 将删除以下关联ID: {sorted(all_invalid_ids)}")
return {'deleted': len(all_invalid_ids) if not dry_run else 0}
def fix_input_field_relations(conn, dry_run: bool = True) -> Dict:
"""删除关联到输入字段的记录f_polic_file_field 应该只关联输出字段)"""
cursor = conn.cursor()
print("\n" + "="*80)
print("删除关联到输入字段的记录")
print("="*80)
if dry_run:
print("\n[DRY RUN模式 - 不会实际修改数据库]")
# 获取关联到输入字段的记录
input_field_relations = check_field_type_consistency(conn)['input_field_relations']
if not input_field_relations:
print("\n✓ 没有关联到输入字段的记录需要删除")
return {'deleted': 0}
ids_to_delete = [rel['id'] for rel in input_field_relations]
print(f"\n准备删除 {len(ids_to_delete)} 条关联到输入字段的记录")
if not dry_run:
placeholders = ','.join(['%s'] * len(ids_to_delete))
cursor.execute(f"""
DELETE FROM f_polic_file_field
WHERE id IN ({placeholders})
""", ids_to_delete)
conn.commit()
print(f"✓ 已删除 {cursor.rowcount} 条关联到输入字段的记录")
else:
print(f"[DRY RUN] 将删除以下关联ID: {sorted(ids_to_delete)}")
return {'deleted': len(ids_to_delete) if not dry_run else 0}
def fix_duplicate_relations(conn, dry_run: bool = True) -> Dict:
"""修复重复的关联关系(保留第一条,删除其他)"""
cursor = conn.cursor()
print("\n" + "="*80)
print("修复重复的关联关系")
print("="*80)
if dry_run:
print("\n[DRY RUN模式 - 不会实际修改数据库]")
duplicates = check_duplicate_relations(conn)['duplicates']
if not duplicates:
print("\n✓ 没有重复的关联关系需要修复")
return {'deleted': 0}
ids_to_delete = []
for dup in duplicates:
# 保留第一条ID最小的删除其他的
ids_to_delete.extend(dup['ids'][1:])
print(f"\n准备删除 {len(ids_to_delete)} 条重复的关联关系")
if not dry_run:
placeholders = ','.join(['%s'] * len(ids_to_delete))
cursor.execute(f"""
DELETE FROM f_polic_file_field
WHERE id IN ({placeholders})
""", ids_to_delete)
conn.commit()
print(f"✓ 已删除 {cursor.rowcount} 条重复的关联关系")
else:
print(f"[DRY RUN] 将删除以下关联ID: {sorted(ids_to_delete)}")
return {'deleted': len(ids_to_delete) if not dry_run else 0}
def get_statistics(conn) -> Dict:
"""获取统计信息"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
print("\n" + "="*80)
print("统计信息")
print("="*80)
# 总关联数
cursor.execute("""
SELECT COUNT(*) as total
FROM f_polic_file_field
WHERE tenant_id = %s
""", (TENANT_ID,))
total_relations = cursor.fetchone()['total']
# 有效的关联数(关联到存在的、启用的文件和字段)
cursor.execute("""
SELECT COUNT(*) as total
FROM f_polic_file_field fff
INNER JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id AND fc.state = 1
INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id AND f.state = 1
WHERE fff.tenant_id = %s
""", (TENANT_ID,))
valid_relations = cursor.fetchone()['total']
# 关联的文件数
cursor.execute("""
SELECT COUNT(DISTINCT file_id) as total
FROM f_polic_file_field
WHERE tenant_id = %s
""", (TENANT_ID,))
related_files = cursor.fetchone()['total']
# 关联的字段数
cursor.execute("""
SELECT COUNT(DISTINCT filed_id) as total
FROM f_polic_file_field
WHERE tenant_id = %s
""", (TENANT_ID,))
related_fields = cursor.fetchone()['total']
print(f"\n总关联数: {total_relations}")
print(f"有效关联数: {valid_relations}")
print(f"关联的文件数: {related_files}")
print(f"关联的字段数: {related_fields}")
return {
'total_relations': total_relations,
'valid_relations': valid_relations,
'related_files': related_files,
'related_fields': related_fields
}
def main():
"""主函数"""
print("="*80)
print("检查并修复 f_polic_file_field 表的关联关系")
print("="*80)
try:
conn = pymysql.connect(**DB_CONFIG)
print("✓ 数据库连接成功\n")
except Exception as e:
print(f"✗ 数据库连接失败: {e}")
return
try:
# 1. 检查无效的关联关系
invalid_result = check_invalid_relations(conn)
# 2. 检查重复的关联关系
duplicate_result = check_duplicate_relations(conn)
# 3. 检查关联到已删除或未启用的字段/文件
disabled_result = check_disabled_relations(conn)
# 4. 检查缺失的关联关系
missing_result = check_missing_relations(conn)
# 5. 检查字段类型一致性
type_result = check_field_type_consistency(conn)
# 6. 获取统计信息
stats = get_statistics(conn)
# 总结
print("\n" + "="*80)
print("检查总结")
print("="*80)
has_issues = (
len(invalid_result['invalid_file_relations']) > 0 or
len(invalid_result['invalid_field_relations']) > 0 or
len(duplicate_result['duplicates']) > 0
)
has_issues = (
len(invalid_result['invalid_file_relations']) > 0 or
len(invalid_result['invalid_field_relations']) > 0 or
len(duplicate_result['duplicates']) > 0 or
len(type_result['input_field_relations']) > 0
)
if has_issues:
print("\n⚠ 发现以下问题:")
print(f" - 无效的 file_id 关联: {len(invalid_result['invalid_file_relations'])}")
print(f" - 无效的 filed_id 关联: {len(invalid_result['invalid_field_relations'])}")
print(f" - 重复的关联关系: {len(duplicate_result['duplicates'])}")
print(f" - 关联到未启用的文件: {len(disabled_result['disabled_file_relations'])}")
print(f" - 关联到未启用的字段: {len(disabled_result['disabled_field_relations'])}")
print(f" - 关联到输入字段: {len(type_result['input_field_relations'])}")
print("\n是否要修复这些问题?")
print("运行以下命令进行修复:")
print(" python check_and_fix_file_field_relations.py --fix")
else:
print("\n✓ 未发现需要修复的问题")
print("\n" + "="*80)
except Exception as e:
print(f"\n✗ 检查过程中发生错误: {e}")
import traceback
traceback.print_exc()
finally:
conn.close()
print("\n数据库连接已关闭")
def fix_main():
"""修复主函数"""
print("="*80)
print("修复 f_polic_file_field 表的关联关系")
print("="*80)
try:
conn = pymysql.connect(**DB_CONFIG)
print("✓ 数据库连接成功\n")
except Exception as e:
print(f"✗ 数据库连接失败: {e}")
return
try:
# 先进行干运行
print("\n[第一步] 干运行检查...")
invalid_result = check_invalid_relations(conn)
duplicate_result = check_duplicate_relations(conn)
# 修复无效的关联关系
print("\n[第二步] 修复无效的关联关系...")
fix_invalid_relations(conn, dry_run=False)
# 修复重复的关联关系
print("\n[第三步] 修复重复的关联关系...")
fix_duplicate_relations(conn, dry_run=False)
# 删除关联到输入字段的记录
print("\n[第四步] 删除关联到输入字段的记录...")
fix_input_field_relations(conn, dry_run=False)
# 重新获取统计信息
print("\n[第五步] 修复后的统计信息...")
stats = get_statistics(conn)
print("\n" + "="*80)
print("修复完成")
print("="*80)
except Exception as e:
print(f"\n✗ 修复过程中发生错误: {e}")
import traceback
traceback.print_exc()
conn.rollback()
finally:
conn.close()
print("\n数据库连接已关闭")
if __name__ == '__main__':
import sys
if '--fix' in sys.argv:
# 确认操作
print("\n⚠ 警告: 这将修改数据库!")
response = input("确认要继续吗? (yes/no): ")
if response.lower() == 'yes':
fix_main()
else:
print("操作已取消")
else:
main()

View File

@ -0,0 +1,36 @@
"""查询保密承诺书相关的模板记录"""
import pymysql
DB_CONFIG = {
'host': '152.136.177.240',
'port': 5012,
'user': 'finyx',
'password': '6QsGK6MpePZDE57Z',
'database': 'finyx',
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
conn = pymysql.connect(**DB_CONFIG)
cursor = conn.cursor(pymysql.cursors.DictCursor)
cursor.execute("""
SELECT id, name, file_path, parent_id
FROM f_polic_file_config
WHERE tenant_id = %s AND name LIKE %s
ORDER BY name
""", (TENANT_ID, '%保密承诺书%'))
results = cursor.fetchall()
print(f"找到 {len(results)} 条记录:\n")
for r in results:
print(f"ID: {r['id']}")
print(f"名称: {r['name']}")
print(f"文件路径: {r['file_path']}")
print(f"父节点ID: {r['parent_id']}")
print()
cursor.close()
conn.close()

View File

@ -0,0 +1,539 @@
"""
检查数据库中的ID关系是否正确
功能
1. 检查f_polic_file_config表中的数据
2. 检查f_polic_field表中的数据
3. 检查f_polic_file_field表中的关联关系
4. 验证ID关系是否正确匹配
5. 找出孤立数据和错误关联
使用方法
python check_database_id_relations.py --host 10.100.31.21 --port 3306 --user finyx --password FknJYz3FA5WDYtsd --database finyx --tenant-id 1
"""
import os
import sys
import pymysql
import argparse
from typing import Dict, List, Set, Optional
from collections import defaultdict
# 设置输出编码为UTF-8Windows兼容
if sys.platform == 'win32':
import io
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
sys.stderr = io.TextIOWrapper(sys.stderr.buffer, encoding='utf-8')
def print_section(title):
"""打印章节标题"""
print("\n" + "="*70)
print(f" {title}")
print("="*70)
def print_result(success, message):
"""打印结果"""
status = "[OK]" if success else "[FAIL]"
print(f"{status} {message}")
def get_db_config_from_args() -> Dict:
"""从命令行参数获取数据库配置"""
parser = argparse.ArgumentParser(
description='检查数据库中的ID关系是否正确',
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
示例
python check_database_id_relations.py --host 10.100.31.21 --port 3306 --user finyx --password FknJYz3FA5WDYtsd --database finyx --tenant-id 1
"""
)
parser.add_argument('--host', type=str, required=True, help='MySQL服务器地址')
parser.add_argument('--port', type=int, required=True, help='MySQL服务器端口')
parser.add_argument('--user', type=str, required=True, help='MySQL用户名')
parser.add_argument('--password', type=str, required=True, help='MySQL密码')
parser.add_argument('--database', type=str, required=True, help='数据库名称')
parser.add_argument('--tenant-id', type=int, required=True, help='租户ID')
parser.add_argument('--file-id', type=int, help='检查特定的文件ID')
args = parser.parse_args()
return {
'host': args.host,
'port': args.port,
'user': args.user,
'password': args.password,
'database': args.database,
'charset': 'utf8mb4',
'tenant_id': args.tenant_id,
'file_id': args.file_id
}
def test_db_connection(config: Dict) -> Optional[pymysql.Connection]:
"""测试数据库连接"""
try:
conn = pymysql.connect(
host=config['host'],
port=config['port'],
user=config['user'],
password=config['password'],
database=config['database'],
charset=config['charset']
)
return conn
except Exception as e:
print_result(False, f"数据库连接失败: {str(e)}")
return None
def check_file_config(conn, tenant_id: int, file_id: Optional[int] = None):
"""检查f_polic_file_config表"""
print_section("检查 f_polic_file_config 表")
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
if file_id:
# 检查特定文件ID
cursor.execute("""
SELECT id, tenant_id, parent_id, name, file_path, state
FROM f_polic_file_config
WHERE id = %s AND tenant_id = %s
""", (file_id, tenant_id))
result = cursor.fetchone()
if result:
print(f"\n 文件ID {file_id} 的信息:")
print(f" - ID: {result['id']}")
print(f" - 租户ID: {result['tenant_id']}")
print(f" - 父级ID: {result['parent_id']}")
print(f" - 名称: {result['name']}")
print(f" - 文件路径: {result['file_path']}")
# 处理state字段可能是bytes或int
state_raw = result['state']
if isinstance(state_raw, bytes):
state_value = int.from_bytes(state_raw, byteorder='big')
elif state_raw is not None:
state_value = int(state_raw)
else:
state_value = 0
print(f" - 状态: {state_value} ({'启用' if state_value == 1 else '禁用'})")
if state_value != 1:
print_result(False, f"文件ID {file_id} 的状态为禁用state={state_value}")
else:
print_result(True, f"文件ID {file_id} 存在且已启用")
else:
print_result(False, f"文件ID {file_id} 不存在或不属于租户 {tenant_id}")
return
# 统计信息
cursor.execute("""
SELECT
COUNT(*) as total,
SUM(CASE WHEN state = 1 THEN 1 ELSE 0 END) as enabled,
SUM(CASE WHEN state = 0 THEN 1 ELSE 0 END) as disabled,
SUM(CASE WHEN file_path IS NOT NULL AND file_path != '' THEN 1 ELSE 0 END) as files,
SUM(CASE WHEN file_path IS NULL OR file_path = '' THEN 1 ELSE 0 END) as directories
FROM f_polic_file_config
WHERE tenant_id = %s
""", (tenant_id,))
stats = cursor.fetchone()
print(f"\n 统计信息:")
print(f" - 总记录数: {stats['total']}")
print(f" - 启用记录: {stats['enabled']}")
print(f" - 禁用记录: {stats['disabled']}")
print(f" - 文件节点: {stats['files']}")
print(f" - 目录节点: {stats['directories']}")
# 检查parent_id引用
cursor.execute("""
SELECT fc1.id, fc1.name, fc1.parent_id
FROM f_polic_file_config fc1
LEFT JOIN f_polic_file_config fc2 ON fc1.parent_id = fc2.id AND fc1.tenant_id = fc2.tenant_id
WHERE fc1.tenant_id = %s
AND fc1.parent_id IS NOT NULL
AND fc2.id IS NULL
""", (tenant_id,))
broken_parents = cursor.fetchall()
if broken_parents:
print(f"\n [警告] 发现 {len(broken_parents)} 个parent_id引用错误:")
for item in broken_parents[:10]:
print(f" - ID: {item['id']}, 名称: {item['name']}, parent_id: {item['parent_id']} (不存在)")
if len(broken_parents) > 10:
print(f" ... 还有 {len(broken_parents) - 10}")
else:
print_result(True, "所有parent_id引用正确")
finally:
cursor.close()
def check_fields(conn, tenant_id: int):
"""检查f_polic_field表"""
print_section("检查 f_polic_field 表")
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
# 统计信息
cursor.execute("""
SELECT
field_type,
COUNT(*) as total,
SUM(CASE WHEN state = 1 THEN 1 ELSE 0 END) as enabled,
SUM(CASE WHEN state = 0 THEN 1 ELSE 0 END) as disabled
FROM f_polic_field
WHERE tenant_id = %s
GROUP BY field_type
""", (tenant_id,))
stats = cursor.fetchall()
print(f"\n 统计信息:")
for stat in stats:
field_type_name = "输入字段" if stat['field_type'] == 1 else "输出字段" if stat['field_type'] == 2 else "未知"
print(f" - {field_type_name} (field_type={stat['field_type']}):")
print(f" 总记录数: {stat['total']}")
print(f" 启用: {stat['enabled']}")
print(f" 禁用: {stat['disabled']}")
# 检查重复的filed_code
cursor.execute("""
SELECT filed_code, field_type, COUNT(*) as count
FROM f_polic_field
WHERE tenant_id = %s
AND state = 1
GROUP BY filed_code, field_type
HAVING count > 1
""", (tenant_id,))
duplicates = cursor.fetchall()
if duplicates:
print(f"\n [警告] 发现重复的filed_code:")
for dup in duplicates:
print(f" - filed_code: {dup['filed_code']}, field_type: {dup['field_type']}, 重复数: {dup['count']}")
else:
print_result(True, "没有重复的filed_code")
finally:
cursor.close()
def check_file_field_relations(conn, tenant_id: int, file_id: Optional[int] = None):
"""检查f_polic_file_field表"""
print_section("检查 f_polic_file_field 表(关联关系)")
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
# 统计信息
cursor.execute("""
SELECT COUNT(*) as total
FROM f_polic_file_field
WHERE tenant_id = %s AND state = 1
""", (tenant_id,))
total_relations = cursor.fetchone()['total']
print(f"\n 总关联关系数: {total_relations}")
if file_id:
# 检查特定文件ID的关联关系
cursor.execute("""
SELECT fff.id, fff.file_id, fff.filed_id, fff.state,
fc.name as file_name, fc.file_path, fc.state as file_state,
f.name as field_name, f.filed_code, f.field_type, f.state as field_state
FROM f_polic_file_field fff
LEFT JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id
LEFT JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
WHERE fff.tenant_id = %s AND fff.file_id = %s
""", (tenant_id, file_id))
relations = cursor.fetchall()
if relations:
print(f"\n 文件ID {file_id} 的关联关系 ({len(relations)} 条):")
for rel in relations:
print(f"\n 关联ID: {rel['id']}")
print(f" - file_id: {rel['file_id']}")
if rel['file_name']:
print(f" 模板: {rel['file_name']} (路径: {rel['file_path']})")
# 处理state字段可能是bytes或int
state_raw = rel['file_state']
if isinstance(state_raw, bytes):
file_state = int.from_bytes(state_raw, byteorder='big')
elif state_raw is not None:
file_state = int(state_raw)
else:
file_state = 0
print(f" 状态: {file_state} ({'启用' if file_state == 1 else '禁用'})")
else:
print(f" [错误] 模板不存在!")
print(f" - filed_id: {rel['filed_id']}")
if rel['field_name']:
field_type_name = "输入字段" if rel['field_type'] == 1 else "输出字段" if rel['field_type'] == 2 else "未知"
# 处理state字段可能是bytes或int
state_raw = rel['field_state']
if isinstance(state_raw, bytes):
field_state = int.from_bytes(state_raw, byteorder='big')
elif state_raw is not None:
field_state = int(state_raw)
else:
field_state = 0
print(f" 字段: {rel['field_name']} ({rel['filed_code']}, {field_type_name})")
print(f" 状态: {field_state} ({'启用' if field_state == 1 else '禁用'})")
else:
print(f" [错误] 字段不存在!")
else:
print(f"\n 文件ID {file_id} 没有关联关系")
# 检查孤立的关联关系file_id不存在
cursor.execute("""
SELECT fff.id, fff.file_id, fff.filed_id
FROM f_polic_file_field fff
LEFT JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id
WHERE fff.tenant_id = %s
AND fff.state = 1
AND fc.id IS NULL
""", (tenant_id,))
orphaned_file_relations = cursor.fetchall()
if orphaned_file_relations:
print(f"\n [错误] 发现 {len(orphaned_file_relations)} 个孤立的关联关系file_id不存在:")
for rel in orphaned_file_relations[:10]:
print(f" - 关联ID: {rel['id']}, file_id: {rel['file_id']}, filed_id: {rel['filed_id']}")
if len(orphaned_file_relations) > 10:
print(f" ... 还有 {len(orphaned_file_relations) - 10}")
else:
print_result(True, "所有file_id引用正确")
# 检查孤立的关联关系filed_id不存在
cursor.execute("""
SELECT fff.id, fff.file_id, fff.filed_id
FROM f_polic_file_field fff
LEFT JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
WHERE fff.tenant_id = %s
AND fff.state = 1
AND f.id IS NULL
""", (tenant_id,))
orphaned_field_relations = cursor.fetchall()
if orphaned_field_relations:
print(f"\n [错误] 发现 {len(orphaned_field_relations)} 个孤立的关联关系filed_id不存在:")
for rel in orphaned_field_relations[:10]:
print(f" - 关联ID: {rel['id']}, file_id: {rel['file_id']}, filed_id: {rel['filed_id']}")
if len(orphaned_field_relations) > 10:
print(f" ... 还有 {len(orphaned_field_relations) - 10}")
else:
print_result(True, "所有filed_id引用正确")
# 检查关联到禁用模板或字段的关联关系
cursor.execute("""
SELECT fff.id, fff.file_id, fff.filed_id,
fc.state as file_state, f.state as field_state
FROM f_polic_file_field fff
LEFT JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id
LEFT JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
WHERE fff.tenant_id = %s
AND fff.state = 1
AND (fc.state != 1 OR f.state != 1)
""", (tenant_id,))
disabled_relations = cursor.fetchall()
if disabled_relations:
print(f"\n [警告] 发现 {len(disabled_relations)} 个关联到禁用模板或字段的关联关系:")
for rel in disabled_relations[:10]:
print(f" - 关联ID: {rel['id']}, file_id: {rel['file_id']}, filed_id: {rel['filed_id']}")
print(f" 模板状态: {rel['file_state']}, 字段状态: {rel['field_state']}")
if len(disabled_relations) > 10:
print(f" ... 还有 {len(disabled_relations) - 10}")
else:
print_result(True, "所有关联关系都关联到启用的模板和字段")
finally:
cursor.close()
def check_specific_file(conn, tenant_id: int, file_id: int):
"""检查特定文件ID的完整信息"""
print_section(f"详细检查文件ID {file_id}")
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
# 1. 检查文件配置
cursor.execute("""
SELECT id, tenant_id, parent_id, name, file_path, state, created_time, updated_time
FROM f_polic_file_config
WHERE id = %s AND tenant_id = %s
""", (file_id, tenant_id))
file_config = cursor.fetchone()
if not file_config:
print_result(False, f"文件ID {file_id} 不存在或不属于租户 {tenant_id}")
return
print(f"\n 文件配置信息:")
print(f" - ID: {file_config['id']}")
print(f" - 租户ID: {file_config['tenant_id']}")
print(f" - 父级ID: {file_config['parent_id']}")
print(f" - 名称: {file_config['name']}")
print(f" - 文件路径: {file_config['file_path']}")
# 处理state字段可能是bytes或int
state_raw = file_config['state']
if isinstance(state_raw, bytes):
file_state = int.from_bytes(state_raw, byteorder='big')
elif state_raw is not None:
file_state = int(state_raw)
else:
file_state = 0
print(f" - 状态: {file_state} ({'启用' if file_state == 1 else '禁用'})")
print(f" - 创建时间: {file_config['created_time']}")
print(f" - 更新时间: {file_config['updated_time']}")
# 2. 检查父级
if file_config['parent_id']:
cursor.execute("""
SELECT id, name, file_path, state
FROM f_polic_file_config
WHERE id = %s AND tenant_id = %s
""", (file_config['parent_id'], tenant_id))
parent = cursor.fetchone()
if parent:
# 处理state字段可能是bytes或int
state_raw = parent['state']
if isinstance(state_raw, bytes):
parent_state = int.from_bytes(state_raw, byteorder='big')
elif state_raw is not None:
parent_state = int(state_raw)
else:
parent_state = 0
print(f"\n 父级信息:")
print(f" - ID: {parent['id']}")
print(f" - 名称: {parent['name']}")
print(f" - 状态: {parent_state} ({'启用' if parent_state == 1 else '禁用'})")
else:
print(f"\n [错误] 父级ID {file_config['parent_id']} 不存在!")
# 3. 检查关联的字段
cursor.execute("""
SELECT fff.id as relation_id, fff.filed_id,
f.name as field_name, f.filed_code, f.field_type, f.state as field_state
FROM f_polic_file_field fff
LEFT JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
WHERE fff.tenant_id = %s AND fff.file_id = %s AND fff.state = 1
ORDER BY f.field_type, f.filed_code
""", (tenant_id, file_id))
relations = cursor.fetchall()
print(f"\n 关联的字段 ({len(relations)} 个):")
input_fields = []
output_fields = []
for rel in relations:
field_type_name = "输入字段" if rel['field_type'] == 1 else "输出字段" if rel['field_type'] == 2 else "未知"
# 处理state字段可能是bytes或int
state_raw = rel['field_state']
if isinstance(state_raw, bytes):
field_state = int.from_bytes(state_raw, byteorder='big')
elif state_raw is not None:
field_state = int(state_raw)
else:
field_state = 0
field_info = f" - {rel['field_name']} ({rel['filed_code']}, {field_type_name})"
if field_state != 1:
field_info += f" [状态: 禁用]"
if not rel['field_name']:
field_info += f" [错误: 字段不存在!]"
if rel['field_type'] == 1:
input_fields.append(field_info)
else:
output_fields.append(field_info)
if input_fields:
print(f"\n 输入字段 ({len(input_fields)} 个):")
for info in input_fields:
print(info)
if output_fields:
print(f"\n 输出字段 ({len(output_fields)} 个):")
for info in output_fields:
print(info)
# 4. 检查是否有孤立的关联关系
cursor.execute("""
SELECT fff.id, fff.filed_id
FROM f_polic_file_field fff
LEFT JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
WHERE fff.tenant_id = %s AND fff.file_id = %s AND fff.state = 1 AND f.id IS NULL
""", (tenant_id, file_id))
orphaned = cursor.fetchall()
if orphaned:
print(f"\n [错误] 发现 {len(orphaned)} 个孤立的关联关系(字段不存在):")
for rel in orphaned:
print(f" - 关联ID: {rel['id']}, filed_id: {rel['filed_id']}")
finally:
cursor.close()
def main():
"""主函数"""
print_section("数据库ID关系检查工具")
# 获取配置
config = get_db_config_from_args()
# 显示配置信息
print_section("配置信息")
print(f" 数据库服务器: {config['host']}:{config['port']}")
print(f" 数据库名称: {config['database']}")
print(f" 用户名: {config['user']}")
print(f" 租户ID: {config['tenant_id']}")
if config.get('file_id'):
print(f" 检查文件ID: {config['file_id']}")
# 连接数据库
print_section("连接数据库")
conn = test_db_connection(config)
if not conn:
return
print_result(True, "数据库连接成功")
try:
tenant_id = config['tenant_id']
file_id = config.get('file_id')
# 检查各个表
check_file_config(conn, tenant_id, file_id)
check_fields(conn, tenant_id)
check_file_field_relations(conn, tenant_id, file_id)
# 如果指定了文件ID进行详细检查
if file_id:
check_specific_file(conn, tenant_id, file_id)
# 总结
print_section("检查完成")
print("请查看上述检查结果,找出问题所在")
finally:
conn.close()
print_result(True, "数据库连接已关闭")
if __name__ == "__main__":
try:
main()
except KeyboardInterrupt:
print("\n\n[中断] 用户取消操作")
sys.exit(0)
except Exception as e:
print(f"\n[错误] 发生异常: {str(e)}")
import traceback
traceback.print_exc()
sys.exit(1)

202
check_database_templates.py Normal file
View File

@ -0,0 +1,202 @@
"""
检查数据库中的模板记录情况
"""
import os
import pymysql
from pathlib import Path
from dotenv import load_dotenv
# 加载环境变量
load_dotenv()
# 数据库配置
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
# 先检查数据库中的实际 tenant_id
TENANT_ID = 615873064429507639 # 默认值,会在检查时自动发现实际的 tenant_id
def print_section(title):
"""打印章节标题"""
print("\n" + "="*70)
print(f" {title}")
print("="*70)
def check_database():
"""检查数据库记录"""
print_section("数据库模板记录检查")
try:
conn = pymysql.connect(**DB_CONFIG)
cursor = conn.cursor(pymysql.cursors.DictCursor)
# 0. 先检查所有 tenant_id确定实际使用的 tenant_id
print_section("0. 检查所有不同的 tenant_id")
cursor.execute("SELECT DISTINCT tenant_id, COUNT(*) as count FROM f_polic_file_config GROUP BY tenant_id")
tenant_ids = cursor.fetchall()
actual_tenant_id = None
for row in tenant_ids:
print(f" tenant_id={row['tenant_id']}: {row['count']} 条记录")
if actual_tenant_id is None:
actual_tenant_id = row['tenant_id']
# 使用实际的 tenant_id
if actual_tenant_id:
print(f"\n [使用] tenant_id={actual_tenant_id} 进行后续检查")
tenant_id = actual_tenant_id
else:
tenant_id = TENANT_ID
print(f"\n [使用] 默认 tenant_id={tenant_id}")
# 1. 检查 f_polic_file_config 表的所有记录(不限制条件)
print_section("1. 检查 f_polic_file_config 表(所有记录)")
cursor.execute("SELECT COUNT(*) as count FROM f_polic_file_config")
total_count = cursor.fetchone()['count']
print(f" 总记录数: {total_count}")
# 2. 检查按 tenant_id 过滤
print_section("2. 检查 f_polic_file_config 表(按 tenant_id 过滤)")
cursor.execute("SELECT COUNT(*) as count FROM f_polic_file_config WHERE tenant_id = %s", (tenant_id,))
tenant_count = cursor.fetchone()['count']
print(f" tenant_id={tenant_id} 的记录数: {tenant_count}")
# 3. 检查有 file_path 的记录
print_section("3. 检查 f_polic_file_config 表(有 file_path 的记录)")
cursor.execute("""
SELECT COUNT(*) as count
FROM f_polic_file_config
WHERE tenant_id = %s
AND file_path IS NOT NULL
AND file_path != ''
""", (tenant_id,))
path_count = cursor.fetchone()['count']
print(f" 有 file_path 的记录数: {path_count}")
# 4. 检查不同状态的记录
print_section("4. 检查 f_polic_file_config 表(按 state 分组)")
cursor.execute("""
SELECT state, COUNT(*) as count
FROM f_polic_file_config
WHERE tenant_id = %s
GROUP BY state
""", (tenant_id,))
state_counts = cursor.fetchall()
for row in state_counts:
state_name = "已启用" if row['state'] == 1 else "已禁用"
print(f" state={row['state']} ({state_name}): {row['count']}")
# 5. 查看前10条记录示例
print_section("5. f_polic_file_config 表记录示例前10条")
cursor.execute("""
SELECT id, name, file_path, state, tenant_id, parent_id
FROM f_polic_file_config
WHERE tenant_id = %s
LIMIT 10
""", (tenant_id,))
samples = cursor.fetchall()
if samples:
for i, row in enumerate(samples, 1):
print(f"\n 记录 {i}:")
print(f" ID: {row['id']}")
print(f" 名称: {row['name']}")
print(f" 路径: {row['file_path']}")
print(f" 状态: {row['state']} ({'已启用' if row['state'] == 1 else '已禁用'})")
print(f" 租户ID: {row['tenant_id']}")
print(f" 父级ID: {row['parent_id']}")
else:
print(" 没有找到记录")
# 7. 检查 file_path 的类型分布
print_section("7. 检查 file_path 路径类型分布")
cursor.execute("""
SELECT
CASE
WHEN file_path LIKE 'template_finish/%%' THEN '本地路径'
WHEN file_path LIKE '/%%TEMPLATE/%%' THEN 'MinIO路径'
WHEN file_path IS NULL OR file_path = '' THEN '空路径'
ELSE '其他路径'
END as path_type,
COUNT(*) as count
FROM f_polic_file_config
WHERE tenant_id = %s
GROUP BY path_type
""", (tenant_id,))
path_types = cursor.fetchall()
for row in path_types:
print(f" {row['path_type']}: {row['count']}")
# 8. 检查 f_polic_file_field 关联表
print_section("8. 检查 f_polic_file_field 关联表")
cursor.execute("""
SELECT COUNT(*) as count
FROM f_polic_file_field
WHERE tenant_id = %s
""", (tenant_id,))
relation_count = cursor.fetchone()['count']
print(f" 关联记录数: {relation_count}")
# 9. 检查 f_polic_field 字段表
print_section("9. 检查 f_polic_field 字段表")
cursor.execute("""
SELECT
field_type,
CASE
WHEN field_type = 1 THEN '输入字段'
WHEN field_type = 2 THEN '输出字段'
ELSE '未知'
END as type_name,
COUNT(*) as count
FROM f_polic_field
WHERE tenant_id = %s
GROUP BY field_type
""", (tenant_id,))
field_types = cursor.fetchall()
for row in field_types:
print(f" {row['type_name']} (field_type={row['field_type']}): {row['count']}")
# 10. 检查完整的关联关系
print_section("10. 检查模板与字段的关联关系(示例)")
cursor.execute("""
SELECT
fc.id as file_id,
fc.name as file_name,
fc.file_path,
COUNT(ff.filed_id) as field_count
FROM f_polic_file_config fc
LEFT JOIN f_polic_file_field ff ON fc.id = ff.file_id AND ff.tenant_id = %s
WHERE fc.tenant_id = %s
GROUP BY fc.id, fc.name, fc.file_path
LIMIT 10
""", (tenant_id, tenant_id))
relations = cursor.fetchall()
if relations:
for i, row in enumerate(relations, 1):
print(f"\n 模板 {i}:")
print(f" ID: {row['file_id']}")
print(f" 名称: {row['file_name']}")
print(f" 路径: {row['file_path']}")
print(f" 关联字段数: {row['field_count']}")
else:
print(" 没有找到关联记录")
cursor.close()
conn.close()
print_section("检查完成")
except Exception as e:
print(f"检查失败: {str(e)}")
import traceback
traceback.print_exc()
if __name__ == "__main__":
check_database()

View File

@ -0,0 +1,140 @@
"""
检查数据库中的实际数据查看有哪些 tenant_id 以及对应的数据量
"""
import pymysql
import os
from dotenv import load_dotenv
load_dotenv()
# 数据库连接配置
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
def check_tenant_data():
"""检查各个表中的 tenant_id 数据"""
conn = pymysql.connect(**DB_CONFIG)
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
print("=" * 80)
print("检查数据库中的 tenant_id 数据")
print("=" * 80)
# 1. 检查 f_polic_field 表中的 tenant_id
print("\n1. f_polic_field 表中的 tenant_id 分布:")
cursor.execute("""
SELECT tenant_id,
COUNT(*) as total_count,
SUM(CASE WHEN field_type = 1 THEN 1 ELSE 0 END) as input_count,
SUM(CASE WHEN field_type = 2 THEN 1 ELSE 0 END) as output_count,
SUM(CASE WHEN state = 1 THEN 1 ELSE 0 END) as enabled_count
FROM f_polic_field
GROUP BY tenant_id
ORDER BY tenant_id
""")
field_tenants = cursor.fetchall()
for row in field_tenants:
print(f" tenant_id: {row['tenant_id']}")
print(f" 总字段数: {row['total_count']}, 输入字段: {row['input_count']}, 输出字段: {row['output_count']}, 启用: {row['enabled_count']}")
# 2. 检查 f_polic_file_config 表中的 tenant_id
print("\n2. f_polic_file_config 表中的 tenant_id 分布:")
cursor.execute("""
SELECT tenant_id,
COUNT(*) as total_count,
SUM(CASE WHEN state = 1 THEN 1 ELSE 0 END) as enabled_count
FROM f_polic_file_config
GROUP BY tenant_id
ORDER BY tenant_id
""")
config_tenants = cursor.fetchall()
for row in config_tenants:
print(f" tenant_id: {row['tenant_id']}")
print(f" 总模板数: {row['total_count']}, 启用: {row['enabled_count']}")
# 3. 检查 f_polic_file_field 表中的 tenant_id
print("\n3. f_polic_file_field 表中的 tenant_id 分布:")
cursor.execute("""
SELECT tenant_id,
COUNT(*) as total_count,
SUM(CASE WHEN state = 1 THEN 1 ELSE 0 END) as enabled_count
FROM f_polic_file_field
GROUP BY tenant_id
ORDER BY tenant_id
""")
relation_tenants = cursor.fetchall()
for row in relation_tenants:
print(f" tenant_id: {row['tenant_id']}")
print(f" 总关联数: {row['total_count']}, 启用: {row['enabled_count']}")
# 4. 检查特定 tenant_id 的详细数据
test_tenant_id = 615873064429507600
print(f"\n4. 检查 tenant_id = {test_tenant_id} 的详细数据:")
# 字段数据
cursor.execute("""
SELECT COUNT(*) as count
FROM f_polic_field
WHERE tenant_id = %s
""", (test_tenant_id,))
field_count = cursor.fetchone()['count']
print(f" f_polic_field 表中的字段数: {field_count}")
if field_count > 0:
cursor.execute("""
SELECT id, name, filed_code, field_type, state
FROM f_polic_field
WHERE tenant_id = %s
LIMIT 10
""", (test_tenant_id,))
sample_fields = cursor.fetchall()
print(f" 示例字段前10条")
for field in sample_fields:
print(f" ID: {field['id']}, 名称: {field['name']}, 编码: {field['filed_code']}, 类型: {field['field_type']}, 状态: {field['state']}")
# 模板数据
cursor.execute("""
SELECT COUNT(*) as count
FROM f_polic_file_config
WHERE tenant_id = %s
""", (test_tenant_id,))
template_count = cursor.fetchone()['count']
print(f" f_polic_file_config 表中的模板数: {template_count}")
# 关联数据
cursor.execute("""
SELECT COUNT(*) as count
FROM f_polic_file_field
WHERE tenant_id = %s
""", (test_tenant_id,))
relation_count = cursor.fetchone()['count']
print(f" f_polic_file_field 表中的关联数: {relation_count}")
# 5. 检查所有不同的 tenant_id
print("\n5. 所有表中出现的 tenant_id 汇总:")
cursor.execute("""
SELECT DISTINCT tenant_id FROM f_polic_field
UNION
SELECT DISTINCT tenant_id FROM f_polic_file_config
UNION
SELECT DISTINCT tenant_id FROM f_polic_file_field
ORDER BY tenant_id
""")
all_tenants = cursor.fetchall()
print(" 所有 tenant_id 列表:")
for row in all_tenants:
print(f" {row['tenant_id']}")
finally:
cursor.close()
conn.close()
if __name__ == '__main__':
check_tenant_data()

105
check_existing_data.py Normal file
View File

@ -0,0 +1,105 @@
"""
检查数据库中的现有数据确认匹配情况
"""
import os
import json
import pymysql
from pathlib import Path
# 数据库连接配置
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
def check_existing_data():
"""检查数据库中的现有数据"""
print("="*80)
print("检查数据库中的现有数据")
print("="*80)
try:
conn = pymysql.connect(**DB_CONFIG)
cursor = conn.cursor(pymysql.cursors.DictCursor)
# 查询所有记录
sql = """
SELECT id, name, parent_id, template_code, input_data, file_path, state
FROM f_polic_file_config
WHERE tenant_id = %s
ORDER BY name
"""
cursor.execute(sql, (TENANT_ID,))
configs = cursor.fetchall()
print(f"\n共找到 {len(configs)} 条记录\n")
# 按 parent_id 分组统计
with_parent = []
without_parent = []
for config in configs:
# 尝试从 input_data 中提取 template_code
template_code = config.get('template_code')
if not template_code and config.get('input_data'):
try:
input_data = json.loads(config['input_data']) if isinstance(config['input_data'], str) else config['input_data']
if isinstance(input_data, dict):
template_code = input_data.get('template_code')
except:
pass
config['extracted_template_code'] = template_code
if config.get('parent_id'):
with_parent.append(config)
else:
without_parent.append(config)
print(f"有 parent_id 的记录: {len(with_parent)}")
print(f"无 parent_id 的记录: {len(without_parent)}\n")
# 显示无 parent_id 的记录
print("="*80)
print("无 parent_id 的记录列表:")
print("="*80)
for i, config in enumerate(without_parent, 1):
print(f"\n{i}. {config['name']}")
print(f" ID: {config['id']}")
print(f" template_code: {config.get('extracted_template_code') or config.get('template_code') or ''}")
print(f" file_path: {config.get('file_path', '')}")
print(f" state: {config.get('state')}")
# 显示有 parent_id 的记录(树状结构)
print("\n" + "="*80)
print("有 parent_id 的记录(树状结构):")
print("="*80)
# 构建ID到名称的映射
id_to_name = {config['id']: config['name'] for config in configs}
for config in with_parent:
parent_name = id_to_name.get(config['parent_id'], f"ID:{config['parent_id']}")
print(f"\n{config['name']}")
print(f" ID: {config['id']}")
print(f" 父节点: {parent_name} (ID: {config['parent_id']})")
print(f" template_code: {config.get('extracted_template_code') or config.get('template_code') or ''}")
cursor.close()
conn.close()
except Exception as e:
print(f"错误: {e}")
import traceback
traceback.print_exc()
if __name__ == '__main__':
check_existing_data()

View File

@ -0,0 +1,496 @@
"""
全面检查 f_polic_file_field 表的关联关系
重点检查输入字段field_type=1和输出字段field_type=2与模板的关联情况
"""
import pymysql
import os
import sys
from typing import Dict, List, Tuple
from collections import defaultdict
# 设置输出编码为UTF-8
if sys.platform == 'win32':
import io
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
# 数据库连接配置
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
def check_all_templates_field_relations(conn) -> Dict:
"""检查所有模板的字段关联情况"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
print("\n" + "="*80)
print("1. 检查所有模板的字段关联情况")
print("="*80)
# 获取所有启用的模板
cursor.execute("""
SELECT id, name, template_code
FROM f_polic_file_config
WHERE tenant_id = %s AND state = 1
ORDER BY name
""", (TENANT_ID,))
templates = cursor.fetchall()
# 获取每个模板关联的字段(按类型分组)
cursor.execute("""
SELECT
fc.id AS template_id,
fc.name AS template_name,
f.id AS field_id,
f.name AS field_name,
f.filed_code AS field_code,
f.field_type
FROM f_polic_file_config fc
INNER JOIN f_polic_file_field fff ON fc.id = fff.file_id AND fc.tenant_id = fff.tenant_id
INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
WHERE fc.tenant_id = %s
AND fc.state = 1
AND fff.state = 1
AND f.state = 1
ORDER BY fc.name, f.field_type, f.name
""", (TENANT_ID,))
relations = cursor.fetchall()
# 按模板分组统计
template_stats = {}
for template in templates:
template_stats[template['id']] = {
'template_id': template['id'],
'template_name': template['name'],
'template_code': template.get('template_code'),
'input_fields': [],
'output_fields': [],
'input_count': 0,
'output_count': 0
}
# 填充字段信息
for rel in relations:
template_id = rel['template_id']
if template_id in template_stats:
field_info = {
'field_id': rel['field_id'],
'field_name': rel['field_name'],
'field_code': rel['field_code'],
'field_type': rel['field_type']
}
if rel['field_type'] == 1:
template_stats[template_id]['input_fields'].append(field_info)
template_stats[template_id]['input_count'] += 1
elif rel['field_type'] == 2:
template_stats[template_id]['output_fields'].append(field_info)
template_stats[template_id]['output_count'] += 1
# 打印统计信息
print(f"\n共找到 {len(templates)} 个启用的模板\n")
templates_without_input = []
templates_without_output = []
templates_without_any = []
for template_id, stats in template_stats.items():
status = []
if stats['input_count'] == 0:
status.append("缺少输入字段")
templates_without_input.append(stats)
if stats['output_count'] == 0:
status.append("缺少输出字段")
templates_without_output.append(stats)
if stats['input_count'] == 0 and stats['output_count'] == 0:
templates_without_any.append(stats)
status_str = " | ".join(status) if status else "[OK] 正常"
print(f" {stats['template_name']} (ID: {stats['template_id']})")
print(f" 输入字段: {stats['input_count']} 个 | 输出字段: {stats['output_count']} 个 | {status_str}")
return {
'template_stats': template_stats,
'templates_without_input': templates_without_input,
'templates_without_output': templates_without_output,
'templates_without_any': templates_without_any
}
def check_input_field_relations_detail(conn) -> Dict:
"""详细检查输入字段的关联情况"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
print("\n" + "="*80)
print("2. 详细检查输入字段的关联情况")
print("="*80)
# 获取所有启用的输入字段
cursor.execute("""
SELECT id, name, filed_code
FROM f_polic_field
WHERE tenant_id = %s AND field_type = 1 AND state = 1
ORDER BY name
""", (TENANT_ID,))
input_fields = cursor.fetchall()
# 获取每个输入字段关联的模板
cursor.execute("""
SELECT
f.id AS field_id,
f.name AS field_name,
f.filed_code AS field_code,
fc.id AS template_id,
fc.name AS template_name
FROM f_polic_field f
INNER JOIN f_polic_file_field fff ON f.id = fff.filed_id AND f.tenant_id = fff.tenant_id
INNER JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id
WHERE f.tenant_id = %s
AND f.field_type = 1
AND f.state = 1
AND fff.state = 1
AND fc.state = 1
ORDER BY f.name, fc.name
""", (TENANT_ID,))
input_field_relations = cursor.fetchall()
# 按字段分组
field_template_map = defaultdict(list)
for rel in input_field_relations:
field_template_map[rel['field_id']].append({
'template_id': rel['template_id'],
'template_name': rel['template_name']
})
print(f"\n共找到 {len(input_fields)} 个启用的输入字段\n")
fields_without_relations = []
fields_with_relations = []
for field in input_fields:
field_id = field['id']
templates = field_template_map.get(field_id, [])
if not templates:
fields_without_relations.append(field)
print(f" [WARN] {field['name']} ({field['filed_code']}) - 未关联任何模板")
else:
fields_with_relations.append({
'field': field,
'templates': templates
})
print(f" [OK] {field['name']} ({field['filed_code']}) - 关联了 {len(templates)} 个模板:")
for template in templates:
print(f" - {template['template_name']} (ID: {template['template_id']})")
return {
'input_fields': input_fields,
'fields_without_relations': fields_without_relations,
'fields_with_relations': fields_with_relations
}
def check_output_field_relations_detail(conn) -> Dict:
"""详细检查输出字段的关联情况"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
print("\n" + "="*80)
print("3. 详细检查输出字段的关联情况")
print("="*80)
# 获取所有启用的输出字段
cursor.execute("""
SELECT id, name, filed_code
FROM f_polic_field
WHERE tenant_id = %s AND field_type = 2 AND state = 1
ORDER BY name
""", (TENANT_ID,))
output_fields = cursor.fetchall()
# 获取每个输出字段关联的模板
cursor.execute("""
SELECT
f.id AS field_id,
f.name AS field_name,
f.filed_code AS field_code,
fc.id AS template_id,
fc.name AS template_name
FROM f_polic_field f
INNER JOIN f_polic_file_field fff ON f.id = fff.filed_id AND f.tenant_id = fff.tenant_id
INNER JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id
WHERE f.tenant_id = %s
AND f.field_type = 2
AND f.state = 1
AND fff.state = 1
AND fc.state = 1
ORDER BY f.name, fc.name
""", (TENANT_ID,))
output_field_relations = cursor.fetchall()
# 按字段分组
field_template_map = defaultdict(list)
for rel in output_field_relations:
field_template_map[rel['field_id']].append({
'template_id': rel['template_id'],
'template_name': rel['template_name']
})
print(f"\n共找到 {len(output_fields)} 个启用的输出字段\n")
fields_without_relations = []
fields_with_relations = []
for field in output_fields:
field_id = field['id']
templates = field_template_map.get(field_id, [])
if not templates:
fields_without_relations.append(field)
print(f" [WARN] {field['name']} ({field['filed_code']}) - 未关联任何模板")
else:
fields_with_relations.append({
'field': field,
'templates': templates
})
if len(templates) <= 5:
print(f" [OK] {field['name']} ({field['filed_code']}) - 关联了 {len(templates)} 个模板:")
for template in templates:
print(f" - {template['template_name']} (ID: {template['template_id']})")
else:
print(f" [OK] {field['name']} ({field['filed_code']}) - 关联了 {len(templates)} 个模板")
for template in templates[:3]:
print(f" - {template['template_name']} (ID: {template['template_id']})")
print(f" ... 还有 {len(templates) - 3} 个模板")
return {
'output_fields': output_fields,
'fields_without_relations': fields_without_relations,
'fields_with_relations': fields_with_relations
}
def check_invalid_relations(conn) -> Dict:
"""检查无效的关联关系"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
print("\n" + "="*80)
print("4. 检查无效的关联关系")
print("="*80)
# 检查关联到不存在的 file_id
cursor.execute("""
SELECT fff.id, fff.file_id, fff.filed_id, fff.tenant_id
FROM f_polic_file_field fff
LEFT JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id
WHERE fff.tenant_id = %s AND fc.id IS NULL
""", (TENANT_ID,))
invalid_file_relations = cursor.fetchall()
# 检查关联到不存在的 filed_id
cursor.execute("""
SELECT fff.id, fff.file_id, fff.filed_id, fff.tenant_id
FROM f_polic_file_field fff
LEFT JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
WHERE fff.tenant_id = %s AND f.id IS NULL
""", (TENANT_ID,))
invalid_field_relations = cursor.fetchall()
print(f"\n关联到不存在的 file_id: {len(invalid_file_relations)}")
if invalid_file_relations:
print(" 详情:")
for rel in invalid_file_relations[:10]:
print(f" - 关联ID: {rel['id']}, file_id: {rel['file_id']}, filed_id: {rel['filed_id']}")
if len(invalid_file_relations) > 10:
print(f" ... 还有 {len(invalid_file_relations) - 10}")
else:
print(" [OK] 没有无效的 file_id 关联")
print(f"\n关联到不存在的 filed_id: {len(invalid_field_relations)}")
if invalid_field_relations:
print(" 详情:")
for rel in invalid_field_relations[:10]:
print(f" - 关联ID: {rel['id']}, file_id: {rel['file_id']}, filed_id: {rel['filed_id']}")
if len(invalid_field_relations) > 10:
print(f" ... 还有 {len(invalid_field_relations) - 10}")
else:
print(" [OK] 没有无效的 filed_id 关联")
return {
'invalid_file_relations': invalid_file_relations,
'invalid_field_relations': invalid_field_relations
}
def get_summary_statistics(conn) -> Dict:
"""获取汇总统计信息"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
print("\n" + "="*80)
print("5. 汇总统计信息")
print("="*80)
# 总关联数
cursor.execute("""
SELECT COUNT(*) as total
FROM f_polic_file_field
WHERE tenant_id = %s AND state = 1
""", (TENANT_ID,))
total_relations = cursor.fetchone()['total']
# 输入字段关联数
cursor.execute("""
SELECT COUNT(*) as total
FROM f_polic_file_field fff
INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
WHERE fff.tenant_id = %s
AND fff.state = 1
AND f.state = 1
AND f.field_type = 1
""", (TENANT_ID,))
input_relations = cursor.fetchone()['total']
# 输出字段关联数
cursor.execute("""
SELECT COUNT(*) as total
FROM f_polic_file_field fff
INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
WHERE fff.tenant_id = %s
AND fff.state = 1
AND f.state = 1
AND f.field_type = 2
""", (TENANT_ID,))
output_relations = cursor.fetchone()['total']
# 关联的模板数
cursor.execute("""
SELECT COUNT(DISTINCT file_id) as total
FROM f_polic_file_field
WHERE tenant_id = %s AND state = 1
""", (TENANT_ID,))
related_templates = cursor.fetchone()['total']
# 关联的输入字段数
cursor.execute("""
SELECT COUNT(DISTINCT fff.filed_id) as total
FROM f_polic_file_field fff
INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
WHERE fff.tenant_id = %s
AND fff.state = 1
AND f.state = 1
AND f.field_type = 1
""", (TENANT_ID,))
related_input_fields = cursor.fetchone()['total']
# 关联的输出字段数
cursor.execute("""
SELECT COUNT(DISTINCT fff.filed_id) as total
FROM f_polic_file_field fff
INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
WHERE fff.tenant_id = %s
AND fff.state = 1
AND f.state = 1
AND f.field_type = 2
""", (TENANT_ID,))
related_output_fields = cursor.fetchone()['total']
print(f"\n总关联数: {total_relations}")
print(f" 输入字段关联数: {input_relations}")
print(f" 输出字段关联数: {output_relations}")
print(f"\n关联的模板数: {related_templates}")
print(f"关联的输入字段数: {related_input_fields}")
print(f"关联的输出字段数: {related_output_fields}")
return {
'total_relations': total_relations,
'input_relations': input_relations,
'output_relations': output_relations,
'related_templates': related_templates,
'related_input_fields': related_input_fields,
'related_output_fields': related_output_fields
}
def main():
"""主函数"""
print("="*80)
print("全面检查 f_polic_file_field 表的关联关系")
print("="*80)
try:
conn = pymysql.connect(**DB_CONFIG)
print("[OK] 数据库连接成功\n")
except Exception as e:
print(f"[ERROR] 数据库连接失败: {e}")
return
try:
# 1. 检查所有模板的字段关联情况
template_result = check_all_templates_field_relations(conn)
# 2. 详细检查输入字段的关联情况
input_result = check_input_field_relations_detail(conn)
# 3. 详细检查输出字段的关联情况
output_result = check_output_field_relations_detail(conn)
# 4. 检查无效的关联关系
invalid_result = check_invalid_relations(conn)
# 5. 获取汇总统计信息
stats = get_summary_statistics(conn)
# 总结
print("\n" + "="*80)
print("检查总结")
print("="*80)
issues = []
if len(template_result['templates_without_input']) > 0:
issues.append(f"[WARN] {len(template_result['templates_without_input'])} 个模板缺少输入字段关联")
if len(template_result['templates_without_output']) > 0:
issues.append(f"[WARN] {len(template_result['templates_without_output'])} 个模板缺少输出字段关联")
if len(template_result['templates_without_any']) > 0:
issues.append(f"[WARN] {len(template_result['templates_without_any'])} 个模板没有任何字段关联")
if len(input_result['fields_without_relations']) > 0:
issues.append(f"[WARN] {len(input_result['fields_without_relations'])} 个输入字段未关联任何模板")
if len(output_result['fields_without_relations']) > 0:
issues.append(f"[WARN] {len(output_result['fields_without_relations'])} 个输出字段未关联任何模板")
if len(invalid_result['invalid_file_relations']) > 0:
issues.append(f"[ERROR] {len(invalid_result['invalid_file_relations'])} 条无效的 file_id 关联")
if len(invalid_result['invalid_field_relations']) > 0:
issues.append(f"[ERROR] {len(invalid_result['invalid_field_relations'])} 条无效的 filed_id 关联")
if issues:
print("\n发现以下问题:\n")
for issue in issues:
print(f" {issue}")
else:
print("\n[OK] 未发现明显问题")
print("\n" + "="*80)
except Exception as e:
print(f"\n[ERROR] 检查过程中发生错误: {e}")
import traceback
traceback.print_exc()
finally:
conn.close()
print("\n数据库连接已关闭")
if __name__ == '__main__':
main()

88
check_relations_query.py Normal file
View File

@ -0,0 +1,88 @@
"""
检查关联关系查询逻辑
"""
import pymysql
import os
from dotenv import load_dotenv
load_dotenv()
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
def check_relations():
"""检查关联关系查询"""
conn = pymysql.connect(**DB_CONFIG)
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
# 检查一个具体模板的关联关系
template_id = 1765273962716807 # 走读式谈话流程
print(f"检查模板 ID: {template_id}")
# 方法1: 当前 API 使用的查询
print("\n方法1: 当前 API 使用的查询(带 INNER JOIN 和 state=1:")
cursor.execute("""
SELECT fff.file_id, fff.filed_id, fff.state as relation_state, fc.state as template_state
FROM f_polic_file_field fff
INNER JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id
WHERE fff.tenant_id = %s AND fff.state = 1 AND fff.file_id = %s
""", (TENANT_ID, template_id))
results1 = cursor.fetchall()
print(f" 结果数: {len(results1)}")
for r in results1[:5]:
print(f" file_id: {r['file_id']}, filed_id: {r['filed_id']}, relation_state: {r['relation_state']}, template_state: {r['template_state']}")
# 方法2: 只查询关联表,不检查模板状态
print("\n方法2: 只查询关联表(不检查模板状态):")
cursor.execute("""
SELECT fff.file_id, fff.filed_id, fff.state as relation_state
FROM f_polic_file_field fff
WHERE fff.tenant_id = %s AND fff.state = 1 AND fff.file_id = %s
""", (TENANT_ID, template_id))
results2 = cursor.fetchall()
print(f" 结果数: {len(results2)}")
for r in results2[:5]:
print(f" file_id: {r['file_id']}, filed_id: {r['filed_id']}, relation_state: {r['relation_state']}")
# 方法3: 检查模板是否存在且启用
print("\n方法3: 检查模板状态:")
cursor.execute("""
SELECT id, name, state
FROM f_polic_file_config
WHERE tenant_id = %s AND id = %s
""", (TENANT_ID, template_id))
template = cursor.fetchone()
if template:
print(f" 模板存在: {template['name']}, state: {template['state']}")
else:
print(f" 模板不存在")
# 检查所有关联关系(包括 state=0 的)
print("\n方法4: 检查所有关联关系(包括未启用的):")
cursor.execute("""
SELECT fff.file_id, fff.filed_id, fff.state as relation_state
FROM f_polic_file_field fff
WHERE fff.tenant_id = %s AND fff.file_id = %s
""", (TENANT_ID, template_id))
results4 = cursor.fetchall()
print(f" 结果数: {len(results4)}")
enabled = [r for r in results4 if r['relation_state'] == 1]
disabled = [r for r in results4 if r['relation_state'] == 0]
print(f" 启用: {len(enabled)}, 未启用: {len(disabled)}")
finally:
cursor.close()
conn.close()
if __name__ == '__main__':
check_relations()

131
check_remaining_fields.py Normal file
View File

@ -0,0 +1,131 @@
"""
检查剩余的未处理字段并生成合适的field_code
"""
import os
import pymysql
import re
from typing import Dict, List
# 数据库连接配置
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
def is_chinese(text: str) -> bool:
"""判断字符串是否包含中文字符"""
if not text:
return False
return bool(re.search(r'[\u4e00-\u9fff]', text))
def generate_field_code(field_name: str) -> str:
"""根据字段名称生成field_code"""
# 移除常见前缀
name = field_name.replace('被核查人员', 'target_').replace('被核查人', 'target_')
# 转换为小写并替换特殊字符
code = name.lower()
code = re.sub(r'[^\w\u4e00-\u9fff]', '_', code)
code = re.sub(r'_+', '_', code).strip('_')
# 如果还是中文,尝试更智能的转换
if is_chinese(code):
# 简单的拼音映射(这里只是示例,实际应该使用拼音库)
# 暂时使用更简单的规则
code = field_name.lower()
code = code.replace('被核查人员', 'target_')
code = code.replace('被核查人', 'target_')
code = code.replace('谈话', 'interview_')
code = code.replace('审批', 'approval_')
code = code.replace('核查', 'investigation_')
code = code.replace('人员', '')
code = code.replace('时间', '_time')
code = code.replace('地点', '_location')
code = code.replace('部门', '_department')
code = code.replace('姓名', '_name')
code = code.replace('号码', '_number')
code = code.replace('情况', '_situation')
code = code.replace('问题', '_issue')
code = code.replace('描述', '_description')
code = re.sub(r'[^\w]', '_', code)
code = re.sub(r'_+', '_', code).strip('_')
return code
def check_remaining_fields():
"""检查剩余的未处理字段"""
conn = pymysql.connect(**DB_CONFIG)
cursor = conn.cursor(pymysql.cursors.DictCursor)
print("="*80)
print("检查剩余的未处理字段")
print("="*80)
# 查询所有包含中文field_code的字段
cursor.execute("""
SELECT id, name, filed_code, field_type, state
FROM f_polic_field
WHERE tenant_id = %s AND (
filed_code REGEXP '[\\u4e00-\\u9fff]'
OR filed_code IS NULL
OR filed_code = ''
)
ORDER BY name
""", (TENANT_ID,))
fields = cursor.fetchall()
print(f"\n找到 {len(fields)} 个仍需要处理的字段:\n")
suggestions = []
for field in fields:
suggested_code = generate_field_code(field['name'])
suggestions.append({
'id': field['id'],
'name': field['name'],
'current_code': field['filed_code'],
'suggested_code': suggested_code,
'field_type': field['field_type']
})
print(f" ID: {field['id']}")
print(f" 名称: {field['name']}")
print(f" 当前field_code: {field['filed_code']}")
print(f" 建议field_code: {suggested_code}")
print(f" field_type: {field['field_type']}")
print()
# 询问是否更新
if suggestions:
print("="*80)
choice = input("是否更新这些字段的field_code(y/n默认n): ").strip().lower()
if choice == 'y':
print("\n开始更新...")
for sug in suggestions:
cursor.execute("""
UPDATE f_polic_field
SET filed_code = %s, updated_time = NOW(), updated_by = %s
WHERE id = %s
""", (sug['suggested_code'], 655162080928945152, sug['id']))
print(f" ✓ 更新字段 ID {sug['id']}: {sug['name']} -> {sug['suggested_code']}")
conn.commit()
print("\n✓ 更新完成")
else:
print("未执行更新")
cursor.close()
conn.close()
if __name__ == '__main__':
check_remaining_fields()

View File

@ -0,0 +1,198 @@
"""
检查特定模板的关联关系
"""
import pymysql
import os
import re
from pathlib import Path
from docx import Document
from dotenv import load_dotenv
load_dotenv()
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
TEMPLATE_NAME = "1.请示报告卡(初核谈话)"
TEMPLATE_FILE = "template_finish/2-初核模版/2.谈话审批/走读式谈话审批/1.请示报告卡(初核谈话).docx"
def extract_placeholders_from_docx(file_path: str):
"""从docx文件中提取所有占位符"""
placeholders = set()
pattern = r'\{\{([^}]+)\}\}'
try:
doc = Document(file_path)
# 从段落中提取占位符
for paragraph in doc.paragraphs:
text = paragraph.text
matches = re.findall(pattern, text)
for match in matches:
cleaned = match.strip()
if cleaned and '{' not in cleaned and '}' not in cleaned:
placeholders.add(cleaned)
# 从表格中提取占位符
for table in doc.tables:
for row in table.rows:
for cell in row.cells:
for paragraph in cell.paragraphs:
text = paragraph.text
matches = re.findall(pattern, text)
for match in matches:
cleaned = match.strip()
if cleaned and '{' not in cleaned and '}' not in cleaned:
placeholders.add(cleaned)
except Exception as e:
print(f"错误: 读取文件失败 - {str(e)}")
return []
return sorted(list(placeholders))
def check_template():
"""检查模板的关联关系"""
conn = pymysql.connect(**DB_CONFIG)
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
print(f"检查模板: {TEMPLATE_NAME}")
print("=" * 80)
# 1. 从文档提取占位符
print("\n1. 从文档提取占位符:")
if not Path(TEMPLATE_FILE).exists():
print(f" 文件不存在: {TEMPLATE_FILE}")
return
placeholders = extract_placeholders_from_docx(TEMPLATE_FILE)
print(f" 占位符数量: {len(placeholders)}")
print(f" 占位符列表: {placeholders}")
# 2. 查询模板ID
print(f"\n2. 查询模板ID:")
cursor.execute("""
SELECT id, name
FROM f_polic_file_config
WHERE tenant_id = %s AND name = %s
""", (TENANT_ID, TEMPLATE_NAME))
template = cursor.fetchone()
if not template:
print(f" 模板不存在")
return
template_id = template['id']
print(f" 模板ID: {template_id}")
# 3. 查询字段映射
print(f"\n3. 查询字段映射:")
cursor.execute("""
SELECT id, name, filed_code, field_type, state
FROM f_polic_field
WHERE tenant_id = %s
""", (TENANT_ID,))
fields = cursor.fetchall()
field_map = {}
for field in fields:
state = field['state']
if isinstance(state, bytes):
state = int.from_bytes(state, byteorder='big') if len(state) == 1 else 1
field_map[field['filed_code']] = {
'id': field['id'],
'name': field['name'],
'field_type': field['field_type'],
'state': state
}
print(f" 字段总数: {len(field_map)}")
# 4. 匹配占位符到字段
print(f"\n4. 匹配占位符到字段:")
input_field_ids = []
output_field_ids = []
not_found = []
for placeholder in placeholders:
if placeholder in field_map:
field_info = field_map[placeholder]
if field_info['state'] == 1:
if field_info['field_type'] == 1:
input_field_ids.append(field_info['id'])
elif field_info['field_type'] == 2:
output_field_ids.append(field_info['id'])
else:
not_found.append(placeholder)
# 添加必需的输入字段
required_input_fields = ['clue_info', 'target_basic_info_clue']
for req_field in required_input_fields:
if req_field in field_map:
field_info = field_map[req_field]
if field_info['state'] == 1 and field_info['id'] not in input_field_ids:
input_field_ids.append(field_info['id'])
print(f" 输入字段ID: {input_field_ids}")
print(f" 输出字段ID: {output_field_ids}")
if not_found:
print(f" 未找到的占位符: {not_found}")
# 5. 查询数据库中的关联关系
print(f"\n5. 查询数据库中的关联关系:")
cursor.execute("""
SELECT fff.filed_id, fff.state, f.name, f.field_type
FROM f_polic_file_field fff
INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
WHERE fff.tenant_id = %s AND fff.file_id = %s
""", (TENANT_ID, template_id))
db_relations = cursor.fetchall()
db_input_ids = []
db_output_ids = []
for rel in db_relations:
state = rel['state']
if isinstance(state, bytes):
state = int.from_bytes(state, byteorder='big') if len(state) == 1 else 1
if state == 1:
if rel['field_type'] == 1:
db_input_ids.append(rel['filed_id'])
elif rel['field_type'] == 2:
db_output_ids.append(rel['filed_id'])
print(f" 数据库中的输入字段ID: {sorted(db_input_ids)}")
print(f" 数据库中的输出字段ID: {sorted(db_output_ids)}")
# 6. 对比
print(f"\n6. 对比结果:")
expected_input = set(input_field_ids)
expected_output = set(output_field_ids)
actual_input = set(db_input_ids)
actual_output = set(db_output_ids)
print(f" 输入字段 - 期望: {sorted(expected_input)}, 实际: {sorted(actual_input)}")
print(f" 输入字段匹配: {expected_input == actual_input}")
print(f" 输出字段 - 期望: {sorted(expected_output)}, 实际: {sorted(actual_output)}")
print(f" 输出字段匹配: {expected_output == actual_output}")
if expected_output != actual_output:
missing = expected_output - actual_output
extra = actual_output - expected_output
print(f" 缺少的输出字段: {sorted(missing)}")
print(f" 多余的输出字段: {sorted(extra)}")
finally:
cursor.close()
conn.close()
if __name__ == '__main__':
check_template()

View File

@ -0,0 +1,98 @@
"""
检查模板的所有关联关系包括未启用的
"""
import pymysql
import os
from dotenv import load_dotenv
load_dotenv()
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
TEMPLATE_ID = 1765432134276990 # 1.请示报告卡(初核谈话)
def check_all_relations():
"""检查模板的所有关联关系"""
conn = pymysql.connect(**DB_CONFIG)
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
print(f"检查模板 ID: {TEMPLATE_ID}")
print("=" * 80)
# 查询模板信息
cursor.execute("""
SELECT id, name, state
FROM f_polic_file_config
WHERE tenant_id = %s AND id = %s
""", (TENANT_ID, TEMPLATE_ID))
template = cursor.fetchone()
if template:
print(f"模板名称: {template['name']}")
print(f"模板状态: {template['state']}")
else:
print("模板不存在")
return
# 查询所有关联关系(包括 state=0 的)
cursor.execute("""
SELECT
fff.file_id,
fff.filed_id,
fff.state as relation_state,
f.name as field_name,
f.field_type,
f.state as field_state
FROM f_polic_file_field fff
INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
WHERE fff.tenant_id = %s AND fff.file_id = %s
ORDER BY f.field_type, f.name
""", (TENANT_ID, TEMPLATE_ID))
all_relations = cursor.fetchall()
print(f"\n所有关联关系数: {len(all_relations)}")
# 按状态分组
enabled_relations = [r for r in all_relations if r['relation_state'] == 1 or (isinstance(r['relation_state'], bytes) and r['relation_state'] == b'\x01')]
disabled_relations = [r for r in all_relations if r not in enabled_relations]
print(f"启用的关联关系: {len(enabled_relations)}")
print(f"未启用的关联关系: {len(disabled_relations)}")
# 按字段类型分组
input_fields = [r for r in enabled_relations if r['field_type'] == 1]
output_fields = [r for r in enabled_relations if r['field_type'] == 2]
print(f"\n启用的输入字段关联: {len(input_fields)}")
for r in input_fields:
state_str = str(r['relation_state']) if not isinstance(r['relation_state'], bytes) else 'bytes'
print(f" - {r['field_name']} (ID: {r['filed_id']}, relation_state: {state_str}, field_state: {r['field_state']})")
print(f"\n启用的输出字段关联: {len(output_fields)}")
for r in output_fields[:10]:
state_str = str(r['relation_state']) if not isinstance(r['relation_state'], bytes) else 'bytes'
print(f" - {r['field_name']} (ID: {r['filed_id']}, relation_state: {state_str}, field_state: {r['field_state']})")
if len(output_fields) > 10:
print(f" ... 还有 {len(output_fields) - 10} 个输出字段")
# 检查未启用的关联关系
if disabled_relations:
print(f"\n未启用的关联关系: {len(disabled_relations)}")
disabled_input = [r for r in disabled_relations if r['field_type'] == 1]
disabled_output = [r for r in disabled_relations if r['field_type'] == 2]
print(f" 输入字段: {len(disabled_input)}, 输出字段: {len(disabled_output)}")
finally:
cursor.close()
conn.close()
if __name__ == '__main__':
check_all_relations()

View File

@ -0,0 +1,76 @@
"""
检查哪些模板有输出字段关联
"""
import pymysql
import os
from dotenv import load_dotenv
load_dotenv()
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
def check_templates_with_output_fields():
"""检查哪些模板有输出字段关联"""
conn = pymysql.connect(**DB_CONFIG)
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
# 查询所有模板及其关联的输出字段
cursor.execute("""
SELECT
fc.id as template_id,
fc.name as template_name,
COUNT(CASE WHEN f.field_type = 2 THEN 1 END) as output_field_count,
COUNT(CASE WHEN f.field_type = 1 THEN 1 END) as input_field_count,
COUNT(*) as total_field_count
FROM f_polic_file_config fc
INNER JOIN f_polic_file_field fff ON fc.id = fff.file_id AND fc.tenant_id = fff.tenant_id
INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
WHERE fc.tenant_id = %s
AND fff.state = 1
AND fc.state = 1
GROUP BY fc.id, fc.name
HAVING output_field_count > 0
ORDER BY output_field_count DESC
LIMIT 10
""", (TENANT_ID,))
templates = cursor.fetchall()
print(f"有输出字段关联的模板前10个:")
print("=" * 80)
for t in templates:
print(f"\n模板: {t['template_name']} (ID: {t['template_id']})")
print(f" 输入字段: {t['input_field_count']}, 输出字段: {t['output_field_count']}, 总计: {t['total_field_count']}")
# 查询该模板的具体输出字段
cursor.execute("""
SELECT f.id, f.name, f.filed_code
FROM f_polic_file_field fff
INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
WHERE fff.tenant_id = %s
AND fff.file_id = %s
AND fff.state = 1
AND f.field_type = 2
LIMIT 5
""", (TENANT_ID, t['template_id']))
output_fields = cursor.fetchall()
print(f" 输出字段示例前5个:")
for f in output_fields:
print(f" - {f['name']} (ID: {f['id']}, code: {f['filed_code']})")
finally:
cursor.close()
conn.close()
if __name__ == '__main__':
check_templates_with_output_fields()

View File

@ -0,0 +1,874 @@
"""
清理并重新同步模板数据到指定数据库
功能
1. 清理指定tenant_id下的旧数据包括MinIO路径的数据
2. 清理相关的字段关联关系
3. 重新扫描template_finish/目录
4. 重新创建/更新模板数据
5. 重新建立字段关联关系
使用方法
python clean_and_resync_templates.py --host 10.100.31.21 --port 3306 --user finyx --password FknJYz3FA5WDYtsd --database finyx --tenant-id 1
"""
import os
import sys
import pymysql
import argparse
from pathlib import Path
from typing import Dict, List, Set, Optional
import re
from docx import Document
import getpass
# 设置输出编码为UTF-8Windows兼容
if sys.platform == 'win32':
import io
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
sys.stderr = io.TextIOWrapper(sys.stderr.buffer, encoding='utf-8')
# 项目根目录
PROJECT_ROOT = Path(__file__).parent
TEMPLATES_DIR = PROJECT_ROOT / "template_finish"
CREATED_BY = 655162080928945152
UPDATED_BY = 655162080928945152
def print_section(title):
"""打印章节标题"""
print("\n" + "="*70)
print(f" {title}")
print("="*70)
def print_result(success, message):
"""打印结果"""
status = "[OK]" if success else "[FAIL]"
print(f"{status} {message}")
def generate_id():
"""生成ID"""
import time
return int(time.time() * 1000000)
def get_db_config_from_args() -> Optional[Dict]:
"""从命令行参数获取数据库配置"""
parser = argparse.ArgumentParser(
description='清理并重新同步模板数据到指定数据库',
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
示例
python clean_and_resync_templates.py --host 10.100.31.21 --port 3306 --user finyx --password FknJYz3FA5WDYtsd --database finyx --tenant-id 1
"""
)
parser.add_argument('--host', type=str, required=True, help='MySQL服务器地址')
parser.add_argument('--port', type=int, required=True, help='MySQL服务器端口')
parser.add_argument('--user', type=str, required=True, help='MySQL用户名')
parser.add_argument('--password', type=str, required=True, help='MySQL密码')
parser.add_argument('--database', type=str, required=True, help='数据库名称')
parser.add_argument('--tenant-id', type=int, required=True, help='租户ID')
parser.add_argument('--dry-run', action='store_true', help='预览模式(不实际更新数据库)')
parser.add_argument('--skip-clean', action='store_true', help='跳过清理步骤(只同步)')
args = parser.parse_args()
return {
'host': args.host,
'port': args.port,
'user': args.user,
'password': args.password,
'database': args.database,
'charset': 'utf8mb4',
'tenant_id': args.tenant_id,
'dry_run': args.dry_run,
'skip_clean': args.skip_clean
}
def test_db_connection(config: Dict) -> Optional[pymysql.Connection]:
"""测试数据库连接"""
try:
conn = pymysql.connect(
host=config['host'],
port=config['port'],
user=config['user'],
password=config['password'],
database=config['database'],
charset=config['charset']
)
return conn
except Exception as e:
print_result(False, f"数据库连接失败: {str(e)}")
return None
def scan_local_templates() -> Dict[str, Path]:
"""扫描本地template_finish目录返回file_path -> Path的映射"""
templates = {}
if not TEMPLATES_DIR.exists():
return templates
for item in TEMPLATES_DIR.rglob("*"):
if item.is_file() and item.suffix.lower() in ['.docx', '.doc']:
rel_path = item.relative_to(PROJECT_ROOT)
rel_path_str = str(rel_path).replace('\\', '/')
templates[rel_path_str] = item
return templates
def clean_old_data(conn, tenant_id: int, local_templates: Dict[str, Path], dry_run: bool = False):
"""清理旧数据"""
print_section("清理旧数据")
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
# 1. 获取所有模板
cursor.execute("""
SELECT id, name, file_path
FROM f_polic_file_config
WHERE tenant_id = %s
AND state = 1
""", (tenant_id,))
all_templates = cursor.fetchall()
print(f" 数据库中的模板总数: {len(all_templates)}")
# 2. 识别需要删除的模板
to_delete = []
minio_paths = []
invalid_paths = []
duplicate_paths = []
# 统计file_path
path_count = {}
for template in all_templates:
file_path = template.get('file_path')
if file_path:
if file_path not in path_count:
path_count[file_path] = []
path_count[file_path].append(template)
for template in all_templates:
file_path = template.get('file_path')
template_id = template['id']
# 检查是否是MinIO路径
if file_path and ('minio' in file_path.lower() or file_path.startswith('http://') or file_path.startswith('https://')):
minio_paths.append(template)
to_delete.append(template_id)
continue
# 检查文件路径是否在本地存在
if file_path:
if file_path not in local_templates:
invalid_paths.append(template)
to_delete.append(template_id)
continue
# 检查是否有重复路径
if len(path_count.get(file_path, [])) > 1:
# 保留第一个,删除其他的
if template != path_count[file_path][0]:
duplicate_paths.append(template)
to_delete.append(template_id)
continue
# 3. 统计需要删除的数据
print(f"\n 需要删除的模板:")
print(f" - MinIO路径的模板: {len(minio_paths)}")
print(f" - 无效路径的模板: {len(invalid_paths)}")
print(f" - 重复路径的模板: {len(duplicate_paths)}")
print(f" - 总计: {len(to_delete)}")
if to_delete and not dry_run:
# 4. 删除字段关联关系
print("\n 删除字段关联关系...")
if to_delete:
placeholders = ','.join(['%s'] * len(to_delete))
delete_relations_sql = f"""
DELETE FROM f_polic_file_field
WHERE tenant_id = %s
AND file_id IN ({placeholders})
"""
cursor.execute(delete_relations_sql, [tenant_id] + to_delete)
deleted_relations = cursor.rowcount
print(f" 删除了 {deleted_relations} 条字段关联关系")
# 5. 删除模板记录
print("\n 删除模板记录...")
delete_templates_sql = f"""
UPDATE f_polic_file_config
SET state = 0, updated_time = NOW(), updated_by = %s
WHERE tenant_id = %s
AND id IN ({placeholders})
"""
cursor.execute(delete_templates_sql, [UPDATED_BY, tenant_id] + to_delete)
deleted_templates = cursor.rowcount
print(f" 删除了 {deleted_templates} 个模板记录标记为state=0")
conn.commit()
print_result(True, f"清理完成:删除了 {deleted_templates} 个模板记录")
elif to_delete:
print("\n [预览模式] 将删除上述模板记录")
else:
print_result(True, "没有需要清理的数据")
return {
'total': len(all_templates),
'deleted': len(to_delete),
'minio_paths': len(minio_paths),
'invalid_paths': len(invalid_paths),
'duplicate_paths': len(duplicate_paths)
}
finally:
cursor.close()
def scan_directory_structure(base_dir: Path) -> Dict:
"""扫描目录结构"""
directories = []
files = []
def scan_recursive(current_path: Path, parent_path: Optional[str] = None):
"""递归扫描目录"""
if not current_path.exists() or not current_path.is_dir():
return
# 获取相对路径
rel_path = current_path.relative_to(base_dir)
rel_path_str = str(rel_path).replace('\\', '/')
# 添加目录节点
if rel_path_str != '.':
directories.append({
'name': current_path.name,
'path': rel_path_str,
'parent_path': parent_path
})
# 扫描子项
for item in sorted(current_path.iterdir()):
if item.is_dir():
scan_recursive(item, rel_path_str)
elif item.is_file() and item.suffix.lower() in ['.docx', '.doc']:
file_rel_path = item.relative_to(base_dir)
file_rel_path_str = str(file_rel_path).replace('\\', '/')
files.append({
'name': item.name,
'path': file_rel_path_str,
'parent_path': rel_path_str if rel_path_str != '.' else None
})
scan_recursive(base_dir)
return {
'directories': directories,
'files': files
}
def get_existing_templates(conn, tenant_id: int) -> Dict:
"""获取现有模板只获取state=1的"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
cursor.execute("""
SELECT id, name, file_path, parent_id
FROM f_polic_file_config
WHERE tenant_id = %s
AND state = 1
""", (tenant_id,))
templates = cursor.fetchall()
result = {
'by_path': {},
'by_name': {},
'by_id': {}
}
for t in templates:
result['by_id'][t['id']] = t
if t['file_path']:
result['by_path'][t['file_path']] = t
else:
name = t['name']
if name not in result['by_name']:
result['by_name'][name] = []
result['by_name'][name].append(t)
return result
finally:
cursor.close()
def sync_template_hierarchy(conn, tenant_id: int, dry_run: bool = False):
"""同步模板层级结构"""
print_section("同步模板层级结构")
# 1. 扫描目录结构
print("1. 扫描目录结构...")
structure = scan_directory_structure(TEMPLATES_DIR)
print_result(True, f"找到 {len(structure['directories'])} 个目录,{len(structure['files'])} 个文件")
if not structure['directories'] and not structure['files']:
print_result(False, "未找到任何目录或文件")
return None
# 2. 获取现有模板
print("\n2. 获取现有模板...")
existing_templates = get_existing_templates(conn, tenant_id)
print_result(True, f"找到 {len(existing_templates['by_path'])} 个文件模板,{len(existing_templates['by_name'])} 个目录模板")
# 3. 创建/更新目录节点
print("\n3. 创建/更新目录节点...")
path_to_id = {}
dir_created = 0
dir_updated = 0
for dir_info in structure['directories']:
parent_id = None
if dir_info['parent_path']:
parent_id = path_to_id.get(dir_info['parent_path'])
existing = None
candidates = existing_templates['by_name'].get(dir_info['name'], [])
for candidate in candidates:
if candidate.get('parent_id') == parent_id and not candidate.get('file_path'):
existing = candidate
break
if existing:
dir_id = existing['id']
if existing.get('parent_id') != parent_id:
dir_updated += 1
if not dry_run:
cursor = conn.cursor()
cursor.execute("""
UPDATE f_polic_file_config
SET parent_id = %s, updated_time = NOW(), updated_by = %s
WHERE id = %s AND tenant_id = %s
""", (parent_id, UPDATED_BY, dir_id, tenant_id))
conn.commit()
cursor.close()
else:
dir_id = generate_id()
dir_created += 1
if not dry_run:
cursor = conn.cursor()
cursor.execute("""
INSERT INTO f_polic_file_config
(id, tenant_id, parent_id, name, file_path, created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, NULL, NOW(), %s, NOW(), %s, 1)
""", (dir_id, tenant_id, parent_id, dir_info['name'], CREATED_BY, UPDATED_BY))
conn.commit()
cursor.close()
path_to_id[dir_info['path']] = dir_id
print_result(True, f"创建 {dir_created} 个目录,更新 {dir_updated} 个目录")
# 4. 创建/更新文件节点
print("\n4. 创建/更新文件节点...")
file_created = 0
file_updated = 0
for file_info in structure['files']:
parent_id = None
if file_info['parent_path']:
parent_id = path_to_id.get(file_info['parent_path'])
existing = existing_templates['by_path'].get(file_info['path'])
if existing:
file_id = existing['id']
if existing.get('parent_id') != parent_id or existing.get('name') != file_info['name']:
file_updated += 1
if not dry_run:
cursor = conn.cursor()
cursor.execute("""
UPDATE f_polic_file_config
SET parent_id = %s, name = %s, updated_time = NOW(), updated_by = %s
WHERE id = %s AND tenant_id = %s
""", (parent_id, file_info['name'], UPDATED_BY, file_id, tenant_id))
conn.commit()
cursor.close()
else:
file_id = generate_id()
file_created += 1
if not dry_run:
cursor = conn.cursor()
cursor.execute("""
INSERT INTO f_polic_file_config
(id, tenant_id, parent_id, name, file_path, created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, 1)
""", (file_id, tenant_id, parent_id, file_info['name'], file_info['path'], CREATED_BY, UPDATED_BY))
conn.commit()
cursor.close()
print_result(True, f"创建 {file_created} 个文件,更新 {file_updated} 个文件")
return {
'directories_created': dir_created,
'directories_updated': dir_updated,
'files_created': file_created,
'files_updated': file_updated
}
def get_input_fields(conn, tenant_id: int) -> Dict[str, int]:
"""获取输入字段"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
sql = """
SELECT id, filed_code, name
FROM f_polic_field
WHERE tenant_id = %s
AND field_type = 1
AND filed_code IN ('clue_info', 'target_basic_info_clue')
AND state = 1
"""
cursor.execute(sql, (tenant_id,))
fields = cursor.fetchall()
result = {}
for field in fields:
result[field['filed_code']] = field['id']
return result
finally:
cursor.close()
def get_output_fields(conn, tenant_id: int) -> Dict[str, int]:
"""获取所有输出字段"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
sql = """
SELECT id, filed_code, name
FROM f_polic_field
WHERE tenant_id = %s
AND field_type = 2
AND state = 1
"""
cursor.execute(sql, (tenant_id,))
fields = cursor.fetchall()
result = {}
for field in fields:
result[field['filed_code']] = field['id']
return result
finally:
cursor.close()
def extract_placeholders_from_docx(file_path: Path) -> Set[str]:
"""从docx文件中提取所有占位符"""
placeholders = set()
placeholder_pattern = re.compile(r'\{\{([^}]+)\}\}')
try:
doc = Document(file_path)
# 从段落中提取
for paragraph in doc.paragraphs:
text = paragraph.text
matches = placeholder_pattern.findall(text)
for match in matches:
field_code = match.strip()
if field_code:
placeholders.add(field_code)
# 从表格中提取
for table in doc.tables:
try:
for row in table.rows:
for cell in row.cells:
for paragraph in cell.paragraphs:
text = paragraph.text
matches = placeholder_pattern.findall(text)
for match in matches:
field_code = match.strip()
if field_code:
placeholders.add(field_code)
except:
continue
except Exception as e:
pass
return placeholders
def create_missing_input_field(conn, tenant_id: int, field_code: str) -> Optional[int]:
"""创建缺失的输入字段"""
cursor = conn.cursor()
try:
field_id = generate_id()
field_name_map = {
'clue_info': '线索信息',
'target_basic_info_clue': '被核查人基本信息(线索)'
}
field_name = field_name_map.get(field_code, field_code.replace('_', ' '))
insert_sql = """
INSERT INTO f_polic_field
(id, tenant_id, name, filed_code, field_type, created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, 1)
"""
cursor.execute(insert_sql, (
field_id,
tenant_id,
field_name,
field_code,
1,
CREATED_BY,
UPDATED_BY
))
conn.commit()
return field_id
except Exception as e:
conn.rollback()
return None
finally:
cursor.close()
def create_missing_output_field(conn, tenant_id: int, field_code: str) -> Optional[int]:
"""创建缺失的输出字段"""
cursor = conn.cursor()
try:
# 先检查是否已存在
check_cursor = conn.cursor(pymysql.cursors.DictCursor)
check_cursor.execute("""
SELECT id FROM f_polic_field
WHERE tenant_id = %s AND filed_code = %s
""", (tenant_id, field_code))
existing = check_cursor.fetchone()
check_cursor.close()
if existing:
return existing['id']
# 创建新字段
field_id = generate_id()
field_name = field_code.replace('_', ' ')
insert_sql = """
INSERT INTO f_polic_field
(id, tenant_id, name, filed_code, field_type, created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, 1)
"""
cursor.execute(insert_sql, (
field_id,
tenant_id,
field_name,
field_code,
2,
CREATED_BY,
UPDATED_BY
))
conn.commit()
return field_id
except Exception as e:
conn.rollback()
return None
finally:
cursor.close()
def get_existing_relations(conn, tenant_id: int, file_id: int) -> Set[int]:
"""获取模板的现有关联关系"""
cursor = conn.cursor()
try:
sql = """
SELECT filed_id
FROM f_polic_file_field
WHERE tenant_id = %s
AND file_id = %s
AND state = 1
"""
cursor.execute(sql, (tenant_id, file_id))
results = cursor.fetchall()
return {row[0] for row in results}
finally:
cursor.close()
def sync_field_relations(conn, tenant_id: int, dry_run: bool = False):
"""同步字段关联关系"""
print_section("同步字段关联关系")
# 1. 获取输入字段
print("1. 获取输入字段...")
input_fields = get_input_fields(conn, tenant_id)
if not input_fields:
print(" 创建缺失的输入字段...")
for field_code in ['clue_info', 'target_basic_info_clue']:
field_id = create_missing_input_field(conn, tenant_id, field_code)
if field_id:
input_fields[field_code] = field_id
if not input_fields:
print_result(False, "无法获取或创建输入字段")
return None
input_field_ids = list(input_fields.values())
print_result(True, f"找到 {len(input_field_ids)} 个输入字段")
# 2. 获取输出字段
print("\n2. 获取输出字段...")
output_fields = get_output_fields(conn, tenant_id)
print_result(True, f"找到 {len(output_fields)} 个输出字段")
# 3. 获取所有模板
print("\n3. 获取所有模板...")
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
sql = """
SELECT id, name, file_path
FROM f_polic_file_config
WHERE tenant_id = %s
AND file_path IS NOT NULL
AND file_path != ''
AND state = 1
"""
cursor.execute(sql, (tenant_id,))
templates = cursor.fetchall()
finally:
cursor.close()
print_result(True, f"找到 {len(templates)} 个模板")
if not templates:
print_result(False, "未找到模板")
return None
# 4. 先清理所有现有关联关系
print("\n4. 清理现有关联关系...")
if not dry_run:
cursor = conn.cursor()
try:
cursor.execute("""
DELETE FROM f_polic_file_field
WHERE tenant_id = %s
""", (tenant_id,))
deleted_count = cursor.rowcount
conn.commit()
print_result(True, f"删除了 {deleted_count} 条旧关联关系")
finally:
cursor.close()
else:
print(" [预览模式] 将清理所有现有关联关系")
# 5. 扫描模板占位符并创建关联关系
print("\n5. 扫描模板占位符并创建关联关系...")
total_updated = 0
total_errors = 0
all_placeholders_found = set()
missing_fields = set()
for i, template in enumerate(templates, 1):
template_id = template['id']
template_name = template['name']
file_path = template['file_path']
if i % 20 == 0:
print(f" 处理进度: {i}/{len(templates)}")
# 检查本地文件是否存在
local_file = PROJECT_ROOT / file_path
if not local_file.exists():
total_errors += 1
continue
# 提取占位符
placeholders = extract_placeholders_from_docx(local_file)
all_placeholders_found.update(placeholders)
# 根据占位符找到对应的输出字段ID
output_field_ids = []
for placeholder in placeholders:
if placeholder in output_fields:
output_field_ids.append(output_fields[placeholder])
else:
# 字段不存在,尝试创建
missing_fields.add(placeholder)
field_id = create_missing_output_field(conn, tenant_id, placeholder)
if field_id:
output_fields[placeholder] = field_id
output_field_ids.append(field_id)
# 创建关联关系
all_field_ids = input_field_ids + output_field_ids
if not dry_run and all_field_ids:
cursor = conn.cursor()
try:
for field_id in all_field_ids:
relation_id = generate_id()
insert_sql = """
INSERT INTO f_polic_file_field
(id, tenant_id, file_id, filed_id, created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, NOW(), %s, NOW(), %s, 1)
"""
cursor.execute(insert_sql, (
relation_id,
tenant_id,
template_id,
field_id,
CREATED_BY,
UPDATED_BY
))
conn.commit()
total_updated += 1
except Exception as e:
conn.rollback()
total_errors += 1
finally:
cursor.close()
else:
total_updated += 1
# 6. 统计结果
print_section("字段关联同步结果")
print(f" 总模板数: {len(templates)}")
print(f" 已处理: {total_updated}")
print(f" 错误: {total_errors}")
print(f" 发现的占位符总数: {len(all_placeholders_found)}")
print(f" 创建的字段数: {len(missing_fields)}")
return {
'total_templates': len(templates),
'updated': total_updated,
'errors': total_errors,
'placeholders_found': len(all_placeholders_found),
'fields_created': len(missing_fields)
}
def main():
"""主函数"""
print_section("清理并重新同步模板数据")
# 获取配置
config = get_db_config_from_args()
# 显示配置信息
print_section("配置信息")
print(f" 数据库服务器: {config['host']}:{config['port']}")
print(f" 数据库名称: {config['database']}")
print(f" 用户名: {config['user']}")
print(f" 租户ID: {config['tenant_id']}")
print(f" 预览模式: {'' if config['dry_run'] else ''}")
print(f" 跳过清理: {'' if config['skip_clean'] else ''}")
if config['dry_run']:
print("\n[注意] 当前为预览模式,不会实际更新数据库")
# 确认
if not config.get('dry_run'):
print("\n[警告] 此操作将清理指定租户下的旧数据并重新同步")
confirm = input("确认执行?[yes/N]: ").strip().lower()
if confirm != 'yes':
print("已取消")
return
# 连接数据库
print_section("连接数据库")
conn = test_db_connection(config)
if not conn:
return
print_result(True, "数据库连接成功")
try:
tenant_id = config['tenant_id']
dry_run = config['dry_run']
skip_clean = config['skip_clean']
results = {}
# 1. 扫描本地模板
print_section("扫描本地模板")
local_templates = scan_local_templates()
print_result(True, f"找到 {len(local_templates)} 个本地模板文件")
# 2. 清理旧数据
if not skip_clean:
clean_result = clean_old_data(conn, tenant_id, local_templates, dry_run)
results['clean'] = clean_result
else:
print_section("跳过清理步骤")
print(" 已跳过清理步骤")
# 3. 同步模板层级结构
hierarchy_result = sync_template_hierarchy(conn, tenant_id, dry_run)
results['hierarchy'] = hierarchy_result
# 4. 同步字段关联关系
fields_result = sync_field_relations(conn, tenant_id, dry_run)
results['fields'] = fields_result
# 5. 总结
print_section("同步完成")
if config['dry_run']:
print(" 本次为预览模式,未实际更新数据库")
else:
print(" 数据库已更新")
if 'clean' in results:
c = results['clean']
print(f"\n 清理结果:")
print(f" - 总模板数: {c['total']}")
print(f" - 删除模板: {c['deleted']}")
print(f" * MinIO路径: {c['minio_paths']}")
print(f" * 无效路径: {c['invalid_paths']}")
print(f" * 重复路径: {c['duplicate_paths']}")
if 'hierarchy' in results and results['hierarchy']:
h = results['hierarchy']
print(f"\n 层级结构:")
print(f" - 创建目录: {h['directories_created']}")
print(f" - 更新目录: {h['directories_updated']}")
print(f" - 创建文件: {h['files_created']}")
print(f" - 更新文件: {h['files_updated']}")
if 'fields' in results and results['fields']:
f = results['fields']
print(f"\n 字段关联:")
print(f" - 总模板数: {f['total_templates']}")
print(f" - 已处理: {f['updated']}")
print(f" - 发现的占位符: {f['placeholders_found']}")
print(f" - 创建的字段: {f['fields_created']}")
finally:
conn.close()
print_result(True, "数据库连接已关闭")
if __name__ == "__main__":
try:
main()
except KeyboardInterrupt:
print("\n\n[中断] 用户取消操作")
sys.exit(0)
except Exception as e:
print(f"\n[错误] 发生异常: {str(e)}")
import traceback
traceback.print_exc()
sys.exit(1)

View File

@ -0,0 +1,361 @@
"""
清理 f_polic_file_config 表中的重复和无效数据
确保文档模板结构和 template_finish/ 文件夹对应
"""
import os
import re
import json
import pymysql
from pathlib import Path
from typing import Dict, List, Set, Optional
from collections import defaultdict
# 数据库连接配置
DB_CONFIG = {
'host': '152.136.177.240',
'port': 5012,
'user': 'finyx',
'password': '6QsGK6MpePZDE57Z',
'database': 'finyx',
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
UPDATED_BY = 655162080928945152
TEMPLATE_BASE_DIR = 'template_finish'
def normalize_template_name(name: str) -> str:
"""
标准化模板名称去掉扩展名括号内容数字前缀等
Args:
name: 文件名或模板名称
Returns:
标准化后的名称
"""
# 去掉扩展名
name = Path(name).stem if '.' in name else name
# 去掉括号内容
name = re.sub(r'[(].*?[)]', '', name)
name = name.strip()
# 去掉数字前缀和点号
name = re.sub(r'^\d+[\.\-]?\s*', '', name)
name = name.strip()
return name
def scan_template_files(base_dir: str) -> Dict[str, Dict]:
"""
扫描模板文件夹获取所有有效的模板文件
Returns:
字典key为标准化名称value为模板信息列表可能有多个同名文件
"""
base_path = Path(base_dir)
if not base_path.exists():
print(f"错误: 目录不存在 - {base_dir}")
return {}
templates = defaultdict(list)
print("=" * 80)
print("扫描模板文件...")
print("=" * 80)
for docx_file in sorted(base_path.rglob("*.docx")):
# 跳过临时文件
if docx_file.name.startswith("~$"):
continue
relative_path = docx_file.relative_to(base_path)
file_name = docx_file.name
normalized_name = normalize_template_name(file_name)
templates[normalized_name].append({
'file_path': str(docx_file),
'relative_path': str(relative_path),
'file_name': file_name,
'normalized_name': normalized_name
})
print(f"总共扫描到 {sum(len(v) for v in templates.values())} 个模板文件")
print(f"唯一模板名称: {len(templates)}")
return dict(templates)
def get_all_templates_from_db(conn) -> Dict[str, List[Dict]]:
"""
从数据库获取所有模板按标准化名称分组
Returns:
字典key为标准化名称value为模板记录列表
"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
sql = """
SELECT id, name, file_path, parent_id, state, input_data, created_time, updated_time
FROM f_polic_file_config
WHERE tenant_id = %s
ORDER BY created_time DESC
"""
cursor.execute(sql, (TENANT_ID,))
templates = cursor.fetchall()
result = defaultdict(list)
for template in templates:
normalized_name = normalize_template_name(template['name'])
result[normalized_name].append({
'id': template['id'],
'name': template['name'],
'normalized_name': normalized_name,
'file_path': template['file_path'],
'parent_id': template['parent_id'],
'state': template['state'],
'input_data': template['input_data'],
'created_time': template['created_time'],
'updated_time': template['updated_time']
})
cursor.close()
return dict(result)
def find_duplicates(db_templates: Dict[str, List[Dict]]) -> Dict[str, List[Dict]]:
"""
找出重复的模板同一标准化名称有多个记录
Returns:
字典key为标准化名称value为重复的模板记录列表
"""
duplicates = {}
for normalized_name, templates in db_templates.items():
if len(templates) > 1:
duplicates[normalized_name] = templates
return duplicates
def select_best_template(templates: List[Dict], valid_template_files: List[Dict]) -> Optional[Dict]:
"""
从多个重复的模板中选择最好的一个保留最新的有效的
Args:
templates: 数据库中的模板记录列表
valid_template_files: 有效的模板文件列表
Returns:
应该保留的模板记录或None
"""
if not templates:
return None
# 优先选择state=1 且 file_path 有效的
enabled_templates = [t for t in templates if t.get('state') == 1]
if enabled_templates:
# 如果有多个启用的,选择最新的
enabled_templates.sort(key=lambda x: x.get('updated_time') or x.get('created_time'), reverse=True)
return enabled_templates[0]
# 如果没有启用的,选择最新的
templates.sort(key=lambda x: x.get('updated_time') or x.get('created_time'), reverse=True)
return templates[0]
def delete_template_and_relations(conn, template_id: int):
"""
删除模板及其关联关系
Args:
conn: 数据库连接
template_id: 模板ID
"""
cursor = conn.cursor()
try:
# 删除字段关联
delete_relations_sql = """
DELETE FROM f_polic_file_field
WHERE tenant_id = %s AND file_id = %s
"""
cursor.execute(delete_relations_sql, (TENANT_ID, template_id))
relations_deleted = cursor.rowcount
# 删除模板配置
delete_template_sql = """
DELETE FROM f_polic_file_config
WHERE tenant_id = %s AND id = %s
"""
cursor.execute(delete_template_sql, (TENANT_ID, template_id))
template_deleted = cursor.rowcount
conn.commit()
return relations_deleted, template_deleted
except Exception as e:
conn.rollback()
raise Exception(f"删除模板失败: {str(e)}")
finally:
cursor.close()
def mark_invalid_templates(conn, valid_template_names: Set[str]):
"""
标记无效的模板不在template_finish文件夹中的模板
Args:
conn: 数据库连接
valid_template_names: 有效的模板名称集合标准化后的
"""
cursor = conn.cursor()
try:
# 获取所有模板
sql = """
SELECT id, name FROM f_polic_file_config
WHERE tenant_id = %s
"""
cursor.execute(sql, (TENANT_ID,))
all_templates = cursor.fetchall()
invalid_count = 0
for template in all_templates:
template_id = template[0]
template_name = template[1]
normalized_name = normalize_template_name(template_name)
# 检查是否在有效模板列表中
if normalized_name not in valid_template_names:
# 标记为未启用
update_sql = """
UPDATE f_polic_file_config
SET state = 0, updated_time = NOW(), updated_by = %s
WHERE id = %s AND tenant_id = %s
"""
cursor.execute(update_sql, (UPDATED_BY, template_id, TENANT_ID))
invalid_count += 1
print(f" [WARN] 标记无效模板: {template_name} (ID: {template_id})")
conn.commit()
print(f"\n总共标记 {invalid_count} 个无效模板")
except Exception as e:
conn.rollback()
raise Exception(f"标记无效模板失败: {str(e)}")
finally:
cursor.close()
def main():
"""主函数"""
print("=" * 80)
print("清理重复和无效的模板数据")
print("=" * 80)
print()
try:
# 连接数据库
print("1. 连接数据库...")
conn = pymysql.connect(**DB_CONFIG)
print("[OK] 数据库连接成功\n")
# 扫描模板文件
print("2. 扫描模板文件...")
valid_templates = scan_template_files(TEMPLATE_BASE_DIR)
valid_template_names = set(valid_templates.keys())
print(f"[OK] 找到 {len(valid_template_names)} 个有效模板名称\n")
# 获取数据库中的模板
print("3. 获取数据库中的模板...")
db_templates = get_all_templates_from_db(conn)
print(f"[OK] 数据库中有 {sum(len(v) for v in db_templates.values())} 个模板记录")
print(f"[OK] 唯一模板名称: {len(db_templates)}\n")
# 找出重复的模板
print("4. 查找重复的模板...")
duplicates = find_duplicates(db_templates)
print(f"[OK] 找到 {len(duplicates)} 个重复的模板名称\n")
# 处理重复模板
print("5. 处理重复模板...")
print("=" * 80)
total_deleted = 0
total_relations_deleted = 0
for normalized_name, templates in duplicates.items():
print(f"\n处理重复模板: {normalized_name}")
print(f" 重复记录数: {len(templates)}")
# 获取对应的有效模板文件
valid_files = valid_templates.get(normalized_name, [])
# 选择要保留的模板
keep_template = select_best_template(templates, valid_files)
if keep_template:
print(f" [KEEP] 保留模板: {keep_template['name']} (ID: {keep_template['id']})")
# 删除其他重复的模板
for template in templates:
if template['id'] != keep_template['id']:
print(f" [DELETE] 删除重复模板: {template['name']} (ID: {template['id']})")
relations_deleted, template_deleted = delete_template_and_relations(conn, template['id'])
total_relations_deleted += relations_deleted
total_deleted += template_deleted
else:
print(f" [WARN] 无法确定要保留的模板,跳过")
print(f"\n[OK] 删除重复模板: {total_deleted}")
print(f"[OK] 删除关联关系: {total_relations_deleted}\n")
# 标记无效模板
print("6. 标记无效模板...")
mark_invalid_templates(conn, valid_template_names)
# 统计最终结果
print("\n7. 统计最终结果...")
final_templates = get_all_templates_from_db(conn)
enabled_count = sum(1 for templates in final_templates.values()
for t in templates if t.get('state') == 1)
disabled_count = sum(1 for templates in final_templates.values()
for t in templates if t.get('state') != 1)
print(f"[OK] 最终模板总数: {sum(len(v) for v in final_templates.values())}")
print(f"[OK] 启用模板数: {enabled_count}")
print(f"[OK] 禁用模板数: {disabled_count}")
print(f"[OK] 唯一模板名称: {len(final_templates)}")
# 打印最终模板列表
print("\n8. 最终模板列表(启用的):")
print("=" * 80)
for normalized_name, templates in sorted(final_templates.items()):
enabled = [t for t in templates if t.get('state') == 1]
if enabled:
for template in enabled:
print(f" - {template['name']} (ID: {template['id']})")
print("\n" + "=" * 80)
print("清理完成!")
print("=" * 80)
except Exception as e:
print(f"\n[ERROR] 发生错误: {e}")
import traceback
traceback.print_exc()
if 'conn' in locals():
conn.rollback()
finally:
if 'conn' in locals():
conn.close()
print("\n数据库连接已关闭")
if __name__ == '__main__':
main()

View File

@ -1,21 +1,34 @@
{ {
"prompt_template": { "prompt_template": {
"intro": "请从以下输入文本中提取结构化信息。", "intro": "请从以下输入文本中提取结构化信息。仔细分析文本内容,准确提取每个字段的值。\n\n⚠ 重要提醒:请逐字逐句仔细阅读输入文本,不要遗漏任何信息。对于性别、年龄、职务、单位、文化程度等字段,请特别仔细查找,这些信息可能以各种形式出现在文本中。",
"input_text_label": "输入文本:", "input_text_label": "输入文本:",
"output_fields_label": "需要提取的字段", "output_fields_label": "需要提取的字段(请仔细分析每个字段,确保提取完整)",
"json_format_label": "请严格按照以下JSON格式返回结果只返回JSON,不要包含其他文字说明:", "json_format_label": "请严格按照以下JSON格式返回结果只返回JSON对象,不要包含任何其他文字说明或markdown代码块标记",
"requirements_label": "要求:", "requirements_label": "重要要求(请严格遵守)",
"requirements": [ "requirements": [
"仔细分析输入文本,准确提取每个字段的值", "⚠️ 逐字逐句仔细分析输入文本,不要遗漏任何信息。请特别关注性别、年龄、职务、单位、文化程度等字段",
"如果某个字段在输入文本中找不到对应信息,该字段值设为空字符串\"\"", "对于每个字段,请从多个角度思考:直接提及、同义词、隐含信息、可推断信息。例如:性别可能以\"男\"、\"女\"、\"男性\"、\"女性\"、\"先生\"、\"女士\"等形式出现",
"日期格式统一为YYYYMM198005表示1980年5月如果包含日期信息则格式为YYYYMMDD", "如果文本中明确提到某个信息(如\"30岁\"、\"男\"、\"总经理\"、\"某公司\"等),必须提取出来,不能设为空",
"性别统一为\"男\"或\"女\",不要使用\"男性\"或\"女性\"", "如果可以通过已有信息合理推断,请进行推断并填写:\n - 根据出生年月如1980年05月和当前年份2024年计算年龄44岁\n - 从单位及职务(如\"某公司总经理\")中拆分单位(\"某公司\")和职务(\"总经理\"\n - 从工作基本情况中提取性别、文化程度等信息",
"政治面貌使用标准表述(如:中共党员、中共预备党员、共青团员、群众等)", "如果某个字段在输入文本中确实找不到任何相关信息,该字段值才设为空字符串\"\"",
"日期格式统一为中文格式YYYY年MM月1980年05月表示1980年5月如果包含日期信息则格式为YYYY年MM月DD日1985年05月17日。注意年份必须是4位数字月份和日期必须是2位数字如1980年5月应格式化为1980年05月不是1980年5月",
"性别统一为\"男\"或\"女\",不要使用\"男性\"或\"女性\"。如果文本中提到\"男性\"、\"男\"、\"先生\"等,统一转换为\"男\";如果提到\"女性\"、\"女\"、\"女士\"等,统一转换为\"女\"",
"年龄字段:如果文本中直接提到年龄(如\"30岁\"、\"30周岁\"直接提取数字如果只有出生年月可以根据当前年份计算年龄当前年份为2024年",
"单位及职务字段:如果文本中提到\"XX公司总经理\"、\"XX单位XX职务\"等,需要同时提取单位名称和职务名称",
"单位字段:从单位及职务信息中提取单位名称部分(如\"XX公司\"、\"XX局\"、\"XX部门\"等)",
"职务字段:从单位及职务信息中提取职务名称部分(如\"总经理\"、\"局长\"、\"主任\"等)",
"文化程度字段:注意识别\"本科\"、\"大专\"、\"高中\"、\"中专\"、\"研究生\"、\"硕士\"、\"博士\"等表述",
"政治面貌使用标准表述(如:中共党员、中共预备党员、共青团员、群众等)。如果文本中提到\"党员\",统一转换为\"中共党员\"",
"职级使用标准表述(如:正处级、副处级、正科级、副科级等)", "职级使用标准表述(如:正处级、副处级、正科级、副科级等)",
"线索来源字段:注意识别\"举报\"、\"来信\"、\"来电\"、\"网络举报\"、\"上级交办\"等表述",
"主要问题线索字段:提取文本中关于问题、线索、举报内容等的描述",
"身份证号码只提取数字,不包含其他字符", "身份证号码只提取数字,不包含其他字符",
"联系方式提取电话号码,格式化为纯数字", "联系方式提取电话号码,格式化为纯数字",
"地址信息保持完整,包含省市区街道等详细信息", "地址信息保持完整,包含省市区街道等详细信息",
"只返回JSON对象不要包含markdown代码块标记" "只返回JSON对象不要包含markdown代码块标记、思考过程或其他说明文字",
"JSON格式要求所有字段名必须使用双引号字段名中不能包含前导点如不能使用\".target_gender\",应使用\"target_gender\"),字段名前后不能有空格",
"必须返回所有要求的字段即使值为空字符串也要包含在JSON中",
"字段名必须严格按照JSON示例中的字段编码不能随意修改或拼写错误如不能使用\"targetsProfessionalRank\",应使用\"target_professional_rank\""
] ]
}, },
"field_formatting": { "field_formatting": {
@ -34,23 +47,32 @@
"description": "被核查人员性别", "description": "被核查人员性别",
"rules": [ "rules": [
"只能返回\"男\"或\"女\"", "只能返回\"男\"或\"女\"",
"如果文本中提到\"男性\"、\"男性公民\"等,统一转换为\"男\"", "如果文本中提到\"男性\"、\"男性公民\"、\"男\"、\"先生\"等,统一转换为\"男\"",
"如果文本中提到\"女性\"、\"女性公民\"等,统一转换为\"女\"" "如果文本中提到\"女性\"、\"女性公民\"、\"女\"、\"女士\"等,统一转换为\"女\"",
"请仔细查找文本中所有可能表示性别的词汇,不要遗漏",
"如果文本中提到\"XXX...\"或\"XXX...\",必须提取性别",
"如果工作基本情况中提到性别信息,必须提取"
] ]
}, },
"target_date_of_birth": { "target_date_of_birth": {
"description": "被核查人员出生年月", "description": "被核查人员出生年月",
"rules": [ "rules": [
"格式YYYYMM如198005表示1980年5月", "格式YYYY年MM月中文格式如1980年05月表示1980年5月注意月份必须是2位数字如5月应写为05月不是5月",
"如果只有年份月份设为01", "如果只有年份月份设为01如1980年应格式化为1980年01月",
"如果文本中提到\"X年X月X日出生\",只提取年月,忽略日期" "如果文本中提到\"X年X月X日出生\",只提取年月,忽略日期",
"如果文本中提到\"1980年5月\",格式化为\"1980年05月\"(月份补零)",
"如果文本中提到\"1980年05月\",保持为\"1980年05月\"",
"年份必须是4位数字月份必须是2位数字01-12",
"输出格式示例1980年05月、1985年03月、1990年12月"
] ]
}, },
"target_date_of_birth_full": { "target_date_of_birth_full": {
"description": "被核查人员出生年月日", "description": "被核查人员出生年月日",
"rules": [ "rules": [
"格式YYYYMMDD如19800515表示1980年5月15日", "格式YYYY年MM月DD日中文格式如1985年05月17日表示1985年5月17日",
"如果只有年月日期设为01" "如果只有年月日期设为01如1980年05月应格式化为1980年05月01日",
"年份必须是4位数字月份和日期必须是2位数字01-12和01-31",
"输出格式示例1985年05月17日、1980年03月15日、1990年12月01日"
] ]
}, },
"target_political_status": { "target_political_status": {
@ -99,6 +121,84 @@
"学历使用标准表述:本科、大专、高中、中专、研究生等", "学历使用标准表述:本科、大专、高中、中专、研究生等",
"政治面貌部分:如果是中共党员,写\"加入中国共产党\";如果不是,省略此部分" "政治面貌部分:如果是中共党员,写\"加入中国共产党\";如果不是,省略此部分"
] ]
},
"target_age": {
"description": "被核查人员年龄",
"rules": [
"如果文本中直接提到年龄(如\"30岁\"、\"30周岁\"、\"年龄30\"、\"现年30\"),直接提取数字部分",
"如果无法抽取到年龄数据,但抽取到了\"被核查人员出生年月\",系统将根据出生年月和当前日期自动计算年龄",
"年龄格式:纯数字,单位为岁,如\"44\"表示44岁",
"如果文本中既没有直接提到年龄,也没有出生年月信息,则设为空字符串"
]
},
"target_organization_and_position": {
"description": "被核查人员单位及职务(包括兼职)",
"rules": [
"提取完整的单位及职务信息,格式如:\"XX公司总经理\"、\"XX局XX处处长\"、\"XX单位XX职务\"",
"如果文本中提到\"XX公司总经理\"、\"XX单位XX职务\"等,需要完整提取",
"如果文本中分别提到单位和职务,需要组合成\"单位+职务\"的格式",
"如果文本中提到多个职务或兼职,需要全部包含,用\"、\"或\"兼\"连接",
"保持原文中的表述,不要随意修改"
]
},
"target_organization": {
"description": "被核查人员单位",
"rules": [
"从单位及职务信息中提取单位名称部分",
"单位名称包括:公司、企业、机关、事业单位、部门等(如\"XX公司\"、\"XX局\"、\"XX部门\"、\"XX委员会\"等)",
"如果文本中只提到单位名称,直接提取",
"⚠️ 如果文本中提到\"XX公司总经理\",必须提取\"XX公司\"部分,不能设为空",
"如果文本中提到\"XX单位XX职务\",提取\"XX单位\"部分",
"如果已有单位及职务字段target_organization_and_position必须从中拆分出单位名称",
"保持单位名称的完整性,不要遗漏"
]
},
"target_position": {
"description": "被核查人员职务",
"rules": [
"从单位及职务信息中提取职务名称部分",
"职务名称包括:总经理、经理、局长、处长、科长、主任、书记、部长等",
"如果文本中只提到职务名称,直接提取",
"⚠️ 如果文本中提到\"XX公司总经理\",必须提取\"总经理\"部分,不能设为空",
"如果文本中提到\"XX单位XX职务\",提取\"XX职务\"部分",
"如果已有单位及职务字段target_organization_and_position必须从中拆分出职务名称",
"如果文本中提到多个职务,需要全部提取,用\"、\"连接",
"保持职务名称的准确性"
]
},
"target_education_level": {
"description": "被核查人员文化程度",
"rules": [
"识别文本中关于学历、文化程度的表述",
"标准表述包括:小学、初中、高中、中专、大专、本科、研究生、硕士、博士等",
"如果文本中提到\"大学\"、\"大学毕业\",通常指\"本科\"",
"如果文本中提到\"专科\",通常指\"大专\"",
"如果文本中提到\"研究生学历\",可以写\"研究生\"",
"保持标准表述,不要使用非标准表述"
]
},
"clue_source": {
"description": "线索来源",
"rules": [
"识别文本中关于线索来源的表述",
"常见来源包括:举报、来信、来电、网络举报、上级交办、巡视发现、审计发现、媒体曝光等",
"如果文本中提到\"举报\"、\"被举报\",线索来源可能是\"举报\"或\"来信举报\"",
"如果文本中提到\"电话\"、\"来电\",线索来源可能是\"来电举报\"",
"如果文本中提到\"网络\"、\"网上\",线索来源可能是\"网络举报\"",
"如果文本中提到\"上级\"、\"交办\",线索来源可能是\"上级交办\"",
"如果文本中没有明确提到线索来源,但提到\"举报\"相关信息,可以推断为\"举报\"",
"保持标准表述"
]
},
"target_issue_description": {
"description": "主要问题线索",
"rules": [
"提取文本中关于问题、线索、举报内容等的描述",
"包括但不限于:违纪违法问题、工作作风问题、经济问题、生活作风问题等",
"如果文本中提到\"问题\"、\"线索\"、\"举报\"、\"反映\"等关键词,提取相关内容",
"保持问题描述的完整性和准确性,不要遗漏重要信息",
"如果文本中没有明确的问题描述,但提到了相关情况,也要尽量提取"
]
} }
} }
} }

View File

@ -0,0 +1,482 @@
"""
诊断MinIO文档生成问题
测试新MinIO服务器配置下的文档生成流程
"""
import os
import sys
from minio import Minio
from minio.error import S3Error
from dotenv import load_dotenv
# 加载环境变量
load_dotenv()
# 新MinIO配置用户提供
NEW_MINIO_CONFIG = {
'endpoint': '10.100.31.21:9000',
'access_key': 'minio_PC8dcY',
'secret_key': 'minio_7k7RNJ',
'secure': False # 重要根据测试结果应该是false
}
BUCKET_NAME = 'finyx'
TENANT_ID = 615873064429507639
def print_section(title):
"""打印章节标题"""
print("\n" + "="*70)
print(f" {title}")
print("="*70)
def print_result(success, message):
"""打印测试结果"""
status = "[OK]" if success else "[FAIL]"
print(f"{status} {message}")
def check_environment_variables():
"""检查环境变量配置"""
print_section("1. 检查环境变量配置")
env_vars = {
'MINIO_ENDPOINT': os.getenv('MINIO_ENDPOINT'),
'MINIO_ACCESS_KEY': os.getenv('MINIO_ACCESS_KEY'),
'MINIO_SECRET_KEY': os.getenv('MINIO_SECRET_KEY'),
'MINIO_BUCKET': os.getenv('MINIO_BUCKET'),
'MINIO_SECURE': os.getenv('MINIO_SECURE')
}
print("\n当前环境变量配置:")
for key, value in env_vars.items():
if key == 'MINIO_SECRET_KEY' and value:
# 隐藏密钥的部分内容
masked_value = value[:8] + '***' if len(value) > 8 else '***'
print(f" {key}: {masked_value}")
else:
print(f" {key}: {value}")
# 检查配置是否正确
issues = []
if env_vars['MINIO_ENDPOINT'] != NEW_MINIO_CONFIG['endpoint']:
issues.append(f"MINIO_ENDPOINT 应该是 '{NEW_MINIO_CONFIG['endpoint']}',当前是 '{env_vars['MINIO_ENDPOINT']}'")
if env_vars['MINIO_ACCESS_KEY'] != NEW_MINIO_CONFIG['access_key']:
issues.append(f"MINIO_ACCESS_KEY 应该是 '{NEW_MINIO_CONFIG['access_key']}',当前是 '{env_vars['MINIO_ACCESS_KEY']}'")
secure_value = env_vars['MINIO_SECURE']
if secure_value and secure_value.lower() == 'true':
issues.append(f"[WARN] MINIO_SECURE 设置为 'true'但新服务器使用HTTP应该设置为 'false'")
if issues:
print("\n[WARN] 发现配置问题:")
for issue in issues:
print(f" - {issue}")
print_result(False, "环境变量配置需要更新")
return False
else:
print_result(True, "环境变量配置正确")
return True
def test_minio_connection():
"""测试MinIO连接"""
print_section("2. 测试MinIO连接")
# 先尝试用户配置的secure值
secure_values = [False, True] # 优先尝试false根据测试结果
for secure in secure_values:
try:
print(f"\n尝试连接secure={secure}...")
client = Minio(
NEW_MINIO_CONFIG['endpoint'],
access_key=NEW_MINIO_CONFIG['access_key'],
secret_key=NEW_MINIO_CONFIG['secret_key'],
secure=secure
)
# 测试连接:列出存储桶
buckets = client.list_buckets()
print_result(True, f"MinIO连接成功secure={secure}")
print(f"\n 连接信息:")
print(f" 端点: {NEW_MINIO_CONFIG['endpoint']}")
print(f" 使用HTTPS: {secure}")
print(f" 访问密钥: {NEW_MINIO_CONFIG['access_key']}")
print(f"\n 可用存储桶:")
for bucket in buckets:
print(f" - {bucket.name} (创建时间: {bucket.creation_date})")
# 检查目标存储桶
bucket_exists = client.bucket_exists(BUCKET_NAME)
if bucket_exists:
print_result(True, f"存储桶 '{BUCKET_NAME}' 存在")
else:
print_result(False, f"存储桶 '{BUCKET_NAME}' 不存在")
print(f" 建议:需要创建存储桶 '{BUCKET_NAME}'")
return None, False
return client, True
except Exception as e:
error_msg = str(e)
if secure == True:
print_result(False, f"使用HTTPS连接失败: {error_msg}")
print(f" 将尝试使用HTTP连接...")
continue
else:
print_result(False, f"MinIO连接失败: {error_msg}")
import traceback
traceback.print_exc()
return None, False
return None, False
def test_template_download(client):
"""测试模板下载功能"""
print_section("3. 测试模板下载功能")
if not client:
print_result(False, "MinIO客户端未连接跳过测试")
return False
try:
# 查询数据库获取一个模板文件路径
import pymysql
db_config = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
conn = pymysql.connect(**db_config)
cursor = conn.cursor(pymysql.cursors.DictCursor)
# 查询一个启用的模板
sql = """
SELECT id, name, file_path
FROM f_polic_file_config
WHERE tenant_id = %s
AND state = 1
AND file_path IS NOT NULL
AND file_path != ''
LIMIT 1
"""
cursor.execute(sql, (TENANT_ID,))
template = cursor.fetchone()
cursor.close()
conn.close()
if not template:
print_result(False, "数据库中没有找到可用的模板文件")
print(" 建议:检查数据库中的 f_polic_file_config 表")
return False
print(f"\n找到模板:")
print(f" ID: {template['id']}")
print(f" 名称: {template['name']}")
print(f" 文件路径: {template['file_path']}")
# 尝试下载模板
object_name = template['file_path'].lstrip('/')
print(f"\n尝试下载模板...")
print(f" 对象名称: {object_name}")
# 检查文件是否存在
try:
stat = client.stat_object(BUCKET_NAME, object_name)
print_result(True, f"模板文件存在(大小: {stat.size:,} 字节)")
except S3Error as e:
if e.code == 'NoSuchKey':
print_result(False, f"模板文件不存在: {object_name}")
print(f" 错误: {str(e)}")
print(f" 建议检查MinIO服务器上是否存在该文件")
return False
else:
raise
# 尝试下载(使用临时文件)
import tempfile
temp_file = tempfile.NamedTemporaryFile(delete=False, suffix='.docx')
temp_file.close()
try:
client.fget_object(BUCKET_NAME, object_name, temp_file.name)
file_size = os.path.getsize(temp_file.name)
print_result(True, f"模板下载成功(大小: {file_size:,} 字节)")
# 清理临时文件
os.unlink(temp_file.name)
return True
except Exception as e:
print_result(False, f"模板下载失败: {str(e)}")
# 清理临时文件
if os.path.exists(temp_file.name):
os.unlink(temp_file.name)
return False
except Exception as e:
print_result(False, f"测试模板下载时出错: {str(e)}")
import traceback
traceback.print_exc()
return False
def test_file_upload(client):
"""测试文件上传功能"""
print_section("4. 测试文件上传功能")
if not client:
print_result(False, "MinIO客户端未连接跳过测试")
return False
try:
# 创建一个测试文件
import tempfile
from datetime import datetime
test_content = b"Test document content for MinIO upload test"
temp_file = tempfile.NamedTemporaryFile(delete=False, suffix='.docx')
temp_file.write(test_content)
temp_file.close()
print(f"\n创建测试文件: {temp_file.name}")
# 生成上传路径
now = datetime.now()
timestamp = f"{now.strftime('%Y%m%d%H%M%S')}{now.microsecond:06d}"
object_name = f"{TENANT_ID}/TEST/{timestamp}/test_upload.docx"
print(f"\n尝试上传文件...")
print(f" 对象名称: {object_name}")
# 上传文件
client.fput_object(
BUCKET_NAME,
object_name,
temp_file.name,
content_type='application/vnd.openxmlformats-officedocument.wordprocessingml.document'
)
print_result(True, "文件上传成功")
# 验证文件是否存在
stat = client.stat_object(BUCKET_NAME, object_name)
print(f" 上传的文件大小: {stat.size:,} 字节")
# 清理测试文件
os.unlink(temp_file.name)
# 可选:删除测试文件
try:
client.remove_object(BUCKET_NAME, object_name)
print(f" 已清理测试文件: {object_name}")
except:
pass
return True
except Exception as e:
print_result(False, f"文件上传失败: {str(e)}")
import traceback
traceback.print_exc()
# 清理临时文件
if 'temp_file' in locals() and os.path.exists(temp_file.name):
os.unlink(temp_file.name)
return False
def test_presigned_url(client):
"""测试预签名URL生成"""
print_section("5. 测试预签名URL生成")
if not client:
print_result(False, "MinIO客户端未连接跳过测试")
return False
try:
# 使用一个测试对象名称
from datetime import datetime, timedelta
now = datetime.now()
timestamp = f"{now.strftime('%Y%m%d%H%M%S')}{now.microsecond:06d}"
test_object_name = f"{TENANT_ID}/TEST/{timestamp}/test_url.docx"
# 先创建一个测试文件
import tempfile
test_content = b"Test content"
temp_file = tempfile.NamedTemporaryFile(delete=False, suffix='.docx')
temp_file.write(test_content)
temp_file.close()
# 上传测试文件
client.fput_object(
BUCKET_NAME,
test_object_name,
temp_file.name,
content_type='application/vnd.openxmlformats-officedocument.wordprocessingml.document'
)
os.unlink(temp_file.name)
print(f"\n生成预签名URL...")
print(f" 对象名称: {test_object_name}")
# 生成预签名URL
url = client.presigned_get_object(
BUCKET_NAME,
test_object_name,
expires=timedelta(days=7)
)
print_result(True, "预签名URL生成成功")
print(f"\n URL: {url[:100]}...")
# 清理测试文件
try:
client.remove_object(BUCKET_NAME, test_object_name)
except:
pass
return True
except Exception as e:
print_result(False, f"预签名URL生成失败: {str(e)}")
import traceback
traceback.print_exc()
return False
def check_directory_structure(client):
"""检查目录结构MinIO是对象存储不需要创建目录"""
print_section("6. 检查目录结构")
if not client:
print_result(False, "MinIO客户端未连接跳过测试")
return False
print("\n说明MinIO是对象存储不需要创建目录。")
print("对象名称可以包含路径分隔符(如 '/'MinIO会自动处理。")
print("\n检查存储桶中的对象结构...")
try:
# 列出一些对象,查看目录结构
objects = client.list_objects(BUCKET_NAME, prefix=f"{TENANT_ID}/", recursive=False)
prefixes = set()
count = 0
for obj in objects:
count += 1
if count <= 20: # 只显示前20个
# 提取前缀(目录)
parts = obj.object_name.split('/')
if len(parts) > 1:
prefix = '/'.join(parts[:-1])
prefixes.add(prefix)
if prefixes:
print(f"\n发现的前缀目录结构前20个对象:")
for prefix in sorted(prefixes):
print(f" - {prefix}/")
print_result(True, f"存储桶结构正常(已检查 {count} 个对象)")
return True
except Exception as e:
print_result(False, f"检查目录结构失败: {str(e)}")
import traceback
traceback.print_exc()
return False
def print_recommendations():
"""打印修复建议"""
print_section("修复建议")
print("\n根据诊断结果,请执行以下步骤:")
print("\n1. 更新环境变量配置(.env文件或系统环境变量:")
print(" MINIO_ENDPOINT=10.100.31.21:9000")
print(" MINIO_ACCESS_KEY=minio_PC8dcY")
print(" MINIO_SECRET_KEY=minio_7k7RNJ")
print(" MINIO_BUCKET=finyx")
print(" MINIO_SECURE=false # [IMPORTANT] 重要必须是false不是true")
print("\n2. 确保存储桶存在:")
print(f" 存储桶名称: {BUCKET_NAME}")
print(" 如果不存在,需要创建存储桶")
print("\n3. 确保模板文件已上传到MinIO:")
print(" 检查数据库中的 f_polic_file_config 表的 file_path 字段")
print(" 确保对应的文件在MinIO服务器上存在")
print("\n4. 关于目录:")
print(" MinIO是对象存储不需要创建目录")
print(" 对象名称可以包含路径分隔符(如 '/'MinIO会自动处理")
print(" 例如: 615873064429507639/TEMPLATE/2024/12/template.docx")
print("\n5. 重启应用:")
print(" 更新环境变量后,需要重启应用服务才能生效")
print("\n[IMPORTANT] MINIO_SECURE=false # 注意必须是false不是true")
def main():
"""主函数"""
print("\n" + "="*70)
print(" MinIO文档生成问题诊断工具")
print("="*70)
print(f"\n新MinIO服务器配置:")
print(f" 端点: {NEW_MINIO_CONFIG['endpoint']}")
print(f" 存储桶: {BUCKET_NAME}")
print(f" 访问密钥: {NEW_MINIO_CONFIG['access_key']}")
print(f" 使用HTTPS: {NEW_MINIO_CONFIG['secure']}")
results = {}
try:
# 1. 检查环境变量
results['环境变量'] = check_environment_variables()
# 2. 测试MinIO连接
client, bucket_exists = test_minio_connection()
results['MinIO连接'] = client is not None and bucket_exists
if client and bucket_exists:
# 3. 测试模板下载
results['模板下载'] = test_template_download(client)
# 4. 测试文件上传
results['文件上传'] = test_file_upload(client)
# 5. 测试预签名URL
results['预签名URL'] = test_presigned_url(client)
# 6. 检查目录结构
results['目录结构'] = check_directory_structure(client)
# 总结
print_section("诊断总结")
print("\n测试结果:")
for test_name, success in results.items():
status = "[OK] 通过" if success else "[FAIL] 失败"
print(f" {test_name}: {status}")
passed = sum(1 for v in results.values() if v)
total = len(results)
print(f"\n通过率: {passed}/{total} ({passed*100//total if total > 0 else 0}%)")
if passed == total:
print("\n[OK] 所有测试通过MinIO配置正确文档生成应该可以正常工作。")
else:
print("\n[WARN] 部分测试失败,请查看上面的错误信息并按照建议进行修复。")
print_recommendations()
except KeyboardInterrupt:
print("\n\n诊断已中断")
except Exception as e:
print(f"\n[ERROR] 诊断过程中发生错误: {e}")
import traceback
traceback.print_exc()
print_recommendations()
if __name__ == '__main__':
main()

231
enable_all_fields.py Normal file
View File

@ -0,0 +1,231 @@
"""
启用f_polic_field表中所有字段将state更新为1
"""
import pymysql
import os
from datetime import datetime
# 数据库连接配置
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
UPDATED_BY = 655162080928945152
CURRENT_TIME = datetime.now()
def check_field_states(conn):
"""检查字段状态统计"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
# 统计各状态的字段数量使用CAST来正确处理二进制类型
sql = """
SELECT
CAST(state AS UNSIGNED) as state_int,
field_type,
COUNT(*) as count
FROM f_polic_field
WHERE tenant_id = %s
GROUP BY CAST(state AS UNSIGNED), field_type
ORDER BY field_type, CAST(state AS UNSIGNED)
"""
cursor.execute(sql, (TENANT_ID,))
stats = cursor.fetchall()
cursor.close()
return stats
def get_fields_by_state(conn, state):
"""获取指定状态的字段列表"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
sql = """
SELECT id, name, filed_code, field_type, CAST(state AS UNSIGNED) as state_int
FROM f_polic_field
WHERE tenant_id = %s
AND CAST(state AS UNSIGNED) = %s
ORDER BY field_type, name
"""
cursor.execute(sql, (TENANT_ID, state))
fields = cursor.fetchall()
cursor.close()
return fields
def enable_all_fields(conn, dry_run=True):
"""启用所有字段"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
# 查询需要更新的字段使用CAST来正确处理二进制类型
sql = """
SELECT id, name, filed_code, field_type, CAST(state AS UNSIGNED) as state_int
FROM f_polic_field
WHERE tenant_id = %s
AND CAST(state AS UNSIGNED) != 1
ORDER BY field_type, name
"""
cursor.execute(sql, (TENANT_ID,))
fields_to_update = cursor.fetchall()
if not fields_to_update:
print("✓ 所有字段已经是启用状态,无需更新")
cursor.close()
return 0
print(f"\n找到 {len(fields_to_update)} 个需要启用的字段:")
for field in fields_to_update:
field_type_str = "输出字段" if field['field_type'] == 2 else "输入字段"
print(f" - {field['name']} ({field['filed_code']}) [{field_type_str}] (当前state={field['state_int']})")
if dry_run:
print("\n⚠ 这是预览模式dry_run=True不会实际更新数据库")
cursor.close()
return len(fields_to_update)
# 执行更新使用CAST来正确比较
update_sql = """
UPDATE f_polic_field
SET state = 1, updated_time = %s, updated_by = %s
WHERE tenant_id = %s
AND CAST(state AS UNSIGNED) != 1
"""
cursor.execute(update_sql, (CURRENT_TIME, UPDATED_BY, TENANT_ID))
updated_count = cursor.rowcount
conn.commit()
cursor.close()
return updated_count
def main():
"""主函数"""
print("="*80)
print("启用f_polic_field表中所有字段")
print("="*80)
print()
try:
conn = pymysql.connect(**DB_CONFIG)
print("✓ 数据库连接成功")
except Exception as e:
print(f"✗ 数据库连接失败: {str(e)}")
return
try:
# 1. 检查当前状态统计
print("\n正在检查字段状态统计...")
stats = check_field_states(conn)
print("\n字段状态统计:")
total_fields = 0
enabled_fields = 0
disabled_fields = 0
for stat in stats:
state_int = stat['state_int']
field_type = stat['field_type']
count = stat['count']
total_fields += count
state_str = "启用" if state_int == 1 else "未启用"
type_str = "输出字段" if field_type == 2 else "输入字段"
print(f" {type_str} - {state_str} (state={state_int}): {count}")
if state_int == 1:
enabled_fields += count
else:
disabled_fields += count
print(f"\n总计: {total_fields} 个字段")
print(f" 启用: {enabled_fields}")
print(f" 未启用: {disabled_fields}")
# 2. 显示未启用的字段详情
if disabled_fields > 0:
print(f"\n正在查询未启用的字段详情...")
disabled_fields_list = get_fields_by_state(conn, 0)
print(f"\n未启用的字段列表 ({len(disabled_fields_list)} 个):")
for field in disabled_fields_list:
field_type_str = "输出字段" if field['field_type'] == 2 else "输入字段"
print(f" - {field['name']} ({field['filed_code']}) [{field_type_str}]")
# 3. 预览更新dry_run
print("\n" + "="*80)
print("预览更新(不会实际修改数据库)")
print("="*80)
count_to_update = enable_all_fields(conn, dry_run=True)
if count_to_update == 0:
print("\n所有字段已经是启用状态,无需更新")
return
# 4. 确认是否执行更新
print("\n" + "="*80)
print("准备执行更新")
print("="*80)
print(f"将更新 {count_to_update} 个字段的状态为启用state=1")
# 实际执行更新
print("\n正在执行更新...")
updated_count = enable_all_fields(conn, dry_run=False)
print(f"\n✓ 更新成功!共更新 {updated_count} 个字段")
# 5. 验证更新结果
print("\n正在验证更新结果...")
final_stats = check_field_states(conn)
print("\n更新后的字段状态统计:")
final_enabled = 0
final_disabled = 0
for stat in final_stats:
state_int = stat['state_int']
field_type = stat['field_type']
count = stat['count']
state_str = "启用" if state_int == 1 else "未启用"
type_str = "输出字段" if field_type == 2 else "输入字段"
print(f" {type_str} - {state_str} (state={state_int}): {count}")
if state_int == 1:
final_enabled += count
else:
final_disabled += count
print(f"\n总计: {final_enabled + final_disabled} 个字段")
print(f" 启用: {final_enabled}")
print(f" 未启用: {final_disabled}")
if final_disabled == 0:
print("\n✓ 所有字段已成功启用!")
else:
print(f"\n⚠ 仍有 {final_disabled} 个字段未启用")
print("\n" + "="*80)
print("操作完成!")
print("="*80)
except Exception as e:
print(f"\n✗ 处理失败: {str(e)}")
import traceback
traceback.print_exc()
conn.rollback()
finally:
conn.close()
if __name__ == '__main__':
main()

View File

@ -0,0 +1,328 @@
"""
导出模板和字段关系到Excel表格
用于汇总整理模板和字段关系后续可以基于这个Excel表格新增数据并增加导入脚本
"""
import pymysql
import os
from dotenv import load_dotenv
from openpyxl import Workbook
from openpyxl.styles import Font, Alignment, PatternFill, Border, Side
from openpyxl.utils import get_column_letter
from datetime import datetime
import re
# 加载环境变量
load_dotenv()
# 数据库配置
DB_CONFIG = {
'host': os.getenv('DB_HOST'),
'port': int(os.getenv('DB_PORT', 3306)),
'user': os.getenv('DB_USER'),
'password': os.getenv('DB_PASSWORD'),
'database': os.getenv('DB_NAME'),
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
def clean_query_result(data):
"""清理查询结果,将 bytes 类型转换为字符串"""
if isinstance(data, bytes):
if len(data) == 1:
return int.from_bytes(data, byteorder='big')
try:
return data.decode('utf-8')
except UnicodeDecodeError:
return data.decode('utf-8', errors='ignore')
elif isinstance(data, dict):
return {key: clean_query_result(value) for key, value in data.items()}
elif isinstance(data, list):
return [clean_query_result(item) for item in data]
elif isinstance(data, (int, float, str, bool, type(None))):
return data
else:
return str(data)
def extract_template_category(file_path, template_name):
"""
从文件路径或模板名称提取模板的上级分类
例如/615873064429507639/TEMPLATE/2025/12/2-初核模版/2.谈话审批/走读式谈话审批/2谈话审批表.docx
提取为2-初核模版/2.谈话审批/走读式谈话审批
"""
category = ""
# 首先尝试从文件路径提取
if file_path:
# 移除开头的斜杠和租户ID部分
path = file_path.lstrip('/')
# 移除租户ID/TEMPLATE/年份/月份/部分
pattern = r'^\d+/TEMPLATE/\d+/\d+/(.+)'
match = re.match(pattern, path)
if match:
full_path = match.group(1)
# 移除文件名,只保留目录路径
if '/' in full_path:
category = '/'.join(full_path.split('/')[:-1])
# 如果路径格式不匹配,尝试其他方式
if not category and ('template_finish' in path.lower() or '初核' in path or '谈话' in path or '函询' in path):
# 尝试提取目录结构
parts = path.split('/')
result_parts = []
for part in parts:
if any(keyword in part for keyword in ['初核', '谈话', '函询', '模版', '模板']):
result_parts.append(part)
if result_parts:
category = '/'.join(result_parts[:-1]) if len(result_parts) > 1 else result_parts[0]
# 如果从路径无法提取,尝试从模板名称推断
if not category and template_name:
# 根据模板名称中的关键词推断分类
if '初核' in template_name:
if '谈话' in template_name:
category = '2-初核模版/2.谈话审批'
elif '请示' in template_name or '审批' in template_name:
category = '2-初核模版/1.初核请示'
elif '结论' in template_name or '报告' in template_name:
category = '2-初核模版/3.初核结论'
else:
category = '2-初核模版'
elif '谈话' in template_name:
if '函询' in template_name:
category = '1-谈话函询模板/函询模板'
else:
category = '1-谈话函询模板/谈话模版'
elif '函询' in template_name:
category = '1-谈话函询模板/函询模板'
return category
def get_all_templates_with_fields():
"""
获取所有模板及其关联的输入和输出字段
Returns:
list: 模板列表每个模板包含字段信息
"""
conn = pymysql.connect(**DB_CONFIG)
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
# 查询所有启用的模板
cursor.execute("""
SELECT
fc.id AS template_id,
fc.name AS template_name,
fc.file_path
FROM f_polic_file_config fc
WHERE fc.tenant_id = %s
AND fc.state = 1
ORDER BY fc.name
""", (TENANT_ID,))
templates = cursor.fetchall()
templates = [clean_query_result(t) for t in templates]
result = []
for template in templates:
template_id = template['template_id']
template_name = template['template_name']
file_path = template.get('file_path', '')
# 提取模板上级分类
template_category = extract_template_category(file_path, template_name)
# 查询该模板关联的输入字段
cursor.execute("""
SELECT
f.id AS field_id,
f.name AS field_name,
f.filed_code AS field_code
FROM f_polic_file_field fff
INNER JOIN f_polic_field f ON fff.filed_id = f.id
WHERE fff.file_id = %s
AND fff.tenant_id = %s
AND fff.state = 1
AND f.state = 1
AND f.field_type = 1
ORDER BY f.name
""", (template_id, TENANT_ID))
input_fields = cursor.fetchall()
input_fields = [clean_query_result(f) for f in input_fields]
# 查询该模板关联的输出字段
cursor.execute("""
SELECT
f.id AS field_id,
f.name AS field_name,
f.filed_code AS field_code
FROM f_polic_file_field fff
INNER JOIN f_polic_field f ON fff.filed_id = f.id
WHERE fff.file_id = %s
AND fff.tenant_id = %s
AND fff.state = 1
AND f.state = 1
AND f.field_type = 2
ORDER BY f.name
""", (template_id, TENANT_ID))
output_fields = cursor.fetchall()
output_fields = [clean_query_result(f) for f in output_fields]
# 格式化字段信息
input_fields_str = '; '.join([f"{f['field_name']}({f['field_code']})" for f in input_fields])
output_fields_str = '; '.join([f"{f['field_name']}({f['field_code']})" for f in output_fields])
result.append({
'template_id': template_id,
'template_name': template_name,
'template_category': template_category,
'input_fields': input_fields,
'output_fields': output_fields,
'input_fields_str': input_fields_str,
'output_fields_str': output_fields_str,
'input_field_count': len(input_fields),
'output_field_count': len(output_fields)
})
return result
finally:
cursor.close()
conn.close()
def create_excel_file(templates_data, output_file='template_fields_export.xlsx'):
"""
创建Excel文件
Args:
templates_data: 模板数据列表
output_file: 输出文件名
"""
wb = Workbook()
ws = wb.active
ws.title = "模板字段关系"
# 设置表头
headers = ['模板ID', '模板名称', '模板上级', '输入字段', '输出字段', '输入字段数量', '输出字段数量']
ws.append(headers)
# 设置表头样式
header_fill = PatternFill(start_color="366092", end_color="366092", fill_type="solid")
header_font = Font(bold=True, color="FFFFFF", size=11)
header_alignment = Alignment(horizontal="center", vertical="center", wrap_text=True)
border = Border(
left=Side(style='thin'),
right=Side(style='thin'),
top=Side(style='thin'),
bottom=Side(style='thin')
)
for col_num, header in enumerate(headers, 1):
cell = ws.cell(row=1, column=col_num)
cell.fill = header_fill
cell.font = header_font
cell.alignment = header_alignment
cell.border = border
# 填充数据
data_font = Font(size=10)
data_alignment = Alignment(horizontal="left", vertical="top", wrap_text=True)
for template in templates_data:
row = [
template['template_id'],
template['template_name'],
template['template_category'],
template['input_fields_str'],
template['output_fields_str'],
template['input_field_count'],
template['output_field_count']
]
ws.append(row)
# 设置数据行样式
for col_num in range(1, len(headers) + 1):
cell = ws.cell(row=ws.max_row, column=col_num)
cell.font = data_font
cell.alignment = data_alignment
cell.border = border
# 设置列宽
ws.column_dimensions['A'].width = 18 # 模板ID
ws.column_dimensions['B'].width = 40 # 模板名称
ws.column_dimensions['C'].width = 50 # 模板上级
ws.column_dimensions['D'].width = 60 # 输入字段
ws.column_dimensions['E'].width = 80 # 输出字段
ws.column_dimensions['F'].width = 15 # 输入字段数量
ws.column_dimensions['G'].width = 15 # 输出字段数量
# 设置行高
ws.row_dimensions[1].height = 30 # 表头行高
for row_num in range(2, ws.max_row + 1):
ws.row_dimensions[row_num].height = 60 # 数据行高
# 冻结首行
ws.freeze_panes = 'A2'
# 保存文件
wb.save(output_file)
print(f"Excel文件已生成: {output_file}")
print(f"共导出 {len(templates_data)} 个模板")
def main():
"""主函数"""
print("开始导出模板和字段关系...")
print("=" * 80)
try:
# 获取所有模板及其字段
templates_data = get_all_templates_with_fields()
if not templates_data:
print("未找到任何模板数据")
return
print(f"共找到 {len(templates_data)} 个模板")
# 生成Excel文件
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
output_file = f"template_fields_export_{timestamp}.xlsx"
create_excel_file(templates_data, output_file)
# 打印统计信息
print("\n统计信息:")
print(f" 模板总数: {len(templates_data)}")
total_input_fields = sum(t['input_field_count'] for t in templates_data)
total_output_fields = sum(t['output_field_count'] for t in templates_data)
print(f" 输入字段总数: {total_input_fields}")
print(f" 输出字段总数: {total_output_fields}")
# 打印前几个模板的信息
print("\n前5个模板预览:")
for i, template in enumerate(templates_data[:5], 1):
print(f"\n{i}. {template['template_name']}")
print(f" 上级: {template['template_category']}")
print(f" 输入字段: {template['input_field_count']}")
print(f" 输出字段: {template['output_field_count']}")
if len(templates_data) > 5:
print(f"\n... 还有 {len(templates_data) - 5} 个模板")
except Exception as e:
print(f"导出失败: {str(e)}")
import traceback
traceback.print_exc()
if __name__ == '__main__':
main()

View File

@ -0,0 +1,158 @@
"""
最终完善模板层级结构
修复文件路径错误和重复问题
"""
import pymysql
import time
import random
from pathlib import Path
DB_CONFIG = {
'host': '152.136.177.240',
'port': 5012,
'user': 'finyx',
'password': '6QsGK6MpePZDE57Z',
'database': 'finyx',
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
CREATED_BY = 655162080928945152
UPDATED_BY = 655162080928945152
def generate_id():
timestamp = int(time.time() * 1000)
random_part = random.randint(100000, 999999)
return timestamp * 1000 + random_part
conn = pymysql.connect(**DB_CONFIG)
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
# 检查"1请示报告卡"的记录
# 根据目录结构,应该有两个不同的文件:
# 1. "1.初核请示"下的"1.请示报告卡XXX.docx"
# 2. "走读式谈话审批"下的"1.请示报告卡(初核谈话).docx"
cursor.execute("""
SELECT id, name, file_path, parent_id
FROM f_polic_file_config
WHERE tenant_id = %s AND name = %s
ORDER BY id
""", (TENANT_ID, '1请示报告卡'))
results = cursor.fetchall()
# 检查是否在"1.初核请示"下有记录
in_initial_request = any(r['parent_id'] == 1765431558933731 for r in results)
# 检查是否在"走读式谈话审批"下有记录
in_interview_approval = any(r['parent_id'] == 1765273962700431 for r in results)
if not in_initial_request:
# 需要在"1.初核请示"下创建记录
new_id = generate_id()
insert_sql = """
INSERT INTO f_polic_file_config
(id, tenant_id, parent_id, name, input_data, file_path, created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, %s)
"""
cursor.execute(insert_sql, (
new_id,
TENANT_ID,
1765431558933731, # 1.初核请示
'1请示报告卡',
None,
'/615873064429507639/TEMPLATE/2025/12/1.请示报告卡XXX.docx',
CREATED_BY,
CREATED_BY,
1
))
print(f"[CREATE] 在'1.初核请示'下创建'1请示报告卡'记录 (ID: {new_id})")
if not in_interview_approval:
# 需要在"走读式谈话审批"下创建记录
new_id = generate_id()
insert_sql = """
INSERT INTO f_polic_file_config
(id, tenant_id, parent_id, name, input_data, file_path, created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, %s)
"""
cursor.execute(insert_sql, (
new_id,
TENANT_ID,
1765273962700431, # 走读式谈话审批
'1请示报告卡',
None,
'/615873064429507639/TEMPLATE/2025/12/1.请示报告卡(初核谈话).docx',
CREATED_BY,
CREATED_BY,
1
))
print(f"[CREATE] 在'走读式谈话审批'下创建'1请示报告卡'记录 (ID: {new_id})")
# 更新现有记录的文件路径
for result in results:
if result['parent_id'] == 1765431558933731: # 1.初核请示
correct_path = '/615873064429507639/TEMPLATE/2025/12/1.请示报告卡XXX.docx'
elif result['parent_id'] == 1765273962700431: # 走读式谈话审批
correct_path = '/615873064429507639/TEMPLATE/2025/12/1.请示报告卡(初核谈话).docx'
else:
continue
if result['file_path'] != correct_path:
cursor.execute("""
UPDATE f_polic_file_config
SET file_path = %s, updated_time = NOW(), updated_by = %s
WHERE tenant_id = %s AND id = %s
""", (correct_path, UPDATED_BY, TENANT_ID, result['id']))
print(f"[UPDATE] 修复'1请示报告卡'的文件路径 (ID: {result['id']}): {result['file_path']} -> {correct_path}")
# 检查重复的"XXX初核情况报告"
cursor.execute("""
SELECT id, name, file_path, parent_id
FROM f_polic_file_config
WHERE tenant_id = %s AND name LIKE %s
ORDER BY id
""", (TENANT_ID, '%XXX初核情况报告%'))
results = cursor.fetchall()
if len(results) > 1:
# 保留最新的,删除旧的
# 或者根据file_path判断哪个是正确的
# 根据目录结构,应该是"8.XXX初核情况报告.docx"
correct_name = 'XXX初核情况报告'
correct_path = '/615873064429507639/TEMPLATE/2025/12/8.XXX初核情况报告.docx'
for r in results:
if r['name'] == '8.XXX初核情况报告':
# 这个应该删除(名称带数字前缀)
cursor.execute("""
DELETE FROM f_polic_file_field
WHERE tenant_id = %s AND file_id = %s
""", (TENANT_ID, r['id']))
cursor.execute("""
DELETE FROM f_polic_file_config
WHERE tenant_id = %s AND id = %s
""", (TENANT_ID, r['id']))
print(f"[DELETE] 删除重复记录: {r['name']} (ID: {r['id']})")
elif r['name'] == 'XXX初核情况报告':
# 更新这个记录的文件路径
if r['file_path'] != correct_path:
cursor.execute("""
UPDATE f_polic_file_config
SET file_path = %s, updated_time = NOW(), updated_by = %s
WHERE tenant_id = %s AND id = %s
""", (correct_path, UPDATED_BY, TENANT_ID, r['id']))
print(f"[UPDATE] 更新'XXX初核情况报告'的文件路径: {r['file_path']} -> {correct_path}")
conn.commit()
print("\n[OK] 修复完成")
except Exception as e:
conn.rollback()
print(f"[ERROR] 修复失败: {e}")
import traceback
traceback.print_exc()
finally:
cursor.close()
conn.close()

View File

@ -0,0 +1,102 @@
"""
修复document_service.py中的tenant_id查询问题
问题get_file_config_by_id方法没有检查tenant_id导致查询可能失败
解决方案在查询中添加tenant_id检查
"""
import re
from pathlib import Path
def fix_document_service():
"""修复document_service.py中的查询逻辑"""
file_path = Path("services/document_service.py")
if not file_path.exists():
print(f"[错误] 文件不存在: {file_path}")
return False
# 读取文件
with open(file_path, 'r', encoding='utf-8') as f:
content = f.read()
# 查找get_file_config_by_id方法
pattern = r'(def get_file_config_by_id\(self, file_id: int\) -> Optional\[Dict\]:.*?)(\s+sql = """.*?WHERE id = %s\s+AND state = 1\s+""".*?cursor\.execute\(sql, \(file_id,\)\))'
match = re.search(pattern, content, re.DOTALL)
if not match:
print("[错误] 未找到get_file_config_by_id方法或查询语句")
return False
old_code = match.group(0)
# 检查是否已经包含tenant_id
if 'tenant_id' in old_code:
print("[信息] 查询已经包含tenant_id检查无需修复")
return True
# 生成新的代码
new_sql = ''' sql = """
SELECT id, name, file_path
FROM f_polic_file_config
WHERE id = %s
AND tenant_id = %s
AND state = 1
"""
# 获取tenant_id从环境变量或请求中获取
tenant_id = self.tenant_id if self.tenant_id else os.getenv('TENANT_ID', '1')
try:
tenant_id = int(tenant_id)
except (ValueError, TypeError):
tenant_id = 1 # 默认值
cursor.execute(sql, (file_id, tenant_id))'''
# 替换
new_code = re.sub(
r'sql = """.*?WHERE id = %s\s+AND state = 1\s+""".*?cursor\.execute\(sql, \(file_id,\)\)',
new_sql,
old_code,
flags=re.DOTALL
)
new_content = content.replace(old_code, new_code)
# 检查是否需要导入os
if 'import os' not in new_content and 'os.getenv' in new_content:
# 在文件开头添加import os如果还没有
if 'from dotenv import load_dotenv' in new_content:
new_content = new_content.replace('from dotenv import load_dotenv', 'from dotenv import load_dotenv\nimport os')
elif 'import pymysql' in new_content:
new_content = new_content.replace('import pymysql', 'import pymysql\nimport os')
else:
# 在文件开头添加
lines = new_content.split('\n')
import_line = 0
for i, line in enumerate(lines):
if line.startswith('import ') or line.startswith('from '):
import_line = i + 1
lines.insert(import_line, 'import os')
new_content = '\n'.join(lines)
# 写回文件
with open(file_path, 'w', encoding='utf-8') as f:
f.write(new_content)
print("[成功] 已修复get_file_config_by_id方法添加了tenant_id检查")
return True
if __name__ == "__main__":
print("="*70)
print("修复document_service.py中的tenant_id查询问题")
print("="*70)
if fix_document_service():
print("\n修复完成!")
print("\n注意:")
print("1. 请确保.env文件中配置了TENANT_ID")
print("2. 或者确保应用程序在调用时正确传递tenant_id")
print("3. 建议在app.py中从请求中获取tenant_id并传递给document_service")
else:
print("\n修复失败,请手动检查代码")

176
fix_duplicate_fields.py Normal file
View File

@ -0,0 +1,176 @@
"""修复 f_polic_field 表中的重复字段"""
import pymysql
import os
from dotenv import load_dotenv
from collections import defaultdict
load_dotenv()
TENANT_ID = 615873064429507639
conn = pymysql.connect(
host=os.getenv('DB_HOST', '152.136.177.240'),
port=int(os.getenv('DB_PORT', 5012)),
user=os.getenv('DB_USER', 'finyx'),
password=os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
database=os.getenv('DB_NAME', 'finyx'),
charset='utf8mb4'
)
cursor = conn.cursor(pymysql.cursors.DictCursor)
print("=" * 80)
print("修复重复字段")
print("=" * 80)
# 1. 查找所有重复的 filed_code
cursor.execute("""
SELECT filed_code, COUNT(*) as cnt, GROUP_CONCAT(id ORDER BY id) as field_ids
FROM f_polic_field
WHERE tenant_id = %s
GROUP BY filed_code
HAVING cnt > 1
""", (TENANT_ID,))
duplicate_codes = cursor.fetchall()
print(f"\n发现 {len(duplicate_codes)} 个重复的字段编码:\n")
for dup in duplicate_codes:
code = dup['filed_code']
field_ids = [int(x) for x in dup['field_ids'].split(',')]
print(f"\n处理字段编码: {code}")
print(f" 字段ID列表: {field_ids}")
# 获取每个字段的详细信息
placeholders = ','.join(['%s'] * len(field_ids))
cursor.execute(f"""
SELECT id, name, field_type, state
FROM f_polic_field
WHERE id IN ({placeholders})
ORDER BY id
""", field_ids)
fields = cursor.fetchall()
# 获取每个字段的关联关系
field_associations = {}
for field_id in field_ids:
cursor.execute("""
SELECT COUNT(*) as cnt, GROUP_CONCAT(file_id) as file_ids
FROM f_polic_file_field
WHERE filed_id = %s
""", (field_id,))
result = cursor.fetchone()
field_associations[field_id] = {
'count': result['cnt'] if result else 0,
'file_ids': result['file_ids'].split(',') if result and result['file_ids'] else []
}
print(f"\n 字段详情和关联关系:")
for field in fields:
assoc = field_associations[field['id']]
print(f" ID: {field['id']}, name: {field['name']}, "
f"field_type: {field['field_type']}, state: {field['state']}, "
f"关联模板数: {assoc['count']}")
# 选择保留的字段优先选择关联模板数最多的如果相同则选择ID较小的
fields_with_assoc = [(f, field_associations[f['id']]) for f in fields]
fields_with_assoc.sort(key=lambda x: (-x[1]['count'], x[0]['id']))
keep_field = fields_with_assoc[0][0]
remove_fields = [f for f, _ in fields_with_assoc[1:]]
print(f"\n 保留字段: ID={keep_field['id']}, name={keep_field['name']}, "
f"关联模板数={field_associations[keep_field['id']]['count']}")
print(f" 删除字段: {[f['id'] for f in remove_fields]}")
# 迁移关联关系:将删除字段的关联关系迁移到保留字段
for remove_field in remove_fields:
remove_id = remove_field['id']
keep_id = keep_field['id']
# 获取删除字段的所有关联
cursor.execute("""
SELECT file_id
FROM f_polic_file_field
WHERE filed_id = %s
""", (remove_id,))
remove_assocs = cursor.fetchall()
migrated_count = 0
skipped_count = 0
for assoc in remove_assocs:
file_id = assoc['file_id']
# 检查保留字段是否已经关联了这个文件
cursor.execute("""
SELECT COUNT(*) as cnt
FROM f_polic_file_field
WHERE filed_id = %s AND file_id = %s
""", (keep_id, file_id))
exists = cursor.fetchone()['cnt'] > 0
if not exists:
# 迁移关联关系
cursor.execute("""
UPDATE f_polic_file_field
SET filed_id = %s
WHERE filed_id = %s AND file_id = %s
""", (keep_id, remove_id, file_id))
migrated_count += 1
else:
# 如果已存在,直接删除重复的关联
cursor.execute("""
DELETE FROM f_polic_file_field
WHERE filed_id = %s AND file_id = %s
""", (remove_id, file_id))
skipped_count += 1
print(f" 字段ID {remove_id} -> {keep_id}: 迁移 {migrated_count} 个关联, 跳过 {skipped_count} 个重复关联")
# 删除字段的所有关联关系(应该已经迁移或删除完毕)
cursor.execute("""
DELETE FROM f_polic_file_field
WHERE filed_id = %s
""", (remove_id,))
# 删除字段本身
cursor.execute("""
DELETE FROM f_polic_field
WHERE id = %s
""", (remove_id,))
print(f" 已删除字段 ID {remove_id} 及其关联关系")
print("\n" + "=" * 80)
print("验证修复结果")
print("=" * 80)
# 再次检查是否还有重复
cursor.execute("""
SELECT filed_code, COUNT(*) as cnt
FROM f_polic_field
WHERE tenant_id = %s
GROUP BY filed_code
HAVING cnt > 1
""", (TENANT_ID,))
remaining_duplicates = cursor.fetchall()
if remaining_duplicates:
print(f"\n警告:仍有 {len(remaining_duplicates)} 个重复的字段编码:")
for dup in remaining_duplicates:
print(f" {dup['filed_code']}: {dup['cnt']}")
else:
print("\n[OK] 所有重复字段已修复filed_code 现在唯一")
# 提交事务
conn.commit()
print("\n[OK] 所有更改已提交到数据库")
cursor.close()
conn.close()

View File

@ -0,0 +1,131 @@
"""
修复重复的"1请示报告卡"记录
确保每个文件在正确的位置只有一个记录
"""
import pymysql
DB_CONFIG = {
'host': '152.136.177.240',
'port': 5012,
'user': 'finyx',
'password': '6QsGK6MpePZDE57Z',
'database': 'finyx',
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
UPDATED_BY = 655162080928945152
conn = pymysql.connect(**DB_CONFIG)
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
# 查找所有"1请示报告卡"记录
cursor.execute("""
SELECT id, name, file_path, parent_id
FROM f_polic_file_config
WHERE tenant_id = %s AND name = %s
ORDER BY id
""", (TENANT_ID, '1请示报告卡'))
results = cursor.fetchall()
print(f"找到 {len(results)}'1请示报告卡'记录:\n")
# 根据file_path和parent_id判断哪些是正确的
correct_records = []
for r in results:
print(f"ID: {r['id']}, file_path: {r['file_path']}, parent_id: {r['parent_id']}")
# 判断是否正确
if r['parent_id'] == 1765431558933731: # 1.初核请示
if '1.请示报告卡XXX' in (r['file_path'] or ''):
correct_records.append(r)
elif r['parent_id'] == 1765273962700431: # 走读式谈话审批
if '1.请示报告卡(初核谈话)' in (r['file_path'] or ''):
correct_records.append(r)
print(f"\n正确的记录数: {len(correct_records)}")
# 删除不正确的记录
for r in results:
if r not in correct_records:
# 先删除关联关系
cursor.execute("""
DELETE FROM f_polic_file_field
WHERE tenant_id = %s AND file_id = %s
""", (TENANT_ID, r['id']))
# 删除模板记录
cursor.execute("""
DELETE FROM f_polic_file_config
WHERE tenant_id = %s AND id = %s
""", (TENANT_ID, r['id']))
print(f"[DELETE] 删除不正确的记录: ID {r['id']}, file_path: {r['file_path']}, parent_id: {r['parent_id']}")
# 确保两个位置都有正确的记录
has_initial_request = any(r['parent_id'] == 1765431558933731 for r in correct_records)
has_interview_approval = any(r['parent_id'] == 1765273962700431 for r in correct_records)
if not has_initial_request:
# 创建"1.初核请示"下的记录
import time
import random
timestamp = int(time.time() * 1000)
random_part = random.randint(100000, 999999)
new_id = timestamp * 1000 + random_part
insert_sql = """
INSERT INTO f_polic_file_config
(id, tenant_id, parent_id, name, input_data, file_path, created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, %s)
"""
cursor.execute(insert_sql, (
new_id,
TENANT_ID,
1765431558933731, # 1.初核请示
'1请示报告卡',
None,
'/615873064429507639/TEMPLATE/2025/12/1.请示报告卡XXX.docx',
655162080928945152,
655162080928945152,
1
))
print(f"[CREATE] 在'1.初核请示'下创建'1请示报告卡'记录 (ID: {new_id})")
if not has_interview_approval:
# 创建"走读式谈话审批"下的记录
import time
import random
timestamp = int(time.time() * 1000)
random_part = random.randint(100000, 999999)
new_id = timestamp * 1000 + random_part
insert_sql = """
INSERT INTO f_polic_file_config
(id, tenant_id, parent_id, name, input_data, file_path, created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, %s)
"""
cursor.execute(insert_sql, (
new_id,
TENANT_ID,
1765273962700431, # 走读式谈话审批
'1请示报告卡',
None,
'/615873064429507639/TEMPLATE/2025/12/1.请示报告卡(初核谈话).docx',
655162080928945152,
655162080928945152,
1
))
print(f"[CREATE] 在'走读式谈话审批'下创建'1请示报告卡'记录 (ID: {new_id})")
conn.commit()
print("\n[OK] 修复完成")
except Exception as e:
conn.rollback()
print(f"[ERROR] 修复失败: {e}")
import traceback
traceback.print_exc()
finally:
cursor.close()
conn.close()

147
fix_isolated_template.py Normal file
View File

@ -0,0 +1,147 @@
"""
修复孤立的模板文件有路径但无父级
"""
import os
import pymysql
from pathlib import Path
from dotenv import load_dotenv
# 加载环境变量
load_dotenv()
# 数据库配置
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
UPDATED_BY = 655162080928945152
def get_actual_tenant_id(conn) -> int:
"""获取数据库中的实际tenant_id"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
cursor.execute("SELECT DISTINCT tenant_id FROM f_polic_file_config LIMIT 1")
result = cursor.fetchone()
if result:
return result['tenant_id']
return 1
finally:
cursor.close()
def find_parent_directory(conn, tenant_id: int, file_path: str) -> int:
"""根据文件路径找到父目录ID"""
# 从文件路径中提取父目录路径
path_parts = file_path.split('/')
if len(path_parts) < 2:
return None
# 父目录路径(去掉文件名)
parent_path = '/'.join(path_parts[:-1])
parent_dir_name = path_parts[-2] # 父目录名称
# 查找父目录通过名称匹配且file_path为NULL
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
sql = """
SELECT id, name
FROM f_polic_file_config
WHERE tenant_id = %s
AND name = %s
AND file_path IS NULL
ORDER BY id
LIMIT 1
"""
cursor.execute(sql, (tenant_id, parent_dir_name))
result = cursor.fetchone()
if result:
return result['id']
return None
finally:
cursor.close()
def main():
"""主函数"""
print("="*70)
print("修复孤立的模板文件")
print("="*70)
try:
conn = pymysql.connect(**DB_CONFIG)
print("[OK] 数据库连接成功")
except Exception as e:
print(f"[FAIL] 数据库连接失败: {str(e)}")
return
try:
tenant_id = get_actual_tenant_id(conn)
print(f"实际tenant_id: {tenant_id}")
# 查找孤立的文件有路径但无父级且路径包含至少2级
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
sql = """
SELECT id, name, file_path
FROM f_polic_file_config
WHERE tenant_id = %s
AND file_path IS NOT NULL
AND parent_id IS NULL
AND file_path LIKE 'template_finish/%%/%%'
"""
cursor.execute(sql, (tenant_id,))
isolated_files = cursor.fetchall()
if not isolated_files:
print("[OK] 没有发现孤立的文件")
return
print(f"\n发现 {len(isolated_files)} 个孤立的文件:")
fixed_count = 0
for file in isolated_files:
print(f"\n 文件: {file['name']}")
print(f" ID: {file['id']}")
print(f" 路径: {file['file_path']}")
# 查找父目录
parent_id = find_parent_directory(conn, tenant_id, file['file_path'])
if parent_id:
# 更新parent_id
update_cursor = conn.cursor()
try:
update_cursor.execute("""
UPDATE f_polic_file_config
SET parent_id = %s, updated_time = NOW(), updated_by = %s
WHERE id = %s AND tenant_id = %s
""", (parent_id, UPDATED_BY, file['id'], tenant_id))
conn.commit()
print(f" [修复] 设置parent_id: {parent_id}")
fixed_count += 1
except Exception as e:
conn.rollback()
print(f" [错误] 更新失败: {str(e)}")
finally:
update_cursor.close()
else:
print(f" [警告] 未找到父目录")
print(f"\n[OK] 成功修复 {fixed_count} 个文件")
finally:
cursor.close()
finally:
conn.close()
print("[OK] 数据库连接已关闭")
if __name__ == "__main__":
main()

209
fix_minio_config.py Normal file
View File

@ -0,0 +1,209 @@
"""
修复MinIO配置
1. 创建或更新.env文件
2. 检查并迁移模板文件
"""
import os
from pathlib import Path
# 新MinIO配置
NEW_MINIO_CONFIG = {
'endpoint': '10.100.31.21:9000',
'access_key': 'minio_PC8dcY',
'secret_key': 'minio_7k7RNJ',
'secure': 'false', # 重要必须是false
'bucket': 'finyx'
}
def create_env_file():
"""创建或更新.env文件"""
env_file = Path('.env')
print("="*70)
print("创建/更新 .env 文件")
print("="*70)
# 读取现有.env文件如果存在
existing_vars = {}
if env_file.exists():
print(f"\n发现现有 .env 文件将更新MinIO相关配置...")
with open(env_file, 'r', encoding='utf-8') as f:
for line in f:
line = line.strip()
if line and not line.startswith('#') and '=' in line:
key, value = line.split('=', 1)
existing_vars[key.strip()] = value.strip()
else:
print(f"\n创建新的 .env 文件...")
# 更新MinIO配置
existing_vars['MINIO_ENDPOINT'] = NEW_MINIO_CONFIG['endpoint']
existing_vars['MINIO_ACCESS_KEY'] = NEW_MINIO_CONFIG['access_key']
existing_vars['MINIO_SECRET_KEY'] = NEW_MINIO_CONFIG['secret_key']
existing_vars['MINIO_BUCKET'] = NEW_MINIO_CONFIG['bucket']
existing_vars['MINIO_SECURE'] = NEW_MINIO_CONFIG['secure']
# 写入.env文件
with open(env_file, 'w', encoding='utf-8') as f:
f.write("# MinIO配置\n")
f.write(f"MINIO_ENDPOINT={NEW_MINIO_CONFIG['endpoint']}\n")
f.write(f"MINIO_ACCESS_KEY={NEW_MINIO_CONFIG['access_key']}\n")
f.write(f"MINIO_SECRET_KEY={NEW_MINIO_CONFIG['secret_key']}\n")
f.write(f"MINIO_BUCKET={NEW_MINIO_CONFIG['bucket']}\n")
f.write(f"MINIO_SECURE={NEW_MINIO_CONFIG['secure']} # 重要新服务器使用HTTP必须是false\n")
f.write("\n")
# 保留其他配置(如果有)
other_keys = set(existing_vars.keys()) - {
'MINIO_ENDPOINT', 'MINIO_ACCESS_KEY', 'MINIO_SECRET_KEY',
'MINIO_BUCKET', 'MINIO_SECURE'
}
if other_keys:
f.write("# 其他配置\n")
for key in sorted(other_keys):
f.write(f"{key}={existing_vars[key]}\n")
print(f"\n[OK] .env 文件已更新")
print(f"\n更新的配置:")
print(f" MINIO_ENDPOINT={NEW_MINIO_CONFIG['endpoint']}")
print(f" MINIO_ACCESS_KEY={NEW_MINIO_CONFIG['access_key']}")
print(f" MINIO_SECRET_KEY={NEW_MINIO_CONFIG['secret_key'][:8]}***")
print(f" MINIO_BUCKET={NEW_MINIO_CONFIG['bucket']}")
print(f" MINIO_SECURE={NEW_MINIO_CONFIG['secure']} # [IMPORTANT] 必须是false")
return True
def check_template_files():
"""检查模板文件是否存在"""
print("\n" + "="*70)
print("检查模板文件")
print("="*70)
try:
from minio import Minio
from minio.error import S3Error
import pymysql
from dotenv import load_dotenv
load_dotenv()
# 连接新MinIO
client = Minio(
NEW_MINIO_CONFIG['endpoint'],
access_key=NEW_MINIO_CONFIG['access_key'],
secret_key=NEW_MINIO_CONFIG['secret_key'],
secure=False
)
# 连接数据库
db_config = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
conn = pymysql.connect(**db_config)
cursor = conn.cursor(pymysql.cursors.DictCursor)
# 查询所有模板
sql = """
SELECT id, name, file_path
FROM f_polic_file_config
WHERE tenant_id = %s
AND state = 1
AND file_path IS NOT NULL
AND file_path != ''
"""
cursor.execute(sql, (615873064429507639,))
templates = cursor.fetchall()
print(f"\n数据库中找到 {len(templates)} 个模板文件")
missing_files = []
existing_files = []
for template in templates:
object_name = template['file_path'].lstrip('/')
try:
stat = client.stat_object(NEW_MINIO_CONFIG['bucket'], object_name)
existing_files.append(template)
print(f" [OK] {template['name']} - 存在 ({stat.size:,} 字节)")
except S3Error as e:
if e.code == 'NoSuchKey':
missing_files.append(template)
print(f" [FAIL] {template['name']} - 不存在")
print(f" 路径: {object_name}")
cursor.close()
conn.close()
print(f"\n总结:")
print(f" 存在的文件: {len(existing_files)}")
print(f" 缺失的文件: {len(missing_files)}")
if missing_files:
print(f"\n[WARN] 发现 {len(missing_files)} 个模板文件在新MinIO服务器上不存在")
print(f"\n需要执行以下操作之一:")
print(f" 1. 从旧MinIO服务器迁移这些文件到新服务器")
print(f" 2. 重新上传这些模板文件到新MinIO服务器")
print(f"\n缺失的文件列表:")
for template in missing_files:
print(f" - {template['name']}")
print(f" 路径: {template['file_path']}")
return len(missing_files) == 0
except Exception as e:
print(f"\n[ERROR] 检查模板文件时出错: {str(e)}")
import traceback
traceback.print_exc()
return False
def main():
"""主函数"""
print("\n" + "="*70)
print("MinIO配置修复工具")
print("="*70)
try:
# 1. 创建/更新.env文件
create_env_file()
# 2. 检查模板文件
all_files_exist = check_template_files()
# 总结
print("\n" + "="*70)
print("修复总结")
print("="*70)
print("\n[OK] .env 文件已更新")
if all_files_exist:
print("[OK] 所有模板文件都存在")
print("\n下一步:")
print(" 1. 重启应用服务以使新的环境变量生效")
print(" 2. 测试文档生成功能")
else:
print("[WARN] 部分模板文件缺失")
print("\n下一步:")
print(" 1. 迁移或上传缺失的模板文件到新MinIO服务器")
print(" 2. 重启应用服务以使新的环境变量生效")
print(" 3. 测试文档生成功能")
print("\n重要提示:")
print(" - MINIO_SECURE 必须设置为 false新服务器使用HTTP")
print(" - 更新环境变量后必须重启应用才能生效")
except Exception as e:
print(f"\n[ERROR] 修复过程中发生错误: {e}")
import traceback
traceback.print_exc()
if __name__ == '__main__':
main()

View File

@ -0,0 +1,191 @@
"""
修复缺失的 target_education_level 字段
检查并创建被核查人员文化程度字段
"""
import pymysql
import os
from datetime import datetime
# 数据库连接配置
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
CREATED_BY = 655162080928945152
UPDATED_BY = 655162080928945152
CURRENT_TIME = datetime.now()
# 字段定义
FIELD_DEFINITION = {
'name': '被核查人员文化程度',
'field_code': 'target_education_level',
'field_type': 2, # 输出字段
'description': '被核查人员文化程度(如:本科、大专、高中等)'
}
def generate_id():
"""生成ID使用时间戳+随机数的方式,模拟雪花算法)"""
import time
import random
timestamp = int(time.time() * 1000)
random_part = random.randint(100000, 999999)
return timestamp * 1000 + random_part
def check_field_exists(conn):
"""检查字段是否存在"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
sql = """
SELECT id, name, filed_code, field_type, state
FROM f_polic_field
WHERE tenant_id = %s AND filed_code = %s
"""
cursor.execute(sql, (TENANT_ID, FIELD_DEFINITION['field_code']))
field = cursor.fetchone()
cursor.close()
return field
def create_field(conn, dry_run: bool = True):
"""创建字段"""
cursor = conn.cursor()
field_id = generate_id()
insert_sql = """
INSERT INTO f_polic_field
(id, tenant_id, name, filed_code, field_type, created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s)
"""
if dry_run:
print(f"[DRY RUN] 将创建字段:")
print(f" ID: {field_id}")
print(f" 名称: {FIELD_DEFINITION['name']}")
print(f" 编码: {FIELD_DEFINITION['field_code']}")
print(f" 类型: {FIELD_DEFINITION['field_type']} (输出字段)")
print(f" 状态: 1 (启用)")
else:
cursor.execute(insert_sql, (
field_id,
TENANT_ID,
FIELD_DEFINITION['name'],
FIELD_DEFINITION['field_code'],
FIELD_DEFINITION['field_type'],
CURRENT_TIME,
CREATED_BY,
CURRENT_TIME,
UPDATED_BY,
1 # state: 1表示启用
))
conn.commit()
print(f"✓ 成功创建字段: {FIELD_DEFINITION['name']} ({FIELD_DEFINITION['field_code']}), ID: {field_id}")
cursor.close()
return field_id
def update_field_state(conn, field_id, dry_run: bool = True):
"""更新字段状态为启用"""
cursor = conn.cursor()
update_sql = """
UPDATE f_polic_field
SET state = 1, updated_time = NOW(), updated_by = %s
WHERE id = %s AND tenant_id = %s
"""
if dry_run:
print(f"[DRY RUN] 将更新字段状态为启用: ID={field_id}")
else:
cursor.execute(update_sql, (UPDATED_BY, field_id, TENANT_ID))
conn.commit()
print(f"✓ 成功更新字段状态为启用: ID={field_id}")
cursor.close()
def main(dry_run: bool = True):
"""主函数"""
print("="*80)
print("修复缺失的 target_education_level 字段")
print("="*80)
if dry_run:
print("\n[DRY RUN模式 - 不会实际修改数据库]")
else:
print("\n[实际执行模式 - 将修改数据库]")
try:
conn = pymysql.connect(**DB_CONFIG)
print("✓ 数据库连接成功\n")
# 检查字段是否存在
print("1. 检查字段是否存在...")
existing_field = check_field_exists(conn)
if existing_field:
print(f" ✓ 字段已存在:")
print(f" ID: {existing_field['id']}")
print(f" 名称: {existing_field['name']}")
print(f" 编码: {existing_field['filed_code']}")
print(f" 类型: {existing_field['field_type']} ({'输出字段' if existing_field['field_type'] == 2 else '输入字段'})")
print(f" 状态: {existing_field['state']} ({'启用' if existing_field['state'] == 1 else '未启用'})")
# 如果字段存在但未启用,启用它
if existing_field['state'] != 1:
print(f"\n2. 字段存在但未启用,将更新状态...")
update_field_state(conn, existing_field['id'], dry_run=dry_run)
else:
print(f"\n✓ 字段已存在且已启用,无需操作")
else:
print(f" ✗ 字段不存在,需要创建")
print(f"\n2. 创建字段...")
field_id = create_field(conn, dry_run=dry_run)
if not dry_run:
print(f"\n✓ 字段创建完成")
print("\n" + "="*80)
if dry_run:
print("\n这是DRY RUN模式未实际修改数据库。")
print("要实际执行,请运行: python fix_missing_education_level_field.py --execute")
else:
print("\n✓ 字段修复完成")
except Exception as e:
print(f"\n✗ 发生错误: {e}")
import traceback
traceback.print_exc()
if not dry_run:
conn.rollback()
finally:
conn.close()
print("\n数据库连接已关闭")
if __name__ == '__main__':
import sys
dry_run = '--execute' not in sys.argv
if not dry_run:
print("\n⚠ 警告: 这将修改数据库!")
response = input("确认要继续吗? (yes/no): ")
if response.lower() != 'yes':
print("操作已取消")
sys.exit(0)
main(dry_run=dry_run)

View File

@ -0,0 +1,260 @@
"""
修复缺少字段关联的模板
为有 template_code 但没有字段关联的文件节点补充字段关联
"""
import os
import json
import pymysql
from typing import Dict, List
# 数据库连接配置
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
CREATED_BY = 655162080928945152
UPDATED_BY = 655162080928945152
def generate_id():
"""生成ID"""
import time
import random
timestamp = int(time.time() * 1000)
random_part = random.randint(100000, 999999)
return timestamp * 1000 + random_part
def get_templates_without_relations(conn):
"""获取没有字段关联的文件节点"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
sql = """
SELECT
fc.id,
fc.name,
fc.template_code,
fc.input_data,
COUNT(ff.id) as relation_count
FROM f_polic_file_config fc
LEFT JOIN f_polic_file_field ff ON fc.id = ff.file_id AND ff.tenant_id = fc.tenant_id
WHERE fc.tenant_id = %s
AND fc.template_code IS NOT NULL
AND fc.template_code != ''
GROUP BY fc.id, fc.name, fc.template_code, fc.input_data
HAVING relation_count = 0
ORDER BY fc.name
"""
cursor.execute(sql, (TENANT_ID,))
templates = cursor.fetchall()
cursor.close()
return templates
def get_fields_by_code(conn):
"""获取所有字段,按字段编码索引"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
sql = """
SELECT id, name, filed_code, field_type
FROM f_polic_field
WHERE tenant_id = %s
"""
cursor.execute(sql, (TENANT_ID,))
fields = cursor.fetchall()
result = {
'by_code': {},
'by_name': {}
}
for field in fields:
field_code = field['filed_code']
field_name = field['name']
result['by_code'][field_code] = field
result['by_name'][field_name] = field
cursor.close()
return result
def extract_fields_from_input_data(input_data: str) -> List[str]:
"""从 input_data 中提取字段编码列表"""
try:
data = json.loads(input_data) if isinstance(input_data, str) else input_data
if isinstance(data, dict):
return data.get('input_fields', [])
except:
pass
return []
def create_field_relations(conn, file_id: int, field_codes: List[str], field_type: int,
db_fields: Dict, dry_run: bool = True):
"""创建字段关联关系"""
cursor = conn.cursor()
try:
created_count = 0
for field_code in field_codes:
field = db_fields['by_code'].get(field_code)
if not field:
print(f" ⚠ 字段不存在: {field_code}")
continue
if field['field_type'] != field_type:
print(f" ⚠ 字段类型不匹配: {field_code} (期望 {field_type}, 实际 {field['field_type']})")
continue
if not dry_run:
# 检查是否已存在
check_sql = """
SELECT id FROM f_polic_file_field
WHERE tenant_id = %s AND file_id = %s AND filed_id = %s
"""
cursor.execute(check_sql, (TENANT_ID, file_id, field['id']))
existing = cursor.fetchone()
if not existing:
relation_id = generate_id()
insert_sql = """
INSERT INTO f_polic_file_field
(id, tenant_id, file_id, filed_id, created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, NOW(), %s, NOW(), %s, %s)
"""
cursor.execute(insert_sql, (
relation_id, TENANT_ID, file_id, field['id'],
CREATED_BY, UPDATED_BY, 1
))
created_count += 1
print(f" ✓ 创建关联: {field['name']} ({field_code})")
else:
created_count += 1
print(f" [模拟] 将创建关联: {field_code}")
if not dry_run:
conn.commit()
return created_count
finally:
cursor.close()
def main():
"""主函数"""
print("="*80)
print("修复缺少字段关联的模板")
print("="*80)
try:
conn = pymysql.connect(**DB_CONFIG)
print("✓ 数据库连接成功\n")
except Exception as e:
print(f"✗ 数据库连接失败: {e}")
return
try:
# 获取没有字段关联的模板
print("查找缺少字段关联的模板...")
templates = get_templates_without_relations(conn)
print(f" 找到 {len(templates)} 个缺少字段关联的文件节点\n")
if not templates:
print("✓ 所有文件节点都有字段关联,无需修复")
return
# 获取所有字段
print("获取字段定义...")
db_fields = get_fields_by_code(conn)
print(f" 找到 {len(db_fields['by_code'])} 个字段\n")
# 显示需要修复的模板
print("需要修复的模板:")
for template in templates:
print(f" - {template['name']} (code: {template['template_code']})")
# 尝试从 input_data 中提取字段
print("\n" + "="*80)
print("分析并修复")
print("="*80)
fixable_count = 0
unfixable_count = 0
for template in templates:
print(f"\n处理: {template['name']}")
print(f" template_code: {template['template_code']}")
input_data = template.get('input_data')
if not input_data:
print(" ⚠ 没有 input_data无法自动修复")
unfixable_count += 1
continue
# 从 input_data 中提取输入字段
input_fields = extract_fields_from_input_data(input_data)
if not input_fields:
print(" ⚠ input_data 中没有 input_fields无法自动修复")
unfixable_count += 1
continue
print(f" 找到 {len(input_fields)} 个输入字段")
fixable_count += 1
# 创建输入字段关联
print(" 创建输入字段关联...")
created = create_field_relations(conn, template['id'], input_fields, 1, db_fields, dry_run=True)
print(f" 将创建 {created} 个输入字段关联")
print("\n" + "="*80)
print("统计")
print("="*80)
print(f" 可修复: {fixable_count}")
print(f" 无法自动修复: {unfixable_count}")
# 询问是否执行
if fixable_count > 0:
print("\n" + "="*80)
response = input("\n是否执行修复?(yes/no默认no): ").strip().lower()
if response == 'yes':
print("\n执行修复...")
for template in templates:
input_data = template.get('input_data')
if not input_data:
continue
input_fields = extract_fields_from_input_data(input_data)
if not input_fields:
continue
print(f"\n修复: {template['name']}")
create_field_relations(conn, template['id'], input_fields, 1, db_fields, dry_run=False)
print("\n" + "="*80)
print("✓ 修复完成!")
print("="*80)
else:
print("\n已取消修复")
else:
print("\n没有可以自动修复的模板")
finally:
conn.close()
print("\n数据库连接已关闭")
if __name__ == '__main__':
main()

View File

@ -0,0 +1,201 @@
"""
只修复真正包含中文的field_code字段
"""
import os
import pymysql
import re
from typing import Dict
# 数据库连接配置
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
UPDATED_BY = 655162080928945152
# 字段名称到field_code的映射针对剩余的中文字段
FIELD_MAPPING = {
# 谈话相关字段
'拟谈话地点': 'proposed_interview_location',
'拟谈话时间': 'proposed_interview_time',
'谈话事由': 'interview_reason',
'谈话人': 'interviewer',
'谈话人员-安全员': 'interview_personnel_safety_officer',
'谈话人员-组长': 'interview_personnel_leader',
'谈话人员-谈话人员': 'interview_personnel',
'谈话前安全风险评估结果': 'pre_interview_risk_assessment_result',
'谈话地点': 'interview_location',
'谈话次数': 'interview_count',
# 被核查人员相关字段
'被核查人单位及职务': 'target_organization_and_position', # 注意:这个和"被核查人员单位及职务"应该是同一个
'被核查人员交代问题程度': 'target_confession_level',
'被核查人员减压后的表现': 'target_behavior_after_relief',
'被核查人员学历': 'target_education', # 注意:这个和"被核查人员文化程度"可能不同
'被核查人员工作履历': 'target_work_history',
'被核查人员思想负担程度': 'target_mental_burden_level',
'被核查人员职业': 'target_occupation',
'被核查人员谈话中的表现': 'target_behavior_during_interview',
'被核查人员问题严重程度': 'target_issue_severity_level',
'被核查人员风险等级': 'target_risk_level',
'被核查人基本情况': 'target_basic_info',
# 其他字段
'补空人员': 'backup_personnel',
'记录人': 'recorder',
'评估意见': 'assessment_opinion',
}
def is_chinese(text: str) -> bool:
"""判断字符串是否完全或主要包含中文字符"""
if not text:
return False
# 如果包含中文字符且中文字符占比超过50%,认为是中文
chinese_chars = len(re.findall(r'[\u4e00-\u9fff]', text))
total_chars = len(text)
if total_chars == 0:
return False
return chinese_chars / total_chars > 0.3 # 如果中文字符占比超过30%,认为是中文
def fix_chinese_fields(dry_run: bool = True):
"""修复包含中文的field_code字段"""
conn = pymysql.connect(**DB_CONFIG)
cursor = conn.cursor(pymysql.cursors.DictCursor)
print("="*80)
print("修复包含中文的field_code字段")
print("="*80)
if dry_run:
print("\n[DRY RUN模式 - 不会实际修改数据库]")
# 查询所有字段
cursor.execute("""
SELECT id, name, filed_code, field_type, state
FROM f_polic_field
WHERE tenant_id = %s
ORDER BY name
""", (TENANT_ID,))
all_fields = cursor.fetchall()
# 找出field_code包含中文的字段
chinese_fields = []
for field in all_fields:
if field['filed_code'] and is_chinese(field['filed_code']):
chinese_fields.append(field)
print(f"\n找到 {len(chinese_fields)} 个field_code包含中文的字段:\n")
updates = []
for field in chinese_fields:
field_name = field['name']
new_code = FIELD_MAPPING.get(field_name)
if not new_code:
# 如果没有映射生成一个基于名称的code
new_code = field_name.lower()
new_code = new_code.replace('被核查人员', 'target_').replace('被核查人', 'target_')
new_code = new_code.replace('谈话', 'interview_')
new_code = new_code.replace('人员', '')
new_code = new_code.replace('时间', '_time')
new_code = new_code.replace('地点', '_location')
new_code = new_code.replace('问题', '_issue')
new_code = new_code.replace('情况', '_situation')
new_code = new_code.replace('程度', '_level')
new_code = new_code.replace('表现', '_behavior')
new_code = new_code.replace('等级', '_level')
new_code = new_code.replace('履历', '_history')
new_code = new_code.replace('学历', '_education')
new_code = new_code.replace('职业', '_occupation')
new_code = new_code.replace('事由', '_reason')
new_code = new_code.replace('次数', '_count')
new_code = new_code.replace('结果', '_result')
new_code = new_code.replace('意见', '_opinion')
new_code = re.sub(r'[^\w]', '_', new_code)
new_code = re.sub(r'_+', '_', new_code).strip('_')
new_code = new_code.replace('__', '_')
updates.append({
'id': field['id'],
'name': field_name,
'old_code': field['filed_code'],
'new_code': new_code,
'field_type': field['field_type']
})
print(f" ID: {field['id']}")
print(f" 名称: {field_name}")
print(f" 当前field_code: {field['filed_code']}")
print(f" 新field_code: {new_code}")
print()
# 检查是否有重复的new_code
code_to_fields = {}
for update in updates:
code = update['new_code']
if code not in code_to_fields:
code_to_fields[code] = []
code_to_fields[code].append(update)
duplicate_codes = {code: fields_list for code, fields_list in code_to_fields.items()
if len(fields_list) > 1}
if duplicate_codes:
print("\n⚠ 警告以下field_code会重复:")
for code, fields_list in duplicate_codes.items():
print(f" field_code: {code}")
for field in fields_list:
print(f" - ID: {field['id']}, 名称: {field['name']}")
print()
# 执行更新
if not dry_run:
print("开始执行更新...\n")
for update in updates:
cursor.execute("""
UPDATE f_polic_field
SET filed_code = %s, updated_time = NOW(), updated_by = %s
WHERE id = %s
""", (update['new_code'], UPDATED_BY, update['id']))
print(f" ✓ 更新字段 ID {update['id']}: {update['name']}")
print(f" {update['old_code']} -> {update['new_code']}")
conn.commit()
print("\n✓ 更新完成")
else:
print("[DRY RUN] 以上操作不会实际执行")
cursor.close()
conn.close()
return updates
if __name__ == '__main__':
print("是否执行修复?")
print("1. DRY RUN不实际修改数据库")
print("2. 直接执行修复(会修改数据库)")
choice = input("\n请选择 (1/2默认1): ").strip() or "1"
if choice == "2":
print("\n执行实际修复...")
fix_chinese_fields(dry_run=False)
else:
print("\n执行DRY RUN...")
updates = fix_chinese_fields(dry_run=True)
if updates:
confirm = input("\nDRY RUN完成。是否执行实际修复(y/n默认n): ").strip().lower()
if confirm == 'y':
print("\n执行实际修复...")
fix_chinese_fields(dry_run=False)

View File

@ -0,0 +1,191 @@
"""
修复剩余的中文field_code字段
为这些字段生成合适的英文field_code
"""
import os
import pymysql
import re
from typing import Dict
# 数据库连接配置
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
UPDATED_BY = 655162080928945152
# 字段名称到field_code的映射针对剩余的中文字段
FIELD_MAPPING = {
# 谈话相关字段
'拟谈话地点': 'proposed_interview_location',
'拟谈话时间': 'proposed_interview_time',
'谈话事由': 'interview_reason',
'谈话人': 'interviewer',
'谈话人员-安全员': 'interview_personnel_safety_officer',
'谈话人员-组长': 'interview_personnel_leader',
'谈话人员-谈话人员': 'interview_personnel',
'谈话前安全风险评估结果': 'pre_interview_risk_assessment_result',
'谈话地点': 'interview_location',
'谈话次数': 'interview_count',
# 被核查人员相关字段
'被核查人单位及职务': 'target_organization_and_position', # 注意:这个和"被核查人员单位及职务"应该是同一个
'被核查人员交代问题程度': 'target_confession_level',
'被核查人员减压后的表现': 'target_behavior_after_relief',
'被核查人员学历': 'target_education', # 注意:这个和"被核查人员文化程度"可能不同
'被核查人员工作履历': 'target_work_history',
'被核查人员思想负担程度': 'target_mental_burden_level',
'被核查人员职业': 'target_occupation',
'被核查人员谈话中的表现': 'target_behavior_during_interview',
'被核查人员问题严重程度': 'target_issue_severity_level',
'被核查人员风险等级': 'target_risk_level',
'被核查人基本情况': 'target_basic_info',
# 其他字段
'补空人员': 'backup_personnel',
'记录人': 'recorder',
'评估意见': 'assessment_opinion',
}
def is_chinese(text: str) -> bool:
"""判断字符串是否包含中文字符"""
if not text:
return False
return bool(re.search(r'[\u4e00-\u9fff]', text))
def fix_remaining_fields(dry_run: bool = True):
"""修复剩余的中文field_code字段"""
conn = pymysql.connect(**DB_CONFIG)
cursor = conn.cursor(pymysql.cursors.DictCursor)
print("="*80)
print("修复剩余的中文field_code字段")
print("="*80)
if dry_run:
print("\n[DRY RUN模式 - 不会实际修改数据库]")
# 查询所有包含中文field_code的字段
cursor.execute("""
SELECT id, name, filed_code, field_type, state
FROM f_polic_field
WHERE tenant_id = %s AND filed_code REGEXP '[\\u4e00-\\u9fff]'
ORDER BY name
""", (TENANT_ID,))
fields = cursor.fetchall()
print(f"\n找到 {len(fields)} 个需要修复的字段:\n")
updates = []
for field in fields:
field_name = field['name']
new_code = FIELD_MAPPING.get(field_name)
if not new_code:
# 如果没有映射生成一个基于名称的code
new_code = field_name.lower()
new_code = new_code.replace('被核查人员', 'target_').replace('被核查人', 'target_')
new_code = new_code.replace('谈话', 'interview_')
new_code = new_code.replace('人员', '')
new_code = new_code.replace('时间', '_time')
new_code = new_code.replace('地点', '_location')
new_code = new_code.replace('问题', '_issue')
new_code = new_code.replace('情况', '_situation')
new_code = new_code.replace('程度', '_level')
new_code = new_code.replace('表现', '_behavior')
new_code = new_code.replace('等级', '_level')
new_code = new_code.replace('履历', '_history')
new_code = new_code.replace('学历', '_education')
new_code = new_code.replace('职业', '_occupation')
new_code = new_code.replace('事由', '_reason')
new_code = new_code.replace('次数', '_count')
new_code = new_code.replace('结果', '_result')
new_code = new_code.replace('意见', '_opinion')
new_code = re.sub(r'[^\w]', '_', new_code)
new_code = re.sub(r'_+', '_', new_code).strip('_')
new_code = new_code.replace('__', '_')
updates.append({
'id': field['id'],
'name': field_name,
'old_code': field['filed_code'],
'new_code': new_code,
'field_type': field['field_type']
})
print(f" ID: {field['id']}")
print(f" 名称: {field_name}")
print(f" 当前field_code: {field['filed_code']}")
print(f" 新field_code: {new_code}")
print()
# 检查是否有重复的new_code
code_to_fields = {}
for update in updates:
code = update['new_code']
if code not in code_to_fields:
code_to_fields[code] = []
code_to_fields[code].append(update)
duplicate_codes = {code: fields_list for code, fields_list in code_to_fields.items()
if len(fields_list) > 1}
if duplicate_codes:
print("\n⚠ 警告以下field_code会重复:")
for code, fields_list in duplicate_codes.items():
print(f" field_code: {code}")
for field in fields_list:
print(f" - ID: {field['id']}, 名称: {field['name']}")
print()
# 执行更新
if not dry_run:
print("开始执行更新...\n")
for update in updates:
cursor.execute("""
UPDATE f_polic_field
SET filed_code = %s, updated_time = NOW(), updated_by = %s
WHERE id = %s
""", (update['new_code'], UPDATED_BY, update['id']))
print(f" ✓ 更新字段 ID {update['id']}: {update['name']}")
print(f" {update['old_code']} -> {update['new_code']}")
conn.commit()
print("\n✓ 更新完成")
else:
print("[DRY RUN] 以上操作不会实际执行")
cursor.close()
conn.close()
return updates
if __name__ == '__main__':
print("是否执行修复?")
print("1. DRY RUN不实际修改数据库")
print("2. 直接执行修复(会修改数据库)")
choice = input("\n请选择 (1/2默认1): ").strip() or "1"
if choice == "2":
print("\n执行实际修复...")
fix_remaining_fields(dry_run=False)
else:
print("\n执行DRY RUN...")
updates = fix_remaining_fields(dry_run=True)
if updates:
confirm = input("\nDRY RUN完成。是否执行实际修复(y/n默认n): ").strip().lower()
if confirm == 'y':
print("\n执行实际修复...")
fix_remaining_fields(dry_run=False)

View File

@ -0,0 +1,61 @@
"""
修复剩余的层级结构问题
"""
import pymysql
DB_CONFIG = {
'host': '152.136.177.240',
'port': 5012,
'user': 'finyx',
'password': '6QsGK6MpePZDE57Z',
'database': 'finyx',
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
UPDATED_BY = 655162080928945152
conn = pymysql.connect(**DB_CONFIG)
cursor = conn.cursor()
try:
# 1. 修复"2保密承诺书"的parent_id应该在"走读式谈话流程"下)
# "走读式谈话流程"的ID是 1765273962716807
cursor.execute("""
UPDATE f_polic_file_config
SET parent_id = %s, updated_time = NOW(), updated_by = %s
WHERE tenant_id = %s AND id = %s
""", (1765273962716807, UPDATED_BY, TENANT_ID, 1765425919729046))
print(f"[UPDATE] 更新'2保密承诺书'的parent_id: {cursor.rowcount}")
# 2. 检查"8.XXX初核情况报告"的位置(应该在"3.初核结论"下,而不是"走读式谈话流程"下)
# "3.初核结论"的ID是 1765431559135346
# 先查找"8.XXX初核情况报告"的ID
cursor.execute("""
SELECT id, name, parent_id
FROM f_polic_file_config
WHERE tenant_id = %s AND name LIKE %s
""", (TENANT_ID, '%XXX初核情况报告%'))
result = cursor.fetchone()
if result:
file_id, file_name, current_parent = result
if current_parent != 1765431559135346:
cursor.execute("""
UPDATE f_polic_file_config
SET parent_id = %s, updated_time = NOW(), updated_by = %s
WHERE tenant_id = %s AND id = %s
""", (1765431559135346, UPDATED_BY, TENANT_ID, file_id))
print(f"[UPDATE] 更新'{file_name}'的parent_id: {cursor.rowcount}")
conn.commit()
print("\n[OK] 修复完成")
except Exception as e:
conn.rollback()
print(f"[ERROR] 修复失败: {e}")
import traceback
traceback.print_exc()
finally:
cursor.close()
conn.close()

View File

@ -0,0 +1,272 @@
"""
修复"1.请示报告卡(初核谈话)"模板的input_data字段
分析模板占位符根据数据库字段对应关系生成input_data并更新数据库
"""
import pymysql
import json
import os
import re
from datetime import datetime
from pathlib import Path
from docx import Document
# 数据库连接配置
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
UPDATED_BY = 655162080928945152
CURRENT_TIME = datetime.now()
# 模板信息
TEMPLATE_NAME = "1.请示报告卡(初核谈话)"
TEMPLATE_CODE = "REPORT_CARD_INTERVIEW"
BUSINESS_TYPE = "INVESTIGATION"
TEMPLATE_FILE_PATH = "template_finish/2-初核模版/2.谈话审批/走读式谈话审批/1.请示报告卡(初核谈话).docx"
def extract_placeholders_from_docx(file_path):
"""从docx文件中提取所有占位符"""
placeholders = set()
pattern = r'\{\{([^}]+)\}\}'
try:
doc = Document(file_path)
# 从段落中提取占位符
for paragraph in doc.paragraphs:
text = paragraph.text
matches = re.findall(pattern, text)
for match in matches:
placeholders.add(match.strip())
# 从表格中提取占位符
for table in doc.tables:
for row in table.rows:
for cell in row.cells:
for paragraph in cell.paragraphs:
text = paragraph.text
matches = re.findall(pattern, text)
for match in matches:
placeholders.add(match.strip())
except Exception as e:
print(f" 错误: 读取文件失败 - {str(e)}")
return []
return sorted(list(placeholders))
def get_template_config(conn):
"""查询模板配置"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
sql = """
SELECT id, name, template_code, input_data, file_path, state
FROM f_polic_file_config
WHERE tenant_id = %s AND name = %s
"""
cursor.execute(sql, (TENANT_ID, TEMPLATE_NAME))
config = cursor.fetchone()
cursor.close()
return config
def get_template_fields(conn, file_config_id):
"""查询模板关联的字段"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
sql = """
SELECT f.id, f.name, f.filed_code as field_code, f.field_type
FROM f_polic_field f
INNER JOIN f_polic_file_field ff ON f.id = ff.filed_id
WHERE ff.file_id = %s
AND f.tenant_id = %s
ORDER BY f.field_type, f.name
"""
cursor.execute(sql, (file_config_id, TENANT_ID))
fields = cursor.fetchall()
cursor.close()
return fields
def verify_placeholders_in_database(conn, placeholders):
"""验证占位符是否在数据库中存在对应的字段"""
if not placeholders:
return {}
cursor = conn.cursor(pymysql.cursors.DictCursor)
placeholders_list = list(placeholders)
placeholders_str = ','.join(['%s'] * len(placeholders_list))
# 查询所有字段(包括未启用的)
sql = f"""
SELECT id, name, filed_code as field_code, field_type, state
FROM f_polic_field
WHERE tenant_id = %s
AND filed_code IN ({placeholders_str})
"""
cursor.execute(sql, [TENANT_ID] + placeholders_list)
fields = cursor.fetchall()
cursor.close()
# 构建字段映射
field_map = {f['field_code']: f for f in fields}
# 检查缺失的字段
missing_fields = set(placeholders) - set(field_map.keys())
return {
'found_fields': field_map,
'missing_fields': missing_fields
}
def update_input_data(conn, file_config_id, input_data):
"""更新input_data字段"""
cursor = conn.cursor()
input_data_str = json.dumps(input_data, ensure_ascii=False)
update_sql = """
UPDATE f_polic_file_config
SET input_data = %s, updated_time = %s, updated_by = %s
WHERE id = %s
"""
cursor.execute(update_sql, (input_data_str, CURRENT_TIME, UPDATED_BY, file_config_id))
conn.commit()
cursor.close()
def main():
"""主函数"""
print("="*80)
print("修复'1.请示报告卡(初核谈话)'模板的input_data字段")
print("="*80)
print()
# 1. 检查模板文件是否存在
template_path = Path(TEMPLATE_FILE_PATH)
if not template_path.exists():
print(f"✗ 错误: 模板文件不存在 - {TEMPLATE_FILE_PATH}")
return
print(f"✓ 找到模板文件: {TEMPLATE_FILE_PATH}")
# 2. 提取占位符
print("\n正在提取占位符...")
placeholders = extract_placeholders_from_docx(str(template_path))
print(f"✓ 找到 {len(placeholders)} 个占位符:")
for i, placeholder in enumerate(placeholders, 1):
print(f" {i}. {{{{ {placeholder} }}}}")
# 3. 连接数据库
print("\n正在连接数据库...")
try:
conn = pymysql.connect(**DB_CONFIG)
print("✓ 数据库连接成功")
except Exception as e:
print(f"✗ 数据库连接失败: {str(e)}")
return
try:
# 4. 查询模板配置
print(f"\n正在查询模板配置: {TEMPLATE_NAME}")
config = get_template_config(conn)
if not config:
print(f"✗ 未找到模板配置: {TEMPLATE_NAME}")
return
print(f"✓ 找到模板配置:")
print(f" ID: {config['id']}")
print(f" 名称: {config['name']}")
print(f" 当前template_code: {config.get('template_code', 'NULL')}")
print(f" 当前input_data: {config.get('input_data', 'NULL')}")
print(f" 文件路径: {config.get('file_path', 'NULL')}")
print(f" 状态: {config.get('state', 0)}")
file_config_id = config['id']
# 5. 查询模板关联的字段
print(f"\n正在查询模板关联的字段...")
template_fields = get_template_fields(conn, file_config_id)
print(f"✓ 找到 {len(template_fields)} 个关联字段:")
for field in template_fields:
field_type_str = "输出字段" if field['field_type'] == 2 else "输入字段"
print(f" - {field['name']} ({field['field_code']}) [{field_type_str}]")
# 6. 验证占位符是否在数据库中存在
print(f"\n正在验证占位符...")
verification = verify_placeholders_in_database(conn, placeholders)
found_fields = verification['found_fields']
missing_fields = verification['missing_fields']
print(f"✓ 在数据库中找到 {len(found_fields)} 个字段:")
for field_code, field in found_fields.items():
field_type_str = "输出字段" if field['field_type'] == 2 else "输入字段"
state_str = "启用" if field.get('state', 0) == 1 else "未启用"
print(f" - {field['name']} ({field_code}) [{field_type_str}] [状态: {state_str}]")
if missing_fields:
print(f"\n⚠ 警告: 以下占位符在数据库中未找到对应字段:")
for field_code in missing_fields:
print(f" - {field_code}")
print("\n这些占位符仍会被包含在input_data中但可能无法正确填充。")
# 7. 生成input_data
print(f"\n正在生成input_data...")
input_data = {
'template_code': TEMPLATE_CODE,
'business_type': BUSINESS_TYPE,
'placeholders': placeholders
}
print(f"✓ input_data内容:")
print(json.dumps(input_data, ensure_ascii=False, indent=2))
# 8. 更新数据库
print(f"\n正在更新数据库...")
update_input_data(conn, file_config_id, input_data)
print(f"✓ 更新成功!")
# 9. 验证更新结果
print(f"\n正在验证更新结果...")
updated_config = get_template_config(conn)
if updated_config:
try:
updated_input_data = json.loads(updated_config['input_data'])
if updated_input_data.get('template_code') == TEMPLATE_CODE:
print(f"✓ 验证成功: template_code = {TEMPLATE_CODE}")
if updated_input_data.get('business_type') == BUSINESS_TYPE:
print(f"✓ 验证成功: business_type = {BUSINESS_TYPE}")
if set(updated_input_data.get('placeholders', [])) == set(placeholders):
print(f"✓ 验证成功: placeholders 匹配")
except Exception as e:
print(f"⚠ 验证时出错: {str(e)}")
print("\n" + "="*80)
print("修复完成!")
print("="*80)
except Exception as e:
print(f"\n✗ 处理失败: {str(e)}")
import traceback
traceback.print_exc()
finally:
conn.close()
if __name__ == '__main__':
main()

234
fix_template_names.py Normal file
View File

@ -0,0 +1,234 @@
"""
检查并修复 f_polic_file_config 表中模板名称与文件名的对应关系
确保 name 字段与模板文档名称去掉扩展名完全一致
"""
import os
import sys
import pymysql
from pathlib import Path
from typing import Dict, List, Optional
# 设置输出编码为UTF-8Windows兼容
if sys.platform == 'win32':
import io
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8', errors='replace')
sys.stderr = io.TextIOWrapper(sys.stderr.buffer, encoding='utf-8', errors='replace')
# 数据库连接配置
DB_CONFIG = {
'host': '152.136.177.240',
'port': 5012,
'user': 'finyx',
'password': '6QsGK6MpePZDE57Z',
'database': 'finyx',
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
UPDATED_BY = 655162080928945152
TEMPLATE_BASE_DIR = 'template_finish'
def scan_template_files(base_dir: str) -> Dict[str, str]:
"""
扫描模板文件夹获取所有模板文件信息
Returns:
字典key为MinIO路径用于匹配value为文件名不含扩展名
"""
base_path = Path(base_dir)
if not base_path.exists():
print(f"错误: 目录不存在 - {base_dir}")
return {}
templates = {}
print("=" * 80)
print("扫描模板文件...")
print("=" * 80)
for docx_file in sorted(base_path.rglob("*.docx")):
# 跳过临时文件
if docx_file.name.startswith("~$"):
continue
# 获取文件名(不含扩展名)
file_name_without_ext = docx_file.stem
# 构建MinIO路径用于匹配数据库中的file_path
from datetime import datetime
now = datetime.now()
minio_path = f'/615873064429507639/TEMPLATE/{now.year}/{now.month:02d}/{docx_file.name}'
templates[minio_path] = {
'file_name': docx_file.name,
'name_without_ext': file_name_without_ext,
'relative_path': str(docx_file.relative_to(base_path))
}
print(f"找到 {len(templates)} 个模板文件\n")
return templates
def get_db_templates(conn) -> Dict[str, Dict]:
"""
获取数据库中所有模板记录
Returns:
字典key为file_pathvalue为模板信息
"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
sql = """
SELECT id, name, file_path, parent_id
FROM f_polic_file_config
WHERE tenant_id = %s AND file_path IS NOT NULL
"""
cursor.execute(sql, (TENANT_ID,))
templates = cursor.fetchall()
result = {}
for template in templates:
if template['file_path']:
result[template['file_path']] = {
'id': template['id'],
'name': template['name'],
'file_path': template['file_path'],
'parent_id': template['parent_id']
}
cursor.close()
return result
def update_template_name(conn, template_id: int, new_name: str, old_name: str):
"""
更新模板名称
"""
cursor = conn.cursor()
try:
update_sql = """
UPDATE f_polic_file_config
SET name = %s, updated_time = NOW(), updated_by = %s
WHERE id = %s AND tenant_id = %s
"""
cursor.execute(update_sql, (new_name, UPDATED_BY, template_id, TENANT_ID))
conn.commit()
print(f" [UPDATE] ID: {template_id}")
print(f" 旧名称: {old_name}")
print(f" 新名称: {new_name}")
return True
except Exception as e:
conn.rollback()
print(f" [ERROR] 更新失败: {str(e)}")
return False
finally:
cursor.close()
def match_file_path(file_path: str, db_paths: List[str]) -> Optional[str]:
"""
匹配文件路径可能日期不同
Args:
file_path: 当前构建的MinIO路径
db_paths: 数据库中的所有路径列表
Returns:
匹配的数据库路径如果找到的话
"""
# 提取文件名
file_name = Path(file_path).name
# 在数据库路径中查找相同文件名的路径
for db_path in db_paths:
if Path(db_path).name == file_name:
return db_path
return None
def main():
"""主函数"""
print("=" * 80)
print("检查并修复模板名称")
print("=" * 80)
print()
try:
# 连接数据库
print("1. 连接数据库...")
conn = pymysql.connect(**DB_CONFIG)
print("[OK] 数据库连接成功\n")
# 扫描模板文件
print("2. 扫描模板文件...")
file_templates = scan_template_files(TEMPLATE_BASE_DIR)
# 获取数据库模板
print("3. 获取数据库模板...")
db_templates = get_db_templates(conn)
print(f"[OK] 找到 {len(db_templates)} 个数据库模板\n")
# 检查并更新
print("4. 检查并更新模板名称...")
print("=" * 80)
updated_count = 0
not_found_count = 0
matched_count = 0
# 遍历文件模板
for file_path, file_info in file_templates.items():
file_name = file_info['file_name']
expected_name = file_info['name_without_ext']
# 尝试直接匹配
db_template = db_templates.get(file_path)
# 如果直接匹配失败,尝试通过文件名匹配
if not db_template:
matched_path = match_file_path(file_path, list(db_templates.keys()))
if matched_path:
db_template = db_templates[matched_path]
if db_template:
matched_count += 1
current_name = db_template['name']
# 检查名称是否一致
if current_name != expected_name:
print(f"\n文件: {file_name}")
if update_template_name(conn, db_template['id'], expected_name, current_name):
updated_count += 1
else:
print(f" [OK] {file_name} - 名称已正确")
else:
not_found_count += 1
print(f" [WARN] 未找到: {file_name}")
print("\n" + "=" * 80)
print("检查完成")
print("=" * 80)
print(f"总文件数: {len(file_templates)}")
print(f"匹配成功: {matched_count}")
print(f"更新数量: {updated_count}")
print(f"未找到: {not_found_count}")
print("=" * 80)
except Exception as e:
print(f"\n[ERROR] 发生错误: {e}")
import traceback
traceback.print_exc()
if 'conn' in locals():
conn.rollback()
finally:
if 'conn' in locals():
conn.close()
print("\n数据库连接已关闭")
if __name__ == '__main__':
main()

129
generate_download_urls.py Normal file
View File

@ -0,0 +1,129 @@
"""
为指定的文件路径生成 MinIO 预签名下载 URL
"""
import sys
import io
from minio import Minio
from datetime import timedelta
# 设置输出编码为UTF-8避免Windows控制台编码问题
if sys.platform == 'win32':
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
sys.stderr = io.TextIOWrapper(sys.stderr.buffer, encoding='utf-8')
# MinIO连接配置
MINIO_CONFIG = {
'endpoint': 'minio.datacubeworld.com:9000',
'access_key': 'JOLXFXny3avFSzB0uRA5',
'secret_key': 'G1BR8jStNfovkfH5ou39EmPl34E4l7dGrnd3Cz0I',
'secure': True
}
BUCKET_NAME = 'finyx'
# 文件相对路径列表
FILE_PATHS = [
<<<<<<< HEAD
'/615873064429507639/20251211112544/初步核实审批表_张三.docx',
'/615873064429507639/20251211112545/请示报告卡_张三.docx'
=======
'/615873064429507639/20251211101046/1_张三.docx',
'/615873064429507639/20251211101046/1_张三.docx'
>>>>>>> e3f4a394c1a4333db2fd3a9383be29fa9d9055e0
]
def generate_download_urls():
"""为文件路径列表生成下载 URL"""
print("="*80)
print("生成 MinIO 下载链接")
print("="*80)
try:
# 创建MinIO客户端
client = Minio(
MINIO_CONFIG['endpoint'],
access_key=MINIO_CONFIG['access_key'],
secret_key=MINIO_CONFIG['secret_key'],
secure=MINIO_CONFIG['secure']
)
print(f"\n存储桶: {BUCKET_NAME}")
print(f"端点: {MINIO_CONFIG['endpoint']}")
print(f"使用HTTPS: {MINIO_CONFIG['secure']}\n")
results = []
for file_path in FILE_PATHS:
# 去掉开头的斜杠,得到对象名称
object_name = file_path.lstrip('/')
print("-"*80)
print(f"文件: {file_path}")
print(f"对象名称: {object_name}")
try:
# 检查文件是否存在
stat = client.stat_object(BUCKET_NAME, object_name)
print(f"[OK] 文件存在")
print(f" 文件大小: {stat.size:,} 字节")
print(f" 最后修改: {stat.last_modified}")
# 生成预签名URL7天有效期
url = client.presigned_get_object(
BUCKET_NAME,
object_name,
expires=timedelta(days=7)
)
print(f"[OK] 预签名URL生成成功7天有效")
print(f"\n下载链接:")
print(f"{url}\n")
results.append({
'file_path': file_path,
'object_name': object_name,
'url': url,
'size': stat.size,
'exists': True
})
except Exception as e:
print(f"[ERROR] 错误: {e}\n")
results.append({
'file_path': file_path,
'object_name': object_name,
'url': None,
'exists': False,
'error': str(e)
})
# 输出汇总
print("\n" + "="*80)
print("下载链接汇总")
print("="*80)
for i, result in enumerate(results, 1):
print(f"\n{i}. {result['file_path']}")
if result['exists']:
print(f" [OK] 文件存在")
print(f" 下载链接: {result['url']}")
else:
print(f" [ERROR] 文件不存在或无法访问")
if 'error' in result:
print(f" 错误: {result['error']}")
print("\n" + "="*80)
print("完成")
print("="*80)
return results
except Exception as e:
print(f"\n[ERROR] 连接MinIO失败: {e}")
import traceback
traceback.print_exc()
return None
if __name__ == '__main__':
generate_download_urls()

View File

@ -0,0 +1,219 @@
"""
生成模板 file_id 和关联关系的详细报告
重点检查每个模板的 file_id 是否正确以及 f_polic_file_field 表的关联关系
"""
import sys
import pymysql
from pathlib import Path
from typing import Dict, List
from collections import defaultdict
# 设置控制台编码为UTF-8Windows兼容
if sys.platform == 'win32':
try:
sys.stdout.reconfigure(encoding='utf-8')
sys.stderr.reconfigure(encoding='utf-8')
except:
pass
# 数据库连接配置
DB_CONFIG = {
'host': '152.136.177.240',
'port': 5012,
'user': 'finyx',
'password': '6QsGK6MpePZDE57Z',
'database': 'finyx',
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
def generate_detailed_report():
"""生成详细的 file_id 和关联关系报告"""
print("="*80)
print("模板 file_id 和关联关系详细报告")
print("="*80)
# 连接数据库
try:
conn = pymysql.connect(**DB_CONFIG)
print("\n[OK] 数据库连接成功\n")
except Exception as e:
print(f"\n[ERROR] 数据库连接失败: {e}")
return
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
# 1. 查询所有有 file_path 的模板(实际模板文件,不是目录节点)
cursor.execute("""
SELECT id, name, template_code, file_path, state, parent_id
FROM f_polic_file_config
WHERE tenant_id = %s AND file_path IS NOT NULL AND file_path != ''
ORDER BY name, id
""", (TENANT_ID,))
all_templates = cursor.fetchall()
print(f"总模板数(有 file_path: {len(all_templates)}\n")
# 2. 查询每个模板的关联字段
template_field_map = defaultdict(list)
cursor.execute("""
SELECT
fff.file_id,
fff.filed_id,
fff.state as relation_state,
fc.name as template_name,
fc.template_code,
f.name as field_name,
f.filed_code,
f.field_type,
CASE
WHEN f.field_type = 1 THEN '输入字段'
WHEN f.field_type = 2 THEN '输出字段'
ELSE '未知'
END as field_type_name
FROM f_polic_file_field fff
INNER JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id
INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
WHERE fff.tenant_id = %s
ORDER BY fff.file_id, f.field_type, f.name
""", (TENANT_ID,))
all_relations = cursor.fetchall()
for rel in all_relations:
template_field_map[rel['file_id']].append(rel)
# 3. 按模板分组显示
print("="*80)
print("每个模板的 file_id 和关联字段详情")
print("="*80)
# 按名称分组,显示重复的模板
templates_by_name = defaultdict(list)
for template in all_templates:
templates_by_name[template['name']].append(template)
duplicate_templates = {name: tmpls for name, tmpls in templates_by_name.items() if len(tmpls) > 1}
if duplicate_templates:
print("\n[WARN] 发现重复名称的模板:\n")
for name, tmpls in duplicate_templates.items():
print(f" 模板名称: {name}")
for tmpl in tmpls:
field_count = len(template_field_map.get(tmpl['id'], []))
input_count = sum(1 for f in template_field_map.get(tmpl['id'], []) if f['field_type'] == 1)
output_count = sum(1 for f in template_field_map.get(tmpl['id'], []) if f['field_type'] == 2)
print(f" - file_id: {tmpl['id']}")
print(f" template_code: {tmpl.get('template_code', 'N/A')}")
print(f" file_path: {tmpl.get('file_path', 'N/A')}")
print(f" 关联字段: 总计 {field_count} 个 (输入 {input_count}, 输出 {output_count})")
print()
# 4. 显示每个模板的详细信息
print("\n" + "="*80)
print("所有模板的 file_id 和关联字段统计")
print("="*80)
for template in all_templates:
file_id = template['id']
name = template['name']
template_code = template.get('template_code', 'N/A')
file_path = template.get('file_path', 'N/A')
fields = template_field_map.get(file_id, [])
input_fields = [f for f in fields if f['field_type'] == 1]
output_fields = [f for f in fields if f['field_type'] == 2]
print(f"\n模板: {name}")
print(f" file_id: {file_id}")
print(f" template_code: {template_code}")
print(f" file_path: {file_path}")
print(f" 关联字段: 总计 {len(fields)}")
print(f" - 输入字段 (field_type=1): {len(input_fields)}")
print(f" - 输出字段 (field_type=2): {len(output_fields)}")
if len(fields) == 0:
print(f" [WARN] 该模板没有关联任何字段")
# 5. 检查关联关系的完整性
print("\n" + "="*80)
print("关联关系完整性检查")
print("="*80)
# 检查是否有 file_id 在 f_polic_file_field 中但没有对应的文件配置
cursor.execute("""
SELECT DISTINCT fff.file_id
FROM f_polic_file_field fff
LEFT JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id
WHERE fff.tenant_id = %s AND fc.id IS NULL
""", (TENANT_ID,))
orphan_file_ids = cursor.fetchall()
if orphan_file_ids:
print(f"\n[ERROR] 发现孤立的 file_id在 f_polic_file_field 中但不在 f_polic_file_config 中):")
for item in orphan_file_ids:
print(f" - file_id: {item['file_id']}")
else:
print("\n[OK] 所有关联关系的 file_id 都有效")
# 检查是否有 filed_id 在 f_polic_file_field 中但没有对应的字段
cursor.execute("""
SELECT DISTINCT fff.filed_id
FROM f_polic_file_field fff
LEFT JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
WHERE fff.tenant_id = %s AND f.id IS NULL
""", (TENANT_ID,))
orphan_field_ids = cursor.fetchall()
if orphan_field_ids:
print(f"\n[ERROR] 发现孤立的 filed_id在 f_polic_file_field 中但不在 f_polic_field 中):")
for item in orphan_field_ids:
print(f" - filed_id: {item['filed_id']}")
else:
print("\n[OK] 所有关联关系的 filed_id 都有效")
# 6. 统计汇总
print("\n" + "="*80)
print("统计汇总")
print("="*80)
total_templates = len(all_templates)
templates_with_fields = len([t for t in all_templates if len(template_field_map.get(t['id'], [])) > 0])
templates_without_fields = total_templates - templates_with_fields
total_relations = len(all_relations)
total_input_relations = sum(1 for r in all_relations if r['field_type'] == 1)
total_output_relations = sum(1 for r in all_relations if r['field_type'] == 2)
print(f"\n模板统计:")
print(f" 总模板数: {total_templates}")
print(f" 有关联字段的模板: {templates_with_fields}")
print(f" 无关联字段的模板: {templates_without_fields}")
print(f"\n关联关系统计:")
print(f" 总关联关系数: {total_relations}")
print(f" 输入字段关联: {total_input_relations}")
print(f" 输出字段关联: {total_output_relations}")
if duplicate_templates:
print(f"\n[WARN] 发现 {len(duplicate_templates)} 个模板名称有重复记录")
print(" 建议: 确认每个模板应该使用哪个 file_id并清理重复记录")
if templates_without_fields:
print(f"\n[WARN] 发现 {templates_without_fields} 个模板没有关联任何字段")
print(" 建议: 检查这些模板是否需要关联字段")
finally:
cursor.close()
conn.close()
print("\n数据库连接已关闭")
if __name__ == '__main__':
generate_detailed_report()

64
get_available_file_ids.py Normal file
View File

@ -0,0 +1,64 @@
"""
获取所有可用的文件ID列表用于测试
"""
import pymysql
import os
# 数据库连接配置
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
def get_available_file_configs():
"""获取所有可用的文件配置"""
conn = pymysql.connect(**DB_CONFIG)
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
sql = """
SELECT id, name, file_path, state
FROM f_polic_file_config
WHERE tenant_id = %s
AND state = 1
ORDER BY name
"""
cursor.execute(sql, (TENANT_ID,))
configs = cursor.fetchall()
print("="*80)
print("可用的文件配置列表state=1")
print("="*80)
print(f"\n共找到 {len(configs)} 个启用的文件配置:\n")
for i, config in enumerate(configs, 1):
print(f"{i}. ID: {config['id']}")
print(f" 名称: {config['name']}")
print(f" 文件路径: {config['file_path'] or '(空)'}")
print()
# 输出JSON格式方便复制
print("\n" + "="*80)
print("JSON格式可用于测试:")
print("="*80)
print("[")
for i, config in enumerate(configs):
comma = "," if i < len(configs) - 1 else ""
print(f' {{"fileId": {config["id"]}, "fileName": "{config["name"]}.doc"}}{comma}')
print("]")
return configs
finally:
cursor.close()
conn.close()
if __name__ == '__main__':
get_available_file_configs()

View File

@ -0,0 +1,478 @@
"""
改进的匹配和更新脚本
增强匹配逻辑能够匹配数据库中的已有数据
"""
import os
import json
import pymysql
import re
from pathlib import Path
from typing import Dict, List, Optional, Tuple
from datetime import datetime
# 数据库连接配置
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
CREATED_BY = 655162080928945152
UPDATED_BY = 655162080928945152
# 项目根目录
PROJECT_ROOT = Path(__file__).parent
TEMPLATES_DIR = PROJECT_ROOT / "template_finish"
# 文档类型映射
DOCUMENT_TYPE_MAPPING = {
"1.请示报告卡XXX": {
"template_code": "REPORT_CARD",
"name": "1.请示报告卡XXX",
"business_type": "INVESTIGATION"
},
"2.初步核实审批表XXX": {
"template_code": "PRELIMINARY_VERIFICATION_APPROVAL",
"name": "2.初步核实审批表XXX",
"business_type": "INVESTIGATION"
},
"3.附件初核方案(XXX)": {
"template_code": "INVESTIGATION_PLAN",
"name": "3.附件初核方案(XXX)",
"business_type": "INVESTIGATION"
},
"谈话通知书第一联": {
"template_code": "NOTIFICATION_LETTER_1",
"name": "谈话通知书第一联",
"business_type": "INVESTIGATION"
},
"谈话通知书第二联": {
"template_code": "NOTIFICATION_LETTER_2",
"name": "谈话通知书第二联",
"business_type": "INVESTIGATION"
},
"谈话通知书第三联": {
"template_code": "NOTIFICATION_LETTER_3",
"name": "谈话通知书第三联",
"business_type": "INVESTIGATION"
},
"1.请示报告卡(初核谈话)": {
"template_code": "REPORT_CARD_INTERVIEW",
"name": "1.请示报告卡(初核谈话)",
"business_type": "INVESTIGATION"
},
"2谈话审批表": {
"template_code": "INTERVIEW_APPROVAL_FORM",
"name": "2谈话审批表",
"business_type": "INVESTIGATION"
},
"3.谈话前安全风险评估表": {
"template_code": "PRE_INTERVIEW_RISK_ASSESSMENT",
"name": "3.谈话前安全风险评估表",
"business_type": "INVESTIGATION"
},
"4.谈话方案": {
"template_code": "INTERVIEW_PLAN",
"name": "4.谈话方案",
"business_type": "INVESTIGATION"
},
"5.谈话后安全风险评估表": {
"template_code": "POST_INTERVIEW_RISK_ASSESSMENT",
"name": "5.谈话后安全风险评估表",
"business_type": "INVESTIGATION"
},
"1.谈话笔录": {
"template_code": "INTERVIEW_RECORD",
"name": "1.谈话笔录",
"business_type": "INVESTIGATION"
},
"2.谈话询问对象情况摸底调查30问": {
"template_code": "INVESTIGATION_30_QUESTIONS",
"name": "2.谈话询问对象情况摸底调查30问",
"business_type": "INVESTIGATION"
},
"3.被谈话人权利义务告知书": {
"template_code": "RIGHTS_OBLIGATIONS_NOTICE",
"name": "3.被谈话人权利义务告知书",
"business_type": "INVESTIGATION"
},
"4.点对点交接单": {
"template_code": "HANDOVER_FORM",
"name": "4.点对点交接单",
"business_type": "INVESTIGATION"
},
"5.陪送交接单(新)": {
"template_code": "ESCORT_HANDOVER_FORM",
"name": "5.陪送交接单(新)",
"business_type": "INVESTIGATION"
},
"6.1保密承诺书(谈话对象使用-非中共党员用)": {
"template_code": "CONFIDENTIALITY_COMMITMENT_NON_PARTY",
"name": "6.1保密承诺书(谈话对象使用-非中共党员用)",
"business_type": "INVESTIGATION"
},
"6.2保密承诺书(谈话对象使用-中共党员用)": {
"template_code": "CONFIDENTIALITY_COMMITMENT_PARTY",
"name": "6.2保密承诺书(谈话对象使用-中共党员用)",
"business_type": "INVESTIGATION"
},
"7.办案人员-办案安全保密承诺书": {
"template_code": "INVESTIGATOR_CONFIDENTIALITY_COMMITMENT",
"name": "7.办案人员-办案安全保密承诺书",
"business_type": "INVESTIGATION"
},
"8-1请示报告卡初核报告结论 ": {
"template_code": "REPORT_CARD_CONCLUSION",
"name": "8-1请示报告卡初核报告结论 ",
"business_type": "INVESTIGATION"
},
"8.XXX初核情况报告": {
"template_code": "INVESTIGATION_REPORT",
"name": "8.XXX初核情况报告",
"business_type": "INVESTIGATION"
}
}
def normalize_name(name: str) -> str:
"""标准化名称,用于模糊匹配"""
# 去掉开头的编号(如 "1."、"2."、"8-1" 等)
name = re.sub(r'^\d+[\.\-]\s*', '', name)
# 去掉括号及其内容(如 "XXX"、"(初核谈话)" 等)
name = re.sub(r'[(].*?[)]', '', name)
# 去掉空格和特殊字符
name = name.strip()
return name
def generate_id():
"""生成ID"""
import time
import random
timestamp = int(time.time() * 1000)
random_part = random.randint(100000, 999999)
return timestamp * 1000 + random_part
def identify_document_type(file_name: str) -> Optional[Dict]:
"""根据完整文件名识别文档类型"""
base_name = Path(file_name).stem
if base_name in DOCUMENT_TYPE_MAPPING:
return DOCUMENT_TYPE_MAPPING[base_name]
return None
def scan_directory_structure(base_dir: Path) -> Dict:
"""扫描目录结构,构建树状层级"""
structure = {
'directories': {},
'files': {}
}
def process_path(path: Path, parent_path: Optional[str] = None, level: int = 0):
"""递归处理路径"""
if path.is_file() and path.suffix == '.docx':
file_name = path.stem
doc_config = identify_document_type(file_name)
structure['files'][str(path)] = {
'name': file_name,
'parent': parent_path,
'level': level,
'template_code': doc_config['template_code'] if doc_config else None,
'full_path': str(path),
'normalized_name': normalize_name(file_name)
}
elif path.is_dir():
dir_name = path.name
structure['directories'][str(path)] = {
'name': dir_name,
'parent': parent_path,
'level': level,
'normalized_name': normalize_name(dir_name)
}
for child in sorted(path.iterdir()):
if child.name != '__pycache__':
process_path(child, str(path), level + 1)
if TEMPLATES_DIR.exists():
for item in sorted(TEMPLATES_DIR.iterdir()):
if item.name != '__pycache__':
process_path(item, None, 0)
return structure
def get_existing_data(conn) -> Dict:
"""获取数据库中的现有数据,增强匹配能力"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
sql = """
SELECT id, name, parent_id, template_code, input_data, file_path, state
FROM f_polic_file_config
WHERE tenant_id = %s
"""
cursor.execute(sql, (TENANT_ID,))
configs = cursor.fetchall()
result = {
'by_id': {},
'by_name': {},
'by_template_code': {},
'by_normalized_name': {} # 新增:标准化名称索引
}
for config in configs:
config_id = config['id']
config_name = config['name']
# 提取 template_code
template_code = config.get('template_code')
if not template_code and config.get('input_data'):
try:
input_data = json.loads(config['input_data']) if isinstance(config['input_data'], str) else config['input_data']
if isinstance(input_data, dict):
template_code = input_data.get('template_code')
except:
pass
config['extracted_template_code'] = template_code
config['normalized_name'] = normalize_name(config_name)
result['by_id'][config_id] = config
result['by_name'][config_name] = config
if template_code:
if template_code not in result['by_template_code']:
result['by_template_code'][template_code] = config
# 标准化名称索引(可能有多个记录匹配同一个标准化名称)
normalized = config['normalized_name']
if normalized not in result['by_normalized_name']:
result['by_normalized_name'][normalized] = []
result['by_normalized_name'][normalized].append(config)
cursor.close()
return result
def find_matching_config(file_info: Dict, existing_data: Dict) -> Optional[Dict]:
"""
查找匹配的数据库记录
优先级1. template_code 精确匹配 2. 名称精确匹配 3. 标准化名称匹配
"""
template_code = file_info.get('template_code')
file_name = file_info['name']
normalized_name = file_info.get('normalized_name', normalize_name(file_name))
# 优先级1: template_code 精确匹配
if template_code:
matched = existing_data['by_template_code'].get(template_code)
if matched:
return matched
# 优先级2: 名称精确匹配
matched = existing_data['by_name'].get(file_name)
if matched:
return matched
# 优先级3: 标准化名称匹配
candidates = existing_data['by_normalized_name'].get(normalized_name, [])
if candidates:
# 如果有多个候选,优先选择有正确 template_code 的
for candidate in candidates:
if candidate.get('extracted_template_code') == template_code:
return candidate
# 否则返回第一个
return candidates[0]
return None
def plan_tree_structure(dir_structure: Dict, existing_data: Dict) -> List[Dict]:
"""规划树状结构,使用改进的匹配逻辑"""
plan = []
directories = sorted(dir_structure['directories'].items(),
key=lambda x: (x[1]['level'], x[0]))
files = sorted(dir_structure['files'].items(),
key=lambda x: (x[1]['level'], x[0]))
dir_id_map = {}
# 处理目录
for dir_path, dir_info in directories:
dir_name = dir_info['name']
parent_path = dir_info['parent']
level = dir_info['level']
parent_id = None
if parent_path:
parent_id = dir_id_map.get(parent_path)
# 查找匹配的数据库记录
matched = find_matching_config(dir_info, existing_data)
if matched:
plan.append({
'type': 'directory',
'name': dir_name,
'parent_name': dir_structure['directories'].get(parent_path, {}).get('name') if parent_path else None,
'parent_id': parent_id,
'level': level,
'action': 'update',
'config_id': matched['id'],
'current_parent_id': matched.get('parent_id'),
'matched_by': 'existing'
})
dir_id_map[dir_path] = matched['id']
else:
new_id = generate_id()
plan.append({
'type': 'directory',
'name': dir_name,
'parent_name': dir_structure['directories'].get(parent_path, {}).get('name') if parent_path else None,
'parent_id': parent_id,
'level': level,
'action': 'create',
'config_id': new_id,
'current_parent_id': None,
'matched_by': 'new'
})
dir_id_map[dir_path] = new_id
# 处理文件
for file_path, file_info in files:
file_name = file_info['name']
parent_path = file_info['parent']
level = file_info['level']
template_code = file_info['template_code']
parent_id = dir_id_map.get(parent_path) if parent_path else None
# 查找匹配的数据库记录
matched = find_matching_config(file_info, existing_data)
if matched:
plan.append({
'type': 'file',
'name': file_name,
'parent_name': dir_structure['directories'].get(parent_path, {}).get('name') if parent_path else None,
'parent_id': parent_id,
'level': level,
'action': 'update',
'config_id': matched['id'],
'template_code': template_code,
'current_parent_id': matched.get('parent_id'),
'matched_by': 'existing'
})
else:
new_id = generate_id()
plan.append({
'type': 'file',
'name': file_name,
'parent_name': dir_structure['directories'].get(parent_path, {}).get('name') if parent_path else None,
'parent_id': parent_id,
'level': level,
'action': 'create',
'config_id': new_id,
'template_code': template_code,
'current_parent_id': None,
'matched_by': 'new'
})
return plan
def print_matching_report(plan: List[Dict]):
"""打印匹配报告"""
print("\n" + "="*80)
print("匹配报告")
print("="*80)
matched = [p for p in plan if p.get('matched_by') == 'existing']
unmatched = [p for p in plan if p.get('matched_by') == 'new']
print(f"\n已匹配的记录: {len(matched)}")
print(f"未匹配的记录(将创建): {len(unmatched)}\n")
if unmatched:
print("未匹配的记录列表:")
for item in unmatched:
print(f" - {item['name']} ({item['type']})")
print("\n匹配详情:")
by_level = {}
for item in plan:
level = item['level']
if level not in by_level:
by_level[level] = []
by_level[level].append(item)
for level in sorted(by_level.keys()):
print(f"\n【层级 {level}")
for item in by_level[level]:
indent = " " * level
match_status = "" if item.get('matched_by') == 'existing' else ""
print(f"{indent}{match_status} {item['name']} (ID: {item['config_id']})")
if item.get('parent_name'):
print(f"{indent} 父节点: {item['parent_name']}")
if item['action'] == 'update':
current = item.get('current_parent_id', 'None')
new = item.get('parent_id', 'None')
if current != new:
print(f"{indent} parent_id: {current}{new}")
def main():
"""主函数"""
print("="*80)
print("改进的模板树状结构分析和更新")
print("="*80)
try:
conn = pymysql.connect(**DB_CONFIG)
print("✓ 数据库连接成功\n")
except Exception as e:
print(f"✗ 数据库连接失败: {e}")
return
try:
print("扫描目录结构...")
dir_structure = scan_directory_structure(TEMPLATES_DIR)
print(f" 找到 {len(dir_structure['directories'])} 个目录")
print(f" 找到 {len(dir_structure['files'])} 个文件\n")
print("获取数据库现有数据...")
existing_data = get_existing_data(conn)
print(f" 数据库中有 {len(existing_data['by_id'])} 条记录\n")
print("规划树状结构(使用改进的匹配逻辑)...")
plan = plan_tree_structure(dir_structure, existing_data)
print(f" 生成 {len(plan)} 个更新计划\n")
print_matching_report(plan)
# 询问是否继续
print("\n" + "="*80)
response = input("\n是否生成更新SQL脚本(yes/no默认no): ").strip().lower()
if response == 'yes':
from analyze_and_update_template_tree import generate_update_sql
sql_file = generate_update_sql(plan)
print(f"\n✓ SQL脚本已生成: {sql_file}")
else:
print("\n已取消")
finally:
conn.close()
if __name__ == '__main__':
main()

View File

@ -0,0 +1,544 @@
"""
template_finish 目录初始化模板树状结构
删除旧数据根据目录结构完全重建
"""
import os
import json
import pymysql
from pathlib import Path
from typing import Dict, List, Optional, Tuple
from datetime import datetime
from minio import Minio
from minio.error import S3Error
# 数据库连接配置
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
# MinIO连接配置
MINIO_CONFIG = {
'endpoint': 'minio.datacubeworld.com:9000',
'access_key': 'JOLXFXny3avFSzB0uRA5',
'secret_key': 'G1BR8jStNfovkfH5ou39EmPl34E4l7dGrnd3Cz0I',
'secure': True
}
TENANT_ID = 615873064429507639
CREATED_BY = 655162080928945152
UPDATED_BY = 655162080928945152
BUCKET_NAME = 'finyx'
# 项目根目录
PROJECT_ROOT = Path(__file__).parent
TEMPLATES_DIR = PROJECT_ROOT / "template_finish"
# 文档类型映射
DOCUMENT_TYPE_MAPPING = {
"1.请示报告卡XXX": {
"template_code": "REPORT_CARD",
"name": "1.请示报告卡XXX",
"business_type": "INVESTIGATION"
},
"2.初步核实审批表XXX": {
"template_code": "PRELIMINARY_VERIFICATION_APPROVAL",
"name": "2.初步核实审批表XXX",
"business_type": "INVESTIGATION"
},
"3.附件初核方案(XXX)": {
"template_code": "INVESTIGATION_PLAN",
"name": "3.附件初核方案(XXX)",
"business_type": "INVESTIGATION"
},
"谈话通知书第一联": {
"template_code": "NOTIFICATION_LETTER_1",
"name": "谈话通知书第一联",
"business_type": "INVESTIGATION"
},
"谈话通知书第二联": {
"template_code": "NOTIFICATION_LETTER_2",
"name": "谈话通知书第二联",
"business_type": "INVESTIGATION"
},
"谈话通知书第三联": {
"template_code": "NOTIFICATION_LETTER_3",
"name": "谈话通知书第三联",
"business_type": "INVESTIGATION"
},
"1.请示报告卡(初核谈话)": {
"template_code": "REPORT_CARD_INTERVIEW",
"name": "1.请示报告卡(初核谈话)",
"business_type": "INVESTIGATION"
},
"2谈话审批表": {
"template_code": "INTERVIEW_APPROVAL_FORM",
"name": "2谈话审批表",
"business_type": "INVESTIGATION"
},
"3.谈话前安全风险评估表": {
"template_code": "PRE_INTERVIEW_RISK_ASSESSMENT",
"name": "3.谈话前安全风险评估表",
"business_type": "INVESTIGATION"
},
"4.谈话方案": {
"template_code": "INTERVIEW_PLAN",
"name": "4.谈话方案",
"business_type": "INVESTIGATION"
},
"5.谈话后安全风险评估表": {
"template_code": "POST_INTERVIEW_RISK_ASSESSMENT",
"name": "5.谈话后安全风险评估表",
"business_type": "INVESTIGATION"
},
"1.谈话笔录": {
"template_code": "INTERVIEW_RECORD",
"name": "1.谈话笔录",
"business_type": "INVESTIGATION"
},
"2.谈话询问对象情况摸底调查30问": {
"template_code": "INVESTIGATION_30_QUESTIONS",
"name": "2.谈话询问对象情况摸底调查30问",
"business_type": "INVESTIGATION"
},
"3.被谈话人权利义务告知书": {
"template_code": "RIGHTS_OBLIGATIONS_NOTICE",
"name": "3.被谈话人权利义务告知书",
"business_type": "INVESTIGATION"
},
"4.点对点交接单": {
"template_code": "HANDOVER_FORM",
"name": "4.点对点交接单",
"business_type": "INVESTIGATION"
},
"5.陪送交接单(新)": {
"template_code": "ESCORT_HANDOVER_FORM",
"name": "5.陪送交接单(新)",
"business_type": "INVESTIGATION"
},
"6.1保密承诺书(谈话对象使用-非中共党员用)": {
"template_code": "CONFIDENTIALITY_COMMITMENT_NON_PARTY",
"name": "6.1保密承诺书(谈话对象使用-非中共党员用)",
"business_type": "INVESTIGATION"
},
"6.2保密承诺书(谈话对象使用-中共党员用)": {
"template_code": "CONFIDENTIALITY_COMMITMENT_PARTY",
"name": "6.2保密承诺书(谈话对象使用-中共党员用)",
"business_type": "INVESTIGATION"
},
"7.办案人员-办案安全保密承诺书": {
"template_code": "INVESTIGATOR_CONFIDENTIALITY_COMMITMENT",
"name": "7.办案人员-办案安全保密承诺书",
"business_type": "INVESTIGATION"
},
"8-1请示报告卡初核报告结论 ": {
"template_code": "REPORT_CARD_CONCLUSION",
"name": "8-1请示报告卡初核报告结论 ",
"business_type": "INVESTIGATION"
},
"8.XXX初核情况报告": {
"template_code": "INVESTIGATION_REPORT",
"name": "8.XXX初核情况报告",
"business_type": "INVESTIGATION"
}
}
def generate_id():
"""生成ID"""
import time
import random
timestamp = int(time.time() * 1000)
random_part = random.randint(100000, 999999)
return timestamp * 1000 + random_part
def identify_document_type(file_name: str) -> Optional[Dict]:
"""根据完整文件名识别文档类型"""
base_name = Path(file_name).stem
if base_name in DOCUMENT_TYPE_MAPPING:
return DOCUMENT_TYPE_MAPPING[base_name]
return None
def upload_to_minio(file_path: Path) -> str:
"""上传文件到MinIO"""
try:
client = Minio(
MINIO_CONFIG['endpoint'],
access_key=MINIO_CONFIG['access_key'],
secret_key=MINIO_CONFIG['secret_key'],
secure=MINIO_CONFIG['secure']
)
found = client.bucket_exists(BUCKET_NAME)
if not found:
raise Exception(f"存储桶 '{BUCKET_NAME}' 不存在,请先创建")
now = datetime.now()
object_name = f'{TENANT_ID}/TEMPLATE/{now.year}/{now.month:02d}/{file_path.name}'
client.fput_object(
BUCKET_NAME,
object_name,
str(file_path),
content_type='application/vnd.openxmlformats-officedocument.wordprocessingml.document'
)
return f"/{object_name}"
except S3Error as e:
raise Exception(f"MinIO错误: {e}")
except Exception as e:
raise Exception(f"上传文件时发生错误: {e}")
def scan_directory_structure(base_dir: Path) -> List[Dict]:
"""
扫描目录结构返回按层级排序的节点列表
每个节点包含type, name, path, parent_path, level, template_code, file_path
"""
nodes = []
def process_path(path: Path, parent_path: Optional[str] = None, level: int = 0):
"""递归处理路径"""
if path.is_file() and path.suffix == '.docx':
file_name = path.stem
doc_config = identify_document_type(file_name)
nodes.append({
'type': 'file',
'name': file_name,
'path': str(path),
'parent_path': parent_path,
'level': level,
'template_code': doc_config['template_code'] if doc_config else None,
'doc_config': doc_config,
'file_path': path
})
elif path.is_dir():
dir_name = path.name
nodes.append({
'type': 'directory',
'name': dir_name,
'path': str(path),
'parent_path': parent_path,
'level': level,
'template_code': None,
'doc_config': None,
'file_path': None
})
for child in sorted(path.iterdir()):
if child.name != '__pycache__':
process_path(child, str(path), level + 1)
if TEMPLATES_DIR.exists():
for item in sorted(TEMPLATES_DIR.iterdir()):
if item.name != '__pycache__':
process_path(item, None, 0)
# 按层级排序
return sorted(nodes, key=lambda x: (x['level'], x['path']))
def delete_old_data(conn, dry_run: bool = True):
"""删除旧数据"""
cursor = conn.cursor()
try:
print("\n" + "="*80)
print("删除旧数据")
print("="*80)
# 1. 先删除关联表 f_polic_file_field
print("\n1. 删除 f_polic_file_field 关联记录...")
if not dry_run:
# 先获取所有相关的 file_id
select_file_ids_sql = """
SELECT id FROM f_polic_file_config
WHERE tenant_id = %s
"""
cursor.execute(select_file_ids_sql, (TENANT_ID,))
file_ids = [row[0] for row in cursor.fetchall()]
if file_ids:
# 使用占位符构建SQL
placeholders = ','.join(['%s'] * len(file_ids))
delete_file_field_sql = f"""
DELETE FROM f_polic_file_field
WHERE tenant_id = %s AND file_id IN ({placeholders})
"""
cursor.execute(delete_file_field_sql, [TENANT_ID] + file_ids)
deleted_count = cursor.rowcount
print(f" ✓ 删除了 {deleted_count} 条关联记录")
else:
print(" ✓ 没有需要删除的关联记录")
else:
# 模拟模式:只统计
count_sql = """
SELECT COUNT(*) FROM f_polic_file_field
WHERE tenant_id = %s AND file_id IN (
SELECT id FROM f_polic_file_config WHERE tenant_id = %s
)
"""
cursor.execute(count_sql, (TENANT_ID, TENANT_ID))
count = cursor.fetchone()[0]
print(f" [模拟] 将删除 {count} 条关联记录")
# 2. 删除 f_polic_file_config 记录
print("\n2. 删除 f_polic_file_config 记录...")
delete_config_sql = """
DELETE FROM f_polic_file_config
WHERE tenant_id = %s
"""
if not dry_run:
cursor.execute(delete_config_sql, (TENANT_ID,))
deleted_count = cursor.rowcount
print(f" ✓ 删除了 {deleted_count} 条配置记录")
conn.commit()
else:
count_sql = "SELECT COUNT(*) FROM f_polic_file_config WHERE tenant_id = %s"
cursor.execute(count_sql, (TENANT_ID,))
count = cursor.fetchone()[0]
print(f" [模拟] 将删除 {count} 条配置记录")
return True
except Exception as e:
if not dry_run:
conn.rollback()
print(f" ✗ 删除失败: {e}")
raise
finally:
cursor.close()
def create_tree_structure(conn, nodes: List[Dict], upload_files: bool = True, dry_run: bool = True):
"""创建树状结构"""
cursor = conn.cursor()
try:
if not dry_run:
conn.autocommit(False)
print("\n" + "="*80)
print("创建树状结构")
print("="*80)
# 创建路径到ID的映射
path_to_id = {}
created_count = 0
updated_count = 0
# 按层级顺序处理
for node in nodes:
node_path = node['path']
node_name = node['name']
parent_path = node['parent_path']
level = node['level']
# 获取父节点ID
parent_id = path_to_id.get(parent_path) if parent_path else None
if node['type'] == 'directory':
# 创建目录节点
node_id = generate_id()
path_to_id[node_path] = node_id
if not dry_run:
# 目录节点不包含 template_code 字段
insert_sql = """
INSERT INTO f_polic_file_config
(id, tenant_id, parent_id, name, input_data, file_path,
created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, %s)
"""
cursor.execute(insert_sql, (
node_id,
TENANT_ID,
parent_id,
node_name,
None,
None,
CREATED_BY,
UPDATED_BY,
1
))
indent = " " * level
parent_info = f" [父: {path_to_id.get(parent_path, 'None')}]" if parent_path else ""
print(f"{indent}{'[模拟]' if dry_run else ''}创建目录: {node_name} (ID: {node_id}){parent_info}")
created_count += 1
else:
# 创建文件节点
node_id = generate_id()
path_to_id[node_path] = node_id
doc_config = node.get('doc_config')
template_code = node.get('template_code')
file_path_obj = node.get('file_path')
# 上传文件到MinIO如果需要
minio_path = None
if upload_files and file_path_obj and file_path_obj.exists():
try:
if not dry_run:
minio_path = upload_to_minio(file_path_obj)
else:
minio_path = f"/{TENANT_ID}/TEMPLATE/2025/12/{file_path_obj.name}"
print(f" {'[模拟]' if dry_run else ''}上传文件: {file_path_obj.name}{minio_path}")
except Exception as e:
print(f" ⚠ 上传文件失败: {e}")
# 继续执行使用None作为路径
# 构建 input_data
input_data = None
if doc_config:
input_data = json.dumps({
'template_code': doc_config['template_code'],
'business_type': doc_config['business_type']
}, ensure_ascii=False)
if not dry_run:
# 如果 template_code 为 None使用空字符串
template_code_value = template_code if template_code else ''
insert_sql = """
INSERT INTO f_polic_file_config
(id, tenant_id, parent_id, name, input_data, file_path, template_code,
created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, %s)
"""
cursor.execute(insert_sql, (
node_id,
TENANT_ID,
parent_id,
node_name,
input_data,
minio_path,
template_code_value,
CREATED_BY,
UPDATED_BY,
1
))
indent = " " * level
parent_info = f" [父: {path_to_id.get(parent_path, 'None')}]" if parent_path else ""
template_info = f" [code: {template_code}]" if template_code else ""
print(f"{indent}{'[模拟]' if dry_run else ''}创建文件: {node_name} (ID: {node_id}){parent_info}{template_info}")
created_count += 1
if not dry_run:
conn.commit()
print(f"\n✓ 创建完成!共创建 {created_count} 个节点")
else:
print(f"\n[模拟模式] 将创建 {created_count} 个节点")
return path_to_id
except Exception as e:
if not dry_run:
conn.rollback()
print(f"\n✗ 创建失败: {e}")
import traceback
traceback.print_exc()
raise
finally:
cursor.close()
def main():
"""主函数"""
print("="*80)
print("初始化模板树状结构(从目录结构完全重建)")
print("="*80)
print("\n⚠️ 警告:此操作将删除当前租户的所有模板数据!")
print(" 包括:")
print(" - f_polic_file_config 表中的所有记录")
print(" - f_polic_file_field 表中的相关关联记录")
print(" 然后根据 template_finish 目录结构完全重建")
# 确认
print("\n" + "="*80)
confirm1 = input("\n确认继续?(yes/no默认no): ").strip().lower()
if confirm1 != 'yes':
print("已取消")
return
# 连接数据库
try:
conn = pymysql.connect(**DB_CONFIG)
print("✓ 数据库连接成功")
except Exception as e:
print(f"✗ 数据库连接失败: {e}")
return
try:
# 扫描目录结构
print("\n扫描目录结构...")
nodes = scan_directory_structure(TEMPLATES_DIR)
print(f" 找到 {len(nodes)} 个节点")
print(f" 其中目录: {len([n for n in nodes if n['type'] == 'directory'])}")
print(f" 其中文件: {len([n for n in nodes if n['type'] == 'file'])}")
# 显示预览
print("\n目录结构预览:")
for node in nodes[:10]: # 只显示前10个
indent = " " * node['level']
type_icon = "📁" if node['type'] == 'directory' else "📄"
print(f"{indent}{type_icon} {node['name']}")
if len(nodes) > 10:
print(f" ... 还有 {len(nodes) - 10} 个节点")
# 询问是否上传文件
print("\n" + "="*80)
upload_files = input("\n是否上传文件到MinIO(yes/no默认yes): ").strip().lower()
upload_files = upload_files != 'no'
# 先执行模拟删除
print("\n执行模拟删除...")
delete_old_data(conn, dry_run=True)
# 再执行模拟创建
print("\n执行模拟创建...")
create_tree_structure(conn, nodes, upload_files=upload_files, dry_run=True)
# 最终确认
print("\n" + "="*80)
confirm2 = input("\n确认执行实际更新?(yes/no默认no): ").strip().lower()
if confirm2 != 'yes':
print("已取消")
return
# 执行实际删除
print("\n执行实际删除...")
delete_old_data(conn, dry_run=False)
# 执行实际创建
print("\n执行实际创建...")
create_tree_structure(conn, nodes, upload_files=upload_files, dry_run=False)
print("\n" + "="*80)
print("初始化完成!")
print("="*80)
except Exception as e:
print(f"\n✗ 初始化失败: {e}")
import traceback
traceback.print_exc()
finally:
conn.close()
print("\n数据库连接已关闭")
if __name__ == '__main__':
main()

View File

@ -0,0 +1,234 @@
"""
将所有模板与两个输入字段关联
- 线索信息 (clue_info)
- 被核查人员工作基本情况线索 (target_basic_info_clue)
"""
import pymysql
import os
import sys
import time
import random
from datetime import datetime
# 设置输出编码为UTF-8
if sys.platform == 'win32':
import io
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
# 数据库连接配置
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
CREATED_BY = 655162080928945152
UPDATED_BY = 655162080928945152
def generate_id():
"""生成ID"""
timestamp = int(time.time() * 1000)
random_part = random.randint(100000, 999999)
return timestamp * 1000 + random_part
def get_input_field_ids(conn):
"""获取两个输入字段的ID"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
field_codes = ['clue_info', 'target_basic_info_clue']
cursor.execute("""
SELECT id, name, filed_code
FROM f_polic_field
WHERE tenant_id = %s
AND filed_code IN (%s, %s)
AND field_type = 1
AND state = 1
""", (TENANT_ID, field_codes[0], field_codes[1]))
fields = cursor.fetchall()
field_map = {field['filed_code']: field for field in fields}
result = {}
for code in field_codes:
if code in field_map:
result[code] = field_map[code]
else:
print(f"[WARN] 未找到字段: {code}")
return result
def get_all_templates(conn):
"""获取所有启用的模板"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
cursor.execute("""
SELECT id, name
FROM f_polic_file_config
WHERE tenant_id = %s AND state = 1
ORDER BY name
""", (TENANT_ID,))
return cursor.fetchall()
def get_existing_relations(conn, template_id, field_ids):
"""获取模板与字段的现有关联关系"""
cursor = conn.cursor()
if not field_ids:
return set()
placeholders = ','.join(['%s'] * len(field_ids))
cursor.execute(f"""
SELECT filed_id
FROM f_polic_file_field
WHERE tenant_id = %s
AND file_id = %s
AND filed_id IN ({placeholders})
AND state = 1
""", [TENANT_ID, template_id] + list(field_ids))
return {row[0] for row in cursor.fetchall()}
def create_relation(conn, template_id, field_id):
"""创建模板与字段的关联关系"""
cursor = conn.cursor()
current_time = datetime.now()
# 检查是否已存在
cursor.execute("""
SELECT id FROM f_polic_file_field
WHERE tenant_id = %s
AND file_id = %s
AND filed_id = %s
""", (TENANT_ID, template_id, field_id))
if cursor.fetchone():
return False # 已存在,不需要创建
# 创建新关联
relation_id = generate_id()
cursor.execute("""
INSERT INTO f_polic_file_field
(id, tenant_id, file_id, filed_id, created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, %s, %s, %s, %s, 1)
""", (
relation_id,
TENANT_ID,
template_id,
field_id,
current_time,
CREATED_BY,
current_time,
UPDATED_BY
))
return True # 创建成功
def main():
"""主函数"""
print("="*80)
print("将所有模板与输入字段关联")
print("="*80)
try:
conn = pymysql.connect(**DB_CONFIG)
print("[OK] 数据库连接成功\n")
except Exception as e:
print(f"[ERROR] 数据库连接失败: {e}")
return
try:
# 1. 获取输入字段ID
print("1. 获取输入字段ID...")
input_fields = get_input_field_ids(conn)
if len(input_fields) != 2:
print(f"[ERROR] 未找到所有输入字段,只找到: {list(input_fields.keys())}")
return
field_ids = [field['id'] for field in input_fields.values()]
print(f" 找到字段:")
for code, field in input_fields.items():
print(f" - {field['name']} ({code}): ID={field['id']}")
print()
# 2. 获取所有模板
print("2. 获取所有启用的模板...")
templates = get_all_templates(conn)
print(f" 找到 {len(templates)} 个模板\n")
# 3. 为每个模板创建关联关系
print("3. 创建关联关系...")
created_count = 0
existing_count = 0
error_count = 0
for template in templates:
template_id = template['id']
template_name = template['name']
# 获取现有关联
existing_relations = get_existing_relations(conn, template_id, field_ids)
# 为每个字段创建关联(如果不存在)
for field_code, field_info in input_fields.items():
field_id = field_info['id']
if field_id in existing_relations:
existing_count += 1
continue
try:
if create_relation(conn, template_id, field_id):
created_count += 1
print(f" [OK] {template_name} <- {field_info['name']} ({field_code})")
else:
existing_count += 1
except Exception as e:
error_count += 1
print(f" [ERROR] {template_name} <- {field_info['name']}: {e}")
# 提交事务
conn.commit()
# 4. 统计结果
print("\n" + "="*80)
print("执行结果")
print("="*80)
print(f"模板总数: {len(templates)}")
print(f"字段总数: {len(input_fields)}")
print(f"预期关联数: {len(templates) * len(input_fields)}")
print(f"新创建关联: {created_count}")
print(f"已存在关联: {existing_count}")
print(f"错误数量: {error_count}")
print(f"实际关联数: {created_count + existing_count}")
if error_count == 0:
print("\n[OK] 所有关联关系已成功创建或已存在")
else:
print(f"\n[WARN] 有 {error_count} 个关联关系创建失败")
except Exception as e:
conn.rollback()
print(f"\n[ERROR] 执行过程中发生错误: {e}")
import traceback
traceback.print_exc()
finally:
conn.close()
print("\n数据库连接已关闭")
if __name__ == '__main__':
main()

View File

@ -0,0 +1,372 @@
"""
处理"6.1保密承诺书(谈话对象使用-非中共党员用).docx"
- 解析占位符
- 上传到MinIO
- 更新数据库
"""
import os
import sys
import re
import json
import pymysql
from minio import Minio
from minio.error import S3Error
from datetime import datetime
from pathlib import Path
from docx import Document
from typing import Dict, List, Optional, Tuple
# 设置输出编码为UTF-8Windows兼容
if sys.platform == 'win32':
import io
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8', errors='replace')
sys.stderr = io.TextIOWrapper(sys.stderr.buffer, encoding='utf-8', errors='replace')
# MinIO连接配置
MINIO_CONFIG = {
'endpoint': 'minio.datacubeworld.com:9000',
'access_key': 'JOLXFXny3avFSzB0uRA5',
'secret_key': 'G1BR8jStNfovkfH5ou39EmPl34E4l7dGrnd3Cz0I',
'secure': True
}
# 数据库连接配置
DB_CONFIG = {
'host': '152.136.177.240',
'port': 5012,
'user': 'finyx',
'password': '6QsGK6MpePZDE57Z',
'database': 'finyx',
'charset': 'utf8mb4'
}
# 固定值
TENANT_ID = 615873064429507639
CREATED_BY = 655162080928945152
UPDATED_BY = 655162080928945152
BUCKET_NAME = 'finyx'
# 文件路径
TEMPLATE_FILE = 'template_finish/2-初核模版/2.谈话审批/走读式谈话流程/6.1保密承诺书(谈话对象使用-非中共党员用).docx'
PARENT_ID = 1765273962716807 # 走读式谈话流程的ID
TEMPLATE_NAME = '6.1保密承诺书(谈话对象使用-非中共党员用)'
def generate_id():
"""生成ID"""
import time
import random
timestamp = int(time.time() * 1000)
random_part = random.randint(100000, 999999)
return timestamp * 1000 + random_part
def extract_placeholders_from_docx(file_path: str) -> List[str]:
"""
从docx文件中提取所有占位符
Args:
file_path: docx文件路径
Returns:
占位符列表格式: ['field_code1', 'field_code2', ...]
"""
placeholders = set()
pattern = r'\{\{([^}]+)\}\}' # 匹配 {{field_code}} 格式
try:
doc = Document(file_path)
# 从段落中提取占位符
for paragraph in doc.paragraphs:
text = paragraph.text
matches = re.findall(pattern, text)
for match in matches:
placeholders.add(match.strip())
# 从表格中提取占位符
for table in doc.tables:
for row in table.rows:
for cell in row.cells:
for paragraph in cell.paragraphs:
text = paragraph.text
matches = re.findall(pattern, text)
for match in matches:
placeholders.add(match.strip())
except Exception as e:
print(f" 错误: 读取文件失败 - {str(e)}")
return []
return sorted(list(placeholders))
def upload_to_minio(file_path: str, minio_client: Minio) -> str:
"""
上传文件到MinIO
Args:
file_path: 本地文件路径
minio_client: MinIO客户端实例
Returns:
MinIO中的相对路径
"""
try:
# 检查存储桶是否存在
found = minio_client.bucket_exists(BUCKET_NAME)
if not found:
raise Exception(f"存储桶 '{BUCKET_NAME}' 不存在,请先创建")
# 生成MinIO对象路径使用当前日期
now = datetime.now()
file_name = Path(file_path).name
object_name = f'{TENANT_ID}/TEMPLATE/{now.year}/{now.month:02d}/{file_name}'
# 上传文件
minio_client.fput_object(
BUCKET_NAME,
object_name,
file_path,
content_type='application/vnd.openxmlformats-officedocument.wordprocessingml.document'
)
# 返回相对路径(以/开头)
return f"/{object_name}"
except S3Error as e:
raise Exception(f"MinIO错误: {e}")
except Exception as e:
raise Exception(f"上传文件时发生错误: {e}")
def get_db_fields(conn) -> Dict[str, Dict]:
"""
获取数据库中所有字段field_type=2的输出字段
Returns:
字典key为filed_codevalue为字段信息
"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
sql = """
SELECT id, name, filed_code, field_type
FROM f_polic_field
WHERE tenant_id = %s AND field_type = 2
"""
cursor.execute(sql, (TENANT_ID,))
fields = cursor.fetchall()
result = {}
for field in fields:
result[field['filed_code']] = {
'id': field['id'],
'name': field['name'],
'filed_code': field['filed_code'],
'field_type': field['field_type']
}
cursor.close()
return result
def match_placeholders_to_fields(placeholders: List[str], fields: Dict[str, Dict]) -> Tuple[List[int], List[str]]:
"""
匹配占位符到数据库字段
Returns:
(匹配的字段ID列表, 未匹配的占位符列表)
"""
matched_field_ids = []
unmatched_placeholders = []
for placeholder in placeholders:
if placeholder in fields:
matched_field_ids.append(fields[placeholder]['id'])
else:
unmatched_placeholders.append(placeholder)
return matched_field_ids, unmatched_placeholders
def create_or_update_template(conn, template_name: str, file_path: str, minio_path: str, parent_id: Optional[int]) -> int:
"""
创建或更新模板记录
Returns:
模板ID
"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
# 查找是否已存在通过名称和parent_id匹配
sql = """
SELECT id, name, file_path, parent_id
FROM f_polic_file_config
WHERE tenant_id = %s AND name = %s AND parent_id = %s
"""
cursor.execute(sql, (TENANT_ID, template_name, parent_id))
existing = cursor.fetchone()
if existing:
# 更新现有记录
template_id = existing['id']
update_sql = """
UPDATE f_polic_file_config
SET file_path = %s, updated_time = NOW(), updated_by = %s, state = 1
WHERE id = %s AND tenant_id = %s
"""
cursor.execute(update_sql, (minio_path, UPDATED_BY, template_id, TENANT_ID))
conn.commit()
print(f" [UPDATE] 更新模板记录 (ID: {template_id})")
return template_id
else:
# 创建新记录
template_id = generate_id()
insert_sql = """
INSERT INTO f_polic_file_config
(id, tenant_id, parent_id, name, input_data, file_path, created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, %s)
"""
cursor.execute(insert_sql, (
template_id,
TENANT_ID,
parent_id,
template_name,
None, # input_data
minio_path,
CREATED_BY,
CREATED_BY,
1 # state: 1表示启用
))
conn.commit()
print(f" [CREATE] 创建模板记录 (ID: {template_id})")
return template_id
except Exception as e:
conn.rollback()
raise Exception(f"创建或更新模板失败: {str(e)}")
finally:
cursor.close()
def update_template_field_relations(conn, template_id: int, field_ids: List[int]):
"""
更新模板-字段关联关系
"""
cursor = conn.cursor()
try:
# 删除旧的关联关系
delete_sql = """
DELETE FROM f_polic_file_field
WHERE tenant_id = %s AND file_id = %s
"""
cursor.execute(delete_sql, (TENANT_ID, template_id))
# 插入新的关联关系
if field_ids:
insert_sql = """
INSERT INTO f_polic_file_field
(tenant_id, file_id, filed_id, created_time, created_by, updated_time, updated_by)
VALUES (%s, %s, %s, NOW(), %s, NOW(), %s)
"""
for field_id in field_ids:
cursor.execute(insert_sql, (TENANT_ID, template_id, field_id, CREATED_BY, CREATED_BY))
conn.commit()
print(f" [UPDATE] 更新字段关联关系: {len(field_ids)} 个字段")
except Exception as e:
conn.rollback()
raise Exception(f"更新字段关联关系失败: {str(e)}")
finally:
cursor.close()
def main():
"""主函数"""
print("=" * 80)
print("处理保密承诺书(非中共党员用)模板")
print("=" * 80)
print()
# 检查文件是否存在
if not os.path.exists(TEMPLATE_FILE):
print(f"错误: 文件不存在 - {TEMPLATE_FILE}")
return
print(f"文件路径: {TEMPLATE_FILE}")
print()
try:
# 1. 提取占位符
print("1. 提取占位符...")
placeholders = extract_placeholders_from_docx(TEMPLATE_FILE)
print(f" 找到 {len(placeholders)} 个占位符:")
for i, placeholder in enumerate(placeholders, 1):
print(f" {i}. {{{{ {placeholder} }}}}")
print()
# 2. 连接数据库和MinIO
print("2. 连接数据库和MinIO...")
conn = pymysql.connect(**DB_CONFIG)
minio_client = Minio(
MINIO_CONFIG['endpoint'],
access_key=MINIO_CONFIG['access_key'],
secret_key=MINIO_CONFIG['secret_key'],
secure=MINIO_CONFIG['secure']
)
print(" [OK] 连接成功\n")
# 3. 获取数据库字段
print("3. 获取数据库字段...")
db_fields = get_db_fields(conn)
print(f" [OK] 找到 {len(db_fields)} 个输出字段\n")
# 4. 匹配占位符到字段
print("4. 匹配占位符到字段...")
matched_field_ids, unmatched_placeholders = match_placeholders_to_fields(placeholders, db_fields)
print(f" 匹配成功: {len(matched_field_ids)}")
print(f" 未匹配: {len(unmatched_placeholders)}")
if unmatched_placeholders:
print(f" 未匹配的占位符: {', '.join(unmatched_placeholders)}")
print()
# 5. 上传到MinIO
print("5. 上传到MinIO...")
minio_path = upload_to_minio(TEMPLATE_FILE, minio_client)
print(f" [OK] MinIO路径: {minio_path}\n")
# 6. 创建或更新数据库记录
print("6. 创建或更新数据库记录...")
template_id = create_or_update_template(conn, TEMPLATE_NAME, TEMPLATE_FILE, minio_path, PARENT_ID)
print(f" [OK] 模板ID: {template_id}\n")
# 7. 更新字段关联关系
print("7. 更新字段关联关系...")
update_template_field_relations(conn, template_id, matched_field_ids)
print()
print("=" * 80)
print("处理完成!")
print("=" * 80)
print(f"模板ID: {template_id}")
print(f"MinIO路径: {minio_path}")
print(f"关联字段数: {len(matched_field_ids)}")
except Exception as e:
print(f"\n[ERROR] 发生错误: {e}")
import traceback
traceback.print_exc()
if 'conn' in locals():
conn.rollback()
finally:
if 'conn' in locals():
conn.close()
print("\n数据库连接已关闭")
if __name__ == '__main__':
main()

View File

@ -0,0 +1,318 @@
"""
模板字段关联查询示例脚本
演示如何查询模板关联的输入和输出字段
"""
import pymysql
import os
from typing import Dict, List, Optional
# 数据库连接配置
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
def get_template_fields_by_name(template_name: str) -> Optional[Dict]:
"""
根据模板名称获取关联的字段
Args:
template_name: 模板名称 '初步核实审批表'
Returns:
dict: 包含 template_id, template_name, input_fields output_fields 的字典
"""
conn = pymysql.connect(**DB_CONFIG)
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
sql = """
SELECT
fc.id AS template_id,
fc.name AS template_name,
f.id AS field_id,
f.name AS field_name,
f.filed_code AS field_code,
f.field_type
FROM f_polic_file_config fc
INNER JOIN f_polic_file_field fff ON fc.id = fff.file_id
INNER JOIN f_polic_field f ON fff.filed_id = f.id
WHERE fc.tenant_id = %s
AND fc.name = %s
AND fc.state = 1
AND fff.state = 1
AND f.state = 1
ORDER BY f.field_type, f.name
"""
cursor.execute(sql, (TENANT_ID, template_name))
rows = cursor.fetchall()
if not rows:
return None
result = {
'template_id': rows[0]['template_id'],
'template_name': rows[0]['template_name'],
'input_fields': [],
'output_fields': []
}
for row in rows:
field_info = {
'id': row['field_id'],
'name': row['field_name'],
'field_code': row['field_code'],
'field_type': row['field_type']
}
if row['field_type'] == 1:
result['input_fields'].append(field_info)
elif row['field_type'] == 2:
result['output_fields'].append(field_info)
return result
finally:
cursor.close()
conn.close()
def get_template_fields_by_id(template_id: int) -> Optional[Dict]:
"""
根据模板ID获取关联的字段
Args:
template_id: 模板ID
Returns:
dict: 包含 template_id, template_name, input_fields output_fields 的字典
"""
conn = pymysql.connect(**DB_CONFIG)
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
# 先获取模板名称
sql_template = """
SELECT id, name
FROM f_polic_file_config
WHERE id = %s AND tenant_id = %s AND state = 1
"""
cursor.execute(sql_template, (template_id, TENANT_ID))
template = cursor.fetchone()
if not template:
return None
# 获取字段
sql_fields = """
SELECT
f.id AS field_id,
f.name AS field_name,
f.filed_code AS field_code,
f.field_type
FROM f_polic_file_field fff
INNER JOIN f_polic_field f ON fff.filed_id = f.id
WHERE fff.file_id = %s
AND fff.tenant_id = %s
AND fff.state = 1
AND f.state = 1
ORDER BY f.field_type, f.name
"""
cursor.execute(sql_fields, (template_id, TENANT_ID))
rows = cursor.fetchall()
result = {
'template_id': template['id'],
'template_name': template['name'],
'input_fields': [],
'output_fields': []
}
for row in rows:
field_info = {
'id': row['field_id'],
'name': row['field_name'],
'field_code': row['field_code'],
'field_type': row['field_type']
}
if row['field_type'] == 1:
result['input_fields'].append(field_info)
elif row['field_type'] == 2:
result['output_fields'].append(field_info)
return result
finally:
cursor.close()
conn.close()
def get_all_templates_with_field_stats() -> List[Dict]:
"""
获取所有模板及其字段统计信息
Returns:
list: 模板列表每个模板包含字段统计
"""
conn = pymysql.connect(**DB_CONFIG)
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
sql = """
SELECT
fc.id AS template_id,
fc.name AS template_name,
COUNT(DISTINCT CASE WHEN f.field_type = 1 THEN f.id END) AS input_field_count,
COUNT(DISTINCT CASE WHEN f.field_type = 2 THEN f.id END) AS output_field_count,
COUNT(DISTINCT f.id) AS total_field_count
FROM f_polic_file_config fc
LEFT JOIN f_polic_file_field fff ON fc.id = fff.file_id AND fff.state = 1
LEFT JOIN f_polic_field f ON fff.filed_id = f.id AND f.state = 1
WHERE fc.tenant_id = %s
AND fc.state = 1
GROUP BY fc.id, fc.name
ORDER BY fc.name
"""
cursor.execute(sql, (TENANT_ID,))
templates = cursor.fetchall()
return [
{
'template_id': t['template_id'],
'template_name': t['template_name'],
'input_field_count': t['input_field_count'] or 0,
'output_field_count': t['output_field_count'] or 0,
'total_field_count': t['total_field_count'] or 0
}
for t in templates
]
finally:
cursor.close()
conn.close()
def find_templates_using_field(field_code: str) -> List[Dict]:
"""
查找使用特定字段的所有模板
Args:
field_code: 字段编码 'target_name'
Returns:
list: 使用该字段的模板列表
"""
conn = pymysql.connect(**DB_CONFIG)
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
sql = """
SELECT DISTINCT
fc.id AS template_id,
fc.name AS template_name
FROM f_polic_file_config fc
INNER JOIN f_polic_file_field fff ON fc.id = fff.file_id
INNER JOIN f_polic_field f ON fff.filed_id = f.id
WHERE fc.tenant_id = %s
AND f.tenant_id = %s
AND f.filed_code = %s
AND fc.state = 1
AND fff.state = 1
AND f.state = 1
ORDER BY fc.name
"""
cursor.execute(sql, (TENANT_ID, TENANT_ID, field_code))
templates = cursor.fetchall()
return [
{
'template_id': t['template_id'],
'template_name': t['template_name']
}
for t in templates
]
finally:
cursor.close()
conn.close()
def print_template_fields(result: Dict):
"""打印模板字段信息"""
if not result:
print("未找到模板")
return
print("="*80)
print(f"模板: {result['template_name']} (ID: {result['template_id']})")
print("="*80)
print(f"\n输入字段 ({len(result['input_fields'])} 个):")
if result['input_fields']:
for field in result['input_fields']:
print(f" - {field['name']} ({field['field_code']})")
else:
print(" (无)")
print(f"\n输出字段 ({len(result['output_fields'])} 个):")
if result['output_fields']:
for field in result['output_fields']:
print(f" - {field['name']} ({field['field_code']})")
else:
print(" (无)")
def main():
"""主函数 - 演示各种查询方式"""
print("="*80)
print("模板字段关联查询示例")
print("="*80)
# 示例1: 根据模板名称查询
print("\n【示例1】根据模板名称查询字段")
print("-" * 80)
# 注意:模板名称需要完全匹配,如 "2.初步核实审批表XXX"
result = get_template_fields_by_name('2.初步核实审批表XXX')
if not result:
# 尝试其他可能的名称
result = get_template_fields_by_name('初步核实审批表')
print_template_fields(result)
# 示例2: 获取所有模板的字段统计
print("\n\n【示例2】获取所有模板的字段统计")
print("-" * 80)
templates = get_all_templates_with_field_stats()
print(f"共找到 {len(templates)} 个模板:\n")
for template in templates[:5]: # 只显示前5个
print(f" {template['template_name']} (ID: {template['template_id']})")
print(f" 输入字段: {template['input_field_count']}")
print(f" 输出字段: {template['output_field_count']}")
print(f" 总字段数: {template['total_field_count']}\n")
if len(templates) > 5:
print(f" ... 还有 {len(templates) - 5} 个模板")
# 示例3: 查找使用特定字段的模板
print("\n\n【示例3】查找使用 'target_name' 字段的模板")
print("-" * 80)
templates_using_field = find_templates_using_field('target_name')
print(f"共找到 {len(templates_using_field)} 个模板使用该字段:")
for template in templates_using_field:
print(f" - {template['template_name']} (ID: {template['template_id']})")
print("\n" + "="*80)
print("查询完成")
print("="*80)
if __name__ == '__main__':
main()

View File

@ -0,0 +1,536 @@
"""
重新建立模板和字段的关联关系
根据模板名称重新建立 f_polic_file_field 表的关联关系
不再依赖 input_data template_code 字段
"""
import pymysql
import os
import json
from typing import Dict, List, Set, Optional
from datetime import datetime
from collections import defaultdict
# 数据库连接配置
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
CREATED_BY = 655162080928945152
UPDATED_BY = 655162080928945152
# 模板名称到字段编码的映射(根据业务逻辑定义)
# 格式:{模板名称: {'input_fields': [字段编码列表], 'output_fields': [字段编码列表]}}
TEMPLATE_FIELD_MAPPING = {
# 初步核实审批表
'初步核实审批表': {
'input_fields': ['clue_info', 'target_basic_info_clue'],
'output_fields': [
'target_name', 'target_organization_and_position', 'target_organization',
'target_position', 'target_gender', 'target_date_of_birth', 'target_age',
'target_education_level', 'target_political_status', 'target_professional_rank',
'clue_source', 'target_issue_description', 'department_opinion', 'filler_name'
]
},
# 谈话前安全风险评估表
'谈话前安全风险评估表': {
'input_fields': ['clue_info', 'target_basic_info_clue'],
'output_fields': [
'target_family_situation', 'target_social_relations', 'target_health_status',
'target_personality', 'target_tolerance', 'target_issue_severity',
'target_other_issues_possibility', 'target_previous_investigation',
'target_negative_events', 'target_other_situation', 'risk_level'
]
},
# 请示报告卡
'请示报告卡': {
'input_fields': ['clue_info'],
'output_fields': ['target_name', 'target_organization_and_position', 'report_card_request_time']
},
# 初核方案
'初核方案': {
'input_fields': ['clue_info', 'target_basic_info_clue'],
'output_fields': [
'target_name', 'target_organization_and_position', 'target_work_basic_info',
'target_issue_description', 'investigation_unit_name', 'investigation_team_leader_name',
'investigation_team_member_names', 'investigation_location'
]
},
# 谈话通知书
'谈话通知书': {
'input_fields': ['target_basic_info_clue'],
'output_fields': [
'target_name', 'target_organization_and_position', 'target_id_number',
'appointment_time', 'appointment_location', 'approval_time',
'handling_department', 'handler_name', 'notification_time', 'notification_location'
]
},
# 谈话通知书第一联
'谈话通知书第一联': {
'input_fields': ['target_basic_info_clue'],
'output_fields': [
'target_name', 'target_organization_and_position', 'target_id_number',
'appointment_time', 'appointment_location', 'approval_time',
'handling_department', 'handler_name', 'notification_time', 'notification_location'
]
},
# 谈话通知书第二联
'谈话通知书第二联': {
'input_fields': ['target_basic_info_clue'],
'output_fields': [
'target_name', 'target_organization_and_position', 'target_id_number',
'appointment_time', 'appointment_location', 'approval_time',
'handling_department', 'handler_name', 'notification_time', 'notification_location'
]
},
# 谈话通知书第三联
'谈话通知书第三联': {
'input_fields': ['target_basic_info_clue'],
'output_fields': [
'target_name', 'target_organization_and_position', 'target_id_number',
'appointment_time', 'appointment_location', 'approval_time',
'handling_department', 'handler_name', 'notification_time', 'notification_location'
]
},
# 谈话笔录
'谈话笔录': {
'input_fields': ['clue_info', 'target_basic_info_clue'],
'output_fields': [
'target_name', 'target_organization_and_position', 'target_gender',
'target_date_of_birth_full', 'target_political_status', 'target_address',
'target_registered_address', 'target_contact', 'target_place_of_origin',
'target_ethnicity', 'target_id_number', 'investigation_team_code'
]
},
# 谈话后安全风险评估表
'谈话后安全风险评估表': {
'input_fields': ['clue_info', 'target_basic_info_clue'],
'output_fields': [
'target_name', 'target_organization_and_position', 'target_gender',
'target_date_of_birth_full', 'target_political_status', 'target_address',
'target_registered_address', 'target_contact', 'target_place_of_origin',
'target_ethnicity', 'target_id_number', 'investigation_team_code'
]
},
# XXX初核情况报告
'XXX初核情况报告': {
'input_fields': ['clue_info', 'target_basic_info_clue'],
'output_fields': [
'target_name', 'target_organization_and_position', 'target_issue_description',
'target_work_basic_info', 'investigation_unit_name', 'investigation_team_leader_name'
]
},
# 走读式谈话审批
'走读式谈话审批': {
'input_fields': ['target_basic_info_clue'],
'output_fields': [
'target_name', 'target_organization_and_position', 'target_id_number',
'appointment_time', 'appointment_location', 'approval_time',
'handling_department', 'handler_name'
]
},
# 走读式谈话流程
'走读式谈话流程': {
'input_fields': ['target_basic_info_clue'],
'output_fields': [
'target_name', 'target_organization_and_position', 'target_id_number',
'appointment_time', 'appointment_location', 'approval_time',
'handling_department', 'handler_name'
]
},
# 谈话审批 / 谈话审批表
'谈话审批': {
'input_fields': ['target_basic_info_clue'],
'output_fields': [
'target_name', 'target_organization_and_position', 'target_id_number',
'appointment_time', 'appointment_location', 'approval_time',
'handling_department', 'handler_name'
]
},
'谈话审批表': {
'input_fields': ['clue_info', 'target_basic_info_clue'],
'output_fields': [
'target_name', 'target_organization_and_position', 'target_gender',
'target_date_of_birth_full', 'target_political_status', 'target_address',
'target_registered_address', 'target_contact', 'target_place_of_origin',
'target_ethnicity', 'target_id_number', 'investigation_team_code'
]
},
}
# 模板名称的标准化映射(处理不同的命名方式)
TEMPLATE_NAME_NORMALIZE = {
'1.请示报告卡XXX': '请示报告卡',
'2.初步核实审批表XXX': '初步核实审批表',
'3.附件初核方案(XXX)': '初核方案',
'8.XXX初核情况报告': 'XXX初核情况报告',
'2.谈话审批': '谈话审批',
'2谈话审批表': '谈话审批表',
}
def generate_id():
"""生成ID使用时间戳+随机数的方式,模拟雪花算法)"""
import time
import random
timestamp = int(time.time() * 1000)
random_part = random.randint(100000, 999999)
return timestamp * 1000 + random_part
def normalize_template_name(name: str) -> str:
"""标准化模板名称"""
# 先检查映射表
if name in TEMPLATE_NAME_NORMALIZE:
return TEMPLATE_NAME_NORMALIZE[name]
# 移除常见的后缀和前缀
name = name.strip()
# 移除括号内容
import re
name = re.sub(r'[(].*?[)]', '', name)
name = name.strip()
# 移除数字前缀和点号
name = re.sub(r'^\d+\.', '', name)
name = name.strip()
return name
def get_all_templates(conn) -> Dict:
"""获取所有模板配置"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
sql = """
SELECT id, name, parent_id, state
FROM f_polic_file_config
WHERE tenant_id = %s
ORDER BY name
"""
cursor.execute(sql, (TENANT_ID,))
templates = cursor.fetchall()
result = {}
for template in templates:
name = template['name']
normalized_name = normalize_template_name(name)
# 处理state字段可能是二进制格式
state = template['state']
if isinstance(state, bytes):
state = int.from_bytes(state, byteorder='big')
elif isinstance(state, (int, str)):
state = int(state)
else:
state = 0
result[template['id']] = {
'id': template['id'],
'name': name,
'normalized_name': normalized_name,
'parent_id': template['parent_id'],
'state': state
}
cursor.close()
return result
def get_all_fields(conn) -> Dict:
"""获取所有字段定义"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
sql = """
SELECT id, name, filed_code, field_type, state
FROM f_polic_field
WHERE tenant_id = %s
ORDER BY field_type, filed_code
"""
cursor.execute(sql, (TENANT_ID,))
fields = cursor.fetchall()
result = {
'by_code': {},
'by_name': {},
'input_fields': [],
'output_fields': []
}
for field in fields:
field_code = field['filed_code']
field_name = field['name']
field_type = field['field_type']
result['by_code'][field_code] = field
result['by_name'][field_name] = field
if field_type == 1:
result['input_fields'].append(field)
elif field_type == 2:
result['output_fields'].append(field)
cursor.close()
return result
def get_existing_relations(conn) -> Set[tuple]:
"""获取现有的关联关系"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
sql = """
SELECT file_id, filed_id
FROM f_polic_file_field
WHERE tenant_id = %s
"""
cursor.execute(sql, (TENANT_ID,))
relations = cursor.fetchall()
result = {(rel['file_id'], rel['filed_id']) for rel in relations}
cursor.close()
return result
def rebuild_template_relations(conn, template_id: int, template_name: str,
normalized_name: str, field_mapping: Dict,
dry_run: bool = True) -> Dict:
"""重建单个模板的关联关系"""
cursor = conn.cursor()
# 查找模板对应的字段配置
template_config = None
# 优先精确匹配标准化名称
if normalized_name in TEMPLATE_FIELD_MAPPING:
template_config = TEMPLATE_FIELD_MAPPING[normalized_name]
else:
# 尝试模糊匹配
for name, config in TEMPLATE_FIELD_MAPPING.items():
if name == normalized_name or name in normalized_name or normalized_name in name:
template_config = config
break
# 也检查原始名称
if name in template_name or template_name in name:
template_config = config
break
if not template_config:
return {
'template_id': template_id,
'template_name': template_name,
'status': 'skipped',
'reason': '未找到字段配置映射',
'input_count': 0,
'output_count': 0
}
input_field_codes = template_config.get('input_fields', [])
output_field_codes = template_config.get('output_fields', [])
# 查找字段ID
input_field_ids = []
output_field_ids = []
for field_code in input_field_codes:
field = field_mapping['by_code'].get(field_code)
if field:
if field['field_type'] == 1:
input_field_ids.append(field['id'])
else:
print(f" ⚠ 警告: 字段 {field_code} 应该是输入字段,但实际类型为 {field['field_type']}")
else:
print(f" ⚠ 警告: 字段 {field_code} 不存在")
for field_code in output_field_codes:
field = field_mapping['by_code'].get(field_code)
if field:
if field['field_type'] == 2:
output_field_ids.append(field['id'])
else:
print(f" ⚠ 警告: 字段 {field_code} 应该是输出字段,但实际类型为 {field['field_type']}")
else:
print(f" ⚠ 警告: 字段 {field_code} 不存在")
# 删除旧的关联关系
if not dry_run:
delete_sql = """
DELETE FROM f_polic_file_field
WHERE tenant_id = %s AND file_id = %s
"""
cursor.execute(delete_sql, (TENANT_ID, template_id))
deleted_count = cursor.rowcount
else:
deleted_count = 0
# 创建新的关联关系
created_count = 0
all_field_ids = input_field_ids + output_field_ids
for field_id in all_field_ids:
if not dry_run:
# 检查是否已存在(虽然已经删除了,但为了安全还是检查一下)
check_sql = """
SELECT id FROM f_polic_file_field
WHERE tenant_id = %s AND file_id = %s AND filed_id = %s
"""
cursor.execute(check_sql, (TENANT_ID, template_id, field_id))
existing = cursor.fetchone()
if not existing:
relation_id = generate_id()
insert_sql = """
INSERT INTO f_polic_file_field
(id, tenant_id, file_id, filed_id, created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, NOW(), %s, NOW(), %s, %s)
"""
cursor.execute(insert_sql, (
relation_id, TENANT_ID, template_id, field_id,
CREATED_BY, UPDATED_BY, 1 # state=1 表示启用
))
created_count += 1
else:
created_count += 1
if not dry_run:
conn.commit()
return {
'template_id': template_id,
'template_name': template_name,
'normalized_name': normalized_name,
'status': 'success',
'deleted_count': deleted_count,
'input_count': len(input_field_ids),
'output_count': len(output_field_ids),
'created_count': created_count
}
def main(dry_run: bool = True):
"""主函数"""
print("="*80)
print("重新建立模板和字段的关联关系")
print("="*80)
if dry_run:
print("\n[DRY RUN模式 - 不会实际修改数据库]")
else:
print("\n[实际执行模式 - 将修改数据库]")
try:
conn = pymysql.connect(**DB_CONFIG)
print("✓ 数据库连接成功\n")
# 获取所有模板
print("1. 获取所有模板配置...")
templates = get_all_templates(conn)
print(f" 找到 {len(templates)} 个模板")
# 获取所有字段
print("\n2. 获取所有字段定义...")
field_mapping = get_all_fields(conn)
print(f" 输入字段: {len(field_mapping['input_fields'])}")
print(f" 输出字段: {len(field_mapping['output_fields'])}")
print(f" 总字段数: {len(field_mapping['by_code'])}")
# 获取现有关联关系
print("\n3. 获取现有关联关系...")
existing_relations = get_existing_relations(conn)
print(f" 现有关联关系: {len(existing_relations)}")
# 重建关联关系
print("\n4. 重建模板和字段的关联关系...")
print("="*80)
results = []
for template_id, template_info in templates.items():
template_name = template_info['name']
normalized_name = template_info['normalized_name']
state = template_info['state']
# 处理所有模板(包括未启用的,因为可能需要建立关联)
# 但可以记录状态
status_note = f" (state={state})" if state != 1 else ""
if state != 1:
print(f"\n处理未启用的模板: {template_name}{status_note}")
print(f"\n处理模板: {template_name}")
print(f" 标准化名称: {normalized_name}")
result = rebuild_template_relations(
conn, template_id, template_name, normalized_name,
field_mapping, dry_run=dry_run
)
results.append(result)
if result['status'] == 'success':
print(f" ✓ 成功: 删除 {result['deleted_count']} 条旧关联, "
f"创建 {result['created_count']} 条新关联 "
f"(输入字段: {result['input_count']}, 输出字段: {result['output_count']})")
else:
print(f"{result['status']}: {result.get('reason', '')}")
# 统计信息
print("\n" + "="*80)
print("处理结果统计")
print("="*80)
success_count = sum(1 for r in results if r['status'] == 'success')
skipped_count = sum(1 for r in results if r['status'] == 'skipped')
total_input = sum(r.get('input_count', 0) for r in results)
total_output = sum(r.get('output_count', 0) for r in results)
total_created = sum(r.get('created_count', 0) for r in results)
print(f"\n成功处理: {success_count} 个模板")
print(f"跳过: {skipped_count} 个模板")
print(f"总输入字段关联: {total_input}")
print(f"总输出字段关联: {total_output}")
print(f"总关联关系: {total_created}")
# 显示详细结果
print("\n详细结果:")
for result in results:
if result['status'] == 'success':
print(f" - {result['template_name']}: "
f"输入字段 {result['input_count']} 个, "
f"输出字段 {result['output_count']}")
else:
print(f" - {result['template_name']}: {result['status']} - {result.get('reason', '')}")
print("\n" + "="*80)
if dry_run:
print("\n这是DRY RUN模式未实际修改数据库。")
print("要实际执行,请运行: python rebuild_template_field_relations.py --execute")
else:
print("\n✓ 关联关系已更新完成")
except Exception as e:
print(f"\n✗ 发生错误: {e}")
import traceback
traceback.print_exc()
if not dry_run:
conn.rollback()
finally:
conn.close()
print("\n数据库连接已关闭")
if __name__ == '__main__':
import sys
dry_run = '--execute' not in sys.argv
if not dry_run:
print("\n⚠ 警告: 这将修改数据库!")
response = input("确认要继续吗? (yes/no): ")
if response.lower() != 'yes':
print("操作已取消")
sys.exit(0)
main(dry_run=dry_run)

View File

@ -1,10 +1,12 @@
flask==3.0.0 flask==3.0.0
flask-cors==4.0.0 flask-cors==4.0.0
pymysql==1.1.2 pymysql==1.1.2
cryptography>=41.0.0
python-dotenv==1.0.0 python-dotenv==1.0.0
requests==2.31.0 requests==2.31.0
flasgger==0.9.7.1 flasgger==0.9.7.1
python-docx==1.1.0 python-docx==1.1.0
minio==7.2.3 minio==7.2.3
openpyxl==3.1.2 openpyxl==3.1.2
json-repair

View File

@ -0,0 +1,405 @@
"""
重新扫描模板占位符并更新数据库
1. 扫描所有本地模板文件包括新转换的.docx文件
2. 提取所有占位符
3. 检查数据库中的模板记录
4. 更新数据库如有变化
"""
import os
import pymysql
from pathlib import Path
from typing import Dict, List, Set, Tuple
from dotenv import load_dotenv
import re
from docx import Document
# 加载环境变量
load_dotenv()
# 数据库配置
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
CREATED_BY = 655162080928945152
UPDATED_BY = 655162080928945152
# 项目根目录
PROJECT_ROOT = Path(__file__).parent
TEMPLATES_DIR = PROJECT_ROOT / "template_finish"
def print_section(title):
"""打印章节标题"""
print("\n" + "="*70)
print(f" {title}")
print("="*70)
def print_result(success, message):
"""打印结果"""
status = "[OK]" if success else "[FAIL]"
print(f"{status} {message}")
def generate_id():
"""生成ID"""
import time
return int(time.time() * 1000000)
def scan_local_templates(base_dir: Path) -> Dict[str, Path]:
"""扫描本地模板文件"""
templates = {}
if not base_dir.exists():
return templates
for file_path in base_dir.rglob('*'):
if file_path.is_file():
# 只处理文档文件(优先处理.docx也包含.doc和.wps用于检查
if file_path.suffix.lower() in ['.doc', '.docx', '.wps']:
relative_path = file_path.relative_to(PROJECT_ROOT)
relative_path_str = str(relative_path).replace('\\', '/')
templates[relative_path_str] = file_path
return templates
def get_actual_tenant_id(conn) -> int:
"""获取数据库中的实际tenant_id"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
cursor.execute("SELECT DISTINCT tenant_id FROM f_polic_file_config LIMIT 1")
result = cursor.fetchone()
if result:
return result['tenant_id']
return 1 # 默认值
finally:
cursor.close()
def get_db_templates(conn, tenant_id: int) -> Dict[str, Dict]:
"""从数据库获取所有模板配置"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
sql = """
SELECT id, name, file_path, state, parent_id
FROM f_polic_file_config
WHERE tenant_id = %s
"""
cursor.execute(sql, (tenant_id,))
templates = cursor.fetchall()
result = {}
for template in templates:
file_path = template['file_path']
if file_path:
result[file_path] = {
'id': template['id'],
'name': template['name'],
'file_path': file_path,
'state': template['state'],
'parent_id': template['parent_id']
}
return result
finally:
cursor.close()
def extract_placeholders_from_docx(file_path: Path) -> Tuple[Set[str], bool]:
"""
从docx文件中提取所有占位符
Returns:
(占位符集合, 是否成功读取)
"""
placeholders = set()
placeholder_pattern = re.compile(r'\{\{([^}]+)\}\}')
success = False
try:
doc = Document(file_path)
success = True
# 从段落中提取占位符
for paragraph in doc.paragraphs:
text = paragraph.text
matches = placeholder_pattern.findall(text)
for match in matches:
field_code = match.strip()
if field_code:
placeholders.add(field_code)
# 从表格中提取占位符
for table in doc.tables:
try:
for row in table.rows:
for cell in row.cells:
for paragraph in cell.paragraphs:
text = paragraph.text
matches = placeholder_pattern.findall(text)
for match in matches:
field_code = match.strip()
if field_code:
placeholders.add(field_code)
except Exception as e:
# 某些表格结构可能导致错误,跳过
continue
except Exception as e:
# 文件读取失败(可能是.doc格式或其他问题
return placeholders, False
return placeholders, success
def scan_all_templates_placeholders(local_templates: Dict[str, Path]) -> Dict[str, Tuple[Set[str], bool, str]]:
"""
扫描所有模板的占位符
Returns:
字典key为相对路径value为(占位符集合, 是否成功读取, 文件扩展名)
"""
results = {}
for rel_path, file_path in local_templates.items():
file_ext = file_path.suffix.lower()
placeholders, success = extract_placeholders_from_docx(file_path)
results[rel_path] = (placeholders, success, file_ext)
return results
def update_or_create_template(conn, tenant_id: int, rel_path: str, file_path: Path, db_templates: Dict[str, Dict]):
"""更新或创建模板记录"""
cursor = conn.cursor()
try:
# 检查是否已存在
if rel_path in db_templates:
# 已存在,检查是否需要更新
template_id = db_templates[rel_path]['id']
# 这里可以添加更新逻辑,比如更新名称等
return template_id, 'exists'
else:
# 不存在,创建新记录
template_id = generate_id()
file_name = file_path.stem # 不含扩展名的文件名
cursor.execute("""
INSERT INTO f_polic_file_config
(id, tenant_id, parent_id, name, input_data, file_path, created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, 1)
""", (
template_id,
tenant_id,
None, # parent_id
file_name,
'{}', # input_data
rel_path,
CREATED_BY,
UPDATED_BY
))
conn.commit()
return template_id, 'created'
except Exception as e:
conn.rollback()
raise e
finally:
cursor.close()
def main():
"""主函数"""
print_section("重新扫描模板占位符并更新数据库")
# 1. 扫描本地模板
print_section("1. 扫描本地模板文件")
local_templates = scan_local_templates(TEMPLATES_DIR)
print_result(True, f"找到 {len(local_templates)} 个本地模板文件")
# 统计文件类型
file_types = {}
for file_path in local_templates.values():
ext = file_path.suffix.lower()
file_types[ext] = file_types.get(ext, 0) + 1
print("\n文件类型统计:")
for ext, count in sorted(file_types.items()):
print(f" {ext}: {count}")
if not local_templates:
print_result(False, "未找到本地模板文件")
return
# 2. 连接数据库
print_section("2. 连接数据库")
try:
conn = pymysql.connect(**DB_CONFIG)
print_result(True, "数据库连接成功")
except Exception as e:
print_result(False, f"数据库连接失败: {str(e)}")
return
try:
# 3. 获取实际的tenant_id
print_section("3. 获取实际的tenant_id")
tenant_id = get_actual_tenant_id(conn)
print_result(True, f"实际tenant_id: {tenant_id}")
# 4. 获取数据库中的模板
print_section("4. 获取数据库中的模板配置")
db_templates = get_db_templates(conn, tenant_id)
print_result(True, f"找到 {len(db_templates)} 条数据库模板记录有file_path的")
# 5. 扫描所有模板的占位符
print_section("5. 扫描所有模板的占位符")
print(" 正在扫描,请稍候...")
template_placeholders = scan_all_templates_placeholders(local_templates)
# 统计结果
all_placeholders = set()
templates_with_placeholders = 0
templates_without_placeholders = 0
templates_read_success = 0
templates_read_failed = 0
doc_files = []
docx_files = []
for rel_path, (placeholders, success, file_ext) in template_placeholders.items():
all_placeholders.update(placeholders)
if success:
templates_read_success += 1
if placeholders:
templates_with_placeholders += 1
else:
templates_without_placeholders += 1
else:
templates_read_failed += 1
if file_ext == '.doc':
doc_files.append(rel_path)
if file_ext == '.docx':
docx_files.append(rel_path)
elif file_ext == '.doc':
doc_files.append(rel_path)
print(f"\n扫描结果统计:")
print(f" - 成功读取: {templates_read_success}")
print(f" - 读取失败: {templates_read_failed}")
print(f" - 有占位符: {templates_with_placeholders}")
print(f" - 无占位符: {templates_without_placeholders}")
print(f" - 发现的占位符总数: {len(all_placeholders)} 个不同的占位符")
if doc_files:
print(f"\n [注意] 发现 {len(doc_files)} 个.doc文件可能无法读取:")
for doc_file in doc_files[:5]:
print(f" - {doc_file}")
if len(doc_files) > 5:
print(f" ... 还有 {len(doc_files) - 5}")
print(f"\n .docx文件: {len(docx_files)}")
# 6. 显示所有占位符
print_section("6. 所有占位符列表")
if all_placeholders:
for placeholder in sorted(all_placeholders):
print(f" - {placeholder}")
else:
print(" 未发现占位符")
# 7. 检查并更新数据库
print_section("7. 检查并更新数据库")
missing_templates = []
for rel_path in local_templates.keys():
if rel_path not in db_templates:
missing_templates.append(rel_path)
if missing_templates:
print(f" 发现 {len(missing_templates)} 个缺失的模板记录")
created_count = 0
for rel_path in missing_templates:
file_path = local_templates[rel_path]
try:
template_id, status = update_or_create_template(conn, tenant_id, rel_path, file_path, db_templates)
if status == 'created':
print(f" [创建] ID={template_id}, 路径={rel_path}")
created_count += 1
except Exception as e:
print(f" [错误] 创建失败: {rel_path}, 错误: {str(e)}")
if created_count > 0:
print_result(True, f"成功创建 {created_count} 条模板记录")
else:
print_result(True, "所有本地模板都已存在于数据库中")
# 8. 检查文件格式变化(.doc -> .docx
print_section("8. 检查文件格式变化")
# 检查数据库中是否有.doc路径但本地已经是.docx
format_changes = []
for db_path, db_info in db_templates.items():
if db_path.endswith('.doc'):
# 检查是否有对应的.docx文件
docx_path = db_path.replace('.doc', '.docx')
if docx_path in local_templates:
format_changes.append((db_path, docx_path, db_info))
if format_changes:
print(f" 发现 {len(format_changes)} 个文件格式变化(.doc -> .docx")
updated_count = 0
for old_path, new_path, db_info in format_changes:
try:
cursor = conn.cursor()
cursor.execute("""
UPDATE f_polic_file_config
SET file_path = %s
WHERE id = %s
""", (new_path, db_info['id']))
conn.commit()
cursor.close()
print(f" [更新] ID={db_info['id']}, 名称={db_info['name']}")
print(f" 旧路径: {old_path}")
print(f" 新路径: {new_path}")
updated_count += 1
except Exception as e:
print(f" [错误] 更新失败: {str(e)}")
if updated_count > 0:
print_result(True, f"成功更新 {updated_count} 条路径记录")
else:
print_result(True, "未发现文件格式变化")
# 9. 生成详细报告
print_section("9. 详细报告")
# 找出有占位符的模板示例
templates_with_placeholders_list = []
for rel_path, (placeholders, success, file_ext) in template_placeholders.items():
if success and placeholders and file_ext == '.docx':
templates_with_placeholders_list.append((rel_path, placeholders))
if templates_with_placeholders_list:
print(f"\n 有占位符的模板示例前5个:")
for i, (rel_path, placeholders) in enumerate(templates_with_placeholders_list[:5], 1):
print(f"\n {i}. {Path(rel_path).name}")
print(f" 路径: {rel_path}")
print(f" 占位符数量: {len(placeholders)}")
print(f" 占位符: {sorted(placeholders)}")
finally:
conn.close()
print_result(True, "数据库连接已关闭")
print_section("完成")
if __name__ == "__main__":
main()

340
restore_database.py Normal file
View File

@ -0,0 +1,340 @@
"""
数据库恢复脚本
从SQL备份文件恢复数据库
"""
import os
import sys
import subprocess
import pymysql
from pathlib import Path
from dotenv import load_dotenv
import gzip
# 加载环境变量
load_dotenv()
class DatabaseRestore:
"""数据库恢复类"""
def __init__(self):
"""初始化数据库配置"""
self.db_config = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
def restore_with_mysql(self, backup_file, drop_database=False):
"""
使用mysql命令恢复数据库推荐方式
Args:
backup_file: 备份文件路径
drop_database: 是否先删除数据库危险操作
Returns:
是否成功
"""
backup_file = Path(backup_file)
if not backup_file.exists():
raise FileNotFoundError(f"备份文件不存在: {backup_file}")
# 如果是压缩文件,先解压
sql_file = backup_file
temp_file = None
if backup_file.suffix == '.gz':
print(f"检测到压缩文件,正在解压...")
temp_file = backup_file.with_suffix('')
with gzip.open(backup_file, 'rb') as f_in:
with open(temp_file, 'wb') as f_out:
f_out.write(f_in.read())
sql_file = temp_file
print(f"解压完成: {sql_file}")
try:
print(f"开始恢复数据库 {self.db_config['database']}...")
print(f"备份文件: {backup_file}")
# 如果指定删除数据库
if drop_database:
print("警告: 将删除现有数据库!")
confirm = input("确认继续? (yes/no): ")
if confirm.lower() != 'yes':
print("已取消恢复操作")
return False
# 删除数据库
self._drop_database()
# 构建mysql命令
cmd = [
'mysql',
f"--host={self.db_config['host']}",
f"--port={self.db_config['port']}",
f"--user={self.db_config['user']}",
f"--password={self.db_config['password']}",
'--default-character-set=utf8mb4',
self.db_config['database']
]
# 执行恢复命令
with open(sql_file, 'r', encoding='utf-8') as f:
result = subprocess.run(
cmd,
stdin=f,
stderr=subprocess.PIPE,
text=True
)
if result.returncode != 0:
error_msg = result.stderr.decode('utf-8') if result.stderr else '未知错误'
raise Exception(f"mysql执行失败: {error_msg}")
print("恢复完成!")
return True
except FileNotFoundError:
print("错误: 未找到mysql命令请确保MySQL客户端已安装并在PATH中")
print("尝试使用Python方式恢复...")
return self.restore_with_python(backup_file, drop_database)
except Exception as e:
print(f"恢复失败: {str(e)}")
raise
finally:
# 清理临时解压文件
if temp_file and temp_file.exists():
temp_file.unlink()
def restore_with_python(self, backup_file, drop_database=False):
"""
使用Python直接连接数据库恢复备用方式
Args:
backup_file: 备份文件路径
drop_database: 是否先删除数据库危险操作
Returns:
是否成功
"""
backup_file = Path(backup_file)
if not backup_file.exists():
raise FileNotFoundError(f"备份文件不存在: {backup_file}")
# 如果是压缩文件,先解压
sql_file = backup_file
temp_file = None
if backup_file.suffix == '.gz':
print(f"检测到压缩文件,正在解压...")
temp_file = backup_file.with_suffix('')
with gzip.open(backup_file, 'rb') as f_in:
with open(temp_file, 'wb') as f_out:
f_out.write(f_in.read())
sql_file = temp_file
print(f"解压完成: {sql_file}")
try:
print(f"开始使用Python方式恢复数据库 {self.db_config['database']}...")
print(f"备份文件: {backup_file}")
# 如果指定删除数据库
if drop_database:
print("警告: 将删除现有数据库!")
confirm = input("确认继续? (yes/no): ")
if confirm.lower() != 'yes':
print("已取消恢复操作")
return False
# 删除数据库
self._drop_database()
# 连接数据库
connection = pymysql.connect(**self.db_config)
cursor = connection.cursor()
# 读取SQL文件
print("读取SQL文件...")
with open(sql_file, 'r', encoding='utf-8') as f:
sql_content = f.read()
# 分割SQL语句按分号分割但要注意字符串中的分号
print("执行SQL语句...")
statements = self._split_sql_statements(sql_content)
total = len(statements)
print(f"{total} 条SQL语句")
# 执行每条SQL语句
for i, statement in enumerate(statements, 1):
statement = statement.strip()
if not statement or statement.startswith('--'):
continue
try:
cursor.execute(statement)
if i % 100 == 0:
print(f"进度: {i}/{total} ({i*100//total}%)")
except Exception as e:
# 某些错误可以忽略(如表已存在等)
error_msg = str(e).lower()
if 'already exists' in error_msg or 'duplicate' in error_msg:
continue
print(f"警告: 执行SQL语句时出错 (第{i}条): {str(e)}")
print(f"SQL: {statement[:100]}...")
# 提交事务
connection.commit()
cursor.close()
connection.close()
print("恢复完成!")
return True
except Exception as e:
print(f"恢复失败: {str(e)}")
raise
finally:
# 清理临时解压文件
if temp_file and temp_file.exists():
temp_file.unlink()
def _split_sql_statements(self, sql_content):
"""
分割SQL语句处理字符串中的分号
Args:
sql_content: SQL内容
Returns:
SQL语句列表
"""
statements = []
current_statement = []
in_string = False
string_char = None
i = 0
while i < len(sql_content):
char = sql_content[i]
# 检测字符串开始/结束
if char in ("'", '"', '`') and (i == 0 or sql_content[i-1] != '\\'):
if not in_string:
in_string = True
string_char = char
elif char == string_char:
in_string = False
string_char = None
current_statement.append(char)
# 如果不在字符串中且遇到分号,分割语句
if not in_string and char == ';':
statement = ''.join(current_statement).strip()
if statement:
statements.append(statement)
current_statement = []
i += 1
# 添加最后一条语句
if current_statement:
statement = ''.join(current_statement).strip()
if statement:
statements.append(statement)
return statements
def _drop_database(self):
"""删除数据库(危险操作)"""
try:
# 连接到MySQL服务器不指定数据库
config = self.db_config.copy()
config.pop('database')
connection = pymysql.connect(**config)
cursor = connection.cursor()
cursor.execute(f"DROP DATABASE IF EXISTS `{self.db_config['database']}`")
cursor.execute(f"CREATE DATABASE `{self.db_config['database']}` CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci")
connection.commit()
cursor.close()
connection.close()
print(f"数据库 {self.db_config['database']} 已删除并重新创建")
except Exception as e:
raise Exception(f"删除数据库失败: {str(e)}")
def test_connection(self):
"""测试数据库连接"""
try:
connection = pymysql.connect(**self.db_config)
cursor = connection.cursor()
cursor.execute("SELECT VERSION()")
version = cursor.fetchone()[0]
cursor.close()
connection.close()
print(f"数据库连接成功MySQL版本: {version}")
return True
except Exception as e:
print(f"数据库连接失败: {str(e)}")
return False
def main():
"""主函数"""
import argparse
parser = argparse.ArgumentParser(description='数据库恢复工具')
parser.add_argument('backup_file', help='备份文件路径')
parser.add_argument('--method', choices=['mysql', 'python', 'auto'],
default='auto', help='恢复方法 (默认: auto)')
parser.add_argument('--drop-db', action='store_true',
help='恢复前删除现有数据库(危险操作)')
parser.add_argument('--test', action='store_true',
help='仅测试数据库连接')
args = parser.parse_args()
restore = DatabaseRestore()
# 测试连接
if args.test:
restore.test_connection()
return
# 执行恢复
try:
if args.method == 'mysql':
success = restore.restore_with_mysql(args.backup_file, args.drop_db)
elif args.method == 'python':
success = restore.restore_with_python(args.backup_file, args.drop_db)
else: # auto
try:
success = restore.restore_with_mysql(args.backup_file, args.drop_db)
except:
print("\nmysql方式失败切换到Python方式...")
success = restore.restore_with_python(args.backup_file, args.drop_db)
if success:
print("\n恢复成功!")
else:
print("\n恢复失败!")
sys.exit(1)
except Exception as e:
print(f"\n恢复失败: {str(e)}")
sys.exit(1)
if __name__ == '__main__':
main()

View File

@ -0,0 +1,122 @@
"""
回滚错误的更新恢复被错误修改的字段
"""
import os
import pymysql
# 数据库连接配置
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
UPDATED_BY = 655162080928945152
# 需要恢复的字段映射字段ID -> 正确的field_code
ROLLBACK_MAPPING = {
# 这些字段被错误地从英文改成了中文,需要恢复
1764656917410273: 'target_issue_description',
1764656918032031: 'filler_name',
1764656917418979: 'department_opinion',
1764836032906561: 'appointment_location',
1764836032488198: 'appointment_time',
1764836033052889: 'approval_time',
1764836032655678: 'handler_name',
1764836033342084: 'handling_department',
1764836033240593: 'investigation_unit_name',
1764836033018470: 'investigation_location',
1764836033274278: 'investigation_team_code',
1764836033094781: 'investigation_team_member_names',
1764836033176386: 'investigation_team_leader_name',
1764836033500799: 'commission_name',
1764656917384058: 'clue_info',
1764656917861268: 'clue_source',
1764836032538308: 'target_address',
1764836033565636: 'target_health_status',
1764836033332970: 'target_other_situation',
1764656917299164: 'target_date_of_birth',
1764836033269146: 'target_date_of_birth_full',
1765151880445876: 'target_organization',
1764656917367205: 'target_organization_and_position',
1764836033405778: 'target_family_situation',
1764836033162748: 'target_work_basic_info',
1764656917996367: 'target_basic_info_clue',
1764836032997850: 'target_age',
1764656917561689: 'target_gender',
1764836032855869: 'target_personality',
1764836032893680: 'target_registered_address',
1764836033603501: 'target_tolerance',
1764656917185956: 'target_political_status',
1764836033786057: 'target_attitude',
1764836033587951: 'target_previous_investigation',
1764836032951705: 'target_ethnicity',
1764836033280024: 'target_other_issues_possibility',
1764836033458872: 'target_issue_severity',
1764836032929811: 'target_social_relations',
1764836033618877: 'target_negative_events',
1764836032926994: 'target_place_of_origin',
1765151880304552: 'target_position',
1764656917802442: 'target_professional_rank',
1764836032817243: 'target_contact',
1764836032902356: 'target_id_number',
1764836032913357: 'target_id_number',
1764656917073644: 'target_name',
1764836033571266: 'target_problem_description',
1764836032827460: 'report_card_request_time',
1764836032694865: 'notification_location',
1764836032909732: 'notification_time',
1764836033451248: 'risk_level',
}
def rollback():
"""回滚错误的更新"""
conn = pymysql.connect(**DB_CONFIG)
cursor = conn.cursor(pymysql.cursors.DictCursor)
print("="*80)
print("回滚错误的字段更新")
print("="*80)
print(f"\n需要恢复 {len(ROLLBACK_MAPPING)} 个字段\n")
# 先查询当前状态
for field_id, correct_code in ROLLBACK_MAPPING.items():
cursor.execute("""
SELECT id, name, filed_code
FROM f_polic_field
WHERE id = %s AND tenant_id = %s
""", (field_id, TENANT_ID))
field = cursor.fetchone()
if field:
print(f" ID: {field_id}")
print(f" 名称: {field['name']}")
print(f" 当前field_code: {field['filed_code']}")
print(f" 恢复为: {correct_code}")
print()
# 执行回滚
print("开始执行回滚...\n")
for field_id, correct_code in ROLLBACK_MAPPING.items():
cursor.execute("""
UPDATE f_polic_field
SET filed_code = %s, updated_time = NOW(), updated_by = %s
WHERE id = %s AND tenant_id = %s
""", (correct_code, UPDATED_BY, field_id, TENANT_ID))
print(f" ✓ 恢复字段 ID {field_id}: {correct_code}")
conn.commit()
print("\n✓ 回滚完成")
cursor.close()
conn.close()
if __name__ == '__main__':
rollback()

181
services/ai_logger.py Normal file
View File

@ -0,0 +1,181 @@
"""
AI对话日志记录模块
用于记录大模型对话的输入和输出信息方便排查问题
"""
import os
import json
import time
from datetime import datetime
from pathlib import Path
from typing import Dict, Optional, Any
from threading import Lock
class AILogger:
"""AI对话日志记录器"""
def __init__(self, log_dir: Optional[str] = None):
"""
初始化日志记录器
Args:
log_dir: 日志文件保存目录默认为项目根目录下的 logs/ai_conversations 目录
"""
if log_dir is None:
# 默认日志目录:项目根目录下的 logs/ai_conversations
project_root = Path(__file__).parent.parent
log_dir = project_root / "logs" / "ai_conversations"
self.log_dir = Path(log_dir)
self.log_dir.mkdir(parents=True, exist_ok=True)
# 线程锁,确保日志写入的线程安全
self._lock = Lock()
# 是否启用日志记录(可通过环境变量控制)
self.enabled = os.getenv('AI_LOG_ENABLED', 'true').lower() == 'true'
print(f"[AI日志] 日志记录器初始化完成,日志目录: {self.log_dir}")
print(f"[AI日志] 日志记录状态: {'启用' if self.enabled else '禁用'}")
def log_conversation(
self,
prompt: str,
api_request: Dict[str, Any],
api_response: Optional[Dict[str, Any]] = None,
extracted_data: Optional[Dict[str, Any]] = None,
error: Optional[str] = None,
session_id: Optional[str] = None
) -> str:
"""
记录一次完整的AI对话
Args:
prompt: 输入提示词
api_request: API请求参数
api_response: API响应内容完整响应
extracted_data: 提取后的结构化数据
error: 错误信息如果有
session_id: 会话ID可选用于关联多次对话
Returns:
日志文件路径
"""
if not self.enabled:
return ""
try:
with self._lock:
# 生成时间戳和会话ID
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S_%f")[:-3] # 精确到毫秒
if session_id is None:
session_id = f"session_{int(time.time() * 1000)}"
# 创建日志记录
log_entry = {
"timestamp": datetime.now().isoformat(),
"session_id": session_id,
"prompt": prompt,
"api_request": {
"endpoint": api_request.get("endpoint", "unknown"),
"model": api_request.get("model", "unknown"),
"messages": api_request.get("messages", []),
"temperature": api_request.get("temperature"),
"max_tokens": api_request.get("max_tokens"),
"enable_thinking": api_request.get("enable_thinking", False),
},
"api_response": api_response,
"extracted_data": extracted_data,
"error": error,
"success": error is None
}
# 保存到文件(按日期组织)
date_str = datetime.now().strftime("%Y%m%d")
log_file = self.log_dir / f"conversation_{date_str}_{timestamp}.json"
with open(log_file, 'w', encoding='utf-8') as f:
json.dump(log_entry, f, ensure_ascii=False, indent=2)
print(f"[AI日志] 对话日志已保存: {log_file.name}")
return str(log_file)
except Exception as e:
print(f"[AI日志] 保存日志失败: {e}")
return ""
def log_request_only(
self,
prompt: str,
api_request: Dict[str, Any],
session_id: Optional[str] = None
) -> str:
"""
仅记录请求信息在发送请求前调用
Args:
prompt: 输入提示词
api_request: API请求参数
session_id: 会话ID
Returns:
日志文件路径
"""
return self.log_conversation(
prompt=prompt,
api_request=api_request,
session_id=session_id
)
def get_recent_logs(self, limit: int = 10) -> list:
"""
获取最近的日志文件列表
Args:
limit: 返回的日志文件数量
Returns:
日志文件路径列表按时间倒序
"""
try:
log_files = sorted(
self.log_dir.glob("conversation_*.json"),
key=lambda x: x.stat().st_mtime,
reverse=True
)
return [str(f) for f in log_files[:limit]]
except Exception as e:
print(f"[AI日志] 获取日志列表失败: {e}")
return []
def read_log(self, log_file: str) -> Optional[Dict]:
"""
读取指定的日志文件
Args:
log_file: 日志文件路径
Returns:
日志内容字典如果读取失败返回None
"""
try:
log_path = Path(log_file)
if not log_path.is_absolute():
log_path = self.log_dir / log_file
with open(log_path, 'r', encoding='utf-8') as f:
return json.load(f)
except Exception as e:
print(f"[AI日志] 读取日志文件失败: {e}")
return None
# 全局日志记录器实例
_ai_logger: Optional[AILogger] = None
def get_ai_logger() -> AILogger:
"""获取全局AI日志记录器实例"""
global _ai_logger
if _ai_logger is None:
_ai_logger = AILogger()
return _ai_logger

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -12,15 +12,27 @@ class FieldService:
"""字段服务类""" """字段服务类"""
def __init__(self): def __init__(self):
# 从环境变量读取数据库配置,不设置默认值,确保必须通过.env文件配置
db_host = os.getenv('DB_HOST')
db_port = os.getenv('DB_PORT')
db_user = os.getenv('DB_USER')
db_password = os.getenv('DB_PASSWORD')
db_name = os.getenv('DB_NAME')
if not all([db_host, db_port, db_user, db_password, db_name]):
raise ValueError(
"数据库配置不完整,请在.env文件中配置以下环境变量\n"
"DB_HOST, DB_PORT, DB_USER, DB_PASSWORD, DB_NAME"
)
self.db_config = { self.db_config = {
'host': os.getenv('DB_HOST', '152.136.177.240'), 'host': db_host,
'port': int(os.getenv('DB_PORT', 5012)), 'port': int(db_port),
'user': os.getenv('DB_USER', 'finyx'), 'user': db_user,
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'), 'password': db_password,
'database': os.getenv('DB_NAME', 'finyx'), 'database': db_name,
'charset': 'utf8mb4' 'charset': 'utf8mb4'
} }
self.tenant_id = 615873064429507639
# 加载提示词配置文件 # 加载提示词配置文件
self.prompt_config = self._load_prompt_config() self.prompt_config = self._load_prompt_config()
@ -137,17 +149,16 @@ class FieldService:
cursor = conn.cursor(pymysql.cursors.DictCursor) cursor = conn.cursor(pymysql.cursors.DictCursor)
try: try:
# 根据字段编码查询字段信息 # 根据字段编码查询字段信息不限制tenant_id
placeholders = ','.join(['%s'] * len(field_codes)) placeholders = ','.join(['%s'] * len(field_codes))
sql = f""" sql = f"""
SELECT f.id, f.name, f.filed_code as field_code, f.field_type SELECT f.id, f.name, f.filed_code as field_code, f.field_type
FROM f_polic_field f FROM f_polic_field f
WHERE f.tenant_id = %s WHERE f.filed_code IN ({placeholders})
AND f.filed_code IN ({placeholders})
AND f.field_type = 2 AND f.field_type = 2
ORDER BY f.id ORDER BY f.id
""" """
cursor.execute(sql, [self.tenant_id] + field_codes) cursor.execute(sql, field_codes)
fields = cursor.fetchall() fields = cursor.fetchall()
# 转换为字典列表 # 转换为字典列表
@ -183,12 +194,11 @@ class FieldService:
sql = """ sql = """
SELECT f.id, f.name, f.filed_code as field_code, f.field_type SELECT f.id, f.name, f.filed_code as field_code, f.field_type
FROM f_polic_field f FROM f_polic_field f
WHERE f.tenant_id = %s WHERE f.filed_code = %s
AND f.filed_code = %s
AND f.field_type = 1 AND f.field_type = 1
LIMIT 1 LIMIT 1
""" """
cursor.execute(sql, (self.tenant_id, field_code)) cursor.execute(sql, (field_code,))
field = cursor.fetchone() field = cursor.fetchone()
if field: if field:
@ -224,12 +234,11 @@ class FieldService:
sql_input = """ sql_input = """
SELECT f.id, f.name, f.filed_code as field_code, f.field_type SELECT f.id, f.name, f.filed_code as field_code, f.field_type
FROM f_polic_field f FROM f_polic_field f
WHERE f.tenant_id = %s WHERE f.field_type = 1
AND f.field_type = 1
AND (f.filed_code = 'clue_info' OR f.filed_code = 'target_basic_info_clue') AND (f.filed_code = 'clue_info' OR f.filed_code = 'target_basic_info_clue')
ORDER BY f.id ORDER BY f.id
""" """
cursor.execute(sql_input, (self.tenant_id,)) cursor.execute(sql_input)
input_fields = cursor.fetchall() input_fields = cursor.fetchall()
# 获取输出字段field_type=2 # 获取输出字段field_type=2
@ -239,12 +248,11 @@ class FieldService:
FROM f_polic_field f FROM f_polic_field f
INNER JOIN f_polic_file_field ff ON f.id = ff.filed_id INNER JOIN f_polic_file_field ff ON f.id = ff.filed_id
INNER JOIN f_polic_file_config fc ON ff.file_id = fc.id INNER JOIN f_polic_file_config fc ON ff.file_id = fc.id
WHERE f.tenant_id = %s WHERE f.field_type = 2
AND f.field_type = 2
AND fc.state = 1 AND fc.state = 1
ORDER BY f.id ORDER BY f.id
""" """
cursor.execute(sql_output, (self.tenant_id,)) cursor.execute(sql_output)
all_output_fields = cursor.fetchall() all_output_fields = cursor.fetchall()
# 根据business_type过滤输出字段 # 根据business_type过滤输出字段
@ -252,10 +260,9 @@ class FieldService:
sql_file_configs = """ sql_file_configs = """
SELECT id, name, input_data SELECT id, name, input_data
FROM f_polic_file_config FROM f_polic_file_config
WHERE tenant_id = %s WHERE state = 1
AND state = 1
""" """
cursor.execute(sql_file_configs, (self.tenant_id,)) cursor.execute(sql_file_configs)
file_configs = cursor.fetchall() file_configs = cursor.fetchall()
# 找到匹配business_type的文件配置ID列表 # 找到匹配business_type的文件配置ID列表
@ -277,12 +284,11 @@ class FieldService:
SELECT DISTINCT f.id, f.name, f.filed_code as field_code, f.field_type SELECT DISTINCT f.id, f.name, f.filed_code as field_code, f.field_type
FROM f_polic_field f FROM f_polic_field f
INNER JOIN f_polic_file_field ff ON f.id = ff.filed_id INNER JOIN f_polic_file_field ff ON f.id = ff.filed_id
WHERE f.tenant_id = %s WHERE f.field_type = 2
AND f.field_type = 2
AND ff.file_id IN ({placeholders}) AND ff.file_id IN ({placeholders})
ORDER BY f.id ORDER BY f.id
""" """
cursor.execute(sql_filtered, [self.tenant_id] + matching_file_ids) cursor.execute(sql_filtered, matching_file_ids)
output_fields = cursor.fetchall() output_fields = cursor.fetchall()
return { return {

View File

@ -326,11 +326,18 @@
</div> </div>
<div class="form-group"> <div class="form-group">
<label>文件列表</label> <label>文件列表(文档模板类型)</label>
<div style="margin-bottom: 10px;">
<button class="btn btn-secondary" onclick="loadAvailableFiles()" style="margin-right: 10px;">📋 加载全部可用模板</button>
<button class="btn btn-secondary" onclick="addFileItem()" style="margin-right: 10px;">+ 手动添加文件</button>
<button class="btn btn-danger" onclick="clearAllFiles()">🗑️ 清空列表</button>
</div>
<div style="margin-bottom: 10px; padding: 10px; background: #f0f0f0; border-radius: 4px; font-size: 13px; color: #666;">
💡 提示:点击"加载全部可用模板"可以加载所有可用的文档模板类型,方便测试不同模板的生成效果
</div>
<div id="fileListContainer"> <div id="fileListContainer">
<!-- 动态生成的文件列表 --> <!-- 动态生成的文件列表 -->
</div> </div>
<button class="btn btn-secondary" onclick="addFileItem()">+ 添加文件</button>
</div> </div>
</div> </div>
@ -381,9 +388,9 @@
// ==================== 解析接口相关 ==================== // ==================== 解析接口相关 ====================
function initExtractTab() { function initExtractTab() {
// 初始化默认输入字段(虚拟测试数据) // 初始化默认输入字段
addInputField('clue_info', '被举报用户名称是张三年龄44岁某公司总经理男性1980年5月出生本科文化程度中共党员正处级。主要问题线索违反国家计划生育有关政策规定于2010年10月生育二胎。线索来源群众举报。'); addInputField('clue_info', '张三多次在私下聚会、网络群组中发表抹黑党中央决策部署的言论传播歪曲党的理论和路线方针政策的错误观点频繁接受管理服务对象安排的高档宴请、私人会所聚餐以及高尔夫球、高端足浴等娱乐活动相关费用均由对方全额承担在干部选拔任用、岗位调整工作中利用职务便利收受他人财物利用职权为其亲属经营的公司谋取不正当利益帮助该公司违规承接本单位及关联单位工程项目3个合同总额超200万元从中收受亲属给予的"感谢费"15万元其本人沉迷赌博活动每周至少参与1次大额赌资赌博单次赌资超1万元累计赌资达数十万元。');
addInputField('target_basic_info_clue', '被核查人员工作基本情况张三1980年5月生本科文化中共党员现为某公司总经理正处级。'); addInputField('target_basic_info_clue', '张三汉族1990年9月出生云南普洱人研究生学历2005年8月参加工作2006年10月加入中国共产党。2004年8月至2005年2月在云南省农业机械公司工作2005年2月至2012年2月历任云南省农业机械公司办公室副主任、主任、团委书记2012年2月至2018年3月任云南省农业机械公司支部书记、厂长2018年3月至2020年3月任云南省农业机械公司总经理助理、销售部部长2020年3月至2022年3月任云南省农业机械公司总经理助理2022年3月至2022年7月任云南省农业机械公司大理分公司副经理2022年7月至2023年12月任云南省农业机械公司西双版纳分公司经理2023年12月至今任云南省农业机械公司党支部书记、经理。');
// 初始化默认输出字段(包含完整的字段列表) // 初始化默认输出字段(包含完整的字段列表)
addOutputField('target_name'); addOutputField('target_name');
@ -548,26 +555,153 @@
// ==================== 文档生成接口相关 ==================== // ==================== 文档生成接口相关 ====================
function initGenerateTab() { async function loadAvailableFiles() {
// 初始化默认字段(完整的虚拟测试数据) try {
const response = await fetch('/api/file-configs');
const result = await response.json();
if (result.isSuccess && result.data && result.data.fileConfigs) {
const container = document.getElementById('fileListContainer');
container.innerHTML = ''; // 清空现有列表
// 只添加有filePath的文件有模板文件的
const filesWithPath = result.data.fileConfigs.filter(f => f.filePath);
if (filesWithPath.length === 0) {
alert('没有找到可用的文件配置需要有filePath');
return;
}
// 加载所有可用文件
filesWithPath.forEach(file => {
addFileItem(file.fileId, file.fileName);
});
alert(`已加载全部 ${filesWithPath.length} 个可用文件模板`);
} else {
alert('获取文件列表失败: ' + (result.errorMsg || '未知错误'));
}
} catch (error) {
alert('加载文件列表失败: ' + error.message);
}
}
async function initGenerateTab() {
// 初始化所有字段(完整的虚拟测试数据)
// 基本信息字段
addGenerateField('target_name', '张三'); addGenerateField('target_name', '张三');
addGenerateField('target_gender', '男'); addGenerateField('target_gender', '男');
addGenerateField('target_age', '44'); addGenerateField('target_age', '34');
addGenerateField('target_date_of_birth', '198005'); addGenerateField('target_date_of_birth', '199009');
addGenerateField('target_organization_and_position', '某公司总经理'); addGenerateField('target_date_of_birth_full', '1990年9月');
addGenerateField('target_organization', '某公司'); addGenerateField('target_id_number', '530123199009123456');
addGenerateField('target_position', '总经理'); addGenerateField('target_ethnicity', '汉族');
addGenerateField('target_education_level', '本科'); addGenerateField('target_place_of_origin', '云南普洱');
addGenerateField('target_address', '云南省昆明市五华区某某街道某某小区1栋1单元101室');
addGenerateField('target_registered_address', '云南省昆明市五华区某某街道某某小区1栋1单元101室');
addGenerateField('target_contact', '13800138000');
// 组织和工作信息
addGenerateField('target_organization_and_position', '云南省农业机械公司党支部书记、经理');
addGenerateField('target_organization', '云南省农业机械公司');
addGenerateField('target_position', '党支部书记、经理');
addGenerateField('target_education_level', '研究生');
addGenerateField('target_education', '研究生');
addGenerateField('target_political_status', '中共党员'); addGenerateField('target_political_status', '中共党员');
addGenerateField('target_professional_rank', '正处级'); addGenerateField('target_professional_rank', '高级工程师');
addGenerateField('target_occupation', '企业管理人员');
addGenerateField('target_work_basic_info', '2005年8月参加工作现任云南省农业机械公司党支部书记、经理');
addGenerateField('target_work_history', '2004年8月至2005年2月在云南省农业机械公司工作2005年2月至2012年2月历任云南省农业机械公司办公室副主任、主任、团委书记2012年2月至2018年3月任云南省农业机械公司支部书记、厂长2018年3月至2020年3月任云南省农业机械公司总经理助理、销售部部长2020年3月至2022年3月任云南省农业机械公司总经理助理2022年3月至2022年7月任云南省农业机械公司大理分公司副经理2022年7月至2023年12月任云南省农业机械公司西双版纳分公司经理2023年12月至今任云南省农业机械公司党支部书记、经理。');
addGenerateField('target_basic_info', '张三汉族1990年9月出生云南普洱人研究生学历中共党员现任云南省农业机械公司党支部书记、经理。');
// 线索和问题信息
addGenerateField('clue_info', '张三多次在私下聚会、网络群组中发表抹黑党中央决策部署的言论传播歪曲党的理论和路线方针政策的错误观点频繁接受管理服务对象安排的高档宴请、私人会所聚餐以及高尔夫球、高端足浴等娱乐活动相关费用均由对方全额承担在干部选拔任用、岗位调整工作中利用职务便利收受他人财物利用职权为其亲属经营的公司谋取不正当利益帮助该公司违规承接本单位及关联单位工程项目3个合同总额超200万元从中收受亲属给予的"感谢费"15万元其本人沉迷赌博活动每周至少参与1次大额赌资赌博单次赌资超1万元累计赌资达数十万元。');
addGenerateField('target_basic_info_clue', '张三汉族1990年9月出生云南普洱人研究生学历2005年8月参加工作2006年10月加入中国共产党。2004年8月至2005年2月在云南省农业机械公司工作2005年2月至2012年2月历任云南省农业机械公司办公室副主任、主任、团委书记2012年2月至2018年3月任云南省农业机械公司支部书记、厂长2018年3月至2020年3月任云南省农业机械公司总经理助理、销售部部长2020年3月至2022年3月任云南省农业机械公司总经理助理2022年3月至2022年7月任云南省农业机械公司大理分公司副经理2022年7月至2023年12月任云南省农业机械公司西双版纳分公司经理2023年12月至今任云南省农业机械公司党支部书记、经理。');
addGenerateField('clue_source', '群众举报'); addGenerateField('clue_source', '群众举报');
addGenerateField('target_issue_description', '违反国家计划生育有关政策规定于2010年10月生育二胎。'); addGenerateField('target_issue_description', '张三多次在私下聚会、网络群组中发表抹黑党中央决策部署的言论传播歪曲党的理论和路线方针政策的错误观点频繁接受管理服务对象安排的高档宴请、私人会所聚餐以及高尔夫球、高端足浴等娱乐活动相关费用均由对方全额承担在干部选拔任用、岗位调整工作中利用职务便利收受他人财物利用职权为其亲属经营的公司谋取不正当利益帮助该公司违规承接本单位及关联单位工程项目3个合同总额超200万元从中收受亲属给予的"感谢费"15万元其本人沉迷赌博活动每周至少参与1次大额赌资赌博单次赌资超1万元累计赌资达数十万元。');
addGenerateField('department_opinion', '建议进行初步核实'); addGenerateField('target_problem_description', '违反政治纪律、组织纪律、廉洁纪律,涉嫌违纪违法');
addGenerateField('target_issue_severity', '严重');
addGenerateField('target_issue_severity_level', '严重');
addGenerateField('target_other_issues_possibility', '较大');
// 个人情况评估
addGenerateField('target_family_situation', '家庭关系和谐稳定');
addGenerateField('target_social_relations', '社会交往较多,人际关系基本正常');
addGenerateField('target_health_status', '良好');
addGenerateField('target_personality', '开朗');
addGenerateField('target_tolerance', '较强');
addGenerateField('target_previous_investigation', '无');
addGenerateField('target_negative_events', '无');
addGenerateField('target_other_situation', '无');
// 谈话和调查相关
addGenerateField('target_attitude', '配合调查');
addGenerateField('target_confession_level', '部分承认');
addGenerateField('target_behavior_during_interview', '情绪稳定,配合调查');
addGenerateField('target_behavior_after_relief', '情绪有所缓解');
addGenerateField('target_mental_burden_level', '中等');
addGenerateField('target_risk_level', '中');
addGenerateField('risk_level', '中');
addGenerateField('pre_interview_risk_assessment_result', '风险等级:中,已制定安全预案');
// 调查组织和人员
addGenerateField('investigation_unit_name', '纪检监察室');
addGenerateField('investigation_team_code', 'JC2024001');
addGenerateField('investigation_team_leader_name', '赵六');
addGenerateField('investigation_team_member_names', '赵六、钱七、孙八');
addGenerateField('investigation_location', '纪检监察室谈话室');
addGenerateField('handler_name', '王五');
addGenerateField('handling_department', '纪检监察室');
addGenerateField('commission_name', '中共某某市纪律检查委员会');
// 谈话相关
addGenerateField('interview_location', '纪检监察室谈话室');
addGenerateField('proposed_interview_location', '纪检监察室谈话室');
addGenerateField('notification_location', '纪检监察室');
addGenerateField('appointment_location', '纪检监察室谈话室');
addGenerateField('interview_time', '2024年12月10日14:00');
addGenerateField('proposed_interview_time', '2024年12月10日14:00');
addGenerateField('notification_time', '2024年12月9日');
addGenerateField('appointment_time', '2024年12月10日14:00');
addGenerateField('interview_reason', '就相关问题进行核实了解');
addGenerateField('interview_count', '1');
addGenerateField('interviewer', '赵六');
addGenerateField('recorder', '钱七');
addGenerateField('interview_personnel', '赵六、钱七');
addGenerateField('interview_personnel_leader', '赵六');
addGenerateField('interview_personnel_safety_officer', '孙八');
addGenerateField('backup_personnel', '周九');
// 审批和意见
addGenerateField('approval_time', '2024年12月8日');
addGenerateField('report_card_request_time', '2024年12月8日');
addGenerateField('department_opinion', '经初步核实,建议立案调查');
addGenerateField('assessment_opinion', '建议进行谈话核实');
addGenerateField('filler_name', '李四'); addGenerateField('filler_name', '李四');
// 初始化默认文件(包含多个模板用于测试) // 自动加载所有可用的文件列表
addFileItem(1, '初步核实审批表.doc', 'PRELIMINARY_VERIFICATION_APPROVAL'); try {
addFileItem(2, '请示报告卡.doc', 'REPORT_CARD'); const response = await fetch('/api/file-configs');
const result = await response.json();
if (result.isSuccess && result.data && result.data.fileConfigs) {
// 只添加有filePath的文件有模板文件的
const filesWithPath = result.data.fileConfigs.filter(f => f.filePath);
// 加载所有可用文件
filesWithPath.forEach(file => {
addFileItem(file.fileId, file.fileName);
});
if (filesWithPath.length > 0) {
console.log(`已自动加载 ${filesWithPath.length} 个可用文件模板`);
}
} else {
console.warn('未找到可用的文件配置');
}
} catch (error) {
console.warn('自动加载文件列表失败:', error);
}
} }
function addGenerateField(fieldCode = '', fieldValue = '') { function addGenerateField(fieldCode = '', fieldValue = '') {
@ -584,15 +718,14 @@
container.appendChild(fieldDiv); container.appendChild(fieldDiv);
} }
function addFileItem(fileId = '', fileName = '', templateCode = '') { function addFileItem(fileId = '', fileName = '') {
const container = document.getElementById('fileListContainer'); const container = document.getElementById('fileListContainer');
const fileDiv = document.createElement('div'); const fileDiv = document.createElement('div');
fileDiv.className = 'field-row'; fileDiv.className = 'field-row';
fileDiv.innerHTML = ` fileDiv.innerHTML = `
<input type="number" placeholder="文件ID" value="${fileId}" class="file-id" style="width: 150px;"> <input type="number" placeholder="文件ID (从f_polic_file_config表获取)" value="${fileId}" class="file-id" style="width: 200px;">
<div style="display: flex; gap: 10px; flex: 1;"> <div style="display: flex; gap: 10px; flex: 1;">
<input type="text" placeholder="文件名称 (如: 初步核实审批表.doc)" value="${fileName}" class="file-name" style="flex: 1;"> <input type="text" placeholder="文件名称 (如: 初步核实审批表.doc)" value="${fileName}" class="file-name" style="flex: 1;">
<input type="text" placeholder="模板编码 (如: PRELIMINARY_VERIFICATION_APPROVAL)" value="${templateCode}" class="template-code" style="flex: 1;">
<button class="btn btn-danger" onclick="removeField(this)">删除</button> <button class="btn btn-danger" onclick="removeField(this)">删除</button>
</div> </div>
`; `;
@ -628,13 +761,11 @@
fileContainers.forEach(container => { fileContainers.forEach(container => {
const fileId = container.querySelector('.file-id').value.trim(); const fileId = container.querySelector('.file-id').value.trim();
const fileName = container.querySelector('.file-name').value.trim(); const fileName = container.querySelector('.file-name').value.trim();
const templateCode = container.querySelector('.template-code').value.trim();
if (fileId && fileName && templateCode) { if (fileId) {
fileList.push({ fileList.push({
fileId: parseInt(fileId), fileId: parseInt(fileId),
fileName: fileName, fileName: fileName || 'generated.docx' // fileName可选
templateCode: templateCode
}); });
} }
}); });
@ -714,8 +845,15 @@
result.data.fpolicFieldParamFileList.forEach(file => { result.data.fpolicFieldParamFileList.forEach(file => {
html += `<div class="result-item"> html += `<div class="result-item">
<strong>${file.fileName}:</strong><br> <strong>${file.fileName}:</strong><br>
文件路径: ${file.filePath || '(无路径)'} 文件路径: ${file.filePath || '(无路径)'}<br>`;
</div>`;
// 如果有下载链接,显示可点击的链接
if (file.downloadUrl) {
html += `下载链接: <a href="${file.downloadUrl}" target="_blank" style="color: #667eea; text-decoration: underline; word-break: break-all;">${file.downloadUrl}</a><br>`;
html += `<button class="btn btn-secondary" onclick="window.open('${file.downloadUrl}', '_blank')" style="margin-top: 5px; padding: 6px 15px; font-size: 14px;">📥 下载文档</button>`;
}
html += `</div>`;
}); });
} }
} }
@ -741,6 +879,12 @@
btn.closest('.field-row').remove(); btn.closest('.field-row').remove();
} }
function clearAllFiles() {
if (confirm('确定要清空所有文件列表吗?')) {
document.getElementById('fileListContainer').innerHTML = '';
}
}
function displayError(tabType, errorMsg) { function displayError(tabType, errorMsg) {
const resultSection = document.getElementById(tabType + 'ResultSection'); const resultSection = document.getElementById(tabType + 'ResultSection');
const resultBox = document.getElementById(tabType + 'ResultBox'); const resultBox = document.getElementById(tabType + 'ResultBox');

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,458 @@
"""
从现有数据库同步三个表的数据到新数据库
同步的表f_polic_field, f_polic_file_config, f_polic_file_field
同步前会先备份新数据库
"""
import os
import sys
import subprocess
import pymysql
from datetime import datetime
from pathlib import Path
from typing import List, Dict, Any
from dotenv import load_dotenv
# 设置输出编码为UTF-8Windows控制台兼容
if sys.platform == 'win32':
import io
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8', errors='replace')
sys.stderr = io.TextIOWrapper(sys.stderr.buffer, encoding='utf-8', errors='replace')
# 加载环境变量
load_dotenv()
# 现有数据库配置(源数据库)
SOURCE_DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
# 新数据库配置(目标数据库)
TARGET_DB_CONFIG = {
'host': '10.100.31.21',
'port': 3306,
'user': 'finyx',
'password': 'FknJYz3FA5WDYtsd',
'database': 'finyx',
'charset': 'utf8mb4'
}
# 需要同步的表
TABLES_TO_SYNC = ['f_polic_field', 'f_polic_file_config', 'f_polic_file_field']
# 备份文件存储目录
BACKUP_DIR = Path('backups')
BACKUP_DIR.mkdir(exist_ok=True)
def backup_target_database() -> str:
"""
备份目标数据库
Returns:
备份文件路径
"""
print("=" * 60)
print("步骤 1: 备份新数据库")
print("=" * 60)
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
backup_file = BACKUP_DIR / f"backup_target_db_{timestamp}.sql"
# 构建mysqldump命令
cmd = [
'mysqldump',
f"--host={TARGET_DB_CONFIG['host']}",
f"--port={TARGET_DB_CONFIG['port']}",
f"--user={TARGET_DB_CONFIG['user']}",
f"--password={TARGET_DB_CONFIG['password']}",
'--single-transaction',
'--routines',
'--triggers',
'--events',
'--add-drop-table',
'--default-character-set=utf8mb4',
TARGET_DB_CONFIG['database']
]
try:
print(f"开始备份数据库 {TARGET_DB_CONFIG['database']}...")
print(f"备份文件: {backup_file}")
# 执行备份命令
with open(backup_file, 'w', encoding='utf-8') as f:
result = subprocess.run(
cmd,
stdout=f,
stderr=subprocess.PIPE,
text=True
)
if result.returncode != 0:
error_msg = result.stderr if result.stderr else '未知错误'
raise Exception(f"mysqldump执行失败: {error_msg}")
# 检查文件大小
file_size = backup_file.stat().st_size
print(f"备份完成!文件大小: {file_size / 1024 / 1024:.2f} MB")
print(f"备份文件路径: {backup_file}")
return str(backup_file)
except FileNotFoundError:
print("警告: 未找到mysqldump命令尝试使用Python方式备份...")
return backup_target_database_with_python(backup_file)
except Exception as e:
print(f"备份失败: {str(e)}")
raise
def backup_target_database_with_python(backup_file: Path) -> str:
"""
使用Python方式备份目标数据库备用方式
Args:
backup_file: 备份文件路径
Returns:
备份文件路径
"""
try:
print(f"开始使用Python方式备份数据库 {TARGET_DB_CONFIG['database']}...")
# 连接数据库
connection = pymysql.connect(**TARGET_DB_CONFIG)
cursor = connection.cursor()
with open(backup_file, 'w', encoding='utf-8') as f:
# 写入文件头
f.write(f"-- MySQL数据库备份\n")
f.write(f"-- 数据库: {TARGET_DB_CONFIG['database']}\n")
f.write(f"-- 备份时间: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\n")
f.write(f"-- 主机: {TARGET_DB_CONFIG['host']}:{TARGET_DB_CONFIG['port']}\n")
f.write("--\n\n")
f.write(f"SET NAMES utf8mb4;\n")
f.write(f"SET FOREIGN_KEY_CHECKS=0;\n\n")
# 获取所有表
cursor.execute("SHOW TABLES")
tables = [table[0] for table in cursor.fetchall()]
print(f"找到 {len(tables)} 个表")
# 备份每个表
for table in tables:
print(f"备份表: {table}")
# 获取表结构
cursor.execute(f"SHOW CREATE TABLE `{table}`")
create_table_sql = cursor.fetchone()[1]
f.write(f"-- ----------------------------\n")
f.write(f"-- 表结构: {table}\n")
f.write(f"-- ----------------------------\n")
f.write(f"DROP TABLE IF EXISTS `{table}`;\n")
f.write(f"{create_table_sql};\n\n")
# 获取表数据
cursor.execute(f"SELECT * FROM `{table}`")
rows = cursor.fetchall()
if rows:
# 获取列名
cursor.execute(f"DESCRIBE `{table}`")
columns = [col[0] for col in cursor.fetchall()]
f.write(f"-- ----------------------------\n")
f.write(f"-- 表数据: {table}\n")
f.write(f"-- ----------------------------\n")
# 分批写入数据
batch_size = 1000
for i in range(0, len(rows), batch_size):
batch = rows[i:i+batch_size]
values_list = []
for row in batch:
values = []
for value in row:
if value is None:
values.append('NULL')
elif isinstance(value, (int, float)):
values.append(str(value))
else:
# 转义特殊字符
escaped_value = str(value).replace('\\', '\\\\').replace("'", "\\'")
values.append(f"'{escaped_value}'")
values_list.append(f"({', '.join(values)})")
columns_str = ', '.join([f"`{col}`" for col in columns])
values_str = ',\n'.join(values_list)
f.write(f"INSERT INTO `{table}` ({columns_str}) VALUES\n")
f.write(f"{values_str};\n\n")
print(f" 完成: {len(rows)} 条记录")
f.write("SET FOREIGN_KEY_CHECKS=1;\n")
cursor.close()
connection.close()
# 检查文件大小
file_size = backup_file.stat().st_size
print(f"备份完成!文件大小: {file_size / 1024 / 1024:.2f} MB")
return str(backup_file)
except Exception as e:
print(f"备份失败: {str(e)}")
raise
def get_table_data(conn, table_name: str) -> List[Dict[str, Any]]:
"""
从源数据库获取表的所有数据
Args:
conn: 数据库连接
table_name: 表名
Returns:
数据列表
"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
cursor.execute(f"SELECT * FROM `{table_name}`")
return cursor.fetchall()
finally:
cursor.close()
def get_table_columns(conn, table_name: str) -> List[str]:
"""
获取表的列名
Args:
conn: 数据库连接
table_name: 表名
Returns:
列名列表
"""
cursor = conn.cursor()
try:
cursor.execute(f"DESCRIBE `{table_name}`")
return [col[0] for col in cursor.fetchall()]
finally:
cursor.close()
def clear_table(conn, table_name: str):
"""
清空目标数据库中的表
Args:
conn: 数据库连接
table_name: 表名
"""
cursor = conn.cursor()
try:
# 禁用外键检查
cursor.execute("SET FOREIGN_KEY_CHECKS=0")
# 清空表
cursor.execute(f"TRUNCATE TABLE `{table_name}`")
# 恢复外键检查
cursor.execute("SET FOREIGN_KEY_CHECKS=1")
conn.commit()
print(f" 已清空表: {table_name}")
except Exception as e:
conn.rollback()
raise Exception(f"清空表 {table_name} 失败: {str(e)}")
finally:
cursor.close()
def insert_table_data(conn, table_name: str, columns: List[str], data: List[Dict[str, Any]]):
"""
将数据插入到目标数据库
Args:
conn: 数据库连接
table_name: 表名
columns: 列名列表
data: 数据列表
"""
if not data:
print(f"{table_name} 没有数据需要插入")
return
cursor = conn.cursor()
try:
# 禁用外键检查
cursor.execute("SET FOREIGN_KEY_CHECKS=0")
# 构建INSERT语句
columns_str = ', '.join([f"`{col}`" for col in columns])
placeholders = ', '.join(['%s'] * len(columns))
insert_sql = f"INSERT INTO `{table_name}` ({columns_str}) VALUES ({placeholders})"
# 批量插入数据
batch_size = 1000
total_inserted = 0
for i in range(0, len(data), batch_size):
batch = data[i:i+batch_size]
values_list = []
for row in batch:
values = [row.get(col) for col in columns]
values_list.append(values)
cursor.executemany(insert_sql, values_list)
total_inserted += len(batch)
# 恢复外键检查
cursor.execute("SET FOREIGN_KEY_CHECKS=1")
conn.commit()
print(f" 已插入 {total_inserted} 条记录到表: {table_name}")
except Exception as e:
conn.rollback()
raise Exception(f"插入数据到表 {table_name} 失败: {str(e)}")
finally:
cursor.close()
def sync_table(source_conn, target_conn, table_name: str):
"""
同步单个表的数据
Args:
source_conn: 源数据库连接
target_conn: 目标数据库连接
table_name: 表名
"""
print(f"\n同步表: {table_name}")
print("-" * 60)
try:
# 获取表的列名
columns = get_table_columns(source_conn, table_name)
print(f" 表列: {', '.join(columns)}")
# 从源数据库获取数据
print(f" 从源数据库读取数据...")
source_data = get_table_data(source_conn, table_name)
print(f" 读取到 {len(source_data)} 条记录")
# 清空目标表
print(f" 清空目标表...")
clear_table(target_conn, table_name)
# 插入数据到目标表
if source_data:
print(f" 插入数据到目标表...")
insert_table_data(target_conn, table_name, columns, source_data)
else:
print(f"{table_name} 没有数据需要同步")
print(f"[OK] 表 {table_name} 同步完成")
except Exception as e:
print(f"[ERROR] 表 {table_name} 同步失败: {str(e)}")
raise
def main():
"""主函数"""
print("=" * 60)
print("数据库表同步工具")
print("=" * 60)
print(f"源数据库: {SOURCE_DB_CONFIG['host']}:{SOURCE_DB_CONFIG['port']}/{SOURCE_DB_CONFIG['database']}")
print(f"目标数据库: {TARGET_DB_CONFIG['host']}:{TARGET_DB_CONFIG['port']}/{TARGET_DB_CONFIG['database']}")
print(f"同步表: {', '.join(TABLES_TO_SYNC)}")
print("=" * 60)
# 步骤1: 备份目标数据库
try:
backup_file = backup_target_database()
print(f"\n[OK] 备份完成: {backup_file}\n")
except Exception as e:
print(f"\n[ERROR] 备份失败: {str(e)}")
response = input("是否继续同步?(y/n): ")
if response.lower() != 'y':
print("已取消同步")
sys.exit(1)
# 步骤2: 连接数据库
print("=" * 60)
print("步骤 2: 连接数据库")
print("=" * 60)
source_conn = None
target_conn = None
try:
print("连接源数据库...")
source_conn = pymysql.connect(**SOURCE_DB_CONFIG)
print("[OK] 源数据库连接成功")
print("连接目标数据库...")
try:
target_conn = pymysql.connect(**TARGET_DB_CONFIG, connect_timeout=10)
print("[OK] 目标数据库连接成功\n")
except pymysql.err.OperationalError as e:
if "timed out" in str(e) or "2003" in str(e):
print(f"[ERROR] 无法连接到目标数据库 {TARGET_DB_CONFIG['host']}:{TARGET_DB_CONFIG['port']}")
print("请检查:")
print(" 1. 网络连接是否正常")
print(" 2. 是否需要VPN连接")
print(" 3. 数据库服务器是否可访问")
print(" 4. 防火墙设置是否正确")
raise
# 步骤3: 同步表数据
print("=" * 60)
print("步骤 3: 同步表数据")
print("=" * 60)
for table_name in TABLES_TO_SYNC:
try:
sync_table(source_conn, target_conn, table_name)
except Exception as e:
print(f"\n[ERROR] 同步表 {table_name} 时发生错误: {str(e)}")
print("已停止同步")
sys.exit(1)
print("\n" + "=" * 60)
print("[OK] 所有表同步完成!")
print("=" * 60)
except pymysql.Error as e:
print(f"\n[ERROR] 数据库连接失败: {str(e)}")
if "timed out" in str(e) or "2003" in str(e):
print("\n提示如果无法连接到目标数据库请检查网络连接和VPN设置")
sys.exit(1)
except Exception as e:
print(f"\n[ERROR] 发生错误: {str(e)}")
sys.exit(1)
finally:
if source_conn:
source_conn.close()
if target_conn:
target_conn.close()
if __name__ == '__main__':
main()

View File

@ -0,0 +1,552 @@
"""
根据Excel数据设计文档同步更新模板的input_datatemplate_code和字段关联关系
"""
import os
import json
import pymysql
import pandas as pd
from pathlib import Path
from typing import Dict, List, Optional, Set
from datetime import datetime
from collections import defaultdict
# 数据库连接配置
DB_CONFIG = {
'host': os.getenv('DB_HOST', '152.136.177.240'),
'port': int(os.getenv('DB_PORT', 5012)),
'user': os.getenv('DB_USER', 'finyx'),
'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
'database': os.getenv('DB_NAME', 'finyx'),
'charset': 'utf8mb4'
}
TENANT_ID = 615873064429507639
CREATED_BY = 655162080928945152
UPDATED_BY = 655162080928945152
# Excel文件路径
EXCEL_FILE = '技术文档/智慧监督项目模板数据结构设计表-20251125-一凡标注.xlsx'
# 模板名称映射Excel中的名称 -> 数据库中的名称)
TEMPLATE_NAME_MAPPING = {
'请示报告卡': '1.请示报告卡XXX',
'初步核实审批表': '2.初步核实审批表XXX',
'初核方案': '3.附件初核方案(XXX)',
'谈话通知书': '谈话通知书',
'谈话通知书第一联': '谈话通知书第一联',
'谈话通知书第二联': '谈话通知书第二联',
'谈话通知书第三联': '谈话通知书第三联',
'走读式谈话审批': '走读式谈话审批',
'走读式谈话流程': '走读式谈话流程',
'请示报告卡(初核报告结论)': '8-1请示报告卡初核报告结论 ',
'XXX初核情况报告': '8.XXX初核情况报告',
}
# 模板编码映射Excel中的名称 -> template_code
TEMPLATE_CODE_MAPPING = {
'请示报告卡': 'REPORT_CARD',
'初步核实审批表': 'PRELIMINARY_VERIFICATION_APPROVAL',
'初核方案': 'INVESTIGATION_PLAN',
'谈话通知书第一联': 'NOTIFICATION_LETTER_1',
'谈话通知书第二联': 'NOTIFICATION_LETTER_2',
'谈话通知书第三联': 'NOTIFICATION_LETTER_3',
'请示报告卡(初核报告结论)': 'REPORT_CARD_CONCLUSION',
'XXX初核情况报告': 'INVESTIGATION_REPORT',
}
# 字段名称到字段编码的映射
FIELD_NAME_TO_CODE_MAP = {
# 输入字段
'线索信息': 'clue_info',
'被核查人员工作基本情况线索': 'target_basic_info_clue',
# 输出字段 - 基本信息
'被核查人姓名': 'target_name',
'被核查人员单位及职务': 'target_organization_and_position',
'被核查人员性别': 'target_gender',
'被核查人员出生年月': 'target_date_of_birth',
'被核查人员出生年月日': 'target_date_of_birth_full',
'被核查人员政治面貌': 'target_political_status',
'被核查人员职级': 'target_professional_rank',
'被核查人员单位': 'target_organization',
'被核查人员职务': 'target_position',
# 输出字段 - 其他信息
'线索来源': 'clue_source',
'主要问题线索': 'target_issue_description',
'初步核实审批表承办部门意见': 'department_opinion',
'初步核实审批表填表人': 'filler_name',
'请示报告卡请示时间': 'report_card_request_time',
'被核查人员身份证件及号码': 'target_id_number',
'被核查人员身份证号': 'target_id_number',
'应到时间': 'appointment_time',
'应到地点': 'appointment_location',
'批准时间': 'approval_time',
'承办部门': 'handling_department',
'承办人': 'handler_name',
'谈话通知时间': 'notification_time',
'谈话通知地点': 'notification_location',
'被核查人员住址': 'target_address',
'被核查人员户籍住址': 'target_registered_address',
'被核查人员联系方式': 'target_contact',
'被核查人员籍贯': 'target_place_of_origin',
'被核查人员民族': 'target_ethnicity',
'被核查人员工作基本情况': 'target_work_basic_info',
'核查单位名称': 'investigation_unit_name',
'核查组组长姓名': 'investigation_team_leader_name',
'核查组成员姓名': 'investigation_team_member_names',
'核查地点': 'investigation_location',
}
def generate_id():
"""生成ID"""
import time
import random
timestamp = int(time.time() * 1000)
random_part = random.randint(100000, 999999)
return timestamp * 1000 + random_part
def normalize_template_name(name: str) -> str:
"""标准化模板名称,用于匹配"""
import re
# 去掉开头的编号和括号内容
name = re.sub(r'^\d+[\.\-]\s*', '', name)
name = re.sub(r'[(].*?[)]', '', name)
name = name.strip()
return name
def parse_excel_data() -> Dict:
"""解析Excel文件提取模板和字段的关联关系"""
print("="*80)
print("解析Excel数据设计文档")
print("="*80)
if not Path(EXCEL_FILE).exists():
print(f"✗ Excel文件不存在: {EXCEL_FILE}")
return None
try:
df = pd.read_excel(EXCEL_FILE)
print(f"✓ 成功读取Excel文件{len(df)} 行数据\n")
templates = defaultdict(lambda: {
'template_name': '',
'template_code': '',
'input_fields': [],
'output_fields': []
})
current_template = None
current_input_field = None
for idx, row in df.iterrows():
level1 = row.get('一级分类')
level2 = row.get('二级分类')
level3 = row.get('三级分类')
input_field = row.get('输入数据字段')
output_field = row.get('输出数据字段')
# 处理二级分类(模板名称)
if pd.notna(level2) and level2:
current_template = str(level2).strip()
# 获取模板编码
template_code = TEMPLATE_CODE_MAPPING.get(current_template, '')
if not template_code:
# 如果没有映射,尝试生成
template_code = current_template.upper().replace(' ', '_')
templates[current_template]['template_name'] = current_template
templates[current_template]['template_code'] = template_code
current_input_field = None # 重置输入字段
print(f" 模板: {current_template} (code: {template_code})")
# 处理三级分类(子模板,如谈话通知书第一联)
if pd.notna(level3) and level3:
current_template = str(level3).strip()
template_code = TEMPLATE_CODE_MAPPING.get(current_template, '')
if not template_code:
template_code = current_template.upper().replace(' ', '_')
templates[current_template]['template_name'] = current_template
templates[current_template]['template_code'] = template_code
current_input_field = None
print(f" 子模板: {current_template} (code: {template_code})")
# 处理输入字段
if pd.notna(input_field) and input_field:
input_field_name = str(input_field).strip()
if input_field_name != current_input_field:
current_input_field = input_field_name
field_code = FIELD_NAME_TO_CODE_MAP.get(input_field_name, input_field_name.lower().replace(' ', '_'))
if current_template:
templates[current_template]['input_fields'].append({
'name': input_field_name,
'field_code': field_code
})
# 处理输出字段
if pd.notna(output_field) and output_field:
output_field_name = str(output_field).strip()
field_code = FIELD_NAME_TO_CODE_MAP.get(output_field_name, output_field_name.lower().replace(' ', '_'))
if current_template:
templates[current_template]['output_fields'].append({
'name': output_field_name,
'field_code': field_code
})
# 去重
for template_name, template_info in templates.items():
# 输入字段去重
seen_input = set()
unique_input = []
for field in template_info['input_fields']:
key = field['field_code']
if key not in seen_input:
seen_input.add(key)
unique_input.append(field)
template_info['input_fields'] = unique_input
# 输出字段去重
seen_output = set()
unique_output = []
for field in template_info['output_fields']:
key = field['field_code']
if key not in seen_output:
seen_output.add(key)
unique_output.append(field)
template_info['output_fields'] = unique_output
print(f"\n✓ 解析完成,共 {len(templates)} 个模板")
for template_name, template_info in templates.items():
print(f" - {template_name}: {len(template_info['input_fields'])} 个输入字段, {len(template_info['output_fields'])} 个输出字段")
return dict(templates)
except Exception as e:
print(f"✗ 解析Excel文件失败: {e}")
import traceback
traceback.print_exc()
return None
def get_database_templates(conn) -> Dict:
"""获取数据库中的模板配置"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
sql = """
SELECT id, name, template_code, input_data, parent_id
FROM f_polic_file_config
WHERE tenant_id = %s
"""
cursor.execute(sql, (TENANT_ID,))
configs = cursor.fetchall()
result = {}
for config in configs:
name = config['name']
result[name] = config
# 也添加标准化名称的映射
normalized = normalize_template_name(name)
if normalized not in result:
result[normalized] = config
cursor.close()
return result
def get_database_fields(conn) -> Dict:
"""获取数据库中的字段定义"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
sql = """
SELECT id, name, filed_code, field_type
FROM f_polic_field
WHERE tenant_id = %s
"""
cursor.execute(sql, (TENANT_ID,))
fields = cursor.fetchall()
result = {
'by_code': {},
'by_name': {}
}
for field in fields:
field_code = field['filed_code']
field_name = field['name']
result['by_code'][field_code] = field
result['by_name'][field_name] = field
cursor.close()
return result
def find_matching_template(excel_template_name: str, db_templates: Dict) -> Optional[Dict]:
"""查找匹配的数据库模板"""
# 1. 精确匹配
if excel_template_name in db_templates:
return db_templates[excel_template_name]
# 2. 通过映射表匹配
mapped_name = TEMPLATE_NAME_MAPPING.get(excel_template_name)
if mapped_name and mapped_name in db_templates:
return db_templates[mapped_name]
# 3. 标准化名称匹配
normalized = normalize_template_name(excel_template_name)
if normalized in db_templates:
return db_templates[normalized]
# 4. 模糊匹配
for db_name, db_config in db_templates.items():
if normalized in normalize_template_name(db_name) or normalize_template_name(db_name) in normalized:
return db_config
return None
def update_template_config(conn, template_id: int, template_code: str, input_fields: List[Dict], dry_run: bool = True):
"""更新模板配置的input_data和template_code"""
cursor = conn.cursor()
try:
# 构建input_data
input_data = {
'template_code': template_code,
'business_type': 'INVESTIGATION',
'input_fields': [f['field_code'] for f in input_fields]
}
input_data_json = json.dumps(input_data, ensure_ascii=False)
if not dry_run:
update_sql = """
UPDATE f_polic_file_config
SET template_code = %s, input_data = %s, updated_time = NOW(), updated_by = %s
WHERE id = %s AND tenant_id = %s
"""
cursor.execute(update_sql, (template_code, input_data_json, UPDATED_BY, template_id, TENANT_ID))
conn.commit()
print(f" ✓ 更新模板配置")
else:
print(f" [模拟] 将更新模板配置: template_code={template_code}")
finally:
cursor.close()
def update_template_field_relations(conn, template_id: int, input_fields: List[Dict], output_fields: List[Dict],
db_fields: Dict, dry_run: bool = True):
"""更新模板和字段的关联关系"""
cursor = conn.cursor()
try:
# 先删除旧的关联关系
if not dry_run:
delete_sql = """
DELETE FROM f_polic_file_field
WHERE tenant_id = %s AND file_id = %s
"""
cursor.execute(delete_sql, (TENANT_ID, template_id))
# 创建新的关联关系
relations_created = 0
# 关联输入字段field_type=1
for field_info in input_fields:
field_code = field_info['field_code']
field = db_fields['by_code'].get(field_code)
if not field:
print(f" ⚠ 输入字段不存在: {field_code}")
continue
if field['field_type'] != 1:
print(f" ⚠ 字段类型不匹配: {field_code} (期望输入字段,实际为输出字段)")
continue
if not dry_run:
# 检查是否已存在
check_sql = """
SELECT id FROM f_polic_file_field
WHERE tenant_id = %s AND file_id = %s AND filed_id = %s
"""
cursor.execute(check_sql, (TENANT_ID, template_id, field['id']))
existing = cursor.fetchone()
if not existing:
relation_id = generate_id()
insert_sql = """
INSERT INTO f_polic_file_field
(id, tenant_id, file_id, filed_id, created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, NOW(), %s, NOW(), %s, %s)
"""
cursor.execute(insert_sql, (
relation_id, TENANT_ID, template_id, field['id'],
CREATED_BY, UPDATED_BY, 1
))
relations_created += 1
else:
relations_created += 1
# 关联输出字段field_type=2
for field_info in output_fields:
field_code = field_info['field_code']
field = db_fields['by_code'].get(field_code)
if not field:
# 尝试通过名称匹配
field_name = field_info['name']
field = db_fields['by_name'].get(field_name)
if not field:
print(f" ⚠ 输出字段不存在: {field_code} ({field_info['name']})")
continue
if field['field_type'] != 2:
print(f" ⚠ 字段类型不匹配: {field_code} (期望输出字段,实际为输入字段)")
continue
if not dry_run:
# 检查是否已存在
check_sql = """
SELECT id FROM f_polic_file_field
WHERE tenant_id = %s AND file_id = %s AND filed_id = %s
"""
cursor.execute(check_sql, (TENANT_ID, template_id, field['id']))
existing = cursor.fetchone()
if not existing:
relation_id = generate_id()
insert_sql = """
INSERT INTO f_polic_file_field
(id, tenant_id, file_id, filed_id, created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, NOW(), %s, NOW(), %s, %s)
"""
cursor.execute(insert_sql, (
relation_id, TENANT_ID, template_id, field['id'],
CREATED_BY, UPDATED_BY, 1
))
relations_created += 1
else:
relations_created += 1
if not dry_run:
conn.commit()
print(f" ✓ 创建 {relations_created} 个字段关联关系")
else:
print(f" [模拟] 将创建 {relations_created} 个字段关联关系")
finally:
cursor.close()
def main():
"""主函数"""
print("="*80)
print("同步模板字段信息根据Excel数据设计文档")
print("="*80)
# 解析Excel
excel_data = parse_excel_data()
if not excel_data:
return
# 连接数据库
try:
conn = pymysql.connect(**DB_CONFIG)
print("\n✓ 数据库连接成功")
except Exception as e:
print(f"\n✗ 数据库连接失败: {e}")
return
try:
# 获取数据库中的模板和字段
print("\n获取数据库中的模板和字段...")
db_templates = get_database_templates(conn)
db_fields = get_database_fields(conn)
print(f" 数据库中有 {len(db_templates)} 个模板")
print(f" 数据库中有 {len(db_fields['by_code'])} 个字段")
# 匹配和更新
print("\n" + "="*80)
print("匹配模板并更新配置")
print("="*80)
matched_count = 0
unmatched_templates = []
for excel_template_name, template_info in excel_data.items():
print(f"\n处理模板: {excel_template_name}")
# 查找匹配的数据库模板
db_template = find_matching_template(excel_template_name, db_templates)
if not db_template:
print(f" ✗ 未找到匹配的数据库模板")
unmatched_templates.append(excel_template_name)
continue
print(f" ✓ 匹配到数据库模板: {db_template['name']} (ID: {db_template['id']})")
matched_count += 1
# 更新模板配置
template_code = template_info['template_code']
input_fields = template_info['input_fields']
output_fields = template_info['output_fields']
print(f" 模板编码: {template_code}")
print(f" 输入字段: {len(input_fields)}")
print(f" 输出字段: {len(output_fields)}")
# 先执行模拟更新
print(" [模拟模式]")
update_template_config(conn, db_template['id'], template_code, input_fields, dry_run=True)
update_template_field_relations(conn, db_template['id'], input_fields, output_fields, db_fields, dry_run=True)
# 显示统计
print("\n" + "="*80)
print("统计信息")
print("="*80)
print(f"Excel中的模板数: {len(excel_data)}")
print(f"成功匹配: {matched_count}")
print(f"未匹配: {len(unmatched_templates)}")
if unmatched_templates:
print("\n未匹配的模板:")
for template in unmatched_templates:
print(f" - {template}")
# 询问是否执行实际更新
print("\n" + "="*80)
response = input("\n是否执行实际更新?(yes/no默认no): ").strip().lower()
if response == 'yes':
print("\n执行实际更新...")
for excel_template_name, template_info in excel_data.items():
db_template = find_matching_template(excel_template_name, db_templates)
if db_template:
print(f"\n更新: {db_template['name']}")
update_template_config(conn, db_template['id'], template_info['template_code'],
template_info['input_fields'], dry_run=False)
update_template_field_relations(conn, db_template['id'],
template_info['input_fields'],
template_info['output_fields'],
db_fields, dry_run=False)
print("\n" + "="*80)
print("✓ 同步完成!")
print("="*80)
else:
print("\n已取消更新")
finally:
conn.close()
print("\n数据库连接已关闭")
if __name__ == '__main__':
main()

View File

@ -0,0 +1,779 @@
"""
跨数据库同步模板字段和关联关系
功能
1. .env文件读取源数据库配置
2. 同步到目标数据库10.100.31.21
3. 处理ID映射关系两个数据库的ID不同
4. 根据业务逻辑name, filed_code, file_path匹配数据
使用方法
python sync_templates_between_databases.py --target-host 10.100.31.21 --target-port 3306 --target-user finyx --target-password FknJYz3FA5WDYtsd --target-database finyx --target-tenant-id 1
"""
import os
import sys
import pymysql
import argparse
from pathlib import Path
from typing import Dict, List, Set, Optional, Tuple
from dotenv import load_dotenv
# 设置输出编码为UTF-8Windows兼容
if sys.platform == 'win32':
import io
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
sys.stderr = io.TextIOWrapper(sys.stderr.buffer, encoding='utf-8')
# 加载环境变量
load_dotenv()
# 项目根目录
PROJECT_ROOT = Path(__file__).parent
TEMPLATES_DIR = PROJECT_ROOT / "template_finish"
CREATED_BY = 655162080928945152
UPDATED_BY = 655162080928945152
def print_section(title):
"""打印章节标题"""
print("\n" + "="*70)
print(f" {title}")
print("="*70)
def print_result(success, message):
"""打印结果"""
status = "[OK]" if success else "[FAIL]"
print(f"{status} {message}")
def generate_id():
"""生成ID"""
import time
return int(time.time() * 1000000)
def get_source_db_config() -> Dict:
"""从.env文件读取源数据库配置"""
db_host = os.getenv('DB_HOST')
db_port = os.getenv('DB_PORT')
db_user = os.getenv('DB_USER')
db_password = os.getenv('DB_PASSWORD')
db_name = os.getenv('DB_NAME')
if not all([db_host, db_port, db_user, db_password, db_name]):
raise ValueError(
"源数据库配置不完整,请在.env文件中配置以下环境变量\n"
"DB_HOST, DB_PORT, DB_USER, DB_PASSWORD, DB_NAME"
)
return {
'host': db_host,
'port': int(db_port),
'user': db_user,
'password': db_password,
'database': db_name,
'charset': 'utf8mb4'
}
def get_target_db_config_from_args() -> Dict:
"""从命令行参数获取目标数据库配置"""
parser = argparse.ArgumentParser(
description='跨数据库同步模板、字段和关联关系',
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
示例
python sync_templates_between_databases.py --target-host 10.100.31.21 --target-port 3306 --target-user finyx --target-password FknJYz3FA5WDYtsd --target-database finyx --target-tenant-id 1
"""
)
parser.add_argument('--target-host', type=str, required=True, help='目标MySQL服务器地址')
parser.add_argument('--target-port', type=int, required=True, help='目标MySQL服务器端口')
parser.add_argument('--target-user', type=str, required=True, help='目标MySQL用户名')
parser.add_argument('--target-password', type=str, required=True, help='目标MySQL密码')
parser.add_argument('--target-database', type=str, required=True, help='目标数据库名称')
parser.add_argument('--target-tenant-id', type=int, required=True, help='目标租户ID')
parser.add_argument('--source-tenant-id', type=int, help='源租户ID如果不指定将使用数据库中的第一个tenant_id')
parser.add_argument('--dry-run', action='store_true', help='预览模式(不实际更新数据库)')
args = parser.parse_args()
return {
'host': args.target_host,
'port': args.target_port,
'user': args.target_user,
'password': args.target_password,
'database': args.target_database,
'charset': 'utf8mb4',
'tenant_id': args.target_tenant_id,
'source_tenant_id': args.source_tenant_id,
'dry_run': args.dry_run
}
def test_db_connection(config: Dict, label: str) -> Optional[pymysql.Connection]:
"""测试数据库连接"""
try:
conn = pymysql.connect(
host=config['host'],
port=config['port'],
user=config['user'],
password=config['password'],
database=config['database'],
charset=config['charset']
)
return conn
except Exception as e:
print_result(False, f"{label}数据库连接失败: {str(e)}")
return None
def get_source_tenant_id(conn) -> int:
"""获取源数据库中的tenant_id"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
cursor.execute("SELECT DISTINCT tenant_id FROM f_polic_file_config LIMIT 1")
result = cursor.fetchone()
if result:
return result['tenant_id']
return 1
finally:
cursor.close()
def read_source_fields(conn, tenant_id: int) -> Tuple[Dict[str, Dict], Dict[str, Dict]]:
"""
从源数据库读取字段数据
Returns:
(input_fields_dict, output_fields_dict)
key: filed_code, value: 字段信息
"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
sql = """
SELECT id, tenant_id, name, filed_code, field_type, state
FROM f_polic_field
WHERE tenant_id = %s
AND state = 1
ORDER BY field_type, filed_code
"""
cursor.execute(sql, (tenant_id,))
fields = cursor.fetchall()
input_fields = {}
output_fields = {}
for field in fields:
field_info = {
'id': field['id'],
'tenant_id': field['tenant_id'],
'name': field['name'],
'filed_code': field['filed_code'],
'field_type': field['field_type'],
'state': field['state']
}
if field['field_type'] == 1:
input_fields[field['filed_code']] = field_info
elif field['field_type'] == 2:
output_fields[field['filed_code']] = field_info
return input_fields, output_fields
finally:
cursor.close()
def read_source_templates(conn, tenant_id: int) -> Dict[str, Dict]:
"""
从源数据库读取模板数据
Returns:
key: file_path (如果为空则使用name), value: 模板信息
"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
sql = """
SELECT id, tenant_id, parent_id, name, file_path, state
FROM f_polic_file_config
WHERE tenant_id = %s
AND state = 1
ORDER BY file_path, name
"""
cursor.execute(sql, (tenant_id,))
templates = cursor.fetchall()
result = {}
for template in templates:
# 使用file_path作为key如果没有file_path则使用name
key = template['file_path'] if template['file_path'] else f"DIR:{template['name']}"
result[key] = {
'id': template['id'],
'tenant_id': template['tenant_id'],
'parent_id': template['parent_id'],
'name': template['name'],
'file_path': template['file_path'],
'state': template['state']
}
return result
finally:
cursor.close()
def read_source_relations(conn, tenant_id: int) -> Dict[int, List[int]]:
"""
从源数据库读取字段关联关系
Returns:
key: file_id, value: [filed_id列表]
"""
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
sql = """
SELECT file_id, filed_id
FROM f_polic_file_field
WHERE tenant_id = %s
AND state = 1
"""
cursor.execute(sql, (tenant_id,))
relations = cursor.fetchall()
result = {}
for rel in relations:
file_id = rel['file_id']
filed_id = rel['filed_id']
if file_id not in result:
result[file_id] = []
result[file_id].append(filed_id)
return result
finally:
cursor.close()
def sync_fields_to_target(conn, tenant_id: int, source_input_fields: Dict, source_output_fields: Dict,
dry_run: bool = False) -> Tuple[Dict[int, int], Dict[int, int]]:
"""
同步字段到目标数据库
Returns:
(input_field_id_map, output_field_id_map)
key: 源字段ID, value: 目标字段ID
"""
print_section("同步字段到目标数据库")
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
# 1. 获取目标数据库中的现有字段
cursor.execute("""
SELECT id, filed_code, field_type
FROM f_polic_field
WHERE tenant_id = %s
AND state = 1
""", (tenant_id,))
existing_fields = cursor.fetchall()
existing_by_code = {}
for field in existing_fields:
key = (field['filed_code'], field['field_type'])
existing_by_code[key] = field['id']
print(f" 目标数据库现有字段: {len(existing_fields)}")
# 2. 同步输入字段
print("\n 同步输入字段...")
input_field_id_map = {}
input_created = 0
input_matched = 0
for code, source_field in source_input_fields.items():
key = (code, 1)
if key in existing_by_code:
# 字段已存在使用现有ID
target_id = existing_by_code[key]
input_field_id_map[source_field['id']] = target_id
input_matched += 1
else:
# 创建新字段
target_id = generate_id()
input_field_id_map[source_field['id']] = target_id
if not dry_run:
insert_cursor = conn.cursor()
try:
insert_cursor.execute("""
INSERT INTO f_polic_field
(id, tenant_id, name, filed_code, field_type, created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, 1)
""", (
target_id,
tenant_id,
source_field['name'],
source_field['filed_code'],
1,
CREATED_BY,
UPDATED_BY
))
conn.commit()
input_created += 1
finally:
insert_cursor.close()
else:
input_created += 1
print(f" 匹配: {input_matched} 个,创建: {input_created}")
# 3. 同步输出字段
print("\n 同步输出字段...")
output_field_id_map = {}
output_created = 0
output_matched = 0
for code, source_field in source_output_fields.items():
key = (code, 2)
if key in existing_by_code:
# 字段已存在使用现有ID
target_id = existing_by_code[key]
output_field_id_map[source_field['id']] = target_id
output_matched += 1
else:
# 创建新字段
target_id = generate_id()
output_field_id_map[source_field['id']] = target_id
if not dry_run:
insert_cursor = conn.cursor()
try:
insert_cursor.execute("""
INSERT INTO f_polic_field
(id, tenant_id, name, filed_code, field_type, created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, 1)
""", (
target_id,
tenant_id,
source_field['name'],
source_field['filed_code'],
2,
CREATED_BY,
UPDATED_BY
))
conn.commit()
output_created += 1
finally:
insert_cursor.close()
else:
output_created += 1
print(f" 匹配: {output_matched} 个,创建: {output_created}")
return input_field_id_map, output_field_id_map
finally:
cursor.close()
def sync_templates_to_target(conn, tenant_id: int, source_templates: Dict,
dry_run: bool = False) -> Dict[int, int]:
"""
同步模板到目标数据库
Returns:
template_id_map: key: 源模板ID, value: 目标模板ID
"""
print_section("同步模板到目标数据库")
cursor = conn.cursor(pymysql.cursors.DictCursor)
try:
# 1. 获取目标数据库中的现有模板
cursor.execute("""
SELECT id, name, file_path, parent_id
FROM f_polic_file_config
WHERE tenant_id = %s
AND state = 1
""", (tenant_id,))
existing_templates = cursor.fetchall()
existing_by_path = {}
existing_by_name = {}
for template in existing_templates:
if template['file_path']:
existing_by_path[template['file_path']] = template
else:
# 目录节点
name = template['name']
if name not in existing_by_name:
existing_by_name[name] = []
existing_by_name[name].append(template)
print(f" 目标数据库现有模板: {len(existing_templates)}")
# 2. 先处理目录节点(按层级顺序)
print("\n 同步目录节点...")
template_id_map = {}
dir_created = 0
dir_matched = 0
# 分离目录和文件
dir_templates = {}
file_templates = {}
for key, source_template in source_templates.items():
if source_template['file_path']:
file_templates[key] = source_template
else:
dir_templates[key] = source_template
# 构建目录层级关系(需要先处理父目录)
# 按parent_id分组先处理没有parent_id的再处理有parent_id的
dirs_by_level = {}
for key, source_template in dir_templates.items():
level = 0
current = source_template
while current.get('parent_id'):
level += 1
# 查找父目录
parent_found = False
for t in dir_templates.values():
if t['id'] == current['parent_id']:
current = t
parent_found = True
break
if not parent_found:
break
if level not in dirs_by_level:
dirs_by_level[level] = []
dirs_by_level[level].append((key, source_template))
# 按层级顺序处理目录
for level in sorted(dirs_by_level.keys()):
for key, source_template in dirs_by_level[level]:
source_id = source_template['id']
name = source_template['name']
# 查找匹配的目录通过名称和parent_id
matched = None
target_parent_id = None
if source_template['parent_id']:
target_parent_id = template_id_map.get(source_template['parent_id'])
for existing in existing_by_name.get(name, []):
if not existing['file_path']: # 确保是目录节点
# 检查parent_id是否匹配
if existing['parent_id'] == target_parent_id:
matched = existing
break
if matched:
target_id = matched['id']
template_id_map[source_id] = target_id
dir_matched += 1
else:
target_id = generate_id()
template_id_map[source_id] = target_id
if not dry_run:
insert_cursor = conn.cursor()
try:
insert_cursor.execute("""
INSERT INTO f_polic_file_config
(id, tenant_id, parent_id, name, file_path, created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, NULL, NOW(), %s, NOW(), %s, 1)
""", (
target_id,
tenant_id,
target_parent_id,
name,
CREATED_BY,
UPDATED_BY
))
conn.commit()
dir_created += 1
finally:
insert_cursor.close()
else:
dir_created += 1
print(f" 匹配: {dir_matched} 个,创建: {dir_created}")
# 3. 处理文件节点
print("\n 同步文件节点...")
file_created = 0
file_matched = 0
file_updated = 0
for key, source_template in file_templates.items():
source_id = source_template['id']
file_path = source_template['file_path']
name = source_template['name']
# 通过file_path匹配
matched = existing_by_path.get(file_path)
if matched:
target_id = matched['id']
template_id_map[source_id] = target_id
file_matched += 1
# 检查是否需要更新
target_parent_id = None
if source_template['parent_id']:
target_parent_id = template_id_map.get(source_template['parent_id'])
if matched['parent_id'] != target_parent_id or matched['name'] != name:
file_updated += 1
if not dry_run:
update_cursor = conn.cursor()
try:
update_cursor.execute("""
UPDATE f_polic_file_config
SET parent_id = %s, name = %s, updated_time = NOW(), updated_by = %s
WHERE id = %s AND tenant_id = %s
""", (target_parent_id, name, UPDATED_BY, target_id, tenant_id))
conn.commit()
finally:
update_cursor.close()
else:
target_id = generate_id()
template_id_map[source_id] = target_id
if not dry_run:
insert_cursor = conn.cursor()
try:
# 处理parent_id映射
target_parent_id = None
if source_template['parent_id']:
target_parent_id = template_id_map.get(source_template['parent_id'])
insert_cursor.execute("""
INSERT INTO f_polic_file_config
(id, tenant_id, parent_id, name, file_path, created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, 1)
""", (
target_id,
tenant_id,
target_parent_id,
name,
file_path,
CREATED_BY,
UPDATED_BY
))
conn.commit()
file_created += 1
finally:
insert_cursor.close()
else:
file_created += 1
print(f" 匹配: {file_matched} 个,创建: {file_created} 个,更新: {file_updated}")
return template_id_map
finally:
cursor.close()
def sync_relations_to_target(conn, tenant_id: int, source_relations: Dict[int, List[int]],
template_id_map: Dict[int, int],
input_field_id_map: Dict[int, int],
output_field_id_map: Dict[int, int],
dry_run: bool = False):
"""同步字段关联关系到目标数据库"""
print_section("同步字段关联关系到目标数据库")
# 1. 清理现有关联关系
print("1. 清理现有关联关系...")
if not dry_run:
cursor = conn.cursor()
try:
cursor.execute("""
DELETE FROM f_polic_file_field
WHERE tenant_id = %s
""", (tenant_id,))
deleted_count = cursor.rowcount
conn.commit()
print_result(True, f"删除了 {deleted_count} 条旧关联关系")
finally:
cursor.close()
else:
print(" [预览模式] 将清理所有现有关联关系")
# 2. 创建新的关联关系
print("\n2. 创建新的关联关系...")
all_field_id_map = {**input_field_id_map, **output_field_id_map}
relations_created = 0
relations_skipped = 0
for source_file_id, source_field_ids in source_relations.items():
# 获取目标file_id
target_file_id = template_id_map.get(source_file_id)
if not target_file_id:
relations_skipped += 1
continue
# 转换field_id
target_field_ids = []
for source_field_id in source_field_ids:
target_field_id = all_field_id_map.get(source_field_id)
if target_field_id:
target_field_ids.append(target_field_id)
if not target_field_ids:
continue
# 创建关联关系
if not dry_run:
cursor = conn.cursor()
try:
for target_field_id in target_field_ids:
relation_id = generate_id()
cursor.execute("""
INSERT INTO f_polic_file_field
(id, tenant_id, file_id, filed_id, created_time, created_by, updated_time, updated_by, state)
VALUES (%s, %s, %s, %s, NOW(), %s, NOW(), %s, 1)
""", (
relation_id,
tenant_id,
target_file_id,
target_field_id,
CREATED_BY,
UPDATED_BY
))
conn.commit()
relations_created += len(target_field_ids)
except Exception as e:
conn.rollback()
print(f" [错误] 创建关联关系失败: {str(e)}")
finally:
cursor.close()
else:
relations_created += len(target_field_ids)
print_result(True, f"创建了 {relations_created} 条关联关系,跳过 {relations_skipped} 个模板")
return {
'created': relations_created,
'skipped': relations_skipped
}
def main():
"""主函数"""
print_section("跨数据库同步模板、字段和关联关系")
# 1. 获取源数据库配置(从.env
print_section("读取源数据库配置")
try:
source_config = get_source_db_config()
print_result(True, f"源数据库: {source_config['host']}:{source_config['port']}/{source_config['database']}")
except Exception as e:
print_result(False, str(e))
return
# 2. 获取目标数据库配置(从命令行参数)
print_section("读取目标数据库配置")
target_config = get_target_db_config_from_args()
print_result(True, f"目标数据库: {target_config['host']}:{target_config['port']}/{target_config['database']}")
print(f" 目标租户ID: {target_config['tenant_id']}")
if target_config['dry_run']:
print("\n[注意] 当前为预览模式,不会实际更新数据库")
# 3. 连接数据库
print_section("连接数据库")
source_conn = test_db_connection(source_config, "")
if not source_conn:
return
target_conn = test_db_connection(target_config, "目标")
if not target_conn:
source_conn.close()
return
print_result(True, "数据库连接成功")
try:
# 4. 获取源租户ID
source_tenant_id = target_config.get('source_tenant_id')
if not source_tenant_id:
source_tenant_id = get_source_tenant_id(source_conn)
print(f"\n源租户ID: {source_tenant_id}")
# 5. 读取源数据
print_section("读取源数据库数据")
print(" 读取字段...")
source_input_fields, source_output_fields = read_source_fields(source_conn, source_tenant_id)
print_result(True, f"输入字段: {len(source_input_fields)} 个,输出字段: {len(source_output_fields)}")
print("\n 读取模板...")
source_templates = read_source_templates(source_conn, source_tenant_id)
print_result(True, f"模板总数: {len(source_templates)}")
print("\n 读取关联关系...")
source_relations = read_source_relations(source_conn, source_tenant_id)
print_result(True, f"关联关系: {len(source_relations)} 个模板有字段关联")
# 6. 同步到目标数据库
target_tenant_id = target_config['tenant_id']
dry_run = target_config['dry_run']
# 6.1 同步字段
input_field_id_map, output_field_id_map = sync_fields_to_target(
target_conn, target_tenant_id,
source_input_fields, source_output_fields,
dry_run
)
# 6.2 同步模板
template_id_map = sync_templates_to_target(
target_conn, target_tenant_id,
source_templates,
dry_run
)
# 6.3 同步关联关系
relations_result = sync_relations_to_target(
target_conn, target_tenant_id,
source_relations,
template_id_map,
input_field_id_map,
output_field_id_map,
dry_run
)
# 7. 总结
print_section("同步完成")
if dry_run:
print(" 本次为预览模式,未实际更新数据库")
else:
print(" 数据库已更新")
print(f"\n 同步统计:")
print(f" - 输入字段: {len(input_field_id_map)}")
print(f" - 输出字段: {len(output_field_id_map)}")
print(f" - 模板: {len(template_id_map)}")
print(f" - 关联关系: {relations_result['created']}")
finally:
source_conn.close()
target_conn.close()
print_result(True, "数据库连接已关闭")
if __name__ == "__main__":
try:
main()
except KeyboardInterrupt:
print("\n\n[中断] 用户取消操作")
sys.exit(0)
except Exception as e:
print(f"\n[错误] 发生异常: {str(e)}")
import traceback
traceback.print_exc()
sys.exit(1)

View File

@ -15,12 +15,50 @@ class TemplateAIHelper:
"""模板AI辅助类用于智能分析文档内容""" """模板AI辅助类用于智能分析文档内容"""
def __init__(self): def __init__(self):
self.api_key = os.getenv('SILICONFLOW_API_KEY') # ========== AI服务提供商选择 ==========
self.model = os.getenv('SILICONFLOW_MODEL', 'deepseek-ai/DeepSeek-V3.2-Exp') # 通过环境变量 AI_PROVIDER 选择使用的AI服务
self.api_url = "https://api.siliconflow.cn/v1/chat/completions" # 可选值: 'huawei' 或 'siliconflow',默认为 'siliconflow'
ai_provider = os.getenv('AI_PROVIDER', 'siliconflow').lower()
if not self.api_key: # ========== 华为大模型配置 ==========
raise Exception("未配置 SILICONFLOW_API_KEY请在 .env 文件中设置") huawei_key = os.getenv('HUAWEI_API_KEY', 'sk-PoeiV3qwyTIRqcVc84E8E24cD2904872859a87922e0d9186')
huawei_endpoint = os.getenv('HUAWEI_API_ENDPOINT', 'http://10.100.31.26:3001/v1/chat/completions')
huawei_model = os.getenv('HUAWEI_MODEL', 'DeepSeek-R1-Distill-Llama-70B')
# ========== 硅基流动配置 ==========
siliconflow_key = os.getenv('SILICONFLOW_API_KEY', '')
siliconflow_url = os.getenv('SILICONFLOW_URL', 'https://api.siliconflow.cn/v1/chat/completions')
siliconflow_model = os.getenv('SILICONFLOW_MODEL', 'deepseek-ai/DeepSeek-V3.2-Exp')
# 根据配置选择服务提供商
if ai_provider == 'huawei':
if not huawei_key or not huawei_endpoint:
raise Exception("未配置华为大模型服务,请设置 HUAWEI_API_KEY 和 HUAWEI_API_ENDPOINT或设置 AI_PROVIDER=siliconflow 使用硅基流动")
self.api_key = huawei_key
self.model = huawei_model
self.api_url = huawei_endpoint
print(f"[模板AI助手] 使用华为大模型: {huawei_model}")
elif ai_provider == 'siliconflow':
if not siliconflow_key:
raise Exception("未配置硅基流动服务,请设置 SILICONFLOW_API_KEY或设置 AI_PROVIDER=huawei 使用华为大模型")
self.api_key = siliconflow_key
self.model = siliconflow_model
self.api_url = siliconflow_url
print(f"[模板AI助手] 使用硅基流动: {siliconflow_model}")
else:
# 自动检测:优先使用硅基流动,如果未配置则使用华为大模型
if siliconflow_key and siliconflow_url:
self.api_key = siliconflow_key
self.model = siliconflow_model
self.api_url = siliconflow_url
print(f"[模板AI助手] 自动选择硅基流动: {siliconflow_model}")
elif huawei_key and huawei_endpoint:
self.api_key = huawei_key
self.model = huawei_model
self.api_url = huawei_endpoint
print(f"[模板AI助手] 自动选择华为大模型: {huawei_model}")
else:
raise Exception("未配置AI服务请设置 AI_PROVIDER 环境变量('huawei''siliconflow'并配置相应的API密钥")
def test_api_connection(self) -> bool: def test_api_connection(self) -> bool:
""" """
@ -30,7 +68,9 @@ class TemplateAIHelper:
是否连接成功 是否连接成功
""" """
try: try:
print(" [测试] 正在测试硅基流动API连接...") print(f" [测试] 正在测试API连接...")
# 测试payload
test_payload = { test_payload = {
"model": self.model, "model": self.model,
"messages": [ "messages": [
@ -39,9 +79,14 @@ class TemplateAIHelper:
"content": "测试" "content": "测试"
} }
], ],
"temperature": 0.5,
"max_tokens": 10 "max_tokens": 10
} }
# 如果是华为大模型,添加额外的参数
if 'huawei' in self.api_url.lower() or '10.100.31.26' in self.api_url:
test_payload["stream"] = False
headers = { headers = {
"Authorization": f"Bearer {self.api_key}", "Authorization": f"Bearer {self.api_key}",
"Content-Type": "application/json" "Content-Type": "application/json"
@ -150,10 +195,21 @@ class TemplateAIHelper:
"content": prompt "content": prompt
} }
], ],
"temperature": 0.2, "temperature": 0.5,
"max_tokens": 4000 "max_tokens": 8192
} }
# 如果是华为大模型,添加额外的参数
if 'huawei' in self.api_url.lower() or '10.100.31.26' in self.api_url:
payload["stream"] = False
payload["presence_penalty"] = 1.03
payload["frequency_penalty"] = 1.0
payload["repetition_penalty"] = 1.0
payload["top_p"] = 0.95
payload["top_k"] = 1
payload["seed"] = 1
payload["n"] = 1
headers = { headers = {
"Authorization": f"Bearer {self.api_key}", "Authorization": f"Bearer {self.api_key}",
"Content-Type": "application/json" "Content-Type": "application/json"

Binary file not shown.

Some files were not shown because too many files have changed in this diff Show More