修正生成文档错误测试

再次修复模板目录层级结构
完成121个模板更新和测试。
2025-12-30 10:41:35 +08:00 · 2025-12-26 09:32:15 +08:00 · 2025-12-26 09:16:31 +08:00 · 2025-12-18 16:45:31 +08:00 · 2025-12-15 16:23:37 +08:00 · 2025-12-15 14:45:42 +08:00
965 changed files with 58737 additions and 34689 deletions
--- a/.cursorrules
+++ b/.cursorrules
@ -0,0 +1,324 @@
 # 智慧监督AI文书写作服务 - AI开发手册
 ## 项目背景
 本项目是一个基于大模型的智能文书生成服务，主要功能包括：
 - 从非结构化文本中提取结构化字段数据（使用AI大模型）
 - 根据字段数据填充Word模板生成正式文书
 - 支持多种业务类型的文书模板管理
 - 文档存储和下载管理（MinIO对象存储）
 核心业务流程：
 1. 接收输入的非结构化文本数据
 2. 使用AI大模型提取结构化字段
 3. 根据字段数据填充Word模板
 4. 生成文档并上传到MinIO
 5. 返回文档下载链接
 ## 技术栈与编码标准
 ### 核心技术栈
 - **Python版本**: Python 3.8+
 - **Web框架**: Flask 3.0.0
 - **数据库**: MySQL (使用PyMySQL 1.1.2)
 - **文档处理**: python-docx 1.1.0
 - **对象存储**: MinIO 7.2.3
 - **AI服务**: 
  - 华为大模型 (DeepSeek-R1-Distill-Llama-70B)
  - 硅基流动 (DeepSeek-V3.2-Exp)
 - **其他依赖**: flask-cors, flasgger, python-dotenv, requests, openpyxl, json-repair
 ### 编码规范
 - **代码风格**: 遵循PEP 8规范
 - **命名规范**:
  - 类名使用大驼峰命名（PascalCase）：`AIService`, `DocumentService`
  - 函数和变量使用小写下划线命名（snake_case）：`get_connection`, `field_data`
  - 常量使用全大写下划线命名：`AI_PROVIDER`, `DB_HOST`
 - **注释要求**:
  - 所有类和方法必须有docstring（使用三引号）
  - 复杂逻辑必须添加行内注释
  - 使用中文注释（项目统一使用中文）
 - **类型提示**: 建议使用类型提示（typing模块），提高代码可读性
 - **异常处理**: 必须使用try-except捕获异常，并提供有意义的错误信息
 ### 文件编码
 - 所有Python文件使用UTF-8编码
 - 文件开头不需要BOM标记
 ## 项目文件结构
 ```
 .
 ├── app.py                      # Flask主应用，定义所有API路由
 ├── requirements.txt            # Python依赖列表
 ├── .env.example               # 环境变量配置示例
 ├── .cursorrules               # AI开发手册（本文件）
 ├── services/                  # 服务层（业务逻辑）
 │   ├── __init__.py
 │   ├── ai_service.py          # AI服务：封装大模型调用逻辑
 │   ├── document_service.py    # 文档服务：Word模板填充、MinIO上传
 │   ├── field_service.py       # 字段服务：数据库字段配置查询
 │   └── ai_logger.py           # AI日志记录器
 ├── utils/                     # 工具类
 │   ├── __init__.py
 │   └── response.py            # 统一API响应格式工具
 ├── config/                    # 配置文件
 │   ├── prompt_config.json     # AI提示词配置
 │   └── field_defaults.json    # 字段默认值配置
 ├── static/                    # 静态文件
 │   ├── index.html             # 测试页面
 │   └── template_field_manager.html  # 模板字段管理页面
 ├── template/                  # Word模板文件目录
 ├── template_finish/           # 已完成的模板文件
 ├── test_scripts/              # 测试脚本
 └── 技术文档/                  # 技术文档目录
 ```
 ## 架构约束与最佳实践
 ### 1. 分层架构
 项目采用分层架构，严格遵循以下层次：
 - **路由层** (`app.py`): 只负责接收HTTP请求、参数验证、调用服务层、返回响应
 - **服务层** (`services/`): 包含所有业务逻辑，服务类之间可以相互调用
 - **工具层** (`utils/`): 提供通用工具函数，不包含业务逻辑
 - **数据层**: 数据库操作封装在服务层中，不单独抽象
 **重要原则**:
 - 路由层不包含业务逻辑，只做参数验证和响应格式化
 - 服务层方法应该是可测试的，不依赖Flask的request对象
 - 数据库连接在使用后必须关闭（使用try-finally确保）
 ### 2. 服务层设计规范
 #### AI服务 (`services/ai_service.py`)
 - 负责所有AI大模型调用
 - 支持多种AI服务提供商（华为、硅基流动），通过环境变量切换
 - 必须处理JSON解析失败的情况，提供多种修复机制
 - 字段名规范化：将AI返回的各种字段名格式映射到正确的字段编码
 - 日期格式规范化：统一转换为中文格式（YYYY年MM月 或 YYYY年MM月DD日）
 - 后处理：从已有信息推断缺失字段（如从出生年月计算年龄）
 #### 文档服务 (`services/document_service.py`)
 - 负责Word模板下载、填充、上传
 - 占位符格式：`{{field_code}}`（双大括号）
 - 必须处理表格中的占位符（使用索引访问，避免迭代器bug）
 - 提供XML备用方案处理特殊表格结构
 - 文档名称生成：从原始文件名提取基础名称，添加被核查人姓名后缀
 - MinIO路径格式：`/{tenant_id}/{timestamp}/{file_name}`
 #### 字段服务 (`services/field_service.py`)
 - 负责从数据库查询字段配置
 - 构建AI提示词：根据输入数据和输出字段配置生成提示词
 - 支持从配置文件加载提示词模板和字段默认值
 ### 3. 数据库设计规范
 #### 主要数据表
 - `f_polic_field`: 字段配置表
  - `id`: 主键
  - `name`: 字段名称
  - `filed_code`: 字段编码（注意：数据库字段名是filed_code，不是field_code）
  - `field_type`: 字段类型（1=输入字段，2=输出字段）
  - `state`: 状态（1=启用，0=禁用）
 - `f_polic_file_config`: 文件配置表（模板配置）
  - `id`: 主键（作为fileId使用）
  - `name`: 文件名称
  - `file_path`: MinIO中的文件路径
  - `input_data`: JSON格式的输入数据配置
  - `state`: 状态（1=启用，0=禁用）
 - `f_polic_file_field`: 文件字段关联表
  - `file_id`: 文件配置ID
  - `filed_id`: 字段ID
  - `state`: 状态（1=启用，0=禁用）
 #### 数据库操作规范
 - 所有数据库操作必须使用参数化查询，防止SQL注入
 - 使用`pymysql.cursors.DictCursor`获取字典格式结果
 - 数据库连接必须在使用后关闭（try-finally模式）
 - 事务操作必须正确处理回滚
 ### 4. API设计规范
 #### 统一响应格式
 所有API必须使用`utils/response.py`中的工具函数：
 **成功响应**:
 ```python
 return success_response(data={'key': 'value'}, msg="ok")
 ```
 **错误响应**:
 ```python
 return error_response(code=400, error_msg="错误信息")
 ```
 **响应结构**:
 ```json
 {
  "code": 0,              // 0表示成功，其他值表示错误码
  "data": {},             // 响应数据
  "msg": "ok",            // 响应消息
  "timestamp": "1234567890",  // 时间戳（毫秒）
  "errorMsg": "",         // 错误信息（成功时为空）
  "isSuccess": true       // 是否成功
 }
 ```
 #### API路由规范
 - 使用`@app.route`装饰器定义路由
 - 支持Swagger文档（使用flasgger）
 - 路由路径使用小写字母和连字符：`/api/ai/extract`
 - 保留旧路径以兼容：`/ai/extract` 和 `/api/ai/extract` 同时支持
 #### 错误码规范
 - `0`: 成功
 - `400`: 请求参数错误
 - `500`: 服务器内部错误
 - `1001`: 模板不存在
 - `2001`: AI解析超时
 - `2002`: AI解析失败
 - `3001`: 文件生成失败
 - `3002`: 文件保存失败
 ### 5. 环境变量配置
 所有配置通过环境变量管理，使用`.env`文件（不要提交到版本控制）：
 **必需配置**:
 - `DB_HOST`: 数据库主机
 - `DB_PORT`: 数据库端口
 - `DB_USER`: 数据库用户名
 - `DB_PASSWORD`: 数据库密码
 - `DB_NAME`: 数据库名称
 - `MINIO_ENDPOINT`: MinIO服务地址
 - `MINIO_ACCESS_KEY`: MinIO访问密钥
 - `MINIO_SECRET_KEY`: MinIO密钥
 - `MINIO_BUCKET`: MinIO存储桶名称
 - `MINIO_SECURE`: 是否使用HTTPS（true/false）
 **AI服务配置**:
 - `AI_PROVIDER`: AI服务提供商（'huawei' 或 'siliconflow'）
 - `HUAWEI_API_ENDPOINT`: 华为API地址
 - `HUAWEI_API_KEY`: 华为API密钥
 - `HUAWEI_MODEL`: 华为模型名称
 - `HUAWEI_API_TIMEOUT`: 超时时间（秒）
 - `HUAWEI_API_MAX_TOKENS`: 最大token数
 - `SILICONFLOW_URL`: 硅基流动API地址
 - `SILICONFLOW_API_KEY`: 硅基流动API密钥
 - `SILICONFLOW_MODEL`: 硅基流动模型名称
 - `SILICONFLOW_API_TIMEOUT`: 超时时间（秒）
 - `SILICONFLOW_API_MAX_TOKENS`: 最大token数
 **可选配置**:
 - `PORT`: 服务端口（默认7500）
 - `DEBUG`: 调试模式（true/false，默认false）
 - `TENANT_ID`: 租户ID（用于MinIO路径）
 ### 6. 错误处理规范
 - 所有可能失败的操作必须使用try-except捕获异常
 - 异常信息要详细，包含上下文信息
 - 数据库操作异常必须回滚事务
 - 文件操作异常必须清理临时文件
 - 对外暴露的错误信息要友好，不泄露内部实现细节
 **示例**:
 ```python
 try:
    # 业务逻辑
    result = some_operation()
    return success_response(data=result)
 except ValueError as e:
    return error_response(400, f"参数错误: {str(e)}")
 except Exception as e:
    # 记录详细错误日志
    print(f"[ERROR] 操作失败: {str(e)}")
    import traceback
    print(traceback.format_exc())
    return error_response(500, "服务器内部错误")
 ```
 ### 7. 日志规范
 - 使用`print`输出日志（项目当前使用print，不是logging模块）
 - 日志格式：`[级别] 消息内容`
 - 日志级别：
  - `[DEBUG]`: 调试信息（详细的执行过程）
  - `[INFO]`: 一般信息（正常流程）
  - `[WARN]`: 警告信息（不影响功能但需要注意）
  - `[ERROR]`: 错误信息（功能失败）
 - 关键操作必须记录日志：AI调用、文件生成、数据库操作
 ### 8. 代码质量要求
 - **可读性**: 代码要清晰易懂，变量名要有意义
 - **可维护性**: 避免重复代码，提取公共方法
 - **可测试性**: 服务层方法应该是纯函数，便于单元测试
 - **健壮性**: 处理边界情况，避免崩溃
 - **性能**: 数据库查询要优化，避免N+1查询
 ### 9. 特殊注意事项
 #### Word模板处理
 - 使用`python-docx`库处理Word文档
 - 占位符格式：`{{field_code}}`（双大括号）
 - 表格访问必须使用索引方式，避免迭代器导致的IndexError
 - 处理跨run的占位符时，需要合并runs并保持格式
 - 某些复杂表格结构可能导致访问失败，提供XML备用方案
 #### AI字段提取
 - AI返回的JSON可能格式不正确，需要多种修复机制
 - 字段名可能不规范（如`_source`、`target_organisation`），需要规范化映射
 - 日期格式需要统一转换为中文格式
 - 缺失字段需要从已有信息推断（如从出生年月计算年龄）
 #### MinIO文件管理
 - 文件路径使用相对路径（以`/`开头）
 - 上传文件时自动生成时间戳路径
 - 支持生成预签名下载URL（7天有效期）
 - 临时文件使用后必须清理
 ## 开发工作流
 1. **添加新功能**:
   - 在服务层添加业务逻辑方法
   - 在路由层添加API端点
   - 更新Swagger文档注释
   - 测试功能是否正常
 2. **修改现有功能**:
   - 理解现有代码逻辑
   - 保持API接口兼容性（如需要修改，考虑版本控制）
   - 更新相关文档
 3. **调试问题**:
   - 查看日志输出（使用[DEBUG]级别）
   - 检查数据库数据是否正确
   - 验证环境变量配置
   - 测试AI服务是否可用
 ## 常见问题处理
 1. **AI解析失败**: 检查输入文本质量，查看AI日志，尝试修复JSON格式
 2. **模板填充失败**: 检查占位符格式是否正确，查看表格结构是否异常
 3. **MinIO上传失败**: 检查网络连接，验证MinIO配置和权限
 4. **数据库连接失败**: 检查数据库配置和网络连接
 ## 代码生成指导
 当AI需要生成代码时，请遵循以下原则：
 1. **保持一致性**: 遵循项目现有的代码风格和架构模式
 2. **错误处理**: 所有可能失败的操作都要有异常处理
 3. **日志记录**: 关键操作要记录日志
 4. **参数验证**: API接口要验证输入参数
 5. **资源清理**: 文件、数据库连接等资源要正确清理
 6. **中文注释**: 使用中文编写注释和文档字符串
 7. **类型提示**: 建议使用类型提示提高代码可读性
 ## 更新日志
 - 2025-12-13: 创建初始版本
--- a/.env
+++ b/.env
@ -1,14 +1,68 @@
-# 硅基流动API配置
+# ========== AI服务提供商配置 ==========
-SILICONFLOW_API_KEY=sk-xnhmtotmlpjomrejbwdbczbpbyvanpxndvbxltodjwzbpmni
+# 选择使用的AI服务提供商
-SILICONFLOW_MODEL=deepseek-ai/DeepSeek-V3.2-Exp
+# 可选值: 'huawei' 或 'siliconflow'
 # 默认值: 'siliconflow'
 AI_PROVIDER=siliconflow
-# 华为大模型API配置（预留）
+# ========== 华为大模型API配置 ==========
-HUAWEI_API_ENDPOINT=
+# 当 AI_PROVIDER=huawei 时使用以下配置
 HUAWEI_API_KEY=
-# 数据库配置
+# API端点地址
 HUAWEI_API_ENDPOINT=http://10.100.31.26:3001/v1/chat/completions
 # API密钥
 HUAWEI_API_KEY=sk-PoeiV3qwyTIRqcVc84E8E24cD2904872859a87922e0d9186
 # 模型名称
 HUAWEI_MODEL=DeepSeek-R1-Distill-Llama-70B
 # API超时配置（秒）
 # 开启思考模式时，响应时间会显著增加，需要更长的超时时间
 # 默认180秒（3分钟）
 HUAWEI_API_TIMEOUT=180
 # API最大token数配置
 # 开启思考模式时，模型可能生成更长的响应，需要更多的token
 # 默认12000
 HUAWEI_API_MAX_TOKENS=12000
 # ========== 硅基流动API配置 ==========
 # 当 AI_PROVIDER=siliconflow 时使用以下配置
 # API端点地址（默认值，通常不需要修改）
 SILICONFLOW_URL=https://api.siliconflow.cn/v1/chat/completions
 # API密钥（必需）
 SILICONFLOW_API_KEY=sk-pgujibohpenkomkwlufexmqzyckglgogdiubfplgqxkfqgfu
 # 模型名称（默认值，通常不需要修改）
 SILICONFLOW_MODEL=Qwen/Qwen2.5-72B-Instruct
 # API超时配置（秒）
 # 默认120秒
 SILICONFLOW_API_TIMEOUT=120
 # API最大token数配置
 # 默认2000
 SILICONFLOW_API_MAX_TOKENS=2000
 # ========== 数据库配置 ==========
 DB_HOST=152.136.177.240
 DB_PORT=5012
 DB_USER=finyx
 DB_PASSWORD=6QsGK6MpePZDE57Z
 DB_NAME=finyx
 # ========== MinIO配置（可选，文档生成功能需要） ==========
 MINIO_ENDPOINT=minio.datacubeworld.com:9000
 MINIO_ACCESS_KEY=JOLXFXny3avFSzB0uRA5
 MINIO_SECRET_KEY=G1BR8jStNfovkfH5ou39EmPl34E4l7dGrnd3Cz0I
 MINIO_BUCKET=finyx
 MINIO_SECURE=true
 # ========== 服务配置 ==========
 # 服务端口
 PORT=7500
 # 调试模式（true/false）
 DEBUG=False
--- a/.env.example
+++ b/.env.example
@ -1,14 +1,68 @@
-# 硅基流动API配置
+# ========== AI服务提供商配置 ==========
-SILICONFLOW_API_KEY=your_api_key_here
+# 选择使用的AI服务提供商
-SILICONFLOW_MODEL=deepseek-ai/DeepSeek-V3.2-Exp
+# 可选值: 'huawei' 或 'siliconflow'
 # 默认值: 'siliconflow'
 AI_PROVIDER=siliconflow
-# 华为大模型API配置（预留）
+# ========== 华为大模型API配置 ==========
-HUAWEI_API_ENDPOINT=
+# 当 AI_PROVIDER=huawei 时使用以下配置
 HUAWEI_API_KEY=
-# 数据库配置
+# API端点地址
 HUAWEI_API_ENDPOINT=http://10.100.31.26:3001/v1/chat/completions
 # API密钥
 HUAWEI_API_KEY=sk-PoeiV3qwyTIRqcVc84E8E24cD2904872859a87922e0d9186
 # 模型名称
 HUAWEI_MODEL=DeepSeek-R1-Distill-Llama-70B
 # API超时配置（秒）
 # 开启思考模式时，响应时间会显著增加，需要更长的超时时间
 # 默认180秒（3分钟）
 HUAWEI_API_TIMEOUT=180
 # API最大token数配置
 # 开启思考模式时，模型可能生成更长的响应，需要更多的token
 # 默认12000
 HUAWEI_API_MAX_TOKENS=12000
 # ========== 硅基流动API配置 ==========
 # 当 AI_PROVIDER=siliconflow 时使用以下配置
 # API端点地址（默认值，通常不需要修改）
 SILICONFLOW_URL=https://api.siliconflow.cn/v1/chat/completions
 # API密钥（必需）
 SILICONFLOW_API_KEY=sk-pgujibohpenkomkwlufexmqzyckglgogdiubfplgqxkfqgfu
 # 模型名称（默认值，通常不需要修改）
 SILICONFLOW_MODEL=Qwen/Qwen2.5-72B-Instruct
 # API超时配置（秒）
 # 默认120秒
 SILICONFLOW_API_TIMEOUT=120
 # API最大token数配置
 # 默认2000
 SILICONFLOW_API_MAX_TOKENS=2000
 # ========== 数据库配置 ==========
 DB_HOST=152.136.177.240
 DB_PORT=5012
 DB_USER=finyx
 DB_PASSWORD=6QsGK6MpePZDE57Z
 DB_NAME=finyx
 # ========== MinIO配置（可选，文档生成功能需要） ==========
 MINIO_ENDPOINT=minio.datacubeworld.com:9000
 MINIO_ACCESS_KEY=JOLXFXny3avFSzB0uRA5
 MINIO_SECRET_KEY=G1BR8jStNfovkfH5ou39EmPl34E4l7dGrnd3Cz0I
 MINIO_BUCKET=finyx
 MINIO_SECURE=true
 # ========== 服务配置 ==========
 # 服务端口
 PORT=7500
 # 调试模式（true/false）
 DEBUG=False
--- a/.gitignore
+++ b/.gitignore
@ -0,0 +1,56 @@
 # Python
 __pycache__/
 *.py[cod]
 *$py.class
 *.so
 .Python
 build/
 develop-eggs/
 dist/
 downloads/
 eggs/
 .eggs/
 lib/
 lib64/
 parts/
 sdist/
 var/
 wheels/
 *.egg-info/
 .installed.cfg
 *.egg
 # Virtual Environment
 venv/
 env/
 ENV/
 # IDE
 .vscode/
 .idea/
 *.swp
 *.swo
 *~
 # Logs
 logs/
 *.log
 # Environment variables
 .env
 .env.local
 # Database
 *.db
 *.sqlite
 *.sqlite3
 # OS
 .DS_Store
 Thumbs.db
 # Project specific
 parsed_fields.json
 *.docx.bak
 .env
--- a/README.md
+++ b/README.md
@ -6,8 +6,10 @@
 - ✅ AI解析接口 (`/api/ai/extract`) - 从输入文本中提取结构化字段
 - ✅ 字段配置管理 - 从数据库读取字段配置
- ✅ 支持硅基流动大模型（DeepSeek）
+- ✅ 支持多种AI服务提供商：
- 🔄 预留华为大模型接口支持
+  - 华为大模型（DeepSeek-R1-Distill-Llama-70B）
  - 硅基流动（DeepSeek-V3.2-Exp）
 - ✅ 可通过配置灵活切换AI服务提供商
 - ✅ Web测试界面 - 可视化测试解析功能
 ## 项目结构
@ -70,14 +72,30 @@ copy .env.example .env
 cp .env.example .env
 ```
-编辑 `.env` 文件，填入你的API密钥：
+编辑 `.env` 文件，填入你的配置：
 ```env
-# 硅基流动API配置（必需）
+# ========== AI服务提供商配置 ==========
-SILICONFLOW_API_KEY=your_api_key_here
+# 选择使用的AI服务提供商
-SILICONFLOW_MODEL=deepseek-ai/DeepSeek-V3.2-Exp
+# 可选值: 'huawei' 或 'siliconflow'
 # 默认值: 'siliconflow'
 AI_PROVIDER=siliconflow
-# 数据库配置（已默认配置，如需修改可调整）
+# ========== 华为大模型API配置（当 AI_PROVIDER=huawei 时使用） ==========
 HUAWEI_API_ENDPOINT=http://10.100.31.26:3001/v1/chat/completions
 HUAWEI_API_KEY=sk-PoeiV3qwyTIRqcVc84E8E24cD2904872859a87922e0d9186
 HUAWEI_MODEL=DeepSeek-R1-Distill-Llama-70B
 HUAWEI_API_TIMEOUT=180
 HUAWEI_API_MAX_TOKENS=12000
 # ========== 硅基流动API配置（当 AI_PROVIDER=siliconflow 时使用） ==========
 SILICONFLOW_URL=https://api.siliconflow.cn/v1/chat/completions
 SILICONFLOW_API_KEY=your_siliconflow_api_key_here
 SILICONFLOW_MODEL=deepseek-ai/DeepSeek-V3.2-Exp
 SILICONFLOW_API_TIMEOUT=120
 SILICONFLOW_API_MAX_TOKENS=2000
 # ========== 数据库配置 ==========
 DB_HOST=152.136.177.240
 DB_PORT=5012
 DB_USER=finyx
@ -85,6 +103,41 @@ DB_PASSWORD=6QsGK6MpePZDE57Z
 DB_NAME=finyx
 ```
 **AI服务提供商选择说明：**
 - **华为大模型**：设置 `AI_PROVIDER=huawei`，并配置 `HUAWEI_API_KEY` 和 `HUAWEI_API_ENDPOINT`
 - **硅基流动**：设置 `AI_PROVIDER=siliconflow`（默认值），并配置 `SILICONFLOW_API_KEY`
 如果配置的AI服务不完整，系统会自动尝试使用另一个可用的服务。
 **华为大模型API调用示例：**
 ```bash
 curl --location --request POST 'http://10.100.31.26:3001/v1/chat/completions' \
 --header 'Authorization: Bearer sk-PoeiV3qwyTIRqcVc84E8E24cD2904872859a87922e0d9186' \
 --header 'Content-Type: application/json' \
 --data-raw '{
    "model": "DeepSeek-R1-Distill-Llama-70B",
    "messages": [
        {
            "role": "user",
            "content": "介绍一下山西的营商环境，推荐适合什么行业经营"
        }
    ],
    "stream": false,
    "presence_penalty": 1.03,
    "frequency_penalty": 1.0,
    "repetition_penalty": 1.0,
    "temperature": 0.5,
    "top_p": 0.95,
    "top_k": 1,
    "seed": 1,
    "max_tokens": 8192,
    "n": 2,
    "best_of": 2
 }'
 ```
 ### 3. 启动服务
 ```bash
@ -243,7 +296,7 @@ print(response.json())
 ## 常见问题
 **Q: 提示"未配置AI服务"？**
-A: 检查 `.env` 文件中的 `SILICONFLOW_API_KEY` 是否已正确配置。
+A: 系统仅支持华为大模型（已内置默认配置），请确保 `.env` 文件中正确设置了 `HUAWEI_API_KEY` 和 `HUAWEI_API_ENDPOINT`。如果华为大模型不可用，请检查网络连接和API配置。
 **Q: 解析结果为空？**
 A: 检查输入文本是否包含足够的信息，可以尝试更详细的输入文本。
--- a/README_模板字段导出说明.md
+++ b/README_模板字段导出说明.md
@ -0,0 +1,68 @@
 # 模板字段导出说明
 ## 功能说明
 `export_template_fields_to_excel.py` 脚本用于导出所有模板及其关联的输入字段和输出字段到Excel表格，方便汇总整理模板和字段关系。
 ## 使用方法
 ```bash
 python export_template_fields_to_excel.py
 ```
 ## 输出文件
 脚本会在当前目录生成Excel文件，文件名格式：`template_fields_export_YYYYMMDD_HHMMSS.xlsx`
 ## Excel表格结构
 生成的Excel表格包含以下列：
 1. **模板ID** - 模板在数据库中的唯一标识
 2. **模板名称** - 模板的中文名称
 3. **模板上级** - 模板的分类路径（从文件路径或模板名称推断，可能不完整，需要手动补充）
 4. **输入字段** - 该模板关联的输入字段列表，格式：`字段名称(字段编码); 字段名称(字段编码)`
 5. **输出字段** - 该模板关联的输出字段列表，格式：`字段名称(字段编码); 字段名称(字段编码)`
 6. **输入字段数量** - 输入字段的个数
 7. **输出字段数量** - 输出字段的个数
 ## 注意事项
 1. **模板上级字段**：脚本会尝试从文件路径或模板名称推断模板的分类，但可能不完整或不准确。您可以在Excel中手动补充或修正。
 2. **字段格式**：输入字段和输出字段以分号分隔，每个字段的格式为 `字段名称(字段编码)`。
 3. **数据来源**：所有数据来自数据库，只导出状态为启用（state=1）的模板和字段。
 4. **后续使用**：您可以基于这个Excel表格：
   - 手动补充或修正模板上级分类
   - 新增模板和字段关系
   - 创建导入脚本将修改后的数据导入数据库
 ## 示例数据
 ```
 模板ID: 1765432134276990
 模板名称: 1.请示报告卡（初核谈话）
 模板上级: 2-初核模版/2.谈话审批
 输入字段: 线索信息(clue_info); 被核查人员工作基本情况线索(target_basic_info_clue)
 输出字段: 被核查人姓名(target_name); 被核查人员单位及职务(target_organization_and_position); ...
 输入字段数量: 2
 输出字段数量: 3
 ```
 ## 导入脚本开发建议
 后续开发导入脚本时，可以参考以下步骤：
 1. 读取Excel文件
 2. 解析模板名称、模板上级、输入字段、输出字段
 3. 根据模板名称查找或创建模板记录
 4. 根据字段编码查找字段ID
 5. 创建或更新模板和字段的关联关系
 ## 相关文件
 - `export_template_fields_to_excel.py` - 导出脚本
 - `template_fields_export_*.xlsx` - 生成的Excel文件
--- a/pycache/app.cpython-312.pyc
+++ b/pycache/app.cpython-312.pyc
--- a/analyze_and_fix_field_code_issues.py
+++ b/analyze_and_fix_field_code_issues.py
@ -0,0 +1,582 @@
 """
 分析和修复字段编码问题
 1. 分析f_polic_file_field表中的重复项
 2. 检查f_polic_field表中的中文field_code
 3. 根据占位符与字段对照表更新field_code
 4. 合并重复项并更新关联表
 """
 import os
 import json
 import pymysql
 import re
 from typing import Dict, List, Optional, Tuple
 from datetime import datetime
 from pathlib import Path
 # 数据库连接配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 CREATED_BY = 655162080928945152
 UPDATED_BY = 655162080928945152
 CURRENT_TIME = datetime.now()
 # 从占位符与字段对照表文档中提取的字段映射
 # 格式: {字段名称: field_code}
 FIELD_NAME_TO_CODE_MAPPING = {
    # 基本信息字段
    '被核查人姓名': 'target_name',
    '被核查人员单位及职务': 'target_organization_and_position',
    '被核查人员单位': 'target_organization',
    '被核查人员职务': 'target_position',
    '被核查人员性别': 'target_gender',
    '被核查人员出生年月': 'target_date_of_birth',
    '被核查人员出生年月日': 'target_date_of_birth_full',
    '被核查人员年龄': 'target_age',
    '被核查人员文化程度': 'target_education_level',
    '被核查人员政治面貌': 'target_political_status',
    '被核查人员职级': 'target_professional_rank',
    '被核查人员身份证号': 'target_id_number',
    '被核查人员身份证件及号码': 'target_id_number',
    '被核查人员住址': 'target_address',
    '被核查人员户籍住址': 'target_registered_address',
    '被核查人员联系方式': 'target_contact',
    '被核查人员籍贯': 'target_place_of_origin',
    '被核查人员民族': 'target_ethnicity',
    # 问题相关字段
    '线索来源': 'clue_source',
    '主要问题线索': 'target_issue_description',
    '被核查人问题描述': 'target_problem_description',
    # 审批相关字段
    '初步核实审批表承办部门意见': 'department_opinion',
    '初步核实审批表填表人': 'filler_name',
    '批准时间': 'approval_time',
    # 核查相关字段
    '核查单位名称': 'investigation_unit_name',
    '核查组代号': 'investigation_team_code',
    '核查组组长姓名': 'investigation_team_leader_name',
    '核查组成员姓名': 'investigation_team_member_names',
    '核查地点': 'investigation_location',
    # 风险评估相关字段
    '被核查人员家庭情况': 'target_family_situation',
    '被核查人员社会关系': 'target_social_relations',
    '被核查人员健康状况': 'target_health_status',
    '被核查人员性格特征': 'target_personality',
    '被核查人员承受能力': 'target_tolerance',
    '被核查人员涉及问题严重程度': 'target_issue_severity',
    '被核查人员涉及其他问题的可能性': 'target_other_issues_possibility',
    '被核查人员此前被审查情况': 'target_previous_investigation',
    '被核查人员社会负面事件': 'target_negative_events',
    '被核查人员其他情况': 'target_other_situation',
    '风险等级': 'risk_level',
    # 其他字段
    '线索信息': 'clue_info',
    '被核查人员工作基本情况线索': 'target_basic_info_clue',
    '被核查人员工作基本情况': 'target_work_basic_info',
    '请示报告卡请示时间': 'report_card_request_time',
    '应到时间': 'appointment_time',
    '应到地点': 'appointment_location',
    '承办部门': 'handling_department',
    '承办人': 'handler_name',
    '谈话通知时间': 'notification_time',
    '谈话通知地点': 'notification_location',
    '被核查人员本人认识和态度': 'target_attitude',
    '纪委名称': 'commission_name',
 }
 def is_chinese(text: str) -> bool:
    """判断字符串是否包含中文字符"""
    if not text:
        return False
    return bool(re.search(r'[\u4e00-\u9fff]', text))
 def analyze_f_polic_field(conn) -> Dict:
    """分析f_polic_field表，找出中文field_code和重复项"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    print("\n" + "="*80)
    print("1. 分析 f_polic_field 表")
    print("="*80)
    # 查询所有字段
    cursor.execute("""
        SELECT id, name, filed_code, field_type, state
        FROM f_polic_field
        WHERE tenant_id = %s
        ORDER BY name, filed_code
    """, (TENANT_ID,))
    fields = cursor.fetchall()
    print(f"\n总共找到 {len(fields)} 个字段记录")
    # 找出中文field_code
    chinese_field_codes = []
    for field in fields:
        if is_chinese(field['filed_code']):
            chinese_field_codes.append(field)
    print(f"\n发现 {len(chinese_field_codes)} 个中文field_code:")
    for field in chinese_field_codes:
        print(f"  - ID: {field['id']}, 名称: {field['name']}, field_code: {field['filed_code']}")
    # 找出重复的字段名称
    name_to_fields = {}
    for field in fields:
        name = field['name']
        if name not in name_to_fields:
            name_to_fields[name] = []
        name_to_fields[name].append(field)
    duplicates = {name: fields_list for name, fields_list in name_to_fields.items() 
                 if len(fields_list) > 1}
    print(f"\n发现 {len(duplicates)} 个重复的字段名称:")
    for name, fields_list in duplicates.items():
        print(f"\n  字段名称: {name} (共 {len(fields_list)} 条记录)")
        for field in fields_list:
            print(f"    - ID: {field['id']}, field_code: {field['filed_code']}, "
                  f"field_type: {field['field_type']}, state: {field['state']}")
    # 找出重复的field_code
    code_to_fields = {}
    for field in fields:
        code = field['filed_code']
        if code not in code_to_fields:
            code_to_fields[code] = []
        code_to_fields[code].append(field)
    duplicate_codes = {code: fields_list for code, fields_list in code_to_fields.items() 
                      if len(fields_list) > 1}
    print(f"\n发现 {len(duplicate_codes)} 个重复的field_code:")
    for code, fields_list in duplicate_codes.items():
        print(f"\n  field_code: {code} (共 {len(fields_list)} 条记录)")
        for field in fields_list:
            print(f"    - ID: {field['id']}, 名称: {field['name']}, "
                  f"field_type: {field['field_type']}, state: {field['state']}")
    return {
        'all_fields': fields,
        'chinese_field_codes': chinese_field_codes,
        'duplicate_names': duplicates,
        'duplicate_codes': duplicate_codes
    }
 def analyze_f_polic_file_field(conn) -> Dict:
    """分析f_polic_file_field表，找出重复项"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    print("\n" + "="*80)
    print("2. 分析 f_polic_file_field 表")
    print("="*80)
    # 查询所有关联关系
    cursor.execute("""
        SELECT fff.id, fff.file_id, fff.filed_id, 
               fc.name as file_name, f.name as field_name, f.filed_code
        FROM f_polic_file_field fff
        LEFT JOIN f_polic_file_config fc ON fff.file_id = fc.id
        LEFT JOIN f_polic_field f ON fff.filed_id = f.id
        WHERE fff.tenant_id = %s
        ORDER BY fff.file_id, fff.filed_id
    """, (TENANT_ID,))
    relations = cursor.fetchall()
    print(f"\n总共找到 {len(relations)} 个关联关系")
    # 找出重复的关联关系（相同的file_id和filed_id）
    relation_key_to_records = {}
    for rel in relations:
        key = (rel['file_id'], rel['filed_id'])
        if key not in relation_key_to_records:
            relation_key_to_records[key] = []
        relation_key_to_records[key].append(rel)
    duplicates = {key: records for key, records in relation_key_to_records.items() 
                 if len(records) > 1}
    print(f"\n发现 {len(duplicates)} 个重复的关联关系:")
    for (file_id, filed_id), records in duplicates.items():
        print(f"\n  文件ID: {file_id}, 字段ID: {filed_id} (共 {len(records)} 条记录)")
        for record in records:
            print(f"    - 关联ID: {record['id']}, 文件: {record['file_name']}, "
                  f"字段: {record['field_name']} ({record['filed_code']})")
    # 统计使用中文field_code的关联关系
    chinese_relations = [rel for rel in relations if rel['filed_code'] and is_chinese(rel['filed_code'])]
    print(f"\n发现 {len(chinese_relations)} 个使用中文field_code的关联关系:")
    for rel in chinese_relations[:10]:  # 只显示前10个
        print(f"  - 文件: {rel['file_name']}, 字段: {rel['field_name']}, "
              f"field_code: {rel['filed_code']}")
    if len(chinese_relations) > 10:
        print(f"  ... 还有 {len(chinese_relations) - 10} 个")
    return {
        'all_relations': relations,
        'duplicate_relations': duplicates,
        'chinese_relations': chinese_relations
    }
 def get_correct_field_code(field_name: str, current_code: str) -> Optional[str]:
    """根据字段名称获取正确的field_code"""
    # 首先从映射表中查找
    if field_name in FIELD_NAME_TO_CODE_MAPPING:
        return FIELD_NAME_TO_CODE_MAPPING[field_name]
    # 如果当前code已经是英文且符合规范，保留
    if current_code and not is_chinese(current_code) and re.match(r'^[a-z_]+$', current_code):
        return current_code
    return None
 def fix_f_polic_field(conn, dry_run: bool = True) -> Dict:
    """修复f_polic_field表中的问题"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    print("\n" + "="*80)
    print("3. 修复 f_polic_field 表")
    print("="*80)
    if dry_run:
        print("\n[DRY RUN模式 - 不会实际修改数据库]")
    # 获取所有字段
    cursor.execute("""
        SELECT id, name, filed_code, field_type, state
        FROM f_polic_field
        WHERE tenant_id = %s
    """, (TENANT_ID,))
    fields = cursor.fetchall()
    updates = []
    merges = []
    # 按字段名称分组，找出需要合并的重复项
    name_to_fields = {}
    for field in fields:
        name = field['name']
        if name not in name_to_fields:
            name_to_fields[name] = []
        name_to_fields[name].append(field)
    # 处理每个字段名称
    for field_name, field_list in name_to_fields.items():
        if len(field_list) == 1:
            # 单个字段，检查是否需要更新field_code
            field = field_list[0]
            correct_code = get_correct_field_code(field['name'], field['filed_code'])
            if correct_code and correct_code != field['filed_code']:
                updates.append({
                    'id': field['id'],
                    'name': field['name'],
                    'old_code': field['filed_code'],
                    'new_code': correct_code,
                    'field_type': field['field_type']
                })
        else:
            # 多个字段，需要合并
            # 找出最佳的field_code
            best_field = None
            best_code = None
            for field in field_list:
                correct_code = get_correct_field_code(field['name'], field['filed_code'])
                if correct_code:
                    if not best_field or (field['state'] == 1 and best_field['state'] == 0):
                        best_field = field
                        best_code = correct_code
            # 如果没找到最佳字段，选择第一个启用的，或者第一个
            if not best_field:
                enabled_fields = [f for f in field_list if f['state'] == 1]
                best_field = enabled_fields[0] if enabled_fields else field_list[0]
                best_code = get_correct_field_code(best_field['name'], best_field['filed_code'])
                if not best_code:
                    # 生成一个基于名称的code
                    best_code = field_name.lower().replace('被核查人员', 'target_').replace('被核查人', 'target_')
                    best_code = re.sub(r'[^\w]', '_', best_code)
                    best_code = re.sub(r'_+', '_', best_code).strip('_')
            # 确定要保留的字段和要删除的字段
            keep_field = best_field
            remove_fields = [f for f in field_list if f['id'] != keep_field['id']]
            # 更新保留字段的field_code
            if best_code and best_code != keep_field['filed_code']:
                updates.append({
                    'id': keep_field['id'],
                    'name': keep_field['name'],
                    'old_code': keep_field['filed_code'],
                    'new_code': best_code,
                    'field_type': keep_field['field_type']
                })
            merges.append({
                'keep_field_id': keep_field['id'],
                'keep_field_name': keep_field['name'],
                'keep_field_code': best_code or keep_field['filed_code'],
                'remove_field_ids': [f['id'] for f in remove_fields],
                'remove_fields': remove_fields
            })
    # 显示更新计划
    print(f"\n需要更新 {len(updates)} 个字段的field_code:")
    for update in updates:
        print(f"  - ID: {update['id']}, 名称: {update['name']}, "
              f"{update['old_code']} -> {update['new_code']}")
    print(f"\n需要合并 {len(merges)} 组重复字段:")
    for merge in merges:
        print(f"\n  保留字段: ID={merge['keep_field_id']}, 名称={merge['keep_field_name']}, "
              f"field_code={merge['keep_field_code']}")
        print(f"  删除字段: {len(merge['remove_field_ids'])} 个")
        for remove_field in merge['remove_fields']:
            print(f"    - ID: {remove_field['id']}, field_code: {remove_field['filed_code']}, "
                  f"field_type: {remove_field['field_type']}, state: {remove_field['state']}")
    # 执行更新
    if not dry_run:
        print("\n开始执行更新...")
        # 1. 先更新field_code
        for update in updates:
            cursor.execute("""
                UPDATE f_polic_field
                SET filed_code = %s, updated_time = %s, updated_by = %s
                WHERE id = %s
            """, (update['new_code'], CURRENT_TIME, UPDATED_BY, update['id']))
            print(f"  ✓ 更新字段 ID {update['id']}: {update['old_code']} -> {update['new_code']}")
        # 2. 合并重复字段：先更新关联表，再删除重复字段
        for merge in merges:
            keep_id = merge['keep_field_id']
            for remove_id in merge['remove_field_ids']:
                # 更新f_polic_file_field表中的关联
                cursor.execute("""
                    UPDATE f_polic_file_field
                    SET filed_id = %s, updated_time = %s, updated_by = %s
                    WHERE filed_id = %s AND tenant_id = %s
                """, (keep_id, CURRENT_TIME, UPDATED_BY, remove_id, TENANT_ID))
                # 删除重复的字段记录
                cursor.execute("""
                    DELETE FROM f_polic_field
                    WHERE id = %s AND tenant_id = %s
                """, (remove_id, TENANT_ID))
            print(f"  ✓ 合并字段: 保留 ID {keep_id}, 删除 {len(merge['remove_field_ids'])} 个重复字段")
        conn.commit()
        print("\n✓ 更新完成")
    else:
        print("\n[DRY RUN] 以上操作不会实际执行")
    return {
        'updates': updates,
        'merges': merges
    }
 def fix_f_polic_file_field(conn, dry_run: bool = True) -> Dict:
    """修复f_polic_file_field表中的重复项"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    print("\n" + "="*80)
    print("4. 修复 f_polic_file_field 表")
    print("="*80)
    if dry_run:
        print("\n[DRY RUN模式 - 不会实际修改数据库]")
    # 找出重复的关联关系
    cursor.execute("""
        SELECT file_id, filed_id, COUNT(*) as count, GROUP_CONCAT(id) as ids
        FROM f_polic_file_field
        WHERE tenant_id = %s
        GROUP BY file_id, filed_id
        HAVING count > 1
    """, (TENANT_ID,))
    duplicates = cursor.fetchall()
    print(f"\n发现 {len(duplicates)} 组重复的关联关系")
    deletes = []
    for dup in duplicates:
        file_id = dup['file_id']
        filed_id = dup['filed_id']
        ids = [int(id_str) for id_str in dup['ids'].split(',')]
        # 保留第一个，删除其他的
        keep_id = ids[0]
        remove_ids = ids[1:]
        deletes.append({
            'file_id': file_id,
            'filed_id': filed_id,
            'keep_id': keep_id,
            'remove_ids': remove_ids
        })
        print(f"\n  文件ID: {file_id}, 字段ID: {filed_id}")
        print(f"    保留关联ID: {keep_id}")
        print(f"    删除关联ID: {', '.join(map(str, remove_ids))}")
    # 执行删除
    if not dry_run:
        print("\n开始删除重复的关联关系...")
        for delete in deletes:
            for remove_id in delete['remove_ids']:
                cursor.execute("""
                    DELETE FROM f_polic_file_field
                    WHERE id = %s AND tenant_id = %s
                """, (remove_id, TENANT_ID))
            print(f"  ✓ 删除文件ID {delete['file_id']} 和字段ID {delete['filed_id']} 的重复关联")
        conn.commit()
        print("\n✓ 删除完成")
    else:
        print("\n[DRY RUN] 以上操作不会实际执行")
    return {
        'deletes': deletes
    }
 def check_other_tables(conn):
    """检查其他可能受影响的表"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    print("\n" + "="*80)
    print("5. 检查其他关联表")
    print("="*80)
    # 检查f_polic_task表
    print("\n检查 f_polic_task 表...")
    try:
        cursor.execute("""
            SELECT COUNT(*) as count
            FROM f_polic_task
            WHERE tenant_id = %s
        """, (TENANT_ID,))
        task_count = cursor.fetchone()['count']
        print(f"  找到 {task_count} 个任务记录")
        # 检查是否有引用字段ID的列
        cursor.execute("DESCRIBE f_polic_task")
        columns = [col['Field'] for col in cursor.fetchall()]
        print(f"  表字段: {', '.join(columns)}")
        # 检查是否有引用f_polic_field的字段
        field_refs = [col for col in columns if 'field' in col.lower() or 'filed' in col.lower()]
        if field_refs:
            print(f"  可能引用字段的列: {', '.join(field_refs)}")
    except Exception as e:
        print(f"  检查f_polic_task表时出错: {e}")
    # 检查f_polic_file表
    print("\n检查 f_polic_file 表...")
    try:
        cursor.execute("""
            SELECT COUNT(*) as count
            FROM f_polic_file
            WHERE tenant_id = %s
        """, (TENANT_ID,))
        file_count = cursor.fetchone()['count']
        print(f"  找到 {file_count} 个文件记录")
        cursor.execute("DESCRIBE f_polic_file")
        columns = [col['Field'] for col in cursor.fetchall()]
        print(f"  表字段: {', '.join(columns)}")
    except Exception as e:
        print(f"  检查f_polic_file表时出错: {e}")
 def main():
    """主函数"""
    print("="*80)
    print("字段编码问题分析和修复工具")
    print("="*80)
    try:
        conn = pymysql.connect(**DB_CONFIG)
        # 1. 分析f_polic_field表
        field_analysis = analyze_f_polic_field(conn)
        # 2. 分析f_polic_file_field表
        relation_analysis = analyze_f_polic_file_field(conn)
        # 3. 检查其他表
        check_other_tables(conn)
        # 4. 询问是否执行修复
        print("\n" + "="*80)
        print("分析完成")
        print("="*80)
        print("\n是否执行修复？")
        print("1. 先执行DRY RUN（不实际修改数据库）")
        print("2. 直接执行修复（会修改数据库）")
        print("3. 仅查看分析结果，不执行修复")
        choice = input("\n请选择 (1/2/3，默认1): ").strip() or "1"
        if choice == "1":
            # DRY RUN
            print("\n" + "="*80)
            print("执行DRY RUN...")
            print("="*80)
            fix_f_polic_field(conn, dry_run=True)
            fix_f_polic_file_field(conn, dry_run=True)
            print("\n" + "="*80)
            confirm = input("DRY RUN完成。是否执行实际修复？(y/n，默认n): ").strip().lower()
            if confirm == 'y':
                print("\n执行实际修复...")
                fix_f_polic_field(conn, dry_run=False)
                fix_f_polic_file_field(conn, dry_run=False)
                print("\n✓ 修复完成！")
        elif choice == "2":
            # 直接执行
            print("\n" + "="*80)
            print("执行修复...")
            print("="*80)
            fix_f_polic_field(conn, dry_run=False)
            fix_f_polic_file_field(conn, dry_run=False)
            print("\n✓ 修复完成！")
        else:
            print("\n仅查看分析结果，未执行修复")
        conn.close()
    except Exception as e:
        print(f"\n✗ 执行失败: {e}")
        import traceback
        traceback.print_exc()
 if __name__ == '__main__':
    main()
--- a/analyze_and_update_template_tree.py
+++ b/analyze_and_update_template_tree.py
@ -0,0 +1,555 @@
 """
 分析和更新模板树状结构
 根据 template_finish 目录结构规划树状层级，并更新数据库中的 parent_id 字段
 """
 import os
 import json
 import pymysql
 from pathlib import Path
 from typing import Dict, List, Optional, Tuple
 from datetime import datetime
 # 数据库连接配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 CREATED_BY = 655162080928945152
 UPDATED_BY = 655162080928945152
 CURRENT_TIME = datetime.now()
 # 项目根目录
 PROJECT_ROOT = Path(__file__).parent
 TEMPLATES_DIR = PROJECT_ROOT / "template_finish"
 # 从 init_all_templates.py 复制的文档类型映射
 DOCUMENT_TYPE_MAPPING = {
    "1.请示报告卡（XXX）": {
        "template_code": "REPORT_CARD",
        "name": "1.请示报告卡（XXX）",
        "business_type": "INVESTIGATION"
    },
    "2.初步核实审批表（XXX）": {
        "template_code": "PRELIMINARY_VERIFICATION_APPROVAL",
        "name": "2.初步核实审批表（XXX）",
        "business_type": "INVESTIGATION"
    },
    "3.附件初核方案(XXX)": {
        "template_code": "INVESTIGATION_PLAN",
        "name": "3.附件初核方案(XXX)",
        "business_type": "INVESTIGATION"
    },
    "谈话通知书第一联": {
        "template_code": "NOTIFICATION_LETTER_1",
        "name": "谈话通知书第一联",
        "business_type": "INVESTIGATION"
    },
    "谈话通知书第二联": {
        "template_code": "NOTIFICATION_LETTER_2",
        "name": "谈话通知书第二联",
        "business_type": "INVESTIGATION"
    },
    "谈话通知书第三联": {
        "template_code": "NOTIFICATION_LETTER_3",
        "name": "谈话通知书第三联",
        "business_type": "INVESTIGATION"
    },
    "1.请示报告卡（初核谈话）": {
        "template_code": "REPORT_CARD_INTERVIEW",
        "name": "1.请示报告卡（初核谈话）",
        "business_type": "INVESTIGATION"
    },
    "2谈话审批表": {
        "template_code": "INTERVIEW_APPROVAL_FORM",
        "name": "2谈话审批表",
        "business_type": "INVESTIGATION"
    },
    "3.谈话前安全风险评估表": {
        "template_code": "PRE_INTERVIEW_RISK_ASSESSMENT",
        "name": "3.谈话前安全风险评估表",
        "business_type": "INVESTIGATION"
    },
    "4.谈话方案": {
        "template_code": "INTERVIEW_PLAN",
        "name": "4.谈话方案",
        "business_type": "INVESTIGATION"
    },
    "5.谈话后安全风险评估表": {
        "template_code": "POST_INTERVIEW_RISK_ASSESSMENT",
        "name": "5.谈话后安全风险评估表",
        "business_type": "INVESTIGATION"
    },
    "1.谈话笔录": {
        "template_code": "INTERVIEW_RECORD",
        "name": "1.谈话笔录",
        "business_type": "INVESTIGATION"
    },
    "2.谈话询问对象情况摸底调查30问": {
        "template_code": "INVESTIGATION_30_QUESTIONS",
        "name": "2.谈话询问对象情况摸底调查30问",
        "business_type": "INVESTIGATION"
    },
    "3.被谈话人权利义务告知书": {
        "template_code": "RIGHTS_OBLIGATIONS_NOTICE",
        "name": "3.被谈话人权利义务告知书",
        "business_type": "INVESTIGATION"
    },
    "4.点对点交接单": {
        "template_code": "HANDOVER_FORM",
        "name": "4.点对点交接单",
        "business_type": "INVESTIGATION"
    },
    "4.点对点交接单2": {
        "template_code": "HANDOVER_FORM_2",
        "name": "4.点对点交接单2",
        "business_type": "INVESTIGATION"
    },
    "5.陪送交接单（新）": {
        "template_code": "ESCORT_HANDOVER_FORM",
        "name": "5.陪送交接单（新）",
        "business_type": "INVESTIGATION"
    },
    "6.1保密承诺书（谈话对象使用-非中共党员用）": {
        "template_code": "CONFIDENTIALITY_COMMITMENT_NON_PARTY",
        "name": "6.1保密承诺书（谈话对象使用-非中共党员用）",
        "business_type": "INVESTIGATION"
    },
    "6.2保密承诺书（谈话对象使用-中共党员用）": {
        "template_code": "CONFIDENTIALITY_COMMITMENT_PARTY",
        "name": "6.2保密承诺书（谈话对象使用-中共党员用）",
        "business_type": "INVESTIGATION"
    },
    "7.办案人员-办案安全保密承诺书": {
        "template_code": "INVESTIGATOR_CONFIDENTIALITY_COMMITMENT",
        "name": "7.办案人员-办案安全保密承诺书",
        "business_type": "INVESTIGATION"
    },
    "8-1请示报告卡（初核报告结论） ": {
        "template_code": "REPORT_CARD_CONCLUSION",
        "name": "8-1请示报告卡（初核报告结论） ",
        "business_type": "INVESTIGATION"
    },
    "8.XXX初核情况报告": {
        "template_code": "INVESTIGATION_REPORT",
        "name": "8.XXX初核情况报告",
        "business_type": "INVESTIGATION"
    }
 }
 def generate_id():
    """生成ID（使用时间戳+随机数的方式，模拟雪花算法）"""
    import time
    import random
    timestamp = int(time.time() * 1000)
    random_part = random.randint(100000, 999999)
    return timestamp * 1000 + random_part
 def identify_document_type(file_name: str) -> Optional[Dict]:
    """根据完整文件名识别文档类型"""
    base_name = Path(file_name).stem
    if base_name in DOCUMENT_TYPE_MAPPING:
        return DOCUMENT_TYPE_MAPPING[base_name]
    return None
 def scan_directory_structure(base_dir: Path) -> Dict:
    """
    扫描目录结构，构建树状层级
    Returns:
        包含目录和文件层级结构的字典
    """
    structure = {
        'directories': {},  # {path: {'name': ..., 'parent': ..., 'level': ...}}
        'files': {}  # {file_path: {'name': ..., 'parent': ..., 'template_code': ...}}
    }
    def process_path(path: Path, parent_path: Optional[str] = None, level: int = 0):
        """递归处理路径"""
        if path.is_file() and path.suffix == '.docx':
            # 处理文件
            file_name = path.stem
            doc_config = identify_document_type(file_name)
            structure['files'][str(path)] = {
                'name': file_name,
                'parent': parent_path,
                'level': level,
                'template_code': doc_config['template_code'] if doc_config else None,
                'full_path': str(path)
            }
        elif path.is_dir():
            # 处理目录
            dir_name = path.name
            structure['directories'][str(path)] = {
                'name': dir_name,
                'parent': parent_path,
                'level': level
            }
            # 递归处理子目录和文件
            for child in sorted(path.iterdir()):
                if child.name != '__pycache__':
                    process_path(child, str(path), level + 1)
    # 从根目录开始扫描
    if TEMPLATES_DIR.exists():
        for item in sorted(TEMPLATES_DIR.iterdir()):
            if item.name != '__pycache__':
                process_path(item, None, 0)
    return structure
 def get_existing_data(conn) -> Dict:
    """
    获取数据库中的现有数据
    Returns:
        {
            'by_id': {id: {...}},
            'by_name': {name: {...}},
            'by_template_code': {template_code: {...}}
        }
    """
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    sql = """
        SELECT id, name, parent_id, template_code, input_data, file_path, state
        FROM f_polic_file_config
        WHERE tenant_id = %s
    """
    cursor.execute(sql, (TENANT_ID,))
    configs = cursor.fetchall()
    result = {
        'by_id': {},
        'by_name': {},
        'by_template_code': {}
    }
    for config in configs:
        config_id = config['id']
        config_name = config['name']
        # 尝试从 input_data 中提取 template_code
        template_code = config.get('template_code')
        if not template_code and config.get('input_data'):
            try:
                input_data = json.loads(config['input_data']) if isinstance(config['input_data'], str) else config['input_data']
                if isinstance(input_data, dict):
                    template_code = input_data.get('template_code')
            except:
                pass
        result['by_id'][config_id] = config
        result['by_name'][config_name] = config
        if template_code:
            # 如果已存在相同 template_code，保留第一个
            if template_code not in result['by_template_code']:
                result['by_template_code'][template_code] = config
    cursor.close()
    return result
 def analyze_structure():
    """分析目录结构和数据库数据"""
    print("="*80)
    print("分析模板目录结构和数据库数据")
    print("="*80)
    # 连接数据库
    try:
        conn = pymysql.connect(**DB_CONFIG)
        print("✓ 数据库连接成功\n")
    except Exception as e:
        print(f"✗ 数据库连接失败: {e}")
        return None, None
    # 扫描目录结构
    print("扫描目录结构...")
    dir_structure = scan_directory_structure(TEMPLATES_DIR)
    print(f"  找到 {len(dir_structure['directories'])} 个目录")
    print(f"  找到 {len(dir_structure['files'])} 个文件\n")
    # 获取数据库现有数据
    print("获取数据库现有数据...")
    existing_data = get_existing_data(conn)
    print(f"  数据库中有 {len(existing_data['by_id'])} 条记录\n")
    # 分析缺少 parent_id 的记录
    print("分析缺少 parent_id 的记录...")
    missing_parent = []
    for config in existing_data['by_id'].values():
        if config.get('parent_id') is None:
            missing_parent.append(config)
    print(f"  有 {len(missing_parent)} 条记录缺少 parent_id\n")
    conn.close()
    return dir_structure, existing_data
 def plan_tree_structure(dir_structure: Dict, existing_data: Dict) -> List[Dict]:
    """
    规划树状结构
    Returns:
        更新计划列表，每个元素包含：
        {
            'type': 'directory' | 'file',
            'name': ...,
            'parent_name': ...,
            'level': ...,
            'action': 'create' | 'update',
            'config_id': ... (如果是更新),
            'template_code': ... (如果是文件)
        }
    """
    plan = []
    # 按层级排序目录
    directories = sorted(dir_structure['directories'].items(), 
                        key=lambda x: (x[1]['level'], x[0]))
    # 按层级排序文件
    files = sorted(dir_structure['files'].items(),
                   key=lambda x: (x[1]['level'], x[0]))
    # 创建目录映射（用于查找父目录ID）
    dir_id_map = {}  # {dir_path: config_id}
    # 处理目录（按层级顺序）
    for dir_path, dir_info in directories:
        dir_name = dir_info['name']
        parent_path = dir_info['parent']
        level = dir_info['level']
        # 查找父目录ID
        parent_id = None
        if parent_path:
            parent_id = dir_id_map.get(parent_path)
        # 检查数据库中是否已存在
        existing = existing_data['by_name'].get(dir_name)
        if existing:
            # 更新现有记录
            plan.append({
                'type': 'directory',
                'name': dir_name,
                'parent_name': dir_structure['directories'].get(parent_path, {}).get('name') if parent_path else None,
                'parent_id': parent_id,
                'level': level,
                'action': 'update',
                'config_id': existing['id'],
                'current_parent_id': existing.get('parent_id')
            })
            dir_id_map[dir_path] = existing['id']
        else:
            # 创建新记录（目录节点）
            new_id = generate_id()
            plan.append({
                'type': 'directory',
                'name': dir_name,
                'parent_name': dir_structure['directories'].get(parent_path, {}).get('name') if parent_path else None,
                'parent_id': parent_id,
                'level': level,
                'action': 'create',
                'config_id': new_id,
                'current_parent_id': None
            })
            dir_id_map[dir_path] = new_id
    # 处理文件
    for file_path, file_info in files:
        file_name = file_info['name']
        parent_path = file_info['parent']
        level = file_info['level']
        template_code = file_info['template_code']
        # 查找父目录ID
        parent_id = dir_id_map.get(parent_path) if parent_path else None
        # 查找数据库中的记录（通过 template_code 或 name）
        existing = None
        if template_code:
            existing = existing_data['by_template_code'].get(template_code)
        if not existing:
            existing = existing_data['by_name'].get(file_name)
        if existing:
            # 更新现有记录
            plan.append({
                'type': 'file',
                'name': file_name,
                'parent_name': dir_structure['directories'].get(parent_path, {}).get('name') if parent_path else None,
                'parent_id': parent_id,
                'level': level,
                'action': 'update',
                'config_id': existing['id'],
                'template_code': template_code,
                'current_parent_id': existing.get('parent_id')
            })
        else:
            # 创建新记录（文件节点）
            new_id = generate_id()
            plan.append({
                'type': 'file',
                'name': file_name,
                'parent_name': dir_structure['directories'].get(parent_path, {}).get('name') if parent_path else None,
                'parent_id': parent_id,
                'level': level,
                'action': 'create',
                'config_id': new_id,
                'template_code': template_code,
                'current_parent_id': None
            })
    return plan
 def generate_update_sql(plan: List[Dict], output_file: str = 'update_template_tree.sql'):
    """生成更新SQL脚本"""
    sql_lines = [
        "-- 模板树状结构更新脚本",
        f"-- 生成时间: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}",
        "-- 注意：执行前请备份数据库！",
        "",
        "USE finyx;",
        "",
        "START TRANSACTION;",
        ""
    ]
    # 按层级分组
    by_level = {}
    for item in plan:
        level = item['level']
        if level not in by_level:
            by_level[level] = []
        by_level[level].append(item)
    # 按层级顺序处理（从顶层到底层）
    for level in sorted(by_level.keys()):
        sql_lines.append(f"-- ===== 层级 {level} =====")
        sql_lines.append("")
        for item in by_level[level]:
            if item['action'] == 'create':
                # 创建新记录
                if item['type'] == 'directory':
                    sql_lines.append(f"-- 创建目录节点: {item['name']}")
                    sql_lines.append(f"INSERT INTO f_polic_file_config")
                    sql_lines.append(f"  (id, tenant_id, parent_id, name, input_data, file_path, created_time, created_by, updated_time, updated_by, state)")
                    parent_id_sql = f"{item['parent_id']}" if item['parent_id'] else "NULL"
                    sql_lines.append(f"VALUES ({item['config_id']}, {TENANT_ID}, {parent_id_sql}, '{item['name']}', NULL, NULL, NOW(), {CREATED_BY}, NOW(), {UPDATED_BY}, 1);")
                else:
                    # 文件节点（需要 template_code）
                    sql_lines.append(f"-- 创建文件节点: {item['name']}")
                    input_data = json.dumps({
                        'template_code': item.get('template_code', ''),
                        'business_type': 'INVESTIGATION'
                    }, ensure_ascii=False).replace("'", "''")
                    sql_lines.append(f"INSERT INTO f_polic_file_config")
                    sql_lines.append(f"  (id, tenant_id, parent_id, name, input_data, file_path, template_code, created_time, created_by, updated_time, updated_by, state)")
                    parent_id_sql = f"{item['parent_id']}" if item['parent_id'] else "NULL"
                    template_code_sql = f"'{item.get('template_code', '')}'" if item.get('template_code') else "NULL"
                    sql_lines.append(f"VALUES ({item['config_id']}, {TENANT_ID}, {parent_id_sql}, '{item['name']}', '{input_data}', NULL, {template_code_sql}, NOW(), {CREATED_BY}, NOW(), {UPDATED_BY}, 1);")
                sql_lines.append("")
            else:
                # 更新现有记录
                current_parent = item.get('current_parent_id')
                new_parent = item.get('parent_id')
                if current_parent != new_parent:
                    sql_lines.append(f"-- 更新: {item['name']} (parent_id: {current_parent} -> {new_parent})")
                    parent_id_sql = f"{new_parent}" if new_parent else "NULL"
                    sql_lines.append(f"UPDATE f_polic_file_config")
                    sql_lines.append(f"SET parent_id = {parent_id_sql}, updated_time = NOW(), updated_by = {UPDATED_BY}")
                    sql_lines.append(f"WHERE id = {item['config_id']} AND tenant_id = {TENANT_ID};")
                    sql_lines.append("")
    sql_lines.append("COMMIT;")
    sql_lines.append("")
    sql_lines.append("-- 更新完成")
    # 写入文件
    with open(output_file, 'w', encoding='utf-8') as f:
        f.write('\n'.join(sql_lines))
    print(f"✓ SQL脚本已生成: {output_file}")
    return output_file
 def print_analysis_report(dir_structure: Dict, existing_data: Dict, plan: List[Dict]):
    """打印分析报告"""
    print("\n" + "="*80)
    print("分析报告")
    print("="*80)
    print(f"\n目录结构:")
    print(f"  - 目录数量: {len(dir_structure['directories'])}")
    print(f"  - 文件数量: {len(dir_structure['files'])}")
    print(f"\n数据库现状:")
    print(f"  - 总记录数: {len(existing_data['by_id'])}")
    missing_parent = sum(1 for c in existing_data['by_id'].values() if c.get('parent_id') is None)
    print(f"  - 缺少 parent_id 的记录: {missing_parent}")
    print(f"\n更新计划:")
    create_count = sum(1 for p in plan if p['action'] == 'create')
    update_count = sum(1 for p in plan if p['action'] == 'update')
    print(f"  - 需要创建: {create_count} 条")
    print(f"  - 需要更新: {update_count} 条")
    print(f"\n层级分布:")
    by_level = {}
    for item in plan:
        level = item['level']
        by_level[level] = by_level.get(level, 0) + 1
    for level in sorted(by_level.keys()):
        print(f"  - 层级 {level}: {by_level[level]} 个节点")
    print("\n" + "="*80)
 def main():
    """主函数"""
    # 分析
    dir_structure, existing_data = analyze_structure()
    if not dir_structure or not existing_data:
        return
    # 规划树状结构
    print("规划树状结构...")
    plan = plan_tree_structure(dir_structure, existing_data)
    print(f"  生成 {len(plan)} 个更新计划\n")
    # 打印报告
    print_analysis_report(dir_structure, existing_data, plan)
    # 生成SQL脚本
    print("\n生成SQL更新脚本...")
    sql_file = generate_update_sql(plan)
    print("\n" + "="*80)
    print("分析完成！")
    print("="*80)
    print(f"\n请检查生成的SQL脚本: {sql_file}")
    print("确认无误后，可以执行该脚本更新数据库。")
    print("\n注意：执行前请备份数据库！")
 if __name__ == '__main__':
    main()
--- a/analyze_duplicate_fields.py
+++ b/analyze_duplicate_fields.py
@ -0,0 +1,148 @@
 """分析 f_polic_field 表中的重复字段"""
 import pymysql
 import os
 from dotenv import load_dotenv
 from collections import defaultdict
 load_dotenv()
 TENANT_ID = 615873064429507639
 conn = pymysql.connect(
    host=os.getenv('DB_HOST', '152.136.177.240'),
    port=int(os.getenv('DB_PORT', 5012)),
    user=os.getenv('DB_USER', 'finyx'),
    password=os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    database=os.getenv('DB_NAME', 'finyx'),
    charset='utf8mb4'
 )
 cursor = conn.cursor(pymysql.cursors.DictCursor)
 print("=" * 80)
 print("1. 分析按 name 字段的重复情况")
 print("=" * 80)
 # 查询所有字段
 cursor.execute("""
    SELECT id, name, filed_code, field_type, state
    FROM f_polic_field
    WHERE tenant_id = %s
    ORDER BY name, id
 """, (TENANT_ID,))
 all_fields = cursor.fetchall()
 # 按 name 分组
 name_groups = defaultdict(list)
 for field in all_fields:
    name_groups[field['name']].append(field)
 # 找出重复的 name
 duplicate_names = {name: fields for name, fields in name_groups.items() if len(fields) > 1}
 print(f"\n发现 {len(duplicate_names)} 个重复的字段名称：\n")
 for name, fields in sorted(duplicate_names.items()):
    print(f"字段名称: {name}")
    for field in fields:
        print(f"  ID: {field['id']}, filed_code: {field['filed_code']}, field_type: {field['field_type']}, state: {field['state']}")
    print()
 print("\n" + "=" * 80)
 print("2. 分析按 filed_code 字段的重复情况")
 print("=" * 80)
 # 按 filed_code 分组
 code_groups = defaultdict(list)
 for field in all_fields:
    code_groups[field['filed_code']].append(field)
 # 找出重复的 filed_code
 duplicate_codes = {code: fields for code, fields in code_groups.items() if len(fields) > 1}
 print(f"\n发现 {len(duplicate_codes)} 个重复的字段编码：\n")
 for code, fields in sorted(duplicate_codes.items()):
    print(f"字段编码: {code}")
    for field in fields:
        print(f"  ID: {field['id']}, name: {field['name']}, field_type: {field['field_type']}, state: {field['state']}")
    print()
 print("\n" + "=" * 80)
 print("3. 分析重复字段的关联关系（f_polic_file_field）")
 print("=" * 80)
 # 获取所有重复字段的ID
 all_duplicate_field_ids = set()
 for fields in duplicate_names.values():
    for field in fields:
        all_duplicate_field_ids.add(field['id'])
 for fields in duplicate_codes.values():
    for field in fields:
        all_duplicate_field_ids.add(field['id'])
 if all_duplicate_field_ids:
    placeholders = ','.join(['%s'] * len(all_duplicate_field_ids))
    cursor.execute(f"""
        SELECT ff.file_id, ff.filed_id, f.name, f.filed_code, fc.name as file_name, fc.state as file_state
        FROM f_polic_file_field ff
        INNER JOIN f_polic_field f ON ff.filed_id = f.id
        INNER JOIN f_polic_file_config fc ON ff.file_id = fc.id
        WHERE ff.filed_id IN ({placeholders})
        AND f.tenant_id = %s
        ORDER BY f.filed_code, ff.file_id
    """, list(all_duplicate_field_ids) + [TENANT_ID])
    associations = cursor.fetchall()
    # 按 filed_code 分组关联关系
    code_associations = defaultdict(list)
    for assoc in associations:
        code_associations[assoc['filed_code']].append(assoc)
    print(f"\n重复字段的关联关系：\n")
    for code, assocs in sorted(code_associations.items()):
        print(f"字段编码: {code} ({assocs[0]['name']})")
        for assoc in assocs:
            print(f"  字段ID: {assoc['filed_id']}, 文件ID: {assoc['file_id']}, 文件名: {assoc['file_name']}, 文件状态: {assoc['file_state']}")
        print()
 else:
    print("\n没有发现重复字段的关联关系")
 print("\n" + "=" * 80)
 print("4. 统计每个 filed_code 关联的模板数量")
 print("=" * 80)
 cursor.execute("""
    SELECT f.filed_code, f.name, COUNT(DISTINCT ff.file_id) as template_count,
           GROUP_CONCAT(DISTINCT ff.filed_id ORDER BY ff.filed_id) as field_ids,
           GROUP_CONCAT(DISTINCT fc.name ORDER BY fc.name SEPARATOR ' | ') as template_names
    FROM f_polic_field f
    LEFT JOIN f_polic_file_field ff ON f.id = ff.filed_id
    LEFT JOIN f_polic_file_config fc ON ff.file_id = fc.id AND fc.state = 1
    WHERE f.tenant_id = %s
    GROUP BY f.filed_code, f.name
    HAVING COUNT(DISTINCT ff.filed_id) > 0 OR f.filed_code IN (
        SELECT filed_code FROM (
            SELECT filed_code, COUNT(*) as cnt
            FROM f_polic_field
            WHERE tenant_id = %s
            GROUP BY filed_code
            HAVING cnt > 1
        ) AS dup
    )
    ORDER BY template_count DESC, f.filed_code
 """, (TENANT_ID, TENANT_ID))
 stats = cursor.fetchall()
 print(f"\n字段关联统计（包含重复字段）：\n")
 for stat in stats:
    print(f"字段编码: {stat['filed_code']}")
    print(f"  字段名称: {stat['name']}")
    print(f"  关联模板数: {stat['template_count']}")
    print(f"  字段ID列表: {stat['field_ids']}")
    if stat['template_names']:
        print(f"  关联模板: {stat['template_names']}")
    print()
 cursor.close()
 conn.close()
--- a/app.py
+++ b/app.py
--- a/backup_database.py
+++ b/backup_database.py
@ -0,0 +1,314 @@
 """
 数据库备份脚本
 支持使用mysqldump命令或Python直接导出SQL文件
 """
 import os
 import sys
 import subprocess
 import pymysql
 from datetime import datetime
 from pathlib import Path
 from dotenv import load_dotenv
 # 加载环境变量
 load_dotenv()
 class DatabaseBackup:
    """数据库备份类"""
    def __init__(self):
        """初始化数据库配置"""
        self.db_config = {
            'host': os.getenv('DB_HOST', '152.136.177.240'),
            'port': int(os.getenv('DB_PORT', 5012)),
            'user': os.getenv('DB_USER', 'finyx'),
            'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
            'database': os.getenv('DB_NAME', 'finyx'),
            'charset': 'utf8mb4'
        }
        # 备份文件存储目录
        self.backup_dir = Path('backups')
        self.backup_dir.mkdir(exist_ok=True)
    def backup_with_mysqldump(self, output_file=None, compress=False):
        """
        使用mysqldump命令备份数据库（推荐方式）
        Args:
            output_file: 输出文件路径，如果为None则自动生成
            compress: 是否压缩备份文件
        Returns:
            备份文件路径
        """
        # 生成备份文件名
        if output_file is None:
            timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
            output_file = self.backup_dir / f"backup_{self.db_config['database']}_{timestamp}.sql"
        output_file = Path(output_file)
        # 构建mysqldump命令
        cmd = [
            'mysqldump',
            f"--host={self.db_config['host']}",
            f"--port={self.db_config['port']}",
            f"--user={self.db_config['user']}",
            f"--password={self.db_config['password']}",
            '--single-transaction',  # 保证数据一致性
            '--routines',  # 包含存储过程和函数
            '--triggers',  # 包含触发器
            '--events',  # 包含事件
            '--add-drop-table',  # 添加DROP TABLE语句
            '--default-character-set=utf8mb4',  # 设置字符集
            self.db_config['database']
        ]
        try:
            print(f"开始备份数据库 {self.db_config['database']}...")
            print(f"备份文件: {output_file}")
            # 执行备份命令
            with open(output_file, 'w', encoding='utf-8') as f:
                result = subprocess.run(
                    cmd,
                    stdout=f,
                    stderr=subprocess.PIPE,
                    text=True
                )
            if result.returncode != 0:
                error_msg = result.stderr.decode('utf-8') if result.stderr else '未知错误'
                raise Exception(f"mysqldump执行失败: {error_msg}")
            # 检查文件大小
            file_size = output_file.stat().st_size
            print(f"备份完成！文件大小: {file_size / 1024 / 1024:.2f} MB")
            # 如果需要压缩
            if compress:
                compressed_file = self._compress_file(output_file)
                print(f"压缩完成: {compressed_file}")
                return str(compressed_file)
            return str(output_file)
        except FileNotFoundError:
            print("错误: 未找到mysqldump命令，请确保MySQL客户端已安装并在PATH中")
            print("尝试使用Python方式备份...")
            return self.backup_with_python(output_file)
        except Exception as e:
            print(f"备份失败: {str(e)}")
            raise
    def backup_with_python(self, output_file=None):
        """
        使用Python直接连接数据库备份（备用方式）
        Args:
            output_file: 输出文件路径，如果为None则自动生成
        Returns:
            备份文件路径
        """
        if output_file is None:
            timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
            output_file = self.backup_dir / f"backup_{self.db_config['database']}_{timestamp}.sql"
        output_file = Path(output_file)
        try:
            print(f"开始使用Python方式备份数据库 {self.db_config['database']}...")
            print(f"备份文件: {output_file}")
            # 连接数据库
            connection = pymysql.connect(**self.db_config)
            cursor = connection.cursor()
            with open(output_file, 'w', encoding='utf-8') as f:
                # 写入文件头
                f.write(f"-- MySQL数据库备份\n")
                f.write(f"-- 数据库: {self.db_config['database']}\n")
                f.write(f"-- 备份时间: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\n")
                f.write(f"-- 主机: {self.db_config['host']}:{self.db_config['port']}\n")
                f.write("--\n\n")
                f.write(f"SET NAMES utf8mb4;\n")
                f.write(f"SET FOREIGN_KEY_CHECKS=0;\n\n")
                # 获取所有表
                cursor.execute("SHOW TABLES")
                tables = [table[0] for table in cursor.fetchall()]
                print(f"找到 {len(tables)} 个表")
                # 备份每个表
                for table in tables:
                    print(f"备份表: {table}")
                    # 获取表结构
                    cursor.execute(f"SHOW CREATE TABLE `{table}`")
                    create_table_sql = cursor.fetchone()[1]
                    f.write(f"-- ----------------------------\n")
                    f.write(f"-- 表结构: {table}\n")
                    f.write(f"-- ----------------------------\n")
                    f.write(f"DROP TABLE IF EXISTS `{table}`;\n")
                    f.write(f"{create_table_sql};\n\n")
                    # 获取表数据
                    cursor.execute(f"SELECT * FROM `{table}`")
                    rows = cursor.fetchall()
                    if rows:
                        # 获取列名
                        cursor.execute(f"DESCRIBE `{table}`")
                        columns = [col[0] for col in cursor.fetchall()]
                        f.write(f"-- ----------------------------\n")
                        f.write(f"-- 表数据: {table}\n")
                        f.write(f"-- ----------------------------\n")
                        # 分批写入数据
                        batch_size = 1000
                        for i in range(0, len(rows), batch_size):
                            batch = rows[i:i+batch_size]
                            values_list = []
                            for row in batch:
                                values = []
                                for value in row:
                                    if value is None:
                                        values.append('NULL')
                                    elif isinstance(value, (int, float)):
                                        values.append(str(value))
                                    else:
                                        # 转义特殊字符
                                        escaped_value = str(value).replace('\\', '\\\\').replace("'", "\\'")
                                        values.append(f"'{escaped_value}'")
                                values_list.append(f"({', '.join(values)})")
                            columns_str = ', '.join([f"`{col}`" for col in columns])
                            values_str = ',\n'.join(values_list)
                            f.write(f"INSERT INTO `{table}` ({columns_str}) VALUES\n")
                            f.write(f"{values_str};\n\n")
                    print(f"  完成: {len(rows)} 条记录")
                f.write("SET FOREIGN_KEY_CHECKS=1;\n")
            cursor.close()
            connection.close()
            # 检查文件大小
            file_size = output_file.stat().st_size
            print(f"备份完成！文件大小: {file_size / 1024 / 1024:.2f} MB")
            return str(output_file)
        except Exception as e:
            print(f"备份失败: {str(e)}")
            raise
    def _compress_file(self, file_path):
        """
        压缩备份文件
        Args:
            file_path: 文件路径
        Returns:
            压缩后的文件路径
        """
        import gzip
        file_path = Path(file_path)
        compressed_path = file_path.with_suffix('.sql.gz')
        with open(file_path, 'rb') as f_in:
            with gzip.open(compressed_path, 'wb') as f_out:
                f_out.writelines(f_in)
        # 删除原文件
        file_path.unlink()
        return compressed_path
    def list_backups(self):
        """
        列出所有备份文件
        Returns:
            备份文件列表
        """
        backups = []
        for file in sorted(self.backup_dir.glob('backup_*.sql*'), reverse=True):
            file_info = {
                'filename': file.name,
                'path': str(file),
                'size': file.stat().st_size,
                'size_mb': file.stat().st_size / 1024 / 1024,
                'modified': datetime.fromtimestamp(file.stat().st_mtime)
            }
            backups.append(file_info)
        return backups
 def main():
    """主函数"""
    import argparse
    parser = argparse.ArgumentParser(description='数据库备份工具')
    parser.add_argument('--method', choices=['mysqldump', 'python', 'auto'], 
                       default='auto', help='备份方法 (默认: auto)')
    parser.add_argument('--output', '-o', help='输出文件路径')
    parser.add_argument('--compress', '-c', action='store_true', 
                       help='压缩备份文件')
    parser.add_argument('--list', '-l', action='store_true', 
                       help='列出所有备份文件')
    args = parser.parse_args()
    backup = DatabaseBackup()
    # 列出备份文件
    if args.list:
        backups = backup.list_backups()
        if backups:
            print(f"\n找到 {len(backups)} 个备份文件:\n")
            print(f"{'文件名':<50} {'大小(MB)':<15} {'修改时间':<20}")
            print("-" * 85)
            for b in backups:
                print(f"{b['filename']:<50} {b['size_mb']:<15.2f} {b['modified'].strftime('%Y-%m-%d %H:%M:%S'):<20}")
        else:
            print("未找到备份文件")
        return
    # 执行备份
    try:
        if args.method == 'mysqldump':
            backup_file = backup.backup_with_mysqldump(args.output, args.compress)
        elif args.method == 'python':
            backup_file = backup.backup_with_python(args.output)
        else:  # auto
            try:
                backup_file = backup.backup_with_mysqldump(args.output, args.compress)
            except:
                print("\nmysqldump方式失败，切换到Python方式...")
                backup_file = backup.backup_with_python(args.output)
        print(f"\n备份成功！")
        print(f"备份文件: {backup_file}")
    except Exception as e:
        print(f"\n备份失败: {str(e)}")
        sys.exit(1)
 if __name__ == '__main__':
    main()
--- a/check_and_fix_duplicates.py
+++ b/check_and_fix_duplicates.py
@ -0,0 +1,117 @@
 """
 检查并修复重复记录
 """
 import pymysql
 DB_CONFIG = {
    'host': '152.136.177.240',
    'port': 5012,
    'user': 'finyx',
    'password': '6QsGK6MpePZDE57Z',
    'database': 'finyx',
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 UPDATED_BY = 655162080928945152
 conn = pymysql.connect(**DB_CONFIG)
 cursor = conn.cursor(pymysql.cursors.DictCursor)
 try:
    # 检查"1.初核请示"下的所有记录
    cursor.execute("""
        SELECT id, name, file_path, parent_id
        FROM f_polic_file_config
        WHERE tenant_id = %s AND parent_id = %s
        ORDER BY id
    """, (TENANT_ID, 1765431558933731))  # 1.初核请示
    results = cursor.fetchall()
    print(f"'1.初核请示'下有 {len(results)} 条记录:\n")
    for r in results:
        print(f"ID: {r['id']}, name: {r['name']}, file_path: {r['file_path']}")
    # 检查"1请示报告卡"的记录
    request_cards = [r for r in results if r['name'] == '1请示报告卡']
    if len(request_cards) > 1:
        print(f"\n发现 {len(request_cards)} 个重复的'1请示报告卡'记录")
        # 保留file_path正确的那个
        correct_one = None
        for r in request_cards:
            if r['file_path'] and '1.请示报告卡（XXX）' in r['file_path']:
                correct_one = r
                break
        if correct_one:
            # 删除其他的
            for r in request_cards:
                if r['id'] != correct_one['id']:
                    # 删除关联关系
                    cursor.execute("""
                        DELETE FROM f_polic_file_field
                        WHERE tenant_id = %s AND file_id = %s
                    """, (TENANT_ID, r['id']))
                    # 删除模板记录
                    cursor.execute("""
                        DELETE FROM f_polic_file_config
                        WHERE tenant_id = %s AND id = %s
                    """, (TENANT_ID, r['id']))
                    print(f"[DELETE] 删除重复记录: ID {r['id']}, file_path: {r['file_path']}")
    # 检查"走读式谈话审批"下是否有"1请示报告卡"
    cursor.execute("""
        SELECT id, name, file_path, parent_id
        FROM f_polic_file_config
        WHERE tenant_id = %s AND parent_id = %s AND name = %s
    """, (TENANT_ID, 1765273962700431, '1请示报告卡'))  # 走读式谈话审批
    result = cursor.fetchone()
    if not result:
        print("\n[WARN] '走读式谈话审批'下缺少'1请示报告卡'记录")
        # 创建记录
        import time
        import random
        timestamp = int(time.time() * 1000)
        random_part = random.randint(100000, 999999)
        new_id = timestamp * 1000 + random_part
        insert_sql = """
            INSERT INTO f_polic_file_config
            (id, tenant_id, parent_id, name, input_data, file_path, created_time, created_by, updated_time, updated_by, state)
            VALUES (%s, %s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, %s)
        """
        cursor.execute(insert_sql, (
            new_id,
            TENANT_ID,
            1765273962700431,  # 走读式谈话审批
            '1请示报告卡',
            None,
            '/615873064429507639/TEMPLATE/2025/12/1.请示报告卡（初核谈话）.docx',
            655162080928945152,
            655162080928945152,
            1
        ))
        print(f"[CREATE] 在'走读式谈话审批'下创建'1请示报告卡'记录 (ID: {new_id})")
    else:
        # 检查file_path是否正确
        if result['file_path'] and '1.请示报告卡（初核谈话）' not in result['file_path']:
            cursor.execute("""
                UPDATE f_polic_file_config
                SET file_path = %s, updated_time = NOW(), updated_by = %s
                WHERE tenant_id = %s AND id = %s
            """, ('/615873064429507639/TEMPLATE/2025/12/1.请示报告卡（初核谈话）.docx', UPDATED_BY, TENANT_ID, result['id']))
            print(f"[UPDATE] 修复'走读式谈话审批'下'1请示报告卡'的file_path")
    conn.commit()
    print("\n[OK] 修复完成")
 except Exception as e:
    conn.rollback()
    print(f"[ERROR] 修复失败: {e}")
    import traceback
    traceback.print_exc()
 finally:
    cursor.close()
    conn.close()
--- a/check_and_fix_file_field_relations.py
+++ b/check_and_fix_file_field_relations.py
@ -0,0 +1,551 @@
 """
 检查并修复 f_polic_file_field 表的关联关系
 1. 检查无效的关联（关联到不存在的 file_id 或 filed_id）
 2. 检查重复的关联关系
 3. 检查关联到已删除或未启用的字段/文件
 4. 根据其他表的数据更新关联关系
 """
 import pymysql
 import os
 from typing import Dict, List, Tuple
 from collections import defaultdict
 # 数据库连接配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 def check_invalid_relations(conn) -> Dict:
    """检查无效的关联关系（关联到不存在的 file_id 或 filed_id）"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    print("\n" + "="*80)
    print("1. 检查无效的关联关系")
    print("="*80)
    # 检查关联到不存在的 file_id
    cursor.execute("""
        SELECT fff.id, fff.file_id, fff.filed_id, fff.tenant_id
        FROM f_polic_file_field fff
        LEFT JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id
        WHERE fff.tenant_id = %s AND fc.id IS NULL
    """, (TENANT_ID,))
    invalid_file_relations = cursor.fetchall()
    # 检查关联到不存在的 filed_id
    cursor.execute("""
        SELECT fff.id, fff.file_id, fff.filed_id, fff.tenant_id
        FROM f_polic_file_field fff
        LEFT JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
        WHERE fff.tenant_id = %s AND f.id IS NULL
    """, (TENANT_ID,))
    invalid_field_relations = cursor.fetchall()
    print(f"\n关联到不存在的 file_id: {len(invalid_file_relations)} 条")
    if invalid_file_relations:
        print("  详情:")
        for rel in invalid_file_relations[:10]:
            print(f"    - 关联ID: {rel['id']}, file_id: {rel['file_id']}, filed_id: {rel['filed_id']}")
        if len(invalid_file_relations) > 10:
            print(f"    ... 还有 {len(invalid_file_relations) - 10} 条")
    print(f"\n关联到不存在的 filed_id: {len(invalid_field_relations)} 条")
    if invalid_field_relations:
        print("  详情:")
        for rel in invalid_field_relations[:10]:
            print(f"    - 关联ID: {rel['id']}, file_id: {rel['file_id']}, filed_id: {rel['filed_id']}")
        if len(invalid_field_relations) > 10:
            print(f"    ... 还有 {len(invalid_field_relations) - 10} 条")
    return {
        'invalid_file_relations': invalid_file_relations,
        'invalid_field_relations': invalid_field_relations
    }
 def check_duplicate_relations(conn) -> Dict:
    """检查重复的关联关系（相同的 file_id 和 filed_id）"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    print("\n" + "="*80)
    print("2. 检查重复的关联关系")
    print("="*80)
    # 查找重复的关联关系
    cursor.execute("""
        SELECT file_id, filed_id, COUNT(*) as count, GROUP_CONCAT(id ORDER BY id) as ids
        FROM f_polic_file_field
        WHERE tenant_id = %s
        GROUP BY file_id, filed_id
        HAVING COUNT(*) > 1
        ORDER BY count DESC
    """, (TENANT_ID,))
    duplicates = cursor.fetchall()
    print(f"\n发现 {len(duplicates)} 个重复的关联关系:")
    duplicate_details = []
    for dup in duplicates:
        ids = [int(id_str) for id_str in dup['ids'].split(',')]
        duplicate_details.append({
            'file_id': dup['file_id'],
            'filed_id': dup['filed_id'],
            'count': dup['count'],
            'ids': ids
        })
        print(f"\n  文件ID: {dup['file_id']}, 字段ID: {dup['filed_id']} (共 {dup['count']} 条)")
        print(f"    关联ID列表: {ids}")
    return {
        'duplicates': duplicate_details
    }
 def check_disabled_relations(conn) -> Dict:
    """检查关联到已删除或未启用的字段/文件"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    print("\n" + "="*80)
    print("3. 检查关联到已删除或未启用的字段/文件")
    print("="*80)
    # 检查关联到未启用的文件
    cursor.execute("""
        SELECT fff.id, fff.file_id, fff.filed_id, fc.name as file_name, fc.state as file_state
        FROM f_polic_file_field fff
        INNER JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id
        WHERE fff.tenant_id = %s AND fc.state = 0
    """, (TENANT_ID,))
    disabled_file_relations = cursor.fetchall()
    # 检查关联到未启用的字段
    cursor.execute("""
        SELECT fff.id, fff.file_id, fff.filed_id, f.name as field_name, f.filed_code, f.state as field_state
        FROM f_polic_file_field fff
        INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
        WHERE fff.tenant_id = %s AND f.state = 0
    """, (TENANT_ID,))
    disabled_field_relations = cursor.fetchall()
    print(f"\n关联到未启用的文件: {len(disabled_file_relations)} 条")
    if disabled_file_relations:
        print("  详情:")
        for rel in disabled_file_relations[:10]:
            print(f"    - 关联ID: {rel['id']}, 文件: {rel['file_name']} (ID: {rel['file_id']})")
        if len(disabled_file_relations) > 10:
            print(f"    ... 还有 {len(disabled_file_relations) - 10} 条")
    print(f"\n关联到未启用的字段: {len(disabled_field_relations)} 条")
    if disabled_field_relations:
        print("  详情:")
        for rel in disabled_field_relations[:10]:
            print(f"    - 关联ID: {rel['id']}, 字段: {rel['field_name']} ({rel['filed_code']}, ID: {rel['filed_id']})")
        if len(disabled_field_relations) > 10:
            print(f"    ... 还有 {len(disabled_field_relations) - 10} 条")
    return {
        'disabled_file_relations': disabled_file_relations,
        'disabled_field_relations': disabled_field_relations
    }
 def check_missing_relations(conn) -> Dict:
    """检查应该存在但缺失的关联关系（文件节点应该有输出字段关联）"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    print("\n" + "="*80)
    print("4. 检查缺失的关联关系")
    print("="*80)
    # 获取所有有 template_code 的文件节点（这些应该是文件，不是目录）
    cursor.execute("""
        SELECT fc.id, fc.name, fc.template_code
        FROM f_polic_file_config fc
        WHERE fc.tenant_id = %s AND fc.template_code IS NOT NULL AND fc.state = 1
    """, (TENANT_ID,))
    file_configs = cursor.fetchall()
    # 获取所有启用的输出字段
    cursor.execute("""
        SELECT id, name, filed_code
        FROM f_polic_field
        WHERE tenant_id = %s AND field_type = 2 AND state = 1
    """, (TENANT_ID,))
    output_fields = cursor.fetchall()
    # 获取现有的关联关系
    cursor.execute("""
        SELECT file_id, filed_id
        FROM f_polic_file_field
        WHERE tenant_id = %s
    """, (TENANT_ID,))
    existing_relations = {(rel['file_id'], rel['filed_id']) for rel in cursor.fetchall()}
    print(f"\n文件节点总数: {len(file_configs)}")
    print(f"输出字段总数: {len(output_fields)}")
    print(f"现有关联关系总数: {len(existing_relations)}")
    # 这里不自动创建缺失的关联，因为不是所有文件都需要所有字段
    # 只显示统计信息
    print("\n注意: 缺失的关联关系需要根据业务逻辑手动创建")
    return {
        'file_configs': file_configs,
        'output_fields': output_fields,
        'existing_relations': existing_relations
    }
 def check_field_type_consistency(conn) -> Dict:
    """检查关联关系的字段类型一致性（f_polic_file_field 应该只关联输出字段）"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    print("\n" + "="*80)
    print("5. 检查字段类型一致性")
    print("="*80)
    # 检查是否关联了输入字段（field_type=1）
    cursor.execute("""
        SELECT fff.id, fff.file_id, fff.filed_id, 
               fc.name as file_name, fc.template_code, f.name as field_name, f.filed_code, f.field_type
        FROM f_polic_file_field fff
        INNER JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id
        INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
        WHERE fff.tenant_id = %s AND f.field_type = 1
        ORDER BY fc.name, f.name
    """, (TENANT_ID,))
    input_field_relations = cursor.fetchall()
    print(f"\n关联到输入字段 (field_type=1) 的记录: {len(input_field_relations)} 条")
    if input_field_relations:
        print("  注意: f_polic_file_field 表通常只应该关联输出字段 (field_type=2)")
        print("  根据业务逻辑，输入字段不需要通过此表关联")
        print("  详情:")
        for rel in input_field_relations:
            print(f"    - 关联ID: {rel['id']}, 文件: {rel['file_name']} (code: {rel['template_code']}), "
                  f"字段: {rel['field_name']} ({rel['filed_code']}, type={rel['field_type']})")
    else:
        print("  ✓ 所有关联都是输出字段")
    return {
        'input_field_relations': input_field_relations
    }
 def fix_invalid_relations(conn, dry_run: bool = True) -> Dict:
    """修复无效的关联关系"""
    cursor = conn.cursor()
    print("\n" + "="*80)
    print("修复无效的关联关系")
    print("="*80)
    if dry_run:
        print("\n[DRY RUN模式 - 不会实际修改数据库]")
    # 获取无效的关联
    invalid_file_relations = check_invalid_relations(conn)['invalid_file_relations']
    invalid_field_relations = check_invalid_relations(conn)['invalid_field_relations']
    all_invalid_ids = set()
    for rel in invalid_file_relations:
        all_invalid_ids.add(rel['id'])
    for rel in invalid_field_relations:
        all_invalid_ids.add(rel['id'])
    if not all_invalid_ids:
        print("\n✓ 没有无效的关联关系需要删除")
        return {'deleted': 0}
    print(f"\n准备删除 {len(all_invalid_ids)} 条无效的关联关系")
    if not dry_run:
        placeholders = ','.join(['%s'] * len(all_invalid_ids))
        cursor.execute(f"""
            DELETE FROM f_polic_file_field
            WHERE id IN ({placeholders})
        """, list(all_invalid_ids))
        conn.commit()
        print(f"✓ 已删除 {cursor.rowcount} 条无效的关联关系")
    else:
        print(f"[DRY RUN] 将删除以下关联ID: {sorted(all_invalid_ids)}")
    return {'deleted': len(all_invalid_ids) if not dry_run else 0}
 def fix_input_field_relations(conn, dry_run: bool = True) -> Dict:
    """删除关联到输入字段的记录（f_polic_file_field 应该只关联输出字段）"""
    cursor = conn.cursor()
    print("\n" + "="*80)
    print("删除关联到输入字段的记录")
    print("="*80)
    if dry_run:
        print("\n[DRY RUN模式 - 不会实际修改数据库]")
    # 获取关联到输入字段的记录
    input_field_relations = check_field_type_consistency(conn)['input_field_relations']
    if not input_field_relations:
        print("\n✓ 没有关联到输入字段的记录需要删除")
        return {'deleted': 0}
    ids_to_delete = [rel['id'] for rel in input_field_relations]
    print(f"\n准备删除 {len(ids_to_delete)} 条关联到输入字段的记录")
    if not dry_run:
        placeholders = ','.join(['%s'] * len(ids_to_delete))
        cursor.execute(f"""
            DELETE FROM f_polic_file_field
            WHERE id IN ({placeholders})
        """, ids_to_delete)
        conn.commit()
        print(f"✓ 已删除 {cursor.rowcount} 条关联到输入字段的记录")
    else:
        print(f"[DRY RUN] 将删除以下关联ID: {sorted(ids_to_delete)}")
    return {'deleted': len(ids_to_delete) if not dry_run else 0}
 def fix_duplicate_relations(conn, dry_run: bool = True) -> Dict:
    """修复重复的关联关系（保留第一条，删除其他）"""
    cursor = conn.cursor()
    print("\n" + "="*80)
    print("修复重复的关联关系")
    print("="*80)
    if dry_run:
        print("\n[DRY RUN模式 - 不会实际修改数据库]")
    duplicates = check_duplicate_relations(conn)['duplicates']
    if not duplicates:
        print("\n✓ 没有重复的关联关系需要修复")
        return {'deleted': 0}
    ids_to_delete = []
    for dup in duplicates:
        # 保留第一条（ID最小的），删除其他的
        ids_to_delete.extend(dup['ids'][1:])
    print(f"\n准备删除 {len(ids_to_delete)} 条重复的关联关系")
    if not dry_run:
        placeholders = ','.join(['%s'] * len(ids_to_delete))
        cursor.execute(f"""
            DELETE FROM f_polic_file_field
            WHERE id IN ({placeholders})
        """, ids_to_delete)
        conn.commit()
        print(f"✓ 已删除 {cursor.rowcount} 条重复的关联关系")
    else:
        print(f"[DRY RUN] 将删除以下关联ID: {sorted(ids_to_delete)}")
    return {'deleted': len(ids_to_delete) if not dry_run else 0}
 def get_statistics(conn) -> Dict:
    """获取统计信息"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    print("\n" + "="*80)
    print("统计信息")
    print("="*80)
    # 总关联数
    cursor.execute("""
        SELECT COUNT(*) as total
        FROM f_polic_file_field
        WHERE tenant_id = %s
    """, (TENANT_ID,))
    total_relations = cursor.fetchone()['total']
    # 有效的关联数（关联到存在的、启用的文件和字段）
    cursor.execute("""
        SELECT COUNT(*) as total
        FROM f_polic_file_field fff
        INNER JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id AND fc.state = 1
        INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id AND f.state = 1
        WHERE fff.tenant_id = %s
    """, (TENANT_ID,))
    valid_relations = cursor.fetchone()['total']
    # 关联的文件数
    cursor.execute("""
        SELECT COUNT(DISTINCT file_id) as total
        FROM f_polic_file_field
        WHERE tenant_id = %s
    """, (TENANT_ID,))
    related_files = cursor.fetchone()['total']
    # 关联的字段数
    cursor.execute("""
        SELECT COUNT(DISTINCT filed_id) as total
        FROM f_polic_file_field
        WHERE tenant_id = %s
    """, (TENANT_ID,))
    related_fields = cursor.fetchone()['total']
    print(f"\n总关联数: {total_relations}")
    print(f"有效关联数: {valid_relations}")
    print(f"关联的文件数: {related_files}")
    print(f"关联的字段数: {related_fields}")
    return {
        'total_relations': total_relations,
        'valid_relations': valid_relations,
        'related_files': related_files,
        'related_fields': related_fields
    }
 def main():
    """主函数"""
    print("="*80)
    print("检查并修复 f_polic_file_field 表的关联关系")
    print("="*80)
    try:
        conn = pymysql.connect(**DB_CONFIG)
        print("✓ 数据库连接成功\n")
    except Exception as e:
        print(f"✗ 数据库连接失败: {e}")
        return
    try:
        # 1. 检查无效的关联关系
        invalid_result = check_invalid_relations(conn)
        # 2. 检查重复的关联关系
        duplicate_result = check_duplicate_relations(conn)
        # 3. 检查关联到已删除或未启用的字段/文件
        disabled_result = check_disabled_relations(conn)
        # 4. 检查缺失的关联关系
        missing_result = check_missing_relations(conn)
        # 5. 检查字段类型一致性
        type_result = check_field_type_consistency(conn)
        # 6. 获取统计信息
        stats = get_statistics(conn)
        # 总结
        print("\n" + "="*80)
        print("检查总结")
        print("="*80)
        has_issues = (
            len(invalid_result['invalid_file_relations']) > 0 or
            len(invalid_result['invalid_field_relations']) > 0 or
            len(duplicate_result['duplicates']) > 0
        )
        has_issues = (
            len(invalid_result['invalid_file_relations']) > 0 or
            len(invalid_result['invalid_field_relations']) > 0 or
            len(duplicate_result['duplicates']) > 0 or
            len(type_result['input_field_relations']) > 0
        )
        if has_issues:
            print("\n⚠ 发现以下问题:")
            print(f"  - 无效的 file_id 关联: {len(invalid_result['invalid_file_relations'])} 条")
            print(f"  - 无效的 filed_id 关联: {len(invalid_result['invalid_field_relations'])} 条")
            print(f"  - 重复的关联关系: {len(duplicate_result['duplicates'])} 组")
            print(f"  - 关联到未启用的文件: {len(disabled_result['disabled_file_relations'])} 条")
            print(f"  - 关联到未启用的字段: {len(disabled_result['disabled_field_relations'])} 条")
            print(f"  - 关联到输入字段: {len(type_result['input_field_relations'])} 条")
            print("\n是否要修复这些问题？")
            print("运行以下命令进行修复:")
            print("  python check_and_fix_file_field_relations.py --fix")
        else:
            print("\n✓ 未发现需要修复的问题")
        print("\n" + "="*80)
    except Exception as e:
        print(f"\n✗ 检查过程中发生错误: {e}")
        import traceback
        traceback.print_exc()
    finally:
        conn.close()
        print("\n数据库连接已关闭")
 def fix_main():
    """修复主函数"""
    print("="*80)
    print("修复 f_polic_file_field 表的关联关系")
    print("="*80)
    try:
        conn = pymysql.connect(**DB_CONFIG)
        print("✓ 数据库连接成功\n")
    except Exception as e:
        print(f"✗ 数据库连接失败: {e}")
        return
    try:
        # 先进行干运行
        print("\n[第一步] 干运行检查...")
        invalid_result = check_invalid_relations(conn)
        duplicate_result = check_duplicate_relations(conn)
        # 修复无效的关联关系
        print("\n[第二步] 修复无效的关联关系...")
        fix_invalid_relations(conn, dry_run=False)
        # 修复重复的关联关系
        print("\n[第三步] 修复重复的关联关系...")
        fix_duplicate_relations(conn, dry_run=False)
        # 删除关联到输入字段的记录
        print("\n[第四步] 删除关联到输入字段的记录...")
        fix_input_field_relations(conn, dry_run=False)
        # 重新获取统计信息
        print("\n[第五步] 修复后的统计信息...")
        stats = get_statistics(conn)
        print("\n" + "="*80)
        print("修复完成")
        print("="*80)
    except Exception as e:
        print(f"\n✗ 修复过程中发生错误: {e}")
        import traceback
        traceback.print_exc()
        conn.rollback()
    finally:
        conn.close()
        print("\n数据库连接已关闭")
 if __name__ == '__main__':
    import sys
    if '--fix' in sys.argv:
        # 确认操作
        print("\n⚠ 警告: 这将修改数据库!")
        response = input("确认要继续吗? (yes/no): ")
        if response.lower() == 'yes':
            fix_main()
        else:
            print("操作已取消")
    else:
        main()
--- a/check_confidentiality_commitment.py
+++ b/check_confidentiality_commitment.py
@ -0,0 +1,36 @@
 """查询保密承诺书相关的模板记录"""
 import pymysql
 DB_CONFIG = {
    'host': '152.136.177.240',
    'port': 5012,
    'user': 'finyx',
    'password': '6QsGK6MpePZDE57Z',
    'database': 'finyx',
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 conn = pymysql.connect(**DB_CONFIG)
 cursor = conn.cursor(pymysql.cursors.DictCursor)
 cursor.execute("""
    SELECT id, name, file_path, parent_id
    FROM f_polic_file_config
    WHERE tenant_id = %s AND name LIKE %s
    ORDER BY name
 """, (TENANT_ID, '%保密承诺书%'))
 results = cursor.fetchall()
 print(f"找到 {len(results)} 条记录:\n")
 for r in results:
    print(f"ID: {r['id']}")
    print(f"名称: {r['name']}")
    print(f"文件路径: {r['file_path']}")
    print(f"父节点ID: {r['parent_id']}")
    print()
 cursor.close()
 conn.close()
--- a/check_database_id_relations.py
+++ b/check_database_id_relations.py
@ -0,0 +1,539 @@
 """
 检查数据库中的ID关系是否正确
 功能：
 1. 检查f_polic_file_config表中的数据
 2. 检查f_polic_field表中的数据
 3. 检查f_polic_file_field表中的关联关系
 4. 验证ID关系是否正确匹配
 5. 找出孤立数据和错误关联
 使用方法：
 python check_database_id_relations.py --host 10.100.31.21 --port 3306 --user finyx --password FknJYz3FA5WDYtsd --database finyx --tenant-id 1
 """
 import os
 import sys
 import pymysql
 import argparse
 from typing import Dict, List, Set, Optional
 from collections import defaultdict
 # 设置输出编码为UTF-8（Windows兼容）
 if sys.platform == 'win32':
    import io
    sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
    sys.stderr = io.TextIOWrapper(sys.stderr.buffer, encoding='utf-8')
 def print_section(title):
    """打印章节标题"""
    print("\n" + "="*70)
    print(f"  {title}")
    print("="*70)
 def print_result(success, message):
    """打印结果"""
    status = "[OK]" if success else "[FAIL]"
    print(f"{status} {message}")
 def get_db_config_from_args() -> Dict:
    """从命令行参数获取数据库配置"""
    parser = argparse.ArgumentParser(
        description='检查数据库中的ID关系是否正确',
        formatter_class=argparse.RawDescriptionHelpFormatter,
        epilog="""
 示例：
  python check_database_id_relations.py --host 10.100.31.21 --port 3306 --user finyx --password FknJYz3FA5WDYtsd --database finyx --tenant-id 1
        """
    )
    parser.add_argument('--host', type=str, required=True, help='MySQL服务器地址')
    parser.add_argument('--port', type=int, required=True, help='MySQL服务器端口')
    parser.add_argument('--user', type=str, required=True, help='MySQL用户名')
    parser.add_argument('--password', type=str, required=True, help='MySQL密码')
    parser.add_argument('--database', type=str, required=True, help='数据库名称')
    parser.add_argument('--tenant-id', type=int, required=True, help='租户ID')
    parser.add_argument('--file-id', type=int, help='检查特定的文件ID')
    args = parser.parse_args()
    return {
        'host': args.host,
        'port': args.port,
        'user': args.user,
        'password': args.password,
        'database': args.database,
        'charset': 'utf8mb4',
        'tenant_id': args.tenant_id,
        'file_id': args.file_id
    }
 def test_db_connection(config: Dict) -> Optional[pymysql.Connection]:
    """测试数据库连接"""
    try:
        conn = pymysql.connect(
            host=config['host'],
            port=config['port'],
            user=config['user'],
            password=config['password'],
            database=config['database'],
            charset=config['charset']
        )
        return conn
    except Exception as e:
        print_result(False, f"数据库连接失败: {str(e)}")
        return None
 def check_file_config(conn, tenant_id: int, file_id: Optional[int] = None):
    """检查f_polic_file_config表"""
    print_section("检查 f_polic_file_config 表")
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        if file_id:
            # 检查特定文件ID
            cursor.execute("""
                SELECT id, tenant_id, parent_id, name, file_path, state
                FROM f_polic_file_config
                WHERE id = %s AND tenant_id = %s
            """, (file_id, tenant_id))
            result = cursor.fetchone()
            if result:
                print(f"\n  文件ID {file_id} 的信息:")
                print(f"    - ID: {result['id']}")
                print(f"    - 租户ID: {result['tenant_id']}")
                print(f"    - 父级ID: {result['parent_id']}")
                print(f"    - 名称: {result['name']}")
                print(f"    - 文件路径: {result['file_path']}")
                # 处理state字段（可能是bytes或int）
                state_raw = result['state']
                if isinstance(state_raw, bytes):
                    state_value = int.from_bytes(state_raw, byteorder='big')
                elif state_raw is not None:
                    state_value = int(state_raw)
                else:
                    state_value = 0
                print(f"    - 状态: {state_value} ({'启用' if state_value == 1 else '禁用'})")
                if state_value != 1:
                    print_result(False, f"文件ID {file_id} 的状态为禁用（state={state_value}）")
                else:
                    print_result(True, f"文件ID {file_id} 存在且已启用")
            else:
                print_result(False, f"文件ID {file_id} 不存在或不属于租户 {tenant_id}")
                return
        # 统计信息
        cursor.execute("""
            SELECT 
                COUNT(*) as total,
                SUM(CASE WHEN state = 1 THEN 1 ELSE 0 END) as enabled,
                SUM(CASE WHEN state = 0 THEN 1 ELSE 0 END) as disabled,
                SUM(CASE WHEN file_path IS NOT NULL AND file_path != '' THEN 1 ELSE 0 END) as files,
                SUM(CASE WHEN file_path IS NULL OR file_path = '' THEN 1 ELSE 0 END) as directories
            FROM f_polic_file_config
            WHERE tenant_id = %s
        """, (tenant_id,))
        stats = cursor.fetchone()
        print(f"\n  统计信息:")
        print(f"    - 总记录数: {stats['total']}")
        print(f"    - 启用记录: {stats['enabled']}")
        print(f"    - 禁用记录: {stats['disabled']}")
        print(f"    - 文件节点: {stats['files']}")
        print(f"    - 目录节点: {stats['directories']}")
        # 检查parent_id引用
        cursor.execute("""
            SELECT fc1.id, fc1.name, fc1.parent_id
            FROM f_polic_file_config fc1
            LEFT JOIN f_polic_file_config fc2 ON fc1.parent_id = fc2.id AND fc1.tenant_id = fc2.tenant_id
            WHERE fc1.tenant_id = %s
            AND fc1.parent_id IS NOT NULL
            AND fc2.id IS NULL
        """, (tenant_id,))
        broken_parents = cursor.fetchall()
        if broken_parents:
            print(f"\n  [警告] 发现 {len(broken_parents)} 个parent_id引用错误:")
            for item in broken_parents[:10]:
                print(f"    - ID: {item['id']}, 名称: {item['name']}, parent_id: {item['parent_id']} (不存在)")
            if len(broken_parents) > 10:
                print(f"    ... 还有 {len(broken_parents) - 10} 个")
        else:
            print_result(True, "所有parent_id引用正确")
    finally:
        cursor.close()
 def check_fields(conn, tenant_id: int):
    """检查f_polic_field表"""
    print_section("检查 f_polic_field 表")
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        # 统计信息
        cursor.execute("""
            SELECT 
                field_type,
                COUNT(*) as total,
                SUM(CASE WHEN state = 1 THEN 1 ELSE 0 END) as enabled,
                SUM(CASE WHEN state = 0 THEN 1 ELSE 0 END) as disabled
            FROM f_polic_field
            WHERE tenant_id = %s
            GROUP BY field_type
        """, (tenant_id,))
        stats = cursor.fetchall()
        print(f"\n  统计信息:")
        for stat in stats:
            field_type_name = "输入字段" if stat['field_type'] == 1 else "输出字段" if stat['field_type'] == 2 else "未知"
            print(f"    - {field_type_name} (field_type={stat['field_type']}):")
            print(f"      总记录数: {stat['total']}")
            print(f"      启用: {stat['enabled']}")
            print(f"      禁用: {stat['disabled']}")
        # 检查重复的filed_code
        cursor.execute("""
            SELECT filed_code, field_type, COUNT(*) as count
            FROM f_polic_field
            WHERE tenant_id = %s
            AND state = 1
            GROUP BY filed_code, field_type
            HAVING count > 1
        """, (tenant_id,))
        duplicates = cursor.fetchall()
        if duplicates:
            print(f"\n  [警告] 发现重复的filed_code:")
            for dup in duplicates:
                print(f"    - filed_code: {dup['filed_code']}, field_type: {dup['field_type']}, 重复数: {dup['count']}")
        else:
            print_result(True, "没有重复的filed_code")
    finally:
        cursor.close()
 def check_file_field_relations(conn, tenant_id: int, file_id: Optional[int] = None):
    """检查f_polic_file_field表"""
    print_section("检查 f_polic_file_field 表（关联关系）")
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        # 统计信息
        cursor.execute("""
            SELECT COUNT(*) as total
            FROM f_polic_file_field
            WHERE tenant_id = %s AND state = 1
        """, (tenant_id,))
        total_relations = cursor.fetchone()['total']
        print(f"\n  总关联关系数: {total_relations}")
        if file_id:
            # 检查特定文件ID的关联关系
            cursor.execute("""
                SELECT fff.id, fff.file_id, fff.filed_id, fff.state,
                       fc.name as file_name, fc.file_path, fc.state as file_state,
                       f.name as field_name, f.filed_code, f.field_type, f.state as field_state
                FROM f_polic_file_field fff
                LEFT JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id
                LEFT JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
                WHERE fff.tenant_id = %s AND fff.file_id = %s
            """, (tenant_id, file_id))
            relations = cursor.fetchall()
            if relations:
                print(f"\n  文件ID {file_id} 的关联关系 ({len(relations)} 条):")
                for rel in relations:
                    print(f"\n    关联ID: {rel['id']}")
                    print(f"      - file_id: {rel['file_id']}")
                    if rel['file_name']:
                        print(f"        模板: {rel['file_name']} (路径: {rel['file_path']})")
                        # 处理state字段（可能是bytes或int）
                        state_raw = rel['file_state']
                        if isinstance(state_raw, bytes):
                            file_state = int.from_bytes(state_raw, byteorder='big')
                        elif state_raw is not None:
                            file_state = int(state_raw)
                        else:
                            file_state = 0
                        print(f"        状态: {file_state} ({'启用' if file_state == 1 else '禁用'})")
                    else:
                        print(f"        [错误] 模板不存在！")
                    print(f"      - filed_id: {rel['filed_id']}")
                    if rel['field_name']:
                        field_type_name = "输入字段" if rel['field_type'] == 1 else "输出字段" if rel['field_type'] == 2 else "未知"
                        # 处理state字段（可能是bytes或int）
                        state_raw = rel['field_state']
                        if isinstance(state_raw, bytes):
                            field_state = int.from_bytes(state_raw, byteorder='big')
                        elif state_raw is not None:
                            field_state = int(state_raw)
                        else:
                            field_state = 0
                        print(f"        字段: {rel['field_name']} ({rel['filed_code']}, {field_type_name})")
                        print(f"        状态: {field_state} ({'启用' if field_state == 1 else '禁用'})")
                    else:
                        print(f"        [错误] 字段不存在！")
            else:
                print(f"\n  文件ID {file_id} 没有关联关系")
        # 检查孤立的关联关系（file_id不存在）
        cursor.execute("""
            SELECT fff.id, fff.file_id, fff.filed_id
            FROM f_polic_file_field fff
            LEFT JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id
            WHERE fff.tenant_id = %s
            AND fff.state = 1
            AND fc.id IS NULL
        """, (tenant_id,))
        orphaned_file_relations = cursor.fetchall()
        if orphaned_file_relations:
            print(f"\n  [错误] 发现 {len(orphaned_file_relations)} 个孤立的关联关系（file_id不存在）:")
            for rel in orphaned_file_relations[:10]:
                print(f"    - 关联ID: {rel['id']}, file_id: {rel['file_id']}, filed_id: {rel['filed_id']}")
            if len(orphaned_file_relations) > 10:
                print(f"    ... 还有 {len(orphaned_file_relations) - 10} 个")
        else:
            print_result(True, "所有file_id引用正确")
        # 检查孤立的关联关系（filed_id不存在）
        cursor.execute("""
            SELECT fff.id, fff.file_id, fff.filed_id
            FROM f_polic_file_field fff
            LEFT JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
            WHERE fff.tenant_id = %s
            AND fff.state = 1
            AND f.id IS NULL
        """, (tenant_id,))
        orphaned_field_relations = cursor.fetchall()
        if orphaned_field_relations:
            print(f"\n  [错误] 发现 {len(orphaned_field_relations)} 个孤立的关联关系（filed_id不存在）:")
            for rel in orphaned_field_relations[:10]:
                print(f"    - 关联ID: {rel['id']}, file_id: {rel['file_id']}, filed_id: {rel['filed_id']}")
            if len(orphaned_field_relations) > 10:
                print(f"    ... 还有 {len(orphaned_field_relations) - 10} 个")
        else:
            print_result(True, "所有filed_id引用正确")
        # 检查关联到禁用模板或字段的关联关系
        cursor.execute("""
            SELECT fff.id, fff.file_id, fff.filed_id,
                   fc.state as file_state, f.state as field_state
            FROM f_polic_file_field fff
            LEFT JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id
            LEFT JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
            WHERE fff.tenant_id = %s
            AND fff.state = 1
            AND (fc.state != 1 OR f.state != 1)
        """, (tenant_id,))
        disabled_relations = cursor.fetchall()
        if disabled_relations:
            print(f"\n  [警告] 发现 {len(disabled_relations)} 个关联到禁用模板或字段的关联关系:")
            for rel in disabled_relations[:10]:
                print(f"    - 关联ID: {rel['id']}, file_id: {rel['file_id']}, filed_id: {rel['filed_id']}")
                print(f"      模板状态: {rel['file_state']}, 字段状态: {rel['field_state']}")
            if len(disabled_relations) > 10:
                print(f"    ... 还有 {len(disabled_relations) - 10} 个")
        else:
            print_result(True, "所有关联关系都关联到启用的模板和字段")
    finally:
        cursor.close()
 def check_specific_file(conn, tenant_id: int, file_id: int):
    """检查特定文件ID的完整信息"""
    print_section(f"详细检查文件ID {file_id}")
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        # 1. 检查文件配置
        cursor.execute("""
            SELECT id, tenant_id, parent_id, name, file_path, state, created_time, updated_time
            FROM f_polic_file_config
            WHERE id = %s AND tenant_id = %s
        """, (file_id, tenant_id))
        file_config = cursor.fetchone()
        if not file_config:
            print_result(False, f"文件ID {file_id} 不存在或不属于租户 {tenant_id}")
            return
        print(f"\n  文件配置信息:")
        print(f"    - ID: {file_config['id']}")
        print(f"    - 租户ID: {file_config['tenant_id']}")
        print(f"    - 父级ID: {file_config['parent_id']}")
        print(f"    - 名称: {file_config['name']}")
        print(f"    - 文件路径: {file_config['file_path']}")
        # 处理state字段（可能是bytes或int）
        state_raw = file_config['state']
        if isinstance(state_raw, bytes):
            file_state = int.from_bytes(state_raw, byteorder='big')
        elif state_raw is not None:
            file_state = int(state_raw)
        else:
            file_state = 0
        print(f"    - 状态: {file_state} ({'启用' if file_state == 1 else '禁用'})")
        print(f"    - 创建时间: {file_config['created_time']}")
        print(f"    - 更新时间: {file_config['updated_time']}")
        # 2. 检查父级
        if file_config['parent_id']:
            cursor.execute("""
                SELECT id, name, file_path, state
                FROM f_polic_file_config
                WHERE id = %s AND tenant_id = %s
            """, (file_config['parent_id'], tenant_id))
            parent = cursor.fetchone()
            if parent:
                # 处理state字段（可能是bytes或int）
                state_raw = parent['state']
                if isinstance(state_raw, bytes):
                    parent_state = int.from_bytes(state_raw, byteorder='big')
                elif state_raw is not None:
                    parent_state = int(state_raw)
                else:
                    parent_state = 0
                print(f"\n  父级信息:")
                print(f"    - ID: {parent['id']}")
                print(f"    - 名称: {parent['name']}")
                print(f"    - 状态: {parent_state} ({'启用' if parent_state == 1 else '禁用'})")
            else:
                print(f"\n  [错误] 父级ID {file_config['parent_id']} 不存在！")
        # 3. 检查关联的字段
        cursor.execute("""
            SELECT fff.id as relation_id, fff.filed_id,
                   f.name as field_name, f.filed_code, f.field_type, f.state as field_state
            FROM f_polic_file_field fff
            LEFT JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
            WHERE fff.tenant_id = %s AND fff.file_id = %s AND fff.state = 1
            ORDER BY f.field_type, f.filed_code
        """, (tenant_id, file_id))
        relations = cursor.fetchall()
        print(f"\n  关联的字段 ({len(relations)} 个):")
        input_fields = []
        output_fields = []
        for rel in relations:
            field_type_name = "输入字段" if rel['field_type'] == 1 else "输出字段" if rel['field_type'] == 2 else "未知"
            # 处理state字段（可能是bytes或int）
            state_raw = rel['field_state']
            if isinstance(state_raw, bytes):
                field_state = int.from_bytes(state_raw, byteorder='big')
            elif state_raw is not None:
                field_state = int(state_raw)
            else:
                field_state = 0
            field_info = f"    - {rel['field_name']} ({rel['filed_code']}, {field_type_name})"
            if field_state != 1:
                field_info += f" [状态: 禁用]"
            if not rel['field_name']:
                field_info += f" [错误: 字段不存在！]"
            if rel['field_type'] == 1:
                input_fields.append(field_info)
            else:
                output_fields.append(field_info)
        if input_fields:
            print(f"\n    输入字段 ({len(input_fields)} 个):")
            for info in input_fields:
                print(info)
        if output_fields:
            print(f"\n    输出字段 ({len(output_fields)} 个):")
            for info in output_fields:
                print(info)
        # 4. 检查是否有孤立的关联关系
        cursor.execute("""
            SELECT fff.id, fff.filed_id
            FROM f_polic_file_field fff
            LEFT JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
            WHERE fff.tenant_id = %s AND fff.file_id = %s AND fff.state = 1 AND f.id IS NULL
        """, (tenant_id, file_id))
        orphaned = cursor.fetchall()
        if orphaned:
            print(f"\n  [错误] 发现 {len(orphaned)} 个孤立的关联关系（字段不存在）:")
            for rel in orphaned:
                print(f"    - 关联ID: {rel['id']}, filed_id: {rel['filed_id']}")
    finally:
        cursor.close()
 def main():
    """主函数"""
    print_section("数据库ID关系检查工具")
    # 获取配置
    config = get_db_config_from_args()
    # 显示配置信息
    print_section("配置信息")
    print(f"  数据库服务器: {config['host']}:{config['port']}")
    print(f"  数据库名称: {config['database']}")
    print(f"  用户名: {config['user']}")
    print(f"  租户ID: {config['tenant_id']}")
    if config.get('file_id'):
        print(f"  检查文件ID: {config['file_id']}")
    # 连接数据库
    print_section("连接数据库")
    conn = test_db_connection(config)
    if not conn:
        return
    print_result(True, "数据库连接成功")
    try:
        tenant_id = config['tenant_id']
        file_id = config.get('file_id')
        # 检查各个表
        check_file_config(conn, tenant_id, file_id)
        check_fields(conn, tenant_id)
        check_file_field_relations(conn, tenant_id, file_id)
        # 如果指定了文件ID，进行详细检查
        if file_id:
            check_specific_file(conn, tenant_id, file_id)
        # 总结
        print_section("检查完成")
        print("请查看上述检查结果，找出问题所在")
    finally:
        conn.close()
        print_result(True, "数据库连接已关闭")
 if __name__ == "__main__":
    try:
        main()
    except KeyboardInterrupt:
        print("\n\n[中断] 用户取消操作")
        sys.exit(0)
    except Exception as e:
        print(f"\n[错误] 发生异常: {str(e)}")
        import traceback
        traceback.print_exc()
        sys.exit(1)
--- a/check_database_templates.py
+++ b/check_database_templates.py
@ -0,0 +1,202 @@
 """
 检查数据库中的模板记录情况
 """
 import os
 import pymysql
 from pathlib import Path
 from dotenv import load_dotenv
 # 加载环境变量
 load_dotenv()
 # 数据库配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 # 先检查数据库中的实际 tenant_id
 TENANT_ID = 615873064429507639  # 默认值，会在检查时自动发现实际的 tenant_id
 def print_section(title):
    """打印章节标题"""
    print("\n" + "="*70)
    print(f"  {title}")
    print("="*70)
 def check_database():
    """检查数据库记录"""
    print_section("数据库模板记录检查")
    try:
        conn = pymysql.connect(**DB_CONFIG)
        cursor = conn.cursor(pymysql.cursors.DictCursor)
        # 0. 先检查所有 tenant_id，确定实际使用的 tenant_id
        print_section("0. 检查所有不同的 tenant_id")
        cursor.execute("SELECT DISTINCT tenant_id, COUNT(*) as count FROM f_polic_file_config GROUP BY tenant_id")
        tenant_ids = cursor.fetchall()
        actual_tenant_id = None
        for row in tenant_ids:
            print(f"  tenant_id={row['tenant_id']}: {row['count']} 条记录")
            if actual_tenant_id is None:
                actual_tenant_id = row['tenant_id']
        # 使用实际的 tenant_id
        if actual_tenant_id:
            print(f"\n  [使用] tenant_id={actual_tenant_id} 进行后续检查")
            tenant_id = actual_tenant_id
        else:
            tenant_id = TENANT_ID
            print(f"\n  [使用] 默认 tenant_id={tenant_id}")
        # 1. 检查 f_polic_file_config 表的所有记录（不限制条件）
        print_section("1. 检查 f_polic_file_config 表（所有记录）")
        cursor.execute("SELECT COUNT(*) as count FROM f_polic_file_config")
        total_count = cursor.fetchone()['count']
        print(f"  总记录数: {total_count}")
        # 2. 检查按 tenant_id 过滤
        print_section("2. 检查 f_polic_file_config 表（按 tenant_id 过滤）")
        cursor.execute("SELECT COUNT(*) as count FROM f_polic_file_config WHERE tenant_id = %s", (tenant_id,))
        tenant_count = cursor.fetchone()['count']
        print(f"  tenant_id={tenant_id} 的记录数: {tenant_count}")
        # 3. 检查有 file_path 的记录
        print_section("3. 检查 f_polic_file_config 表（有 file_path 的记录）")
        cursor.execute("""
            SELECT COUNT(*) as count 
            FROM f_polic_file_config 
            WHERE tenant_id = %s 
            AND file_path IS NOT NULL 
            AND file_path != ''
        """, (tenant_id,))
        path_count = cursor.fetchone()['count']
        print(f"  有 file_path 的记录数: {path_count}")
        # 4. 检查不同状态的记录
        print_section("4. 检查 f_polic_file_config 表（按 state 分组）")
        cursor.execute("""
            SELECT state, COUNT(*) as count 
            FROM f_polic_file_config 
            WHERE tenant_id = %s 
            GROUP BY state
        """, (tenant_id,))
        state_counts = cursor.fetchall()
        for row in state_counts:
            state_name = "已启用" if row['state'] == 1 else "已禁用"
            print(f"  state={row['state']} ({state_name}): {row['count']} 条")
        # 5. 查看前10条记录示例
        print_section("5. f_polic_file_config 表记录示例（前10条）")
        cursor.execute("""
            SELECT id, name, file_path, state, tenant_id, parent_id
            FROM f_polic_file_config 
            WHERE tenant_id = %s
            LIMIT 10
        """, (tenant_id,))
        samples = cursor.fetchall()
        if samples:
            for i, row in enumerate(samples, 1):
                print(f"\n  记录 {i}:")
                print(f"    ID: {row['id']}")
                print(f"    名称: {row['name']}")
                print(f"    路径: {row['file_path']}")
                print(f"    状态: {row['state']} ({'已启用' if row['state'] == 1 else '已禁用'})")
                print(f"    租户ID: {row['tenant_id']}")
                print(f"    父级ID: {row['parent_id']}")
        else:
            print("  没有找到记录")
        # 7. 检查 file_path 的类型分布
        print_section("7. 检查 file_path 路径类型分布")
        cursor.execute("""
            SELECT 
                CASE 
                    WHEN file_path LIKE 'template_finish/%%' THEN '本地路径'
                    WHEN file_path LIKE '/%%TEMPLATE/%%' THEN 'MinIO路径'
                    WHEN file_path IS NULL OR file_path = '' THEN '空路径'
                    ELSE '其他路径'
                END as path_type,
                COUNT(*) as count
            FROM f_polic_file_config 
            WHERE tenant_id = %s
            GROUP BY path_type
        """, (tenant_id,))
        path_types = cursor.fetchall()
        for row in path_types:
            print(f"  {row['path_type']}: {row['count']} 条")
        # 8. 检查 f_polic_file_field 关联表
        print_section("8. 检查 f_polic_file_field 关联表")
        cursor.execute("""
            SELECT COUNT(*) as count 
            FROM f_polic_file_field 
            WHERE tenant_id = %s
        """, (tenant_id,))
        relation_count = cursor.fetchone()['count']
        print(f"  关联记录数: {relation_count}")
        # 9. 检查 f_polic_field 字段表
        print_section("9. 检查 f_polic_field 字段表")
        cursor.execute("""
            SELECT 
                field_type,
                CASE 
                    WHEN field_type = 1 THEN '输入字段'
                    WHEN field_type = 2 THEN '输出字段'
                    ELSE '未知'
                END as type_name,
                COUNT(*) as count
            FROM f_polic_field 
            WHERE tenant_id = %s
            GROUP BY field_type
        """, (tenant_id,))
        field_types = cursor.fetchall()
        for row in field_types:
            print(f"  {row['type_name']} (field_type={row['field_type']}): {row['count']} 条")
        # 10. 检查完整的关联关系
        print_section("10. 检查模板与字段的关联关系（示例）")
        cursor.execute("""
            SELECT 
                fc.id as file_id,
                fc.name as file_name,
                fc.file_path,
                COUNT(ff.filed_id) as field_count
            FROM f_polic_file_config fc
            LEFT JOIN f_polic_file_field ff ON fc.id = ff.file_id AND ff.tenant_id = %s
            WHERE fc.tenant_id = %s
            GROUP BY fc.id, fc.name, fc.file_path
            LIMIT 10
        """, (tenant_id, tenant_id))
        relations = cursor.fetchall()
        if relations:
            for i, row in enumerate(relations, 1):
                print(f"\n  模板 {i}:")
                print(f"    ID: {row['file_id']}")
                print(f"    名称: {row['file_name']}")
                print(f"    路径: {row['file_path']}")
                print(f"    关联字段数: {row['field_count']}")
        else:
            print("  没有找到关联记录")
        cursor.close()
        conn.close()
        print_section("检查完成")
    except Exception as e:
        print(f"检查失败: {str(e)}")
        import traceback
        traceback.print_exc()
 if __name__ == "__main__":
    check_database()
--- a/check_database_tenant_data.py
+++ b/check_database_tenant_data.py
@ -0,0 +1,140 @@
 """
 检查数据库中的实际数据，查看有哪些 tenant_id 以及对应的数据量
 """
 import pymysql
 import os
 from dotenv import load_dotenv
 load_dotenv()
 # 数据库连接配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 def check_tenant_data():
    """检查各个表中的 tenant_id 数据"""
    conn = pymysql.connect(**DB_CONFIG)
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        print("=" * 80)
        print("检查数据库中的 tenant_id 数据")
        print("=" * 80)
        # 1. 检查 f_polic_field 表中的 tenant_id
        print("\n1. f_polic_field 表中的 tenant_id 分布：")
        cursor.execute("""
            SELECT tenant_id, 
                   COUNT(*) as total_count,
                   SUM(CASE WHEN field_type = 1 THEN 1 ELSE 0 END) as input_count,
                   SUM(CASE WHEN field_type = 2 THEN 1 ELSE 0 END) as output_count,
                   SUM(CASE WHEN state = 1 THEN 1 ELSE 0 END) as enabled_count
            FROM f_polic_field
            GROUP BY tenant_id
            ORDER BY tenant_id
        """)
        field_tenants = cursor.fetchall()
        for row in field_tenants:
            print(f"  tenant_id: {row['tenant_id']}")
            print(f"    总字段数: {row['total_count']}, 输入字段: {row['input_count']}, 输出字段: {row['output_count']}, 启用: {row['enabled_count']}")
        # 2. 检查 f_polic_file_config 表中的 tenant_id
        print("\n2. f_polic_file_config 表中的 tenant_id 分布：")
        cursor.execute("""
            SELECT tenant_id, 
                   COUNT(*) as total_count,
                   SUM(CASE WHEN state = 1 THEN 1 ELSE 0 END) as enabled_count
            FROM f_polic_file_config
            GROUP BY tenant_id
            ORDER BY tenant_id
        """)
        config_tenants = cursor.fetchall()
        for row in config_tenants:
            print(f"  tenant_id: {row['tenant_id']}")
            print(f"    总模板数: {row['total_count']}, 启用: {row['enabled_count']}")
        # 3. 检查 f_polic_file_field 表中的 tenant_id
        print("\n3. f_polic_file_field 表中的 tenant_id 分布：")
        cursor.execute("""
            SELECT tenant_id, 
                   COUNT(*) as total_count,
                   SUM(CASE WHEN state = 1 THEN 1 ELSE 0 END) as enabled_count
            FROM f_polic_file_field
            GROUP BY tenant_id
            ORDER BY tenant_id
        """)
        relation_tenants = cursor.fetchall()
        for row in relation_tenants:
            print(f"  tenant_id: {row['tenant_id']}")
            print(f"    总关联数: {row['total_count']}, 启用: {row['enabled_count']}")
        # 4. 检查特定 tenant_id 的详细数据
        test_tenant_id = 615873064429507600
        print(f"\n4. 检查 tenant_id = {test_tenant_id} 的详细数据：")
        # 字段数据
        cursor.execute("""
            SELECT COUNT(*) as count
            FROM f_polic_field
            WHERE tenant_id = %s
        """, (test_tenant_id,))
        field_count = cursor.fetchone()['count']
        print(f"  f_polic_field 表中的字段数: {field_count}")
        if field_count > 0:
            cursor.execute("""
                SELECT id, name, filed_code, field_type, state
                FROM f_polic_field
                WHERE tenant_id = %s
                LIMIT 10
            """, (test_tenant_id,))
            sample_fields = cursor.fetchall()
            print(f"  示例字段（前10条）：")
            for field in sample_fields:
                print(f"    ID: {field['id']}, 名称: {field['name']}, 编码: {field['filed_code']}, 类型: {field['field_type']}, 状态: {field['state']}")
        # 模板数据
        cursor.execute("""
            SELECT COUNT(*) as count
            FROM f_polic_file_config
            WHERE tenant_id = %s
        """, (test_tenant_id,))
        template_count = cursor.fetchone()['count']
        print(f"  f_polic_file_config 表中的模板数: {template_count}")
        # 关联数据
        cursor.execute("""
            SELECT COUNT(*) as count
            FROM f_polic_file_field
            WHERE tenant_id = %s
        """, (test_tenant_id,))
        relation_count = cursor.fetchone()['count']
        print(f"  f_polic_file_field 表中的关联数: {relation_count}")
        # 5. 检查所有不同的 tenant_id
        print("\n5. 所有表中出现的 tenant_id 汇总：")
        cursor.execute("""
            SELECT DISTINCT tenant_id FROM f_polic_field
            UNION
            SELECT DISTINCT tenant_id FROM f_polic_file_config
            UNION
            SELECT DISTINCT tenant_id FROM f_polic_file_field
            ORDER BY tenant_id
        """)
        all_tenants = cursor.fetchall()
        print("  所有 tenant_id 列表：")
        for row in all_tenants:
            print(f"    {row['tenant_id']}")
    finally:
        cursor.close()
        conn.close()
 if __name__ == '__main__':
    check_tenant_data()
--- a/check_existing_data.py
+++ b/check_existing_data.py
@ -0,0 +1,105 @@
 """
 检查数据库中的现有数据，确认匹配情况
 """
 import os
 import json
 import pymysql
 from pathlib import Path
 # 数据库连接配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 def check_existing_data():
    """检查数据库中的现有数据"""
    print("="*80)
    print("检查数据库中的现有数据")
    print("="*80)
    try:
        conn = pymysql.connect(**DB_CONFIG)
        cursor = conn.cursor(pymysql.cursors.DictCursor)
        # 查询所有记录
        sql = """
            SELECT id, name, parent_id, template_code, input_data, file_path, state
            FROM f_polic_file_config
            WHERE tenant_id = %s
            ORDER BY name
        """
        cursor.execute(sql, (TENANT_ID,))
        configs = cursor.fetchall()
        print(f"\n共找到 {len(configs)} 条记录\n")
        # 按 parent_id 分组统计
        with_parent = []
        without_parent = []
        for config in configs:
            # 尝试从 input_data 中提取 template_code
            template_code = config.get('template_code')
            if not template_code and config.get('input_data'):
                try:
                    input_data = json.loads(config['input_data']) if isinstance(config['input_data'], str) else config['input_data']
                    if isinstance(input_data, dict):
                        template_code = input_data.get('template_code')
                except:
                    pass
            config['extracted_template_code'] = template_code
            if config.get('parent_id'):
                with_parent.append(config)
            else:
                without_parent.append(config)
        print(f"有 parent_id 的记录: {len(with_parent)} 条")
        print(f"无 parent_id 的记录: {len(without_parent)} 条\n")
        # 显示无 parent_id 的记录
        print("="*80)
        print("无 parent_id 的记录列表：")
        print("="*80)
        for i, config in enumerate(without_parent, 1):
            print(f"\n{i}. {config['name']}")
            print(f"   ID: {config['id']}")
            print(f"   template_code: {config.get('extracted_template_code') or config.get('template_code') or '无'}")
            print(f"   file_path: {config.get('file_path', '无')}")
            print(f"   state: {config.get('state')}")
        # 显示有 parent_id 的记录（树状结构）
        print("\n" + "="*80)
        print("有 parent_id 的记录（树状结构）：")
        print("="*80)
        # 构建ID到名称的映射
        id_to_name = {config['id']: config['name'] for config in configs}
        for config in with_parent:
            parent_name = id_to_name.get(config['parent_id'], f"ID:{config['parent_id']}")
            print(f"\n{config['name']}")
            print(f"   ID: {config['id']}")
            print(f"   父节点: {parent_name} (ID: {config['parent_id']})")
            print(f"   template_code: {config.get('extracted_template_code') or config.get('template_code') or '无'}")
        cursor.close()
        conn.close()
    except Exception as e:
        print(f"错误: {e}")
        import traceback
        traceback.print_exc()
 if __name__ == '__main__':
    check_existing_data()
--- a/check_file_field_relations_comprehensive.py
+++ b/check_file_field_relations_comprehensive.py
@ -0,0 +1,496 @@
 """
 全面检查 f_polic_file_field 表的关联关系
 重点检查输入字段（field_type=1）和输出字段（field_type=2）与模板的关联情况
 """
 import pymysql
 import os
 import sys
 from typing import Dict, List, Tuple
 from collections import defaultdict
 # 设置输出编码为UTF-8
 if sys.platform == 'win32':
    import io
    sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
 # 数据库连接配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 def check_all_templates_field_relations(conn) -> Dict:
    """检查所有模板的字段关联情况"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    print("\n" + "="*80)
    print("1. 检查所有模板的字段关联情况")
    print("="*80)
    # 获取所有启用的模板
    cursor.execute("""
        SELECT id, name, template_code
        FROM f_polic_file_config
        WHERE tenant_id = %s AND state = 1
        ORDER BY name
    """, (TENANT_ID,))
    templates = cursor.fetchall()
    # 获取每个模板关联的字段（按类型分组）
    cursor.execute("""
        SELECT 
            fc.id AS template_id,
            fc.name AS template_name,
            f.id AS field_id,
            f.name AS field_name,
            f.filed_code AS field_code,
            f.field_type
        FROM f_polic_file_config fc
        INNER JOIN f_polic_file_field fff ON fc.id = fff.file_id AND fc.tenant_id = fff.tenant_id
        INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
        WHERE fc.tenant_id = %s 
          AND fc.state = 1
          AND fff.state = 1
          AND f.state = 1
        ORDER BY fc.name, f.field_type, f.name
    """, (TENANT_ID,))
    relations = cursor.fetchall()
    # 按模板分组统计
    template_stats = {}
    for template in templates:
        template_stats[template['id']] = {
            'template_id': template['id'],
            'template_name': template['name'],
            'template_code': template.get('template_code'),
            'input_fields': [],
            'output_fields': [],
            'input_count': 0,
            'output_count': 0
        }
    # 填充字段信息
    for rel in relations:
        template_id = rel['template_id']
        if template_id in template_stats:
            field_info = {
                'field_id': rel['field_id'],
                'field_name': rel['field_name'],
                'field_code': rel['field_code'],
                'field_type': rel['field_type']
            }
            if rel['field_type'] == 1:
                template_stats[template_id]['input_fields'].append(field_info)
                template_stats[template_id]['input_count'] += 1
            elif rel['field_type'] == 2:
                template_stats[template_id]['output_fields'].append(field_info)
                template_stats[template_id]['output_count'] += 1
    # 打印统计信息
    print(f"\n共找到 {len(templates)} 个启用的模板\n")
    templates_without_input = []
    templates_without_output = []
    templates_without_any = []
    for template_id, stats in template_stats.items():
        status = []
        if stats['input_count'] == 0:
            status.append("缺少输入字段")
            templates_without_input.append(stats)
        if stats['output_count'] == 0:
            status.append("缺少输出字段")
            templates_without_output.append(stats)
        if stats['input_count'] == 0 and stats['output_count'] == 0:
            templates_without_any.append(stats)
        status_str = " | ".join(status) if status else "[OK] 正常"
        print(f"  {stats['template_name']} (ID: {stats['template_id']})")
        print(f"    输入字段: {stats['input_count']} 个 | 输出字段: {stats['output_count']} 个 | {status_str}")
    return {
        'template_stats': template_stats,
        'templates_without_input': templates_without_input,
        'templates_without_output': templates_without_output,
        'templates_without_any': templates_without_any
    }
 def check_input_field_relations_detail(conn) -> Dict:
    """详细检查输入字段的关联情况"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    print("\n" + "="*80)
    print("2. 详细检查输入字段的关联情况")
    print("="*80)
    # 获取所有启用的输入字段
    cursor.execute("""
        SELECT id, name, filed_code
        FROM f_polic_field
        WHERE tenant_id = %s AND field_type = 1 AND state = 1
        ORDER BY name
    """, (TENANT_ID,))
    input_fields = cursor.fetchall()
    # 获取每个输入字段关联的模板
    cursor.execute("""
        SELECT 
            f.id AS field_id,
            f.name AS field_name,
            f.filed_code AS field_code,
            fc.id AS template_id,
            fc.name AS template_name
        FROM f_polic_field f
        INNER JOIN f_polic_file_field fff ON f.id = fff.filed_id AND f.tenant_id = fff.tenant_id
        INNER JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id
        WHERE f.tenant_id = %s 
          AND f.field_type = 1
          AND f.state = 1
          AND fff.state = 1
          AND fc.state = 1
        ORDER BY f.name, fc.name
    """, (TENANT_ID,))
    input_field_relations = cursor.fetchall()
    # 按字段分组
    field_template_map = defaultdict(list)
    for rel in input_field_relations:
        field_template_map[rel['field_id']].append({
            'template_id': rel['template_id'],
            'template_name': rel['template_name']
        })
    print(f"\n共找到 {len(input_fields)} 个启用的输入字段\n")
    fields_without_relations = []
    fields_with_relations = []
    for field in input_fields:
        field_id = field['id']
        templates = field_template_map.get(field_id, [])
        if not templates:
            fields_without_relations.append(field)
            print(f"  [WARN] {field['name']} ({field['filed_code']}) - 未关联任何模板")
        else:
            fields_with_relations.append({
                'field': field,
                'templates': templates
            })
            print(f"  [OK] {field['name']} ({field['filed_code']}) - 关联了 {len(templates)} 个模板:")
            for template in templates:
                print(f"      - {template['template_name']} (ID: {template['template_id']})")
    return {
        'input_fields': input_fields,
        'fields_without_relations': fields_without_relations,
        'fields_with_relations': fields_with_relations
    }
 def check_output_field_relations_detail(conn) -> Dict:
    """详细检查输出字段的关联情况"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    print("\n" + "="*80)
    print("3. 详细检查输出字段的关联情况")
    print("="*80)
    # 获取所有启用的输出字段
    cursor.execute("""
        SELECT id, name, filed_code
        FROM f_polic_field
        WHERE tenant_id = %s AND field_type = 2 AND state = 1
        ORDER BY name
    """, (TENANT_ID,))
    output_fields = cursor.fetchall()
    # 获取每个输出字段关联的模板
    cursor.execute("""
        SELECT 
            f.id AS field_id,
            f.name AS field_name,
            f.filed_code AS field_code,
            fc.id AS template_id,
            fc.name AS template_name
        FROM f_polic_field f
        INNER JOIN f_polic_file_field fff ON f.id = fff.filed_id AND f.tenant_id = fff.tenant_id
        INNER JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id
        WHERE f.tenant_id = %s 
          AND f.field_type = 2
          AND f.state = 1
          AND fff.state = 1
          AND fc.state = 1
        ORDER BY f.name, fc.name
    """, (TENANT_ID,))
    output_field_relations = cursor.fetchall()
    # 按字段分组
    field_template_map = defaultdict(list)
    for rel in output_field_relations:
        field_template_map[rel['field_id']].append({
            'template_id': rel['template_id'],
            'template_name': rel['template_name']
        })
    print(f"\n共找到 {len(output_fields)} 个启用的输出字段\n")
    fields_without_relations = []
    fields_with_relations = []
    for field in output_fields:
        field_id = field['id']
        templates = field_template_map.get(field_id, [])
        if not templates:
            fields_without_relations.append(field)
            print(f"  [WARN] {field['name']} ({field['filed_code']}) - 未关联任何模板")
        else:
            fields_with_relations.append({
                'field': field,
                'templates': templates
            })
            if len(templates) <= 5:
                print(f"  [OK] {field['name']} ({field['filed_code']}) - 关联了 {len(templates)} 个模板:")
                for template in templates:
                    print(f"      - {template['template_name']} (ID: {template['template_id']})")
            else:
                print(f"  [OK] {field['name']} ({field['filed_code']}) - 关联了 {len(templates)} 个模板")
                for template in templates[:3]:
                    print(f"      - {template['template_name']} (ID: {template['template_id']})")
                print(f"      ... 还有 {len(templates) - 3} 个模板")
    return {
        'output_fields': output_fields,
        'fields_without_relations': fields_without_relations,
        'fields_with_relations': fields_with_relations
    }
 def check_invalid_relations(conn) -> Dict:
    """检查无效的关联关系"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    print("\n" + "="*80)
    print("4. 检查无效的关联关系")
    print("="*80)
    # 检查关联到不存在的 file_id
    cursor.execute("""
        SELECT fff.id, fff.file_id, fff.filed_id, fff.tenant_id
        FROM f_polic_file_field fff
        LEFT JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id
        WHERE fff.tenant_id = %s AND fc.id IS NULL
    """, (TENANT_ID,))
    invalid_file_relations = cursor.fetchall()
    # 检查关联到不存在的 filed_id
    cursor.execute("""
        SELECT fff.id, fff.file_id, fff.filed_id, fff.tenant_id
        FROM f_polic_file_field fff
        LEFT JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
        WHERE fff.tenant_id = %s AND f.id IS NULL
    """, (TENANT_ID,))
    invalid_field_relations = cursor.fetchall()
    print(f"\n关联到不存在的 file_id: {len(invalid_file_relations)} 条")
    if invalid_file_relations:
        print("  详情:")
        for rel in invalid_file_relations[:10]:
            print(f"    - 关联ID: {rel['id']}, file_id: {rel['file_id']}, filed_id: {rel['filed_id']}")
        if len(invalid_file_relations) > 10:
            print(f"    ... 还有 {len(invalid_file_relations) - 10} 条")
    else:
        print("  [OK] 没有无效的 file_id 关联")
    print(f"\n关联到不存在的 filed_id: {len(invalid_field_relations)} 条")
    if invalid_field_relations:
        print("  详情:")
        for rel in invalid_field_relations[:10]:
            print(f"    - 关联ID: {rel['id']}, file_id: {rel['file_id']}, filed_id: {rel['filed_id']}")
        if len(invalid_field_relations) > 10:
            print(f"    ... 还有 {len(invalid_field_relations) - 10} 条")
    else:
        print("  [OK] 没有无效的 filed_id 关联")
    return {
        'invalid_file_relations': invalid_file_relations,
        'invalid_field_relations': invalid_field_relations
    }
 def get_summary_statistics(conn) -> Dict:
    """获取汇总统计信息"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    print("\n" + "="*80)
    print("5. 汇总统计信息")
    print("="*80)
    # 总关联数
    cursor.execute("""
        SELECT COUNT(*) as total
        FROM f_polic_file_field
        WHERE tenant_id = %s AND state = 1
    """, (TENANT_ID,))
    total_relations = cursor.fetchone()['total']
    # 输入字段关联数
    cursor.execute("""
        SELECT COUNT(*) as total
        FROM f_polic_file_field fff
        INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
        WHERE fff.tenant_id = %s 
          AND fff.state = 1
          AND f.state = 1
          AND f.field_type = 1
    """, (TENANT_ID,))
    input_relations = cursor.fetchone()['total']
    # 输出字段关联数
    cursor.execute("""
        SELECT COUNT(*) as total
        FROM f_polic_file_field fff
        INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
        WHERE fff.tenant_id = %s 
          AND fff.state = 1
          AND f.state = 1
          AND f.field_type = 2
    """, (TENANT_ID,))
    output_relations = cursor.fetchone()['total']
    # 关联的模板数
    cursor.execute("""
        SELECT COUNT(DISTINCT file_id) as total
        FROM f_polic_file_field
        WHERE tenant_id = %s AND state = 1
    """, (TENANT_ID,))
    related_templates = cursor.fetchone()['total']
    # 关联的输入字段数
    cursor.execute("""
        SELECT COUNT(DISTINCT fff.filed_id) as total
        FROM f_polic_file_field fff
        INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
        WHERE fff.tenant_id = %s 
          AND fff.state = 1
          AND f.state = 1
          AND f.field_type = 1
    """, (TENANT_ID,))
    related_input_fields = cursor.fetchone()['total']
    # 关联的输出字段数
    cursor.execute("""
        SELECT COUNT(DISTINCT fff.filed_id) as total
        FROM f_polic_file_field fff
        INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
        WHERE fff.tenant_id = %s 
          AND fff.state = 1
          AND f.state = 1
          AND f.field_type = 2
    """, (TENANT_ID,))
    related_output_fields = cursor.fetchone()['total']
    print(f"\n总关联数: {total_relations}")
    print(f"  输入字段关联数: {input_relations}")
    print(f"  输出字段关联数: {output_relations}")
    print(f"\n关联的模板数: {related_templates}")
    print(f"关联的输入字段数: {related_input_fields}")
    print(f"关联的输出字段数: {related_output_fields}")
    return {
        'total_relations': total_relations,
        'input_relations': input_relations,
        'output_relations': output_relations,
        'related_templates': related_templates,
        'related_input_fields': related_input_fields,
        'related_output_fields': related_output_fields
    }
 def main():
    """主函数"""
    print("="*80)
    print("全面检查 f_polic_file_field 表的关联关系")
    print("="*80)
    try:
        conn = pymysql.connect(**DB_CONFIG)
        print("[OK] 数据库连接成功\n")
    except Exception as e:
        print(f"[ERROR] 数据库连接失败: {e}")
        return
    try:
        # 1. 检查所有模板的字段关联情况
        template_result = check_all_templates_field_relations(conn)
        # 2. 详细检查输入字段的关联情况
        input_result = check_input_field_relations_detail(conn)
        # 3. 详细检查输出字段的关联情况
        output_result = check_output_field_relations_detail(conn)
        # 4. 检查无效的关联关系
        invalid_result = check_invalid_relations(conn)
        # 5. 获取汇总统计信息
        stats = get_summary_statistics(conn)
        # 总结
        print("\n" + "="*80)
        print("检查总结")
        print("="*80)
        issues = []
        if len(template_result['templates_without_input']) > 0:
            issues.append(f"[WARN] {len(template_result['templates_without_input'])} 个模板缺少输入字段关联")
        if len(template_result['templates_without_output']) > 0:
            issues.append(f"[WARN] {len(template_result['templates_without_output'])} 个模板缺少输出字段关联")
        if len(template_result['templates_without_any']) > 0:
            issues.append(f"[WARN] {len(template_result['templates_without_any'])} 个模板没有任何字段关联")
        if len(input_result['fields_without_relations']) > 0:
            issues.append(f"[WARN] {len(input_result['fields_without_relations'])} 个输入字段未关联任何模板")
        if len(output_result['fields_without_relations']) > 0:
            issues.append(f"[WARN] {len(output_result['fields_without_relations'])} 个输出字段未关联任何模板")
        if len(invalid_result['invalid_file_relations']) > 0:
            issues.append(f"[ERROR] {len(invalid_result['invalid_file_relations'])} 条无效的 file_id 关联")
        if len(invalid_result['invalid_field_relations']) > 0:
            issues.append(f"[ERROR] {len(invalid_result['invalid_field_relations'])} 条无效的 filed_id 关联")
        if issues:
            print("\n发现以下问题:\n")
            for issue in issues:
                print(f"  {issue}")
        else:
            print("\n[OK] 未发现明显问题")
        print("\n" + "="*80)
    except Exception as e:
        print(f"\n[ERROR] 检查过程中发生错误: {e}")
        import traceback
        traceback.print_exc()
    finally:
        conn.close()
        print("\n数据库连接已关闭")
 if __name__ == '__main__':
    main()
--- a/check_relations_query.py
+++ b/check_relations_query.py
@ -0,0 +1,88 @@
 """
 检查关联关系查询逻辑
 """
 import pymysql
 import os
 from dotenv import load_dotenv
 load_dotenv()
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 def check_relations():
    """检查关联关系查询"""
    conn = pymysql.connect(**DB_CONFIG)
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        # 检查一个具体模板的关联关系
        template_id = 1765273962716807  # 走读式谈话流程
        print(f"检查模板 ID: {template_id}")
        # 方法1: 当前 API 使用的查询
        print("\n方法1: 当前 API 使用的查询（带 INNER JOIN 和 state=1）:")
        cursor.execute("""
            SELECT fff.file_id, fff.filed_id, fff.state as relation_state, fc.state as template_state
            FROM f_polic_file_field fff
            INNER JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id
            WHERE fff.tenant_id = %s AND fff.state = 1 AND fff.file_id = %s
        """, (TENANT_ID, template_id))
        results1 = cursor.fetchall()
        print(f"  结果数: {len(results1)}")
        for r in results1[:5]:
            print(f"  file_id: {r['file_id']}, filed_id: {r['filed_id']}, relation_state: {r['relation_state']}, template_state: {r['template_state']}")
        # 方法2: 只查询关联表，不检查模板状态
        print("\n方法2: 只查询关联表（不检查模板状态）:")
        cursor.execute("""
            SELECT fff.file_id, fff.filed_id, fff.state as relation_state
            FROM f_polic_file_field fff
            WHERE fff.tenant_id = %s AND fff.state = 1 AND fff.file_id = %s
        """, (TENANT_ID, template_id))
        results2 = cursor.fetchall()
        print(f"  结果数: {len(results2)}")
        for r in results2[:5]:
            print(f"  file_id: {r['file_id']}, filed_id: {r['filed_id']}, relation_state: {r['relation_state']}")
        # 方法3: 检查模板是否存在且启用
        print("\n方法3: 检查模板状态:")
        cursor.execute("""
            SELECT id, name, state
            FROM f_polic_file_config
            WHERE tenant_id = %s AND id = %s
        """, (TENANT_ID, template_id))
        template = cursor.fetchone()
        if template:
            print(f"  模板存在: {template['name']}, state: {template['state']}")
        else:
            print(f"  模板不存在")
        # 检查所有关联关系（包括 state=0 的）
        print("\n方法4: 检查所有关联关系（包括未启用的）:")
        cursor.execute("""
            SELECT fff.file_id, fff.filed_id, fff.state as relation_state
            FROM f_polic_file_field fff
            WHERE fff.tenant_id = %s AND fff.file_id = %s
        """, (TENANT_ID, template_id))
        results4 = cursor.fetchall()
        print(f"  结果数: {len(results4)}")
        enabled = [r for r in results4 if r['relation_state'] == 1]
        disabled = [r for r in results4 if r['relation_state'] == 0]
        print(f"  启用: {len(enabled)}, 未启用: {len(disabled)}")
    finally:
        cursor.close()
        conn.close()
 if __name__ == '__main__':
    check_relations()
--- a/check_remaining_fields.py
+++ b/check_remaining_fields.py
@ -0,0 +1,131 @@
 """
 检查剩余的未处理字段，并生成合适的field_code
 """
 import os
 import pymysql
 import re
 from typing import Dict, List
 # 数据库连接配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 def is_chinese(text: str) -> bool:
    """判断字符串是否包含中文字符"""
    if not text:
        return False
    return bool(re.search(r'[\u4e00-\u9fff]', text))
 def generate_field_code(field_name: str) -> str:
    """根据字段名称生成field_code"""
    # 移除常见前缀
    name = field_name.replace('被核查人员', 'target_').replace('被核查人', 'target_')
    # 转换为小写并替换特殊字符
    code = name.lower()
    code = re.sub(r'[^\w\u4e00-\u9fff]', '_', code)
    code = re.sub(r'_+', '_', code).strip('_')
    # 如果还是中文，尝试更智能的转换
    if is_chinese(code):
        # 简单的拼音映射（这里只是示例，实际应该使用拼音库）
        # 暂时使用更简单的规则
        code = field_name.lower()
        code = code.replace('被核查人员', 'target_')
        code = code.replace('被核查人', 'target_')
        code = code.replace('谈话', 'interview_')
        code = code.replace('审批', 'approval_')
        code = code.replace('核查', 'investigation_')
        code = code.replace('人员', '')
        code = code.replace('时间', '_time')
        code = code.replace('地点', '_location')
        code = code.replace('部门', '_department')
        code = code.replace('姓名', '_name')
        code = code.replace('号码', '_number')
        code = code.replace('情况', '_situation')
        code = code.replace('问题', '_issue')
        code = code.replace('描述', '_description')
        code = re.sub(r'[^\w]', '_', code)
        code = re.sub(r'_+', '_', code).strip('_')
    return code
 def check_remaining_fields():
    """检查剩余的未处理字段"""
    conn = pymysql.connect(**DB_CONFIG)
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    print("="*80)
    print("检查剩余的未处理字段")
    print("="*80)
    # 查询所有包含中文field_code的字段
    cursor.execute("""
        SELECT id, name, filed_code, field_type, state
        FROM f_polic_field
        WHERE tenant_id = %s AND (
            filed_code REGEXP '[\\u4e00-\\u9fff]' 
            OR filed_code IS NULL 
            OR filed_code = ''
        )
        ORDER BY name
    """, (TENANT_ID,))
    fields = cursor.fetchall()
    print(f"\n找到 {len(fields)} 个仍需要处理的字段:\n")
    suggestions = []
    for field in fields:
        suggested_code = generate_field_code(field['name'])
        suggestions.append({
            'id': field['id'],
            'name': field['name'],
            'current_code': field['filed_code'],
            'suggested_code': suggested_code,
            'field_type': field['field_type']
        })
        print(f"  ID: {field['id']}")
        print(f"  名称: {field['name']}")
        print(f"  当前field_code: {field['filed_code']}")
        print(f"  建议field_code: {suggested_code}")
        print(f"  field_type: {field['field_type']}")
        print()
    # 询问是否更新
    if suggestions:
        print("="*80)
        choice = input("是否更新这些字段的field_code？(y/n，默认n): ").strip().lower()
        if choice == 'y':
            print("\n开始更新...")
            for sug in suggestions:
                cursor.execute("""
                    UPDATE f_polic_field
                    SET filed_code = %s, updated_time = NOW(), updated_by = %s
                    WHERE id = %s
                """, (sug['suggested_code'], 655162080928945152, sug['id']))
                print(f"  ✓ 更新字段 ID {sug['id']}: {sug['name']} -> {sug['suggested_code']}")
            conn.commit()
            print("\n✓ 更新完成")
        else:
            print("未执行更新")
    cursor.close()
    conn.close()
 if __name__ == '__main__':
    check_remaining_fields()
--- a/check_specific_template_relations.py
+++ b/check_specific_template_relations.py
@ -0,0 +1,198 @@
 """
 检查特定模板的关联关系
 """
 import pymysql
 import os
 import re
 from pathlib import Path
 from docx import Document
 from dotenv import load_dotenv
 load_dotenv()
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 TEMPLATE_NAME = "1.请示报告卡（初核谈话）"
 TEMPLATE_FILE = "template_finish/2-初核模版/2.谈话审批/走读式谈话审批/1.请示报告卡（初核谈话）.docx"
 def extract_placeholders_from_docx(file_path: str):
    """从docx文件中提取所有占位符"""
    placeholders = set()
    pattern = r'\{\{([^}]+)\}\}'
    try:
        doc = Document(file_path)
        # 从段落中提取占位符
        for paragraph in doc.paragraphs:
            text = paragraph.text
            matches = re.findall(pattern, text)
            for match in matches:
                cleaned = match.strip()
                if cleaned and '{' not in cleaned and '}' not in cleaned:
                    placeholders.add(cleaned)
        # 从表格中提取占位符
        for table in doc.tables:
            for row in table.rows:
                for cell in row.cells:
                    for paragraph in cell.paragraphs:
                        text = paragraph.text
                        matches = re.findall(pattern, text)
                        for match in matches:
                            cleaned = match.strip()
                            if cleaned and '{' not in cleaned and '}' not in cleaned:
                                placeholders.add(cleaned)
    except Exception as e:
        print(f"错误: 读取文件失败 - {str(e)}")
        return []
    return sorted(list(placeholders))
 def check_template():
    """检查模板的关联关系"""
    conn = pymysql.connect(**DB_CONFIG)
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        print(f"检查模板: {TEMPLATE_NAME}")
        print("=" * 80)
        # 1. 从文档提取占位符
        print("\n1. 从文档提取占位符:")
        if not Path(TEMPLATE_FILE).exists():
            print(f"  文件不存在: {TEMPLATE_FILE}")
            return
        placeholders = extract_placeholders_from_docx(TEMPLATE_FILE)
        print(f"  占位符数量: {len(placeholders)}")
        print(f"  占位符列表: {placeholders}")
        # 2. 查询模板ID
        print(f"\n2. 查询模板ID:")
        cursor.execute("""
            SELECT id, name
            FROM f_polic_file_config
            WHERE tenant_id = %s AND name = %s
        """, (TENANT_ID, TEMPLATE_NAME))
        template = cursor.fetchone()
        if not template:
            print(f"  模板不存在")
            return
        template_id = template['id']
        print(f"  模板ID: {template_id}")
        # 3. 查询字段映射
        print(f"\n3. 查询字段映射:")
        cursor.execute("""
            SELECT id, name, filed_code, field_type, state
            FROM f_polic_field
            WHERE tenant_id = %s
        """, (TENANT_ID,))
        fields = cursor.fetchall()
        field_map = {}
        for field in fields:
            state = field['state']
            if isinstance(state, bytes):
                state = int.from_bytes(state, byteorder='big') if len(state) == 1 else 1
            field_map[field['filed_code']] = {
                'id': field['id'],
                'name': field['name'],
                'field_type': field['field_type'],
                'state': state
            }
        print(f"  字段总数: {len(field_map)}")
        # 4. 匹配占位符到字段
        print(f"\n4. 匹配占位符到字段:")
        input_field_ids = []
        output_field_ids = []
        not_found = []
        for placeholder in placeholders:
            if placeholder in field_map:
                field_info = field_map[placeholder]
                if field_info['state'] == 1:
                    if field_info['field_type'] == 1:
                        input_field_ids.append(field_info['id'])
                    elif field_info['field_type'] == 2:
                        output_field_ids.append(field_info['id'])
            else:
                not_found.append(placeholder)
        # 添加必需的输入字段
        required_input_fields = ['clue_info', 'target_basic_info_clue']
        for req_field in required_input_fields:
            if req_field in field_map:
                field_info = field_map[req_field]
                if field_info['state'] == 1 and field_info['id'] not in input_field_ids:
                    input_field_ids.append(field_info['id'])
        print(f"  输入字段ID: {input_field_ids}")
        print(f"  输出字段ID: {output_field_ids}")
        if not_found:
            print(f"  未找到的占位符: {not_found}")
        # 5. 查询数据库中的关联关系
        print(f"\n5. 查询数据库中的关联关系:")
        cursor.execute("""
            SELECT fff.filed_id, fff.state, f.name, f.field_type
            FROM f_polic_file_field fff
            INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
            WHERE fff.tenant_id = %s AND fff.file_id = %s
        """, (TENANT_ID, template_id))
        db_relations = cursor.fetchall()
        db_input_ids = []
        db_output_ids = []
        for rel in db_relations:
            state = rel['state']
            if isinstance(state, bytes):
                state = int.from_bytes(state, byteorder='big') if len(state) == 1 else 1
            if state == 1:
                if rel['field_type'] == 1:
                    db_input_ids.append(rel['filed_id'])
                elif rel['field_type'] == 2:
                    db_output_ids.append(rel['filed_id'])
        print(f"  数据库中的输入字段ID: {sorted(db_input_ids)}")
        print(f"  数据库中的输出字段ID: {sorted(db_output_ids)}")
        # 6. 对比
        print(f"\n6. 对比结果:")
        expected_input = set(input_field_ids)
        expected_output = set(output_field_ids)
        actual_input = set(db_input_ids)
        actual_output = set(db_output_ids)
        print(f"  输入字段 - 期望: {sorted(expected_input)}, 实际: {sorted(actual_input)}")
        print(f"  输入字段匹配: {expected_input == actual_input}")
        print(f"  输出字段 - 期望: {sorted(expected_output)}, 实际: {sorted(actual_output)}")
        print(f"  输出字段匹配: {expected_output == actual_output}")
        if expected_output != actual_output:
            missing = expected_output - actual_output
            extra = actual_output - expected_output
            print(f"  缺少的输出字段: {sorted(missing)}")
            print(f"  多余的输出字段: {sorted(extra)}")
    finally:
        cursor.close()
        conn.close()
 if __name__ == '__main__':
    check_template()
--- a/check_template_all_relations.py
+++ b/check_template_all_relations.py
@ -0,0 +1,98 @@
 """
 检查模板的所有关联关系（包括未启用的）
 """
 import pymysql
 import os
 from dotenv import load_dotenv
 load_dotenv()
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 TEMPLATE_ID = 1765432134276990  # 1.请示报告卡（初核谈话）
 def check_all_relations():
    """检查模板的所有关联关系"""
    conn = pymysql.connect(**DB_CONFIG)
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        print(f"检查模板 ID: {TEMPLATE_ID}")
        print("=" * 80)
        # 查询模板信息
        cursor.execute("""
            SELECT id, name, state
            FROM f_polic_file_config
            WHERE tenant_id = %s AND id = %s
        """, (TENANT_ID, TEMPLATE_ID))
        template = cursor.fetchone()
        if template:
            print(f"模板名称: {template['name']}")
            print(f"模板状态: {template['state']}")
        else:
            print("模板不存在")
            return
        # 查询所有关联关系（包括 state=0 的）
        cursor.execute("""
            SELECT 
                fff.file_id, 
                fff.filed_id, 
                fff.state as relation_state,
                f.name as field_name,
                f.field_type,
                f.state as field_state
            FROM f_polic_file_field fff
            INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
            WHERE fff.tenant_id = %s AND fff.file_id = %s
            ORDER BY f.field_type, f.name
        """, (TENANT_ID, TEMPLATE_ID))
        all_relations = cursor.fetchall()
        print(f"\n所有关联关系数: {len(all_relations)}")
        # 按状态分组
        enabled_relations = [r for r in all_relations if r['relation_state'] == 1 or (isinstance(r['relation_state'], bytes) and r['relation_state'] == b'\x01')]
        disabled_relations = [r for r in all_relations if r not in enabled_relations]
        print(f"启用的关联关系: {len(enabled_relations)}")
        print(f"未启用的关联关系: {len(disabled_relations)}")
        # 按字段类型分组
        input_fields = [r for r in enabled_relations if r['field_type'] == 1]
        output_fields = [r for r in enabled_relations if r['field_type'] == 2]
        print(f"\n启用的输入字段关联: {len(input_fields)}")
        for r in input_fields:
            state_str = str(r['relation_state']) if not isinstance(r['relation_state'], bytes) else 'bytes'
            print(f"  - {r['field_name']} (ID: {r['filed_id']}, relation_state: {state_str}, field_state: {r['field_state']})")
        print(f"\n启用的输出字段关联: {len(output_fields)}")
        for r in output_fields[:10]:
            state_str = str(r['relation_state']) if not isinstance(r['relation_state'], bytes) else 'bytes'
            print(f"  - {r['field_name']} (ID: {r['filed_id']}, relation_state: {state_str}, field_state: {r['field_state']})")
        if len(output_fields) > 10:
            print(f"  ... 还有 {len(output_fields) - 10} 个输出字段")
        # 检查未启用的关联关系
        if disabled_relations:
            print(f"\n未启用的关联关系: {len(disabled_relations)}")
            disabled_input = [r for r in disabled_relations if r['field_type'] == 1]
            disabled_output = [r for r in disabled_relations if r['field_type'] == 2]
            print(f"  输入字段: {len(disabled_input)}, 输出字段: {len(disabled_output)}")
    finally:
        cursor.close()
        conn.close()
 if __name__ == '__main__':
    check_all_relations()
--- a/check_template_with_output_fields.py
+++ b/check_template_with_output_fields.py
@ -0,0 +1,76 @@
 """
 检查哪些模板有输出字段关联
 """
 import pymysql
 import os
 from dotenv import load_dotenv
 load_dotenv()
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 def check_templates_with_output_fields():
    """检查哪些模板有输出字段关联"""
    conn = pymysql.connect(**DB_CONFIG)
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        # 查询所有模板及其关联的输出字段
        cursor.execute("""
            SELECT 
                fc.id as template_id,
                fc.name as template_name,
                COUNT(CASE WHEN f.field_type = 2 THEN 1 END) as output_field_count,
                COUNT(CASE WHEN f.field_type = 1 THEN 1 END) as input_field_count,
                COUNT(*) as total_field_count
            FROM f_polic_file_config fc
            INNER JOIN f_polic_file_field fff ON fc.id = fff.file_id AND fc.tenant_id = fff.tenant_id
            INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
            WHERE fc.tenant_id = %s 
              AND fff.state = 1
              AND fc.state = 1
            GROUP BY fc.id, fc.name
            HAVING output_field_count > 0
            ORDER BY output_field_count DESC
            LIMIT 10
        """, (TENANT_ID,))
        templates = cursor.fetchall()
        print(f"有输出字段关联的模板（前10个）:")
        print("=" * 80)
        for t in templates:
            print(f"\n模板: {t['template_name']} (ID: {t['template_id']})")
            print(f"  输入字段: {t['input_field_count']}, 输出字段: {t['output_field_count']}, 总计: {t['total_field_count']}")
            # 查询该模板的具体输出字段
            cursor.execute("""
                SELECT f.id, f.name, f.filed_code
                FROM f_polic_file_field fff
                INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
                WHERE fff.tenant_id = %s 
                  AND fff.file_id = %s
                  AND fff.state = 1
                  AND f.field_type = 2
                LIMIT 5
            """, (TENANT_ID, t['template_id']))
            output_fields = cursor.fetchall()
            print(f"  输出字段示例（前5个）:")
            for f in output_fields:
                print(f"    - {f['name']} (ID: {f['id']}, code: {f['filed_code']})")
    finally:
        cursor.close()
        conn.close()
 if __name__ == '__main__':
    check_templates_with_output_fields()
--- a/clean_and_resync_templates.py
+++ b/clean_and_resync_templates.py
@ -0,0 +1,874 @@
 """
 清理并重新同步模板数据到指定数据库
 功能：
 1. 清理指定tenant_id下的旧数据（包括MinIO路径的数据）
 2. 清理相关的字段关联关系
 3. 重新扫描template_finish/目录
 4. 重新创建/更新模板数据
 5. 重新建立字段关联关系
 使用方法：
 python clean_and_resync_templates.py --host 10.100.31.21 --port 3306 --user finyx --password FknJYz3FA5WDYtsd --database finyx --tenant-id 1
 """
 import os
 import sys
 import pymysql
 import argparse
 from pathlib import Path
 from typing import Dict, List, Set, Optional
 import re
 from docx import Document
 import getpass
 # 设置输出编码为UTF-8（Windows兼容）
 if sys.platform == 'win32':
    import io
    sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
    sys.stderr = io.TextIOWrapper(sys.stderr.buffer, encoding='utf-8')
 # 项目根目录
 PROJECT_ROOT = Path(__file__).parent
 TEMPLATES_DIR = PROJECT_ROOT / "template_finish"
 CREATED_BY = 655162080928945152
 UPDATED_BY = 655162080928945152
 def print_section(title):
    """打印章节标题"""
    print("\n" + "="*70)
    print(f"  {title}")
    print("="*70)
 def print_result(success, message):
    """打印结果"""
    status = "[OK]" if success else "[FAIL]"
    print(f"{status} {message}")
 def generate_id():
    """生成ID"""
    import time
    return int(time.time() * 1000000)
 def get_db_config_from_args() -> Optional[Dict]:
    """从命令行参数获取数据库配置"""
    parser = argparse.ArgumentParser(
        description='清理并重新同步模板数据到指定数据库',
        formatter_class=argparse.RawDescriptionHelpFormatter,
        epilog="""
 示例：
  python clean_and_resync_templates.py --host 10.100.31.21 --port 3306 --user finyx --password FknJYz3FA5WDYtsd --database finyx --tenant-id 1
        """
    )
    parser.add_argument('--host', type=str, required=True, help='MySQL服务器地址')
    parser.add_argument('--port', type=int, required=True, help='MySQL服务器端口')
    parser.add_argument('--user', type=str, required=True, help='MySQL用户名')
    parser.add_argument('--password', type=str, required=True, help='MySQL密码')
    parser.add_argument('--database', type=str, required=True, help='数据库名称')
    parser.add_argument('--tenant-id', type=int, required=True, help='租户ID')
    parser.add_argument('--dry-run', action='store_true', help='预览模式（不实际更新数据库）')
    parser.add_argument('--skip-clean', action='store_true', help='跳过清理步骤（只同步）')
    args = parser.parse_args()
    return {
        'host': args.host,
        'port': args.port,
        'user': args.user,
        'password': args.password,
        'database': args.database,
        'charset': 'utf8mb4',
        'tenant_id': args.tenant_id,
        'dry_run': args.dry_run,
        'skip_clean': args.skip_clean
    }
 def test_db_connection(config: Dict) -> Optional[pymysql.Connection]:
    """测试数据库连接"""
    try:
        conn = pymysql.connect(
            host=config['host'],
            port=config['port'],
            user=config['user'],
            password=config['password'],
            database=config['database'],
            charset=config['charset']
        )
        return conn
    except Exception as e:
        print_result(False, f"数据库连接失败: {str(e)}")
        return None
 def scan_local_templates() -> Dict[str, Path]:
    """扫描本地template_finish目录，返回file_path -> Path的映射"""
    templates = {}
    if not TEMPLATES_DIR.exists():
        return templates
    for item in TEMPLATES_DIR.rglob("*"):
        if item.is_file() and item.suffix.lower() in ['.docx', '.doc']:
            rel_path = item.relative_to(PROJECT_ROOT)
            rel_path_str = str(rel_path).replace('\\', '/')
            templates[rel_path_str] = item
    return templates
 def clean_old_data(conn, tenant_id: int, local_templates: Dict[str, Path], dry_run: bool = False):
    """清理旧数据"""
    print_section("清理旧数据")
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        # 1. 获取所有模板
        cursor.execute("""
            SELECT id, name, file_path
            FROM f_polic_file_config
            WHERE tenant_id = %s
            AND state = 1
        """, (tenant_id,))
        all_templates = cursor.fetchall()
        print(f"  数据库中的模板总数: {len(all_templates)}")
        # 2. 识别需要删除的模板
        to_delete = []
        minio_paths = []
        invalid_paths = []
        duplicate_paths = []
        # 统计file_path
        path_count = {}
        for template in all_templates:
            file_path = template.get('file_path')
            if file_path:
                if file_path not in path_count:
                    path_count[file_path] = []
                path_count[file_path].append(template)
        for template in all_templates:
            file_path = template.get('file_path')
            template_id = template['id']
            # 检查是否是MinIO路径
            if file_path and ('minio' in file_path.lower() or file_path.startswith('http://') or file_path.startswith('https://')):
                minio_paths.append(template)
                to_delete.append(template_id)
                continue
            # 检查文件路径是否在本地存在
            if file_path:
                if file_path not in local_templates:
                    invalid_paths.append(template)
                    to_delete.append(template_id)
                    continue
                # 检查是否有重复路径
                if len(path_count.get(file_path, [])) > 1:
                    # 保留第一个，删除其他的
                    if template != path_count[file_path][0]:
                        duplicate_paths.append(template)
                        to_delete.append(template_id)
                        continue
        # 3. 统计需要删除的数据
        print(f"\n  需要删除的模板:")
        print(f"    - MinIO路径的模板: {len(minio_paths)} 个")
        print(f"    - 无效路径的模板: {len(invalid_paths)} 个")
        print(f"    - 重复路径的模板: {len(duplicate_paths)} 个")
        print(f"    - 总计: {len(to_delete)} 个")
        if to_delete and not dry_run:
            # 4. 删除字段关联关系
            print("\n  删除字段关联关系...")
            if to_delete:
                placeholders = ','.join(['%s'] * len(to_delete))
                delete_relations_sql = f"""
                    DELETE FROM f_polic_file_field
                    WHERE tenant_id = %s
                    AND file_id IN ({placeholders})
                """
                cursor.execute(delete_relations_sql, [tenant_id] + to_delete)
                deleted_relations = cursor.rowcount
                print(f"    删除了 {deleted_relations} 条字段关联关系")
            # 5. 删除模板记录
            print("\n  删除模板记录...")
            delete_templates_sql = f"""
                UPDATE f_polic_file_config
                SET state = 0, updated_time = NOW(), updated_by = %s
                WHERE tenant_id = %s
                AND id IN ({placeholders})
            """
            cursor.execute(delete_templates_sql, [UPDATED_BY, tenant_id] + to_delete)
            deleted_templates = cursor.rowcount
            print(f"    删除了 {deleted_templates} 个模板记录（标记为state=0）")
            conn.commit()
            print_result(True, f"清理完成：删除了 {deleted_templates} 个模板记录")
        elif to_delete:
            print("\n  [预览模式] 将删除上述模板记录")
        else:
            print_result(True, "没有需要清理的数据")
        return {
            'total': len(all_templates),
            'deleted': len(to_delete),
            'minio_paths': len(minio_paths),
            'invalid_paths': len(invalid_paths),
            'duplicate_paths': len(duplicate_paths)
        }
    finally:
        cursor.close()
 def scan_directory_structure(base_dir: Path) -> Dict:
    """扫描目录结构"""
    directories = []
    files = []
    def scan_recursive(current_path: Path, parent_path: Optional[str] = None):
        """递归扫描目录"""
        if not current_path.exists() or not current_path.is_dir():
            return
        # 获取相对路径
        rel_path = current_path.relative_to(base_dir)
        rel_path_str = str(rel_path).replace('\\', '/')
        # 添加目录节点
        if rel_path_str != '.':
            directories.append({
                'name': current_path.name,
                'path': rel_path_str,
                'parent_path': parent_path
            })
        # 扫描子项
        for item in sorted(current_path.iterdir()):
            if item.is_dir():
                scan_recursive(item, rel_path_str)
            elif item.is_file() and item.suffix.lower() in ['.docx', '.doc']:
                file_rel_path = item.relative_to(base_dir)
                file_rel_path_str = str(file_rel_path).replace('\\', '/')
                files.append({
                    'name': item.name,
                    'path': file_rel_path_str,
                    'parent_path': rel_path_str if rel_path_str != '.' else None
                })
    scan_recursive(base_dir)
    return {
        'directories': directories,
        'files': files
    }
 def get_existing_templates(conn, tenant_id: int) -> Dict:
    """获取现有模板（只获取state=1的）"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        cursor.execute("""
            SELECT id, name, file_path, parent_id
            FROM f_polic_file_config
            WHERE tenant_id = %s
            AND state = 1
        """, (tenant_id,))
        templates = cursor.fetchall()
        result = {
            'by_path': {},
            'by_name': {},
            'by_id': {}
        }
        for t in templates:
            result['by_id'][t['id']] = t
            if t['file_path']:
                result['by_path'][t['file_path']] = t
            else:
                name = t['name']
                if name not in result['by_name']:
                    result['by_name'][name] = []
                result['by_name'][name].append(t)
        return result
    finally:
        cursor.close()
 def sync_template_hierarchy(conn, tenant_id: int, dry_run: bool = False):
    """同步模板层级结构"""
    print_section("同步模板层级结构")
    # 1. 扫描目录结构
    print("1. 扫描目录结构...")
    structure = scan_directory_structure(TEMPLATES_DIR)
    print_result(True, f"找到 {len(structure['directories'])} 个目录，{len(structure['files'])} 个文件")
    if not structure['directories'] and not structure['files']:
        print_result(False, "未找到任何目录或文件")
        return None
    # 2. 获取现有模板
    print("\n2. 获取现有模板...")
    existing_templates = get_existing_templates(conn, tenant_id)
    print_result(True, f"找到 {len(existing_templates['by_path'])} 个文件模板，{len(existing_templates['by_name'])} 个目录模板")
    # 3. 创建/更新目录节点
    print("\n3. 创建/更新目录节点...")
    path_to_id = {}
    dir_created = 0
    dir_updated = 0
    for dir_info in structure['directories']:
        parent_id = None
        if dir_info['parent_path']:
            parent_id = path_to_id.get(dir_info['parent_path'])
        existing = None
        candidates = existing_templates['by_name'].get(dir_info['name'], [])
        for candidate in candidates:
            if candidate.get('parent_id') == parent_id and not candidate.get('file_path'):
                existing = candidate
                break
        if existing:
            dir_id = existing['id']
            if existing.get('parent_id') != parent_id:
                dir_updated += 1
                if not dry_run:
                    cursor = conn.cursor()
                    cursor.execute("""
                        UPDATE f_polic_file_config
                        SET parent_id = %s, updated_time = NOW(), updated_by = %s
                        WHERE id = %s AND tenant_id = %s
                    """, (parent_id, UPDATED_BY, dir_id, tenant_id))
                    conn.commit()
                    cursor.close()
        else:
            dir_id = generate_id()
            dir_created += 1
            if not dry_run:
                cursor = conn.cursor()
                cursor.execute("""
                    INSERT INTO f_polic_file_config
                    (id, tenant_id, parent_id, name, file_path, created_time, created_by, updated_time, updated_by, state)
                    VALUES (%s, %s, %s, %s, NULL, NOW(), %s, NOW(), %s, 1)
                """, (dir_id, tenant_id, parent_id, dir_info['name'], CREATED_BY, UPDATED_BY))
                conn.commit()
                cursor.close()
        path_to_id[dir_info['path']] = dir_id
    print_result(True, f"创建 {dir_created} 个目录，更新 {dir_updated} 个目录")
    # 4. 创建/更新文件节点
    print("\n4. 创建/更新文件节点...")
    file_created = 0
    file_updated = 0
    for file_info in structure['files']:
        parent_id = None
        if file_info['parent_path']:
            parent_id = path_to_id.get(file_info['parent_path'])
        existing = existing_templates['by_path'].get(file_info['path'])
        if existing:
            file_id = existing['id']
            if existing.get('parent_id') != parent_id or existing.get('name') != file_info['name']:
                file_updated += 1
                if not dry_run:
                    cursor = conn.cursor()
                    cursor.execute("""
                        UPDATE f_polic_file_config
                        SET parent_id = %s, name = %s, updated_time = NOW(), updated_by = %s
                        WHERE id = %s AND tenant_id = %s
                    """, (parent_id, file_info['name'], UPDATED_BY, file_id, tenant_id))
                    conn.commit()
                    cursor.close()
        else:
            file_id = generate_id()
            file_created += 1
            if not dry_run:
                cursor = conn.cursor()
                cursor.execute("""
                    INSERT INTO f_polic_file_config
                    (id, tenant_id, parent_id, name, file_path, created_time, created_by, updated_time, updated_by, state)
                    VALUES (%s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, 1)
                """, (file_id, tenant_id, parent_id, file_info['name'], file_info['path'], CREATED_BY, UPDATED_BY))
                conn.commit()
                cursor.close()
    print_result(True, f"创建 {file_created} 个文件，更新 {file_updated} 个文件")
    return {
        'directories_created': dir_created,
        'directories_updated': dir_updated,
        'files_created': file_created,
        'files_updated': file_updated
    }
 def get_input_fields(conn, tenant_id: int) -> Dict[str, int]:
    """获取输入字段"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        sql = """
            SELECT id, filed_code, name
            FROM f_polic_field
            WHERE tenant_id = %s
            AND field_type = 1
            AND filed_code IN ('clue_info', 'target_basic_info_clue')
            AND state = 1
        """
        cursor.execute(sql, (tenant_id,))
        fields = cursor.fetchall()
        result = {}
        for field in fields:
            result[field['filed_code']] = field['id']
        return result
    finally:
        cursor.close()
 def get_output_fields(conn, tenant_id: int) -> Dict[str, int]:
    """获取所有输出字段"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        sql = """
            SELECT id, filed_code, name
            FROM f_polic_field
            WHERE tenant_id = %s
            AND field_type = 2
            AND state = 1
        """
        cursor.execute(sql, (tenant_id,))
        fields = cursor.fetchall()
        result = {}
        for field in fields:
            result[field['filed_code']] = field['id']
        return result
    finally:
        cursor.close()
 def extract_placeholders_from_docx(file_path: Path) -> Set[str]:
    """从docx文件中提取所有占位符"""
    placeholders = set()
    placeholder_pattern = re.compile(r'\{\{([^}]+)\}\}')
    try:
        doc = Document(file_path)
        # 从段落中提取
        for paragraph in doc.paragraphs:
            text = paragraph.text
            matches = placeholder_pattern.findall(text)
            for match in matches:
                field_code = match.strip()
                if field_code:
                    placeholders.add(field_code)
        # 从表格中提取
        for table in doc.tables:
            try:
                for row in table.rows:
                    for cell in row.cells:
                        for paragraph in cell.paragraphs:
                            text = paragraph.text
                            matches = placeholder_pattern.findall(text)
                            for match in matches:
                                field_code = match.strip()
                                if field_code:
                                    placeholders.add(field_code)
            except:
                continue
    except Exception as e:
        pass
    return placeholders
 def create_missing_input_field(conn, tenant_id: int, field_code: str) -> Optional[int]:
    """创建缺失的输入字段"""
    cursor = conn.cursor()
    try:
        field_id = generate_id()
        field_name_map = {
            'clue_info': '线索信息',
            'target_basic_info_clue': '被核查人基本信息（线索）'
        }
        field_name = field_name_map.get(field_code, field_code.replace('_', ' '))
        insert_sql = """
            INSERT INTO f_polic_field
            (id, tenant_id, name, filed_code, field_type, created_time, created_by, updated_time, updated_by, state)
            VALUES (%s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, 1)
        """
        cursor.execute(insert_sql, (
            field_id,
            tenant_id,
            field_name,
            field_code,
            1,
            CREATED_BY,
            UPDATED_BY
        ))
        conn.commit()
        return field_id
    except Exception as e:
        conn.rollback()
        return None
    finally:
        cursor.close()
 def create_missing_output_field(conn, tenant_id: int, field_code: str) -> Optional[int]:
    """创建缺失的输出字段"""
    cursor = conn.cursor()
    try:
        # 先检查是否已存在
        check_cursor = conn.cursor(pymysql.cursors.DictCursor)
        check_cursor.execute("""
            SELECT id FROM f_polic_field
            WHERE tenant_id = %s AND filed_code = %s
        """, (tenant_id, field_code))
        existing = check_cursor.fetchone()
        check_cursor.close()
        if existing:
            return existing['id']
        # 创建新字段
        field_id = generate_id()
        field_name = field_code.replace('_', ' ')
        insert_sql = """
            INSERT INTO f_polic_field
            (id, tenant_id, name, filed_code, field_type, created_time, created_by, updated_time, updated_by, state)
            VALUES (%s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, 1)
        """
        cursor.execute(insert_sql, (
            field_id,
            tenant_id,
            field_name,
            field_code,
            2,
            CREATED_BY,
            UPDATED_BY
        ))
        conn.commit()
        return field_id
    except Exception as e:
        conn.rollback()
        return None
    finally:
        cursor.close()
 def get_existing_relations(conn, tenant_id: int, file_id: int) -> Set[int]:
    """获取模板的现有关联关系"""
    cursor = conn.cursor()
    try:
        sql = """
            SELECT filed_id
            FROM f_polic_file_field
            WHERE tenant_id = %s
            AND file_id = %s
            AND state = 1
        """
        cursor.execute(sql, (tenant_id, file_id))
        results = cursor.fetchall()
        return {row[0] for row in results}
    finally:
        cursor.close()
 def sync_field_relations(conn, tenant_id: int, dry_run: bool = False):
    """同步字段关联关系"""
    print_section("同步字段关联关系")
    # 1. 获取输入字段
    print("1. 获取输入字段...")
    input_fields = get_input_fields(conn, tenant_id)
    if not input_fields:
        print("  创建缺失的输入字段...")
        for field_code in ['clue_info', 'target_basic_info_clue']:
            field_id = create_missing_input_field(conn, tenant_id, field_code)
            if field_id:
                input_fields[field_code] = field_id
    if not input_fields:
        print_result(False, "无法获取或创建输入字段")
        return None
    input_field_ids = list(input_fields.values())
    print_result(True, f"找到 {len(input_field_ids)} 个输入字段")
    # 2. 获取输出字段
    print("\n2. 获取输出字段...")
    output_fields = get_output_fields(conn, tenant_id)
    print_result(True, f"找到 {len(output_fields)} 个输出字段")
    # 3. 获取所有模板
    print("\n3. 获取所有模板...")
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        sql = """
            SELECT id, name, file_path
            FROM f_polic_file_config
            WHERE tenant_id = %s
            AND file_path IS NOT NULL
            AND file_path != ''
            AND state = 1
        """
        cursor.execute(sql, (tenant_id,))
        templates = cursor.fetchall()
    finally:
        cursor.close()
    print_result(True, f"找到 {len(templates)} 个模板")
    if not templates:
        print_result(False, "未找到模板")
        return None
    # 4. 先清理所有现有关联关系
    print("\n4. 清理现有关联关系...")
    if not dry_run:
        cursor = conn.cursor()
        try:
            cursor.execute("""
                DELETE FROM f_polic_file_field
                WHERE tenant_id = %s
            """, (tenant_id,))
            deleted_count = cursor.rowcount
            conn.commit()
            print_result(True, f"删除了 {deleted_count} 条旧关联关系")
        finally:
            cursor.close()
    else:
        print("  [预览模式] 将清理所有现有关联关系")
    # 5. 扫描模板占位符并创建关联关系
    print("\n5. 扫描模板占位符并创建关联关系...")
    total_updated = 0
    total_errors = 0
    all_placeholders_found = set()
    missing_fields = set()
    for i, template in enumerate(templates, 1):
        template_id = template['id']
        template_name = template['name']
        file_path = template['file_path']
        if i % 20 == 0:
            print(f"  处理进度: {i}/{len(templates)}")
        # 检查本地文件是否存在
        local_file = PROJECT_ROOT / file_path
        if not local_file.exists():
            total_errors += 1
            continue
        # 提取占位符
        placeholders = extract_placeholders_from_docx(local_file)
        all_placeholders_found.update(placeholders)
        # 根据占位符找到对应的输出字段ID
        output_field_ids = []
        for placeholder in placeholders:
            if placeholder in output_fields:
                output_field_ids.append(output_fields[placeholder])
            else:
                # 字段不存在，尝试创建
                missing_fields.add(placeholder)
                field_id = create_missing_output_field(conn, tenant_id, placeholder)
                if field_id:
                    output_fields[placeholder] = field_id
                    output_field_ids.append(field_id)
        # 创建关联关系
        all_field_ids = input_field_ids + output_field_ids
        if not dry_run and all_field_ids:
            cursor = conn.cursor()
            try:
                for field_id in all_field_ids:
                    relation_id = generate_id()
                    insert_sql = """
                        INSERT INTO f_polic_file_field
                        (id, tenant_id, file_id, filed_id, created_time, created_by, updated_time, updated_by, state)
                        VALUES (%s, %s, %s, %s, NOW(), %s, NOW(), %s, 1)
                    """
                    cursor.execute(insert_sql, (
                        relation_id,
                        tenant_id,
                        template_id,
                        field_id,
                        CREATED_BY,
                        UPDATED_BY
                    ))
                conn.commit()
                total_updated += 1
            except Exception as e:
                conn.rollback()
                total_errors += 1
            finally:
                cursor.close()
        else:
            total_updated += 1
    # 6. 统计结果
    print_section("字段关联同步结果")
    print(f"  总模板数: {len(templates)}")
    print(f"  已处理: {total_updated} 个")
    print(f"  错误: {total_errors} 个")
    print(f"  发现的占位符总数: {len(all_placeholders_found)} 个")
    print(f"  创建的字段数: {len(missing_fields)} 个")
    return {
        'total_templates': len(templates),
        'updated': total_updated,
        'errors': total_errors,
        'placeholders_found': len(all_placeholders_found),
        'fields_created': len(missing_fields)
    }
 def main():
    """主函数"""
    print_section("清理并重新同步模板数据")
    # 获取配置
    config = get_db_config_from_args()
    # 显示配置信息
    print_section("配置信息")
    print(f"  数据库服务器: {config['host']}:{config['port']}")
    print(f"  数据库名称: {config['database']}")
    print(f"  用户名: {config['user']}")
    print(f"  租户ID: {config['tenant_id']}")
    print(f"  预览模式: {'是' if config['dry_run'] else '否'}")
    print(f"  跳过清理: {'是' if config['skip_clean'] else '否'}")
    if config['dry_run']:
        print("\n[注意] 当前为预览模式，不会实际更新数据库")
    # 确认
    if not config.get('dry_run'):
        print("\n[警告] 此操作将清理指定租户下的旧数据并重新同步")
        confirm = input("确认执行？[yes/N]: ").strip().lower()
        if confirm != 'yes':
            print("已取消")
            return
    # 连接数据库
    print_section("连接数据库")
    conn = test_db_connection(config)
    if not conn:
        return
    print_result(True, "数据库连接成功")
    try:
        tenant_id = config['tenant_id']
        dry_run = config['dry_run']
        skip_clean = config['skip_clean']
        results = {}
        # 1. 扫描本地模板
        print_section("扫描本地模板")
        local_templates = scan_local_templates()
        print_result(True, f"找到 {len(local_templates)} 个本地模板文件")
        # 2. 清理旧数据
        if not skip_clean:
            clean_result = clean_old_data(conn, tenant_id, local_templates, dry_run)
            results['clean'] = clean_result
        else:
            print_section("跳过清理步骤")
            print("  已跳过清理步骤")
        # 3. 同步模板层级结构
        hierarchy_result = sync_template_hierarchy(conn, tenant_id, dry_run)
        results['hierarchy'] = hierarchy_result
        # 4. 同步字段关联关系
        fields_result = sync_field_relations(conn, tenant_id, dry_run)
        results['fields'] = fields_result
        # 5. 总结
        print_section("同步完成")
        if config['dry_run']:
            print("  本次为预览模式，未实际更新数据库")
        else:
            print("  数据库已更新")
        if 'clean' in results:
            c = results['clean']
            print(f"\n  清理结果:")
            print(f"    - 总模板数: {c['total']} 个")
            print(f"    - 删除模板: {c['deleted']} 个")
            print(f"      * MinIO路径: {c['minio_paths']} 个")
            print(f"      * 无效路径: {c['invalid_paths']} 个")
            print(f"      * 重复路径: {c['duplicate_paths']} 个")
        if 'hierarchy' in results and results['hierarchy']:
            h = results['hierarchy']
            print(f"\n  层级结构:")
            print(f"    - 创建目录: {h['directories_created']} 个")
            print(f"    - 更新目录: {h['directories_updated']} 个")
            print(f"    - 创建文件: {h['files_created']} 个")
            print(f"    - 更新文件: {h['files_updated']} 个")
        if 'fields' in results and results['fields']:
            f = results['fields']
            print(f"\n  字段关联:")
            print(f"    - 总模板数: {f['total_templates']} 个")
            print(f"    - 已处理: {f['updated']} 个")
            print(f"    - 发现的占位符: {f['placeholders_found']} 个")
            print(f"    - 创建的字段: {f['fields_created']} 个")
    finally:
        conn.close()
        print_result(True, "数据库连接已关闭")
 if __name__ == "__main__":
    try:
        main()
    except KeyboardInterrupt:
        print("\n\n[中断] 用户取消操作")
        sys.exit(0)
    except Exception as e:
        print(f"\n[错误] 发生异常: {str(e)}")
        import traceback
        traceback.print_exc()
        sys.exit(1)
--- a/cleanup_duplicate_templates.py
+++ b/cleanup_duplicate_templates.py
@ -0,0 +1,361 @@
 """
 清理 f_polic_file_config 表中的重复和无效数据
 确保文档模板结构和 template_finish/ 文件夹对应
 """
 import os
 import re
 import json
 import pymysql
 from pathlib import Path
 from typing import Dict, List, Set, Optional
 from collections import defaultdict
 # 数据库连接配置
 DB_CONFIG = {
    'host': '152.136.177.240',
    'port': 5012,
    'user': 'finyx',
    'password': '6QsGK6MpePZDE57Z',
    'database': 'finyx',
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 UPDATED_BY = 655162080928945152
 TEMPLATE_BASE_DIR = 'template_finish'
 def normalize_template_name(name: str) -> str:
    """
    标准化模板名称（去掉扩展名、括号内容、数字前缀等）
    Args:
        name: 文件名或模板名称
    Returns:
        标准化后的名称
    """
    # 去掉扩展名
    name = Path(name).stem if '.' in name else name
    # 去掉括号内容
    name = re.sub(r'[（(].*?[）)]', '', name)
    name = name.strip()
    # 去掉数字前缀和点号
    name = re.sub(r'^\d+[\.\-]?\s*', '', name)
    name = name.strip()
    return name
 def scan_template_files(base_dir: str) -> Dict[str, Dict]:
    """
    扫描模板文件夹，获取所有有效的模板文件
    Returns:
        字典，key为标准化名称，value为模板信息列表（可能有多个同名文件）
    """
    base_path = Path(base_dir)
    if not base_path.exists():
        print(f"错误: 目录不存在 - {base_dir}")
        return {}
    templates = defaultdict(list)
    print("=" * 80)
    print("扫描模板文件...")
    print("=" * 80)
    for docx_file in sorted(base_path.rglob("*.docx")):
        # 跳过临时文件
        if docx_file.name.startswith("~$"):
            continue
        relative_path = docx_file.relative_to(base_path)
        file_name = docx_file.name
        normalized_name = normalize_template_name(file_name)
        templates[normalized_name].append({
            'file_path': str(docx_file),
            'relative_path': str(relative_path),
            'file_name': file_name,
            'normalized_name': normalized_name
        })
    print(f"总共扫描到 {sum(len(v) for v in templates.values())} 个模板文件")
    print(f"唯一模板名称: {len(templates)} 个")
    return dict(templates)
 def get_all_templates_from_db(conn) -> Dict[str, List[Dict]]:
    """
    从数据库获取所有模板，按标准化名称分组
    Returns:
        字典，key为标准化名称，value为模板记录列表
    """
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    sql = """
        SELECT id, name, file_path, parent_id, state, input_data, created_time, updated_time
        FROM f_polic_file_config
        WHERE tenant_id = %s
        ORDER BY created_time DESC
    """
    cursor.execute(sql, (TENANT_ID,))
    templates = cursor.fetchall()
    result = defaultdict(list)
    for template in templates:
        normalized_name = normalize_template_name(template['name'])
        result[normalized_name].append({
            'id': template['id'],
            'name': template['name'],
            'normalized_name': normalized_name,
            'file_path': template['file_path'],
            'parent_id': template['parent_id'],
            'state': template['state'],
            'input_data': template['input_data'],
            'created_time': template['created_time'],
            'updated_time': template['updated_time']
        })
    cursor.close()
    return dict(result)
 def find_duplicates(db_templates: Dict[str, List[Dict]]) -> Dict[str, List[Dict]]:
    """
    找出重复的模板（同一标准化名称有多个记录）
    Returns:
        字典，key为标准化名称，value为重复的模板记录列表
    """
    duplicates = {}
    for normalized_name, templates in db_templates.items():
        if len(templates) > 1:
            duplicates[normalized_name] = templates
    return duplicates
 def select_best_template(templates: List[Dict], valid_template_files: List[Dict]) -> Optional[Dict]:
    """
    从多个重复的模板中选择最好的一个（保留最新的、有效的）
    Args:
        templates: 数据库中的模板记录列表
        valid_template_files: 有效的模板文件列表
    Returns:
        应该保留的模板记录，或None
    """
    if not templates:
        return None
    # 优先选择：state=1 且 file_path 有效的
    enabled_templates = [t for t in templates if t.get('state') == 1]
    if enabled_templates:
        # 如果有多个启用的，选择最新的
        enabled_templates.sort(key=lambda x: x.get('updated_time') or x.get('created_time'), reverse=True)
        return enabled_templates[0]
    # 如果没有启用的，选择最新的
    templates.sort(key=lambda x: x.get('updated_time') or x.get('created_time'), reverse=True)
    return templates[0]
 def delete_template_and_relations(conn, template_id: int):
    """
    删除模板及其关联关系
    Args:
        conn: 数据库连接
        template_id: 模板ID
    """
    cursor = conn.cursor()
    try:
        # 删除字段关联
        delete_relations_sql = """
            DELETE FROM f_polic_file_field
            WHERE tenant_id = %s AND file_id = %s
        """
        cursor.execute(delete_relations_sql, (TENANT_ID, template_id))
        relations_deleted = cursor.rowcount
        # 删除模板配置
        delete_template_sql = """
            DELETE FROM f_polic_file_config
            WHERE tenant_id = %s AND id = %s
        """
        cursor.execute(delete_template_sql, (TENANT_ID, template_id))
        template_deleted = cursor.rowcount
        conn.commit()
        return relations_deleted, template_deleted
    except Exception as e:
        conn.rollback()
        raise Exception(f"删除模板失败: {str(e)}")
    finally:
        cursor.close()
 def mark_invalid_templates(conn, valid_template_names: Set[str]):
    """
    标记无效的模板（不在template_finish文件夹中的模板）
    Args:
        conn: 数据库连接
        valid_template_names: 有效的模板名称集合（标准化后的）
    """
    cursor = conn.cursor()
    try:
        # 获取所有模板
        sql = """
            SELECT id, name FROM f_polic_file_config
            WHERE tenant_id = %s
        """
        cursor.execute(sql, (TENANT_ID,))
        all_templates = cursor.fetchall()
        invalid_count = 0
        for template in all_templates:
            template_id = template[0]
            template_name = template[1]
            normalized_name = normalize_template_name(template_name)
            # 检查是否在有效模板列表中
            if normalized_name not in valid_template_names:
                # 标记为未启用
                update_sql = """
                    UPDATE f_polic_file_config
                    SET state = 0, updated_time = NOW(), updated_by = %s
                    WHERE id = %s AND tenant_id = %s
                """
                cursor.execute(update_sql, (UPDATED_BY, template_id, TENANT_ID))
                invalid_count += 1
                print(f"  [WARN] 标记无效模板: {template_name} (ID: {template_id})")
        conn.commit()
        print(f"\n总共标记 {invalid_count} 个无效模板")
    except Exception as e:
        conn.rollback()
        raise Exception(f"标记无效模板失败: {str(e)}")
    finally:
        cursor.close()
 def main():
    """主函数"""
    print("=" * 80)
    print("清理重复和无效的模板数据")
    print("=" * 80)
    print()
    try:
        # 连接数据库
        print("1. 连接数据库...")
        conn = pymysql.connect(**DB_CONFIG)
        print("[OK] 数据库连接成功\n")
        # 扫描模板文件
        print("2. 扫描模板文件...")
        valid_templates = scan_template_files(TEMPLATE_BASE_DIR)
        valid_template_names = set(valid_templates.keys())
        print(f"[OK] 找到 {len(valid_template_names)} 个有效模板名称\n")
        # 获取数据库中的模板
        print("3. 获取数据库中的模板...")
        db_templates = get_all_templates_from_db(conn)
        print(f"[OK] 数据库中有 {sum(len(v) for v in db_templates.values())} 个模板记录")
        print(f"[OK] 唯一模板名称: {len(db_templates)} 个\n")
        # 找出重复的模板
        print("4. 查找重复的模板...")
        duplicates = find_duplicates(db_templates)
        print(f"[OK] 找到 {len(duplicates)} 个重复的模板名称\n")
        # 处理重复模板
        print("5. 处理重复模板...")
        print("=" * 80)
        total_deleted = 0
        total_relations_deleted = 0
        for normalized_name, templates in duplicates.items():
            print(f"\n处理重复模板: {normalized_name}")
            print(f"  重复记录数: {len(templates)}")
            # 获取对应的有效模板文件
            valid_files = valid_templates.get(normalized_name, [])
            # 选择要保留的模板
            keep_template = select_best_template(templates, valid_files)
            if keep_template:
                print(f"  [KEEP] 保留模板: {keep_template['name']} (ID: {keep_template['id']})")
                # 删除其他重复的模板
                for template in templates:
                    if template['id'] != keep_template['id']:
                        print(f"  [DELETE] 删除重复模板: {template['name']} (ID: {template['id']})")
                        relations_deleted, template_deleted = delete_template_and_relations(conn, template['id'])
                        total_relations_deleted += relations_deleted
                        total_deleted += template_deleted
            else:
                print(f"  [WARN] 无法确定要保留的模板，跳过")
        print(f"\n[OK] 删除重复模板: {total_deleted} 个")
        print(f"[OK] 删除关联关系: {total_relations_deleted} 条\n")
        # 标记无效模板
        print("6. 标记无效模板...")
        mark_invalid_templates(conn, valid_template_names)
        # 统计最终结果
        print("\n7. 统计最终结果...")
        final_templates = get_all_templates_from_db(conn)
        enabled_count = sum(1 for templates in final_templates.values() 
                           for t in templates if t.get('state') == 1)
        disabled_count = sum(1 for templates in final_templates.values() 
                            for t in templates if t.get('state') != 1)
        print(f"[OK] 最终模板总数: {sum(len(v) for v in final_templates.values())}")
        print(f"[OK] 启用模板数: {enabled_count}")
        print(f"[OK] 禁用模板数: {disabled_count}")
        print(f"[OK] 唯一模板名称: {len(final_templates)}")
        # 打印最终模板列表
        print("\n8. 最终模板列表（启用的）:")
        print("=" * 80)
        for normalized_name, templates in sorted(final_templates.items()):
            enabled = [t for t in templates if t.get('state') == 1]
            if enabled:
                for template in enabled:
                    print(f"  - {template['name']} (ID: {template['id']})")
        print("\n" + "=" * 80)
        print("清理完成！")
        print("=" * 80)
    except Exception as e:
        print(f"\n[ERROR] 发生错误: {e}")
        import traceback
        traceback.print_exc()
        if 'conn' in locals():
            conn.rollback()
    finally:
        if 'conn' in locals():
            conn.close()
            print("\n数据库连接已关闭")
 if __name__ == '__main__':
    main()
--- a/config/prompt_config.json
+++ b/config/prompt_config.json
@ -1,21 +1,34 @@
 {
  "prompt_template": {
-    "intro": "请从以下输入文本中提取结构化信息。",
+    "intro": "请从以下输入文本中提取结构化信息。仔细分析文本内容，准确提取每个字段的值。\n\n⚠️ 重要提醒：请逐字逐句仔细阅读输入文本，不要遗漏任何信息。对于性别、年龄、职务、单位、文化程度等字段，请特别仔细查找，这些信息可能以各种形式出现在文本中。",
    "input_text_label": "输入文本：",
-    "output_fields_label": "需要提取的字段：",
+    "output_fields_label": "需要提取的字段（请仔细分析每个字段，确保提取完整）：",
-    "json_format_label": "请严格按照以下JSON格式返回结果，只返回JSON，不要包含其他文字说明：",
+    "json_format_label": "请严格按照以下JSON格式返回结果，只返回JSON对象，不要包含任何其他文字说明或markdown代码块标记：",
-    "requirements_label": "要求：",
+    "requirements_label": "重要要求（请严格遵守）：",
    "requirements": [
-      "仔细分析输入文本，准确提取每个字段的值",
+      "⚠️ 逐字逐句仔细分析输入文本，不要遗漏任何信息。请特别关注性别、年龄、职务、单位、文化程度等字段",
-      "如果某个字段在输入文本中找不到对应信息，该字段值设为空字符串\"\"",
+      "对于每个字段，请从多个角度思考：直接提及、同义词、隐含信息、可推断信息。例如：性别可能以\"男\"、\"女\"、\"男性\"、\"女性\"、\"先生\"、\"女士\"等形式出现",
-      "日期格式统一为YYYYMM（如：198005表示1980年5月），如果包含日期信息则格式为YYYYMMDD",
+      "如果文本中明确提到某个信息（如\"30岁\"、\"男\"、\"总经理\"、\"某公司\"等），必须提取出来，不能设为空",
-      "性别统一为\"男\"或\"女\"，不要使用\"男性\"或\"女性\"",
+      "如果可以通过已有信息合理推断，请进行推断并填写：\n  - 根据出生年月（如1980年05月）和当前年份（2024年）计算年龄（44岁）\n  - 从单位及职务（如\"某公司总经理\"）中拆分单位（\"某公司\"）和职务（\"总经理\"）\n  - 从工作基本情况中提取性别、文化程度等信息",
-      "政治面貌使用标准表述（如：中共党员、中共预备党员、共青团员、群众等）",
+      "如果某个字段在输入文本中确实找不到任何相关信息，该字段值才设为空字符串\"\"",
      "日期格式统一为中文格式：YYYY年MM月（如：1980年05月表示1980年5月），如果包含日期信息则格式为YYYY年MM月DD日（如：1985年05月17日）。注意：年份必须是4位数字，月份和日期必须是2位数字（如1980年5月应格式化为1980年05月，不是1980年5月）",
      "性别统一为\"男\"或\"女\"，不要使用\"男性\"或\"女性\"。如果文本中提到\"男性\"、\"男\"、\"先生\"等，统一转换为\"男\"；如果提到\"女性\"、\"女\"、\"女士\"等，统一转换为\"女\"",
      "年龄字段：如果文本中直接提到年龄（如\"30岁\"、\"30周岁\"），直接提取数字；如果只有出生年月，可以根据当前年份计算年龄（当前年份为2024年）",
      "单位及职务字段：如果文本中提到\"XX公司总经理\"、\"XX单位XX职务\"等，需要同时提取单位名称和职务名称",
      "单位字段：从单位及职务信息中提取单位名称部分（如\"XX公司\"、\"XX局\"、\"XX部门\"等）",
      "职务字段：从单位及职务信息中提取职务名称部分（如\"总经理\"、\"局长\"、\"主任\"等）",
      "文化程度字段：注意识别\"本科\"、\"大专\"、\"高中\"、\"中专\"、\"研究生\"、\"硕士\"、\"博士\"等表述",
      "政治面貌使用标准表述（如：中共党员、中共预备党员、共青团员、群众等）。如果文本中提到\"党员\"，统一转换为\"中共党员\"",
      "职级使用标准表述（如：正处级、副处级、正科级、副科级等）",
      "线索来源字段：注意识别\"举报\"、\"来信\"、\"来电\"、\"网络举报\"、\"上级交办\"等表述",
      "主要问题线索字段：提取文本中关于问题、线索、举报内容等的描述",
      "身份证号码只提取数字，不包含其他字符",
      "联系方式提取电话号码，格式化为纯数字",
      "地址信息保持完整，包含省市区街道等详细信息",
-      "只返回JSON对象，不要包含markdown代码块标记"
+      "只返回JSON对象，不要包含markdown代码块标记、思考过程或其他说明文字",
      "JSON格式要求：所有字段名必须使用双引号，字段名中不能包含前导点（如不能使用\".target_gender\"，应使用\"target_gender\"），字段名前后不能有空格",
      "必须返回所有要求的字段，即使值为空字符串也要包含在JSON中",
      "字段名必须严格按照JSON示例中的字段编码，不能随意修改或拼写错误（如不能使用\"targetsProfessionalRank\"，应使用\"target_professional_rank\"）"
    ]
  },
  "field_formatting": {
@ -34,23 +47,32 @@
      "description": "被核查人员性别",
      "rules": [
        "只能返回\"男\"或\"女\"",
-        "如果文本中提到\"男性\"、\"男性公民\"等，统一转换为\"男\"",
+        "如果文本中提到\"男性\"、\"男性公民\"、\"男\"、\"先生\"等，统一转换为\"男\"",
-        "如果文本中提到\"女性\"、\"女性公民\"等，统一转换为\"女\""
+        "如果文本中提到\"女性\"、\"女性公民\"、\"女\"、\"女士\"等，统一转换为\"女\"",
        "请仔细查找文本中所有可能表示性别的词汇，不要遗漏",
        "如果文本中提到\"XXX，男，...\"或\"XXX，女，...\"，必须提取性别",
        "如果工作基本情况中提到性别信息，必须提取"
      ]
    },
    "target_date_of_birth": {
      "description": "被核查人员出生年月",
      "rules": [
-        "格式：YYYYMM，如198005表示1980年5月",
+        "格式：YYYY年MM月（中文格式），如1980年05月表示1980年5月（注意：月份必须是2位数字，如5月应写为05月，不是5月）",
-        "如果只有年份，月份设为01",
+        "如果只有年份，月份设为01（如1980年应格式化为1980年01月）",
-        "如果文本中提到\"X年X月X日出生\"，只提取年月，忽略日期"
+        "如果文本中提到\"X年X月X日出生\"，只提取年月，忽略日期",
        "如果文本中提到\"1980年5月\"，格式化为\"1980年05月\"（月份补零）",
        "如果文本中提到\"1980年05月\"，保持为\"1980年05月\"",
        "年份必须是4位数字，月份必须是2位数字（01-12）",
        "输出格式示例：1980年05月、1985年03月、1990年12月"
      ]
    },
    "target_date_of_birth_full": {
      "description": "被核查人员出生年月日",
      "rules": [
-        "格式：YYYYMMDD，如19800515表示1980年5月15日",
+        "格式：YYYY年MM月DD日（中文格式），如1985年05月17日表示1985年5月17日",
-        "如果只有年月，日期设为01"
+        "如果只有年月，日期设为01（如1980年05月应格式化为1980年05月01日）",
        "年份必须是4位数字，月份和日期必须是2位数字（01-12和01-31）",
        "输出格式示例：1985年05月17日、1980年03月15日、1990年12月01日"
      ]
    },
    "target_political_status": {
@ -99,6 +121,84 @@
        "学历使用标准表述：本科、大专、高中、中专、研究生等",
        "政治面貌部分：如果是中共党员，写\"加入中国共产党\"；如果不是，省略此部分"
      ]
    },
    "target_age": {
      "description": "被核查人员年龄",
      "rules": [
        "如果文本中直接提到年龄（如\"30岁\"、\"30周岁\"、\"年龄30\"、\"现年30\"），直接提取数字部分",
        "如果无法抽取到年龄数据，但抽取到了\"被核查人员出生年月\"，系统将根据出生年月和当前日期自动计算年龄",
        "年龄格式：纯数字，单位为岁，如\"44\"表示44岁",
        "如果文本中既没有直接提到年龄，也没有出生年月信息，则设为空字符串"
      ]
    },
    "target_organization_and_position": {
      "description": "被核查人员单位及职务（包括兼职）",
      "rules": [
        "提取完整的单位及职务信息，格式如：\"XX公司总经理\"、\"XX局XX处处长\"、\"XX单位XX职务\"",
        "如果文本中提到\"XX公司总经理\"、\"XX单位XX职务\"等，需要完整提取",
        "如果文本中分别提到单位和职务，需要组合成\"单位+职务\"的格式",
        "如果文本中提到多个职务或兼职，需要全部包含，用\"、\"或\"兼\"连接",
        "保持原文中的表述，不要随意修改"
      ]
    },
    "target_organization": {
      "description": "被核查人员单位",
      "rules": [
        "从单位及职务信息中提取单位名称部分",
        "单位名称包括：公司、企业、机关、事业单位、部门等（如\"XX公司\"、\"XX局\"、\"XX部门\"、\"XX委员会\"等）",
        "如果文本中只提到单位名称，直接提取",
        "⚠️ 如果文本中提到\"XX公司总经理\"，必须提取\"XX公司\"部分，不能设为空",
        "如果文本中提到\"XX单位XX职务\"，提取\"XX单位\"部分",
        "如果已有单位及职务字段（target_organization_and_position），必须从中拆分出单位名称",
        "保持单位名称的完整性，不要遗漏"
      ]
    },
    "target_position": {
      "description": "被核查人员职务",
      "rules": [
        "从单位及职务信息中提取职务名称部分",
        "职务名称包括：总经理、经理、局长、处长、科长、主任、书记、部长等",
        "如果文本中只提到职务名称，直接提取",
        "⚠️ 如果文本中提到\"XX公司总经理\"，必须提取\"总经理\"部分，不能设为空",
        "如果文本中提到\"XX单位XX职务\"，提取\"XX职务\"部分",
        "如果已有单位及职务字段（target_organization_and_position），必须从中拆分出职务名称",
        "如果文本中提到多个职务，需要全部提取，用\"、\"连接",
        "保持职务名称的准确性"
      ]
    },
    "target_education_level": {
      "description": "被核查人员文化程度",
      "rules": [
        "识别文本中关于学历、文化程度的表述",
        "标准表述包括：小学、初中、高中、中专、大专、本科、研究生、硕士、博士等",
        "如果文本中提到\"大学\"、\"大学毕业\"，通常指\"本科\"",
        "如果文本中提到\"专科\"，通常指\"大专\"",
        "如果文本中提到\"研究生学历\"，可以写\"研究生\"",
        "保持标准表述，不要使用非标准表述"
      ]
    },
    "clue_source": {
      "description": "线索来源",
      "rules": [
        "识别文本中关于线索来源的表述",
        "常见来源包括：举报、来信、来电、网络举报、上级交办、巡视发现、审计发现、媒体曝光等",
        "如果文本中提到\"举报\"、\"被举报\"，线索来源可能是\"举报\"或\"来信举报\"",
        "如果文本中提到\"电话\"、\"来电\"，线索来源可能是\"来电举报\"",
        "如果文本中提到\"网络\"、\"网上\"，线索来源可能是\"网络举报\"",
        "如果文本中提到\"上级\"、\"交办\"，线索来源可能是\"上级交办\"",
        "如果文本中没有明确提到线索来源，但提到\"举报\"相关信息，可以推断为\"举报\"",
        "保持标准表述"
      ]
    },
    "target_issue_description": {
      "description": "主要问题线索",
      "rules": [
        "提取文本中关于问题、线索、举报内容等的描述",
        "包括但不限于：违纪违法问题、工作作风问题、经济问题、生活作风问题等",
        "如果文本中提到\"问题\"、\"线索\"、\"举报\"、\"反映\"等关键词，提取相关内容",
        "保持问题描述的完整性和准确性，不要遗漏重要信息",
        "如果文本中没有明确的问题描述，但提到了相关情况，也要尽量提取"
      ]
    }
  }
 }
--- a/diagnose_minio_document_generation.py
+++ b/diagnose_minio_document_generation.py
@ -0,0 +1,482 @@
 """
 诊断MinIO文档生成问题
 测试新MinIO服务器配置下的文档生成流程
 """
 import os
 import sys
 from minio import Minio
 from minio.error import S3Error
 from dotenv import load_dotenv
 # 加载环境变量
 load_dotenv()
 # 新MinIO配置（用户提供）
 NEW_MINIO_CONFIG = {
    'endpoint': '10.100.31.21:9000',
    'access_key': 'minio_PC8dcY',
    'secret_key': 'minio_7k7RNJ',
    'secure': False  # 重要：根据测试结果，应该是false
 }
 BUCKET_NAME = 'finyx'
 TENANT_ID = 615873064429507639
 def print_section(title):
    """打印章节标题"""
    print("\n" + "="*70)
    print(f"  {title}")
    print("="*70)
 def print_result(success, message):
    """打印测试结果"""
    status = "[OK]" if success else "[FAIL]"
    print(f"{status} {message}")
 def check_environment_variables():
    """检查环境变量配置"""
    print_section("1. 检查环境变量配置")
    env_vars = {
        'MINIO_ENDPOINT': os.getenv('MINIO_ENDPOINT'),
        'MINIO_ACCESS_KEY': os.getenv('MINIO_ACCESS_KEY'),
        'MINIO_SECRET_KEY': os.getenv('MINIO_SECRET_KEY'),
        'MINIO_BUCKET': os.getenv('MINIO_BUCKET'),
        'MINIO_SECURE': os.getenv('MINIO_SECURE')
    }
    print("\n当前环境变量配置:")
    for key, value in env_vars.items():
        if key == 'MINIO_SECRET_KEY' and value:
            # 隐藏密钥的部分内容
            masked_value = value[:8] + '***' if len(value) > 8 else '***'
            print(f"  {key}: {masked_value}")
        else:
            print(f"  {key}: {value}")
    # 检查配置是否正确
    issues = []
    if env_vars['MINIO_ENDPOINT'] != NEW_MINIO_CONFIG['endpoint']:
        issues.append(f"MINIO_ENDPOINT 应该是 '{NEW_MINIO_CONFIG['endpoint']}'，当前是 '{env_vars['MINIO_ENDPOINT']}'")
    if env_vars['MINIO_ACCESS_KEY'] != NEW_MINIO_CONFIG['access_key']:
        issues.append(f"MINIO_ACCESS_KEY 应该是 '{NEW_MINIO_CONFIG['access_key']}'，当前是 '{env_vars['MINIO_ACCESS_KEY']}'")
    secure_value = env_vars['MINIO_SECURE']
    if secure_value and secure_value.lower() == 'true':
        issues.append(f"[WARN] MINIO_SECURE 设置为 'true'，但新服务器使用HTTP，应该设置为 'false'")
    if issues:
        print("\n[WARN] 发现配置问题:")
        for issue in issues:
            print(f"  - {issue}")
        print_result(False, "环境变量配置需要更新")
        return False
    else:
        print_result(True, "环境变量配置正确")
        return True
 def test_minio_connection():
    """测试MinIO连接"""
    print_section("2. 测试MinIO连接")
    # 先尝试用户配置的secure值
    secure_values = [False, True]  # 优先尝试false（根据测试结果）
    for secure in secure_values:
        try:
            print(f"\n尝试连接（secure={secure}）...")
            client = Minio(
                NEW_MINIO_CONFIG['endpoint'],
                access_key=NEW_MINIO_CONFIG['access_key'],
                secret_key=NEW_MINIO_CONFIG['secret_key'],
                secure=secure
            )
            # 测试连接：列出存储桶
            buckets = client.list_buckets()
            print_result(True, f"MinIO连接成功（secure={secure}）")
            print(f"\n  连接信息:")
            print(f"    端点: {NEW_MINIO_CONFIG['endpoint']}")
            print(f"    使用HTTPS: {secure}")
            print(f"    访问密钥: {NEW_MINIO_CONFIG['access_key']}")
            print(f"\n  可用存储桶:")
            for bucket in buckets:
                print(f"    - {bucket.name} (创建时间: {bucket.creation_date})")
            # 检查目标存储桶
            bucket_exists = client.bucket_exists(BUCKET_NAME)
            if bucket_exists:
                print_result(True, f"存储桶 '{BUCKET_NAME}' 存在")
            else:
                print_result(False, f"存储桶 '{BUCKET_NAME}' 不存在")
                print(f"  建议：需要创建存储桶 '{BUCKET_NAME}'")
                return None, False
            return client, True
        except Exception as e:
            error_msg = str(e)
            if secure == True:
                print_result(False, f"使用HTTPS连接失败: {error_msg}")
                print(f"  将尝试使用HTTP连接...")
                continue
            else:
                print_result(False, f"MinIO连接失败: {error_msg}")
                import traceback
                traceback.print_exc()
                return None, False
    return None, False
 def test_template_download(client):
    """测试模板下载功能"""
    print_section("3. 测试模板下载功能")
    if not client:
        print_result(False, "MinIO客户端未连接，跳过测试")
        return False
    try:
        # 查询数据库获取一个模板文件路径
        import pymysql
        db_config = {
            'host': os.getenv('DB_HOST', '152.136.177.240'),
            'port': int(os.getenv('DB_PORT', 5012)),
            'user': os.getenv('DB_USER', 'finyx'),
            'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
            'database': os.getenv('DB_NAME', 'finyx'),
            'charset': 'utf8mb4'
        }
        conn = pymysql.connect(**db_config)
        cursor = conn.cursor(pymysql.cursors.DictCursor)
        # 查询一个启用的模板
        sql = """
            SELECT id, name, file_path
            FROM f_polic_file_config
            WHERE tenant_id = %s
            AND state = 1
            AND file_path IS NOT NULL
            AND file_path != ''
            LIMIT 1
        """
        cursor.execute(sql, (TENANT_ID,))
        template = cursor.fetchone()
        cursor.close()
        conn.close()
        if not template:
            print_result(False, "数据库中没有找到可用的模板文件")
            print("  建议：检查数据库中的 f_polic_file_config 表")
            return False
        print(f"\n找到模板:")
        print(f"  ID: {template['id']}")
        print(f"  名称: {template['name']}")
        print(f"  文件路径: {template['file_path']}")
        # 尝试下载模板
        object_name = template['file_path'].lstrip('/')
        print(f"\n尝试下载模板...")
        print(f"  对象名称: {object_name}")
        # 检查文件是否存在
        try:
            stat = client.stat_object(BUCKET_NAME, object_name)
            print_result(True, f"模板文件存在（大小: {stat.size:,} 字节）")
        except S3Error as e:
            if e.code == 'NoSuchKey':
                print_result(False, f"模板文件不存在: {object_name}")
                print(f"  错误: {str(e)}")
                print(f"  建议：检查MinIO服务器上是否存在该文件")
                return False
            else:
                raise
        # 尝试下载（使用临时文件）
        import tempfile
        temp_file = tempfile.NamedTemporaryFile(delete=False, suffix='.docx')
        temp_file.close()
        try:
            client.fget_object(BUCKET_NAME, object_name, temp_file.name)
            file_size = os.path.getsize(temp_file.name)
            print_result(True, f"模板下载成功（大小: {file_size:,} 字节）")
            # 清理临时文件
            os.unlink(temp_file.name)
            return True
        except Exception as e:
            print_result(False, f"模板下载失败: {str(e)}")
            # 清理临时文件
            if os.path.exists(temp_file.name):
                os.unlink(temp_file.name)
            return False
    except Exception as e:
        print_result(False, f"测试模板下载时出错: {str(e)}")
        import traceback
        traceback.print_exc()
        return False
 def test_file_upload(client):
    """测试文件上传功能"""
    print_section("4. 测试文件上传功能")
    if not client:
        print_result(False, "MinIO客户端未连接，跳过测试")
        return False
    try:
        # 创建一个测试文件
        import tempfile
        from datetime import datetime
        test_content = b"Test document content for MinIO upload test"
        temp_file = tempfile.NamedTemporaryFile(delete=False, suffix='.docx')
        temp_file.write(test_content)
        temp_file.close()
        print(f"\n创建测试文件: {temp_file.name}")
        # 生成上传路径
        now = datetime.now()
        timestamp = f"{now.strftime('%Y%m%d%H%M%S')}{now.microsecond:06d}"
        object_name = f"{TENANT_ID}/TEST/{timestamp}/test_upload.docx"
        print(f"\n尝试上传文件...")
        print(f"  对象名称: {object_name}")
        # 上传文件
        client.fput_object(
            BUCKET_NAME,
            object_name,
            temp_file.name,
            content_type='application/vnd.openxmlformats-officedocument.wordprocessingml.document'
        )
        print_result(True, "文件上传成功")
        # 验证文件是否存在
        stat = client.stat_object(BUCKET_NAME, object_name)
        print(f"  上传的文件大小: {stat.size:,} 字节")
        # 清理测试文件
        os.unlink(temp_file.name)
        # 可选：删除测试文件
        try:
            client.remove_object(BUCKET_NAME, object_name)
            print(f"  已清理测试文件: {object_name}")
        except:
            pass
        return True
    except Exception as e:
        print_result(False, f"文件上传失败: {str(e)}")
        import traceback
        traceback.print_exc()
        # 清理临时文件
        if 'temp_file' in locals() and os.path.exists(temp_file.name):
            os.unlink(temp_file.name)
        return False
 def test_presigned_url(client):
    """测试预签名URL生成"""
    print_section("5. 测试预签名URL生成")
    if not client:
        print_result(False, "MinIO客户端未连接，跳过测试")
        return False
    try:
        # 使用一个测试对象名称
        from datetime import datetime, timedelta
        now = datetime.now()
        timestamp = f"{now.strftime('%Y%m%d%H%M%S')}{now.microsecond:06d}"
        test_object_name = f"{TENANT_ID}/TEST/{timestamp}/test_url.docx"
        # 先创建一个测试文件
        import tempfile
        test_content = b"Test content"
        temp_file = tempfile.NamedTemporaryFile(delete=False, suffix='.docx')
        temp_file.write(test_content)
        temp_file.close()
        # 上传测试文件
        client.fput_object(
            BUCKET_NAME,
            test_object_name,
            temp_file.name,
            content_type='application/vnd.openxmlformats-officedocument.wordprocessingml.document'
        )
        os.unlink(temp_file.name)
        print(f"\n生成预签名URL...")
        print(f"  对象名称: {test_object_name}")
        # 生成预签名URL
        url = client.presigned_get_object(
            BUCKET_NAME,
            test_object_name,
            expires=timedelta(days=7)
        )
        print_result(True, "预签名URL生成成功")
        print(f"\n  URL: {url[:100]}...")
        # 清理测试文件
        try:
            client.remove_object(BUCKET_NAME, test_object_name)
        except:
            pass
        return True
    except Exception as e:
        print_result(False, f"预签名URL生成失败: {str(e)}")
        import traceback
        traceback.print_exc()
        return False
 def check_directory_structure(client):
    """检查目录结构（MinIO是对象存储，不需要创建目录）"""
    print_section("6. 检查目录结构")
    if not client:
        print_result(False, "MinIO客户端未连接，跳过测试")
        return False
    print("\n说明：MinIO是对象存储，不需要创建目录。")
    print("对象名称可以包含路径分隔符（如 '/'），MinIO会自动处理。")
    print("\n检查存储桶中的对象结构...")
    try:
        # 列出一些对象，查看目录结构
        objects = client.list_objects(BUCKET_NAME, prefix=f"{TENANT_ID}/", recursive=False)
        prefixes = set()
        count = 0
        for obj in objects:
            count += 1
            if count <= 20:  # 只显示前20个
                # 提取前缀（目录）
                parts = obj.object_name.split('/')
                if len(parts) > 1:
                    prefix = '/'.join(parts[:-1])
                    prefixes.add(prefix)
        if prefixes:
            print(f"\n发现的前缀（目录）结构（前20个对象）:")
            for prefix in sorted(prefixes):
                print(f"  - {prefix}/")
        print_result(True, f"存储桶结构正常（已检查 {count} 个对象）")
        return True
    except Exception as e:
        print_result(False, f"检查目录结构失败: {str(e)}")
        import traceback
        traceback.print_exc()
        return False
 def print_recommendations():
    """打印修复建议"""
    print_section("修复建议")
    print("\n根据诊断结果，请执行以下步骤：")
    print("\n1. 更新环境变量配置（.env文件或系统环境变量）:")
    print("   MINIO_ENDPOINT=10.100.31.21:9000")
    print("   MINIO_ACCESS_KEY=minio_PC8dcY")
    print("   MINIO_SECRET_KEY=minio_7k7RNJ")
    print("   MINIO_BUCKET=finyx")
    print("   MINIO_SECURE=false  # [IMPORTANT] 重要：必须是false，不是true")
    print("\n2. 确保存储桶存在:")
    print(f"   存储桶名称: {BUCKET_NAME}")
    print("   如果不存在，需要创建存储桶")
    print("\n3. 确保模板文件已上传到MinIO:")
    print("   检查数据库中的 f_polic_file_config 表的 file_path 字段")
    print("   确保对应的文件在MinIO服务器上存在")
    print("\n4. 关于目录:")
    print("   MinIO是对象存储，不需要创建目录")
    print("   对象名称可以包含路径分隔符（如 '/'），MinIO会自动处理")
    print("   例如: 615873064429507639/TEMPLATE/2024/12/template.docx")
    print("\n5. 重启应用:")
    print("   更新环境变量后，需要重启应用服务才能生效")
    print("\n[IMPORTANT] MINIO_SECURE=false  # 注意：必须是false，不是true")
 def main():
    """主函数"""
    print("\n" + "="*70)
    print("  MinIO文档生成问题诊断工具")
    print("="*70)
    print(f"\n新MinIO服务器配置:")
    print(f"  端点: {NEW_MINIO_CONFIG['endpoint']}")
    print(f"  存储桶: {BUCKET_NAME}")
    print(f"  访问密钥: {NEW_MINIO_CONFIG['access_key']}")
    print(f"  使用HTTPS: {NEW_MINIO_CONFIG['secure']}")
    results = {}
    try:
        # 1. 检查环境变量
        results['环境变量'] = check_environment_variables()
        # 2. 测试MinIO连接
        client, bucket_exists = test_minio_connection()
        results['MinIO连接'] = client is not None and bucket_exists
        if client and bucket_exists:
            # 3. 测试模板下载
            results['模板下载'] = test_template_download(client)
            # 4. 测试文件上传
            results['文件上传'] = test_file_upload(client)
            # 5. 测试预签名URL
            results['预签名URL'] = test_presigned_url(client)
            # 6. 检查目录结构
            results['目录结构'] = check_directory_structure(client)
        # 总结
        print_section("诊断总结")
        print("\n测试结果:")
        for test_name, success in results.items():
            status = "[OK] 通过" if success else "[FAIL] 失败"
            print(f"  {test_name}: {status}")
        passed = sum(1 for v in results.values() if v)
        total = len(results)
        print(f"\n通过率: {passed}/{total} ({passed*100//total if total > 0 else 0}%)")
        if passed == total:
            print("\n[OK] 所有测试通过！MinIO配置正确，文档生成应该可以正常工作。")
        else:
            print("\n[WARN] 部分测试失败，请查看上面的错误信息并按照建议进行修复。")
            print_recommendations()
    except KeyboardInterrupt:
        print("\n\n诊断已中断")
    except Exception as e:
        print(f"\n[ERROR] 诊断过程中发生错误: {e}")
        import traceback
        traceback.print_exc()
        print_recommendations()
 if __name__ == '__main__':
    main()
--- a/enable_all_fields.py
+++ b/enable_all_fields.py
@ -0,0 +1,231 @@
 """
 启用f_polic_field表中所有字段（将state更新为1）
 """
 import pymysql
 import os
 from datetime import datetime
 # 数据库连接配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 UPDATED_BY = 655162080928945152
 CURRENT_TIME = datetime.now()
 def check_field_states(conn):
    """检查字段状态统计"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    # 统计各状态的字段数量（使用CAST来正确处理二进制类型）
    sql = """
        SELECT 
            CAST(state AS UNSIGNED) as state_int,
            field_type,
            COUNT(*) as count
        FROM f_polic_field
        WHERE tenant_id = %s
        GROUP BY CAST(state AS UNSIGNED), field_type
        ORDER BY field_type, CAST(state AS UNSIGNED)
    """
    cursor.execute(sql, (TENANT_ID,))
    stats = cursor.fetchall()
    cursor.close()
    return stats
 def get_fields_by_state(conn, state):
    """获取指定状态的字段列表"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    sql = """
        SELECT id, name, filed_code, field_type, CAST(state AS UNSIGNED) as state_int
        FROM f_polic_field
        WHERE tenant_id = %s
        AND CAST(state AS UNSIGNED) = %s
        ORDER BY field_type, name
    """
    cursor.execute(sql, (TENANT_ID, state))
    fields = cursor.fetchall()
    cursor.close()
    return fields
 def enable_all_fields(conn, dry_run=True):
    """启用所有字段"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    # 查询需要更新的字段（使用CAST来正确处理二进制类型）
    sql = """
        SELECT id, name, filed_code, field_type, CAST(state AS UNSIGNED) as state_int
        FROM f_polic_field
        WHERE tenant_id = %s
        AND CAST(state AS UNSIGNED) != 1
        ORDER BY field_type, name
    """
    cursor.execute(sql, (TENANT_ID,))
    fields_to_update = cursor.fetchall()
    if not fields_to_update:
        print("✓ 所有字段已经是启用状态，无需更新")
        cursor.close()
        return 0
    print(f"\n找到 {len(fields_to_update)} 个需要启用的字段:")
    for field in fields_to_update:
        field_type_str = "输出字段" if field['field_type'] == 2 else "输入字段"
        print(f"  - {field['name']} ({field['filed_code']}) [{field_type_str}] (当前state={field['state_int']})")
    if dry_run:
        print("\n⚠ 这是预览模式（dry_run=True），不会实际更新数据库")
        cursor.close()
        return len(fields_to_update)
    # 执行更新（使用CAST来正确比较）
    update_sql = """
        UPDATE f_polic_field
        SET state = 1, updated_time = %s, updated_by = %s
        WHERE tenant_id = %s
        AND CAST(state AS UNSIGNED) != 1
    """
    cursor.execute(update_sql, (CURRENT_TIME, UPDATED_BY, TENANT_ID))
    updated_count = cursor.rowcount
    conn.commit()
    cursor.close()
    return updated_count
 def main():
    """主函数"""
    print("="*80)
    print("启用f_polic_field表中所有字段")
    print("="*80)
    print()
    try:
        conn = pymysql.connect(**DB_CONFIG)
        print("✓ 数据库连接成功")
    except Exception as e:
        print(f"✗ 数据库连接失败: {str(e)}")
        return
    try:
        # 1. 检查当前状态统计
        print("\n正在检查字段状态统计...")
        stats = check_field_states(conn)
        print("\n字段状态统计:")
        total_fields = 0
        enabled_fields = 0
        disabled_fields = 0
        for stat in stats:
            state_int = stat['state_int']
            field_type = stat['field_type']
            count = stat['count']
            total_fields += count
            state_str = "启用" if state_int == 1 else "未启用"
            type_str = "输出字段" if field_type == 2 else "输入字段"
            print(f"  {type_str} - {state_str} (state={state_int}): {count} 个")
            if state_int == 1:
                enabled_fields += count
            else:
                disabled_fields += count
        print(f"\n总计: {total_fields} 个字段")
        print(f"  启用: {enabled_fields} 个")
        print(f"  未启用: {disabled_fields} 个")
        # 2. 显示未启用的字段详情
        if disabled_fields > 0:
            print(f"\n正在查询未启用的字段详情...")
            disabled_fields_list = get_fields_by_state(conn, 0)
            print(f"\n未启用的字段列表 ({len(disabled_fields_list)} 个):")
            for field in disabled_fields_list:
                field_type_str = "输出字段" if field['field_type'] == 2 else "输入字段"
                print(f"  - {field['name']} ({field['filed_code']}) [{field_type_str}]")
        # 3. 预览更新（dry_run）
        print("\n" + "="*80)
        print("预览更新（不会实际修改数据库）")
        print("="*80)
        count_to_update = enable_all_fields(conn, dry_run=True)
        if count_to_update == 0:
            print("\n所有字段已经是启用状态，无需更新")
            return
        # 4. 确认是否执行更新
        print("\n" + "="*80)
        print("准备执行更新")
        print("="*80)
        print(f"将更新 {count_to_update} 个字段的状态为启用（state=1）")
        # 实际执行更新
        print("\n正在执行更新...")
        updated_count = enable_all_fields(conn, dry_run=False)
        print(f"\n✓ 更新成功！共更新 {updated_count} 个字段")
        # 5. 验证更新结果
        print("\n正在验证更新结果...")
        final_stats = check_field_states(conn)
        print("\n更新后的字段状态统计:")
        final_enabled = 0
        final_disabled = 0
        for stat in final_stats:
            state_int = stat['state_int']
            field_type = stat['field_type']
            count = stat['count']
            state_str = "启用" if state_int == 1 else "未启用"
            type_str = "输出字段" if field_type == 2 else "输入字段"
            print(f"  {type_str} - {state_str} (state={state_int}): {count} 个")
            if state_int == 1:
                final_enabled += count
            else:
                final_disabled += count
        print(f"\n总计: {final_enabled + final_disabled} 个字段")
        print(f"  启用: {final_enabled} 个")
        print(f"  未启用: {final_disabled} 个")
        if final_disabled == 0:
            print("\n✓ 所有字段已成功启用！")
        else:
            print(f"\n⚠ 仍有 {final_disabled} 个字段未启用")
        print("\n" + "="*80)
        print("操作完成!")
        print("="*80)
    except Exception as e:
        print(f"\n✗ 处理失败: {str(e)}")
        import traceback
        traceback.print_exc()
        conn.rollback()
    finally:
        conn.close()
 if __name__ == '__main__':
    main()
--- a/export_template_fields_to_excel.py
+++ b/export_template_fields_to_excel.py
@ -0,0 +1,328 @@
 """
 导出模板和字段关系到Excel表格
 用于汇总整理模板和字段关系，后续可以基于这个Excel表格新增数据并增加导入脚本
 """
 import pymysql
 import os
 from dotenv import load_dotenv
 from openpyxl import Workbook
 from openpyxl.styles import Font, Alignment, PatternFill, Border, Side
 from openpyxl.utils import get_column_letter
 from datetime import datetime
 import re
 # 加载环境变量
 load_dotenv()
 # 数据库配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST'),
    'port': int(os.getenv('DB_PORT', 3306)),
    'user': os.getenv('DB_USER'),
    'password': os.getenv('DB_PASSWORD'),
    'database': os.getenv('DB_NAME'),
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 def clean_query_result(data):
    """清理查询结果，将 bytes 类型转换为字符串"""
    if isinstance(data, bytes):
        if len(data) == 1:
            return int.from_bytes(data, byteorder='big')
        try:
            return data.decode('utf-8')
        except UnicodeDecodeError:
            return data.decode('utf-8', errors='ignore')
    elif isinstance(data, dict):
        return {key: clean_query_result(value) for key, value in data.items()}
    elif isinstance(data, list):
        return [clean_query_result(item) for item in data]
    elif isinstance(data, (int, float, str, bool, type(None))):
        return data
    else:
        return str(data)
 def extract_template_category(file_path, template_name):
    """
    从文件路径或模板名称提取模板的上级分类
    例如：/615873064429507639/TEMPLATE/2025/12/2-初核模版/2.谈话审批/走读式谈话审批/2谈话审批表.docx
    提取为：2-初核模版/2.谈话审批/走读式谈话审批
    """
    category = ""
    # 首先尝试从文件路径提取
    if file_path:
        # 移除开头的斜杠和租户ID部分
        path = file_path.lstrip('/')
        # 移除租户ID/TEMPLATE/年份/月份/部分
        pattern = r'^\d+/TEMPLATE/\d+/\d+/(.+)'
        match = re.match(pattern, path)
        if match:
            full_path = match.group(1)
            # 移除文件名，只保留目录路径
            if '/' in full_path:
                category = '/'.join(full_path.split('/')[:-1])
        # 如果路径格式不匹配，尝试其他方式
        if not category and ('template_finish' in path.lower() or '初核' in path or '谈话' in path or '函询' in path):
            # 尝试提取目录结构
            parts = path.split('/')
            result_parts = []
            for part in parts:
                if any(keyword in part for keyword in ['初核', '谈话', '函询', '模版', '模板']):
                    result_parts.append(part)
            if result_parts:
                category = '/'.join(result_parts[:-1]) if len(result_parts) > 1 else result_parts[0]
    # 如果从路径无法提取，尝试从模板名称推断
    if not category and template_name:
        # 根据模板名称中的关键词推断分类
        if '初核' in template_name:
            if '谈话' in template_name:
                category = '2-初核模版/2.谈话审批'
            elif '请示' in template_name or '审批' in template_name:
                category = '2-初核模版/1.初核请示'
            elif '结论' in template_name or '报告' in template_name:
                category = '2-初核模版/3.初核结论'
            else:
                category = '2-初核模版'
        elif '谈话' in template_name:
            if '函询' in template_name:
                category = '1-谈话函询模板/函询模板'
            else:
                category = '1-谈话函询模板/谈话模版'
        elif '函询' in template_name:
            category = '1-谈话函询模板/函询模板'
    return category
 def get_all_templates_with_fields():
    """
    获取所有模板及其关联的输入和输出字段
    Returns:
        list: 模板列表，每个模板包含字段信息
    """
    conn = pymysql.connect(**DB_CONFIG)
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        # 查询所有启用的模板
        cursor.execute("""
            SELECT 
                fc.id AS template_id,
                fc.name AS template_name,
                fc.file_path
            FROM f_polic_file_config fc
            WHERE fc.tenant_id = %s
              AND fc.state = 1
            ORDER BY fc.name
        """, (TENANT_ID,))
        templates = cursor.fetchall()
        templates = [clean_query_result(t) for t in templates]
        result = []
        for template in templates:
            template_id = template['template_id']
            template_name = template['template_name']
            file_path = template.get('file_path', '')
            # 提取模板上级分类
            template_category = extract_template_category(file_path, template_name)
            # 查询该模板关联的输入字段
            cursor.execute("""
                SELECT 
                    f.id AS field_id,
                    f.name AS field_name,
                    f.filed_code AS field_code
                FROM f_polic_file_field fff
                INNER JOIN f_polic_field f ON fff.filed_id = f.id
                WHERE fff.file_id = %s
                  AND fff.tenant_id = %s
                  AND fff.state = 1
                  AND f.state = 1
                  AND f.field_type = 1
                ORDER BY f.name
            """, (template_id, TENANT_ID))
            input_fields = cursor.fetchall()
            input_fields = [clean_query_result(f) for f in input_fields]
            # 查询该模板关联的输出字段
            cursor.execute("""
                SELECT 
                    f.id AS field_id,
                    f.name AS field_name,
                    f.filed_code AS field_code
                FROM f_polic_file_field fff
                INNER JOIN f_polic_field f ON fff.filed_id = f.id
                WHERE fff.file_id = %s
                  AND fff.tenant_id = %s
                  AND fff.state = 1
                  AND f.state = 1
                  AND f.field_type = 2
                ORDER BY f.name
            """, (template_id, TENANT_ID))
            output_fields = cursor.fetchall()
            output_fields = [clean_query_result(f) for f in output_fields]
            # 格式化字段信息
            input_fields_str = '; '.join([f"{f['field_name']}({f['field_code']})" for f in input_fields])
            output_fields_str = '; '.join([f"{f['field_name']}({f['field_code']})" for f in output_fields])
            result.append({
                'template_id': template_id,
                'template_name': template_name,
                'template_category': template_category,
                'input_fields': input_fields,
                'output_fields': output_fields,
                'input_fields_str': input_fields_str,
                'output_fields_str': output_fields_str,
                'input_field_count': len(input_fields),
                'output_field_count': len(output_fields)
            })
        return result
    finally:
        cursor.close()
        conn.close()
 def create_excel_file(templates_data, output_file='template_fields_export.xlsx'):
    """
    创建Excel文件
    Args:
        templates_data: 模板数据列表
        output_file: 输出文件名
    """
    wb = Workbook()
    ws = wb.active
    ws.title = "模板字段关系"
    # 设置表头
    headers = ['模板ID', '模板名称', '模板上级', '输入字段', '输出字段', '输入字段数量', '输出字段数量']
    ws.append(headers)
    # 设置表头样式
    header_fill = PatternFill(start_color="366092", end_color="366092", fill_type="solid")
    header_font = Font(bold=True, color="FFFFFF", size=11)
    header_alignment = Alignment(horizontal="center", vertical="center", wrap_text=True)
    border = Border(
        left=Side(style='thin'),
        right=Side(style='thin'),
        top=Side(style='thin'),
        bottom=Side(style='thin')
    )
    for col_num, header in enumerate(headers, 1):
        cell = ws.cell(row=1, column=col_num)
        cell.fill = header_fill
        cell.font = header_font
        cell.alignment = header_alignment
        cell.border = border
    # 填充数据
    data_font = Font(size=10)
    data_alignment = Alignment(horizontal="left", vertical="top", wrap_text=True)
    for template in templates_data:
        row = [
            template['template_id'],
            template['template_name'],
            template['template_category'],
            template['input_fields_str'],
            template['output_fields_str'],
            template['input_field_count'],
            template['output_field_count']
        ]
        ws.append(row)
        # 设置数据行样式
        for col_num in range(1, len(headers) + 1):
            cell = ws.cell(row=ws.max_row, column=col_num)
            cell.font = data_font
            cell.alignment = data_alignment
            cell.border = border
    # 设置列宽
    ws.column_dimensions['A'].width = 18  # 模板ID
    ws.column_dimensions['B'].width = 40  # 模板名称
    ws.column_dimensions['C'].width = 50  # 模板上级
    ws.column_dimensions['D'].width = 60  # 输入字段
    ws.column_dimensions['E'].width = 80  # 输出字段
    ws.column_dimensions['F'].width = 15  # 输入字段数量
    ws.column_dimensions['G'].width = 15  # 输出字段数量
    # 设置行高
    ws.row_dimensions[1].height = 30  # 表头行高
    for row_num in range(2, ws.max_row + 1):
        ws.row_dimensions[row_num].height = 60  # 数据行高
    # 冻结首行
    ws.freeze_panes = 'A2'
    # 保存文件
    wb.save(output_file)
    print(f"Excel文件已生成: {output_file}")
    print(f"共导出 {len(templates_data)} 个模板")
 def main():
    """主函数"""
    print("开始导出模板和字段关系...")
    print("=" * 80)
    try:
        # 获取所有模板及其字段
        templates_data = get_all_templates_with_fields()
        if not templates_data:
            print("未找到任何模板数据")
            return
        print(f"共找到 {len(templates_data)} 个模板")
        # 生成Excel文件
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        output_file = f"template_fields_export_{timestamp}.xlsx"
        create_excel_file(templates_data, output_file)
        # 打印统计信息
        print("\n统计信息:")
        print(f"  模板总数: {len(templates_data)}")
        total_input_fields = sum(t['input_field_count'] for t in templates_data)
        total_output_fields = sum(t['output_field_count'] for t in templates_data)
        print(f"  输入字段总数: {total_input_fields}")
        print(f"  输出字段总数: {total_output_fields}")
        # 打印前几个模板的信息
        print("\n前5个模板预览:")
        for i, template in enumerate(templates_data[:5], 1):
            print(f"\n{i}. {template['template_name']}")
            print(f"   上级: {template['template_category']}")
            print(f"   输入字段: {template['input_field_count']} 个")
            print(f"   输出字段: {template['output_field_count']} 个")
        if len(templates_data) > 5:
            print(f"\n... 还有 {len(templates_data) - 5} 个模板")
    except Exception as e:
        print(f"导出失败: {str(e)}")
        import traceback
        traceback.print_exc()
 if __name__ == '__main__':
    main()
--- a/finalize_template_hierarchy.py
+++ b/finalize_template_hierarchy.py
@ -0,0 +1,158 @@
 """
 最终完善模板层级结构
 修复文件路径错误和重复问题
 """
 import pymysql
 import time
 import random
 from pathlib import Path
 DB_CONFIG = {
    'host': '152.136.177.240',
    'port': 5012,
    'user': 'finyx',
    'password': '6QsGK6MpePZDE57Z',
    'database': 'finyx',
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 CREATED_BY = 655162080928945152
 UPDATED_BY = 655162080928945152
 def generate_id():
    timestamp = int(time.time() * 1000)
    random_part = random.randint(100000, 999999)
    return timestamp * 1000 + random_part
 conn = pymysql.connect(**DB_CONFIG)
 cursor = conn.cursor(pymysql.cursors.DictCursor)
 try:
    # 检查"1请示报告卡"的记录
    # 根据目录结构，应该有两个不同的文件：
    # 1. "1.初核请示"下的"1.请示报告卡（XXX）.docx"
    # 2. "走读式谈话审批"下的"1.请示报告卡（初核谈话）.docx"
    cursor.execute("""
        SELECT id, name, file_path, parent_id
        FROM f_polic_file_config
        WHERE tenant_id = %s AND name = %s
        ORDER BY id
    """, (TENANT_ID, '1请示报告卡'))
    results = cursor.fetchall()
    # 检查是否在"1.初核请示"下有记录
    in_initial_request = any(r['parent_id'] == 1765431558933731 for r in results)
    # 检查是否在"走读式谈话审批"下有记录
    in_interview_approval = any(r['parent_id'] == 1765273962700431 for r in results)
    if not in_initial_request:
        # 需要在"1.初核请示"下创建记录
        new_id = generate_id()
        insert_sql = """
            INSERT INTO f_polic_file_config
            (id, tenant_id, parent_id, name, input_data, file_path, created_time, created_by, updated_time, updated_by, state)
            VALUES (%s, %s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, %s)
        """
        cursor.execute(insert_sql, (
            new_id,
            TENANT_ID,
            1765431558933731,  # 1.初核请示
            '1请示报告卡',
            None,
            '/615873064429507639/TEMPLATE/2025/12/1.请示报告卡（XXX）.docx',
            CREATED_BY,
            CREATED_BY,
            1
        ))
        print(f"[CREATE] 在'1.初核请示'下创建'1请示报告卡'记录 (ID: {new_id})")
    if not in_interview_approval:
        # 需要在"走读式谈话审批"下创建记录
        new_id = generate_id()
        insert_sql = """
            INSERT INTO f_polic_file_config
            (id, tenant_id, parent_id, name, input_data, file_path, created_time, created_by, updated_time, updated_by, state)
            VALUES (%s, %s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, %s)
        """
        cursor.execute(insert_sql, (
            new_id,
            TENANT_ID,
            1765273962700431,  # 走读式谈话审批
            '1请示报告卡',
            None,
            '/615873064429507639/TEMPLATE/2025/12/1.请示报告卡（初核谈话）.docx',
            CREATED_BY,
            CREATED_BY,
            1
        ))
        print(f"[CREATE] 在'走读式谈话审批'下创建'1请示报告卡'记录 (ID: {new_id})")
    # 更新现有记录的文件路径
    for result in results:
        if result['parent_id'] == 1765431558933731:  # 1.初核请示
            correct_path = '/615873064429507639/TEMPLATE/2025/12/1.请示报告卡（XXX）.docx'
        elif result['parent_id'] == 1765273962700431:  # 走读式谈话审批
            correct_path = '/615873064429507639/TEMPLATE/2025/12/1.请示报告卡（初核谈话）.docx'
        else:
            continue
        if result['file_path'] != correct_path:
            cursor.execute("""
                UPDATE f_polic_file_config
                SET file_path = %s, updated_time = NOW(), updated_by = %s
                WHERE tenant_id = %s AND id = %s
            """, (correct_path, UPDATED_BY, TENANT_ID, result['id']))
            print(f"[UPDATE] 修复'1请示报告卡'的文件路径 (ID: {result['id']}): {result['file_path']} -> {correct_path}")
    # 检查重复的"XXX初核情况报告"
    cursor.execute("""
        SELECT id, name, file_path, parent_id
        FROM f_polic_file_config
        WHERE tenant_id = %s AND name LIKE %s
        ORDER BY id
    """, (TENANT_ID, '%XXX初核情况报告%'))
    results = cursor.fetchall()
    if len(results) > 1:
        # 保留最新的，删除旧的
        # 或者根据file_path判断哪个是正确的
        # 根据目录结构，应该是"8.XXX初核情况报告.docx"
        correct_name = 'XXX初核情况报告'
        correct_path = '/615873064429507639/TEMPLATE/2025/12/8.XXX初核情况报告.docx'
        for r in results:
            if r['name'] == '8.XXX初核情况报告':
                # 这个应该删除（名称带数字前缀）
                cursor.execute("""
                    DELETE FROM f_polic_file_field
                    WHERE tenant_id = %s AND file_id = %s
                """, (TENANT_ID, r['id']))
                cursor.execute("""
                    DELETE FROM f_polic_file_config
                    WHERE tenant_id = %s AND id = %s
                """, (TENANT_ID, r['id']))
                print(f"[DELETE] 删除重复记录: {r['name']} (ID: {r['id']})")
            elif r['name'] == 'XXX初核情况报告':
                # 更新这个记录的文件路径
                if r['file_path'] != correct_path:
                    cursor.execute("""
                        UPDATE f_polic_file_config
                        SET file_path = %s, updated_time = NOW(), updated_by = %s
                        WHERE tenant_id = %s AND id = %s
                    """, (correct_path, UPDATED_BY, TENANT_ID, r['id']))
                    print(f"[UPDATE] 更新'XXX初核情况报告'的文件路径: {r['file_path']} -> {correct_path}")
    conn.commit()
    print("\n[OK] 修复完成")
 except Exception as e:
    conn.rollback()
    print(f"[ERROR] 修复失败: {e}")
    import traceback
    traceback.print_exc()
 finally:
    cursor.close()
    conn.close()
--- a/fix_document_service_tenant_id.py
+++ b/fix_document_service_tenant_id.py
@ -0,0 +1,102 @@
 """
 修复document_service.py中的tenant_id查询问题
 问题：get_file_config_by_id方法没有检查tenant_id，导致查询可能失败
 解决方案：在查询中添加tenant_id检查
 """
 import re
 from pathlib import Path
 def fix_document_service():
    """修复document_service.py中的查询逻辑"""
    file_path = Path("services/document_service.py")
    if not file_path.exists():
        print(f"[错误] 文件不存在: {file_path}")
        return False
    # 读取文件
    with open(file_path, 'r', encoding='utf-8') as f:
        content = f.read()
    # 查找get_file_config_by_id方法
    pattern = r'(def get_file_config_by_id\(self, file_id: int\) -> Optional\[Dict\]:.*?)(\s+sql = """.*?WHERE id = %s\s+AND state = 1\s+""".*?cursor\.execute\(sql, \(file_id,\)\))'
    match = re.search(pattern, content, re.DOTALL)
    if not match:
        print("[错误] 未找到get_file_config_by_id方法或查询语句")
        return False
    old_code = match.group(0)
    # 检查是否已经包含tenant_id
    if 'tenant_id' in old_code:
        print("[信息] 查询已经包含tenant_id检查，无需修复")
        return True
    # 生成新的代码
    new_sql = '''            sql = """
                SELECT id, name, file_path
                FROM f_polic_file_config
                WHERE id = %s
                AND tenant_id = %s
                AND state = 1
            """
            # 获取tenant_id（从环境变量或请求中获取）
            tenant_id = self.tenant_id if self.tenant_id else os.getenv('TENANT_ID', '1')
            try:
                tenant_id = int(tenant_id)
            except (ValueError, TypeError):
                tenant_id = 1  # 默认值
            cursor.execute(sql, (file_id, tenant_id))'''
    # 替换
    new_code = re.sub(
        r'sql = """.*?WHERE id = %s\s+AND state = 1\s+""".*?cursor\.execute\(sql, \(file_id,\)\)',
        new_sql,
        old_code,
        flags=re.DOTALL
    )
    new_content = content.replace(old_code, new_code)
    # 检查是否需要导入os
    if 'import os' not in new_content and 'os.getenv' in new_content:
        # 在文件开头添加import os（如果还没有）
        if 'from dotenv import load_dotenv' in new_content:
            new_content = new_content.replace('from dotenv import load_dotenv', 'from dotenv import load_dotenv\nimport os')
        elif 'import pymysql' in new_content:
            new_content = new_content.replace('import pymysql', 'import pymysql\nimport os')
        else:
            # 在文件开头添加
            lines = new_content.split('\n')
            import_line = 0
            for i, line in enumerate(lines):
                if line.startswith('import ') or line.startswith('from '):
                    import_line = i + 1
            lines.insert(import_line, 'import os')
            new_content = '\n'.join(lines)
    # 写回文件
    with open(file_path, 'w', encoding='utf-8') as f:
        f.write(new_content)
    print("[成功] 已修复get_file_config_by_id方法，添加了tenant_id检查")
    return True
 if __name__ == "__main__":
    print("="*70)
    print("修复document_service.py中的tenant_id查询问题")
    print("="*70)
    if fix_document_service():
        print("\n修复完成！")
        print("\n注意：")
        print("1. 请确保.env文件中配置了TENANT_ID")
        print("2. 或者确保应用程序在调用时正确传递tenant_id")
        print("3. 建议在app.py中从请求中获取tenant_id并传递给document_service")
    else:
        print("\n修复失败，请手动检查代码")
--- a/fix_duplicate_fields.py
+++ b/fix_duplicate_fields.py
@ -0,0 +1,176 @@
 """修复 f_polic_field 表中的重复字段"""
 import pymysql
 import os
 from dotenv import load_dotenv
 from collections import defaultdict
 load_dotenv()
 TENANT_ID = 615873064429507639
 conn = pymysql.connect(
    host=os.getenv('DB_HOST', '152.136.177.240'),
    port=int(os.getenv('DB_PORT', 5012)),
    user=os.getenv('DB_USER', 'finyx'),
    password=os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    database=os.getenv('DB_NAME', 'finyx'),
    charset='utf8mb4'
 )
 cursor = conn.cursor(pymysql.cursors.DictCursor)
 print("=" * 80)
 print("修复重复字段")
 print("=" * 80)
 # 1. 查找所有重复的 filed_code
 cursor.execute("""
    SELECT filed_code, COUNT(*) as cnt, GROUP_CONCAT(id ORDER BY id) as field_ids
    FROM f_polic_field
    WHERE tenant_id = %s
    GROUP BY filed_code
    HAVING cnt > 1
 """, (TENANT_ID,))
 duplicate_codes = cursor.fetchall()
 print(f"\n发现 {len(duplicate_codes)} 个重复的字段编码：\n")
 for dup in duplicate_codes:
    code = dup['filed_code']
    field_ids = [int(x) for x in dup['field_ids'].split(',')]
    print(f"\n处理字段编码: {code}")
    print(f"  字段ID列表: {field_ids}")
    # 获取每个字段的详细信息
    placeholders = ','.join(['%s'] * len(field_ids))
    cursor.execute(f"""
        SELECT id, name, field_type, state
        FROM f_polic_field
        WHERE id IN ({placeholders})
        ORDER BY id
    """, field_ids)
    fields = cursor.fetchall()
    # 获取每个字段的关联关系
    field_associations = {}
    for field_id in field_ids:
        cursor.execute("""
            SELECT COUNT(*) as cnt, GROUP_CONCAT(file_id) as file_ids
            FROM f_polic_file_field
            WHERE filed_id = %s
        """, (field_id,))
        result = cursor.fetchone()
        field_associations[field_id] = {
            'count': result['cnt'] if result else 0,
            'file_ids': result['file_ids'].split(',') if result and result['file_ids'] else []
        }
    print(f"\n  字段详情和关联关系：")
    for field in fields:
        assoc = field_associations[field['id']]
        print(f"    ID: {field['id']}, name: {field['name']}, "
              f"field_type: {field['field_type']}, state: {field['state']}, "
              f"关联模板数: {assoc['count']}")
    # 选择保留的字段（优先选择关联模板数最多的，如果相同则选择ID较小的）
    fields_with_assoc = [(f, field_associations[f['id']]) for f in fields]
    fields_with_assoc.sort(key=lambda x: (-x[1]['count'], x[0]['id']))
    keep_field = fields_with_assoc[0][0]
    remove_fields = [f for f, _ in fields_with_assoc[1:]]
    print(f"\n  保留字段: ID={keep_field['id']}, name={keep_field['name']}, "
          f"关联模板数={field_associations[keep_field['id']]['count']}")
    print(f"  删除字段: {[f['id'] for f in remove_fields]}")
    # 迁移关联关系：将删除字段的关联关系迁移到保留字段
    for remove_field in remove_fields:
        remove_id = remove_field['id']
        keep_id = keep_field['id']
        # 获取删除字段的所有关联
        cursor.execute("""
            SELECT file_id
            FROM f_polic_file_field
            WHERE filed_id = %s
        """, (remove_id,))
        remove_assocs = cursor.fetchall()
        migrated_count = 0
        skipped_count = 0
        for assoc in remove_assocs:
            file_id = assoc['file_id']
            # 检查保留字段是否已经关联了这个文件
            cursor.execute("""
                SELECT COUNT(*) as cnt
                FROM f_polic_file_field
                WHERE filed_id = %s AND file_id = %s
            """, (keep_id, file_id))
            exists = cursor.fetchone()['cnt'] > 0
            if not exists:
                # 迁移关联关系
                cursor.execute("""
                    UPDATE f_polic_file_field
                    SET filed_id = %s
                    WHERE filed_id = %s AND file_id = %s
                """, (keep_id, remove_id, file_id))
                migrated_count += 1
            else:
                # 如果已存在，直接删除重复的关联
                cursor.execute("""
                    DELETE FROM f_polic_file_field
                    WHERE filed_id = %s AND file_id = %s
                """, (remove_id, file_id))
                skipped_count += 1
        print(f"    字段ID {remove_id} -> {keep_id}: 迁移 {migrated_count} 个关联, 跳过 {skipped_count} 个重复关联")
        # 删除字段的所有关联关系（应该已经迁移或删除完毕）
        cursor.execute("""
            DELETE FROM f_polic_file_field
            WHERE filed_id = %s
        """, (remove_id,))
        # 删除字段本身
        cursor.execute("""
            DELETE FROM f_polic_field
            WHERE id = %s
        """, (remove_id,))
        print(f"    已删除字段 ID {remove_id} 及其关联关系")
 print("\n" + "=" * 80)
 print("验证修复结果")
 print("=" * 80)
 # 再次检查是否还有重复
 cursor.execute("""
    SELECT filed_code, COUNT(*) as cnt
    FROM f_polic_field
    WHERE tenant_id = %s
    GROUP BY filed_code
    HAVING cnt > 1
 """, (TENANT_ID,))
 remaining_duplicates = cursor.fetchall()
 if remaining_duplicates:
    print(f"\n警告：仍有 {len(remaining_duplicates)} 个重复的字段编码：")
    for dup in remaining_duplicates:
        print(f"  {dup['filed_code']}: {dup['cnt']} 个")
 else:
    print("\n[OK] 所有重复字段已修复，filed_code 现在唯一")
 # 提交事务
 conn.commit()
 print("\n[OK] 所有更改已提交到数据库")
 cursor.close()
 conn.close()
--- a/fix_duplicate_request_report_card.py
+++ b/fix_duplicate_request_report_card.py
@ -0,0 +1,131 @@
 """
 修复重复的"1请示报告卡"记录
 确保每个文件在正确的位置只有一个记录
 """
 import pymysql
 DB_CONFIG = {
    'host': '152.136.177.240',
    'port': 5012,
    'user': 'finyx',
    'password': '6QsGK6MpePZDE57Z',
    'database': 'finyx',
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 UPDATED_BY = 655162080928945152
 conn = pymysql.connect(**DB_CONFIG)
 cursor = conn.cursor(pymysql.cursors.DictCursor)
 try:
    # 查找所有"1请示报告卡"记录
    cursor.execute("""
        SELECT id, name, file_path, parent_id
        FROM f_polic_file_config
        WHERE tenant_id = %s AND name = %s
        ORDER BY id
    """, (TENANT_ID, '1请示报告卡'))
    results = cursor.fetchall()
    print(f"找到 {len(results)} 条'1请示报告卡'记录:\n")
    # 根据file_path和parent_id判断哪些是正确的
    correct_records = []
    for r in results:
        print(f"ID: {r['id']}, file_path: {r['file_path']}, parent_id: {r['parent_id']}")
        # 判断是否正确
        if r['parent_id'] == 1765431558933731:  # 1.初核请示
            if '1.请示报告卡（XXX）' in (r['file_path'] or ''):
                correct_records.append(r)
        elif r['parent_id'] == 1765273962700431:  # 走读式谈话审批
            if '1.请示报告卡（初核谈话）' in (r['file_path'] or ''):
                correct_records.append(r)
    print(f"\n正确的记录数: {len(correct_records)}")
    # 删除不正确的记录
    for r in results:
        if r not in correct_records:
            # 先删除关联关系
            cursor.execute("""
                DELETE FROM f_polic_file_field
                WHERE tenant_id = %s AND file_id = %s
            """, (TENANT_ID, r['id']))
            # 删除模板记录
            cursor.execute("""
                DELETE FROM f_polic_file_config
                WHERE tenant_id = %s AND id = %s
            """, (TENANT_ID, r['id']))
            print(f"[DELETE] 删除不正确的记录: ID {r['id']}, file_path: {r['file_path']}, parent_id: {r['parent_id']}")
    # 确保两个位置都有正确的记录
    has_initial_request = any(r['parent_id'] == 1765431558933731 for r in correct_records)
    has_interview_approval = any(r['parent_id'] == 1765273962700431 for r in correct_records)
    if not has_initial_request:
        # 创建"1.初核请示"下的记录
        import time
        import random
        timestamp = int(time.time() * 1000)
        random_part = random.randint(100000, 999999)
        new_id = timestamp * 1000 + random_part
        insert_sql = """
            INSERT INTO f_polic_file_config
            (id, tenant_id, parent_id, name, input_data, file_path, created_time, created_by, updated_time, updated_by, state)
            VALUES (%s, %s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, %s)
        """
        cursor.execute(insert_sql, (
            new_id,
            TENANT_ID,
            1765431558933731,  # 1.初核请示
            '1请示报告卡',
            None,
            '/615873064429507639/TEMPLATE/2025/12/1.请示报告卡（XXX）.docx',
            655162080928945152,
            655162080928945152,
            1
        ))
        print(f"[CREATE] 在'1.初核请示'下创建'1请示报告卡'记录 (ID: {new_id})")
    if not has_interview_approval:
        # 创建"走读式谈话审批"下的记录
        import time
        import random
        timestamp = int(time.time() * 1000)
        random_part = random.randint(100000, 999999)
        new_id = timestamp * 1000 + random_part
        insert_sql = """
            INSERT INTO f_polic_file_config
            (id, tenant_id, parent_id, name, input_data, file_path, created_time, created_by, updated_time, updated_by, state)
            VALUES (%s, %s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, %s)
        """
        cursor.execute(insert_sql, (
            new_id,
            TENANT_ID,
            1765273962700431,  # 走读式谈话审批
            '1请示报告卡',
            None,
            '/615873064429507639/TEMPLATE/2025/12/1.请示报告卡（初核谈话）.docx',
            655162080928945152,
            655162080928945152,
            1
        ))
        print(f"[CREATE] 在'走读式谈话审批'下创建'1请示报告卡'记录 (ID: {new_id})")
    conn.commit()
    print("\n[OK] 修复完成")
 except Exception as e:
    conn.rollback()
    print(f"[ERROR] 修复失败: {e}")
    import traceback
    traceback.print_exc()
 finally:
    cursor.close()
    conn.close()
--- a/fix_isolated_template.py
+++ b/fix_isolated_template.py
@ -0,0 +1,147 @@
 """
 修复孤立的模板文件（有路径但无父级）
 """
 import os
 import pymysql
 from pathlib import Path
 from dotenv import load_dotenv
 # 加载环境变量
 load_dotenv()
 # 数据库配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 UPDATED_BY = 655162080928945152
 def get_actual_tenant_id(conn) -> int:
    """获取数据库中的实际tenant_id"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        cursor.execute("SELECT DISTINCT tenant_id FROM f_polic_file_config LIMIT 1")
        result = cursor.fetchone()
        if result:
            return result['tenant_id']
        return 1
    finally:
        cursor.close()
 def find_parent_directory(conn, tenant_id: int, file_path: str) -> int:
    """根据文件路径找到父目录ID"""
    # 从文件路径中提取父目录路径
    path_parts = file_path.split('/')
    if len(path_parts) < 2:
        return None
    # 父目录路径（去掉文件名）
    parent_path = '/'.join(path_parts[:-1])
    parent_dir_name = path_parts[-2]  # 父目录名称
    # 查找父目录（通过名称匹配，且file_path为NULL）
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        sql = """
            SELECT id, name
            FROM f_polic_file_config
            WHERE tenant_id = %s
            AND name = %s
            AND file_path IS NULL
            ORDER BY id
            LIMIT 1
        """
        cursor.execute(sql, (tenant_id, parent_dir_name))
        result = cursor.fetchone()
        if result:
            return result['id']
        return None
    finally:
        cursor.close()
 def main():
    """主函数"""
    print("="*70)
    print("修复孤立的模板文件")
    print("="*70)
    try:
        conn = pymysql.connect(**DB_CONFIG)
        print("[OK] 数据库连接成功")
    except Exception as e:
        print(f"[FAIL] 数据库连接失败: {str(e)}")
        return
    try:
        tenant_id = get_actual_tenant_id(conn)
        print(f"实际tenant_id: {tenant_id}")
        # 查找孤立的文件（有路径但无父级，且路径包含至少2级）
        cursor = conn.cursor(pymysql.cursors.DictCursor)
        try:
            sql = """
                SELECT id, name, file_path
                FROM f_polic_file_config
                WHERE tenant_id = %s
                AND file_path IS NOT NULL
                AND parent_id IS NULL
                AND file_path LIKE 'template_finish/%%/%%'
            """
            cursor.execute(sql, (tenant_id,))
            isolated_files = cursor.fetchall()
            if not isolated_files:
                print("[OK] 没有发现孤立的文件")
                return
            print(f"\n发现 {len(isolated_files)} 个孤立的文件:")
            fixed_count = 0
            for file in isolated_files:
                print(f"\n  文件: {file['name']}")
                print(f"    ID: {file['id']}")
                print(f"    路径: {file['file_path']}")
                # 查找父目录
                parent_id = find_parent_directory(conn, tenant_id, file['file_path'])
                if parent_id:
                    # 更新parent_id
                    update_cursor = conn.cursor()
                    try:
                        update_cursor.execute("""
                            UPDATE f_polic_file_config
                            SET parent_id = %s, updated_time = NOW(), updated_by = %s
                            WHERE id = %s AND tenant_id = %s
                        """, (parent_id, UPDATED_BY, file['id'], tenant_id))
                        conn.commit()
                        print(f"    [修复] 设置parent_id: {parent_id}")
                        fixed_count += 1
                    except Exception as e:
                        conn.rollback()
                        print(f"    [错误] 更新失败: {str(e)}")
                    finally:
                        update_cursor.close()
                else:
                    print(f"    [警告] 未找到父目录")
            print(f"\n[OK] 成功修复 {fixed_count} 个文件")
        finally:
            cursor.close()
    finally:
        conn.close()
        print("[OK] 数据库连接已关闭")
 if __name__ == "__main__":
    main()
--- a/fix_minio_config.py
+++ b/fix_minio_config.py
@ -0,0 +1,209 @@
 """
 修复MinIO配置
 1. 创建或更新.env文件
 2. 检查并迁移模板文件
 """
 import os
 from pathlib import Path
 # 新MinIO配置
 NEW_MINIO_CONFIG = {
    'endpoint': '10.100.31.21:9000',
    'access_key': 'minio_PC8dcY',
    'secret_key': 'minio_7k7RNJ',
    'secure': 'false',  # 重要：必须是false
    'bucket': 'finyx'
 }
 def create_env_file():
    """创建或更新.env文件"""
    env_file = Path('.env')
    print("="*70)
    print("创建/更新 .env 文件")
    print("="*70)
    # 读取现有.env文件（如果存在）
    existing_vars = {}
    if env_file.exists():
        print(f"\n发现现有 .env 文件，将更新MinIO相关配置...")
        with open(env_file, 'r', encoding='utf-8') as f:
            for line in f:
                line = line.strip()
                if line and not line.startswith('#') and '=' in line:
                    key, value = line.split('=', 1)
                    existing_vars[key.strip()] = value.strip()
    else:
        print(f"\n创建新的 .env 文件...")
    # 更新MinIO配置
    existing_vars['MINIO_ENDPOINT'] = NEW_MINIO_CONFIG['endpoint']
    existing_vars['MINIO_ACCESS_KEY'] = NEW_MINIO_CONFIG['access_key']
    existing_vars['MINIO_SECRET_KEY'] = NEW_MINIO_CONFIG['secret_key']
    existing_vars['MINIO_BUCKET'] = NEW_MINIO_CONFIG['bucket']
    existing_vars['MINIO_SECURE'] = NEW_MINIO_CONFIG['secure']
    # 写入.env文件
    with open(env_file, 'w', encoding='utf-8') as f:
        f.write("# MinIO配置\n")
        f.write(f"MINIO_ENDPOINT={NEW_MINIO_CONFIG['endpoint']}\n")
        f.write(f"MINIO_ACCESS_KEY={NEW_MINIO_CONFIG['access_key']}\n")
        f.write(f"MINIO_SECRET_KEY={NEW_MINIO_CONFIG['secret_key']}\n")
        f.write(f"MINIO_BUCKET={NEW_MINIO_CONFIG['bucket']}\n")
        f.write(f"MINIO_SECURE={NEW_MINIO_CONFIG['secure']}  # 重要：新服务器使用HTTP，必须是false\n")
        f.write("\n")
        # 保留其他配置（如果有）
        other_keys = set(existing_vars.keys()) - {
            'MINIO_ENDPOINT', 'MINIO_ACCESS_KEY', 'MINIO_SECRET_KEY', 
            'MINIO_BUCKET', 'MINIO_SECURE'
        }
        if other_keys:
            f.write("# 其他配置\n")
            for key in sorted(other_keys):
                f.write(f"{key}={existing_vars[key]}\n")
    print(f"\n[OK] .env 文件已更新")
    print(f"\n更新的配置:")
    print(f"  MINIO_ENDPOINT={NEW_MINIO_CONFIG['endpoint']}")
    print(f"  MINIO_ACCESS_KEY={NEW_MINIO_CONFIG['access_key']}")
    print(f"  MINIO_SECRET_KEY={NEW_MINIO_CONFIG['secret_key'][:8]}***")
    print(f"  MINIO_BUCKET={NEW_MINIO_CONFIG['bucket']}")
    print(f"  MINIO_SECURE={NEW_MINIO_CONFIG['secure']}  # [IMPORTANT] 必须是false")
    return True
 def check_template_files():
    """检查模板文件是否存在"""
    print("\n" + "="*70)
    print("检查模板文件")
    print("="*70)
    try:
        from minio import Minio
        from minio.error import S3Error
        import pymysql
        from dotenv import load_dotenv
        load_dotenv()
        # 连接新MinIO
        client = Minio(
            NEW_MINIO_CONFIG['endpoint'],
            access_key=NEW_MINIO_CONFIG['access_key'],
            secret_key=NEW_MINIO_CONFIG['secret_key'],
            secure=False
        )
        # 连接数据库
        db_config = {
            'host': os.getenv('DB_HOST', '152.136.177.240'),
            'port': int(os.getenv('DB_PORT', 5012)),
            'user': os.getenv('DB_USER', 'finyx'),
            'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
            'database': os.getenv('DB_NAME', 'finyx'),
            'charset': 'utf8mb4'
        }
        conn = pymysql.connect(**db_config)
        cursor = conn.cursor(pymysql.cursors.DictCursor)
        # 查询所有模板
        sql = """
            SELECT id, name, file_path
            FROM f_polic_file_config
            WHERE tenant_id = %s
            AND state = 1
            AND file_path IS NOT NULL
            AND file_path != ''
        """
        cursor.execute(sql, (615873064429507639,))
        templates = cursor.fetchall()
        print(f"\n数据库中找到 {len(templates)} 个模板文件")
        missing_files = []
        existing_files = []
        for template in templates:
            object_name = template['file_path'].lstrip('/')
            try:
                stat = client.stat_object(NEW_MINIO_CONFIG['bucket'], object_name)
                existing_files.append(template)
                print(f"  [OK] {template['name']} - 存在 ({stat.size:,} 字节)")
            except S3Error as e:
                if e.code == 'NoSuchKey':
                    missing_files.append(template)
                    print(f"  [FAIL] {template['name']} - 不存在")
                    print(f"         路径: {object_name}")
        cursor.close()
        conn.close()
        print(f"\n总结:")
        print(f"  存在的文件: {len(existing_files)}")
        print(f"  缺失的文件: {len(missing_files)}")
        if missing_files:
            print(f"\n[WARN] 发现 {len(missing_files)} 个模板文件在新MinIO服务器上不存在")
            print(f"\n需要执行以下操作之一:")
            print(f"  1. 从旧MinIO服务器迁移这些文件到新服务器")
            print(f"  2. 重新上传这些模板文件到新MinIO服务器")
            print(f"\n缺失的文件列表:")
            for template in missing_files:
                print(f"  - {template['name']}")
                print(f"    路径: {template['file_path']}")
        return len(missing_files) == 0
    except Exception as e:
        print(f"\n[ERROR] 检查模板文件时出错: {str(e)}")
        import traceback
        traceback.print_exc()
        return False
 def main():
    """主函数"""
    print("\n" + "="*70)
    print("MinIO配置修复工具")
    print("="*70)
    try:
        # 1. 创建/更新.env文件
        create_env_file()
        # 2. 检查模板文件
        all_files_exist = check_template_files()
        # 总结
        print("\n" + "="*70)
        print("修复总结")
        print("="*70)
        print("\n[OK] .env 文件已更新")
        if all_files_exist:
            print("[OK] 所有模板文件都存在")
            print("\n下一步:")
            print("  1. 重启应用服务以使新的环境变量生效")
            print("  2. 测试文档生成功能")
        else:
            print("[WARN] 部分模板文件缺失")
            print("\n下一步:")
            print("  1. 迁移或上传缺失的模板文件到新MinIO服务器")
            print("  2. 重启应用服务以使新的环境变量生效")
            print("  3. 测试文档生成功能")
        print("\n重要提示:")
        print("  - MINIO_SECURE 必须设置为 false（新服务器使用HTTP）")
        print("  - 更新环境变量后必须重启应用才能生效")
    except Exception as e:
        print(f"\n[ERROR] 修复过程中发生错误: {e}")
        import traceback
        traceback.print_exc()
 if __name__ == '__main__':
    main()
--- a/fix_missing_education_level_field.py
+++ b/fix_missing_education_level_field.py
@ -0,0 +1,191 @@
 """
 修复缺失的 target_education_level 字段
 检查并创建被核查人员文化程度字段
 """
 import pymysql
 import os
 from datetime import datetime
 # 数据库连接配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 CREATED_BY = 655162080928945152
 UPDATED_BY = 655162080928945152
 CURRENT_TIME = datetime.now()
 # 字段定义
 FIELD_DEFINITION = {
    'name': '被核查人员文化程度',
    'field_code': 'target_education_level',
    'field_type': 2,  # 输出字段
    'description': '被核查人员文化程度（如：本科、大专、高中等）'
 }
 def generate_id():
    """生成ID（使用时间戳+随机数的方式，模拟雪花算法）"""
    import time
    import random
    timestamp = int(time.time() * 1000)
    random_part = random.randint(100000, 999999)
    return timestamp * 1000 + random_part
 def check_field_exists(conn):
    """检查字段是否存在"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    sql = """
        SELECT id, name, filed_code, field_type, state
        FROM f_polic_field
        WHERE tenant_id = %s AND filed_code = %s
    """
    cursor.execute(sql, (TENANT_ID, FIELD_DEFINITION['field_code']))
    field = cursor.fetchone()
    cursor.close()
    return field
 def create_field(conn, dry_run: bool = True):
    """创建字段"""
    cursor = conn.cursor()
    field_id = generate_id()
    insert_sql = """
        INSERT INTO f_polic_field 
        (id, tenant_id, name, filed_code, field_type, created_time, created_by, updated_time, updated_by, state)
        VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s)
    """
    if dry_run:
        print(f"[DRY RUN] 将创建字段:")
        print(f"  ID: {field_id}")
        print(f"  名称: {FIELD_DEFINITION['name']}")
        print(f"  编码: {FIELD_DEFINITION['field_code']}")
        print(f"  类型: {FIELD_DEFINITION['field_type']} (输出字段)")
        print(f"  状态: 1 (启用)")
    else:
        cursor.execute(insert_sql, (
            field_id,
            TENANT_ID,
            FIELD_DEFINITION['name'],
            FIELD_DEFINITION['field_code'],
            FIELD_DEFINITION['field_type'],
            CURRENT_TIME,
            CREATED_BY,
            CURRENT_TIME,
            UPDATED_BY,
            1  # state: 1表示启用
        ))
        conn.commit()
        print(f"✓ 成功创建字段: {FIELD_DEFINITION['name']} ({FIELD_DEFINITION['field_code']}), ID: {field_id}")
    cursor.close()
    return field_id
 def update_field_state(conn, field_id, dry_run: bool = True):
    """更新字段状态为启用"""
    cursor = conn.cursor()
    update_sql = """
        UPDATE f_polic_field
        SET state = 1, updated_time = NOW(), updated_by = %s
        WHERE id = %s AND tenant_id = %s
    """
    if dry_run:
        print(f"[DRY RUN] 将更新字段状态为启用: ID={field_id}")
    else:
        cursor.execute(update_sql, (UPDATED_BY, field_id, TENANT_ID))
        conn.commit()
        print(f"✓ 成功更新字段状态为启用: ID={field_id}")
    cursor.close()
 def main(dry_run: bool = True):
    """主函数"""
    print("="*80)
    print("修复缺失的 target_education_level 字段")
    print("="*80)
    if dry_run:
        print("\n[DRY RUN模式 - 不会实际修改数据库]")
    else:
        print("\n[实际执行模式 - 将修改数据库]")
    try:
        conn = pymysql.connect(**DB_CONFIG)
        print("✓ 数据库连接成功\n")
        # 检查字段是否存在
        print("1. 检查字段是否存在...")
        existing_field = check_field_exists(conn)
        if existing_field:
            print(f"   ✓ 字段已存在:")
            print(f"     ID: {existing_field['id']}")
            print(f"     名称: {existing_field['name']}")
            print(f"     编码: {existing_field['filed_code']}")
            print(f"     类型: {existing_field['field_type']} ({'输出字段' if existing_field['field_type'] == 2 else '输入字段'})")
            print(f"     状态: {existing_field['state']} ({'启用' if existing_field['state'] == 1 else '未启用'})")
            # 如果字段存在但未启用，启用它
            if existing_field['state'] != 1:
                print(f"\n2. 字段存在但未启用，将更新状态...")
                update_field_state(conn, existing_field['id'], dry_run=dry_run)
            else:
                print(f"\n✓ 字段已存在且已启用，无需操作")
        else:
            print(f"   ✗ 字段不存在，需要创建")
            print(f"\n2. 创建字段...")
            field_id = create_field(conn, dry_run=dry_run)
            if not dry_run:
                print(f"\n✓ 字段创建完成")
        print("\n" + "="*80)
        if dry_run:
            print("\n这是DRY RUN模式，未实际修改数据库。")
            print("要实际执行，请运行: python fix_missing_education_level_field.py --execute")
        else:
            print("\n✓ 字段修复完成")
    except Exception as e:
        print(f"\n✗ 发生错误: {e}")
        import traceback
        traceback.print_exc()
        if not dry_run:
            conn.rollback()
    finally:
        conn.close()
        print("\n数据库连接已关闭")
 if __name__ == '__main__':
    import sys
    dry_run = '--execute' not in sys.argv
    if not dry_run:
        print("\n⚠ 警告: 这将修改数据库!")
        response = input("确认要继续吗? (yes/no): ")
        if response.lower() != 'yes':
            print("操作已取消")
            sys.exit(0)
    main(dry_run=dry_run)
--- a/fix_missing_field_relations.py
+++ b/fix_missing_field_relations.py
@ -0,0 +1,260 @@
 """
 修复缺少字段关联的模板
 为有 template_code 但没有字段关联的文件节点补充字段关联
 """
 import os
 import json
 import pymysql
 from typing import Dict, List
 # 数据库连接配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 CREATED_BY = 655162080928945152
 UPDATED_BY = 655162080928945152
 def generate_id():
    """生成ID"""
    import time
    import random
    timestamp = int(time.time() * 1000)
    random_part = random.randint(100000, 999999)
    return timestamp * 1000 + random_part
 def get_templates_without_relations(conn):
    """获取没有字段关联的文件节点"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    sql = """
        SELECT 
            fc.id,
            fc.name,
            fc.template_code,
            fc.input_data,
            COUNT(ff.id) as relation_count
        FROM f_polic_file_config fc
        LEFT JOIN f_polic_file_field ff ON fc.id = ff.file_id AND ff.tenant_id = fc.tenant_id
        WHERE fc.tenant_id = %s 
          AND fc.template_code IS NOT NULL
          AND fc.template_code != ''
        GROUP BY fc.id, fc.name, fc.template_code, fc.input_data
        HAVING relation_count = 0
        ORDER BY fc.name
    """
    cursor.execute(sql, (TENANT_ID,))
    templates = cursor.fetchall()
    cursor.close()
    return templates
 def get_fields_by_code(conn):
    """获取所有字段，按字段编码索引"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    sql = """
        SELECT id, name, filed_code, field_type
        FROM f_polic_field
        WHERE tenant_id = %s
    """
    cursor.execute(sql, (TENANT_ID,))
    fields = cursor.fetchall()
    result = {
        'by_code': {},
        'by_name': {}
    }
    for field in fields:
        field_code = field['filed_code']
        field_name = field['name']
        result['by_code'][field_code] = field
        result['by_name'][field_name] = field
    cursor.close()
    return result
 def extract_fields_from_input_data(input_data: str) -> List[str]:
    """从 input_data 中提取字段编码列表"""
    try:
        data = json.loads(input_data) if isinstance(input_data, str) else input_data
        if isinstance(data, dict):
            return data.get('input_fields', [])
    except:
        pass
    return []
 def create_field_relations(conn, file_id: int, field_codes: List[str], field_type: int, 
                          db_fields: Dict, dry_run: bool = True):
    """创建字段关联关系"""
    cursor = conn.cursor()
    try:
        created_count = 0
        for field_code in field_codes:
            field = db_fields['by_code'].get(field_code)
            if not field:
                print(f"      ⚠ 字段不存在: {field_code}")
                continue
            if field['field_type'] != field_type:
                print(f"      ⚠ 字段类型不匹配: {field_code} (期望 {field_type}, 实际 {field['field_type']})")
                continue
            if not dry_run:
                # 检查是否已存在
                check_sql = """
                    SELECT id FROM f_polic_file_field
                    WHERE tenant_id = %s AND file_id = %s AND filed_id = %s
                """
                cursor.execute(check_sql, (TENANT_ID, file_id, field['id']))
                existing = cursor.fetchone()
                if not existing:
                    relation_id = generate_id()
                    insert_sql = """
                        INSERT INTO f_polic_file_field
                        (id, tenant_id, file_id, filed_id, created_time, created_by, updated_time, updated_by, state)
                        VALUES (%s, %s, %s, %s, NOW(), %s, NOW(), %s, %s)
                    """
                    cursor.execute(insert_sql, (
                        relation_id, TENANT_ID, file_id, field['id'],
                        CREATED_BY, UPDATED_BY, 1
                    ))
                    created_count += 1
                    print(f"      ✓ 创建关联: {field['name']} ({field_code})")
            else:
                created_count += 1
                print(f"      [模拟] 将创建关联: {field_code}")
        if not dry_run:
            conn.commit()
        return created_count
    finally:
        cursor.close()
 def main():
    """主函数"""
    print("="*80)
    print("修复缺少字段关联的模板")
    print("="*80)
    try:
        conn = pymysql.connect(**DB_CONFIG)
        print("✓ 数据库连接成功\n")
    except Exception as e:
        print(f"✗ 数据库连接失败: {e}")
        return
    try:
        # 获取没有字段关联的模板
        print("查找缺少字段关联的模板...")
        templates = get_templates_without_relations(conn)
        print(f"  找到 {len(templates)} 个缺少字段关联的文件节点\n")
        if not templates:
            print("✓ 所有文件节点都有字段关联，无需修复")
            return
        # 获取所有字段
        print("获取字段定义...")
        db_fields = get_fields_by_code(conn)
        print(f"  找到 {len(db_fields['by_code'])} 个字段\n")
        # 显示需要修复的模板
        print("需要修复的模板：")
        for template in templates:
            print(f"  - {template['name']} (code: {template['template_code']})")
        # 尝试从 input_data 中提取字段
        print("\n" + "="*80)
        print("分析并修复")
        print("="*80)
        fixable_count = 0
        unfixable_count = 0
        for template in templates:
            print(f"\n处理: {template['name']}")
            print(f"  template_code: {template['template_code']}")
            input_data = template.get('input_data')
            if not input_data:
                print("  ⚠ 没有 input_data，无法自动修复")
                unfixable_count += 1
                continue
            # 从 input_data 中提取输入字段
            input_fields = extract_fields_from_input_data(input_data)
            if not input_fields:
                print("  ⚠ input_data 中没有 input_fields，无法自动修复")
                unfixable_count += 1
                continue
            print(f"  找到 {len(input_fields)} 个输入字段")
            fixable_count += 1
            # 创建输入字段关联
            print("  创建输入字段关联...")
            created = create_field_relations(conn, template['id'], input_fields, 1, db_fields, dry_run=True)
            print(f"    将创建 {created} 个输入字段关联")
        print("\n" + "="*80)
        print("统计")
        print("="*80)
        print(f"  可修复: {fixable_count} 个")
        print(f"  无法自动修复: {unfixable_count} 个")
        # 询问是否执行
        if fixable_count > 0:
            print("\n" + "="*80)
            response = input("\n是否执行修复？(yes/no，默认no): ").strip().lower()
            if response == 'yes':
                print("\n执行修复...")
                for template in templates:
                    input_data = template.get('input_data')
                    if not input_data:
                        continue
                    input_fields = extract_fields_from_input_data(input_data)
                    if not input_fields:
                        continue
                    print(f"\n修复: {template['name']}")
                    create_field_relations(conn, template['id'], input_fields, 1, db_fields, dry_run=False)
                print("\n" + "="*80)
                print("✓ 修复完成！")
                print("="*80)
            else:
                print("\n已取消修复")
        else:
            print("\n没有可以自动修复的模板")
    finally:
        conn.close()
        print("\n数据库连接已关闭")
 if __name__ == '__main__':
    main()
--- a/fix_only_chinese_field_codes.py
+++ b/fix_only_chinese_field_codes.py
@ -0,0 +1,201 @@
 """
 只修复真正包含中文的field_code字段
 """
 import os
 import pymysql
 import re
 from typing import Dict
 # 数据库连接配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 UPDATED_BY = 655162080928945152
 # 字段名称到field_code的映射（针对剩余的中文字段）
 FIELD_MAPPING = {
    # 谈话相关字段
    '拟谈话地点': 'proposed_interview_location',
    '拟谈话时间': 'proposed_interview_time',
    '谈话事由': 'interview_reason',
    '谈话人': 'interviewer',
    '谈话人员-安全员': 'interview_personnel_safety_officer',
    '谈话人员-组长': 'interview_personnel_leader',
    '谈话人员-谈话人员': 'interview_personnel',
    '谈话前安全风险评估结果': 'pre_interview_risk_assessment_result',
    '谈话地点': 'interview_location',
    '谈话次数': 'interview_count',
    # 被核查人员相关字段
    '被核查人单位及职务': 'target_organization_and_position',  # 注意：这个和"被核查人员单位及职务"应该是同一个
    '被核查人员交代问题程度': 'target_confession_level',
    '被核查人员减压后的表现': 'target_behavior_after_relief',
    '被核查人员学历': 'target_education',  # 注意：这个和"被核查人员文化程度"可能不同
    '被核查人员工作履历': 'target_work_history',
    '被核查人员思想负担程度': 'target_mental_burden_level',
    '被核查人员职业': 'target_occupation',
    '被核查人员谈话中的表现': 'target_behavior_during_interview',
    '被核查人员问题严重程度': 'target_issue_severity_level',
    '被核查人员风险等级': 'target_risk_level',
    '被核查人基本情况': 'target_basic_info',
    # 其他字段
    '补空人员': 'backup_personnel',
    '记录人': 'recorder',
    '评估意见': 'assessment_opinion',
 }
 def is_chinese(text: str) -> bool:
    """判断字符串是否完全或主要包含中文字符"""
    if not text:
        return False
    # 如果包含中文字符，且中文字符占比超过50%，认为是中文
    chinese_chars = len(re.findall(r'[\u4e00-\u9fff]', text))
    total_chars = len(text)
    if total_chars == 0:
        return False
    return chinese_chars / total_chars > 0.3  # 如果中文字符占比超过30%，认为是中文
 def fix_chinese_fields(dry_run: bool = True):
    """修复包含中文的field_code字段"""
    conn = pymysql.connect(**DB_CONFIG)
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    print("="*80)
    print("修复包含中文的field_code字段")
    print("="*80)
    if dry_run:
        print("\n[DRY RUN模式 - 不会实际修改数据库]")
    # 查询所有字段
    cursor.execute("""
        SELECT id, name, filed_code, field_type, state
        FROM f_polic_field
        WHERE tenant_id = %s
        ORDER BY name
    """, (TENANT_ID,))
    all_fields = cursor.fetchall()
    # 找出field_code包含中文的字段
    chinese_fields = []
    for field in all_fields:
        if field['filed_code'] and is_chinese(field['filed_code']):
            chinese_fields.append(field)
    print(f"\n找到 {len(chinese_fields)} 个field_code包含中文的字段:\n")
    updates = []
    for field in chinese_fields:
        field_name = field['name']
        new_code = FIELD_MAPPING.get(field_name)
        if not new_code:
            # 如果没有映射，生成一个基于名称的code
            new_code = field_name.lower()
            new_code = new_code.replace('被核查人员', 'target_').replace('被核查人', 'target_')
            new_code = new_code.replace('谈话', 'interview_')
            new_code = new_code.replace('人员', '')
            new_code = new_code.replace('时间', '_time')
            new_code = new_code.replace('地点', '_location')
            new_code = new_code.replace('问题', '_issue')
            new_code = new_code.replace('情况', '_situation')
            new_code = new_code.replace('程度', '_level')
            new_code = new_code.replace('表现', '_behavior')
            new_code = new_code.replace('等级', '_level')
            new_code = new_code.replace('履历', '_history')
            new_code = new_code.replace('学历', '_education')
            new_code = new_code.replace('职业', '_occupation')
            new_code = new_code.replace('事由', '_reason')
            new_code = new_code.replace('次数', '_count')
            new_code = new_code.replace('结果', '_result')
            new_code = new_code.replace('意见', '_opinion')
            new_code = re.sub(r'[^\w]', '_', new_code)
            new_code = re.sub(r'_+', '_', new_code).strip('_')
            new_code = new_code.replace('__', '_')
        updates.append({
            'id': field['id'],
            'name': field_name,
            'old_code': field['filed_code'],
            'new_code': new_code,
            'field_type': field['field_type']
        })
        print(f"  ID: {field['id']}")
        print(f"  名称: {field_name}")
        print(f"  当前field_code: {field['filed_code']}")
        print(f"  新field_code: {new_code}")
        print()
    # 检查是否有重复的new_code
    code_to_fields = {}
    for update in updates:
        code = update['new_code']
        if code not in code_to_fields:
            code_to_fields[code] = []
        code_to_fields[code].append(update)
    duplicate_codes = {code: fields_list for code, fields_list in code_to_fields.items() 
                      if len(fields_list) > 1}
    if duplicate_codes:
        print("\n⚠ 警告：以下field_code会重复:")
        for code, fields_list in duplicate_codes.items():
            print(f"  field_code: {code}")
            for field in fields_list:
                print(f"    - ID: {field['id']}, 名称: {field['name']}")
        print()
    # 执行更新
    if not dry_run:
        print("开始执行更新...\n")
        for update in updates:
            cursor.execute("""
                UPDATE f_polic_field
                SET filed_code = %s, updated_time = NOW(), updated_by = %s
                WHERE id = %s
            """, (update['new_code'], UPDATED_BY, update['id']))
            print(f"  ✓ 更新字段 ID {update['id']}: {update['name']}")
            print(f"    {update['old_code']} -> {update['new_code']}")
        conn.commit()
        print("\n✓ 更新完成")
    else:
        print("[DRY RUN] 以上操作不会实际执行")
    cursor.close()
    conn.close()
    return updates
 if __name__ == '__main__':
    print("是否执行修复？")
    print("1. DRY RUN（不实际修改数据库）")
    print("2. 直接执行修复（会修改数据库）")
    choice = input("\n请选择 (1/2，默认1): ").strip() or "1"
    if choice == "2":
        print("\n执行实际修复...")
        fix_chinese_fields(dry_run=False)
    else:
        print("\n执行DRY RUN...")
        updates = fix_chinese_fields(dry_run=True)
        if updates:
            confirm = input("\nDRY RUN完成。是否执行实际修复？(y/n，默认n): ").strip().lower()
            if confirm == 'y':
                print("\n执行实际修复...")
                fix_chinese_fields(dry_run=False)
--- a/fix_remaining_chinese_fields.py
+++ b/fix_remaining_chinese_fields.py
@ -0,0 +1,191 @@
 """
 修复剩余的中文field_code字段
 为这些字段生成合适的英文field_code
 """
 import os
 import pymysql
 import re
 from typing import Dict
 # 数据库连接配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 UPDATED_BY = 655162080928945152
 # 字段名称到field_code的映射（针对剩余的中文字段）
 FIELD_MAPPING = {
    # 谈话相关字段
    '拟谈话地点': 'proposed_interview_location',
    '拟谈话时间': 'proposed_interview_time',
    '谈话事由': 'interview_reason',
    '谈话人': 'interviewer',
    '谈话人员-安全员': 'interview_personnel_safety_officer',
    '谈话人员-组长': 'interview_personnel_leader',
    '谈话人员-谈话人员': 'interview_personnel',
    '谈话前安全风险评估结果': 'pre_interview_risk_assessment_result',
    '谈话地点': 'interview_location',
    '谈话次数': 'interview_count',
    # 被核查人员相关字段
    '被核查人单位及职务': 'target_organization_and_position',  # 注意：这个和"被核查人员单位及职务"应该是同一个
    '被核查人员交代问题程度': 'target_confession_level',
    '被核查人员减压后的表现': 'target_behavior_after_relief',
    '被核查人员学历': 'target_education',  # 注意：这个和"被核查人员文化程度"可能不同
    '被核查人员工作履历': 'target_work_history',
    '被核查人员思想负担程度': 'target_mental_burden_level',
    '被核查人员职业': 'target_occupation',
    '被核查人员谈话中的表现': 'target_behavior_during_interview',
    '被核查人员问题严重程度': 'target_issue_severity_level',
    '被核查人员风险等级': 'target_risk_level',
    '被核查人基本情况': 'target_basic_info',
    # 其他字段
    '补空人员': 'backup_personnel',
    '记录人': 'recorder',
    '评估意见': 'assessment_opinion',
 }
 def is_chinese(text: str) -> bool:
    """判断字符串是否包含中文字符"""
    if not text:
        return False
    return bool(re.search(r'[\u4e00-\u9fff]', text))
 def fix_remaining_fields(dry_run: bool = True):
    """修复剩余的中文field_code字段"""
    conn = pymysql.connect(**DB_CONFIG)
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    print("="*80)
    print("修复剩余的中文field_code字段")
    print("="*80)
    if dry_run:
        print("\n[DRY RUN模式 - 不会实际修改数据库]")
    # 查询所有包含中文field_code的字段
    cursor.execute("""
        SELECT id, name, filed_code, field_type, state
        FROM f_polic_field
        WHERE tenant_id = %s AND filed_code REGEXP '[\\u4e00-\\u9fff]'
        ORDER BY name
    """, (TENANT_ID,))
    fields = cursor.fetchall()
    print(f"\n找到 {len(fields)} 个需要修复的字段:\n")
    updates = []
    for field in fields:
        field_name = field['name']
        new_code = FIELD_MAPPING.get(field_name)
        if not new_code:
            # 如果没有映射，生成一个基于名称的code
            new_code = field_name.lower()
            new_code = new_code.replace('被核查人员', 'target_').replace('被核查人', 'target_')
            new_code = new_code.replace('谈话', 'interview_')
            new_code = new_code.replace('人员', '')
            new_code = new_code.replace('时间', '_time')
            new_code = new_code.replace('地点', '_location')
            new_code = new_code.replace('问题', '_issue')
            new_code = new_code.replace('情况', '_situation')
            new_code = new_code.replace('程度', '_level')
            new_code = new_code.replace('表现', '_behavior')
            new_code = new_code.replace('等级', '_level')
            new_code = new_code.replace('履历', '_history')
            new_code = new_code.replace('学历', '_education')
            new_code = new_code.replace('职业', '_occupation')
            new_code = new_code.replace('事由', '_reason')
            new_code = new_code.replace('次数', '_count')
            new_code = new_code.replace('结果', '_result')
            new_code = new_code.replace('意见', '_opinion')
            new_code = re.sub(r'[^\w]', '_', new_code)
            new_code = re.sub(r'_+', '_', new_code).strip('_')
            new_code = new_code.replace('__', '_')
        updates.append({
            'id': field['id'],
            'name': field_name,
            'old_code': field['filed_code'],
            'new_code': new_code,
            'field_type': field['field_type']
        })
        print(f"  ID: {field['id']}")
        print(f"  名称: {field_name}")
        print(f"  当前field_code: {field['filed_code']}")
        print(f"  新field_code: {new_code}")
        print()
    # 检查是否有重复的new_code
    code_to_fields = {}
    for update in updates:
        code = update['new_code']
        if code not in code_to_fields:
            code_to_fields[code] = []
        code_to_fields[code].append(update)
    duplicate_codes = {code: fields_list for code, fields_list in code_to_fields.items() 
                      if len(fields_list) > 1}
    if duplicate_codes:
        print("\n⚠ 警告：以下field_code会重复:")
        for code, fields_list in duplicate_codes.items():
            print(f"  field_code: {code}")
            for field in fields_list:
                print(f"    - ID: {field['id']}, 名称: {field['name']}")
        print()
    # 执行更新
    if not dry_run:
        print("开始执行更新...\n")
        for update in updates:
            cursor.execute("""
                UPDATE f_polic_field
                SET filed_code = %s, updated_time = NOW(), updated_by = %s
                WHERE id = %s
            """, (update['new_code'], UPDATED_BY, update['id']))
            print(f"  ✓ 更新字段 ID {update['id']}: {update['name']}")
            print(f"    {update['old_code']} -> {update['new_code']}")
        conn.commit()
        print("\n✓ 更新完成")
    else:
        print("[DRY RUN] 以上操作不会实际执行")
    cursor.close()
    conn.close()
    return updates
 if __name__ == '__main__':
    print("是否执行修复？")
    print("1. DRY RUN（不实际修改数据库）")
    print("2. 直接执行修复（会修改数据库）")
    choice = input("\n请选择 (1/2，默认1): ").strip() or "1"
    if choice == "2":
        print("\n执行实际修复...")
        fix_remaining_fields(dry_run=False)
    else:
        print("\n执行DRY RUN...")
        updates = fix_remaining_fields(dry_run=True)
        if updates:
            confirm = input("\nDRY RUN完成。是否执行实际修复？(y/n，默认n): ").strip().lower()
            if confirm == 'y':
                print("\n执行实际修复...")
                fix_remaining_fields(dry_run=False)
--- a/fix_remaining_hierarchy_issues.py
+++ b/fix_remaining_hierarchy_issues.py
@ -0,0 +1,61 @@
 """
 修复剩余的层级结构问题
 """
 import pymysql
 DB_CONFIG = {
    'host': '152.136.177.240',
    'port': 5012,
    'user': 'finyx',
    'password': '6QsGK6MpePZDE57Z',
    'database': 'finyx',
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 UPDATED_BY = 655162080928945152
 conn = pymysql.connect(**DB_CONFIG)
 cursor = conn.cursor()
 try:
    # 1. 修复"2保密承诺书"的parent_id（应该在"走读式谈话流程"下）
    # "走读式谈话流程"的ID是 1765273962716807
    cursor.execute("""
        UPDATE f_polic_file_config
        SET parent_id = %s, updated_time = NOW(), updated_by = %s
        WHERE tenant_id = %s AND id = %s
    """, (1765273962716807, UPDATED_BY, TENANT_ID, 1765425919729046))
    print(f"[UPDATE] 更新'2保密承诺书'的parent_id: {cursor.rowcount} 条")
    # 2. 检查"8.XXX初核情况报告"的位置（应该在"3.初核结论"下，而不是"走读式谈话流程"下）
    # "3.初核结论"的ID是 1765431559135346
    # 先查找"8.XXX初核情况报告"的ID
    cursor.execute("""
        SELECT id, name, parent_id
        FROM f_polic_file_config
        WHERE tenant_id = %s AND name LIKE %s
    """, (TENANT_ID, '%XXX初核情况报告%'))
    result = cursor.fetchone()
    if result:
        file_id, file_name, current_parent = result
        if current_parent != 1765431559135346:
            cursor.execute("""
                UPDATE f_polic_file_config
                SET parent_id = %s, updated_time = NOW(), updated_by = %s
                WHERE tenant_id = %s AND id = %s
            """, (1765431559135346, UPDATED_BY, TENANT_ID, file_id))
            print(f"[UPDATE] 更新'{file_name}'的parent_id: {cursor.rowcount} 条")
    conn.commit()
    print("\n[OK] 修复完成")
 except Exception as e:
    conn.rollback()
    print(f"[ERROR] 修复失败: {e}")
    import traceback
    traceback.print_exc()
 finally:
    cursor.close()
    conn.close()
--- a/fix_report_card_interview_input_data.py
+++ b/fix_report_card_interview_input_data.py
@ -0,0 +1,272 @@
 """
 修复"1.请示报告卡（初核谈话）"模板的input_data字段
 分析模板占位符，根据数据库字段对应关系生成input_data并更新数据库
 """
 import pymysql
 import json
 import os
 import re
 from datetime import datetime
 from pathlib import Path
 from docx import Document
 # 数据库连接配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 UPDATED_BY = 655162080928945152
 CURRENT_TIME = datetime.now()
 # 模板信息
 TEMPLATE_NAME = "1.请示报告卡（初核谈话）"
 TEMPLATE_CODE = "REPORT_CARD_INTERVIEW"
 BUSINESS_TYPE = "INVESTIGATION"
 TEMPLATE_FILE_PATH = "template_finish/2-初核模版/2.谈话审批/走读式谈话审批/1.请示报告卡（初核谈话）.docx"
 def extract_placeholders_from_docx(file_path):
    """从docx文件中提取所有占位符"""
    placeholders = set()
    pattern = r'\{\{([^}]+)\}\}'
    try:
        doc = Document(file_path)
        # 从段落中提取占位符
        for paragraph in doc.paragraphs:
            text = paragraph.text
            matches = re.findall(pattern, text)
            for match in matches:
                placeholders.add(match.strip())
        # 从表格中提取占位符
        for table in doc.tables:
            for row in table.rows:
                for cell in row.cells:
                    for paragraph in cell.paragraphs:
                        text = paragraph.text
                        matches = re.findall(pattern, text)
                        for match in matches:
                            placeholders.add(match.strip())
    except Exception as e:
        print(f"  错误: 读取文件失败 - {str(e)}")
        return []
    return sorted(list(placeholders))
 def get_template_config(conn):
    """查询模板配置"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    sql = """
        SELECT id, name, template_code, input_data, file_path, state
        FROM f_polic_file_config
        WHERE tenant_id = %s AND name = %s
    """
    cursor.execute(sql, (TENANT_ID, TEMPLATE_NAME))
    config = cursor.fetchone()
    cursor.close()
    return config
 def get_template_fields(conn, file_config_id):
    """查询模板关联的字段"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    sql = """
        SELECT f.id, f.name, f.filed_code as field_code, f.field_type
        FROM f_polic_field f
        INNER JOIN f_polic_file_field ff ON f.id = ff.filed_id
        WHERE ff.file_id = %s
        AND f.tenant_id = %s
        ORDER BY f.field_type, f.name
    """
    cursor.execute(sql, (file_config_id, TENANT_ID))
    fields = cursor.fetchall()
    cursor.close()
    return fields
 def verify_placeholders_in_database(conn, placeholders):
    """验证占位符是否在数据库中存在对应的字段"""
    if not placeholders:
        return {}
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    placeholders_list = list(placeholders)
    placeholders_str = ','.join(['%s'] * len(placeholders_list))
    # 查询所有字段（包括未启用的）
    sql = f"""
        SELECT id, name, filed_code as field_code, field_type, state
        FROM f_polic_field
        WHERE tenant_id = %s
        AND filed_code IN ({placeholders_str})
    """
    cursor.execute(sql, [TENANT_ID] + placeholders_list)
    fields = cursor.fetchall()
    cursor.close()
    # 构建字段映射
    field_map = {f['field_code']: f for f in fields}
    # 检查缺失的字段
    missing_fields = set(placeholders) - set(field_map.keys())
    return {
        'found_fields': field_map,
        'missing_fields': missing_fields
    }
 def update_input_data(conn, file_config_id, input_data):
    """更新input_data字段"""
    cursor = conn.cursor()
    input_data_str = json.dumps(input_data, ensure_ascii=False)
    update_sql = """
        UPDATE f_polic_file_config
        SET input_data = %s, updated_time = %s, updated_by = %s
        WHERE id = %s
    """
    cursor.execute(update_sql, (input_data_str, CURRENT_TIME, UPDATED_BY, file_config_id))
    conn.commit()
    cursor.close()
 def main():
    """主函数"""
    print("="*80)
    print("修复'1.请示报告卡（初核谈话）'模板的input_data字段")
    print("="*80)
    print()
    # 1. 检查模板文件是否存在
    template_path = Path(TEMPLATE_FILE_PATH)
    if not template_path.exists():
        print(f"✗ 错误: 模板文件不存在 - {TEMPLATE_FILE_PATH}")
        return
    print(f"✓ 找到模板文件: {TEMPLATE_FILE_PATH}")
    # 2. 提取占位符
    print("\n正在提取占位符...")
    placeholders = extract_placeholders_from_docx(str(template_path))
    print(f"✓ 找到 {len(placeholders)} 个占位符:")
    for i, placeholder in enumerate(placeholders, 1):
        print(f"  {i}. {{{{ {placeholder} }}}}")
    # 3. 连接数据库
    print("\n正在连接数据库...")
    try:
        conn = pymysql.connect(**DB_CONFIG)
        print("✓ 数据库连接成功")
    except Exception as e:
        print(f"✗ 数据库连接失败: {str(e)}")
        return
    try:
        # 4. 查询模板配置
        print(f"\n正在查询模板配置: {TEMPLATE_NAME}")
        config = get_template_config(conn)
        if not config:
            print(f"✗ 未找到模板配置: {TEMPLATE_NAME}")
            return
        print(f"✓ 找到模板配置:")
        print(f"  ID: {config['id']}")
        print(f"  名称: {config['name']}")
        print(f"  当前template_code: {config.get('template_code', 'NULL')}")
        print(f"  当前input_data: {config.get('input_data', 'NULL')}")
        print(f"  文件路径: {config.get('file_path', 'NULL')}")
        print(f"  状态: {config.get('state', 0)}")
        file_config_id = config['id']
        # 5. 查询模板关联的字段
        print(f"\n正在查询模板关联的字段...")
        template_fields = get_template_fields(conn, file_config_id)
        print(f"✓ 找到 {len(template_fields)} 个关联字段:")
        for field in template_fields:
            field_type_str = "输出字段" if field['field_type'] == 2 else "输入字段"
            print(f"  - {field['name']} ({field['field_code']}) [{field_type_str}]")
        # 6. 验证占位符是否在数据库中存在
        print(f"\n正在验证占位符...")
        verification = verify_placeholders_in_database(conn, placeholders)
        found_fields = verification['found_fields']
        missing_fields = verification['missing_fields']
        print(f"✓ 在数据库中找到 {len(found_fields)} 个字段:")
        for field_code, field in found_fields.items():
            field_type_str = "输出字段" if field['field_type'] == 2 else "输入字段"
            state_str = "启用" if field.get('state', 0) == 1 else "未启用"
            print(f"  - {field['name']} ({field_code}) [{field_type_str}] [状态: {state_str}]")
        if missing_fields:
            print(f"\n⚠ 警告: 以下占位符在数据库中未找到对应字段:")
            for field_code in missing_fields:
                print(f"  - {field_code}")
            print("\n这些占位符仍会被包含在input_data中，但可能无法正确填充。")
        # 7. 生成input_data
        print(f"\n正在生成input_data...")
        input_data = {
            'template_code': TEMPLATE_CODE,
            'business_type': BUSINESS_TYPE,
            'placeholders': placeholders
        }
        print(f"✓ input_data内容:")
        print(json.dumps(input_data, ensure_ascii=False, indent=2))
        # 8. 更新数据库
        print(f"\n正在更新数据库...")
        update_input_data(conn, file_config_id, input_data)
        print(f"✓ 更新成功!")
        # 9. 验证更新结果
        print(f"\n正在验证更新结果...")
        updated_config = get_template_config(conn)
        if updated_config:
            try:
                updated_input_data = json.loads(updated_config['input_data'])
                if updated_input_data.get('template_code') == TEMPLATE_CODE:
                    print(f"✓ 验证成功: template_code = {TEMPLATE_CODE}")
                if updated_input_data.get('business_type') == BUSINESS_TYPE:
                    print(f"✓ 验证成功: business_type = {BUSINESS_TYPE}")
                if set(updated_input_data.get('placeholders', [])) == set(placeholders):
                    print(f"✓ 验证成功: placeholders 匹配")
            except Exception as e:
                print(f"⚠ 验证时出错: {str(e)}")
        print("\n" + "="*80)
        print("修复完成!")
        print("="*80)
    except Exception as e:
        print(f"\n✗ 处理失败: {str(e)}")
        import traceback
        traceback.print_exc()
    finally:
        conn.close()
 if __name__ == '__main__':
    main()
--- a/fix_template_names.py
+++ b/fix_template_names.py
@ -0,0 +1,234 @@
 """
 检查并修复 f_polic_file_config 表中模板名称与文件名的对应关系
 确保 name 字段与模板文档名称（去掉扩展名）完全一致
 """
 import os
 import sys
 import pymysql
 from pathlib import Path
 from typing import Dict, List, Optional
 # 设置输出编码为UTF-8（Windows兼容）
 if sys.platform == 'win32':
    import io
    sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8', errors='replace')
    sys.stderr = io.TextIOWrapper(sys.stderr.buffer, encoding='utf-8', errors='replace')
 # 数据库连接配置
 DB_CONFIG = {
    'host': '152.136.177.240',
    'port': 5012,
    'user': 'finyx',
    'password': '6QsGK6MpePZDE57Z',
    'database': 'finyx',
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 UPDATED_BY = 655162080928945152
 TEMPLATE_BASE_DIR = 'template_finish'
 def scan_template_files(base_dir: str) -> Dict[str, str]:
    """
    扫描模板文件夹，获取所有模板文件信息
    Returns:
        字典，key为MinIO路径（用于匹配），value为文件名（不含扩展名）
    """
    base_path = Path(base_dir)
    if not base_path.exists():
        print(f"错误: 目录不存在 - {base_dir}")
        return {}
    templates = {}
    print("=" * 80)
    print("扫描模板文件...")
    print("=" * 80)
    for docx_file in sorted(base_path.rglob("*.docx")):
        # 跳过临时文件
        if docx_file.name.startswith("~$"):
            continue
        # 获取文件名（不含扩展名）
        file_name_without_ext = docx_file.stem
        # 构建MinIO路径（用于匹配数据库中的file_path）
        from datetime import datetime
        now = datetime.now()
        minio_path = f'/615873064429507639/TEMPLATE/{now.year}/{now.month:02d}/{docx_file.name}'
        templates[minio_path] = {
            'file_name': docx_file.name,
            'name_without_ext': file_name_without_ext,
            'relative_path': str(docx_file.relative_to(base_path))
        }
    print(f"找到 {len(templates)} 个模板文件\n")
    return templates
 def get_db_templates(conn) -> Dict[str, Dict]:
    """
    获取数据库中所有模板记录
    Returns:
        字典，key为file_path，value为模板信息
    """
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    sql = """
        SELECT id, name, file_path, parent_id
        FROM f_polic_file_config
        WHERE tenant_id = %s AND file_path IS NOT NULL
    """
    cursor.execute(sql, (TENANT_ID,))
    templates = cursor.fetchall()
    result = {}
    for template in templates:
        if template['file_path']:
            result[template['file_path']] = {
                'id': template['id'],
                'name': template['name'],
                'file_path': template['file_path'],
                'parent_id': template['parent_id']
            }
    cursor.close()
    return result
 def update_template_name(conn, template_id: int, new_name: str, old_name: str):
    """
    更新模板名称
    """
    cursor = conn.cursor()
    try:
        update_sql = """
            UPDATE f_polic_file_config
            SET name = %s, updated_time = NOW(), updated_by = %s
            WHERE id = %s AND tenant_id = %s
        """
        cursor.execute(update_sql, (new_name, UPDATED_BY, template_id, TENANT_ID))
        conn.commit()
        print(f"  [UPDATE] ID: {template_id}")
        print(f"    旧名称: {old_name}")
        print(f"    新名称: {new_name}")
        return True
    except Exception as e:
        conn.rollback()
        print(f"  [ERROR] 更新失败: {str(e)}")
        return False
    finally:
        cursor.close()
 def match_file_path(file_path: str, db_paths: List[str]) -> Optional[str]:
    """
    匹配文件路径（可能日期不同）
    Args:
        file_path: 当前构建的MinIO路径
        db_paths: 数据库中的所有路径列表
    Returns:
        匹配的数据库路径，如果找到的话
    """
    # 提取文件名
    file_name = Path(file_path).name
    # 在数据库路径中查找相同文件名的路径
    for db_path in db_paths:
        if Path(db_path).name == file_name:
            return db_path
    return None
 def main():
    """主函数"""
    print("=" * 80)
    print("检查并修复模板名称")
    print("=" * 80)
    print()
    try:
        # 连接数据库
        print("1. 连接数据库...")
        conn = pymysql.connect(**DB_CONFIG)
        print("[OK] 数据库连接成功\n")
        # 扫描模板文件
        print("2. 扫描模板文件...")
        file_templates = scan_template_files(TEMPLATE_BASE_DIR)
        # 获取数据库模板
        print("3. 获取数据库模板...")
        db_templates = get_db_templates(conn)
        print(f"[OK] 找到 {len(db_templates)} 个数据库模板\n")
        # 检查并更新
        print("4. 检查并更新模板名称...")
        print("=" * 80)
        updated_count = 0
        not_found_count = 0
        matched_count = 0
        # 遍历文件模板
        for file_path, file_info in file_templates.items():
            file_name = file_info['file_name']
            expected_name = file_info['name_without_ext']
            # 尝试直接匹配
            db_template = db_templates.get(file_path)
            # 如果直接匹配失败，尝试通过文件名匹配
            if not db_template:
                matched_path = match_file_path(file_path, list(db_templates.keys()))
                if matched_path:
                    db_template = db_templates[matched_path]
            if db_template:
                matched_count += 1
                current_name = db_template['name']
                # 检查名称是否一致
                if current_name != expected_name:
                    print(f"\n文件: {file_name}")
                    if update_template_name(conn, db_template['id'], expected_name, current_name):
                        updated_count += 1
                else:
                    print(f"  [OK] {file_name} - 名称已正确")
            else:
                not_found_count += 1
                print(f"  [WARN] 未找到: {file_name}")
        print("\n" + "=" * 80)
        print("检查完成")
        print("=" * 80)
        print(f"总文件数: {len(file_templates)}")
        print(f"匹配成功: {matched_count}")
        print(f"更新数量: {updated_count}")
        print(f"未找到: {not_found_count}")
        print("=" * 80)
    except Exception as e:
        print(f"\n[ERROR] 发生错误: {e}")
        import traceback
        traceback.print_exc()
        if 'conn' in locals():
            conn.rollback()
    finally:
        if 'conn' in locals():
            conn.close()
            print("\n数据库连接已关闭")
 if __name__ == '__main__':
    main()
--- a/generate_download_urls.py
+++ b/generate_download_urls.py
@ -0,0 +1,129 @@
 """
 为指定的文件路径生成 MinIO 预签名下载 URL
 """
 import sys
 import io
 from minio import Minio
 from datetime import timedelta
 # 设置输出编码为UTF-8，避免Windows控制台编码问题
 if sys.platform == 'win32':
    sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
    sys.stderr = io.TextIOWrapper(sys.stderr.buffer, encoding='utf-8')
 # MinIO连接配置
 MINIO_CONFIG = {
    'endpoint': 'minio.datacubeworld.com:9000',
    'access_key': 'JOLXFXny3avFSzB0uRA5',
    'secret_key': 'G1BR8jStNfovkfH5ou39EmPl34E4l7dGrnd3Cz0I',
    'secure': True
 }
 BUCKET_NAME = 'finyx'
 # 文件相对路径列表
 FILE_PATHS = [
 <<<<<<< HEAD
    '/615873064429507639/20251211112544/初步核实审批表_张三.docx',
    '/615873064429507639/20251211112545/请示报告卡_张三.docx'
 =======
    '/615873064429507639/20251211101046/1_张三.docx',
    '/615873064429507639/20251211101046/1_张三.docx'
 >>>>>>> e3f4a394c1a4333db2fd3a9383be29fa9d9055e0
 ]
 def generate_download_urls():
    """为文件路径列表生成下载 URL"""
    print("="*80)
    print("生成 MinIO 下载链接")
    print("="*80)
    try:
        # 创建MinIO客户端
        client = Minio(
            MINIO_CONFIG['endpoint'],
            access_key=MINIO_CONFIG['access_key'],
            secret_key=MINIO_CONFIG['secret_key'],
            secure=MINIO_CONFIG['secure']
        )
        print(f"\n存储桶: {BUCKET_NAME}")
        print(f"端点: {MINIO_CONFIG['endpoint']}")
        print(f"使用HTTPS: {MINIO_CONFIG['secure']}\n")
        results = []
        for file_path in FILE_PATHS:
            # 去掉开头的斜杠，得到对象名称
            object_name = file_path.lstrip('/')
            print("-"*80)
            print(f"文件: {file_path}")
            print(f"对象名称: {object_name}")
            try:
                # 检查文件是否存在
                stat = client.stat_object(BUCKET_NAME, object_name)
                print(f"[OK] 文件存在")
                print(f"  文件大小: {stat.size:,} 字节")
                print(f"  最后修改: {stat.last_modified}")
                # 生成预签名URL（7天有效期）
                url = client.presigned_get_object(
                    BUCKET_NAME,
                    object_name,
                    expires=timedelta(days=7)
                )
                print(f"[OK] 预签名URL生成成功（7天有效）")
                print(f"\n下载链接:")
                print(f"{url}\n")
                results.append({
                    'file_path': file_path,
                    'object_name': object_name,
                    'url': url,
                    'size': stat.size,
                    'exists': True
                })
            except Exception as e:
                print(f"[ERROR] 错误: {e}\n")
                results.append({
                    'file_path': file_path,
                    'object_name': object_name,
                    'url': None,
                    'exists': False,
                    'error': str(e)
                })
        # 输出汇总
        print("\n" + "="*80)
        print("下载链接汇总")
        print("="*80)
        for i, result in enumerate(results, 1):
            print(f"\n{i}. {result['file_path']}")
            if result['exists']:
                print(f"   [OK] 文件存在")
                print(f"   下载链接: {result['url']}")
            else:
                print(f"   [ERROR] 文件不存在或无法访问")
                if 'error' in result:
                    print(f"   错误: {result['error']}")
        print("\n" + "="*80)
        print("完成")
        print("="*80)
        return results
    except Exception as e:
        print(f"\n[ERROR] 连接MinIO失败: {e}")
        import traceback
        traceback.print_exc()
        return None
 if __name__ == '__main__':
    generate_download_urls()
--- a/generate_template_file_id_report.py
+++ b/generate_template_file_id_report.py
@ -0,0 +1,219 @@
 """
 生成模板 file_id 和关联关系的详细报告
 重点检查每个模板的 file_id 是否正确，以及 f_polic_file_field 表的关联关系
 """
 import sys
 import pymysql
 from pathlib import Path
 from typing import Dict, List
 from collections import defaultdict
 # 设置控制台编码为UTF-8（Windows兼容）
 if sys.platform == 'win32':
    try:
        sys.stdout.reconfigure(encoding='utf-8')
        sys.stderr.reconfigure(encoding='utf-8')
    except:
        pass
 # 数据库连接配置
 DB_CONFIG = {
    'host': '152.136.177.240',
    'port': 5012,
    'user': 'finyx',
    'password': '6QsGK6MpePZDE57Z',
    'database': 'finyx',
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 def generate_detailed_report():
    """生成详细的 file_id 和关联关系报告"""
    print("="*80)
    print("模板 file_id 和关联关系详细报告")
    print("="*80)
    # 连接数据库
    try:
        conn = pymysql.connect(**DB_CONFIG)
        print("\n[OK] 数据库连接成功\n")
    except Exception as e:
        print(f"\n[ERROR] 数据库连接失败: {e}")
        return
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        # 1. 查询所有有 file_path 的模板（实际模板文件，不是目录节点）
        cursor.execute("""
            SELECT id, name, template_code, file_path, state, parent_id
            FROM f_polic_file_config
            WHERE tenant_id = %s AND file_path IS NOT NULL AND file_path != ''
            ORDER BY name, id
        """, (TENANT_ID,))
        all_templates = cursor.fetchall()
        print(f"总模板数（有 file_path）: {len(all_templates)}\n")
        # 2. 查询每个模板的关联字段
        template_field_map = defaultdict(list)
        cursor.execute("""
            SELECT 
                fff.file_id,
                fff.filed_id,
                fff.state as relation_state,
                fc.name as template_name,
                fc.template_code,
                f.name as field_name,
                f.filed_code,
                f.field_type,
                CASE 
                    WHEN f.field_type = 1 THEN '输入字段'
                    WHEN f.field_type = 2 THEN '输出字段'
                    ELSE '未知'
                END as field_type_name
            FROM f_polic_file_field fff
            INNER JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id
            INNER JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
            WHERE fff.tenant_id = %s
            ORDER BY fff.file_id, f.field_type, f.name
        """, (TENANT_ID,))
        all_relations = cursor.fetchall()
        for rel in all_relations:
            template_field_map[rel['file_id']].append(rel)
        # 3. 按模板分组显示
        print("="*80)
        print("每个模板的 file_id 和关联字段详情")
        print("="*80)
        # 按名称分组，显示重复的模板
        templates_by_name = defaultdict(list)
        for template in all_templates:
            templates_by_name[template['name']].append(template)
        duplicate_templates = {name: tmpls for name, tmpls in templates_by_name.items() if len(tmpls) > 1}
        if duplicate_templates:
            print("\n[WARN] 发现重复名称的模板:\n")
            for name, tmpls in duplicate_templates.items():
                print(f"  模板名称: {name}")
                for tmpl in tmpls:
                    field_count = len(template_field_map.get(tmpl['id'], []))
                    input_count = sum(1 for f in template_field_map.get(tmpl['id'], []) if f['field_type'] == 1)
                    output_count = sum(1 for f in template_field_map.get(tmpl['id'], []) if f['field_type'] == 2)
                    print(f"    - file_id: {tmpl['id']}")
                    print(f"      template_code: {tmpl.get('template_code', 'N/A')}")
                    print(f"      file_path: {tmpl.get('file_path', 'N/A')}")
                    print(f"      关联字段: 总计 {field_count} 个 (输入 {input_count}, 输出 {output_count})")
                print()
        # 4. 显示每个模板的详细信息
        print("\n" + "="*80)
        print("所有模板的 file_id 和关联字段统计")
        print("="*80)
        for template in all_templates:
            file_id = template['id']
            name = template['name']
            template_code = template.get('template_code', 'N/A')
            file_path = template.get('file_path', 'N/A')
            fields = template_field_map.get(file_id, [])
            input_fields = [f for f in fields if f['field_type'] == 1]
            output_fields = [f for f in fields if f['field_type'] == 2]
            print(f"\n模板: {name}")
            print(f"  file_id: {file_id}")
            print(f"  template_code: {template_code}")
            print(f"  file_path: {file_path}")
            print(f"  关联字段: 总计 {len(fields)} 个")
            print(f"    - 输入字段 (field_type=1): {len(input_fields)} 个")
            print(f"    - 输出字段 (field_type=2): {len(output_fields)} 个")
            if len(fields) == 0:
                print(f"    [WARN] 该模板没有关联任何字段")
        # 5. 检查关联关系的完整性
        print("\n" + "="*80)
        print("关联关系完整性检查")
        print("="*80)
        # 检查是否有 file_id 在 f_polic_file_field 中但没有对应的文件配置
        cursor.execute("""
            SELECT DISTINCT fff.file_id
            FROM f_polic_file_field fff
            LEFT JOIN f_polic_file_config fc ON fff.file_id = fc.id AND fff.tenant_id = fc.tenant_id
            WHERE fff.tenant_id = %s AND fc.id IS NULL
        """, (TENANT_ID,))
        orphan_file_ids = cursor.fetchall()
        if orphan_file_ids:
            print(f"\n[ERROR] 发现孤立的 file_id（在 f_polic_file_field 中但不在 f_polic_file_config 中）:")
            for item in orphan_file_ids:
                print(f"  - file_id: {item['file_id']}")
        else:
            print("\n[OK] 所有关联关系的 file_id 都有效")
        # 检查是否有 filed_id 在 f_polic_file_field 中但没有对应的字段
        cursor.execute("""
            SELECT DISTINCT fff.filed_id
            FROM f_polic_file_field fff
            LEFT JOIN f_polic_field f ON fff.filed_id = f.id AND fff.tenant_id = f.tenant_id
            WHERE fff.tenant_id = %s AND f.id IS NULL
        """, (TENANT_ID,))
        orphan_field_ids = cursor.fetchall()
        if orphan_field_ids:
            print(f"\n[ERROR] 发现孤立的 filed_id（在 f_polic_file_field 中但不在 f_polic_field 中）:")
            for item in orphan_field_ids:
                print(f"  - filed_id: {item['filed_id']}")
        else:
            print("\n[OK] 所有关联关系的 filed_id 都有效")
        # 6. 统计汇总
        print("\n" + "="*80)
        print("统计汇总")
        print("="*80)
        total_templates = len(all_templates)
        templates_with_fields = len([t for t in all_templates if len(template_field_map.get(t['id'], [])) > 0])
        templates_without_fields = total_templates - templates_with_fields
        total_relations = len(all_relations)
        total_input_relations = sum(1 for r in all_relations if r['field_type'] == 1)
        total_output_relations = sum(1 for r in all_relations if r['field_type'] == 2)
        print(f"\n模板统计:")
        print(f"  总模板数: {total_templates}")
        print(f"  有关联字段的模板: {templates_with_fields}")
        print(f"  无关联字段的模板: {templates_without_fields}")
        print(f"\n关联关系统计:")
        print(f"  总关联关系数: {total_relations}")
        print(f"  输入字段关联: {total_input_relations}")
        print(f"  输出字段关联: {total_output_relations}")
        if duplicate_templates:
            print(f"\n[WARN] 发现 {len(duplicate_templates)} 个模板名称有重复记录")
            print("  建议: 确认每个模板应该使用哪个 file_id，并清理重复记录")
        if templates_without_fields:
            print(f"\n[WARN] 发现 {templates_without_fields} 个模板没有关联任何字段")
            print("  建议: 检查这些模板是否需要关联字段")
    finally:
        cursor.close()
        conn.close()
        print("\n数据库连接已关闭")
 if __name__ == '__main__':
    generate_detailed_report()
--- a/get_available_file_ids.py
+++ b/get_available_file_ids.py
@ -0,0 +1,64 @@
 """
 获取所有可用的文件ID列表（用于测试）
 """
 import pymysql
 import os
 # 数据库连接配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 def get_available_file_configs():
    """获取所有可用的文件配置"""
    conn = pymysql.connect(**DB_CONFIG)
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        sql = """
            SELECT id, name, file_path, state
            FROM f_polic_file_config
            WHERE tenant_id = %s
            AND state = 1
            ORDER BY name
        """
        cursor.execute(sql, (TENANT_ID,))
        configs = cursor.fetchall()
        print("="*80)
        print("可用的文件配置列表（state=1）")
        print("="*80)
        print(f"\n共找到 {len(configs)} 个启用的文件配置:\n")
        for i, config in enumerate(configs, 1):
            print(f"{i}. ID: {config['id']}")
            print(f"   名称: {config['name']}")
            print(f"   文件路径: {config['file_path'] or '(空)'}")
            print()
        # 输出JSON格式，方便复制
        print("\n" + "="*80)
        print("JSON格式（可用于测试）:")
        print("="*80)
        print("[")
        for i, config in enumerate(configs):
            comma = "," if i < len(configs) - 1 else ""
            print(f'  {{"fileId": {config["id"]}, "fileName": "{config["name"]}.doc"}}{comma}')
        print("]")
        return configs
    finally:
        cursor.close()
        conn.close()
 if __name__ == '__main__':
    get_available_file_configs()
--- a/improved_match_and_update.py
+++ b/improved_match_and_update.py
@ -0,0 +1,478 @@
 """
 改进的匹配和更新脚本
 增强匹配逻辑，能够匹配数据库中的已有数据
 """
 import os
 import json
 import pymysql
 import re
 from pathlib import Path
 from typing import Dict, List, Optional, Tuple
 from datetime import datetime
 # 数据库连接配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 CREATED_BY = 655162080928945152
 UPDATED_BY = 655162080928945152
 # 项目根目录
 PROJECT_ROOT = Path(__file__).parent
 TEMPLATES_DIR = PROJECT_ROOT / "template_finish"
 # 文档类型映射
 DOCUMENT_TYPE_MAPPING = {
    "1.请示报告卡（XXX）": {
        "template_code": "REPORT_CARD",
        "name": "1.请示报告卡（XXX）",
        "business_type": "INVESTIGATION"
    },
    "2.初步核实审批表（XXX）": {
        "template_code": "PRELIMINARY_VERIFICATION_APPROVAL",
        "name": "2.初步核实审批表（XXX）",
        "business_type": "INVESTIGATION"
    },
    "3.附件初核方案(XXX)": {
        "template_code": "INVESTIGATION_PLAN",
        "name": "3.附件初核方案(XXX)",
        "business_type": "INVESTIGATION"
    },
    "谈话通知书第一联": {
        "template_code": "NOTIFICATION_LETTER_1",
        "name": "谈话通知书第一联",
        "business_type": "INVESTIGATION"
    },
    "谈话通知书第二联": {
        "template_code": "NOTIFICATION_LETTER_2",
        "name": "谈话通知书第二联",
        "business_type": "INVESTIGATION"
    },
    "谈话通知书第三联": {
        "template_code": "NOTIFICATION_LETTER_3",
        "name": "谈话通知书第三联",
        "business_type": "INVESTIGATION"
    },
    "1.请示报告卡（初核谈话）": {
        "template_code": "REPORT_CARD_INTERVIEW",
        "name": "1.请示报告卡（初核谈话）",
        "business_type": "INVESTIGATION"
    },
    "2谈话审批表": {
        "template_code": "INTERVIEW_APPROVAL_FORM",
        "name": "2谈话审批表",
        "business_type": "INVESTIGATION"
    },
    "3.谈话前安全风险评估表": {
        "template_code": "PRE_INTERVIEW_RISK_ASSESSMENT",
        "name": "3.谈话前安全风险评估表",
        "business_type": "INVESTIGATION"
    },
    "4.谈话方案": {
        "template_code": "INTERVIEW_PLAN",
        "name": "4.谈话方案",
        "business_type": "INVESTIGATION"
    },
    "5.谈话后安全风险评估表": {
        "template_code": "POST_INTERVIEW_RISK_ASSESSMENT",
        "name": "5.谈话后安全风险评估表",
        "business_type": "INVESTIGATION"
    },
    "1.谈话笔录": {
        "template_code": "INTERVIEW_RECORD",
        "name": "1.谈话笔录",
        "business_type": "INVESTIGATION"
    },
    "2.谈话询问对象情况摸底调查30问": {
        "template_code": "INVESTIGATION_30_QUESTIONS",
        "name": "2.谈话询问对象情况摸底调查30问",
        "business_type": "INVESTIGATION"
    },
    "3.被谈话人权利义务告知书": {
        "template_code": "RIGHTS_OBLIGATIONS_NOTICE",
        "name": "3.被谈话人权利义务告知书",
        "business_type": "INVESTIGATION"
    },
    "4.点对点交接单": {
        "template_code": "HANDOVER_FORM",
        "name": "4.点对点交接单",
        "business_type": "INVESTIGATION"
    },
    "5.陪送交接单（新）": {
        "template_code": "ESCORT_HANDOVER_FORM",
        "name": "5.陪送交接单（新）",
        "business_type": "INVESTIGATION"
    },
    "6.1保密承诺书（谈话对象使用-非中共党员用）": {
        "template_code": "CONFIDENTIALITY_COMMITMENT_NON_PARTY",
        "name": "6.1保密承诺书（谈话对象使用-非中共党员用）",
        "business_type": "INVESTIGATION"
    },
    "6.2保密承诺书（谈话对象使用-中共党员用）": {
        "template_code": "CONFIDENTIALITY_COMMITMENT_PARTY",
        "name": "6.2保密承诺书（谈话对象使用-中共党员用）",
        "business_type": "INVESTIGATION"
    },
    "7.办案人员-办案安全保密承诺书": {
        "template_code": "INVESTIGATOR_CONFIDENTIALITY_COMMITMENT",
        "name": "7.办案人员-办案安全保密承诺书",
        "business_type": "INVESTIGATION"
    },
    "8-1请示报告卡（初核报告结论） ": {
        "template_code": "REPORT_CARD_CONCLUSION",
        "name": "8-1请示报告卡（初核报告结论） ",
        "business_type": "INVESTIGATION"
    },
    "8.XXX初核情况报告": {
        "template_code": "INVESTIGATION_REPORT",
        "name": "8.XXX初核情况报告",
        "business_type": "INVESTIGATION"
    }
 }
 def normalize_name(name: str) -> str:
    """标准化名称，用于模糊匹配"""
    # 去掉开头的编号（如 "1."、"2."、"8-1" 等）
    name = re.sub(r'^\d+[\.\-]\s*', '', name)
    # 去掉括号及其内容（如 "（XXX）"、"（初核谈话）" 等）
    name = re.sub(r'[（(].*?[）)]', '', name)
    # 去掉空格和特殊字符
    name = name.strip()
    return name
 def generate_id():
    """生成ID"""
    import time
    import random
    timestamp = int(time.time() * 1000)
    random_part = random.randint(100000, 999999)
    return timestamp * 1000 + random_part
 def identify_document_type(file_name: str) -> Optional[Dict]:
    """根据完整文件名识别文档类型"""
    base_name = Path(file_name).stem
    if base_name in DOCUMENT_TYPE_MAPPING:
        return DOCUMENT_TYPE_MAPPING[base_name]
    return None
 def scan_directory_structure(base_dir: Path) -> Dict:
    """扫描目录结构，构建树状层级"""
    structure = {
        'directories': {},
        'files': {}
    }
    def process_path(path: Path, parent_path: Optional[str] = None, level: int = 0):
        """递归处理路径"""
        if path.is_file() and path.suffix == '.docx':
            file_name = path.stem
            doc_config = identify_document_type(file_name)
            structure['files'][str(path)] = {
                'name': file_name,
                'parent': parent_path,
                'level': level,
                'template_code': doc_config['template_code'] if doc_config else None,
                'full_path': str(path),
                'normalized_name': normalize_name(file_name)
            }
        elif path.is_dir():
            dir_name = path.name
            structure['directories'][str(path)] = {
                'name': dir_name,
                'parent': parent_path,
                'level': level,
                'normalized_name': normalize_name(dir_name)
            }
            for child in sorted(path.iterdir()):
                if child.name != '__pycache__':
                    process_path(child, str(path), level + 1)
    if TEMPLATES_DIR.exists():
        for item in sorted(TEMPLATES_DIR.iterdir()):
            if item.name != '__pycache__':
                process_path(item, None, 0)
    return structure
 def get_existing_data(conn) -> Dict:
    """获取数据库中的现有数据，增强匹配能力"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    sql = """
        SELECT id, name, parent_id, template_code, input_data, file_path, state
        FROM f_polic_file_config
        WHERE tenant_id = %s
    """
    cursor.execute(sql, (TENANT_ID,))
    configs = cursor.fetchall()
    result = {
        'by_id': {},
        'by_name': {},
        'by_template_code': {},
        'by_normalized_name': {}  # 新增：标准化名称索引
    }
    for config in configs:
        config_id = config['id']
        config_name = config['name']
        # 提取 template_code
        template_code = config.get('template_code')
        if not template_code and config.get('input_data'):
            try:
                input_data = json.loads(config['input_data']) if isinstance(config['input_data'], str) else config['input_data']
                if isinstance(input_data, dict):
                    template_code = input_data.get('template_code')
            except:
                pass
        config['extracted_template_code'] = template_code
        config['normalized_name'] = normalize_name(config_name)
        result['by_id'][config_id] = config
        result['by_name'][config_name] = config
        if template_code:
            if template_code not in result['by_template_code']:
                result['by_template_code'][template_code] = config
        # 标准化名称索引（可能有多个记录匹配同一个标准化名称）
        normalized = config['normalized_name']
        if normalized not in result['by_normalized_name']:
            result['by_normalized_name'][normalized] = []
        result['by_normalized_name'][normalized].append(config)
    cursor.close()
    return result
 def find_matching_config(file_info: Dict, existing_data: Dict) -> Optional[Dict]:
    """
    查找匹配的数据库记录
    优先级：1. template_code 精确匹配  2. 名称精确匹配  3. 标准化名称匹配
    """
    template_code = file_info.get('template_code')
    file_name = file_info['name']
    normalized_name = file_info.get('normalized_name', normalize_name(file_name))
    # 优先级1: template_code 精确匹配
    if template_code:
        matched = existing_data['by_template_code'].get(template_code)
        if matched:
            return matched
    # 优先级2: 名称精确匹配
    matched = existing_data['by_name'].get(file_name)
    if matched:
        return matched
    # 优先级3: 标准化名称匹配
    candidates = existing_data['by_normalized_name'].get(normalized_name, [])
    if candidates:
        # 如果有多个候选，优先选择有正确 template_code 的
        for candidate in candidates:
            if candidate.get('extracted_template_code') == template_code:
                return candidate
        # 否则返回第一个
        return candidates[0]
    return None
 def plan_tree_structure(dir_structure: Dict, existing_data: Dict) -> List[Dict]:
    """规划树状结构，使用改进的匹配逻辑"""
    plan = []
    directories = sorted(dir_structure['directories'].items(), 
                        key=lambda x: (x[1]['level'], x[0]))
    files = sorted(dir_structure['files'].items(),
                   key=lambda x: (x[1]['level'], x[0]))
    dir_id_map = {}
    # 处理目录
    for dir_path, dir_info in directories:
        dir_name = dir_info['name']
        parent_path = dir_info['parent']
        level = dir_info['level']
        parent_id = None
        if parent_path:
            parent_id = dir_id_map.get(parent_path)
        # 查找匹配的数据库记录
        matched = find_matching_config(dir_info, existing_data)
        if matched:
            plan.append({
                'type': 'directory',
                'name': dir_name,
                'parent_name': dir_structure['directories'].get(parent_path, {}).get('name') if parent_path else None,
                'parent_id': parent_id,
                'level': level,
                'action': 'update',
                'config_id': matched['id'],
                'current_parent_id': matched.get('parent_id'),
                'matched_by': 'existing'
            })
            dir_id_map[dir_path] = matched['id']
        else:
            new_id = generate_id()
            plan.append({
                'type': 'directory',
                'name': dir_name,
                'parent_name': dir_structure['directories'].get(parent_path, {}).get('name') if parent_path else None,
                'parent_id': parent_id,
                'level': level,
                'action': 'create',
                'config_id': new_id,
                'current_parent_id': None,
                'matched_by': 'new'
            })
            dir_id_map[dir_path] = new_id
    # 处理文件
    for file_path, file_info in files:
        file_name = file_info['name']
        parent_path = file_info['parent']
        level = file_info['level']
        template_code = file_info['template_code']
        parent_id = dir_id_map.get(parent_path) if parent_path else None
        # 查找匹配的数据库记录
        matched = find_matching_config(file_info, existing_data)
        if matched:
            plan.append({
                'type': 'file',
                'name': file_name,
                'parent_name': dir_structure['directories'].get(parent_path, {}).get('name') if parent_path else None,
                'parent_id': parent_id,
                'level': level,
                'action': 'update',
                'config_id': matched['id'],
                'template_code': template_code,
                'current_parent_id': matched.get('parent_id'),
                'matched_by': 'existing'
            })
        else:
            new_id = generate_id()
            plan.append({
                'type': 'file',
                'name': file_name,
                'parent_name': dir_structure['directories'].get(parent_path, {}).get('name') if parent_path else None,
                'parent_id': parent_id,
                'level': level,
                'action': 'create',
                'config_id': new_id,
                'template_code': template_code,
                'current_parent_id': None,
                'matched_by': 'new'
            })
    return plan
 def print_matching_report(plan: List[Dict]):
    """打印匹配报告"""
    print("\n" + "="*80)
    print("匹配报告")
    print("="*80)
    matched = [p for p in plan if p.get('matched_by') == 'existing']
    unmatched = [p for p in plan if p.get('matched_by') == 'new']
    print(f"\n已匹配的记录: {len(matched)} 条")
    print(f"未匹配的记录（将创建）: {len(unmatched)} 条\n")
    if unmatched:
        print("未匹配的记录列表：")
        for item in unmatched:
            print(f"  - {item['name']} ({item['type']})")
    print("\n匹配详情：")
    by_level = {}
    for item in plan:
        level = item['level']
        if level not in by_level:
            by_level[level] = []
        by_level[level].append(item)
    for level in sorted(by_level.keys()):
        print(f"\n【层级 {level}】")
        for item in by_level[level]:
            indent = "  " * level
            match_status = "✓" if item.get('matched_by') == 'existing' else "✗"
            print(f"{indent}{match_status} {item['name']} (ID: {item['config_id']})")
            if item.get('parent_name'):
                print(f"{indent}  父节点: {item['parent_name']}")
            if item['action'] == 'update':
                current = item.get('current_parent_id', 'None')
                new = item.get('parent_id', 'None')
                if current != new:
                    print(f"{indent}  parent_id: {current} → {new}")
 def main():
    """主函数"""
    print("="*80)
    print("改进的模板树状结构分析和更新")
    print("="*80)
    try:
        conn = pymysql.connect(**DB_CONFIG)
        print("✓ 数据库连接成功\n")
    except Exception as e:
        print(f"✗ 数据库连接失败: {e}")
        return
    try:
        print("扫描目录结构...")
        dir_structure = scan_directory_structure(TEMPLATES_DIR)
        print(f"  找到 {len(dir_structure['directories'])} 个目录")
        print(f"  找到 {len(dir_structure['files'])} 个文件\n")
        print("获取数据库现有数据...")
        existing_data = get_existing_data(conn)
        print(f"  数据库中有 {len(existing_data['by_id'])} 条记录\n")
        print("规划树状结构（使用改进的匹配逻辑）...")
        plan = plan_tree_structure(dir_structure, existing_data)
        print(f"  生成 {len(plan)} 个更新计划\n")
        print_matching_report(plan)
        # 询问是否继续
        print("\n" + "="*80)
        response = input("\n是否生成更新SQL脚本？(yes/no，默认no): ").strip().lower()
        if response == 'yes':
            from analyze_and_update_template_tree import generate_update_sql
            sql_file = generate_update_sql(plan)
            print(f"\n✓ SQL脚本已生成: {sql_file}")
        else:
            print("\n已取消")
    finally:
        conn.close()
 if __name__ == '__main__':
    main()
--- a/init_template_tree_from_directory.py
+++ b/init_template_tree_from_directory.py
@ -0,0 +1,544 @@
 """
 从 template_finish 目录初始化模板树状结构
 删除旧数据，根据目录结构完全重建
 """
 import os
 import json
 import pymysql
 from pathlib import Path
 from typing import Dict, List, Optional, Tuple
 from datetime import datetime
 from minio import Minio
 from minio.error import S3Error
 # 数据库连接配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 # MinIO连接配置
 MINIO_CONFIG = {
    'endpoint': 'minio.datacubeworld.com:9000',
    'access_key': 'JOLXFXny3avFSzB0uRA5',
    'secret_key': 'G1BR8jStNfovkfH5ou39EmPl34E4l7dGrnd3Cz0I',
    'secure': True
 }
 TENANT_ID = 615873064429507639
 CREATED_BY = 655162080928945152
 UPDATED_BY = 655162080928945152
 BUCKET_NAME = 'finyx'
 # 项目根目录
 PROJECT_ROOT = Path(__file__).parent
 TEMPLATES_DIR = PROJECT_ROOT / "template_finish"
 # 文档类型映射
 DOCUMENT_TYPE_MAPPING = {
    "1.请示报告卡（XXX）": {
        "template_code": "REPORT_CARD",
        "name": "1.请示报告卡（XXX）",
        "business_type": "INVESTIGATION"
    },
    "2.初步核实审批表（XXX）": {
        "template_code": "PRELIMINARY_VERIFICATION_APPROVAL",
        "name": "2.初步核实审批表（XXX）",
        "business_type": "INVESTIGATION"
    },
    "3.附件初核方案(XXX)": {
        "template_code": "INVESTIGATION_PLAN",
        "name": "3.附件初核方案(XXX)",
        "business_type": "INVESTIGATION"
    },
    "谈话通知书第一联": {
        "template_code": "NOTIFICATION_LETTER_1",
        "name": "谈话通知书第一联",
        "business_type": "INVESTIGATION"
    },
    "谈话通知书第二联": {
        "template_code": "NOTIFICATION_LETTER_2",
        "name": "谈话通知书第二联",
        "business_type": "INVESTIGATION"
    },
    "谈话通知书第三联": {
        "template_code": "NOTIFICATION_LETTER_3",
        "name": "谈话通知书第三联",
        "business_type": "INVESTIGATION"
    },
    "1.请示报告卡（初核谈话）": {
        "template_code": "REPORT_CARD_INTERVIEW",
        "name": "1.请示报告卡（初核谈话）",
        "business_type": "INVESTIGATION"
    },
    "2谈话审批表": {
        "template_code": "INTERVIEW_APPROVAL_FORM",
        "name": "2谈话审批表",
        "business_type": "INVESTIGATION"
    },
    "3.谈话前安全风险评估表": {
        "template_code": "PRE_INTERVIEW_RISK_ASSESSMENT",
        "name": "3.谈话前安全风险评估表",
        "business_type": "INVESTIGATION"
    },
    "4.谈话方案": {
        "template_code": "INTERVIEW_PLAN",
        "name": "4.谈话方案",
        "business_type": "INVESTIGATION"
    },
    "5.谈话后安全风险评估表": {
        "template_code": "POST_INTERVIEW_RISK_ASSESSMENT",
        "name": "5.谈话后安全风险评估表",
        "business_type": "INVESTIGATION"
    },
    "1.谈话笔录": {
        "template_code": "INTERVIEW_RECORD",
        "name": "1.谈话笔录",
        "business_type": "INVESTIGATION"
    },
    "2.谈话询问对象情况摸底调查30问": {
        "template_code": "INVESTIGATION_30_QUESTIONS",
        "name": "2.谈话询问对象情况摸底调查30问",
        "business_type": "INVESTIGATION"
    },
    "3.被谈话人权利义务告知书": {
        "template_code": "RIGHTS_OBLIGATIONS_NOTICE",
        "name": "3.被谈话人权利义务告知书",
        "business_type": "INVESTIGATION"
    },
    "4.点对点交接单": {
        "template_code": "HANDOVER_FORM",
        "name": "4.点对点交接单",
        "business_type": "INVESTIGATION"
    },
    "5.陪送交接单（新）": {
        "template_code": "ESCORT_HANDOVER_FORM",
        "name": "5.陪送交接单（新）",
        "business_type": "INVESTIGATION"
    },
    "6.1保密承诺书（谈话对象使用-非中共党员用）": {
        "template_code": "CONFIDENTIALITY_COMMITMENT_NON_PARTY",
        "name": "6.1保密承诺书（谈话对象使用-非中共党员用）",
        "business_type": "INVESTIGATION"
    },
    "6.2保密承诺书（谈话对象使用-中共党员用）": {
        "template_code": "CONFIDENTIALITY_COMMITMENT_PARTY",
        "name": "6.2保密承诺书（谈话对象使用-中共党员用）",
        "business_type": "INVESTIGATION"
    },
    "7.办案人员-办案安全保密承诺书": {
        "template_code": "INVESTIGATOR_CONFIDENTIALITY_COMMITMENT",
        "name": "7.办案人员-办案安全保密承诺书",
        "business_type": "INVESTIGATION"
    },
    "8-1请示报告卡（初核报告结论） ": {
        "template_code": "REPORT_CARD_CONCLUSION",
        "name": "8-1请示报告卡（初核报告结论） ",
        "business_type": "INVESTIGATION"
    },
    "8.XXX初核情况报告": {
        "template_code": "INVESTIGATION_REPORT",
        "name": "8.XXX初核情况报告",
        "business_type": "INVESTIGATION"
    }
 }
 def generate_id():
    """生成ID"""
    import time
    import random
    timestamp = int(time.time() * 1000)
    random_part = random.randint(100000, 999999)
    return timestamp * 1000 + random_part
 def identify_document_type(file_name: str) -> Optional[Dict]:
    """根据完整文件名识别文档类型"""
    base_name = Path(file_name).stem
    if base_name in DOCUMENT_TYPE_MAPPING:
        return DOCUMENT_TYPE_MAPPING[base_name]
    return None
 def upload_to_minio(file_path: Path) -> str:
    """上传文件到MinIO"""
    try:
        client = Minio(
            MINIO_CONFIG['endpoint'],
            access_key=MINIO_CONFIG['access_key'],
            secret_key=MINIO_CONFIG['secret_key'],
            secure=MINIO_CONFIG['secure']
        )
        found = client.bucket_exists(BUCKET_NAME)
        if not found:
            raise Exception(f"存储桶 '{BUCKET_NAME}' 不存在，请先创建")
        now = datetime.now()
        object_name = f'{TENANT_ID}/TEMPLATE/{now.year}/{now.month:02d}/{file_path.name}'
        client.fput_object(
            BUCKET_NAME,
            object_name,
            str(file_path),
            content_type='application/vnd.openxmlformats-officedocument.wordprocessingml.document'
        )
        return f"/{object_name}"
    except S3Error as e:
        raise Exception(f"MinIO错误: {e}")
    except Exception as e:
        raise Exception(f"上传文件时发生错误: {e}")
 def scan_directory_structure(base_dir: Path) -> List[Dict]:
    """
    扫描目录结构，返回按层级排序的节点列表
    每个节点包含：type, name, path, parent_path, level, template_code, file_path
    """
    nodes = []
    def process_path(path: Path, parent_path: Optional[str] = None, level: int = 0):
        """递归处理路径"""
        if path.is_file() and path.suffix == '.docx':
            file_name = path.stem
            doc_config = identify_document_type(file_name)
            nodes.append({
                'type': 'file',
                'name': file_name,
                'path': str(path),
                'parent_path': parent_path,
                'level': level,
                'template_code': doc_config['template_code'] if doc_config else None,
                'doc_config': doc_config,
                'file_path': path
            })
        elif path.is_dir():
            dir_name = path.name
            nodes.append({
                'type': 'directory',
                'name': dir_name,
                'path': str(path),
                'parent_path': parent_path,
                'level': level,
                'template_code': None,
                'doc_config': None,
                'file_path': None
            })
            for child in sorted(path.iterdir()):
                if child.name != '__pycache__':
                    process_path(child, str(path), level + 1)
    if TEMPLATES_DIR.exists():
        for item in sorted(TEMPLATES_DIR.iterdir()):
            if item.name != '__pycache__':
                process_path(item, None, 0)
    # 按层级排序
    return sorted(nodes, key=lambda x: (x['level'], x['path']))
 def delete_old_data(conn, dry_run: bool = True):
    """删除旧数据"""
    cursor = conn.cursor()
    try:
        print("\n" + "="*80)
        print("删除旧数据")
        print("="*80)
        # 1. 先删除关联表 f_polic_file_field
        print("\n1. 删除 f_polic_file_field 关联记录...")
        if not dry_run:
            # 先获取所有相关的 file_id
            select_file_ids_sql = """
                SELECT id FROM f_polic_file_config
                WHERE tenant_id = %s
            """
            cursor.execute(select_file_ids_sql, (TENANT_ID,))
            file_ids = [row[0] for row in cursor.fetchall()]
            if file_ids:
                # 使用占位符构建SQL
                placeholders = ','.join(['%s'] * len(file_ids))
                delete_file_field_sql = f"""
                    DELETE FROM f_polic_file_field
                    WHERE tenant_id = %s AND file_id IN ({placeholders})
                """
                cursor.execute(delete_file_field_sql, [TENANT_ID] + file_ids)
                deleted_count = cursor.rowcount
                print(f"  ✓ 删除了 {deleted_count} 条关联记录")
            else:
                print("  ✓ 没有需要删除的关联记录")
        else:
            # 模拟模式：只统计
            count_sql = """
                SELECT COUNT(*) FROM f_polic_file_field
                WHERE tenant_id = %s AND file_id IN (
                    SELECT id FROM f_polic_file_config WHERE tenant_id = %s
                )
            """
            cursor.execute(count_sql, (TENANT_ID, TENANT_ID))
            count = cursor.fetchone()[0]
            print(f"  [模拟] 将删除 {count} 条关联记录")
        # 2. 删除 f_polic_file_config 记录
        print("\n2. 删除 f_polic_file_config 记录...")
        delete_config_sql = """
            DELETE FROM f_polic_file_config
            WHERE tenant_id = %s
        """
        if not dry_run:
            cursor.execute(delete_config_sql, (TENANT_ID,))
            deleted_count = cursor.rowcount
            print(f"  ✓ 删除了 {deleted_count} 条配置记录")
            conn.commit()
        else:
            count_sql = "SELECT COUNT(*) FROM f_polic_file_config WHERE tenant_id = %s"
            cursor.execute(count_sql, (TENANT_ID,))
            count = cursor.fetchone()[0]
            print(f"  [模拟] 将删除 {count} 条配置记录")
        return True
    except Exception as e:
        if not dry_run:
            conn.rollback()
        print(f"  ✗ 删除失败: {e}")
        raise
    finally:
        cursor.close()
 def create_tree_structure(conn, nodes: List[Dict], upload_files: bool = True, dry_run: bool = True):
    """创建树状结构"""
    cursor = conn.cursor()
    try:
        if not dry_run:
            conn.autocommit(False)
        print("\n" + "="*80)
        print("创建树状结构")
        print("="*80)
        # 创建路径到ID的映射
        path_to_id = {}
        created_count = 0
        updated_count = 0
        # 按层级顺序处理
        for node in nodes:
            node_path = node['path']
            node_name = node['name']
            parent_path = node['parent_path']
            level = node['level']
            # 获取父节点ID
            parent_id = path_to_id.get(parent_path) if parent_path else None
            if node['type'] == 'directory':
                # 创建目录节点
                node_id = generate_id()
                path_to_id[node_path] = node_id
                if not dry_run:
                    # 目录节点不包含 template_code 字段
                    insert_sql = """
                        INSERT INTO f_polic_file_config
                        (id, tenant_id, parent_id, name, input_data, file_path,
                         created_time, created_by, updated_time, updated_by, state)
                        VALUES (%s, %s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, %s)
                    """
                    cursor.execute(insert_sql, (
                        node_id,
                        TENANT_ID,
                        parent_id,
                        node_name,
                        None,
                        None,
                        CREATED_BY,
                        UPDATED_BY,
                        1
                    ))
                indent = "  " * level
                parent_info = f" [父: {path_to_id.get(parent_path, 'None')}]" if parent_path else ""
                print(f"{indent}✓ {'[模拟]' if dry_run else ''}创建目录: {node_name} (ID: {node_id}){parent_info}")
                created_count += 1
            else:
                # 创建文件节点
                node_id = generate_id()
                path_to_id[node_path] = node_id
                doc_config = node.get('doc_config')
                template_code = node.get('template_code')
                file_path_obj = node.get('file_path')
                # 上传文件到MinIO（如果需要）
                minio_path = None
                if upload_files and file_path_obj and file_path_obj.exists():
                    try:
                        if not dry_run:
                            minio_path = upload_to_minio(file_path_obj)
                        else:
                            minio_path = f"/{TENANT_ID}/TEMPLATE/2025/12/{file_path_obj.name}"
                        print(f"  {'[模拟]' if dry_run else ''}上传文件: {file_path_obj.name} → {minio_path}")
                    except Exception as e:
                        print(f"  ⚠ 上传文件失败: {e}")
                        # 继续执行，使用None作为路径
                # 构建 input_data
                input_data = None
                if doc_config:
                    input_data = json.dumps({
                        'template_code': doc_config['template_code'],
                        'business_type': doc_config['business_type']
                    }, ensure_ascii=False)
                if not dry_run:
                    # 如果 template_code 为 None，使用空字符串
                    template_code_value = template_code if template_code else ''
                    insert_sql = """
                        INSERT INTO f_polic_file_config
                        (id, tenant_id, parent_id, name, input_data, file_path, template_code,
                         created_time, created_by, updated_time, updated_by, state)
                        VALUES (%s, %s, %s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, %s)
                    """
                    cursor.execute(insert_sql, (
                        node_id,
                        TENANT_ID,
                        parent_id,
                        node_name,
                        input_data,
                        minio_path,
                        template_code_value,
                        CREATED_BY,
                        UPDATED_BY,
                        1
                    ))
                indent = "  " * level
                parent_info = f" [父: {path_to_id.get(parent_path, 'None')}]" if parent_path else ""
                template_info = f" [code: {template_code}]" if template_code else ""
                print(f"{indent}✓ {'[模拟]' if dry_run else ''}创建文件: {node_name} (ID: {node_id}){parent_info}{template_info}")
                created_count += 1
        if not dry_run:
            conn.commit()
            print(f"\n✓ 创建完成！共创建 {created_count} 个节点")
        else:
            print(f"\n[模拟模式] 将创建 {created_count} 个节点")
        return path_to_id
    except Exception as e:
        if not dry_run:
            conn.rollback()
        print(f"\n✗ 创建失败: {e}")
        import traceback
        traceback.print_exc()
        raise
    finally:
        cursor.close()
 def main():
    """主函数"""
    print("="*80)
    print("初始化模板树状结构（从目录结构完全重建）")
    print("="*80)
    print("\n⚠️  警告：此操作将删除当前租户的所有模板数据！")
    print("   包括：")
    print("   - f_polic_file_config 表中的所有记录")
    print("   - f_polic_file_field 表中的相关关联记录")
    print("   然后根据 template_finish 目录结构完全重建")
    # 确认
    print("\n" + "="*80)
    confirm1 = input("\n确认继续？(yes/no，默认no): ").strip().lower()
    if confirm1 != 'yes':
        print("已取消")
        return
    # 连接数据库
    try:
        conn = pymysql.connect(**DB_CONFIG)
        print("✓ 数据库连接成功")
    except Exception as e:
        print(f"✗ 数据库连接失败: {e}")
        return
    try:
        # 扫描目录结构
        print("\n扫描目录结构...")
        nodes = scan_directory_structure(TEMPLATES_DIR)
        print(f"  找到 {len(nodes)} 个节点")
        print(f"  其中目录: {len([n for n in nodes if n['type'] == 'directory'])} 个")
        print(f"  其中文件: {len([n for n in nodes if n['type'] == 'file'])} 个")
        # 显示预览
        print("\n目录结构预览：")
        for node in nodes[:10]:  # 只显示前10个
            indent = "  " * node['level']
            type_icon = "📁" if node['type'] == 'directory' else "📄"
            print(f"{indent}{type_icon} {node['name']}")
        if len(nodes) > 10:
            print(f"  ... 还有 {len(nodes) - 10} 个节点")
        # 询问是否上传文件
        print("\n" + "="*80)
        upload_files = input("\n是否上传文件到MinIO？(yes/no，默认yes): ").strip().lower()
        upload_files = upload_files != 'no'
        # 先执行模拟删除
        print("\n执行模拟删除...")
        delete_old_data(conn, dry_run=True)
        # 再执行模拟创建
        print("\n执行模拟创建...")
        create_tree_structure(conn, nodes, upload_files=upload_files, dry_run=True)
        # 最终确认
        print("\n" + "="*80)
        confirm2 = input("\n确认执行实际更新？(yes/no，默认no): ").strip().lower()
        if confirm2 != 'yes':
            print("已取消")
            return
        # 执行实际删除
        print("\n执行实际删除...")
        delete_old_data(conn, dry_run=False)
        # 执行实际创建
        print("\n执行实际创建...")
        create_tree_structure(conn, nodes, upload_files=upload_files, dry_run=False)
        print("\n" + "="*80)
        print("初始化完成！")
        print("="*80)
    except Exception as e:
        print(f"\n✗ 初始化失败: {e}")
        import traceback
        traceback.print_exc()
    finally:
        conn.close()
        print("\n数据库连接已关闭")
 if __name__ == '__main__':
    main()
--- a/link_all_templates_to_input_fields.py
+++ b/link_all_templates_to_input_fields.py
@ -0,0 +1,234 @@
 """
 将所有模板与两个输入字段关联
 - 线索信息 (clue_info)
 - 被核查人员工作基本情况线索 (target_basic_info_clue)
 """
 import pymysql
 import os
 import sys
 import time
 import random
 from datetime import datetime
 # 设置输出编码为UTF-8
 if sys.platform == 'win32':
    import io
    sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
 # 数据库连接配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 CREATED_BY = 655162080928945152
 UPDATED_BY = 655162080928945152
 def generate_id():
    """生成ID"""
    timestamp = int(time.time() * 1000)
    random_part = random.randint(100000, 999999)
    return timestamp * 1000 + random_part
 def get_input_field_ids(conn):
    """获取两个输入字段的ID"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    field_codes = ['clue_info', 'target_basic_info_clue']
    cursor.execute("""
        SELECT id, name, filed_code
        FROM f_polic_field
        WHERE tenant_id = %s 
          AND filed_code IN (%s, %s)
          AND field_type = 1
          AND state = 1
    """, (TENANT_ID, field_codes[0], field_codes[1]))
    fields = cursor.fetchall()
    field_map = {field['filed_code']: field for field in fields}
    result = {}
    for code in field_codes:
        if code in field_map:
            result[code] = field_map[code]
        else:
            print(f"[WARN] 未找到字段: {code}")
    return result
 def get_all_templates(conn):
    """获取所有启用的模板"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    cursor.execute("""
        SELECT id, name
        FROM f_polic_file_config
        WHERE tenant_id = %s AND state = 1
        ORDER BY name
    """, (TENANT_ID,))
    return cursor.fetchall()
 def get_existing_relations(conn, template_id, field_ids):
    """获取模板与字段的现有关联关系"""
    cursor = conn.cursor()
    if not field_ids:
        return set()
    placeholders = ','.join(['%s'] * len(field_ids))
    cursor.execute(f"""
        SELECT filed_id
        FROM f_polic_file_field
        WHERE tenant_id = %s 
          AND file_id = %s
          AND filed_id IN ({placeholders})
          AND state = 1
    """, [TENANT_ID, template_id] + list(field_ids))
    return {row[0] for row in cursor.fetchall()}
 def create_relation(conn, template_id, field_id):
    """创建模板与字段的关联关系"""
    cursor = conn.cursor()
    current_time = datetime.now()
    # 检查是否已存在
    cursor.execute("""
        SELECT id FROM f_polic_file_field
        WHERE tenant_id = %s 
          AND file_id = %s 
          AND filed_id = %s
    """, (TENANT_ID, template_id, field_id))
    if cursor.fetchone():
        return False  # 已存在，不需要创建
    # 创建新关联
    relation_id = generate_id()
    cursor.execute("""
        INSERT INTO f_polic_file_field
        (id, tenant_id, file_id, filed_id, created_time, created_by, updated_time, updated_by, state)
        VALUES (%s, %s, %s, %s, %s, %s, %s, %s, 1)
    """, (
        relation_id,
        TENANT_ID,
        template_id,
        field_id,
        current_time,
        CREATED_BY,
        current_time,
        UPDATED_BY
    ))
    return True  # 创建成功
 def main():
    """主函数"""
    print("="*80)
    print("将所有模板与输入字段关联")
    print("="*80)
    try:
        conn = pymysql.connect(**DB_CONFIG)
        print("[OK] 数据库连接成功\n")
    except Exception as e:
        print(f"[ERROR] 数据库连接失败: {e}")
        return
    try:
        # 1. 获取输入字段ID
        print("1. 获取输入字段ID...")
        input_fields = get_input_field_ids(conn)
        if len(input_fields) != 2:
            print(f"[ERROR] 未找到所有输入字段，只找到: {list(input_fields.keys())}")
            return
        field_ids = [field['id'] for field in input_fields.values()]
        print(f"   找到字段:")
        for code, field in input_fields.items():
            print(f"     - {field['name']} ({code}): ID={field['id']}")
        print()
        # 2. 获取所有模板
        print("2. 获取所有启用的模板...")
        templates = get_all_templates(conn)
        print(f"   找到 {len(templates)} 个模板\n")
        # 3. 为每个模板创建关联关系
        print("3. 创建关联关系...")
        created_count = 0
        existing_count = 0
        error_count = 0
        for template in templates:
            template_id = template['id']
            template_name = template['name']
            # 获取现有关联
            existing_relations = get_existing_relations(conn, template_id, field_ids)
            # 为每个字段创建关联（如果不存在）
            for field_code, field_info in input_fields.items():
                field_id = field_info['id']
                if field_id in existing_relations:
                    existing_count += 1
                    continue
                try:
                    if create_relation(conn, template_id, field_id):
                        created_count += 1
                        print(f"   [OK] {template_name} <- {field_info['name']} ({field_code})")
                    else:
                        existing_count += 1
                except Exception as e:
                    error_count += 1
                    print(f"   [ERROR] {template_name} <- {field_info['name']}: {e}")
        # 提交事务
        conn.commit()
        # 4. 统计结果
        print("\n" + "="*80)
        print("执行结果")
        print("="*80)
        print(f"模板总数: {len(templates)}")
        print(f"字段总数: {len(input_fields)}")
        print(f"预期关联数: {len(templates) * len(input_fields)}")
        print(f"新创建关联: {created_count}")
        print(f"已存在关联: {existing_count}")
        print(f"错误数量: {error_count}")
        print(f"实际关联数: {created_count + existing_count}")
        if error_count == 0:
            print("\n[OK] 所有关联关系已成功创建或已存在")
        else:
            print(f"\n[WARN] 有 {error_count} 个关联关系创建失败")
    except Exception as e:
        conn.rollback()
        print(f"\n[ERROR] 执行过程中发生错误: {e}")
        import traceback
        traceback.print_exc()
    finally:
        conn.close()
        print("\n数据库连接已关闭")
 if __name__ == '__main__':
    main()
--- a/process_confidentiality_commitment_non_party.py
+++ b/process_confidentiality_commitment_non_party.py
@ -0,0 +1,372 @@
 """
 处理"6.1保密承诺书（谈话对象使用-非中共党员用）.docx"
 - 解析占位符
 - 上传到MinIO
 - 更新数据库
 """
 import os
 import sys
 import re
 import json
 import pymysql
 from minio import Minio
 from minio.error import S3Error
 from datetime import datetime
 from pathlib import Path
 from docx import Document
 from typing import Dict, List, Optional, Tuple
 # 设置输出编码为UTF-8（Windows兼容）
 if sys.platform == 'win32':
    import io
    sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8', errors='replace')
    sys.stderr = io.TextIOWrapper(sys.stderr.buffer, encoding='utf-8', errors='replace')
 # MinIO连接配置
 MINIO_CONFIG = {
    'endpoint': 'minio.datacubeworld.com:9000',
    'access_key': 'JOLXFXny3avFSzB0uRA5',
    'secret_key': 'G1BR8jStNfovkfH5ou39EmPl34E4l7dGrnd3Cz0I',
    'secure': True
 }
 # 数据库连接配置
 DB_CONFIG = {
    'host': '152.136.177.240',
    'port': 5012,
    'user': 'finyx',
    'password': '6QsGK6MpePZDE57Z',
    'database': 'finyx',
    'charset': 'utf8mb4'
 }
 # 固定值
 TENANT_ID = 615873064429507639
 CREATED_BY = 655162080928945152
 UPDATED_BY = 655162080928945152
 BUCKET_NAME = 'finyx'
 # 文件路径
 TEMPLATE_FILE = 'template_finish/2-初核模版/2.谈话审批/走读式谈话流程/6.1保密承诺书（谈话对象使用-非中共党员用）.docx'
 PARENT_ID = 1765273962716807  # 走读式谈话流程的ID
 TEMPLATE_NAME = '6.1保密承诺书（谈话对象使用-非中共党员用）'
 def generate_id():
    """生成ID"""
    import time
    import random
    timestamp = int(time.time() * 1000)
    random_part = random.randint(100000, 999999)
    return timestamp * 1000 + random_part
 def extract_placeholders_from_docx(file_path: str) -> List[str]:
    """
    从docx文件中提取所有占位符
    Args:
        file_path: docx文件路径
    Returns:
        占位符列表，格式: ['field_code1', 'field_code2', ...]
    """
    placeholders = set()
    pattern = r'\{\{([^}]+)\}\}'  # 匹配 {{field_code}} 格式
    try:
        doc = Document(file_path)
        # 从段落中提取占位符
        for paragraph in doc.paragraphs:
            text = paragraph.text
            matches = re.findall(pattern, text)
            for match in matches:
                placeholders.add(match.strip())
        # 从表格中提取占位符
        for table in doc.tables:
            for row in table.rows:
                for cell in row.cells:
                    for paragraph in cell.paragraphs:
                        text = paragraph.text
                        matches = re.findall(pattern, text)
                        for match in matches:
                            placeholders.add(match.strip())
    except Exception as e:
        print(f"  错误: 读取文件失败 - {str(e)}")
        return []
    return sorted(list(placeholders))
 def upload_to_minio(file_path: str, minio_client: Minio) -> str:
    """
    上传文件到MinIO
    Args:
        file_path: 本地文件路径
        minio_client: MinIO客户端实例
    Returns:
        MinIO中的相对路径
    """
    try:
        # 检查存储桶是否存在
        found = minio_client.bucket_exists(BUCKET_NAME)
        if not found:
            raise Exception(f"存储桶 '{BUCKET_NAME}' 不存在，请先创建")
        # 生成MinIO对象路径（使用当前日期）
        now = datetime.now()
        file_name = Path(file_path).name
        object_name = f'{TENANT_ID}/TEMPLATE/{now.year}/{now.month:02d}/{file_name}'
        # 上传文件
        minio_client.fput_object(
            BUCKET_NAME,
            object_name,
            file_path,
            content_type='application/vnd.openxmlformats-officedocument.wordprocessingml.document'
        )
        # 返回相对路径（以/开头）
        return f"/{object_name}"
    except S3Error as e:
        raise Exception(f"MinIO错误: {e}")
    except Exception as e:
        raise Exception(f"上传文件时发生错误: {e}")
 def get_db_fields(conn) -> Dict[str, Dict]:
    """
    获取数据库中所有字段（field_type=2的输出字段）
    Returns:
        字典，key为filed_code，value为字段信息
    """
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    sql = """
        SELECT id, name, filed_code, field_type
        FROM f_polic_field
        WHERE tenant_id = %s AND field_type = 2
    """
    cursor.execute(sql, (TENANT_ID,))
    fields = cursor.fetchall()
    result = {}
    for field in fields:
        result[field['filed_code']] = {
            'id': field['id'],
            'name': field['name'],
            'filed_code': field['filed_code'],
            'field_type': field['field_type']
        }
    cursor.close()
    return result
 def match_placeholders_to_fields(placeholders: List[str], fields: Dict[str, Dict]) -> Tuple[List[int], List[str]]:
    """
    匹配占位符到数据库字段
    Returns:
        (匹配的字段ID列表, 未匹配的占位符列表)
    """
    matched_field_ids = []
    unmatched_placeholders = []
    for placeholder in placeholders:
        if placeholder in fields:
            matched_field_ids.append(fields[placeholder]['id'])
        else:
            unmatched_placeholders.append(placeholder)
    return matched_field_ids, unmatched_placeholders
 def create_or_update_template(conn, template_name: str, file_path: str, minio_path: str, parent_id: Optional[int]) -> int:
    """
    创建或更新模板记录
    Returns:
        模板ID
    """
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        # 查找是否已存在（通过名称和parent_id匹配）
        sql = """
            SELECT id, name, file_path, parent_id
            FROM f_polic_file_config
            WHERE tenant_id = %s AND name = %s AND parent_id = %s
        """
        cursor.execute(sql, (TENANT_ID, template_name, parent_id))
        existing = cursor.fetchone()
        if existing:
            # 更新现有记录
            template_id = existing['id']
            update_sql = """
                UPDATE f_polic_file_config
                SET file_path = %s, updated_time = NOW(), updated_by = %s, state = 1
                WHERE id = %s AND tenant_id = %s
            """
            cursor.execute(update_sql, (minio_path, UPDATED_BY, template_id, TENANT_ID))
            conn.commit()
            print(f"  [UPDATE] 更新模板记录 (ID: {template_id})")
            return template_id
        else:
            # 创建新记录
            template_id = generate_id()
            insert_sql = """
                INSERT INTO f_polic_file_config
                (id, tenant_id, parent_id, name, input_data, file_path, created_time, created_by, updated_time, updated_by, state)
                VALUES (%s, %s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, %s)
            """
            cursor.execute(insert_sql, (
                template_id,
                TENANT_ID,
                parent_id,
                template_name,
                None,  # input_data
                minio_path,
                CREATED_BY,
                CREATED_BY,
                1  # state: 1表示启用
            ))
            conn.commit()
            print(f"  [CREATE] 创建模板记录 (ID: {template_id})")
            return template_id
    except Exception as e:
        conn.rollback()
        raise Exception(f"创建或更新模板失败: {str(e)}")
    finally:
        cursor.close()
 def update_template_field_relations(conn, template_id: int, field_ids: List[int]):
    """
    更新模板-字段关联关系
    """
    cursor = conn.cursor()
    try:
        # 删除旧的关联关系
        delete_sql = """
            DELETE FROM f_polic_file_field
            WHERE tenant_id = %s AND file_id = %s
        """
        cursor.execute(delete_sql, (TENANT_ID, template_id))
        # 插入新的关联关系
        if field_ids:
            insert_sql = """
                INSERT INTO f_polic_file_field
                (tenant_id, file_id, filed_id, created_time, created_by, updated_time, updated_by)
                VALUES (%s, %s, %s, NOW(), %s, NOW(), %s)
            """
            for field_id in field_ids:
                cursor.execute(insert_sql, (TENANT_ID, template_id, field_id, CREATED_BY, CREATED_BY))
        conn.commit()
        print(f"  [UPDATE] 更新字段关联关系: {len(field_ids)} 个字段")
    except Exception as e:
        conn.rollback()
        raise Exception(f"更新字段关联关系失败: {str(e)}")
    finally:
        cursor.close()
 def main():
    """主函数"""
    print("=" * 80)
    print("处理保密承诺书（非中共党员用）模板")
    print("=" * 80)
    print()
    # 检查文件是否存在
    if not os.path.exists(TEMPLATE_FILE):
        print(f"错误: 文件不存在 - {TEMPLATE_FILE}")
        return
    print(f"文件路径: {TEMPLATE_FILE}")
    print()
    try:
        # 1. 提取占位符
        print("1. 提取占位符...")
        placeholders = extract_placeholders_from_docx(TEMPLATE_FILE)
        print(f"   找到 {len(placeholders)} 个占位符:")
        for i, placeholder in enumerate(placeholders, 1):
            print(f"     {i}. {{{{ {placeholder} }}}}")
        print()
        # 2. 连接数据库和MinIO
        print("2. 连接数据库和MinIO...")
        conn = pymysql.connect(**DB_CONFIG)
        minio_client = Minio(
            MINIO_CONFIG['endpoint'],
            access_key=MINIO_CONFIG['access_key'],
            secret_key=MINIO_CONFIG['secret_key'],
            secure=MINIO_CONFIG['secure']
        )
        print("   [OK] 连接成功\n")
        # 3. 获取数据库字段
        print("3. 获取数据库字段...")
        db_fields = get_db_fields(conn)
        print(f"   [OK] 找到 {len(db_fields)} 个输出字段\n")
        # 4. 匹配占位符到字段
        print("4. 匹配占位符到字段...")
        matched_field_ids, unmatched_placeholders = match_placeholders_to_fields(placeholders, db_fields)
        print(f"   匹配成功: {len(matched_field_ids)} 个")
        print(f"   未匹配: {len(unmatched_placeholders)} 个")
        if unmatched_placeholders:
            print(f"   未匹配的占位符: {', '.join(unmatched_placeholders)}")
        print()
        # 5. 上传到MinIO
        print("5. 上传到MinIO...")
        minio_path = upload_to_minio(TEMPLATE_FILE, minio_client)
        print(f"   [OK] MinIO路径: {minio_path}\n")
        # 6. 创建或更新数据库记录
        print("6. 创建或更新数据库记录...")
        template_id = create_or_update_template(conn, TEMPLATE_NAME, TEMPLATE_FILE, minio_path, PARENT_ID)
        print(f"   [OK] 模板ID: {template_id}\n")
        # 7. 更新字段关联关系
        print("7. 更新字段关联关系...")
        update_template_field_relations(conn, template_id, matched_field_ids)
        print()
        print("=" * 80)
        print("处理完成！")
        print("=" * 80)
        print(f"模板ID: {template_id}")
        print(f"MinIO路径: {minio_path}")
        print(f"关联字段数: {len(matched_field_ids)}")
    except Exception as e:
        print(f"\n[ERROR] 发生错误: {e}")
        import traceback
        traceback.print_exc()
        if 'conn' in locals():
            conn.rollback()
    finally:
        if 'conn' in locals():
            conn.close()
            print("\n数据库连接已关闭")
 if __name__ == '__main__':
    main()
--- a/query_template_fields_example.py
+++ b/query_template_fields_example.py
@ -0,0 +1,318 @@
 """
 模板字段关联查询示例脚本
 演示如何查询模板关联的输入和输出字段
 """
 import pymysql
 import os
 from typing import Dict, List, Optional
 # 数据库连接配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 def get_template_fields_by_name(template_name: str) -> Optional[Dict]:
    """
    根据模板名称获取关联的字段
    Args:
        template_name: 模板名称，如 '初步核实审批表'
    Returns:
        dict: 包含 template_id, template_name, input_fields 和 output_fields 的字典
    """
    conn = pymysql.connect(**DB_CONFIG)
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        sql = """
            SELECT 
                fc.id AS template_id,
                fc.name AS template_name,
                f.id AS field_id,
                f.name AS field_name,
                f.filed_code AS field_code,
                f.field_type
            FROM f_polic_file_config fc
            INNER JOIN f_polic_file_field fff ON fc.id = fff.file_id
            INNER JOIN f_polic_field f ON fff.filed_id = f.id
            WHERE fc.tenant_id = %s
              AND fc.name = %s
              AND fc.state = 1
              AND fff.state = 1
              AND f.state = 1
            ORDER BY f.field_type, f.name
        """
        cursor.execute(sql, (TENANT_ID, template_name))
        rows = cursor.fetchall()
        if not rows:
            return None
        result = {
            'template_id': rows[0]['template_id'],
            'template_name': rows[0]['template_name'],
            'input_fields': [],
            'output_fields': []
        }
        for row in rows:
            field_info = {
                'id': row['field_id'],
                'name': row['field_name'],
                'field_code': row['field_code'],
                'field_type': row['field_type']
            }
            if row['field_type'] == 1:
                result['input_fields'].append(field_info)
            elif row['field_type'] == 2:
                result['output_fields'].append(field_info)
        return result
    finally:
        cursor.close()
        conn.close()
 def get_template_fields_by_id(template_id: int) -> Optional[Dict]:
    """
    根据模板ID获取关联的字段
    Args:
        template_id: 模板ID
    Returns:
        dict: 包含 template_id, template_name, input_fields 和 output_fields 的字典
    """
    conn = pymysql.connect(**DB_CONFIG)
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        # 先获取模板名称
        sql_template = """
            SELECT id, name
            FROM f_polic_file_config
            WHERE id = %s AND tenant_id = %s AND state = 1
        """
        cursor.execute(sql_template, (template_id, TENANT_ID))
        template = cursor.fetchone()
        if not template:
            return None
        # 获取字段
        sql_fields = """
            SELECT 
                f.id AS field_id,
                f.name AS field_name,
                f.filed_code AS field_code,
                f.field_type
            FROM f_polic_file_field fff
            INNER JOIN f_polic_field f ON fff.filed_id = f.id
            WHERE fff.file_id = %s
              AND fff.tenant_id = %s
              AND fff.state = 1
              AND f.state = 1
            ORDER BY f.field_type, f.name
        """
        cursor.execute(sql_fields, (template_id, TENANT_ID))
        rows = cursor.fetchall()
        result = {
            'template_id': template['id'],
            'template_name': template['name'],
            'input_fields': [],
            'output_fields': []
        }
        for row in rows:
            field_info = {
                'id': row['field_id'],
                'name': row['field_name'],
                'field_code': row['field_code'],
                'field_type': row['field_type']
            }
            if row['field_type'] == 1:
                result['input_fields'].append(field_info)
            elif row['field_type'] == 2:
                result['output_fields'].append(field_info)
        return result
    finally:
        cursor.close()
        conn.close()
 def get_all_templates_with_field_stats() -> List[Dict]:
    """
    获取所有模板及其字段统计信息
    Returns:
        list: 模板列表，每个模板包含字段统计
    """
    conn = pymysql.connect(**DB_CONFIG)
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        sql = """
            SELECT 
                fc.id AS template_id,
                fc.name AS template_name,
                COUNT(DISTINCT CASE WHEN f.field_type = 1 THEN f.id END) AS input_field_count,
                COUNT(DISTINCT CASE WHEN f.field_type = 2 THEN f.id END) AS output_field_count,
                COUNT(DISTINCT f.id) AS total_field_count
            FROM f_polic_file_config fc
            LEFT JOIN f_polic_file_field fff ON fc.id = fff.file_id AND fff.state = 1
            LEFT JOIN f_polic_field f ON fff.filed_id = f.id AND f.state = 1
            WHERE fc.tenant_id = %s
              AND fc.state = 1
            GROUP BY fc.id, fc.name
            ORDER BY fc.name
        """
        cursor.execute(sql, (TENANT_ID,))
        templates = cursor.fetchall()
        return [
            {
                'template_id': t['template_id'],
                'template_name': t['template_name'],
                'input_field_count': t['input_field_count'] or 0,
                'output_field_count': t['output_field_count'] or 0,
                'total_field_count': t['total_field_count'] or 0
            }
            for t in templates
        ]
    finally:
        cursor.close()
        conn.close()
 def find_templates_using_field(field_code: str) -> List[Dict]:
    """
    查找使用特定字段的所有模板
    Args:
        field_code: 字段编码，如 'target_name'
    Returns:
        list: 使用该字段的模板列表
    """
    conn = pymysql.connect(**DB_CONFIG)
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        sql = """
            SELECT DISTINCT
                fc.id AS template_id,
                fc.name AS template_name
            FROM f_polic_file_config fc
            INNER JOIN f_polic_file_field fff ON fc.id = fff.file_id
            INNER JOIN f_polic_field f ON fff.filed_id = f.id
            WHERE fc.tenant_id = %s
              AND f.tenant_id = %s
              AND f.filed_code = %s
              AND fc.state = 1
              AND fff.state = 1
              AND f.state = 1
            ORDER BY fc.name
        """
        cursor.execute(sql, (TENANT_ID, TENANT_ID, field_code))
        templates = cursor.fetchall()
        return [
            {
                'template_id': t['template_id'],
                'template_name': t['template_name']
            }
            for t in templates
        ]
    finally:
        cursor.close()
        conn.close()
 def print_template_fields(result: Dict):
    """打印模板字段信息"""
    if not result:
        print("未找到模板")
        return
    print("="*80)
    print(f"模板: {result['template_name']} (ID: {result['template_id']})")
    print("="*80)
    print(f"\n输入字段 ({len(result['input_fields'])} 个):")
    if result['input_fields']:
        for field in result['input_fields']:
            print(f"  - {field['name']} ({field['field_code']})")
    else:
        print("  (无)")
    print(f"\n输出字段 ({len(result['output_fields'])} 个):")
    if result['output_fields']:
        for field in result['output_fields']:
            print(f"  - {field['name']} ({field['field_code']})")
    else:
        print("  (无)")
 def main():
    """主函数 - 演示各种查询方式"""
    print("="*80)
    print("模板字段关联查询示例")
    print("="*80)
    # 示例1: 根据模板名称查询
    print("\n【示例1】根据模板名称查询字段")
    print("-" * 80)
    # 注意：模板名称需要完全匹配，如 "2.初步核实审批表（XXX）"
    result = get_template_fields_by_name('2.初步核实审批表（XXX）')
    if not result:
        # 尝试其他可能的名称
        result = get_template_fields_by_name('初步核实审批表')
    print_template_fields(result)
    # 示例2: 获取所有模板的字段统计
    print("\n\n【示例2】获取所有模板的字段统计")
    print("-" * 80)
    templates = get_all_templates_with_field_stats()
    print(f"共找到 {len(templates)} 个模板:\n")
    for template in templates[:5]:  # 只显示前5个
        print(f"  {template['template_name']} (ID: {template['template_id']})")
        print(f"    输入字段: {template['input_field_count']} 个")
        print(f"    输出字段: {template['output_field_count']} 个")
        print(f"    总字段数: {template['total_field_count']} 个\n")
    if len(templates) > 5:
        print(f"  ... 还有 {len(templates) - 5} 个模板")
    # 示例3: 查找使用特定字段的模板
    print("\n\n【示例3】查找使用 'target_name' 字段的模板")
    print("-" * 80)
    templates_using_field = find_templates_using_field('target_name')
    print(f"共找到 {len(templates_using_field)} 个模板使用该字段:")
    for template in templates_using_field:
        print(f"  - {template['template_name']} (ID: {template['template_id']})")
    print("\n" + "="*80)
    print("查询完成")
    print("="*80)
 if __name__ == '__main__':
    main()
--- a/rebuild_template_field_relations.py
+++ b/rebuild_template_field_relations.py
@ -0,0 +1,536 @@
 """
 重新建立模板和字段的关联关系
 根据模板名称，重新建立 f_polic_file_field 表的关联关系
 不再依赖 input_data 和 template_code 字段
 """
 import pymysql
 import os
 import json
 from typing import Dict, List, Set, Optional
 from datetime import datetime
 from collections import defaultdict
 # 数据库连接配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 CREATED_BY = 655162080928945152
 UPDATED_BY = 655162080928945152
 # 模板名称到字段编码的映射（根据业务逻辑定义）
 # 格式：{模板名称: {'input_fields': [字段编码列表], 'output_fields': [字段编码列表]}}
 TEMPLATE_FIELD_MAPPING = {
    # 初步核实审批表
    '初步核实审批表': {
        'input_fields': ['clue_info', 'target_basic_info_clue'],
        'output_fields': [
            'target_name', 'target_organization_and_position', 'target_organization',
            'target_position', 'target_gender', 'target_date_of_birth', 'target_age',
            'target_education_level', 'target_political_status', 'target_professional_rank',
            'clue_source', 'target_issue_description', 'department_opinion', 'filler_name'
        ]
    },
    # 谈话前安全风险评估表
    '谈话前安全风险评估表': {
        'input_fields': ['clue_info', 'target_basic_info_clue'],
        'output_fields': [
            'target_family_situation', 'target_social_relations', 'target_health_status',
            'target_personality', 'target_tolerance', 'target_issue_severity',
            'target_other_issues_possibility', 'target_previous_investigation',
            'target_negative_events', 'target_other_situation', 'risk_level'
        ]
    },
    # 请示报告卡
    '请示报告卡': {
        'input_fields': ['clue_info'],
        'output_fields': ['target_name', 'target_organization_and_position', 'report_card_request_time']
    },
    # 初核方案
    '初核方案': {
        'input_fields': ['clue_info', 'target_basic_info_clue'],
        'output_fields': [
            'target_name', 'target_organization_and_position', 'target_work_basic_info',
            'target_issue_description', 'investigation_unit_name', 'investigation_team_leader_name',
            'investigation_team_member_names', 'investigation_location'
        ]
    },
    # 谈话通知书
    '谈话通知书': {
        'input_fields': ['target_basic_info_clue'],
        'output_fields': [
            'target_name', 'target_organization_and_position', 'target_id_number',
            'appointment_time', 'appointment_location', 'approval_time',
            'handling_department', 'handler_name', 'notification_time', 'notification_location'
        ]
    },
    # 谈话通知书第一联
    '谈话通知书第一联': {
        'input_fields': ['target_basic_info_clue'],
        'output_fields': [
            'target_name', 'target_organization_and_position', 'target_id_number',
            'appointment_time', 'appointment_location', 'approval_time',
            'handling_department', 'handler_name', 'notification_time', 'notification_location'
        ]
    },
    # 谈话通知书第二联
    '谈话通知书第二联': {
        'input_fields': ['target_basic_info_clue'],
        'output_fields': [
            'target_name', 'target_organization_and_position', 'target_id_number',
            'appointment_time', 'appointment_location', 'approval_time',
            'handling_department', 'handler_name', 'notification_time', 'notification_location'
        ]
    },
    # 谈话通知书第三联
    '谈话通知书第三联': {
        'input_fields': ['target_basic_info_clue'],
        'output_fields': [
            'target_name', 'target_organization_and_position', 'target_id_number',
            'appointment_time', 'appointment_location', 'approval_time',
            'handling_department', 'handler_name', 'notification_time', 'notification_location'
        ]
    },
    # 谈话笔录
    '谈话笔录': {
        'input_fields': ['clue_info', 'target_basic_info_clue'],
        'output_fields': [
            'target_name', 'target_organization_and_position', 'target_gender',
            'target_date_of_birth_full', 'target_political_status', 'target_address',
            'target_registered_address', 'target_contact', 'target_place_of_origin',
            'target_ethnicity', 'target_id_number', 'investigation_team_code'
        ]
    },
    # 谈话后安全风险评估表
    '谈话后安全风险评估表': {
        'input_fields': ['clue_info', 'target_basic_info_clue'],
        'output_fields': [
            'target_name', 'target_organization_and_position', 'target_gender',
            'target_date_of_birth_full', 'target_political_status', 'target_address',
            'target_registered_address', 'target_contact', 'target_place_of_origin',
            'target_ethnicity', 'target_id_number', 'investigation_team_code'
        ]
    },
    # XXX初核情况报告
    'XXX初核情况报告': {
        'input_fields': ['clue_info', 'target_basic_info_clue'],
        'output_fields': [
            'target_name', 'target_organization_and_position', 'target_issue_description',
            'target_work_basic_info', 'investigation_unit_name', 'investigation_team_leader_name'
        ]
    },
    # 走读式谈话审批
    '走读式谈话审批': {
        'input_fields': ['target_basic_info_clue'],
        'output_fields': [
            'target_name', 'target_organization_and_position', 'target_id_number',
            'appointment_time', 'appointment_location', 'approval_time',
            'handling_department', 'handler_name'
        ]
    },
    # 走读式谈话流程
    '走读式谈话流程': {
        'input_fields': ['target_basic_info_clue'],
        'output_fields': [
            'target_name', 'target_organization_and_position', 'target_id_number',
            'appointment_time', 'appointment_location', 'approval_time',
            'handling_department', 'handler_name'
        ]
    },
    # 谈话审批 / 谈话审批表
    '谈话审批': {
        'input_fields': ['target_basic_info_clue'],
        'output_fields': [
            'target_name', 'target_organization_and_position', 'target_id_number',
            'appointment_time', 'appointment_location', 'approval_time',
            'handling_department', 'handler_name'
        ]
    },
    '谈话审批表': {
        'input_fields': ['clue_info', 'target_basic_info_clue'],
        'output_fields': [
            'target_name', 'target_organization_and_position', 'target_gender',
            'target_date_of_birth_full', 'target_political_status', 'target_address',
            'target_registered_address', 'target_contact', 'target_place_of_origin',
            'target_ethnicity', 'target_id_number', 'investigation_team_code'
        ]
    },
 }
 # 模板名称的标准化映射（处理不同的命名方式）
 TEMPLATE_NAME_NORMALIZE = {
    '1.请示报告卡（XXX）': '请示报告卡',
    '2.初步核实审批表（XXX）': '初步核实审批表',
    '3.附件初核方案(XXX)': '初核方案',
    '8.XXX初核情况报告': 'XXX初核情况报告',
    '2.谈话审批': '谈话审批',
    '2谈话审批表': '谈话审批表',
 }
 def generate_id():
    """生成ID（使用时间戳+随机数的方式，模拟雪花算法）"""
    import time
    import random
    timestamp = int(time.time() * 1000)
    random_part = random.randint(100000, 999999)
    return timestamp * 1000 + random_part
 def normalize_template_name(name: str) -> str:
    """标准化模板名称"""
    # 先检查映射表
    if name in TEMPLATE_NAME_NORMALIZE:
        return TEMPLATE_NAME_NORMALIZE[name]
    # 移除常见的后缀和前缀
    name = name.strip()
    # 移除括号内容
    import re
    name = re.sub(r'[（(].*?[）)]', '', name)
    name = name.strip()
    # 移除数字前缀和点号
    name = re.sub(r'^\d+\.', '', name)
    name = name.strip()
    return name
 def get_all_templates(conn) -> Dict:
    """获取所有模板配置"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    sql = """
        SELECT id, name, parent_id, state
        FROM f_polic_file_config
        WHERE tenant_id = %s
        ORDER BY name
    """
    cursor.execute(sql, (TENANT_ID,))
    templates = cursor.fetchall()
    result = {}
    for template in templates:
        name = template['name']
        normalized_name = normalize_template_name(name)
        # 处理state字段（可能是二进制格式）
        state = template['state']
        if isinstance(state, bytes):
            state = int.from_bytes(state, byteorder='big')
        elif isinstance(state, (int, str)):
            state = int(state)
        else:
            state = 0
        result[template['id']] = {
            'id': template['id'],
            'name': name,
            'normalized_name': normalized_name,
            'parent_id': template['parent_id'],
            'state': state
        }
    cursor.close()
    return result
 def get_all_fields(conn) -> Dict:
    """获取所有字段定义"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    sql = """
        SELECT id, name, filed_code, field_type, state
        FROM f_polic_field
        WHERE tenant_id = %s
        ORDER BY field_type, filed_code
    """
    cursor.execute(sql, (TENANT_ID,))
    fields = cursor.fetchall()
    result = {
        'by_code': {},
        'by_name': {},
        'input_fields': [],
        'output_fields': []
    }
    for field in fields:
        field_code = field['filed_code']
        field_name = field['name']
        field_type = field['field_type']
        result['by_code'][field_code] = field
        result['by_name'][field_name] = field
        if field_type == 1:
            result['input_fields'].append(field)
        elif field_type == 2:
            result['output_fields'].append(field)
    cursor.close()
    return result
 def get_existing_relations(conn) -> Set[tuple]:
    """获取现有的关联关系"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    sql = """
        SELECT file_id, filed_id
        FROM f_polic_file_field
        WHERE tenant_id = %s
    """
    cursor.execute(sql, (TENANT_ID,))
    relations = cursor.fetchall()
    result = {(rel['file_id'], rel['filed_id']) for rel in relations}
    cursor.close()
    return result
 def rebuild_template_relations(conn, template_id: int, template_name: str, 
                                normalized_name: str, field_mapping: Dict, 
                                dry_run: bool = True) -> Dict:
    """重建单个模板的关联关系"""
    cursor = conn.cursor()
        # 查找模板对应的字段配置
    template_config = None
    # 优先精确匹配标准化名称
    if normalized_name in TEMPLATE_FIELD_MAPPING:
        template_config = TEMPLATE_FIELD_MAPPING[normalized_name]
    else:
        # 尝试模糊匹配
        for name, config in TEMPLATE_FIELD_MAPPING.items():
            if name == normalized_name or name in normalized_name or normalized_name in name:
                template_config = config
                break
            # 也检查原始名称
            if name in template_name or template_name in name:
                template_config = config
                break
    if not template_config:
        return {
            'template_id': template_id,
            'template_name': template_name,
            'status': 'skipped',
            'reason': '未找到字段配置映射',
            'input_count': 0,
            'output_count': 0
        }
    input_field_codes = template_config.get('input_fields', [])
    output_field_codes = template_config.get('output_fields', [])
    # 查找字段ID
    input_field_ids = []
    output_field_ids = []
    for field_code in input_field_codes:
        field = field_mapping['by_code'].get(field_code)
        if field:
            if field['field_type'] == 1:
                input_field_ids.append(field['id'])
            else:
                print(f"    ⚠ 警告: 字段 {field_code} 应该是输入字段，但实际类型为 {field['field_type']}")
        else:
            print(f"    ⚠ 警告: 字段 {field_code} 不存在")
    for field_code in output_field_codes:
        field = field_mapping['by_code'].get(field_code)
        if field:
            if field['field_type'] == 2:
                output_field_ids.append(field['id'])
            else:
                print(f"    ⚠ 警告: 字段 {field_code} 应该是输出字段，但实际类型为 {field['field_type']}")
        else:
            print(f"    ⚠ 警告: 字段 {field_code} 不存在")
    # 删除旧的关联关系
    if not dry_run:
        delete_sql = """
            DELETE FROM f_polic_file_field
            WHERE tenant_id = %s AND file_id = %s
        """
        cursor.execute(delete_sql, (TENANT_ID, template_id))
        deleted_count = cursor.rowcount
    else:
        deleted_count = 0
    # 创建新的关联关系
    created_count = 0
    all_field_ids = input_field_ids + output_field_ids
    for field_id in all_field_ids:
        if not dry_run:
            # 检查是否已存在（虽然已经删除了，但为了安全还是检查一下）
            check_sql = """
                SELECT id FROM f_polic_file_field
                WHERE tenant_id = %s AND file_id = %s AND filed_id = %s
            """
            cursor.execute(check_sql, (TENANT_ID, template_id, field_id))
            existing = cursor.fetchone()
            if not existing:
                relation_id = generate_id()
                insert_sql = """
                    INSERT INTO f_polic_file_field
                    (id, tenant_id, file_id, filed_id, created_time, created_by, updated_time, updated_by, state)
                    VALUES (%s, %s, %s, %s, NOW(), %s, NOW(), %s, %s)
                """
                cursor.execute(insert_sql, (
                    relation_id, TENANT_ID, template_id, field_id,
                    CREATED_BY, UPDATED_BY, 1  # state=1 表示启用
                ))
                created_count += 1
        else:
            created_count += 1
    if not dry_run:
        conn.commit()
    return {
        'template_id': template_id,
        'template_name': template_name,
        'normalized_name': normalized_name,
        'status': 'success',
        'deleted_count': deleted_count,
        'input_count': len(input_field_ids),
        'output_count': len(output_field_ids),
        'created_count': created_count
    }
 def main(dry_run: bool = True):
    """主函数"""
    print("="*80)
    print("重新建立模板和字段的关联关系")
    print("="*80)
    if dry_run:
        print("\n[DRY RUN模式 - 不会实际修改数据库]")
    else:
        print("\n[实际执行模式 - 将修改数据库]")
    try:
        conn = pymysql.connect(**DB_CONFIG)
        print("✓ 数据库连接成功\n")
        # 获取所有模板
        print("1. 获取所有模板配置...")
        templates = get_all_templates(conn)
        print(f"   找到 {len(templates)} 个模板")
        # 获取所有字段
        print("\n2. 获取所有字段定义...")
        field_mapping = get_all_fields(conn)
        print(f"   输入字段: {len(field_mapping['input_fields'])} 个")
        print(f"   输出字段: {len(field_mapping['output_fields'])} 个")
        print(f"   总字段数: {len(field_mapping['by_code'])} 个")
        # 获取现有关联关系
        print("\n3. 获取现有关联关系...")
        existing_relations = get_existing_relations(conn)
        print(f"   现有关联关系: {len(existing_relations)} 条")
        # 重建关联关系
        print("\n4. 重建模板和字段的关联关系...")
        print("="*80)
        results = []
        for template_id, template_info in templates.items():
            template_name = template_info['name']
            normalized_name = template_info['normalized_name']
            state = template_info['state']
            # 处理所有模板（包括未启用的，因为可能需要建立关联）
            # 但可以记录状态
            status_note = f" (state={state})" if state != 1 else ""
            if state != 1:
                print(f"\n处理未启用的模板: {template_name}{status_note}")
            print(f"\n处理模板: {template_name}")
            print(f"  标准化名称: {normalized_name}")
            result = rebuild_template_relations(
                conn, template_id, template_name, normalized_name, 
                field_mapping, dry_run=dry_run
            )
            results.append(result)
            if result['status'] == 'success':
                print(f"  ✓ 成功: 删除 {result['deleted_count']} 条旧关联, "
                      f"创建 {result['created_count']} 条新关联 "
                      f"(输入字段: {result['input_count']}, 输出字段: {result['output_count']})")
            else:
                print(f"  ⚠ {result['status']}: {result.get('reason', '')}")
        # 统计信息
        print("\n" + "="*80)
        print("处理结果统计")
        print("="*80)
        success_count = sum(1 for r in results if r['status'] == 'success')
        skipped_count = sum(1 for r in results if r['status'] == 'skipped')
        total_input = sum(r.get('input_count', 0) for r in results)
        total_output = sum(r.get('output_count', 0) for r in results)
        total_created = sum(r.get('created_count', 0) for r in results)
        print(f"\n成功处理: {success_count} 个模板")
        print(f"跳过: {skipped_count} 个模板")
        print(f"总输入字段关联: {total_input} 条")
        print(f"总输出字段关联: {total_output} 条")
        print(f"总关联关系: {total_created} 条")
        # 显示详细结果
        print("\n详细结果:")
        for result in results:
            if result['status'] == 'success':
                print(f"  - {result['template_name']}: "
                      f"输入字段 {result['input_count']} 个, "
                      f"输出字段 {result['output_count']} 个")
            else:
                print(f"  - {result['template_name']}: {result['status']} - {result.get('reason', '')}")
        print("\n" + "="*80)
        if dry_run:
            print("\n这是DRY RUN模式，未实际修改数据库。")
            print("要实际执行，请运行: python rebuild_template_field_relations.py --execute")
        else:
            print("\n✓ 关联关系已更新完成")
    except Exception as e:
        print(f"\n✗ 发生错误: {e}")
        import traceback
        traceback.print_exc()
        if not dry_run:
            conn.rollback()
    finally:
        conn.close()
        print("\n数据库连接已关闭")
 if __name__ == '__main__':
    import sys
    dry_run = '--execute' not in sys.argv
    if not dry_run:
        print("\n⚠ 警告: 这将修改数据库!")
        response = input("确认要继续吗? (yes/no): ")
        if response.lower() != 'yes':
            print("操作已取消")
            sys.exit(0)
    main(dry_run=dry_run)
--- a/requirements.txt
+++ b/requirements.txt
@ -1,10 +1,12 @@
 flask==3.0.0
 flask-cors==4.0.0
 pymysql==1.1.2
 cryptography>=41.0.0
 python-dotenv==1.0.0
 requests==2.31.0
 flasgger==0.9.7.1
 python-docx==1.1.0
 minio==7.2.3
 openpyxl==3.1.2
 json-repair
--- a/rescan_and_update_templates.py
+++ b/rescan_and_update_templates.py
@ -0,0 +1,405 @@
 """
 重新扫描模板占位符并更新数据库
 1. 扫描所有本地模板文件（包括新转换的.docx文件）
 2. 提取所有占位符
 3. 检查数据库中的模板记录
 4. 更新数据库（如有变化）
 """
 import os
 import pymysql
 from pathlib import Path
 from typing import Dict, List, Set, Tuple
 from dotenv import load_dotenv
 import re
 from docx import Document
 # 加载环境变量
 load_dotenv()
 # 数据库配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 CREATED_BY = 655162080928945152
 UPDATED_BY = 655162080928945152
 # 项目根目录
 PROJECT_ROOT = Path(__file__).parent
 TEMPLATES_DIR = PROJECT_ROOT / "template_finish"
 def print_section(title):
    """打印章节标题"""
    print("\n" + "="*70)
    print(f"  {title}")
    print("="*70)
 def print_result(success, message):
    """打印结果"""
    status = "[OK]" if success else "[FAIL]"
    print(f"{status} {message}")
 def generate_id():
    """生成ID"""
    import time
    return int(time.time() * 1000000)
 def scan_local_templates(base_dir: Path) -> Dict[str, Path]:
    """扫描本地模板文件"""
    templates = {}
    if not base_dir.exists():
        return templates
    for file_path in base_dir.rglob('*'):
        if file_path.is_file():
            # 只处理文档文件（优先处理.docx，也包含.doc和.wps用于检查）
            if file_path.suffix.lower() in ['.doc', '.docx', '.wps']:
                relative_path = file_path.relative_to(PROJECT_ROOT)
                relative_path_str = str(relative_path).replace('\\', '/')
                templates[relative_path_str] = file_path
    return templates
 def get_actual_tenant_id(conn) -> int:
    """获取数据库中的实际tenant_id"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        cursor.execute("SELECT DISTINCT tenant_id FROM f_polic_file_config LIMIT 1")
        result = cursor.fetchone()
        if result:
            return result['tenant_id']
        return 1  # 默认值
    finally:
        cursor.close()
 def get_db_templates(conn, tenant_id: int) -> Dict[str, Dict]:
    """从数据库获取所有模板配置"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        sql = """
            SELECT id, name, file_path, state, parent_id
            FROM f_polic_file_config
            WHERE tenant_id = %s
        """
        cursor.execute(sql, (tenant_id,))
        templates = cursor.fetchall()
        result = {}
        for template in templates:
            file_path = template['file_path']
            if file_path:
                result[file_path] = {
                    'id': template['id'],
                    'name': template['name'],
                    'file_path': file_path,
                    'state': template['state'],
                    'parent_id': template['parent_id']
                }
        return result
    finally:
        cursor.close()
 def extract_placeholders_from_docx(file_path: Path) -> Tuple[Set[str], bool]:
    """
    从docx文件中提取所有占位符
    Returns:
        (占位符集合, 是否成功读取)
    """
    placeholders = set()
    placeholder_pattern = re.compile(r'\{\{([^}]+)\}\}')
    success = False
    try:
        doc = Document(file_path)
        success = True
        # 从段落中提取占位符
        for paragraph in doc.paragraphs:
            text = paragraph.text
            matches = placeholder_pattern.findall(text)
            for match in matches:
                field_code = match.strip()
                if field_code:
                    placeholders.add(field_code)
        # 从表格中提取占位符
        for table in doc.tables:
            try:
                for row in table.rows:
                    for cell in row.cells:
                        for paragraph in cell.paragraphs:
                            text = paragraph.text
                            matches = placeholder_pattern.findall(text)
                            for match in matches:
                                field_code = match.strip()
                                if field_code:
                                    placeholders.add(field_code)
            except Exception as e:
                # 某些表格结构可能导致错误，跳过
                continue
    except Exception as e:
        # 文件读取失败（可能是.doc格式或其他问题）
        return placeholders, False
    return placeholders, success
 def scan_all_templates_placeholders(local_templates: Dict[str, Path]) -> Dict[str, Tuple[Set[str], bool, str]]:
    """
    扫描所有模板的占位符
    Returns:
        字典，key为相对路径，value为(占位符集合, 是否成功读取, 文件扩展名)
    """
    results = {}
    for rel_path, file_path in local_templates.items():
        file_ext = file_path.suffix.lower()
        placeholders, success = extract_placeholders_from_docx(file_path)
        results[rel_path] = (placeholders, success, file_ext)
    return results
 def update_or_create_template(conn, tenant_id: int, rel_path: str, file_path: Path, db_templates: Dict[str, Dict]):
    """更新或创建模板记录"""
    cursor = conn.cursor()
    try:
        # 检查是否已存在
        if rel_path in db_templates:
            # 已存在，检查是否需要更新
            template_id = db_templates[rel_path]['id']
            # 这里可以添加更新逻辑，比如更新名称等
            return template_id, 'exists'
        else:
            # 不存在，创建新记录
            template_id = generate_id()
            file_name = file_path.stem  # 不含扩展名的文件名
            cursor.execute("""
                INSERT INTO f_polic_file_config
                (id, tenant_id, parent_id, name, input_data, file_path, created_time, created_by, updated_time, updated_by, state)
                VALUES (%s, %s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, 1)
            """, (
                template_id,
                tenant_id,
                None,  # parent_id
                file_name,
                '{}',  # input_data
                rel_path,
                CREATED_BY,
                UPDATED_BY
            ))
            conn.commit()
            return template_id, 'created'
    except Exception as e:
        conn.rollback()
        raise e
    finally:
        cursor.close()
 def main():
    """主函数"""
    print_section("重新扫描模板占位符并更新数据库")
    # 1. 扫描本地模板
    print_section("1. 扫描本地模板文件")
    local_templates = scan_local_templates(TEMPLATES_DIR)
    print_result(True, f"找到 {len(local_templates)} 个本地模板文件")
    # 统计文件类型
    file_types = {}
    for file_path in local_templates.values():
        ext = file_path.suffix.lower()
        file_types[ext] = file_types.get(ext, 0) + 1
    print("\n文件类型统计:")
    for ext, count in sorted(file_types.items()):
        print(f"  {ext}: {count} 个")
    if not local_templates:
        print_result(False, "未找到本地模板文件")
        return
    # 2. 连接数据库
    print_section("2. 连接数据库")
    try:
        conn = pymysql.connect(**DB_CONFIG)
        print_result(True, "数据库连接成功")
    except Exception as e:
        print_result(False, f"数据库连接失败: {str(e)}")
        return
    try:
        # 3. 获取实际的tenant_id
        print_section("3. 获取实际的tenant_id")
        tenant_id = get_actual_tenant_id(conn)
        print_result(True, f"实际tenant_id: {tenant_id}")
        # 4. 获取数据库中的模板
        print_section("4. 获取数据库中的模板配置")
        db_templates = get_db_templates(conn, tenant_id)
        print_result(True, f"找到 {len(db_templates)} 条数据库模板记录（有file_path的）")
        # 5. 扫描所有模板的占位符
        print_section("5. 扫描所有模板的占位符")
        print("  正在扫描，请稍候...")
        template_placeholders = scan_all_templates_placeholders(local_templates)
        # 统计结果
        all_placeholders = set()
        templates_with_placeholders = 0
        templates_without_placeholders = 0
        templates_read_success = 0
        templates_read_failed = 0
        doc_files = []
        docx_files = []
        for rel_path, (placeholders, success, file_ext) in template_placeholders.items():
            all_placeholders.update(placeholders)
            if success:
                templates_read_success += 1
                if placeholders:
                    templates_with_placeholders += 1
                else:
                    templates_without_placeholders += 1
            else:
                templates_read_failed += 1
                if file_ext == '.doc':
                    doc_files.append(rel_path)
            if file_ext == '.docx':
                docx_files.append(rel_path)
            elif file_ext == '.doc':
                doc_files.append(rel_path)
        print(f"\n扫描结果统计:")
        print(f"  - 成功读取: {templates_read_success} 个")
        print(f"  - 读取失败: {templates_read_failed} 个")
        print(f"  - 有占位符: {templates_with_placeholders} 个")
        print(f"  - 无占位符: {templates_without_placeholders} 个")
        print(f"  - 发现的占位符总数: {len(all_placeholders)} 个不同的占位符")
        if doc_files:
            print(f"\n  [注意] 发现 {len(doc_files)} 个.doc文件（可能无法读取）:")
            for doc_file in doc_files[:5]:
                print(f"    - {doc_file}")
            if len(doc_files) > 5:
                print(f"    ... 还有 {len(doc_files) - 5} 个")
        print(f"\n  .docx文件: {len(docx_files)} 个")
        # 6. 显示所有占位符
        print_section("6. 所有占位符列表")
        if all_placeholders:
            for placeholder in sorted(all_placeholders):
                print(f"    - {placeholder}")
        else:
            print("    未发现占位符")
        # 7. 检查并更新数据库
        print_section("7. 检查并更新数据库")
        missing_templates = []
        for rel_path in local_templates.keys():
            if rel_path not in db_templates:
                missing_templates.append(rel_path)
        if missing_templates:
            print(f"  发现 {len(missing_templates)} 个缺失的模板记录")
            created_count = 0
            for rel_path in missing_templates:
                file_path = local_templates[rel_path]
                try:
                    template_id, status = update_or_create_template(conn, tenant_id, rel_path, file_path, db_templates)
                    if status == 'created':
                        print(f"  [创建] ID={template_id}, 路径={rel_path}")
                        created_count += 1
                except Exception as e:
                    print(f"  [错误] 创建失败: {rel_path}, 错误: {str(e)}")
            if created_count > 0:
                print_result(True, f"成功创建 {created_count} 条模板记录")
        else:
            print_result(True, "所有本地模板都已存在于数据库中")
        # 8. 检查文件格式变化（.doc -> .docx）
        print_section("8. 检查文件格式变化")
        # 检查数据库中是否有.doc路径，但本地已经是.docx
        format_changes = []
        for db_path, db_info in db_templates.items():
            if db_path.endswith('.doc'):
                # 检查是否有对应的.docx文件
                docx_path = db_path.replace('.doc', '.docx')
                if docx_path in local_templates:
                    format_changes.append((db_path, docx_path, db_info))
        if format_changes:
            print(f"  发现 {len(format_changes)} 个文件格式变化（.doc -> .docx）")
            updated_count = 0
            for old_path, new_path, db_info in format_changes:
                try:
                    cursor = conn.cursor()
                    cursor.execute("""
                        UPDATE f_polic_file_config
                        SET file_path = %s
                        WHERE id = %s
                    """, (new_path, db_info['id']))
                    conn.commit()
                    cursor.close()
                    print(f"  [更新] ID={db_info['id']}, 名称={db_info['name']}")
                    print(f"    旧路径: {old_path}")
                    print(f"    新路径: {new_path}")
                    updated_count += 1
                except Exception as e:
                    print(f"  [错误] 更新失败: {str(e)}")
            if updated_count > 0:
                print_result(True, f"成功更新 {updated_count} 条路径记录")
        else:
            print_result(True, "未发现文件格式变化")
        # 9. 生成详细报告
        print_section("9. 详细报告")
        # 找出有占位符的模板示例
        templates_with_placeholders_list = []
        for rel_path, (placeholders, success, file_ext) in template_placeholders.items():
            if success and placeholders and file_ext == '.docx':
                templates_with_placeholders_list.append((rel_path, placeholders))
        if templates_with_placeholders_list:
            print(f"\n  有占位符的模板示例（前5个）:")
            for i, (rel_path, placeholders) in enumerate(templates_with_placeholders_list[:5], 1):
                print(f"\n  {i}. {Path(rel_path).name}")
                print(f"     路径: {rel_path}")
                print(f"     占位符数量: {len(placeholders)}")
                print(f"     占位符: {sorted(placeholders)}")
    finally:
        conn.close()
        print_result(True, "数据库连接已关闭")
    print_section("完成")
 if __name__ == "__main__":
    main()
--- a/restore_database.py
+++ b/restore_database.py
@ -0,0 +1,340 @@
 """
 数据库恢复脚本
 从SQL备份文件恢复数据库
 """
 import os
 import sys
 import subprocess
 import pymysql
 from pathlib import Path
 from dotenv import load_dotenv
 import gzip
 # 加载环境变量
 load_dotenv()
 class DatabaseRestore:
    """数据库恢复类"""
    def __init__(self):
        """初始化数据库配置"""
        self.db_config = {
            'host': os.getenv('DB_HOST', '152.136.177.240'),
            'port': int(os.getenv('DB_PORT', 5012)),
            'user': os.getenv('DB_USER', 'finyx'),
            'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
            'database': os.getenv('DB_NAME', 'finyx'),
            'charset': 'utf8mb4'
        }
    def restore_with_mysql(self, backup_file, drop_database=False):
        """
        使用mysql命令恢复数据库（推荐方式）
        Args:
            backup_file: 备份文件路径
            drop_database: 是否先删除数据库（危险操作）
        Returns:
            是否成功
        """
        backup_file = Path(backup_file)
        if not backup_file.exists():
            raise FileNotFoundError(f"备份文件不存在: {backup_file}")
        # 如果是压缩文件，先解压
        sql_file = backup_file
        temp_file = None
        if backup_file.suffix == '.gz':
            print(f"检测到压缩文件，正在解压...")
            temp_file = backup_file.with_suffix('')
            with gzip.open(backup_file, 'rb') as f_in:
                with open(temp_file, 'wb') as f_out:
                    f_out.write(f_in.read())
            sql_file = temp_file
            print(f"解压完成: {sql_file}")
        try:
            print(f"开始恢复数据库 {self.db_config['database']}...")
            print(f"备份文件: {backup_file}")
            # 如果指定删除数据库
            if drop_database:
                print("警告: 将删除现有数据库！")
                confirm = input("确认继续? (yes/no): ")
                if confirm.lower() != 'yes':
                    print("已取消恢复操作")
                    return False
                # 删除数据库
                self._drop_database()
            # 构建mysql命令
            cmd = [
                'mysql',
                f"--host={self.db_config['host']}",
                f"--port={self.db_config['port']}",
                f"--user={self.db_config['user']}",
                f"--password={self.db_config['password']}",
                '--default-character-set=utf8mb4',
                self.db_config['database']
            ]
            # 执行恢复命令
            with open(sql_file, 'r', encoding='utf-8') as f:
                result = subprocess.run(
                    cmd,
                    stdin=f,
                    stderr=subprocess.PIPE,
                    text=True
                )
            if result.returncode != 0:
                error_msg = result.stderr.decode('utf-8') if result.stderr else '未知错误'
                raise Exception(f"mysql执行失败: {error_msg}")
            print("恢复完成！")
            return True
        except FileNotFoundError:
            print("错误: 未找到mysql命令，请确保MySQL客户端已安装并在PATH中")
            print("尝试使用Python方式恢复...")
            return self.restore_with_python(backup_file, drop_database)
        except Exception as e:
            print(f"恢复失败: {str(e)}")
            raise
        finally:
            # 清理临时解压文件
            if temp_file and temp_file.exists():
                temp_file.unlink()
    def restore_with_python(self, backup_file, drop_database=False):
        """
        使用Python直接连接数据库恢复（备用方式）
        Args:
            backup_file: 备份文件路径
            drop_database: 是否先删除数据库（危险操作）
        Returns:
            是否成功
        """
        backup_file = Path(backup_file)
        if not backup_file.exists():
            raise FileNotFoundError(f"备份文件不存在: {backup_file}")
        # 如果是压缩文件，先解压
        sql_file = backup_file
        temp_file = None
        if backup_file.suffix == '.gz':
            print(f"检测到压缩文件，正在解压...")
            temp_file = backup_file.with_suffix('')
            with gzip.open(backup_file, 'rb') as f_in:
                with open(temp_file, 'wb') as f_out:
                    f_out.write(f_in.read())
            sql_file = temp_file
            print(f"解压完成: {sql_file}")
        try:
            print(f"开始使用Python方式恢复数据库 {self.db_config['database']}...")
            print(f"备份文件: {backup_file}")
            # 如果指定删除数据库
            if drop_database:
                print("警告: 将删除现有数据库！")
                confirm = input("确认继续? (yes/no): ")
                if confirm.lower() != 'yes':
                    print("已取消恢复操作")
                    return False
                # 删除数据库
                self._drop_database()
            # 连接数据库
            connection = pymysql.connect(**self.db_config)
            cursor = connection.cursor()
            # 读取SQL文件
            print("读取SQL文件...")
            with open(sql_file, 'r', encoding='utf-8') as f:
                sql_content = f.read()
            # 分割SQL语句（按分号分割，但要注意字符串中的分号）
            print("执行SQL语句...")
            statements = self._split_sql_statements(sql_content)
            total = len(statements)
            print(f"共 {total} 条SQL语句")
            # 执行每条SQL语句
            for i, statement in enumerate(statements, 1):
                statement = statement.strip()
                if not statement or statement.startswith('--'):
                    continue
                try:
                    cursor.execute(statement)
                    if i % 100 == 0:
                        print(f"进度: {i}/{total} ({i*100//total}%)")
                except Exception as e:
                    # 某些错误可以忽略（如表已存在等）
                    error_msg = str(e).lower()
                    if 'already exists' in error_msg or 'duplicate' in error_msg:
                        continue
                    print(f"警告: 执行SQL语句时出错 (第{i}条): {str(e)}")
                    print(f"SQL: {statement[:100]}...")
            # 提交事务
            connection.commit()
            cursor.close()
            connection.close()
            print("恢复完成！")
            return True
        except Exception as e:
            print(f"恢复失败: {str(e)}")
            raise
        finally:
            # 清理临时解压文件
            if temp_file and temp_file.exists():
                temp_file.unlink()
    def _split_sql_statements(self, sql_content):
        """
        分割SQL语句（处理字符串中的分号）
        Args:
            sql_content: SQL内容
        Returns:
            SQL语句列表
        """
        statements = []
        current_statement = []
        in_string = False
        string_char = None
        i = 0
        while i < len(sql_content):
            char = sql_content[i]
            # 检测字符串开始/结束
            if char in ("'", '"', '`') and (i == 0 or sql_content[i-1] != '\\'):
                if not in_string:
                    in_string = True
                    string_char = char
                elif char == string_char:
                    in_string = False
                    string_char = None
            current_statement.append(char)
            # 如果不在字符串中且遇到分号，分割语句
            if not in_string and char == ';':
                statement = ''.join(current_statement).strip()
                if statement:
                    statements.append(statement)
                current_statement = []
            i += 1
        # 添加最后一条语句
        if current_statement:
            statement = ''.join(current_statement).strip()
            if statement:
                statements.append(statement)
        return statements
    def _drop_database(self):
        """删除数据库（危险操作）"""
        try:
            # 连接到MySQL服务器（不指定数据库）
            config = self.db_config.copy()
            config.pop('database')
            connection = pymysql.connect(**config)
            cursor = connection.cursor()
            cursor.execute(f"DROP DATABASE IF EXISTS `{self.db_config['database']}`")
            cursor.execute(f"CREATE DATABASE `{self.db_config['database']}` CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci")
            connection.commit()
            cursor.close()
            connection.close()
            print(f"数据库 {self.db_config['database']} 已删除并重新创建")
        except Exception as e:
            raise Exception(f"删除数据库失败: {str(e)}")
    def test_connection(self):
        """测试数据库连接"""
        try:
            connection = pymysql.connect(**self.db_config)
            cursor = connection.cursor()
            cursor.execute("SELECT VERSION()")
            version = cursor.fetchone()[0]
            cursor.close()
            connection.close()
            print(f"数据库连接成功！MySQL版本: {version}")
            return True
        except Exception as e:
            print(f"数据库连接失败: {str(e)}")
            return False
 def main():
    """主函数"""
    import argparse
    parser = argparse.ArgumentParser(description='数据库恢复工具')
    parser.add_argument('backup_file', help='备份文件路径')
    parser.add_argument('--method', choices=['mysql', 'python', 'auto'], 
                       default='auto', help='恢复方法 (默认: auto)')
    parser.add_argument('--drop-db', action='store_true', 
                       help='恢复前删除现有数据库（危险操作）')
    parser.add_argument('--test', action='store_true', 
                       help='仅测试数据库连接')
    args = parser.parse_args()
    restore = DatabaseRestore()
    # 测试连接
    if args.test:
        restore.test_connection()
        return
    # 执行恢复
    try:
        if args.method == 'mysql':
            success = restore.restore_with_mysql(args.backup_file, args.drop_db)
        elif args.method == 'python':
            success = restore.restore_with_python(args.backup_file, args.drop_db)
        else:  # auto
            try:
                success = restore.restore_with_mysql(args.backup_file, args.drop_db)
            except:
                print("\nmysql方式失败，切换到Python方式...")
                success = restore.restore_with_python(args.backup_file, args.drop_db)
        if success:
            print("\n恢复成功！")
        else:
            print("\n恢复失败！")
            sys.exit(1)
    except Exception as e:
        print(f"\n恢复失败: {str(e)}")
        sys.exit(1)
 if __name__ == '__main__':
    main()
--- a/rollback_incorrect_updates.py
+++ b/rollback_incorrect_updates.py
@ -0,0 +1,122 @@
 """
 回滚错误的更新，恢复被错误修改的字段
 """
 import os
 import pymysql
 # 数据库连接配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 UPDATED_BY = 655162080928945152
 # 需要恢复的字段映射（字段ID -> 正确的field_code）
 ROLLBACK_MAPPING = {
    # 这些字段被错误地从英文改成了中文，需要恢复
    1764656917410273: 'target_issue_description',
    1764656918032031: 'filler_name',
    1764656917418979: 'department_opinion',
    1764836032906561: 'appointment_location',
    1764836032488198: 'appointment_time',
    1764836033052889: 'approval_time',
    1764836032655678: 'handler_name',
    1764836033342084: 'handling_department',
    1764836033240593: 'investigation_unit_name',
    1764836033018470: 'investigation_location',
    1764836033274278: 'investigation_team_code',
    1764836033094781: 'investigation_team_member_names',
    1764836033176386: 'investigation_team_leader_name',
    1764836033500799: 'commission_name',
    1764656917384058: 'clue_info',
    1764656917861268: 'clue_source',
    1764836032538308: 'target_address',
    1764836033565636: 'target_health_status',
    1764836033332970: 'target_other_situation',
    1764656917299164: 'target_date_of_birth',
    1764836033269146: 'target_date_of_birth_full',
    1765151880445876: 'target_organization',
    1764656917367205: 'target_organization_and_position',
    1764836033405778: 'target_family_situation',
    1764836033162748: 'target_work_basic_info',
    1764656917996367: 'target_basic_info_clue',
    1764836032997850: 'target_age',
    1764656917561689: 'target_gender',
    1764836032855869: 'target_personality',
    1764836032893680: 'target_registered_address',
    1764836033603501: 'target_tolerance',
    1764656917185956: 'target_political_status',
    1764836033786057: 'target_attitude',
    1764836033587951: 'target_previous_investigation',
    1764836032951705: 'target_ethnicity',
    1764836033280024: 'target_other_issues_possibility',
    1764836033458872: 'target_issue_severity',
    1764836032929811: 'target_social_relations',
    1764836033618877: 'target_negative_events',
    1764836032926994: 'target_place_of_origin',
    1765151880304552: 'target_position',
    1764656917802442: 'target_professional_rank',
    1764836032817243: 'target_contact',
    1764836032902356: 'target_id_number',
    1764836032913357: 'target_id_number',
    1764656917073644: 'target_name',
    1764836033571266: 'target_problem_description',
    1764836032827460: 'report_card_request_time',
    1764836032694865: 'notification_location',
    1764836032909732: 'notification_time',
    1764836033451248: 'risk_level',
 }
 def rollback():
    """回滚错误的更新"""
    conn = pymysql.connect(**DB_CONFIG)
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    print("="*80)
    print("回滚错误的字段更新")
    print("="*80)
    print(f"\n需要恢复 {len(ROLLBACK_MAPPING)} 个字段\n")
    # 先查询当前状态
    for field_id, correct_code in ROLLBACK_MAPPING.items():
        cursor.execute("""
            SELECT id, name, filed_code
            FROM f_polic_field
            WHERE id = %s AND tenant_id = %s
        """, (field_id, TENANT_ID))
        field = cursor.fetchone()
        if field:
            print(f"  ID: {field_id}")
            print(f"  名称: {field['name']}")
            print(f"  当前field_code: {field['filed_code']}")
            print(f"  恢复为: {correct_code}")
            print()
    # 执行回滚
    print("开始执行回滚...\n")
    for field_id, correct_code in ROLLBACK_MAPPING.items():
        cursor.execute("""
            UPDATE f_polic_field
            SET filed_code = %s, updated_time = NOW(), updated_by = %s
            WHERE id = %s AND tenant_id = %s
        """, (correct_code, UPDATED_BY, field_id, TENANT_ID))
        print(f"  ✓ 恢复字段 ID {field_id}: {correct_code}")
    conn.commit()
    print("\n✓ 回滚完成")
    cursor.close()
    conn.close()
 if __name__ == '__main__':
    rollback()
--- a/services/pycache/ai_service.cpython-312.pyc
+++ b/services/pycache/ai_service.cpython-312.pyc
--- a/services/pycache/document_service.cpython-312.pyc
+++ b/services/pycache/document_service.cpython-312.pyc
--- a/services/pycache/field_service.cpython-312.pyc
+++ b/services/pycache/field_service.cpython-312.pyc
--- a/services/ai_logger.py
+++ b/services/ai_logger.py
@ -0,0 +1,181 @@
 """
 AI对话日志记录模块
 用于记录大模型对话的输入和输出信息，方便排查问题
 """
 import os
 import json
 import time
 from datetime import datetime
 from pathlib import Path
 from typing import Dict, Optional, Any
 from threading import Lock
 class AILogger:
    """AI对话日志记录器"""
    def __init__(self, log_dir: Optional[str] = None):
        """
        初始化日志记录器
        Args:
            log_dir: 日志文件保存目录，默认为项目根目录下的 logs/ai_conversations 目录
        """
        if log_dir is None:
            # 默认日志目录：项目根目录下的 logs/ai_conversations
            project_root = Path(__file__).parent.parent
            log_dir = project_root / "logs" / "ai_conversations"
        self.log_dir = Path(log_dir)
        self.log_dir.mkdir(parents=True, exist_ok=True)
        # 线程锁，确保日志写入的线程安全
        self._lock = Lock()
        # 是否启用日志记录（可通过环境变量控制）
        self.enabled = os.getenv('AI_LOG_ENABLED', 'true').lower() == 'true'
        print(f"[AI日志] 日志记录器初始化完成，日志目录: {self.log_dir}")
        print(f"[AI日志] 日志记录状态: {'启用' if self.enabled else '禁用'}")
    def log_conversation(
        self,
        prompt: str,
        api_request: Dict[str, Any],
        api_response: Optional[Dict[str, Any]] = None,
        extracted_data: Optional[Dict[str, Any]] = None,
        error: Optional[str] = None,
        session_id: Optional[str] = None
    ) -> str:
        """
        记录一次完整的AI对话
        Args:
            prompt: 输入提示词
            api_request: API请求参数
            api_response: API响应内容（完整响应）
            extracted_data: 提取后的结构化数据
            error: 错误信息（如果有）
            session_id: 会话ID（可选，用于关联多次对话）
        Returns:
            日志文件路径
        """
        if not self.enabled:
            return ""
        try:
            with self._lock:
                # 生成时间戳和会话ID
                timestamp = datetime.now().strftime("%Y%m%d_%H%M%S_%f")[:-3]  # 精确到毫秒
                if session_id is None:
                    session_id = f"session_{int(time.time() * 1000)}"
                # 创建日志记录
                log_entry = {
                    "timestamp": datetime.now().isoformat(),
                    "session_id": session_id,
                    "prompt": prompt,
                    "api_request": {
                        "endpoint": api_request.get("endpoint", "unknown"),
                        "model": api_request.get("model", "unknown"),
                        "messages": api_request.get("messages", []),
                        "temperature": api_request.get("temperature"),
                        "max_tokens": api_request.get("max_tokens"),
                        "enable_thinking": api_request.get("enable_thinking", False),
                    },
                    "api_response": api_response,
                    "extracted_data": extracted_data,
                    "error": error,
                    "success": error is None
                }
                # 保存到文件（按日期组织）
                date_str = datetime.now().strftime("%Y%m%d")
                log_file = self.log_dir / f"conversation_{date_str}_{timestamp}.json"
                with open(log_file, 'w', encoding='utf-8') as f:
                    json.dump(log_entry, f, ensure_ascii=False, indent=2)
                print(f"[AI日志] 对话日志已保存: {log_file.name}")
                return str(log_file)
        except Exception as e:
            print(f"[AI日志] 保存日志失败: {e}")
            return ""
    def log_request_only(
        self,
        prompt: str,
        api_request: Dict[str, Any],
        session_id: Optional[str] = None
    ) -> str:
        """
        仅记录请求信息（在发送请求前调用）
        Args:
            prompt: 输入提示词
            api_request: API请求参数
            session_id: 会话ID
        Returns:
            日志文件路径
        """
        return self.log_conversation(
            prompt=prompt,
            api_request=api_request,
            session_id=session_id
        )
    def get_recent_logs(self, limit: int = 10) -> list:
        """
        获取最近的日志文件列表
        Args:
            limit: 返回的日志文件数量
        Returns:
            日志文件路径列表（按时间倒序）
        """
        try:
            log_files = sorted(
                self.log_dir.glob("conversation_*.json"),
                key=lambda x: x.stat().st_mtime,
                reverse=True
            )
            return [str(f) for f in log_files[:limit]]
        except Exception as e:
            print(f"[AI日志] 获取日志列表失败: {e}")
            return []
    def read_log(self, log_file: str) -> Optional[Dict]:
        """
        读取指定的日志文件
        Args:
            log_file: 日志文件路径
        Returns:
            日志内容字典，如果读取失败返回None
        """
        try:
            log_path = Path(log_file)
            if not log_path.is_absolute():
                log_path = self.log_dir / log_file
            with open(log_path, 'r', encoding='utf-8') as f:
                return json.load(f)
        except Exception as e:
            print(f"[AI日志] 读取日志文件失败: {e}")
            return None
 # 全局日志记录器实例
 _ai_logger: Optional[AILogger] = None
 def get_ai_logger() -> AILogger:
    """获取全局AI日志记录器实例"""
    global _ai_logger
    if _ai_logger is None:
        _ai_logger = AILogger()
    return _ai_logger
--- a/services/ai_service.py
+++ b/services/ai_service.py
--- a/services/document_service.py
+++ b/services/document_service.py
--- a/services/field_service.py
+++ b/services/field_service.py
@ -12,15 +12,27 @@ class FieldService:
    """字段服务类"""
    def __init__(self):
        # 从环境变量读取数据库配置，不设置默认值，确保必须通过.env文件配置
        db_host = os.getenv('DB_HOST')
        db_port = os.getenv('DB_PORT')
        db_user = os.getenv('DB_USER')
        db_password = os.getenv('DB_PASSWORD')
        db_name = os.getenv('DB_NAME')
        if not all([db_host, db_port, db_user, db_password, db_name]):
            raise ValueError(
                "数据库配置不完整，请在.env文件中配置以下环境变量：\n"
                "DB_HOST, DB_PORT, DB_USER, DB_PASSWORD, DB_NAME"
            )
        self.db_config = {
-            'host': os.getenv('DB_HOST', '152.136.177.240'),
+            'host': db_host,
-            'port': int(os.getenv('DB_PORT', 5012)),
+            'port': int(db_port),
-            'user': os.getenv('DB_USER', 'finyx'),
+            'user': db_user,
-            'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
+            'password': db_password,
-            'database': os.getenv('DB_NAME', 'finyx'),
+            'database': db_name,
            'charset': 'utf8mb4'
        }
        self.tenant_id = 615873064429507639
        # 加载提示词配置文件
        self.prompt_config = self._load_prompt_config()
@ -137,17 +149,16 @@ class FieldService:
        cursor = conn.cursor(pymysql.cursors.DictCursor)
        try:
-            # 根据字段编码查询字段信息
+            # 根据字段编码查询字段信息（不限制tenant_id）
            placeholders = ','.join(['%s'] * len(field_codes))
            sql = f"""
                SELECT f.id, f.name, f.filed_code as field_code, f.field_type
                FROM f_polic_field f
-                WHERE f.tenant_id = %s
+                WHERE f.filed_code IN ({placeholders})
                AND f.filed_code IN ({placeholders})
                AND f.field_type = 2
                ORDER BY f.id
            """
-            cursor.execute(sql, [self.tenant_id] + field_codes)
+            cursor.execute(sql, field_codes)
            fields = cursor.fetchall()
            # 转换为字典列表
@ -183,12 +194,11 @@ class FieldService:
            sql = """
                SELECT f.id, f.name, f.filed_code as field_code, f.field_type
                FROM f_polic_field f
-                WHERE f.tenant_id = %s
+                WHERE f.filed_code = %s
                AND f.filed_code = %s
                AND f.field_type = 1
                LIMIT 1
            """
-            cursor.execute(sql, (self.tenant_id, field_code))
+            cursor.execute(sql, (field_code,))
            field = cursor.fetchone()
            if field:
@ -224,12 +234,11 @@ class FieldService:
            sql_input = """
                SELECT f.id, f.name, f.filed_code as field_code, f.field_type
                FROM f_polic_field f
-                WHERE f.tenant_id = %s
+                WHERE f.field_type = 1
                AND f.field_type = 1
                AND (f.filed_code = 'clue_info' OR f.filed_code = 'target_basic_info_clue')
                ORDER BY f.id
            """
-            cursor.execute(sql_input, (self.tenant_id,))
+            cursor.execute(sql_input)
            input_fields = cursor.fetchall()
            # 获取输出字段（field_type=2）
@ -239,12 +248,11 @@ class FieldService:
                FROM f_polic_field f
                INNER JOIN f_polic_file_field ff ON f.id = ff.filed_id
                INNER JOIN f_polic_file_config fc ON ff.file_id = fc.id
-                WHERE f.tenant_id = %s
+                WHERE f.field_type = 2
                AND f.field_type = 2
                AND fc.state = 1
                ORDER BY f.id
            """
-            cursor.execute(sql_output, (self.tenant_id,))
+            cursor.execute(sql_output)
            all_output_fields = cursor.fetchall()
            # 根据business_type过滤输出字段
@ -252,10 +260,9 @@ class FieldService:
            sql_file_configs = """
                SELECT id, name, input_data
                FROM f_polic_file_config
-                WHERE tenant_id = %s
+                WHERE state = 1
                AND state = 1
            """
-            cursor.execute(sql_file_configs, (self.tenant_id,))
+            cursor.execute(sql_file_configs)
            file_configs = cursor.fetchall()
            # 找到匹配business_type的文件配置ID列表
@ -277,12 +284,11 @@ class FieldService:
                    SELECT DISTINCT f.id, f.name, f.filed_code as field_code, f.field_type
                    FROM f_polic_field f
                    INNER JOIN f_polic_file_field ff ON f.id = ff.filed_id
-                    WHERE f.tenant_id = %s
+                    WHERE f.field_type = 2
                    AND f.field_type = 2
                    AND ff.file_id IN ({placeholders})
                    ORDER BY f.id
                """
-                cursor.execute(sql_filtered, [self.tenant_id] + matching_file_ids)
+                cursor.execute(sql_filtered, matching_file_ids)
                output_fields = cursor.fetchall()
            return {
--- a/static/index.html
+++ b/static/index.html
@ -326,11 +326,18 @@
                </div>
                <div class="form-group">
-                    <label>文件列表</label>
+                    <label>文件列表（文档模板类型）</label>
                    <div style="margin-bottom: 10px;">
                        <button class="btn btn-secondary" onclick="loadAvailableFiles()" style="margin-right: 10px;">📋 加载全部可用模板</button>
                        <button class="btn btn-secondary" onclick="addFileItem()" style="margin-right: 10px;">+ 手动添加文件</button>
                        <button class="btn btn-danger" onclick="clearAllFiles()">🗑️ 清空列表</button>
                    </div>
                    <div style="margin-bottom: 10px; padding: 10px; background: #f0f0f0; border-radius: 4px; font-size: 13px; color: #666;">
                        💡 提示：点击"加载全部可用模板"可以加载所有可用的文档模板类型，方便测试不同模板的生成效果
                    </div>
                    <div id="fileListContainer">
                        <!-- 动态生成的文件列表 -->
                    </div>
                    <button class="btn btn-secondary" onclick="addFileItem()">+ 添加文件</button>
                </div>
            </div>
@ -381,9 +388,9 @@
        // ==================== 解析接口相关 ====================
        function initExtractTab() {
-            // 初始化默认输入字段（虚拟测试数据）
+            // 初始化默认输入字段
-            addInputField('clue_info', '被举报用户名称是张三，年龄44岁，某公司总经理，男性，1980年5月出生，本科文化程度，中共党员，正处级。主要问题线索：违反国家计划生育有关政策规定，于2010年10月生育二胎。线索来源：群众举报。');
+            addInputField('clue_info', '张三多次在私下聚会、网络群组中发表抹黑党中央决策部署的言论，传播歪曲党的理论和路线方针政策的错误观点，频繁接受管理服务对象安排的高档宴请、私人会所聚餐，以及高尔夫球、高端足浴等娱乐活动，相关费用均由对方全额承担，在干部选拔任用、岗位调整工作中，利用职务便利收受他人财物，利用职权为其亲属经营的公司谋取不正当利益，帮助该公司违规承接本单位及关联单位工程项目3个，合同总额超200万元，从中收受亲属给予的"感谢费"15万元；其本人沉迷赌博活动，每周至少参与1次大额赌资赌博，单次赌资超1万元，累计赌资达数十万元。');
-            addInputField('target_basic_info_clue', '被核查人员工作基本情况：张三，男，1980年5月生，本科文化，中共党员，现为某公司总经理，正处级。');
+            addInputField('target_basic_info_clue', '张三，男，汉族，1990年9月出生，云南普洱人，研究生学历，2005年8月参加工作，2006年10月加入中国共产党。2004年8月至2005年2月，在云南省农业机械公司工作；2005年2月至2012年2月，历任云南省农业机械公司办公室副主任、主任、团委书记；2012年2月至2018年3月，任云南省农业机械公司支部书记、厂长；2018年3月至2020年3月，任云南省农业机械公司总经理助理、销售部部长；2020年3月至2022年3月，任云南省农业机械公司总经理助理；2022年3月至2022年7月，任云南省农业机械公司大理分公司副经理；2022年7月至2023年12月，任云南省农业机械公司西双版纳分公司经理；2023年12月至今，任云南省农业机械公司党支部书记、经理。');
            // 初始化默认输出字段（包含完整的字段列表）
            addOutputField('target_name');
@ -548,26 +555,153 @@
        // ==================== 文档生成接口相关 ====================
-        function initGenerateTab() {
+        async function loadAvailableFiles() {
-            // 初始化默认字段（完整的虚拟测试数据）
+            try {
                const response = await fetch('/api/file-configs');
                const result = await response.json();
                if (result.isSuccess && result.data && result.data.fileConfigs) {
                    const container = document.getElementById('fileListContainer');
                    container.innerHTML = ''; // 清空现有列表
                    // 只添加有filePath的文件（有模板文件的）
                    const filesWithPath = result.data.fileConfigs.filter(f => f.filePath);
                    if (filesWithPath.length === 0) {
                        alert('没有找到可用的文件配置（需要有filePath）');
                        return;
                    }
                    // 加载所有可用文件
                    filesWithPath.forEach(file => {
                        addFileItem(file.fileId, file.fileName);
                    });
                    alert(`已加载全部 ${filesWithPath.length} 个可用文件模板`);
                } else {
                    alert('获取文件列表失败: ' + (result.errorMsg || '未知错误'));
                }
            } catch (error) {
                alert('加载文件列表失败: ' + error.message);
            }
        }
        async function initGenerateTab() {
            // 初始化所有字段（完整的虚拟测试数据）
            // 基本信息字段
            addGenerateField('target_name', '张三');
            addGenerateField('target_gender', '男');
-            addGenerateField('target_age', '44');
+            addGenerateField('target_age', '34');
-            addGenerateField('target_date_of_birth', '198005');
+            addGenerateField('target_date_of_birth', '199009');
-            addGenerateField('target_organization_and_position', '某公司总经理');
+            addGenerateField('target_date_of_birth_full', '1990年9月');
-            addGenerateField('target_organization', '某公司');
+            addGenerateField('target_id_number', '530123199009123456');
-            addGenerateField('target_position', '总经理');
+            addGenerateField('target_ethnicity', '汉族');
-            addGenerateField('target_education_level', '本科');
+            addGenerateField('target_place_of_origin', '云南普洱');
            addGenerateField('target_address', '云南省昆明市五华区某某街道某某小区1栋1单元101室');
            addGenerateField('target_registered_address', '云南省昆明市五华区某某街道某某小区1栋1单元101室');
            addGenerateField('target_contact', '13800138000');
            // 组织和工作信息
            addGenerateField('target_organization_and_position', '云南省农业机械公司党支部书记、经理');
            addGenerateField('target_organization', '云南省农业机械公司');
            addGenerateField('target_position', '党支部书记、经理');
            addGenerateField('target_education_level', '研究生');
            addGenerateField('target_education', '研究生');
            addGenerateField('target_political_status', '中共党员');
-            addGenerateField('target_professional_rank', '正处级');
+            addGenerateField('target_professional_rank', '高级工程师');
            addGenerateField('target_occupation', '企业管理人员');
            addGenerateField('target_work_basic_info', '2005年8月参加工作，现任云南省农业机械公司党支部书记、经理');
            addGenerateField('target_work_history', '2004年8月至2005年2月，在云南省农业机械公司工作；2005年2月至2012年2月，历任云南省农业机械公司办公室副主任、主任、团委书记；2012年2月至2018年3月，任云南省农业机械公司支部书记、厂长；2018年3月至2020年3月，任云南省农业机械公司总经理助理、销售部部长；2020年3月至2022年3月，任云南省农业机械公司总经理助理；2022年3月至2022年7月，任云南省农业机械公司大理分公司副经理；2022年7月至2023年12月，任云南省农业机械公司西双版纳分公司经理；2023年12月至今，任云南省农业机械公司党支部书记、经理。');
            addGenerateField('target_basic_info', '张三，男，汉族，1990年9月出生，云南普洱人，研究生学历，中共党员，现任云南省农业机械公司党支部书记、经理。');
            // 线索和问题信息
            addGenerateField('clue_info', '张三多次在私下聚会、网络群组中发表抹黑党中央决策部署的言论，传播歪曲党的理论和路线方针政策的错误观点，频繁接受管理服务对象安排的高档宴请、私人会所聚餐，以及高尔夫球、高端足浴等娱乐活动，相关费用均由对方全额承担，在干部选拔任用、岗位调整工作中，利用职务便利收受他人财物，利用职权为其亲属经营的公司谋取不正当利益，帮助该公司违规承接本单位及关联单位工程项目3个，合同总额超200万元，从中收受亲属给予的"感谢费"15万元；其本人沉迷赌博活动，每周至少参与1次大额赌资赌博，单次赌资超1万元，累计赌资达数十万元。');
            addGenerateField('target_basic_info_clue', '张三，男，汉族，1990年9月出生，云南普洱人，研究生学历，2005年8月参加工作，2006年10月加入中国共产党。2004年8月至2005年2月，在云南省农业机械公司工作；2005年2月至2012年2月，历任云南省农业机械公司办公室副主任、主任、团委书记；2012年2月至2018年3月，任云南省农业机械公司支部书记、厂长；2018年3月至2020年3月，任云南省农业机械公司总经理助理、销售部部长；2020年3月至2022年3月，任云南省农业机械公司总经理助理；2022年3月至2022年7月，任云南省农业机械公司大理分公司副经理；2022年7月至2023年12月，任云南省农业机械公司西双版纳分公司经理；2023年12月至今，任云南省农业机械公司党支部书记、经理。');
            addGenerateField('clue_source', '群众举报');
-            addGenerateField('target_issue_description', '违反国家计划生育有关政策规定，于2010年10月生育二胎。');
+            addGenerateField('target_issue_description', '张三多次在私下聚会、网络群组中发表抹黑党中央决策部署的言论，传播歪曲党的理论和路线方针政策的错误观点，频繁接受管理服务对象安排的高档宴请、私人会所聚餐，以及高尔夫球、高端足浴等娱乐活动，相关费用均由对方全额承担，在干部选拔任用、岗位调整工作中，利用职务便利收受他人财物，利用职权为其亲属经营的公司谋取不正当利益，帮助该公司违规承接本单位及关联单位工程项目3个，合同总额超200万元，从中收受亲属给予的"感谢费"15万元；其本人沉迷赌博活动，每周至少参与1次大额赌资赌博，单次赌资超1万元，累计赌资达数十万元。');
-            addGenerateField('department_opinion', '建议进行初步核实');
+            addGenerateField('target_problem_description', '违反政治纪律、组织纪律、廉洁纪律，涉嫌违纪违法');
            addGenerateField('target_issue_severity', '严重');
            addGenerateField('target_issue_severity_level', '严重');
            addGenerateField('target_other_issues_possibility', '较大');
            // 个人情况评估
            addGenerateField('target_family_situation', '家庭关系和谐稳定');
            addGenerateField('target_social_relations', '社会交往较多，人际关系基本正常');
            addGenerateField('target_health_status', '良好');
            addGenerateField('target_personality', '开朗');
            addGenerateField('target_tolerance', '较强');
            addGenerateField('target_previous_investigation', '无');
            addGenerateField('target_negative_events', '无');
            addGenerateField('target_other_situation', '无');
            // 谈话和调查相关
            addGenerateField('target_attitude', '配合调查');
            addGenerateField('target_confession_level', '部分承认');
            addGenerateField('target_behavior_during_interview', '情绪稳定，配合调查');
            addGenerateField('target_behavior_after_relief', '情绪有所缓解');
            addGenerateField('target_mental_burden_level', '中等');
            addGenerateField('target_risk_level', '中');
            addGenerateField('risk_level', '中');
            addGenerateField('pre_interview_risk_assessment_result', '风险等级：中，已制定安全预案');
            // 调查组织和人员
            addGenerateField('investigation_unit_name', '纪检监察室');
            addGenerateField('investigation_team_code', 'JC2024001');
            addGenerateField('investigation_team_leader_name', '赵六');
            addGenerateField('investigation_team_member_names', '赵六、钱七、孙八');
            addGenerateField('investigation_location', '纪检监察室谈话室');
            addGenerateField('handler_name', '王五');
            addGenerateField('handling_department', '纪检监察室');
            addGenerateField('commission_name', '中共某某市纪律检查委员会');
            // 谈话相关
            addGenerateField('interview_location', '纪检监察室谈话室');
            addGenerateField('proposed_interview_location', '纪检监察室谈话室');
            addGenerateField('notification_location', '纪检监察室');
            addGenerateField('appointment_location', '纪检监察室谈话室');
            addGenerateField('interview_time', '2024年12月10日14:00');
            addGenerateField('proposed_interview_time', '2024年12月10日14:00');
            addGenerateField('notification_time', '2024年12月9日');
            addGenerateField('appointment_time', '2024年12月10日14:00');
            addGenerateField('interview_reason', '就相关问题进行核实了解');
            addGenerateField('interview_count', '1');
            addGenerateField('interviewer', '赵六');
            addGenerateField('recorder', '钱七');
            addGenerateField('interview_personnel', '赵六、钱七');
            addGenerateField('interview_personnel_leader', '赵六');
            addGenerateField('interview_personnel_safety_officer', '孙八');
            addGenerateField('backup_personnel', '周九');
            // 审批和意见
            addGenerateField('approval_time', '2024年12月8日');
            addGenerateField('report_card_request_time', '2024年12月8日');
            addGenerateField('department_opinion', '经初步核实，建议立案调查');
            addGenerateField('assessment_opinion', '建议进行谈话核实');
            addGenerateField('filler_name', '李四');
-            // 初始化默认文件（包含多个模板用于测试）
+            // 自动加载所有可用的文件列表
-            addFileItem(1, '初步核实审批表.doc', 'PRELIMINARY_VERIFICATION_APPROVAL');
+            try {
-            addFileItem(2, '请示报告卡.doc', 'REPORT_CARD');
+                const response = await fetch('/api/file-configs');
                const result = await response.json();
                if (result.isSuccess && result.data && result.data.fileConfigs) {
                    // 只添加有filePath的文件（有模板文件的）
                    const filesWithPath = result.data.fileConfigs.filter(f => f.filePath);
                    // 加载所有可用文件
                    filesWithPath.forEach(file => {
                        addFileItem(file.fileId, file.fileName);
                    });
                    if (filesWithPath.length > 0) {
                        console.log(`已自动加载 ${filesWithPath.length} 个可用文件模板`);
                    }
                } else {
                    console.warn('未找到可用的文件配置');
                }
            } catch (error) {
                console.warn('自动加载文件列表失败:', error);
            }
        }
        function addGenerateField(fieldCode = '', fieldValue = '') {
@ -584,15 +718,14 @@
            container.appendChild(fieldDiv);
        }
-        function addFileItem(fileId = '', fileName = '', templateCode = '') {
+        function addFileItem(fileId = '', fileName = '') {
            const container = document.getElementById('fileListContainer');
            const fileDiv = document.createElement('div');
            fileDiv.className = 'field-row';
            fileDiv.innerHTML = `
-                <input type="number" placeholder="文件ID" value="${fileId}" class="file-id" style="width: 150px;">
+                <input type="number" placeholder="文件ID (从f_polic_file_config表获取)" value="${fileId}" class="file-id" style="width: 200px;">
                <div style="display: flex; gap: 10px; flex: 1;">
                    <input type="text" placeholder="文件名称 (如: 初步核实审批表.doc)" value="${fileName}" class="file-name" style="flex: 1;">
                    <input type="text" placeholder="模板编码 (如: PRELIMINARY_VERIFICATION_APPROVAL)" value="${templateCode}" class="template-code" style="flex: 1;">
                    <button class="btn btn-danger" onclick="removeField(this)">删除</button>
                </div>
            `;
@ -628,13 +761,11 @@
            fileContainers.forEach(container => {
                const fileId = container.querySelector('.file-id').value.trim();
                const fileName = container.querySelector('.file-name').value.trim();
                const templateCode = container.querySelector('.template-code').value.trim();
-                if (fileId && fileName && templateCode) {
+                if (fileId) {
                    fileList.push({
                        fileId: parseInt(fileId),
-                        fileName: fileName,
+                        fileName: fileName || 'generated.docx'  // fileName可选
                        templateCode: templateCode
                    });
                }
            });
@ -714,8 +845,15 @@
                        result.data.fpolicFieldParamFileList.forEach(file => {
                            html += `<div class="result-item">
                                <strong>${file.fileName}:</strong><br>
-                                文件路径: ${file.filePath || '(无路径)'}
+                                文件路径: ${file.filePath || '(无路径)'}<br>`;
-                            </div>`;
+                            
                            // 如果有下载链接，显示可点击的链接
                            if (file.downloadUrl) {
                                html += `下载链接: <a href="${file.downloadUrl}" target="_blank" style="color: #667eea; text-decoration: underline; word-break: break-all;">${file.downloadUrl}</a><br>`;
                                html += `<button class="btn btn-secondary" onclick="window.open('${file.downloadUrl}', '_blank')" style="margin-top: 5px; padding: 6px 15px; font-size: 14px;">📥 下载文档</button>`;
                            }
                            html += `</div>`;
                        });
                    }
                }
@ -741,6 +879,12 @@
            btn.closest('.field-row').remove();
        }
        function clearAllFiles() {
            if (confirm('确定要清空所有文件列表吗？')) {
                document.getElementById('fileListContainer').innerHTML = '';
            }
        }
        function displayError(tabType, errorMsg) {
            const resultSection = document.getElementById(tabType + 'ResultSection');
            const resultBox = document.getElementById(tabType + 'ResultBox');
--- a/static/template_field_manager.html
+++ b/static/template_field_manager.html
--- a/sync_tables_to_new_database.py
+++ b/sync_tables_to_new_database.py
@ -0,0 +1,458 @@
 """
 从现有数据库同步三个表的数据到新数据库
 同步的表：f_polic_field, f_polic_file_config, f_polic_file_field
 同步前会先备份新数据库
 """
 import os
 import sys
 import subprocess
 import pymysql
 from datetime import datetime
 from pathlib import Path
 from typing import List, Dict, Any
 from dotenv import load_dotenv
 # 设置输出编码为UTF-8（Windows控制台兼容）
 if sys.platform == 'win32':
    import io
    sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8', errors='replace')
    sys.stderr = io.TextIOWrapper(sys.stderr.buffer, encoding='utf-8', errors='replace')
 # 加载环境变量
 load_dotenv()
 # 现有数据库配置（源数据库）
 SOURCE_DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 # 新数据库配置（目标数据库）
 TARGET_DB_CONFIG = {
    'host': '10.100.31.21',
    'port': 3306,
    'user': 'finyx',
    'password': 'FknJYz3FA5WDYtsd',
    'database': 'finyx',
    'charset': 'utf8mb4'
 }
 # 需要同步的表
 TABLES_TO_SYNC = ['f_polic_field', 'f_polic_file_config', 'f_polic_file_field']
 # 备份文件存储目录
 BACKUP_DIR = Path('backups')
 BACKUP_DIR.mkdir(exist_ok=True)
 def backup_target_database() -> str:
    """
    备份目标数据库
    Returns:
        备份文件路径
    """
    print("=" * 60)
    print("步骤 1: 备份新数据库")
    print("=" * 60)
    timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
    backup_file = BACKUP_DIR / f"backup_target_db_{timestamp}.sql"
    # 构建mysqldump命令
    cmd = [
        'mysqldump',
        f"--host={TARGET_DB_CONFIG['host']}",
        f"--port={TARGET_DB_CONFIG['port']}",
        f"--user={TARGET_DB_CONFIG['user']}",
        f"--password={TARGET_DB_CONFIG['password']}",
        '--single-transaction',
        '--routines',
        '--triggers',
        '--events',
        '--add-drop-table',
        '--default-character-set=utf8mb4',
        TARGET_DB_CONFIG['database']
    ]
    try:
        print(f"开始备份数据库 {TARGET_DB_CONFIG['database']}...")
        print(f"备份文件: {backup_file}")
        # 执行备份命令
        with open(backup_file, 'w', encoding='utf-8') as f:
            result = subprocess.run(
                cmd,
                stdout=f,
                stderr=subprocess.PIPE,
                text=True
            )
        if result.returncode != 0:
            error_msg = result.stderr if result.stderr else '未知错误'
            raise Exception(f"mysqldump执行失败: {error_msg}")
        # 检查文件大小
        file_size = backup_file.stat().st_size
        print(f"备份完成！文件大小: {file_size / 1024 / 1024:.2f} MB")
        print(f"备份文件路径: {backup_file}")
        return str(backup_file)
    except FileNotFoundError:
        print("警告: 未找到mysqldump命令，尝试使用Python方式备份...")
        return backup_target_database_with_python(backup_file)
    except Exception as e:
        print(f"备份失败: {str(e)}")
        raise
 def backup_target_database_with_python(backup_file: Path) -> str:
    """
    使用Python方式备份目标数据库（备用方式）
    Args:
        backup_file: 备份文件路径
    Returns:
        备份文件路径
    """
    try:
        print(f"开始使用Python方式备份数据库 {TARGET_DB_CONFIG['database']}...")
        # 连接数据库
        connection = pymysql.connect(**TARGET_DB_CONFIG)
        cursor = connection.cursor()
        with open(backup_file, 'w', encoding='utf-8') as f:
            # 写入文件头
            f.write(f"-- MySQL数据库备份\n")
            f.write(f"-- 数据库: {TARGET_DB_CONFIG['database']}\n")
            f.write(f"-- 备份时间: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}\n")
            f.write(f"-- 主机: {TARGET_DB_CONFIG['host']}:{TARGET_DB_CONFIG['port']}\n")
            f.write("--\n\n")
            f.write(f"SET NAMES utf8mb4;\n")
            f.write(f"SET FOREIGN_KEY_CHECKS=0;\n\n")
            # 获取所有表
            cursor.execute("SHOW TABLES")
            tables = [table[0] for table in cursor.fetchall()]
            print(f"找到 {len(tables)} 个表")
            # 备份每个表
            for table in tables:
                print(f"备份表: {table}")
                # 获取表结构
                cursor.execute(f"SHOW CREATE TABLE `{table}`")
                create_table_sql = cursor.fetchone()[1]
                f.write(f"-- ----------------------------\n")
                f.write(f"-- 表结构: {table}\n")
                f.write(f"-- ----------------------------\n")
                f.write(f"DROP TABLE IF EXISTS `{table}`;\n")
                f.write(f"{create_table_sql};\n\n")
                # 获取表数据
                cursor.execute(f"SELECT * FROM `{table}`")
                rows = cursor.fetchall()
                if rows:
                    # 获取列名
                    cursor.execute(f"DESCRIBE `{table}`")
                    columns = [col[0] for col in cursor.fetchall()]
                    f.write(f"-- ----------------------------\n")
                    f.write(f"-- 表数据: {table}\n")
                    f.write(f"-- ----------------------------\n")
                    # 分批写入数据
                    batch_size = 1000
                    for i in range(0, len(rows), batch_size):
                        batch = rows[i:i+batch_size]
                        values_list = []
                        for row in batch:
                            values = []
                            for value in row:
                                if value is None:
                                    values.append('NULL')
                                elif isinstance(value, (int, float)):
                                    values.append(str(value))
                                else:
                                    # 转义特殊字符
                                    escaped_value = str(value).replace('\\', '\\\\').replace("'", "\\'")
                                    values.append(f"'{escaped_value}'")
                            values_list.append(f"({', '.join(values)})")
                        columns_str = ', '.join([f"`{col}`" for col in columns])
                        values_str = ',\n'.join(values_list)
                        f.write(f"INSERT INTO `{table}` ({columns_str}) VALUES\n")
                        f.write(f"{values_str};\n\n")
                print(f"  完成: {len(rows)} 条记录")
            f.write("SET FOREIGN_KEY_CHECKS=1;\n")
        cursor.close()
        connection.close()
        # 检查文件大小
        file_size = backup_file.stat().st_size
        print(f"备份完成！文件大小: {file_size / 1024 / 1024:.2f} MB")
        return str(backup_file)
    except Exception as e:
        print(f"备份失败: {str(e)}")
        raise
 def get_table_data(conn, table_name: str) -> List[Dict[str, Any]]:
    """
    从源数据库获取表的所有数据
    Args:
        conn: 数据库连接
        table_name: 表名
    Returns:
        数据列表
    """
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        cursor.execute(f"SELECT * FROM `{table_name}`")
        return cursor.fetchall()
    finally:
        cursor.close()
 def get_table_columns(conn, table_name: str) -> List[str]:
    """
    获取表的列名
    Args:
        conn: 数据库连接
        table_name: 表名
    Returns:
        列名列表
    """
    cursor = conn.cursor()
    try:
        cursor.execute(f"DESCRIBE `{table_name}`")
        return [col[0] for col in cursor.fetchall()]
    finally:
        cursor.close()
 def clear_table(conn, table_name: str):
    """
    清空目标数据库中的表
    Args:
        conn: 数据库连接
        table_name: 表名
    """
    cursor = conn.cursor()
    try:
        # 禁用外键检查
        cursor.execute("SET FOREIGN_KEY_CHECKS=0")
        # 清空表
        cursor.execute(f"TRUNCATE TABLE `{table_name}`")
        # 恢复外键检查
        cursor.execute("SET FOREIGN_KEY_CHECKS=1")
        conn.commit()
        print(f"  已清空表: {table_name}")
    except Exception as e:
        conn.rollback()
        raise Exception(f"清空表 {table_name} 失败: {str(e)}")
    finally:
        cursor.close()
 def insert_table_data(conn, table_name: str, columns: List[str], data: List[Dict[str, Any]]):
    """
    将数据插入到目标数据库
    Args:
        conn: 数据库连接
        table_name: 表名
        columns: 列名列表
        data: 数据列表
    """
    if not data:
        print(f"  表 {table_name} 没有数据需要插入")
        return
    cursor = conn.cursor()
    try:
        # 禁用外键检查
        cursor.execute("SET FOREIGN_KEY_CHECKS=0")
        # 构建INSERT语句
        columns_str = ', '.join([f"`{col}`" for col in columns])
        placeholders = ', '.join(['%s'] * len(columns))
        insert_sql = f"INSERT INTO `{table_name}` ({columns_str}) VALUES ({placeholders})"
        # 批量插入数据
        batch_size = 1000
        total_inserted = 0
        for i in range(0, len(data), batch_size):
            batch = data[i:i+batch_size]
            values_list = []
            for row in batch:
                values = [row.get(col) for col in columns]
                values_list.append(values)
            cursor.executemany(insert_sql, values_list)
            total_inserted += len(batch)
        # 恢复外键检查
        cursor.execute("SET FOREIGN_KEY_CHECKS=1")
        conn.commit()
        print(f"  已插入 {total_inserted} 条记录到表: {table_name}")
    except Exception as e:
        conn.rollback()
        raise Exception(f"插入数据到表 {table_name} 失败: {str(e)}")
    finally:
        cursor.close()
 def sync_table(source_conn, target_conn, table_name: str):
    """
    同步单个表的数据
    Args:
        source_conn: 源数据库连接
        target_conn: 目标数据库连接
        table_name: 表名
    """
    print(f"\n同步表: {table_name}")
    print("-" * 60)
    try:
        # 获取表的列名
        columns = get_table_columns(source_conn, table_name)
        print(f"  表列: {', '.join(columns)}")
        # 从源数据库获取数据
        print(f"  从源数据库读取数据...")
        source_data = get_table_data(source_conn, table_name)
        print(f"  读取到 {len(source_data)} 条记录")
        # 清空目标表
        print(f"  清空目标表...")
        clear_table(target_conn, table_name)
        # 插入数据到目标表
        if source_data:
            print(f"  插入数据到目标表...")
            insert_table_data(target_conn, table_name, columns, source_data)
        else:
            print(f"  表 {table_name} 没有数据需要同步")
        print(f"[OK] 表 {table_name} 同步完成")
    except Exception as e:
        print(f"[ERROR] 表 {table_name} 同步失败: {str(e)}")
        raise
 def main():
    """主函数"""
    print("=" * 60)
    print("数据库表同步工具")
    print("=" * 60)
    print(f"源数据库: {SOURCE_DB_CONFIG['host']}:{SOURCE_DB_CONFIG['port']}/{SOURCE_DB_CONFIG['database']}")
    print(f"目标数据库: {TARGET_DB_CONFIG['host']}:{TARGET_DB_CONFIG['port']}/{TARGET_DB_CONFIG['database']}")
    print(f"同步表: {', '.join(TABLES_TO_SYNC)}")
    print("=" * 60)
    # 步骤1: 备份目标数据库
    try:
        backup_file = backup_target_database()
        print(f"\n[OK] 备份完成: {backup_file}\n")
    except Exception as e:
        print(f"\n[ERROR] 备份失败: {str(e)}")
        response = input("是否继续同步？(y/n): ")
        if response.lower() != 'y':
            print("已取消同步")
            sys.exit(1)
    # 步骤2: 连接数据库
    print("=" * 60)
    print("步骤 2: 连接数据库")
    print("=" * 60)
    source_conn = None
    target_conn = None
    try:
        print("连接源数据库...")
        source_conn = pymysql.connect(**SOURCE_DB_CONFIG)
        print("[OK] 源数据库连接成功")
        print("连接目标数据库...")
        try:
            target_conn = pymysql.connect(**TARGET_DB_CONFIG, connect_timeout=10)
            print("[OK] 目标数据库连接成功\n")
        except pymysql.err.OperationalError as e:
            if "timed out" in str(e) or "2003" in str(e):
                print(f"[ERROR] 无法连接到目标数据库 {TARGET_DB_CONFIG['host']}:{TARGET_DB_CONFIG['port']}")
                print("请检查：")
                print("  1. 网络连接是否正常")
                print("  2. 是否需要VPN连接")
                print("  3. 数据库服务器是否可访问")
                print("  4. 防火墙设置是否正确")
            raise
        # 步骤3: 同步表数据
        print("=" * 60)
        print("步骤 3: 同步表数据")
        print("=" * 60)
        for table_name in TABLES_TO_SYNC:
            try:
                sync_table(source_conn, target_conn, table_name)
            except Exception as e:
                print(f"\n[ERROR] 同步表 {table_name} 时发生错误: {str(e)}")
                print("已停止同步")
                sys.exit(1)
        print("\n" + "=" * 60)
        print("[OK] 所有表同步完成！")
        print("=" * 60)
    except pymysql.Error as e:
        print(f"\n[ERROR] 数据库连接失败: {str(e)}")
        if "timed out" in str(e) or "2003" in str(e):
            print("\n提示：如果无法连接到目标数据库，请检查网络连接和VPN设置")
        sys.exit(1)
    except Exception as e:
        print(f"\n[ERROR] 发生错误: {str(e)}")
        sys.exit(1)
    finally:
        if source_conn:
            source_conn.close()
        if target_conn:
            target_conn.close()
 if __name__ == '__main__':
    main()
--- a/sync_template_fields_from_excel.py
+++ b/sync_template_fields_from_excel.py
@ -0,0 +1,552 @@
 """
 根据Excel数据设计文档同步更新模板的input_data、template_code和字段关联关系
 """
 import os
 import json
 import pymysql
 import pandas as pd
 from pathlib import Path
 from typing import Dict, List, Optional, Set
 from datetime import datetime
 from collections import defaultdict
 # 数据库连接配置
 DB_CONFIG = {
    'host': os.getenv('DB_HOST', '152.136.177.240'),
    'port': int(os.getenv('DB_PORT', 5012)),
    'user': os.getenv('DB_USER', 'finyx'),
    'password': os.getenv('DB_PASSWORD', '6QsGK6MpePZDE57Z'),
    'database': os.getenv('DB_NAME', 'finyx'),
    'charset': 'utf8mb4'
 }
 TENANT_ID = 615873064429507639
 CREATED_BY = 655162080928945152
 UPDATED_BY = 655162080928945152
 # Excel文件路径
 EXCEL_FILE = '技术文档/智慧监督项目模板数据结构设计表-20251125-一凡标注.xlsx'
 # 模板名称映射（Excel中的名称 -> 数据库中的名称）
 TEMPLATE_NAME_MAPPING = {
    '请示报告卡': '1.请示报告卡（XXX）',
    '初步核实审批表': '2.初步核实审批表（XXX）',
    '初核方案': '3.附件初核方案(XXX)',
    '谈话通知书': '谈话通知书',
    '谈话通知书第一联': '谈话通知书第一联',
    '谈话通知书第二联': '谈话通知书第二联',
    '谈话通知书第三联': '谈话通知书第三联',
    '走读式谈话审批': '走读式谈话审批',
    '走读式谈话流程': '走读式谈话流程',
    '请示报告卡（初核报告结论）': '8-1请示报告卡（初核报告结论） ',
    'XXX初核情况报告': '8.XXX初核情况报告',
 }
 # 模板编码映射（Excel中的名称 -> template_code）
 TEMPLATE_CODE_MAPPING = {
    '请示报告卡': 'REPORT_CARD',
    '初步核实审批表': 'PRELIMINARY_VERIFICATION_APPROVAL',
    '初核方案': 'INVESTIGATION_PLAN',
    '谈话通知书第一联': 'NOTIFICATION_LETTER_1',
    '谈话通知书第二联': 'NOTIFICATION_LETTER_2',
    '谈话通知书第三联': 'NOTIFICATION_LETTER_3',
    '请示报告卡（初核报告结论）': 'REPORT_CARD_CONCLUSION',
    'XXX初核情况报告': 'INVESTIGATION_REPORT',
 }
 # 字段名称到字段编码的映射
 FIELD_NAME_TO_CODE_MAP = {
    # 输入字段
    '线索信息': 'clue_info',
    '被核查人员工作基本情况线索': 'target_basic_info_clue',
    # 输出字段 - 基本信息
    '被核查人姓名': 'target_name',
    '被核查人员单位及职务': 'target_organization_and_position',
    '被核查人员性别': 'target_gender',
    '被核查人员出生年月': 'target_date_of_birth',
    '被核查人员出生年月日': 'target_date_of_birth_full',
    '被核查人员政治面貌': 'target_political_status',
    '被核查人员职级': 'target_professional_rank',
    '被核查人员单位': 'target_organization',
    '被核查人员职务': 'target_position',
    # 输出字段 - 其他信息
    '线索来源': 'clue_source',
    '主要问题线索': 'target_issue_description',
    '初步核实审批表承办部门意见': 'department_opinion',
    '初步核实审批表填表人': 'filler_name',
    '请示报告卡请示时间': 'report_card_request_time',
    '被核查人员身份证件及号码': 'target_id_number',
    '被核查人员身份证号': 'target_id_number',
    '应到时间': 'appointment_time',
    '应到地点': 'appointment_location',
    '批准时间': 'approval_time',
    '承办部门': 'handling_department',
    '承办人': 'handler_name',
    '谈话通知时间': 'notification_time',
    '谈话通知地点': 'notification_location',
    '被核查人员住址': 'target_address',
    '被核查人员户籍住址': 'target_registered_address',
    '被核查人员联系方式': 'target_contact',
    '被核查人员籍贯': 'target_place_of_origin',
    '被核查人员民族': 'target_ethnicity',
    '被核查人员工作基本情况': 'target_work_basic_info',
    '核查单位名称': 'investigation_unit_name',
    '核查组组长姓名': 'investigation_team_leader_name',
    '核查组成员姓名': 'investigation_team_member_names',
    '核查地点': 'investigation_location',
 }
 def generate_id():
    """生成ID"""
    import time
    import random
    timestamp = int(time.time() * 1000)
    random_part = random.randint(100000, 999999)
    return timestamp * 1000 + random_part
 def normalize_template_name(name: str) -> str:
    """标准化模板名称，用于匹配"""
    import re
    # 去掉开头的编号和括号内容
    name = re.sub(r'^\d+[\.\-]\s*', '', name)
    name = re.sub(r'[（(].*?[）)]', '', name)
    name = name.strip()
    return name
 def parse_excel_data() -> Dict:
    """解析Excel文件，提取模板和字段的关联关系"""
    print("="*80)
    print("解析Excel数据设计文档")
    print("="*80)
    if not Path(EXCEL_FILE).exists():
        print(f"✗ Excel文件不存在: {EXCEL_FILE}")
        return None
    try:
        df = pd.read_excel(EXCEL_FILE)
        print(f"✓ 成功读取Excel文件，共 {len(df)} 行数据\n")
        templates = defaultdict(lambda: {
            'template_name': '',
            'template_code': '',
            'input_fields': [],
            'output_fields': []
        })
        current_template = None
        current_input_field = None
        for idx, row in df.iterrows():
            level1 = row.get('一级分类')
            level2 = row.get('二级分类')
            level3 = row.get('三级分类')
            input_field = row.get('输入数据字段')
            output_field = row.get('输出数据字段')
            # 处理二级分类（模板名称）
            if pd.notna(level2) and level2:
                current_template = str(level2).strip()
                # 获取模板编码
                template_code = TEMPLATE_CODE_MAPPING.get(current_template, '')
                if not template_code:
                    # 如果没有映射，尝试生成
                    template_code = current_template.upper().replace(' ', '_')
                templates[current_template]['template_name'] = current_template
                templates[current_template]['template_code'] = template_code
                current_input_field = None  # 重置输入字段
                print(f"  模板: {current_template} (code: {template_code})")
            # 处理三级分类（子模板，如谈话通知书第一联）
            if pd.notna(level3) and level3:
                current_template = str(level3).strip()
                template_code = TEMPLATE_CODE_MAPPING.get(current_template, '')
                if not template_code:
                    template_code = current_template.upper().replace(' ', '_')
                templates[current_template]['template_name'] = current_template
                templates[current_template]['template_code'] = template_code
                current_input_field = None
                print(f"  子模板: {current_template} (code: {template_code})")
            # 处理输入字段
            if pd.notna(input_field) and input_field:
                input_field_name = str(input_field).strip()
                if input_field_name != current_input_field:
                    current_input_field = input_field_name
                    field_code = FIELD_NAME_TO_CODE_MAP.get(input_field_name, input_field_name.lower().replace(' ', '_'))
                    if current_template:
                        templates[current_template]['input_fields'].append({
                            'name': input_field_name,
                            'field_code': field_code
                        })
            # 处理输出字段
            if pd.notna(output_field) and output_field:
                output_field_name = str(output_field).strip()
                field_code = FIELD_NAME_TO_CODE_MAP.get(output_field_name, output_field_name.lower().replace(' ', '_'))
                if current_template:
                    templates[current_template]['output_fields'].append({
                        'name': output_field_name,
                        'field_code': field_code
                    })
        # 去重
        for template_name, template_info in templates.items():
            # 输入字段去重
            seen_input = set()
            unique_input = []
            for field in template_info['input_fields']:
                key = field['field_code']
                if key not in seen_input:
                    seen_input.add(key)
                    unique_input.append(field)
            template_info['input_fields'] = unique_input
            # 输出字段去重
            seen_output = set()
            unique_output = []
            for field in template_info['output_fields']:
                key = field['field_code']
                if key not in seen_output:
                    seen_output.add(key)
                    unique_output.append(field)
            template_info['output_fields'] = unique_output
        print(f"\n✓ 解析完成，共 {len(templates)} 个模板")
        for template_name, template_info in templates.items():
            print(f"  - {template_name}: {len(template_info['input_fields'])} 个输入字段, {len(template_info['output_fields'])} 个输出字段")
        return dict(templates)
    except Exception as e:
        print(f"✗ 解析Excel文件失败: {e}")
        import traceback
        traceback.print_exc()
        return None
 def get_database_templates(conn) -> Dict:
    """获取数据库中的模板配置"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    sql = """
        SELECT id, name, template_code, input_data, parent_id
        FROM f_polic_file_config
        WHERE tenant_id = %s
    """
    cursor.execute(sql, (TENANT_ID,))
    configs = cursor.fetchall()
    result = {}
    for config in configs:
        name = config['name']
        result[name] = config
        # 也添加标准化名称的映射
        normalized = normalize_template_name(name)
        if normalized not in result:
            result[normalized] = config
    cursor.close()
    return result
 def get_database_fields(conn) -> Dict:
    """获取数据库中的字段定义"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    sql = """
        SELECT id, name, filed_code, field_type
        FROM f_polic_field
        WHERE tenant_id = %s
    """
    cursor.execute(sql, (TENANT_ID,))
    fields = cursor.fetchall()
    result = {
        'by_code': {},
        'by_name': {}
    }
    for field in fields:
        field_code = field['filed_code']
        field_name = field['name']
        result['by_code'][field_code] = field
        result['by_name'][field_name] = field
    cursor.close()
    return result
 def find_matching_template(excel_template_name: str, db_templates: Dict) -> Optional[Dict]:
    """查找匹配的数据库模板"""
    # 1. 精确匹配
    if excel_template_name in db_templates:
        return db_templates[excel_template_name]
    # 2. 通过映射表匹配
    mapped_name = TEMPLATE_NAME_MAPPING.get(excel_template_name)
    if mapped_name and mapped_name in db_templates:
        return db_templates[mapped_name]
    # 3. 标准化名称匹配
    normalized = normalize_template_name(excel_template_name)
    if normalized in db_templates:
        return db_templates[normalized]
    # 4. 模糊匹配
    for db_name, db_config in db_templates.items():
        if normalized in normalize_template_name(db_name) or normalize_template_name(db_name) in normalized:
            return db_config
    return None
 def update_template_config(conn, template_id: int, template_code: str, input_fields: List[Dict], dry_run: bool = True):
    """更新模板配置的input_data和template_code"""
    cursor = conn.cursor()
    try:
        # 构建input_data
        input_data = {
            'template_code': template_code,
            'business_type': 'INVESTIGATION',
            'input_fields': [f['field_code'] for f in input_fields]
        }
        input_data_json = json.dumps(input_data, ensure_ascii=False)
        if not dry_run:
            update_sql = """
                UPDATE f_polic_file_config
                SET template_code = %s, input_data = %s, updated_time = NOW(), updated_by = %s
                WHERE id = %s AND tenant_id = %s
            """
            cursor.execute(update_sql, (template_code, input_data_json, UPDATED_BY, template_id, TENANT_ID))
            conn.commit()
            print(f"    ✓ 更新模板配置")
        else:
            print(f"    [模拟] 将更新模板配置: template_code={template_code}")
    finally:
        cursor.close()
 def update_template_field_relations(conn, template_id: int, input_fields: List[Dict], output_fields: List[Dict], 
                                    db_fields: Dict, dry_run: bool = True):
    """更新模板和字段的关联关系"""
    cursor = conn.cursor()
    try:
        # 先删除旧的关联关系
        if not dry_run:
            delete_sql = """
                DELETE FROM f_polic_file_field
                WHERE tenant_id = %s AND file_id = %s
            """
            cursor.execute(delete_sql, (TENANT_ID, template_id))
        # 创建新的关联关系
        relations_created = 0
        # 关联输入字段（field_type=1）
        for field_info in input_fields:
            field_code = field_info['field_code']
            field = db_fields['by_code'].get(field_code)
            if not field:
                print(f"      ⚠ 输入字段不存在: {field_code}")
                continue
            if field['field_type'] != 1:
                print(f"      ⚠ 字段类型不匹配: {field_code} (期望输入字段，实际为输出字段)")
                continue
            if not dry_run:
                # 检查是否已存在
                check_sql = """
                    SELECT id FROM f_polic_file_field
                    WHERE tenant_id = %s AND file_id = %s AND filed_id = %s
                """
                cursor.execute(check_sql, (TENANT_ID, template_id, field['id']))
                existing = cursor.fetchone()
                if not existing:
                    relation_id = generate_id()
                    insert_sql = """
                        INSERT INTO f_polic_file_field
                        (id, tenant_id, file_id, filed_id, created_time, created_by, updated_time, updated_by, state)
                        VALUES (%s, %s, %s, %s, NOW(), %s, NOW(), %s, %s)
                    """
                    cursor.execute(insert_sql, (
                        relation_id, TENANT_ID, template_id, field['id'],
                        CREATED_BY, UPDATED_BY, 1
                    ))
                    relations_created += 1
            else:
                relations_created += 1
        # 关联输出字段（field_type=2）
        for field_info in output_fields:
            field_code = field_info['field_code']
            field = db_fields['by_code'].get(field_code)
            if not field:
                # 尝试通过名称匹配
                field_name = field_info['name']
                field = db_fields['by_name'].get(field_name)
            if not field:
                print(f"      ⚠ 输出字段不存在: {field_code} ({field_info['name']})")
                continue
            if field['field_type'] != 2:
                print(f"      ⚠ 字段类型不匹配: {field_code} (期望输出字段，实际为输入字段)")
                continue
            if not dry_run:
                # 检查是否已存在
                check_sql = """
                    SELECT id FROM f_polic_file_field
                    WHERE tenant_id = %s AND file_id = %s AND filed_id = %s
                """
                cursor.execute(check_sql, (TENANT_ID, template_id, field['id']))
                existing = cursor.fetchone()
                if not existing:
                    relation_id = generate_id()
                    insert_sql = """
                        INSERT INTO f_polic_file_field
                        (id, tenant_id, file_id, filed_id, created_time, created_by, updated_time, updated_by, state)
                        VALUES (%s, %s, %s, %s, NOW(), %s, NOW(), %s, %s)
                    """
                    cursor.execute(insert_sql, (
                        relation_id, TENANT_ID, template_id, field['id'],
                        CREATED_BY, UPDATED_BY, 1
                    ))
                    relations_created += 1
            else:
                relations_created += 1
        if not dry_run:
            conn.commit()
            print(f"    ✓ 创建 {relations_created} 个字段关联关系")
        else:
            print(f"    [模拟] 将创建 {relations_created} 个字段关联关系")
    finally:
        cursor.close()
 def main():
    """主函数"""
    print("="*80)
    print("同步模板字段信息（根据Excel数据设计文档）")
    print("="*80)
    # 解析Excel
    excel_data = parse_excel_data()
    if not excel_data:
        return
    # 连接数据库
    try:
        conn = pymysql.connect(**DB_CONFIG)
        print("\n✓ 数据库连接成功")
    except Exception as e:
        print(f"\n✗ 数据库连接失败: {e}")
        return
    try:
        # 获取数据库中的模板和字段
        print("\n获取数据库中的模板和字段...")
        db_templates = get_database_templates(conn)
        db_fields = get_database_fields(conn)
        print(f"  数据库中有 {len(db_templates)} 个模板")
        print(f"  数据库中有 {len(db_fields['by_code'])} 个字段")
        # 匹配和更新
        print("\n" + "="*80)
        print("匹配模板并更新配置")
        print("="*80)
        matched_count = 0
        unmatched_templates = []
        for excel_template_name, template_info in excel_data.items():
            print(f"\n处理模板: {excel_template_name}")
            # 查找匹配的数据库模板
            db_template = find_matching_template(excel_template_name, db_templates)
            if not db_template:
                print(f"  ✗ 未找到匹配的数据库模板")
                unmatched_templates.append(excel_template_name)
                continue
            print(f"  ✓ 匹配到数据库模板: {db_template['name']} (ID: {db_template['id']})")
            matched_count += 1
            # 更新模板配置
            template_code = template_info['template_code']
            input_fields = template_info['input_fields']
            output_fields = template_info['output_fields']
            print(f"    模板编码: {template_code}")
            print(f"    输入字段: {len(input_fields)} 个")
            print(f"    输出字段: {len(output_fields)} 个")
            # 先执行模拟更新
            print("  [模拟模式]")
            update_template_config(conn, db_template['id'], template_code, input_fields, dry_run=True)
            update_template_field_relations(conn, db_template['id'], input_fields, output_fields, db_fields, dry_run=True)
        # 显示统计
        print("\n" + "="*80)
        print("统计信息")
        print("="*80)
        print(f"Excel中的模板数: {len(excel_data)}")
        print(f"成功匹配: {matched_count} 个")
        print(f"未匹配: {len(unmatched_templates)} 个")
        if unmatched_templates:
            print("\n未匹配的模板:")
            for template in unmatched_templates:
                print(f"  - {template}")
        # 询问是否执行实际更新
        print("\n" + "="*80)
        response = input("\n是否执行实际更新？(yes/no，默认no): ").strip().lower()
        if response == 'yes':
            print("\n执行实际更新...")
            for excel_template_name, template_info in excel_data.items():
                db_template = find_matching_template(excel_template_name, db_templates)
                if db_template:
                    print(f"\n更新: {db_template['name']}")
                    update_template_config(conn, db_template['id'], template_info['template_code'], 
                                         template_info['input_fields'], dry_run=False)
                    update_template_field_relations(conn, db_template['id'], 
                                                   template_info['input_fields'], 
                                                   template_info['output_fields'], 
                                                   db_fields, dry_run=False)
            print("\n" + "="*80)
            print("✓ 同步完成！")
            print("="*80)
        else:
            print("\n已取消更新")
    finally:
        conn.close()
        print("\n数据库连接已关闭")
 if __name__ == '__main__':
    main()
--- a/sync_templates_between_databases.py
+++ b/sync_templates_between_databases.py
@ -0,0 +1,779 @@
 """
 跨数据库同步模板、字段和关联关系
 功能：
 1. 从.env文件读取源数据库配置
 2. 同步到目标数据库（10.100.31.21）
 3. 处理ID映射关系（两个数据库的ID不同）
 4. 根据业务逻辑（name, filed_code, file_path）匹配数据
 使用方法：
 python sync_templates_between_databases.py --target-host 10.100.31.21 --target-port 3306 --target-user finyx --target-password FknJYz3FA5WDYtsd --target-database finyx --target-tenant-id 1
 """
 import os
 import sys
 import pymysql
 import argparse
 from pathlib import Path
 from typing import Dict, List, Set, Optional, Tuple
 from dotenv import load_dotenv
 # 设置输出编码为UTF-8（Windows兼容）
 if sys.platform == 'win32':
    import io
    sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
    sys.stderr = io.TextIOWrapper(sys.stderr.buffer, encoding='utf-8')
 # 加载环境变量
 load_dotenv()
 # 项目根目录
 PROJECT_ROOT = Path(__file__).parent
 TEMPLATES_DIR = PROJECT_ROOT / "template_finish"
 CREATED_BY = 655162080928945152
 UPDATED_BY = 655162080928945152
 def print_section(title):
    """打印章节标题"""
    print("\n" + "="*70)
    print(f"  {title}")
    print("="*70)
 def print_result(success, message):
    """打印结果"""
    status = "[OK]" if success else "[FAIL]"
    print(f"{status} {message}")
 def generate_id():
    """生成ID"""
    import time
    return int(time.time() * 1000000)
 def get_source_db_config() -> Dict:
    """从.env文件读取源数据库配置"""
    db_host = os.getenv('DB_HOST')
    db_port = os.getenv('DB_PORT')
    db_user = os.getenv('DB_USER')
    db_password = os.getenv('DB_PASSWORD')
    db_name = os.getenv('DB_NAME')
    if not all([db_host, db_port, db_user, db_password, db_name]):
        raise ValueError(
            "源数据库配置不完整，请在.env文件中配置以下环境变量：\n"
            "DB_HOST, DB_PORT, DB_USER, DB_PASSWORD, DB_NAME"
        )
    return {
        'host': db_host,
        'port': int(db_port),
        'user': db_user,
        'password': db_password,
        'database': db_name,
        'charset': 'utf8mb4'
    }
 def get_target_db_config_from_args() -> Dict:
    """从命令行参数获取目标数据库配置"""
    parser = argparse.ArgumentParser(
        description='跨数据库同步模板、字段和关联关系',
        formatter_class=argparse.RawDescriptionHelpFormatter,
        epilog="""
 示例：
  python sync_templates_between_databases.py --target-host 10.100.31.21 --target-port 3306 --target-user finyx --target-password FknJYz3FA5WDYtsd --target-database finyx --target-tenant-id 1
        """
    )
    parser.add_argument('--target-host', type=str, required=True, help='目标MySQL服务器地址')
    parser.add_argument('--target-port', type=int, required=True, help='目标MySQL服务器端口')
    parser.add_argument('--target-user', type=str, required=True, help='目标MySQL用户名')
    parser.add_argument('--target-password', type=str, required=True, help='目标MySQL密码')
    parser.add_argument('--target-database', type=str, required=True, help='目标数据库名称')
    parser.add_argument('--target-tenant-id', type=int, required=True, help='目标租户ID')
    parser.add_argument('--source-tenant-id', type=int, help='源租户ID（如果不指定，将使用数据库中的第一个tenant_id）')
    parser.add_argument('--dry-run', action='store_true', help='预览模式（不实际更新数据库）')
    args = parser.parse_args()
    return {
        'host': args.target_host,
        'port': args.target_port,
        'user': args.target_user,
        'password': args.target_password,
        'database': args.target_database,
        'charset': 'utf8mb4',
        'tenant_id': args.target_tenant_id,
        'source_tenant_id': args.source_tenant_id,
        'dry_run': args.dry_run
    }
 def test_db_connection(config: Dict, label: str) -> Optional[pymysql.Connection]:
    """测试数据库连接"""
    try:
        conn = pymysql.connect(
            host=config['host'],
            port=config['port'],
            user=config['user'],
            password=config['password'],
            database=config['database'],
            charset=config['charset']
        )
        return conn
    except Exception as e:
        print_result(False, f"{label}数据库连接失败: {str(e)}")
        return None
 def get_source_tenant_id(conn) -> int:
    """获取源数据库中的tenant_id"""
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        cursor.execute("SELECT DISTINCT tenant_id FROM f_polic_file_config LIMIT 1")
        result = cursor.fetchone()
        if result:
            return result['tenant_id']
        return 1
    finally:
        cursor.close()
 def read_source_fields(conn, tenant_id: int) -> Tuple[Dict[str, Dict], Dict[str, Dict]]:
    """
    从源数据库读取字段数据
    Returns:
        (input_fields_dict, output_fields_dict)
        key: filed_code, value: 字段信息
    """
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        sql = """
            SELECT id, tenant_id, name, filed_code, field_type, state
            FROM f_polic_field
            WHERE tenant_id = %s
            AND state = 1
            ORDER BY field_type, filed_code
        """
        cursor.execute(sql, (tenant_id,))
        fields = cursor.fetchall()
        input_fields = {}
        output_fields = {}
        for field in fields:
            field_info = {
                'id': field['id'],
                'tenant_id': field['tenant_id'],
                'name': field['name'],
                'filed_code': field['filed_code'],
                'field_type': field['field_type'],
                'state': field['state']
            }
            if field['field_type'] == 1:
                input_fields[field['filed_code']] = field_info
            elif field['field_type'] == 2:
                output_fields[field['filed_code']] = field_info
        return input_fields, output_fields
    finally:
        cursor.close()
 def read_source_templates(conn, tenant_id: int) -> Dict[str, Dict]:
    """
    从源数据库读取模板数据
    Returns:
        key: file_path (如果为空则使用name), value: 模板信息
    """
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        sql = """
            SELECT id, tenant_id, parent_id, name, file_path, state
            FROM f_polic_file_config
            WHERE tenant_id = %s
            AND state = 1
            ORDER BY file_path, name
        """
        cursor.execute(sql, (tenant_id,))
        templates = cursor.fetchall()
        result = {}
        for template in templates:
            # 使用file_path作为key，如果没有file_path则使用name
            key = template['file_path'] if template['file_path'] else f"DIR:{template['name']}"
            result[key] = {
                'id': template['id'],
                'tenant_id': template['tenant_id'],
                'parent_id': template['parent_id'],
                'name': template['name'],
                'file_path': template['file_path'],
                'state': template['state']
            }
        return result
    finally:
        cursor.close()
 def read_source_relations(conn, tenant_id: int) -> Dict[int, List[int]]:
    """
    从源数据库读取字段关联关系
    Returns:
        key: file_id, value: [filed_id列表]
    """
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        sql = """
            SELECT file_id, filed_id
            FROM f_polic_file_field
            WHERE tenant_id = %s
            AND state = 1
        """
        cursor.execute(sql, (tenant_id,))
        relations = cursor.fetchall()
        result = {}
        for rel in relations:
            file_id = rel['file_id']
            filed_id = rel['filed_id']
            if file_id not in result:
                result[file_id] = []
            result[file_id].append(filed_id)
        return result
    finally:
        cursor.close()
 def sync_fields_to_target(conn, tenant_id: int, source_input_fields: Dict, source_output_fields: Dict,
                          dry_run: bool = False) -> Tuple[Dict[int, int], Dict[int, int]]:
    """
    同步字段到目标数据库
    Returns:
        (input_field_id_map, output_field_id_map)
        key: 源字段ID, value: 目标字段ID
    """
    print_section("同步字段到目标数据库")
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        # 1. 获取目标数据库中的现有字段
        cursor.execute("""
            SELECT id, filed_code, field_type
            FROM f_polic_field
            WHERE tenant_id = %s
            AND state = 1
        """, (tenant_id,))
        existing_fields = cursor.fetchall()
        existing_by_code = {}
        for field in existing_fields:
            key = (field['filed_code'], field['field_type'])
            existing_by_code[key] = field['id']
        print(f"  目标数据库现有字段: {len(existing_fields)} 个")
        # 2. 同步输入字段
        print("\n  同步输入字段...")
        input_field_id_map = {}
        input_created = 0
        input_matched = 0
        for code, source_field in source_input_fields.items():
            key = (code, 1)
            if key in existing_by_code:
                # 字段已存在，使用现有ID
                target_id = existing_by_code[key]
                input_field_id_map[source_field['id']] = target_id
                input_matched += 1
            else:
                # 创建新字段
                target_id = generate_id()
                input_field_id_map[source_field['id']] = target_id
                if not dry_run:
                    insert_cursor = conn.cursor()
                    try:
                        insert_cursor.execute("""
                            INSERT INTO f_polic_field
                            (id, tenant_id, name, filed_code, field_type, created_time, created_by, updated_time, updated_by, state)
                            VALUES (%s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, 1)
                        """, (
                            target_id,
                            tenant_id,
                            source_field['name'],
                            source_field['filed_code'],
                            1,
                            CREATED_BY,
                            UPDATED_BY
                        ))
                        conn.commit()
                        input_created += 1
                    finally:
                        insert_cursor.close()
                else:
                    input_created += 1
        print(f"    匹配: {input_matched} 个，创建: {input_created} 个")
        # 3. 同步输出字段
        print("\n  同步输出字段...")
        output_field_id_map = {}
        output_created = 0
        output_matched = 0
        for code, source_field in source_output_fields.items():
            key = (code, 2)
            if key in existing_by_code:
                # 字段已存在，使用现有ID
                target_id = existing_by_code[key]
                output_field_id_map[source_field['id']] = target_id
                output_matched += 1
            else:
                # 创建新字段
                target_id = generate_id()
                output_field_id_map[source_field['id']] = target_id
                if not dry_run:
                    insert_cursor = conn.cursor()
                    try:
                        insert_cursor.execute("""
                            INSERT INTO f_polic_field
                            (id, tenant_id, name, filed_code, field_type, created_time, created_by, updated_time, updated_by, state)
                            VALUES (%s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, 1)
                        """, (
                            target_id,
                            tenant_id,
                            source_field['name'],
                            source_field['filed_code'],
                            2,
                            CREATED_BY,
                            UPDATED_BY
                        ))
                        conn.commit()
                        output_created += 1
                    finally:
                        insert_cursor.close()
                else:
                    output_created += 1
        print(f"    匹配: {output_matched} 个，创建: {output_created} 个")
        return input_field_id_map, output_field_id_map
    finally:
        cursor.close()
 def sync_templates_to_target(conn, tenant_id: int, source_templates: Dict,
                             dry_run: bool = False) -> Dict[int, int]:
    """
    同步模板到目标数据库
    Returns:
        template_id_map: key: 源模板ID, value: 目标模板ID
    """
    print_section("同步模板到目标数据库")
    cursor = conn.cursor(pymysql.cursors.DictCursor)
    try:
        # 1. 获取目标数据库中的现有模板
        cursor.execute("""
            SELECT id, name, file_path, parent_id
            FROM f_polic_file_config
            WHERE tenant_id = %s
            AND state = 1
        """, (tenant_id,))
        existing_templates = cursor.fetchall()
        existing_by_path = {}
        existing_by_name = {}
        for template in existing_templates:
            if template['file_path']:
                existing_by_path[template['file_path']] = template
            else:
                # 目录节点
                name = template['name']
                if name not in existing_by_name:
                    existing_by_name[name] = []
                existing_by_name[name].append(template)
        print(f"  目标数据库现有模板: {len(existing_templates)} 个")
        # 2. 先处理目录节点（按层级顺序）
        print("\n  同步目录节点...")
        template_id_map = {}
        dir_created = 0
        dir_matched = 0
        # 分离目录和文件
        dir_templates = {}
        file_templates = {}
        for key, source_template in source_templates.items():
            if source_template['file_path']:
                file_templates[key] = source_template
            else:
                dir_templates[key] = source_template
        # 构建目录层级关系（需要先处理父目录）
        # 按parent_id分组，先处理没有parent_id的，再处理有parent_id的
        dirs_by_level = {}
        for key, source_template in dir_templates.items():
            level = 0
            current = source_template
            while current.get('parent_id'):
                level += 1
                # 查找父目录
                parent_found = False
                for t in dir_templates.values():
                    if t['id'] == current['parent_id']:
                        current = t
                        parent_found = True
                        break
                if not parent_found:
                    break
            if level not in dirs_by_level:
                dirs_by_level[level] = []
            dirs_by_level[level].append((key, source_template))
        # 按层级顺序处理目录
        for level in sorted(dirs_by_level.keys()):
            for key, source_template in dirs_by_level[level]:
                source_id = source_template['id']
                name = source_template['name']
                # 查找匹配的目录（通过名称和parent_id）
                matched = None
                target_parent_id = None
                if source_template['parent_id']:
                    target_parent_id = template_id_map.get(source_template['parent_id'])
                for existing in existing_by_name.get(name, []):
                    if not existing['file_path']:  # 确保是目录节点
                        # 检查parent_id是否匹配
                        if existing['parent_id'] == target_parent_id:
                            matched = existing
                            break
                if matched:
                    target_id = matched['id']
                    template_id_map[source_id] = target_id
                    dir_matched += 1
                else:
                    target_id = generate_id()
                    template_id_map[source_id] = target_id
                    if not dry_run:
                        insert_cursor = conn.cursor()
                        try:
                            insert_cursor.execute("""
                                INSERT INTO f_polic_file_config
                                (id, tenant_id, parent_id, name, file_path, created_time, created_by, updated_time, updated_by, state)
                                VALUES (%s, %s, %s, %s, NULL, NOW(), %s, NOW(), %s, 1)
                            """, (
                                target_id,
                                tenant_id,
                                target_parent_id,
                                name,
                                CREATED_BY,
                                UPDATED_BY
                            ))
                            conn.commit()
                            dir_created += 1
                        finally:
                            insert_cursor.close()
                    else:
                        dir_created += 1
        print(f"    匹配: {dir_matched} 个，创建: {dir_created} 个")
        # 3. 处理文件节点
        print("\n  同步文件节点...")
        file_created = 0
        file_matched = 0
        file_updated = 0
        for key, source_template in file_templates.items():
            source_id = source_template['id']
            file_path = source_template['file_path']
            name = source_template['name']
            # 通过file_path匹配
            matched = existing_by_path.get(file_path)
            if matched:
                target_id = matched['id']
                template_id_map[source_id] = target_id
                file_matched += 1
                # 检查是否需要更新
                target_parent_id = None
                if source_template['parent_id']:
                    target_parent_id = template_id_map.get(source_template['parent_id'])
                if matched['parent_id'] != target_parent_id or matched['name'] != name:
                    file_updated += 1
                    if not dry_run:
                        update_cursor = conn.cursor()
                        try:
                            update_cursor.execute("""
                                UPDATE f_polic_file_config
                                SET parent_id = %s, name = %s, updated_time = NOW(), updated_by = %s
                                WHERE id = %s AND tenant_id = %s
                            """, (target_parent_id, name, UPDATED_BY, target_id, tenant_id))
                            conn.commit()
                        finally:
                            update_cursor.close()
            else:
                target_id = generate_id()
                template_id_map[source_id] = target_id
                if not dry_run:
                    insert_cursor = conn.cursor()
                    try:
                        # 处理parent_id映射
                        target_parent_id = None
                        if source_template['parent_id']:
                            target_parent_id = template_id_map.get(source_template['parent_id'])
                        insert_cursor.execute("""
                            INSERT INTO f_polic_file_config
                            (id, tenant_id, parent_id, name, file_path, created_time, created_by, updated_time, updated_by, state)
                            VALUES (%s, %s, %s, %s, %s, NOW(), %s, NOW(), %s, 1)
                        """, (
                            target_id,
                            tenant_id,
                            target_parent_id,
                            name,
                            file_path,
                            CREATED_BY,
                            UPDATED_BY
                        ))
                        conn.commit()
                        file_created += 1
                    finally:
                        insert_cursor.close()
                else:
                    file_created += 1
        print(f"    匹配: {file_matched} 个，创建: {file_created} 个，更新: {file_updated} 个")
        return template_id_map
    finally:
        cursor.close()
 def sync_relations_to_target(conn, tenant_id: int, source_relations: Dict[int, List[int]],
                             template_id_map: Dict[int, int],
                             input_field_id_map: Dict[int, int],
                             output_field_id_map: Dict[int, int],
                             dry_run: bool = False):
    """同步字段关联关系到目标数据库"""
    print_section("同步字段关联关系到目标数据库")
    # 1. 清理现有关联关系
    print("1. 清理现有关联关系...")
    if not dry_run:
        cursor = conn.cursor()
        try:
            cursor.execute("""
                DELETE FROM f_polic_file_field
                WHERE tenant_id = %s
            """, (tenant_id,))
            deleted_count = cursor.rowcount
            conn.commit()
            print_result(True, f"删除了 {deleted_count} 条旧关联关系")
        finally:
            cursor.close()
    else:
        print("  [预览模式] 将清理所有现有关联关系")
    # 2. 创建新的关联关系
    print("\n2. 创建新的关联关系...")
    all_field_id_map = {**input_field_id_map, **output_field_id_map}
    relations_created = 0
    relations_skipped = 0
    for source_file_id, source_field_ids in source_relations.items():
        # 获取目标file_id
        target_file_id = template_id_map.get(source_file_id)
        if not target_file_id:
            relations_skipped += 1
            continue
        # 转换field_id
        target_field_ids = []
        for source_field_id in source_field_ids:
            target_field_id = all_field_id_map.get(source_field_id)
            if target_field_id:
                target_field_ids.append(target_field_id)
        if not target_field_ids:
            continue
        # 创建关联关系
        if not dry_run:
            cursor = conn.cursor()
            try:
                for target_field_id in target_field_ids:
                    relation_id = generate_id()
                    cursor.execute("""
                        INSERT INTO f_polic_file_field
                        (id, tenant_id, file_id, filed_id, created_time, created_by, updated_time, updated_by, state)
                        VALUES (%s, %s, %s, %s, NOW(), %s, NOW(), %s, 1)
                    """, (
                        relation_id,
                        tenant_id,
                        target_file_id,
                        target_field_id,
                        CREATED_BY,
                        UPDATED_BY
                    ))
                conn.commit()
                relations_created += len(target_field_ids)
            except Exception as e:
                conn.rollback()
                print(f"  [错误] 创建关联关系失败: {str(e)}")
            finally:
                cursor.close()
        else:
            relations_created += len(target_field_ids)
    print_result(True, f"创建了 {relations_created} 条关联关系，跳过 {relations_skipped} 个模板")
    return {
        'created': relations_created,
        'skipped': relations_skipped
    }
 def main():
    """主函数"""
    print_section("跨数据库同步模板、字段和关联关系")
    # 1. 获取源数据库配置（从.env）
    print_section("读取源数据库配置")
    try:
        source_config = get_source_db_config()
        print_result(True, f"源数据库: {source_config['host']}:{source_config['port']}/{source_config['database']}")
    except Exception as e:
        print_result(False, str(e))
        return
    # 2. 获取目标数据库配置（从命令行参数）
    print_section("读取目标数据库配置")
    target_config = get_target_db_config_from_args()
    print_result(True, f"目标数据库: {target_config['host']}:{target_config['port']}/{target_config['database']}")
    print(f"  目标租户ID: {target_config['tenant_id']}")
    if target_config['dry_run']:
        print("\n[注意] 当前为预览模式，不会实际更新数据库")
    # 3. 连接数据库
    print_section("连接数据库")
    source_conn = test_db_connection(source_config, "源")
    if not source_conn:
        return
    target_conn = test_db_connection(target_config, "目标")
    if not target_conn:
        source_conn.close()
        return
    print_result(True, "数据库连接成功")
    try:
        # 4. 获取源租户ID
        source_tenant_id = target_config.get('source_tenant_id')
        if not source_tenant_id:
            source_tenant_id = get_source_tenant_id(source_conn)
        print(f"\n源租户ID: {source_tenant_id}")
        # 5. 读取源数据
        print_section("读取源数据库数据")
        print("  读取字段...")
        source_input_fields, source_output_fields = read_source_fields(source_conn, source_tenant_id)
        print_result(True, f"输入字段: {len(source_input_fields)} 个，输出字段: {len(source_output_fields)} 个")
        print("\n  读取模板...")
        source_templates = read_source_templates(source_conn, source_tenant_id)
        print_result(True, f"模板总数: {len(source_templates)} 个")
        print("\n  读取关联关系...")
        source_relations = read_source_relations(source_conn, source_tenant_id)
        print_result(True, f"关联关系: {len(source_relations)} 个模板有字段关联")
        # 6. 同步到目标数据库
        target_tenant_id = target_config['tenant_id']
        dry_run = target_config['dry_run']
        # 6.1 同步字段
        input_field_id_map, output_field_id_map = sync_fields_to_target(
            target_conn, target_tenant_id,
            source_input_fields, source_output_fields,
            dry_run
        )
        # 6.2 同步模板
        template_id_map = sync_templates_to_target(
            target_conn, target_tenant_id,
            source_templates,
            dry_run
        )
        # 6.3 同步关联关系
        relations_result = sync_relations_to_target(
            target_conn, target_tenant_id,
            source_relations,
            template_id_map,
            input_field_id_map,
            output_field_id_map,
            dry_run
        )
        # 7. 总结
        print_section("同步完成")
        if dry_run:
            print("  本次为预览模式，未实际更新数据库")
        else:
            print("  数据库已更新")
        print(f"\n  同步统计:")
        print(f"    - 输入字段: {len(input_field_id_map)} 个")
        print(f"    - 输出字段: {len(output_field_id_map)} 个")
        print(f"    - 模板: {len(template_id_map)} 个")
        print(f"    - 关联关系: {relations_result['created']} 条")
    finally:
        source_conn.close()
        target_conn.close()
        print_result(True, "数据库连接已关闭")
 if __name__ == "__main__":
    try:
        main()
    except KeyboardInterrupt:
        print("\n\n[中断] 用户取消操作")
        sys.exit(0)
    except Exception as e:
        print(f"\n[错误] 发生异常: {str(e)}")
        import traceback
        traceback.print_exc()
        sys.exit(1)
--- a/template_ai_helper.py
+++ b/template_ai_helper.py
@ -15,12 +15,50 @@ class TemplateAIHelper:
    """模板AI辅助类，用于智能分析文档内容"""
    def __init__(self):
-        self.api_key = os.getenv('SILICONFLOW_API_KEY')
+        # ========== AI服务提供商选择 ==========
-        self.model = os.getenv('SILICONFLOW_MODEL', 'deepseek-ai/DeepSeek-V3.2-Exp')
+        # 通过环境变量 AI_PROVIDER 选择使用的AI服务
-        self.api_url = "https://api.siliconflow.cn/v1/chat/completions"
+        # 可选值: 'huawei' 或 'siliconflow'，默认为 'siliconflow'
        ai_provider = os.getenv('AI_PROVIDER', 'siliconflow').lower()
-        if not self.api_key:
+        # ========== 华为大模型配置 ==========
-            raise Exception("未配置 SILICONFLOW_API_KEY，请在 .env 文件中设置")
+        huawei_key = os.getenv('HUAWEI_API_KEY', 'sk-PoeiV3qwyTIRqcVc84E8E24cD2904872859a87922e0d9186')
        huawei_endpoint = os.getenv('HUAWEI_API_ENDPOINT', 'http://10.100.31.26:3001/v1/chat/completions')
        huawei_model = os.getenv('HUAWEI_MODEL', 'DeepSeek-R1-Distill-Llama-70B')
        # ========== 硅基流动配置 ==========
        siliconflow_key = os.getenv('SILICONFLOW_API_KEY', '')
        siliconflow_url = os.getenv('SILICONFLOW_URL', 'https://api.siliconflow.cn/v1/chat/completions')
        siliconflow_model = os.getenv('SILICONFLOW_MODEL', 'deepseek-ai/DeepSeek-V3.2-Exp')
        # 根据配置选择服务提供商
        if ai_provider == 'huawei':
            if not huawei_key or not huawei_endpoint:
                raise Exception("未配置华为大模型服务，请设置 HUAWEI_API_KEY 和 HUAWEI_API_ENDPOINT，或设置 AI_PROVIDER=siliconflow 使用硅基流动")
            self.api_key = huawei_key
            self.model = huawei_model
            self.api_url = huawei_endpoint
            print(f"[模板AI助手] 使用华为大模型: {huawei_model}")
        elif ai_provider == 'siliconflow':
            if not siliconflow_key:
                raise Exception("未配置硅基流动服务，请设置 SILICONFLOW_API_KEY，或设置 AI_PROVIDER=huawei 使用华为大模型")
            self.api_key = siliconflow_key
            self.model = siliconflow_model
            self.api_url = siliconflow_url
            print(f"[模板AI助手] 使用硅基流动: {siliconflow_model}")
        else:
            # 自动检测：优先使用硅基流动，如果未配置则使用华为大模型
            if siliconflow_key and siliconflow_url:
                self.api_key = siliconflow_key
                self.model = siliconflow_model
                self.api_url = siliconflow_url
                print(f"[模板AI助手] 自动选择硅基流动: {siliconflow_model}")
            elif huawei_key and huawei_endpoint:
                self.api_key = huawei_key
                self.model = huawei_model
                self.api_url = huawei_endpoint
                print(f"[模板AI助手] 自动选择华为大模型: {huawei_model}")
            else:
                raise Exception("未配置AI服务，请设置 AI_PROVIDER 环境变量（'huawei' 或 'siliconflow'），并配置相应的API密钥")
    def test_api_connection(self) -> bool:
        """
@ -30,7 +68,9 @@ class TemplateAIHelper:
            是否连接成功
        """
        try:
-            print("  [测试] 正在测试硅基流动API连接...")
+            print(f"  [测试] 正在测试API连接...")
            # 测试payload
            test_payload = {
                "model": self.model,
                "messages": [
@ -39,9 +79,14 @@ class TemplateAIHelper:
                        "content": "测试"
                    }
                ],
                "temperature": 0.5,
                "max_tokens": 10
            }
            # 如果是华为大模型，添加额外的参数
            if 'huawei' in self.api_url.lower() or '10.100.31.26' in self.api_url:
                test_payload["stream"] = False
            headers = {
                "Authorization": f"Bearer {self.api_key}",
                "Content-Type": "application/json"
@ -150,10 +195,21 @@ class TemplateAIHelper:
                        "content": prompt
                    }
                ],
-                "temperature": 0.2,
+                "temperature": 0.5,
-                "max_tokens": 4000
+                "max_tokens": 8192
            }
            # 如果是华为大模型，添加额外的参数
            if 'huawei' in self.api_url.lower() or '10.100.31.26' in self.api_url:
                payload["stream"] = False
                payload["presence_penalty"] = 1.03
                payload["frequency_penalty"] = 1.0
                payload["repetition_penalty"] = 1.0
                payload["top_p"] = 0.95
                payload["top_k"] = 1
                payload["seed"] = 1
                payload["n"] = 1
            headers = {
                "Authorization": f"Bearer {self.api_key}",
                "Content-Type": "application/json"
--- a/template_fields_export_20251216_101749.xlsx
+++ b/template_fields_export_20251216_101749.xlsx
--- a/template_finish/1-谈话函询模板/函询模板/1.纪委请示报告卡.docx
+++ b/template_finish/1-谈话函询模板/函询模板/1.纪委请示报告卡.docx
--- a/template_finish/1-谈话函询模板/函询模板/2谈话函询呈批表.docx
+++ b/template_finish/1-谈话函询模板/函询模板/2谈话函询呈批表.docx
--- a/template_finish/1-谈话函询模板/函询模板/3函询通知书.docx
+++ b/template_finish/1-谈话函询模板/函询模板/3函询通知书.docx
--- a/template_finish/1-谈话函询模板/函询模板/4.
+++ b/template_finish/1-谈话函询模板/函询模板/4.
--- a/template_finish/1-谈话函询模板/函询模板/5.纪委请示报告卡.docx
+++ b/template_finish/1-谈话函询模板/函询模板/5.纪委请示报告卡.docx
--- a/template_finish/1-谈话函询模板/函询模板/6.谈话函询结果呈批表.docx
+++ b/template_finish/1-谈话函询模板/函询模板/6.谈话函询结果呈批表.docx
--- a/template_finish/1-谈话函询模板/函询模板/7.反馈通知书.docx
+++ b/template_finish/1-谈话函询模板/函询模板/7.反馈通知书.docx
--- a/template_finish/1-谈话函询模板/谈话模版/1.请示报告卡.docx
+++ b/template_finish/1-谈话函询模板/谈话模版/1.请示报告卡.docx
--- a/template_finish/1-谈话函询模板/谈话模版/2.谈话函询呈批表.docx
+++ b/template_finish/1-谈话函询模板/谈话模版/2.谈话函询呈批表.docx
--- a/template_finish/1-谈话函询模板/谈话模版/3.谈话方案、安全预案.docx
+++ b/template_finish/1-谈话函询模板/谈话模版/3.谈话方案、安全预案.docx
--- a/template_finish/1-谈话函询模板/谈话模版/4.谈话通知书.docx
+++ b/template_finish/1-谈话函询模板/谈话模版/4.谈话通知书.docx
--- a/template_finish/1-谈话函询模板/谈话模版/5.谈话笔录.docx
+++ b/template_finish/1-谈话函询模板/谈话模版/5.谈话笔录.docx
--- a/template_finish/1-谈话函询模板/谈话模版/6.请示报告卡（了结）.docx
+++ b/template_finish/1-谈话函询模板/谈话模版/6.请示报告卡（了结）.docx
--- a/template_finish/1-谈话函询模板/谈话模版/7.谈话函询结果呈批表.docx
+++ b/template_finish/1-谈话函询模板/谈话模版/7.谈话函询结果呈批表.docx
--- a/template_finish/1-谈话函询模板/谈话模版/8.谈话情况报告.docx
+++ b/template_finish/1-谈话函询模板/谈话模版/8.谈话情况报告.docx
--- a/template_finish/1-谈话函询模板/谈话模版/9.谈话了结通知书.docx
+++ b/template_finish/1-谈话函询模板/谈话模版/9.谈话了结通知书.docx
--- a/template_finish/2-初核模版/1.初核请示/1.请示报告卡（XXX）.docx
+++ b/template_finish/2-初核模版/1.初核请示/1.请示报告卡（XXX）.docx
--- a/template_finish/2-初核模版/1.初核请示/3.附件初核方案(XXX).docx
+++ b/template_finish/2-初核模版/1.初核请示/3.附件初核方案(XXX).docx
--- a/template_finish/2-初核模版/2.谈话审批/走读式谈话审批/2谈话审批表-重新制作表格.docx
+++ b/template_finish/2-初核模版/2.谈话审批/走读式谈话审批/2谈话审批表-重新制作表格.docx
--- a/template_finish/2-初核模版/3.初核结论/8-1请示报告卡（初核报告结论）
+++ b/template_finish/2-初核模版/3.初核结论/8-1请示报告卡（初核报告结论）
--- a/template_finish/2-初核模版/3.初核结论/8.XXX初核情况报告.docx
+++ b/template_finish/2-初核模版/3.初核结论/8.XXX初核情况报告.docx
--- a/template_finish/3-立案模版/党员/移送审理/1.请示报告卡（移送审理）.docx
+++ b/template_finish/3-立案模版/党员/移送审理/1.请示报告卡（移送审理）.docx
--- a/template_finish/3-立案模版/党员/移送审理/2.移送审理审批表.docx
+++ b/template_finish/3-立案模版/党员/移送审理/2.移送审理审批表.docx
--- a/template_finish/3-立案模版/党员/移送审理/3.案件材料移送审理交接单.docx
+++ b/template_finish/3-立案模版/党员/移送审理/3.案件材料移送审理交接单.docx
--- a/template_finish/3-立案模版/党员/立案审查/1.立案审查请示报批/1.请示报告卡（立案审查）.docx
+++ b/template_finish/3-立案模版/党员/立案审查/1.立案审查请示报批/1.请示报告卡（立案审查）.docx
--- a/template_finish/3-立案模版/党员/立案审查/1.立案审查请示报批/2.立案审批表.docx
+++ b/template_finish/3-立案模版/党员/立案审查/1.立案审查请示报批/2.立案审批表.docx
--- a/template_finish/3-立案模版/党员/立案审查/1.立案审查请示报批/3.会议纪要(立案审查).docx
+++ b/template_finish/3-立案模版/党员/立案审查/1.立案审查请示报批/3.会议纪要(立案审查).docx
--- a/template_finish/3-立案模版/党员/立案审查/1.立案审查请示报批/4.立案审查方案.docx
+++ b/template_finish/3-立案模版/党员/立案审查/1.立案审查请示报批/4.立案审查方案.docx
--- a/template_finish/3-立案模版/党员/立案审查/2.立案决定书/立案决定书-第一联.docx
+++ b/template_finish/3-立案模版/党员/立案审查/2.立案决定书/立案决定书-第一联.docx
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
python	d27c18d0d2	修正生成文档错误测试	2025-12-30 10:41:35 +08:00
python	7bb69af45e	再次修复模板目录层级结构	2025-12-26 09:32:15 +08:00
python	ac8bdba941	完成121个模板更新和测试。	2025-12-26 09:16:31 +08:00
python	eec66cbe05	重构模板保存和下载逻辑，将文件保存到本地template_finish文件夹，并更新文档服务以从本地读取模板文件。增强了错误处理，确保文件路径的有效性和安全性。	2025-12-18 16:45:31 +08:00
python	fb7fb985ad	重做谈话审批表并且修改测试代码	2025-12-15 16:23:37 +08:00
python	557c9ae351	增加占位符替换测试脚本	2025-12-15 14:45:42 +08:00
python	cb4a07f148	优化更新调整了数据字段关联关系工具，可以正常显示并且添加和修改。	2025-12-14 20:12:05 +08:00
python	665612d2bf	更新.env文件以反映新的MinIO和AI服务配置，同时删除无用的文档和模板文件，增强了系统的灵活性和可维护性。添加了数据库备份和恢复功能，支持通过API管理租户ID和字段关联，优化了前端模板字段管理界面。	2025-12-14 16:56:26 +08:00
python	1d0f7a5bfe	添加.env文件到.gitignore以避免将环境变量文件纳入版本控制	2025-12-12 22:46:59 +08:00
python	672dd2e516	优化数据库查询逻辑，移除对tenant_id的限制，确保在获取文件配置和字段信息时不再依赖tenant_id。同时，更新文档服务和字段服务的初始化逻辑，从环境变量中读取数据库和MinIO配置，增强了配置的灵活性和可维护性。	2025-12-12 22:41:01 +08:00
python	2563e7fc74	同步数据库和minio模板数据至智慧监督服务器	2025-12-12 15:20:43 +08:00
python	640f7834b6	临时保存一下	2025-12-12 09:50:22 +08:00
python	70f5be89ce	删除无效和重复的模板文件，更新数据库记录以确保模板与字段的关联关系正确，同时修复了占位符替换逻辑中的问题，增强了错误处理和调试信息输出，确保在不同环境下的稳定性。更新了md说明文档。	2025-12-11 21:19:23 +08:00
python	dab5d8ee59	添加通过XML直接替换Word文档占位符的功能，作为表格处理失败时的备用方案。同时，优化表格占位符替换逻辑，增强错误处理和调试信息输出，确保在处理复杂表格时的稳定性。	2025-12-11 16:48:43 +08:00
python	4d9080855c	优化文档服务中的表格处理逻辑，使用索引方式访问行和单元格以避免迭代时的索引错误，同时增强对异常情况的处理，确保在访问行、单元格和段落时的稳定性	2025-12-11 16:34:50 +08:00
python	91fcd5461d	修复ubuntu表格中占位符替换的问题	2025-12-11 16:30:42 +08:00
python	d01f367ffb	修复ubuntu环境下占位符替换为嗯体	2025-12-11 16:24:21 +08:00
python	7cbe4b29b7	修复文档生成异常。修复模板输入字段关联异常。	2025-12-11 16:10:48 +08:00
python	d3afba9191	调整模板和测试页面	2025-12-11 15:10:46 +08:00
python	d8d2817aed	修复文档生成逻辑中的冲突，更新文件路径和下载链接的返回逻辑，增强用户体验。同时，优化前端文件列表加载功能，支持加载所有可用模板，并添加清空文件列表的功能。	2025-12-11 14:50:07 +08:00
python	28bf100ca4	添加文件配置查询接口，支持通过taskId获取文档，增强参数验证和错误处理能力。同时，优化文档生成逻辑，确保生成的文档名称和路径的准确性，提升代码可读性和维护性。	2025-12-11 12:14:25 +08:00
python	6dd272d083	添加生成MinIO预签名下载URL的功能，更新文档生成逻辑以返回下载链接，同时优化控制台输出信息，提升用户体验和错误处理能力。	2025-12-11 12:10:13 +08:00
python	2a5952f3f5	更新输入字段的默认值，调整文件项的添加逻辑，改用文件ID替代模板编码，简化文件配置查询，提升代码可读性和维护性。	2025-12-11 09:16:14 +08:00
python	a320f55da0	更新文档生成逻辑，改用文件ID替代模板编码，增强参数验证和错误处理能力。同时，调整文件配置查询逻辑，确保根据文件ID获取文件配置信息，提升代码可读性和维护性。	2025-12-11 09:09:10 +08:00
python	ebc1154beb	更新下载链接生成逻辑，修改文件路径以反映最新的文档版本。同时，在文档服务中添加文件路径有效性检查，确保模板文件路径不为空，提升错误处理能力。	2025-12-10 19:02:20 +08:00
python	0563ff5346	更新AI服务逻辑，改进年龄字段的推断机制，允许在缺失年龄数据时根据出生年月自动计算年龄，并优化相关日志记录。同时，更新文档以反映新的字段配置和使用说明，确保数据提取的准确性和完整性。	2025-12-10 14:16:59 +08:00
python	e38ba42669	删除多个不再需要的文档，包括AI服务错误分析报告、模板树状结构更新说明、数据库备份和恢复工具使用说明等，简化项目结构，提升可维护性。同时，更新文档服务以支持从input_data和template_code中提取模板配置，增强查询逻辑的灵活性。	2025-12-10 10:39:36 +08:00
python	11be119ffc	更新环境配置，支持多种AI服务提供商（华为和硅基流动），增强API调用的灵活性和可配置性，同时更新文档以反映新的配置选项和使用说明。	2025-12-10 10:05:45 +08:00
python	cd27bb4bd0	优化AI服务的内容提取逻辑，增强对缺失字段的推断能力，改进后处理机制以提升数据提取的准确性和完整性。	2025-12-10 09:49:57 +08:00
python	6871c2e803	增强后处理逻辑，允许从原始输入文本中提取缺失的性别和年龄字段，改进数据推断的准确性和完整性。	2025-12-10 09:37:37 +08:00
python	24fdfdea4c	增强AI服务的内容提取逻辑，更新提取助手的描述，添加后处理逻辑以推断缺失的性别、职级和线索来源字段，确保提取结果的准确性和完整性。	2025-12-09 15:32:25 +08:00
python	563d97184b	优化AI服务的内容提取逻辑，更新提取助手的描述，增强JSON格式的严格性，修复字段名错误和下划线前缀处理，确保提取结果的准确性和一致性。	2025-12-09 15:19:32 +08:00
python	9bf1dd1210	优化AI服务的内容提取逻辑，增强对API返回结果的处理能力，改进JSON解析和错误处理机制，确保在提取数据失败时能够返回空结果而不抛出异常，同时记录详细的调试信息以提高容错性和可维护性。	2025-12-09 15:01:31 +08:00
python	315301fc0b	添加AI日志记录器支持，增强对话日志记录功能，记录请求和响应信息，包括成功和错误情况，以提高调试和监控能力。	2025-12-09 14:51:33 +08:00
python	8bebc13efe	优化调整抽取	2025-12-09 14:41:26 +08:00
python	f1b5c52500	修正json repair安装和导入	2025-12-09 14:18:32 +08:00
python	7c30e59328	增强AI服务的JSON解析能力，添加对jsonrepair库的支持以处理不完整或格式错误的JSON，改进字段提取逻辑以允许部分字段为空，提升数据提取的容错性和准确性。	2025-12-09 14:13:07 +08:00
python	eaa384cf7e	更新提示配置和AI服务内容，增强信息提取助手的描述，明确提取要求，添加后处理逻辑以推断缺失字段，改进字段提取方法以提高数据提取的准确性和完整性。	2025-12-09 12:56:28 +08:00
python	b8d89c28ec	增强调试信息，添加对AI返回结果和字段映射的打印，改进字段名清理逻辑以避免空字段名的处理错误，确保数据提取的准确性和完整性。	2025-12-09 12:45:11 +08:00
python	e1d8d27dc4	更新提示配置，统一日期格式为中文格式，增强AI服务的日期规范化功能，添加对常见拼写错误的处理逻辑，改进字段名清理和规范化方法以提高数据提取准确性。	2025-12-09 12:34:01 +08:00
python	e31cd0b764	添加API最大token数配置，增强JSON解析功能，新增清理和修复JSON字符串的方法，改进字段名规范化逻辑以提高数据提取准确性。	2025-12-09 12:14:34 +08:00
python	d8fa4c3d7e	添加API超时配置，支持思考模式下动态调整超时时间；修改重试机制的延迟时间，从1秒改为2秒，增强错误处理逻辑。	2025-12-09 11:58:07 +08:00
python	c7a7780e71	更新提示配置和AI服务内容，简化信息提取助手的描述，明确提取要求，增强对API返回内容的处理逻辑，添加调试信息以便于问题排查。	2025-12-09 11:46:52 +08:00
python	14ff607b52	为华为大模型API调用添加重试机制，增强了错误处理逻辑，确保在请求失败时能够自动重试并提供详细的错误信息。同时，将API调用逻辑分离到单独的方法中，以提高代码可读性和可维护性。	2025-12-09 11:41:45 +08:00
python	8461725a13	更新提示配置和AI服务内容，增强信息提取助手的描述，明确提取要求和规则，添加新的字段提取逻辑以提高提取准确性和完整性。	2025-12-09 11:39:18 +08:00
python	684cb0141a	增强AI服务的JSON提取功能，添加了从文本中提取JSON对象的方法，改进了对华为大模型返回内容的处理，确保只返回JSON对象而不包含其他说明。	2025-12-09 11:30:02 +08:00
python	f0cb4a7ba0	调整env文件，配置华为大模型最新参数	2025-12-09 11:03:19 +08:00
python	7d50b160c2	创建新分支，用于静态交通本地服务器部署，调整默认为华为大模型调用	2025-12-09 10:33:41 +08:00