|
|
@@ -0,0 +1,241 @@
|
|
|
+# BrandCultivation 卷烟品牌培育推荐系统
|
|
|
+
|
|
|
+基于协同过滤、Item2Vec 和 GBDT-LR 的卷烟品牌培育商户推荐系统,提供品规-商户匹配推荐、投放量分配、效果验证等功能。
|
|
|
+
|
|
|
+## 目录结构
|
|
|
+
|
|
|
+```
|
|
|
+BrandCultivation/
|
|
|
+├── core/ # 基础设施层(日志、配置、异常、中间件)
|
|
|
+├── api/ # FastAPI 路由层
|
|
|
+├── database/ # 数据访问层(MySQL DAO + Redis)
|
|
|
+├── models/ # ML 模型(Item2Vec、ItemCF、GBDT-LR)
|
|
|
+├── utils/ # 工具类(文件上传、报告生成)
|
|
|
+├── config/ # 配置文件(YAML)
|
|
|
+├── run_api.py # API 服务入口
|
|
|
+├── train.py # 模型训练入口
|
|
|
+├── requirements.txt # Python 依赖
|
|
|
+└── .env.example # 环境变量模板
|
|
|
+```
|
|
|
+
|
|
|
+## 环境要求
|
|
|
+
|
|
|
+- Python 3.10+
|
|
|
+- MySQL 5.7+
|
|
|
+- Redis 5.0+
|
|
|
+
|
|
|
+## 安装
|
|
|
+
|
|
|
+```bash
|
|
|
+# 克隆项目
|
|
|
+git clone <repo-url>
|
|
|
+cd BrandCultivation
|
|
|
+
|
|
|
+# 创建虚拟环境
|
|
|
+conda create -n recommend python=3.10
|
|
|
+conda activate recommend
|
|
|
+
|
|
|
+# 安装依赖
|
|
|
+pip install -r requirements.txt
|
|
|
+```
|
|
|
+
|
|
|
+## 配置
|
|
|
+
|
|
|
+### 环境变量
|
|
|
+
|
|
|
+复制 `.env.example` 为 `.env`,填入实际值:
|
|
|
+
|
|
|
+```bash
|
|
|
+cp .env.example .env
|
|
|
+```
|
|
|
+
|
|
|
+必须配置的环境变量:
|
|
|
+
|
|
|
+| 变量 | 说明 | 示例 |
|
|
|
+|------|------|------|
|
|
|
+| `MYSQL_HOST` | MySQL 主机地址 | `rm-xxx.mysql.rds.aliyuncs.com` |
|
|
|
+| `MYSQL_PORT` | MySQL 端口 | `3036` |
|
|
|
+| `MYSQL_USER` | MySQL 用户名 | `BrandCultivation` |
|
|
|
+| `MYSQL_PASSWORD` | MySQL 密码 | (必填) |
|
|
|
+| `MYSQL_DB` | 数据库名 | `brand_cultivation` |
|
|
|
+| `REDIS_HOST` | Redis 主机地址 | `r-xxx.redis.rds.aliyuncs.com` |
|
|
|
+| `REDIS_PORT` | Redis 端口 | `5000` |
|
|
|
+| `REDIS_PASSWORD` | Redis 密码 | (必填) |
|
|
|
+| `REDIS_DB` | Redis 数据库编号 | `10` |
|
|
|
+| `LOG_LEVEL` | 日志级别 | `INFO`(默认) |
|
|
|
+| `FILE_UPLOAD_URL` | 文件上传服务地址 | `http://file-center.jcpt:8080/file/fileUpload` |
|
|
|
+| `FILE_DOWNLOAD_URL` | 文件下载服务地址 | `http://file-center.jcpt:8080/file/fileDownload` |
|
|
|
+
|
|
|
+如果不使用 `.env` 文件,也可以直接 export 环境变量:
|
|
|
+
|
|
|
+```bash
|
|
|
+export MYSQL_PASSWORD='your_password'
|
|
|
+export REDIS_PASSWORD='your_password'
|
|
|
+```
|
|
|
+
|
|
|
+### YAML 配置
|
|
|
+
|
|
|
+非敏感配置保留在 `config/` 目录下的 YAML 文件中,环境变量优先级高于 YAML。
|
|
|
+
|
|
|
+## 运行
|
|
|
+
|
|
|
+### 启动 API 服务
|
|
|
+
|
|
|
+```bash
|
|
|
+python run_api.py
|
|
|
+```
|
|
|
+
|
|
|
+服务启动后监听 `0.0.0.0:7960`,可通过以下方式验证:
|
|
|
+
|
|
|
+```bash
|
|
|
+# 健康检查
|
|
|
+curl http://localhost:7960/health
|
|
|
+
|
|
|
+# 预期返回
|
|
|
+# {"status":"healthy","mysql":"ok","redis":"ok"}
|
|
|
+```
|
|
|
+
|
|
|
+也可以使用 uvicorn 直接启动(支持热重载):
|
|
|
+
|
|
|
+```bash
|
|
|
+uvicorn run_api:app --host 0.0.0.0 --port 7960 --reload
|
|
|
+```
|
|
|
+
|
|
|
+### 模型训练
|
|
|
+
|
|
|
+训练前确保 MySQL 和 Redis 均可连接。
|
|
|
+
|
|
|
+```bash
|
|
|
+# 完整训练(协同过滤 + 热度召回 + GBDT-LR)
|
|
|
+python train.py --run_train --city_uuid 00000000000000000000000011445301
|
|
|
+
|
|
|
+# 仅训练召回模型(协同过滤 + 热度召回)
|
|
|
+python train.py --run_recall --city_uuid 00000000000000000000000011445301
|
|
|
+
|
|
|
+# 仅训练排序模型(GBDT-LR)
|
|
|
+python train.py --run_gbdtlr --city_uuid 00000000000000000000000011445301
|
|
|
+```
|
|
|
+
|
|
|
+训练参数:
|
|
|
+
|
|
|
+| 参数 | 说明 | 默认值 |
|
|
|
+|------|------|--------|
|
|
|
+| `--city_uuid` | 城市 UUID | `00000000000000000000000011445301` |
|
|
|
+| `--train_data_dir` | 训练数据保存目录 | `./data/gbdt` |
|
|
|
+| `--model_path` | 模型权重保存目录 | `./models/rank/weights` |
|
|
|
+| `--largest_n` | ItemCF 热度 Top N | `300` |
|
|
|
+| `--similarity_k` | ItemCF 相似商户数 | `100` |
|
|
|
+| `--top_n` | ItemCF 推荐候选数 | `1500` |
|
|
|
+| `--n_jobs` | 并行计算线程数 | `2` |
|
|
|
+
|
|
|
+## API 接口
|
|
|
+
|
|
|
+基础路径:`/brandcultivation/api/v1`
|
|
|
+
|
|
|
+### POST /recommend
|
|
|
+
|
|
|
+生成商户推荐列表并分配投放量。
|
|
|
+
|
|
|
+请求体:
|
|
|
+```json
|
|
|
+{
|
|
|
+ "city_uuid": "00000000000000000000000011445301",
|
|
|
+ "product_code": "440298",
|
|
|
+ "recall_cust_count": 500,
|
|
|
+ "delivery_count": 5000,
|
|
|
+ "cultivacation_id": "10000001",
|
|
|
+ "limit_cycle_name": "202505W1(05.05-05.11)"
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+响应:
|
|
|
+```json
|
|
|
+{
|
|
|
+ "code": 200,
|
|
|
+ "msg": "success",
|
|
|
+ "data": {
|
|
|
+ "recommendationInfo": [
|
|
|
+ {"id": 1, "cust_code": "445300108802", "recommend_score": 95.3, "delivery_count": 120}
|
|
|
+ ]
|
|
|
+ }
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+### POST /report
|
|
|
+
|
|
|
+获取推荐相关报告文件 ID。
|
|
|
+
|
|
|
+请求体:
|
|
|
+```json
|
|
|
+{
|
|
|
+ "cultivacation_id": "10000001"
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+### POST /eval_report
|
|
|
+
|
|
|
+生成投放效果验证报告。
|
|
|
+
|
|
|
+请求体:
|
|
|
+```json
|
|
|
+{
|
|
|
+ "city_uuid": "00000000000000000000000011445301",
|
|
|
+ "product_code": "440298",
|
|
|
+ "cultivacation_id": "10000001",
|
|
|
+ "start_time": "2025/2/10",
|
|
|
+ "end_time": "2025/2/16"
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+### GET /health
|
|
|
+
|
|
|
+健康检查,返回 MySQL 和 Redis 连接状态。
|
|
|
+
|
|
|
+## 日志
|
|
|
+
|
|
|
+系统使用 JSON 格式日志输出到 stdout,每条日志包含:
|
|
|
+
|
|
|
+```json
|
|
|
+{
|
|
|
+ "timestamp": "2026-05-21T03:35:48.869426+00:00",
|
|
|
+ "level": "INFO",
|
|
|
+ "module": "recommend",
|
|
|
+ "function": "recommend",
|
|
|
+ "line": 18,
|
|
|
+ "message": "Recommend request: city=xxx, product=440298, recall=500",
|
|
|
+ "request_id": "a1b2c3d4"
|
|
|
+}
|
|
|
+```
|
|
|
+
|
|
|
+通过 `LOG_LEVEL` 环境变量控制日志级别(DEBUG / INFO / WARNING / ERROR)。
|
|
|
+
|
|
|
+API 请求会自动生成 `request_id`,贯穿整个请求链路,方便问题追踪。响应头中也会返回 `X-Request-ID`。
|
|
|
+
|
|
|
+## Docker 部署
|
|
|
+
|
|
|
+```dockerfile
|
|
|
+FROM python:3.10-slim
|
|
|
+
|
|
|
+WORKDIR /app
|
|
|
+COPY requirements.txt .
|
|
|
+RUN pip install --no-cache-dir -r requirements.txt
|
|
|
+
|
|
|
+COPY . .
|
|
|
+
|
|
|
+ENV MYSQL_PASSWORD=""
|
|
|
+ENV REDIS_PASSWORD=""
|
|
|
+ENV LOG_LEVEL=INFO
|
|
|
+
|
|
|
+EXPOSE 7960
|
|
|
+CMD ["python", "run_api.py"]
|
|
|
+```
|
|
|
+
|
|
|
+```bash
|
|
|
+docker build -t brand-cultivation .
|
|
|
+docker run -d \
|
|
|
+ -p 7960:7960 \
|
|
|
+ -e MYSQL_PASSWORD='your_password' \
|
|
|
+ -e REDIS_PASSWORD='your_password' \
|
|
|
+ brand-cultivation
|
|
|
+```
|
|
|
+
|