docs(predictV3): 添加predictV3算法优化研究文档和前端功能实现
- 完成Phase 11: predictV3算法优化研究文档,涵盖6个优化方向的技术分析 - 实现置信度评估功能,提供历史命中率、得分分布、多维度一致性置信度指标 - 扩展回测指标体系,新增NDCG@K、MRR、命中率分布等排名质量评估指标 - 优化转移概率算法,引入二阶马尔可夫链和多属性联合转移增强预测准确性 - 设计权重训练机制,支持网格搜索和遗传算法进行数据驱动的参数优化 - 集成组合特征挖掘功能,采用关联规则和序列模式发现号码间潜在关联 - 实现完整的前端交互界面,支持预测结果显示、置信度展示和回测验证功能 - 建立性能优化策略,包括预计算缓存、批量计算和降级策略保障响应速度
This commit is contained in:
@@ -0,0 +1,328 @@
|
||||
---
|
||||
phase: 11-predictv3
|
||||
plan: 01
|
||||
type: execute
|
||||
wave: 1
|
||||
depends_on: []
|
||||
files_modified:
|
||||
- application/admin/model/History.php
|
||||
autonomous: true
|
||||
requirements:
|
||||
- PRED-02
|
||||
- PRED-05
|
||||
must_haves:
|
||||
truths:
|
||||
- "用户可以在回测结果中看到 NDCG@5 指标"
|
||||
- "用户可以在回测结果中看到 MRR 指标"
|
||||
- "用户可以看到各排名位置的命中分布统计"
|
||||
- "系统在数据不足时返回合理的默认值或提示"
|
||||
artifacts:
|
||||
- path: "application/admin/model/History.php"
|
||||
provides: "NDCG、MRR、命中分布计算方法"
|
||||
contains: "_calculateNDCG|_calculateMRR|_calculateHitDistribution"
|
||||
key_links:
|
||||
- from: "_runBacktestV3"
|
||||
to: "_calculateNDCG, _calculateMRR, _calculateHitDistribution"
|
||||
via: "method call in return statement"
|
||||
---
|
||||
|
||||
# Phase 11 - Plan 01: 回测指标扩展
|
||||
|
||||
## Objective
|
||||
|
||||
扩展 `_runBacktestV3` 方法的回测指标,新增 NDCG@5、MRR、命中率分布等排名质量评估指标,提升算法评估能力。
|
||||
|
||||
**Purpose:** 当前回测仅返回命中率(Top5)和平均排名,缺少排名质量评估指标。NDCG、MRR 是成熟的推荐系统评估指标,能更全面反映预测排名质量。
|
||||
|
||||
**Output:** `History.php` 中新增 3 个计算方法,`_runBacktestV3` 返回结果扩展。
|
||||
|
||||
## Tasks
|
||||
|
||||
### Task 1: 实现 NDCG@5 计算(含空预测保护和公式文档)
|
||||
|
||||
<read_first>
|
||||
- D:\code\php\amlhc\application\admin\model\History.php (line 3495-3560, _runBacktestV3 方法)
|
||||
</read_first>
|
||||
|
||||
<action>
|
||||
在 `History.php` 文件末尾(类内)新增 `_calculateNDCG` 方法:
|
||||
|
||||
```php
|
||||
/**
|
||||
* 计算 NDCG@K (Normalized Discounted Cumulative Gain)
|
||||
*
|
||||
* 公式说明:
|
||||
* - DCG (Discounted Cumulative Gain) = Σ(rel_i / log2(rank_i + 1))
|
||||
* 其中 rel_i = 1 (命中) 或 0 (未命中),rank_i 为预测排名位置
|
||||
* - IDCG (Ideal DCG) = Σ(1 / log2(i + 1)) for i = 1..min(hits, K)
|
||||
* 即理想情况下所有命中的号码都排在最前面的DCG值
|
||||
* - NDCG = DCG / IDCG,范围 0-1,越接近1表示排名质量越好
|
||||
*
|
||||
* @param array $backtestDetails 回测详情数组,每项包含 {hit: bool, rank: int}
|
||||
* @param int $K Top-K 参数,默认5,评估前K个预测位置的排名质量
|
||||
* @return float NDCG值 (0-1范围),空数据时返回0
|
||||
*/
|
||||
private function _calculateNDCG($backtestDetails, $K = 5)
|
||||
{
|
||||
// 边缘情况处理:空预测或无效参数
|
||||
if (empty($backtestDetails) || $K <= 0) {
|
||||
return 0;
|
||||
}
|
||||
|
||||
$dcg = 0;
|
||||
$idcg = 0;
|
||||
|
||||
// 计算 DCG: 命中号码的排名折损累积值
|
||||
foreach ($backtestDetails as $detail) {
|
||||
if (!isset($detail['hit']) || !isset($detail['rank'])) {
|
||||
continue; // 跳过无效数据
|
||||
}
|
||||
if ($detail['hit'] && $detail['rank'] > 0 && $detail['rank'] <= $K) {
|
||||
// DCG公式: rel / log2(rank + 1),命中时 rel=1
|
||||
$dcg += 1 / log($detail['rank'] + 1, 2);
|
||||
}
|
||||
}
|
||||
|
||||
// 计算 IDCG: 最理想情况下所有命中的 DCG(假设都排在第1位)
|
||||
$hitCount = 0;
|
||||
foreach ($backtestDetails as $detail) {
|
||||
if (isset($detail['hit']) && $detail['hit']) {
|
||||
$hitCount++;
|
||||
}
|
||||
}
|
||||
|
||||
for ($i = 1; $i <= min($hitCount, $K); $i++) {
|
||||
$idcg += 1 / log($i + 1, 2);
|
||||
}
|
||||
|
||||
// 返回标准化值,IDCG为0时返回0避免除零错误
|
||||
return $idcg > 0 ? round($dcg / $idcg, 4) : 0;
|
||||
}
|
||||
```
|
||||
|
||||
实现要点:
|
||||
- 公式:DCG = Σ(1/log2(rank+1)),IDCG = Σ(1/log2(i+1)) for i=1..hits
|
||||
- 添加空预测保护:检查 $backtestDetails 是否为空
|
||||
- 添加数据完整性检查:确保 hit 和 rank 字段存在
|
||||
- 使用 log(rank + 1, 2) 作为折损函数,排名越靠前权重越高
|
||||
- 返回 0-1 范围的标准化值,越接近 1 表示排名质量越好
|
||||
</action>
|
||||
|
||||
<acceptance_criteria>
|
||||
- grep 正则匹配: `_calculateNDCG\s*\(` 在 History.php 中存在
|
||||
- grep 匹配: `empty($backtestDetails)` 在方法中存在(空预测保护)
|
||||
- 方法返回 float 类型值
|
||||
- 包含函数级注释说明 NDCG 计算逻辑和公式
|
||||
</acceptance_criteria>
|
||||
|
||||
### Task 2: 实现 MRR 和命中分布计算(含边缘情况处理)
|
||||
|
||||
<read_first>
|
||||
- D:\code\php\amlhc\application\admin\model\History.php (新增的 _calculateNDCG 方法位置)
|
||||
</read_first>
|
||||
|
||||
<action>
|
||||
在 `_calculateNDCG` 方法后继续新增 `_calculateMRR` 和 `_calculateHitDistribution` 方法:
|
||||
|
||||
```php
|
||||
/**
|
||||
* 计算 MRR (Mean Reciprocal Rank)
|
||||
* 平均倒数排名,关注命中号码的具体排名位置
|
||||
*
|
||||
* 公式说明:
|
||||
* - MRR = Σ(1/rank_i) / N,其中 rank_i 为命中号码的排名,N 为测试总数
|
||||
* - 未命中的测试项贡献 0 到倒数排名
|
||||
* - MRR 范围 0-1,越接近1表示命中号码平均排名越靠前
|
||||
*
|
||||
* @param array $backtestDetails 回测详情数组,每项包含 {hit: bool, rank: int}
|
||||
* @return float MRR值 (0-1范围),空数据时返回0
|
||||
*/
|
||||
private function _calculateMRR($backtestDetails)
|
||||
{
|
||||
// 边缘情况处理:空预测
|
||||
if (empty($backtestDetails)) {
|
||||
return 0;
|
||||
}
|
||||
|
||||
$reciprocalRanks = [];
|
||||
|
||||
foreach ($backtestDetails as $detail) {
|
||||
if (!isset($detail['hit']) || !isset($detail['rank'])) {
|
||||
continue; // 跳过无效数据
|
||||
}
|
||||
if ($detail['hit'] && $detail['rank'] > 0) {
|
||||
$reciprocalRanks[] = 1 / $detail['rank'];
|
||||
} else {
|
||||
$reciprocalRanks[] = 0; // 未命中记为0
|
||||
}
|
||||
}
|
||||
|
||||
return count($reciprocalRanks) > 0
|
||||
? round(array_sum($reciprocalRanks) / count($reciprocalRanks), 4)
|
||||
: 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* 计算命中率分布
|
||||
* 统计各排名位置(1-5)的命中次数分布
|
||||
*
|
||||
* 结构定义:
|
||||
* - 返回格式: {rank_1: n, rank_2: n, rank_3: n, rank_4: n, rank_5: n}
|
||||
* - rank_N 表示预测排名第N位的命中次数
|
||||
* - 用于前端柱状图可视化展示
|
||||
*
|
||||
* @param array $backtestDetails 回测详情数组,每项包含 {hit: bool, rank: int}
|
||||
* @return array 各排名(1-5)的命中次数统计,键名为 rank_1 到 rank_5
|
||||
*/
|
||||
private function _calculateHitDistribution($backtestDetails)
|
||||
{
|
||||
// 边缘情况处理:空预测返回全0分布
|
||||
if (empty($backtestDetails)) {
|
||||
return [
|
||||
'rank_1' => 0,
|
||||
'rank_2' => 0,
|
||||
'rank_3' => 0,
|
||||
'rank_4' => 0,
|
||||
'rank_5' => 0
|
||||
];
|
||||
}
|
||||
|
||||
// 初始化分布数组,键名使用 rank_N 格式便于前端解析
|
||||
$distribution = [
|
||||
'rank_1' => 0,
|
||||
'rank_2' => 0,
|
||||
'rank_3' => 0,
|
||||
'rank_4' => 0,
|
||||
'rank_5' => 0
|
||||
];
|
||||
|
||||
foreach ($backtestDetails as $detail) {
|
||||
if (!isset($detail['hit']) || !isset($detail['rank'])) {
|
||||
continue; // 跳过无效数据
|
||||
}
|
||||
if ($detail['hit'] && $detail['rank'] >= 1 && $detail['rank'] <= 5) {
|
||||
$key = 'rank_' . $detail['rank'];
|
||||
$distribution[$key]++;
|
||||
}
|
||||
}
|
||||
|
||||
return $distribution;
|
||||
}
|
||||
```
|
||||
|
||||
实现要点:
|
||||
- MRR: 命中号码排名倒数平均值,公式 Σ(1/rank)/N
|
||||
- 命中分布: 明确结构为 `{rank_1: n, rank_2: n, ..., rank_5: n}`
|
||||
- 两个方法均添加空预测保护和无效数据跳过逻辑
|
||||
- hit_distribution 使用 rank_N 键名格式,便于前端柱状图渲染
|
||||
</action>
|
||||
|
||||
<acceptance_criteria>
|
||||
- grep 正则匹配: `_calculateMRR\s*\(` 在 History.php 中存在
|
||||
- grep 正则匹配: `_calculateHitDistribution\s*\(` 在 History.php 中存在
|
||||
- grep 匹配: `empty($backtestDetails)` 在两个方法中均存在(空预测保护)
|
||||
- grep 匹配: `rank_1|rank_2|rank_3|rank_4|rank_5` 在 _calculateHitDistribution 中存在
|
||||
- 两个方法均包含函数级注释
|
||||
</acceptance_criteria>
|
||||
|
||||
### Task 3: 扩展 _runBacktestV3 返回结果(含数据量检查)
|
||||
|
||||
<read_first>
|
||||
- D:\code\php\amlhc\application\admin\model\History.php (line 3549-3556, _runBacktestV3 返回语句)
|
||||
</read_first>
|
||||
|
||||
<action>
|
||||
修改 `_runBacktestV3` 方法的返回语句,在原有返回结构中添加新指标和数据量验证:
|
||||
|
||||
找到以下代码段(约 line 3549-3556):
|
||||
```php
|
||||
return [
|
||||
'hit_rate' => $hitRate,
|
||||
'avg_rank' => $avgRank,
|
||||
'total_tests' => $testCount,
|
||||
'total_hits' => $hits,
|
||||
'details' => $details
|
||||
];
|
||||
```
|
||||
|
||||
替换为:
|
||||
```php
|
||||
// 计算新增指标(添加数据量检查)
|
||||
$minDataThreshold = 50; // 置信度计算最小数据量阈值
|
||||
|
||||
// 如果测试数据不足,返回默认值并添加警告
|
||||
if ($testCount < $minDataThreshold) {
|
||||
$ndcg5 = 0;
|
||||
$mrr = 0;
|
||||
$hitDistribution = [
|
||||
'rank_1' => 0,
|
||||
'rank_2' => 0,
|
||||
'rank_3' => 0,
|
||||
'rank_4' => 0,
|
||||
'rank_5' => 0
|
||||
];
|
||||
$dataWarning = '回测数据不足(' . $testCount . '期),建议至少50期以获得可靠指标';
|
||||
} else {
|
||||
$ndcg5 = $this->_calculateNDCG($details, 5);
|
||||
$mrr = $this->_calculateMRR($details);
|
||||
$hitDistribution = $this->_calculateHitDistribution($details);
|
||||
$dataWarning = null;
|
||||
}
|
||||
|
||||
$precision5 = $testCount > 0 ? round($hits / ($testCount * 5) * 100, 2) : 0;
|
||||
|
||||
return [
|
||||
'hit_rate' => $hitRate,
|
||||
'avg_rank' => $avgRank,
|
||||
'total_tests' => $testCount,
|
||||
'total_hits' => $hits,
|
||||
'details' => $details,
|
||||
// 新增排名质量指标
|
||||
'ndcg_5' => $ndcg5,
|
||||
'mrr' => $mrr,
|
||||
'hit_distribution' => $hitDistribution,
|
||||
'precision_5' => $precision5,
|
||||
// 数据量警告(不足时提示)
|
||||
'data_warning' => $dataWarning,
|
||||
'data_sufficient' => $testCount >= $minDataThreshold
|
||||
];
|
||||
```
|
||||
|
||||
注意:
|
||||
- 新增指标计算放在 return 语句之前,确保 $details 数组已完整构建
|
||||
- 添加最小数据量检查(50期),不足时返回默认值和警告提示
|
||||
- 新增 data_warning 和 data_sufficient 字段供前端展示
|
||||
</action>
|
||||
|
||||
<acceptance_criteria>
|
||||
- grep 匹配: `ndcg_5` 在 _runBacktestV3 返回结构中存在
|
||||
- grep 匹配: `mrr` 在 _runBacktestV3 返回结构中存在
|
||||
- grep 匹配: `hit_distribution` 在 _runBacktestV3 返回结构中存在
|
||||
- grep 匹配: `precision_5` 在 _runBacktestV3 返回结构中存在
|
||||
- grep 匹配: `data_warning` 在 _runBacktestV3 返回结构中存在
|
||||
- grep 匹配: `minDataThreshold` 变量在方法中存在
|
||||
</acceptance_criteria>
|
||||
|
||||
## Verification
|
||||
|
||||
执行预测接口验证新指标返回:
|
||||
|
||||
```bash
|
||||
curl -s "http://127.0.0.1:8000/admin/history/predictV3?periods=200&backtest=10" | grep -E "ndcg_5|mrr|hit_distribution|precision_5|data_warning"
|
||||
```
|
||||
|
||||
预期结果:返回 JSON 中包含 ndcg_5、mrr、hit_distribution、precision_5、data_warning 字段。
|
||||
|
||||
## Success Criteria
|
||||
|
||||
1. `_calculateNDCG`、`_calculateMRR`、`_calculateHitDistribution` 三个方法已实现
|
||||
2. 所有计算方法包含空预测保护和数据完整性检查
|
||||
3. NDCG 公式在注释中完整说明:DCG = Σ(1/log2(rank+1))
|
||||
4. hit_distribution 结构明确为 `{rank_1..rank_5: counts}` 格式
|
||||
5. `_runBacktestV3` 返回结构包含 ndcg_5、mrr、hit_distribution、precision_5、data_warning 字段
|
||||
6. 添加数据量检查,不足50期时返回警告
|
||||
7. 所有新增方法包含函数级注释
|
||||
|
||||
## Output
|
||||
|
||||
完成后创建 `.planning/phases/11-predictv3/11-01-SUMMARY.md`
|
||||
@@ -0,0 +1,325 @@
|
||||
---
|
||||
phase: 11-predictv3
|
||||
plan: 02
|
||||
type: execute
|
||||
wave: 1
|
||||
depends_on: []
|
||||
files_modified:
|
||||
- application/admin/model/History.php
|
||||
autonomous: true
|
||||
requirements:
|
||||
- PRED-01
|
||||
must_haves:
|
||||
truths:
|
||||
- "用户可以看到每个预测号码的置信度百分比"
|
||||
- "用户可以看到 Top5 预测的整体置信度"
|
||||
- "置信度基于历史命中率、得分集中度、得分分布三个维度计算"
|
||||
- "系统在数据不足时提供合理的置信度估算"
|
||||
artifacts:
|
||||
- path: "application/admin/model/History.php"
|
||||
provides: "置信度计算方法"
|
||||
contains: "_calculateConfidence|_getHistoricalHitRateByRank|_getScoreDistributionConfidence|_getScoreConcentration"
|
||||
key_links:
|
||||
- from: "getPredictionV3"
|
||||
to: "_calculateConfidence"
|
||||
via: "method call before return"
|
||||
---
|
||||
|
||||
# Phase 11 - Plan 02: 置信度评估实现
|
||||
|
||||
## Objective
|
||||
|
||||
为预测结果添加置信度评估,帮助用户判断预测可靠性。置信度基于历史排名命中率、得分分布、得分集中度三个维度计算。
|
||||
|
||||
**Purpose:** 当前预测结果只有排名和得分,缺少置信度指标。用户无法判断预测结果的可信程度,置信度评估能有效辅助用户决策。
|
||||
|
||||
**Output:** `History.php` 中新增置信度计算方法,`getPredictionV3` 返回结果扩展 confidence 字段。
|
||||
|
||||
## Tasks
|
||||
|
||||
### Task 1: 实现置信度核心计算方法(含明确维度定义和数据量检查)
|
||||
|
||||
<read_first>
|
||||
- D:\code\php\amlhc\application\admin\model\History.php (line 2436-2444, getPredictionV3 返回语句)
|
||||
- D:\code\php\amlhc\application\admin\model\History.php (line 3495-3556, _runBacktestV3 方法)
|
||||
</read_first>
|
||||
|
||||
<action>
|
||||
在 `History.php` 类末尾新增置信度计算相关方法:
|
||||
|
||||
```php
|
||||
/**
|
||||
* 计算预测置信度
|
||||
*
|
||||
* 置信度组成(三个维度加权平均):
|
||||
* - 维度1: 历史排名命中率 (权重0.4) - 基于回测数据统计各排名位置的命中率
|
||||
* - 维度2: 得分分布置信度 (权重0.3) - 当前号码得分与Top5得分范围的比例关系
|
||||
* - 维度3: 得分集中度 (权重0.3) - Top5得分与平均得分的差距,差距越大置信度越高
|
||||
*
|
||||
* 加权公式:
|
||||
* confidence = 0.4 * historical_hit_rate + 0.3 * score_distribution + 0.3 * score_concentration
|
||||
*
|
||||
* 阈值定义:
|
||||
* - 高置信度: >= 70% (绿色展示)
|
||||
* - 中置信度: 50-70% (橙色展示)
|
||||
* - 低置信度: < 50% (红色展示)
|
||||
*
|
||||
* @param array $predictions 预测结果数组(Top5)
|
||||
* @param array $backtest 回测结果
|
||||
* @param array $scoresAll 所有号码得分详情(可选,用于集中度计算)
|
||||
* @param int $minDataThreshold 最小数据量阈值,默认50期
|
||||
* @return array {confidence_scores: [], overall_confidence: float, data_warning: string|null}
|
||||
*/
|
||||
private function _calculateConfidence($predictions, $backtest, $scoresAll = null, $minDataThreshold = 50)
|
||||
{
|
||||
// 数据量检查
|
||||
$dataWarning = null;
|
||||
$hasBacktest = $backtest && !empty($backtest['details']) && $backtest['total_tests'] > 0;
|
||||
|
||||
if (!$hasBacktest || $backtest['total_tests'] < $minDataThreshold) {
|
||||
$dataWarning = '回测数据不足(' . ($backtest['total_tests'] ?? 0) . '期),置信度基于估算,建议至少50期';
|
||||
}
|
||||
|
||||
$confidenceScores = [];
|
||||
|
||||
// 计算Top5平均得分(用于集中度计算)
|
||||
$avgScore = 0;
|
||||
if (!empty($predictions)) {
|
||||
$totalScore = array_sum(array_column($predictions, 'score'));
|
||||
$avgScore = $totalScore / count($predictions);
|
||||
}
|
||||
|
||||
foreach ($predictions as $idx => $pred) {
|
||||
$rank = $idx + 1;
|
||||
$num = $pred['num'];
|
||||
$score = $pred['score'];
|
||||
|
||||
// 维度1: 历史排名命中率 (权重0.4)
|
||||
$rankHitRate = $this->_getHistoricalHitRateByRank($rank, $backtest);
|
||||
|
||||
// 维度2: 得分分布置信度 (权重0.3) - 得分比例
|
||||
$scoreDistribution = $this->_getScoreDistributionConfidence($score, $predictions);
|
||||
|
||||
// 维度3: 得分集中度 (权重0.3) - Top得分与平均得分的差距比例
|
||||
$scoreConcentration = $this->_getScoreConcentration($score, $avgScore, $predictions);
|
||||
|
||||
// 综合置信度(加权平均)
|
||||
$overallConfidence = $rankHitRate * 0.4 + $scoreDistribution * 0.3 + $scoreConcentration * 0.3;
|
||||
|
||||
$confidenceScores[] = [
|
||||
'num' => $num,
|
||||
'rank' => $rank,
|
||||
'confidence' => round($overallConfidence * 100, 1),
|
||||
'rank_hit_rate' => round($rankHitRate * 100, 1),
|
||||
'score_distribution' => round($scoreDistribution * 100, 1),
|
||||
'score_concentration' => round($scoreConcentration * 100, 1)
|
||||
];
|
||||
}
|
||||
|
||||
// 整体置信度(Top5平均)
|
||||
$overallConfidence = count($confidenceScores) > 0
|
||||
? round(array_sum(array_column($confidenceScores, 'confidence')) / count($confidenceScores), 1)
|
||||
: 0;
|
||||
|
||||
return [
|
||||
'confidence_scores' => $confidenceScores,
|
||||
'overall_confidence' => $overallConfidence,
|
||||
'data_warning' => $dataWarning
|
||||
];
|
||||
}
|
||||
|
||||
/**
|
||||
* 基于历史排名获取命中率
|
||||
*
|
||||
* 计算方法:
|
||||
* - 有回测数据时: 统计各排名的历史命中次数 / 总测试次数
|
||||
* - 无回测数据时: 根据排名估算,排名越靠前置信度越高
|
||||
* 估算公式: 1 - (rank - 1) * 0.15,即第1名估算85%,第5名估算25%
|
||||
*
|
||||
* @param int $rank 排名位置 (1-5)
|
||||
* @param array $backtest 回测结果
|
||||
* @return float 该排名的历史命中率 (0-1)
|
||||
*/
|
||||
private function _getHistoricalHitRateByRank($rank, $backtest)
|
||||
{
|
||||
if (!$backtest || empty($backtest['details']) || $backtest['total_tests'] == 0) {
|
||||
// 无回测数据时,根据排名估算(排名越靠前置信度越高)
|
||||
// 估算公式: 1 - (rank - 1) * 0.15
|
||||
// 第1名: 1.0, 第2名: 0.85, 第3名: 0.70, 第4名: 0.55, 第5名: 0.40
|
||||
return max(0, 1 - ($rank - 1) * 0.15);
|
||||
}
|
||||
|
||||
// 统计各排名的历史命中次数
|
||||
$rankHits = array_fill(1, 5, 0);
|
||||
foreach ($backtest['details'] as $detail) {
|
||||
if ($detail['hit'] && $detail['rank'] >= 1 && $detail['rank'] <= 5) {
|
||||
$rankHits[$detail['rank']]++;
|
||||
}
|
||||
}
|
||||
|
||||
$totalTests = $backtest['total_tests'];
|
||||
return $totalTests > 0 ? $rankHits[$rank] / $totalTests : 0;
|
||||
}
|
||||
|
||||
/**
|
||||
* 计算得分分布置信度
|
||||
*
|
||||
* 计算方法:
|
||||
* - 得分比例 = (score - bottomScore) / (topScore - bottomScore)
|
||||
* - 得分越接近第一名,置信度越高
|
||||
* - 所有得分相同时返回1
|
||||
*
|
||||
* @param float $score 当前号码得分
|
||||
* @param array $predictions 所有预测结果
|
||||
* @return float 得分置信度 (0-1)
|
||||
*/
|
||||
private function _getScoreDistributionConfidence($score, $predictions)
|
||||
{
|
||||
if (empty($predictions)) return 0;
|
||||
|
||||
$topScore = $predictions[0]['score'];
|
||||
$bottomScore = end($predictions)['score'];
|
||||
|
||||
if ($topScore == $bottomScore) return 1; // 所有得分相同
|
||||
|
||||
// 得分比例:(score - bottom) / (top - bottom)
|
||||
$ratio = ($score - $bottomScore) / ($topScore - $bottomScore);
|
||||
return max(0, min(1, $ratio));
|
||||
}
|
||||
|
||||
/**
|
||||
* 计算得分集中度
|
||||
*
|
||||
* 计算方法:
|
||||
* - 集中度 = (score - avgScore) / (topScore - avgScore) 如果 score > avgScore
|
||||
* - 集中度 = 0 如果 score <= avgScore
|
||||
* - Top得分与平均得分差距越大,集中度越高,表示预测结果区分度明显
|
||||
*
|
||||
* @param float $score 当前号码得分
|
||||
* @param float $avgScore Top5平均得分
|
||||
* @param array $predictions 所有预测结果
|
||||
* @return float 集中度置信度 (0-1)
|
||||
*/
|
||||
private function _getScoreConcentration($score, $avgScore, $predictions)
|
||||
{
|
||||
if (empty($predictions)) return 0;
|
||||
|
||||
$topScore = $predictions[0]['score'];
|
||||
|
||||
// 如果得分低于平均,集中度为0
|
||||
if ($score <= $avgScore) {
|
||||
return 0;
|
||||
}
|
||||
|
||||
// 如果Top得分等于平均,所有得分相同,集中度为0.5
|
||||
if ($topScore == $avgScore) {
|
||||
return $score == $topScore ? 0.5 : 0;
|
||||
}
|
||||
|
||||
// 集中度 = (score - avg) / (top - avg)
|
||||
$concentration = ($score - $avgScore) / ($topScore - $avgScore);
|
||||
return max(0, min(1, $concentration));
|
||||
}
|
||||
```
|
||||
|
||||
实现要点:
|
||||
- **维度重命名**:将"多维度一致性"改为"得分集中度",更明确地表示Top得分与平均得分的差距
|
||||
- **加权公式明确**:`confidence = 0.4*历史命中率 + 0.3*得分分布 + 0.3*得分集中度`
|
||||
- **数据量检查**:回测数据不足50期时返回警告
|
||||
- **阈值明确**:>=70%高、50-70%中、<50%低
|
||||
- **无数据fallback**:回测缺失时使用估算公式
|
||||
</action>
|
||||
|
||||
<acceptance_criteria>
|
||||
- grep 正则匹配: `_calculateConfidence\s*\(` 在 History.php 中存在
|
||||
- grep 正则匹配: `_getHistoricalHitRateByRank\s*\(` 在 History.php 中存在
|
||||
- grep 正则匹配: `_getScoreDistributionConfidence\s*\(` 在 History.php 中存在
|
||||
- grep 正则匹配: `_getScoreConcentration\s*\(` 在 History.php 中存在(替代原_getDimensionConsistency)
|
||||
- grep 匹配: `minDataThreshold` 在方法中存在(数据量阈值)
|
||||
- grep 匹配: `score_concentration` 在返回结构中存在(替代原consistency)
|
||||
- 所有方法包含函数级注释,注释中包含加权公式说明
|
||||
</acceptance_criteria>
|
||||
|
||||
### Task 2: 在 getPredictionV3 中调用置信度计算(含数据量传递)
|
||||
|
||||
<read_first>
|
||||
- D:\code\php\amlhc\application\admin\model\History.php (line 2413-2444, getPredictionV3 返回部分)
|
||||
</read_first>
|
||||
|
||||
<action>
|
||||
在 `getPredictionV3` 方法中,找到以下代码段(约 line 2413-2444):
|
||||
|
||||
```php
|
||||
// ====== 10. 历史回测验证 ======
|
||||
$backtest = $skipBacktest ? null : $this->_runBacktestV3($periods, $weights, $backtestCount, $cutoffTime);
|
||||
|
||||
// 计算命中情况
|
||||
$hitInfo = null;
|
||||
...
|
||||
return [
|
||||
'predictions' => $predictions,
|
||||
...
|
||||
];
|
||||
```
|
||||
|
||||
在 `$backtest` 计算后、`$hitInfo` 计算前,插入置信度计算代码:
|
||||
|
||||
```php
|
||||
// ====== 10. 历史回测验证 ======
|
||||
$backtest = $skipBacktest ? null : $this->_runBacktestV3($periods, $weights, $backtestCount, $cutoffTime);
|
||||
|
||||
// ====== 11. 置信度评估(新增)======
|
||||
// 最小数据量阈值设为50期,不足时置信度基于估算
|
||||
$minDataThreshold = 50;
|
||||
$confidence = $this->_calculateConfidence($predictions, $backtest, null, $minDataThreshold);
|
||||
|
||||
// 计算命中情况
|
||||
$hitInfo = null;
|
||||
...
|
||||
```
|
||||
|
||||
并修改返回结构,添加 `confidence` 字段:
|
||||
|
||||
```php
|
||||
return [
|
||||
'predictions' => $predictions,
|
||||
'last_special' => $lastSpecial,
|
||||
'last_expect' => $lastExpect,
|
||||
'analysis' => $analysis,
|
||||
'actual_result' => $actualResult,
|
||||
'hit_info' => $hitInfo,
|
||||
'backtest' => $backtest,
|
||||
'confidence' => $confidence // 新增置信度字段
|
||||
];
|
||||
```
|
||||
</action>
|
||||
|
||||
<acceptance_criteria>
|
||||
- grep 匹配: `_calculateConfidence` 在 getPredictionV3 方法中被调用
|
||||
- grep 匹配: `$minDataThreshold` 在 getPredictionV3 中存在
|
||||
- grep 匹配: `'confidence'` 在 getPredictionV3 返回结构中存在
|
||||
- 置信度计算在回测验证之后执行
|
||||
</acceptance_criteria>
|
||||
|
||||
## Verification
|
||||
|
||||
执行预测接口验证置信度字段返回:
|
||||
|
||||
```bash
|
||||
curl -s "http://127.0.0.1:8000/admin/history/predictV3?periods=200&backtest=10" | grep -E "confidence|overall_confidence|confidence_scores|score_concentration"
|
||||
```
|
||||
|
||||
预期结果:返回 JSON 中包含 confidence、overall_confidence、confidence_scores、score_concentration 字段。
|
||||
|
||||
## Success Criteria
|
||||
|
||||
1. `_calculateConfidence` 及 4 个辅助方法已实现
|
||||
2. 置信度维度重命名为:历史排名命中率、得分分布、得分集中度
|
||||
3. 加权公式在注释中明确:`confidence = 0.4*历史 + 0.3*分布 + 0.3*集中度`
|
||||
4. 添加数据量检查,不足50期时返回警告
|
||||
5. `getPredictionV3` 返回结构包含 confidence 字段
|
||||
6. 所有新增方法包含函数级注释
|
||||
|
||||
## Output
|
||||
|
||||
完成后创建 `.planning/phases/11-predictv3/11-02-SUMMARY.md`
|
||||
@@ -0,0 +1,278 @@
|
||||
---
|
||||
phase: 11-predictv3
|
||||
plan: 03
|
||||
type: execute
|
||||
wave: 2
|
||||
depends_on:
|
||||
- 11-01
|
||||
- 11-02
|
||||
files_modified:
|
||||
- public/assets/js/backend/history.js
|
||||
autonomous: true
|
||||
requirements:
|
||||
- PRED-01
|
||||
- PRED-02
|
||||
must_haves:
|
||||
truths:
|
||||
- "用户可以在预测弹窗中看到每个号码的置信度百分比"
|
||||
- "用户可以在回测结果区域看到 NDCG@5 和 MRR 指标"
|
||||
- "用户可以看到各排名位置的命中分布柱状图"
|
||||
- "用户可以看到数据不足时的警告提示"
|
||||
artifacts:
|
||||
- path: "public/assets/js/backend/history.js"
|
||||
provides: "置信度和新回测指标前端展示"
|
||||
contains: "renderPredict|confidence|ndcg_5|mrr|hit_distribution|data_warning"
|
||||
key_links:
|
||||
- from: "renderPredict"
|
||||
to: "backtest.ndcg_5, backtest.mrr, confidence, backtest.data_warning"
|
||||
via: "property access in rendering logic"
|
||||
---
|
||||
|
||||
# Phase 11 - Plan 03: 前端展示优化
|
||||
|
||||
## Objective
|
||||
|
||||
更新前端 `renderPredict` 方法,展示新增的置信度指标和扩展的回测指标(NDCG、MRR、命中分布、数据警告)。
|
||||
|
||||
**Purpose:** 后端已计算置信度和新回测指标,前端需要将这些数据可视化呈现给用户,提升预测结果的可读性和决策辅助价值。
|
||||
|
||||
**Output:** `history.js` 中的 `renderPredict` 方法扩展,新增置信度展示区域和回测指标扩展展示。
|
||||
|
||||
## Tasks
|
||||
|
||||
### Task 1: 添加置信度展示区域(含数据警告提示)
|
||||
|
||||
<read_first>
|
||||
- D:\code\php\amlhc\public\assets\js\backend\history.js (line 1700-1871, renderPredict 方法)
|
||||
</read_first>
|
||||
|
||||
<action>
|
||||
在 `renderPredict` 方法中,找到以下变量声明位置(约 line 1701):
|
||||
|
||||
```javascript
|
||||
var predictions = data.predictions || [];
|
||||
var analysis = data.analysis || {};
|
||||
var hitInfo = data.hit_info || null;
|
||||
var actualResult = data.actual_result || null;
|
||||
var backtest = data.backtest || null;
|
||||
```
|
||||
|
||||
在 `backtest` 声明后添加 `confidence` 变量:
|
||||
|
||||
```javascript
|
||||
var confidence = data.confidence || null;
|
||||
```
|
||||
|
||||
然后在回测验证结果展示区域(约 line 1732-1751)之前,插入置信度展示区域:
|
||||
|
||||
在 line 1732 之前(即 `// 回测验证结果` 注释之前)插入:
|
||||
|
||||
```javascript
|
||||
// 置信度评估展示(V2和V3版本)
|
||||
if (confidence && (version === 'v2' || version === 'v3')) {
|
||||
html += '<div style="background:#fff8e1;border:1px solid #ffb300;border-radius:6px;padding:12px;margin-bottom:15px;">';
|
||||
html += '<div style="font-size:13px;font-weight:bold;color:#ff8f00;margin-bottom:8px;"><i class="fa fa-star-half-o"></i> 预测置信度评估</div>';
|
||||
|
||||
// 数据警告提示(数据不足时显示)
|
||||
if (confidence.data_warning) {
|
||||
html += '<div style="font-size:11px;color:#d32f2f;background:#ffebee;padding:6px;border-radius:4px;margin-bottom:8px;"><i class="fa fa-exclamation-triangle"></i> ' + confidence.data_warning + '</div>';
|
||||
}
|
||||
|
||||
html += '<div style="display:flex;gap:20px;align-items:center;">';
|
||||
html += '<div style="text-align:center;"><div style="font-size:24px;font-weight:bold;color:#ff8f00;">' + confidence.overall_confidence + '%</div><div style="font-size:12px;color:#666;">整体置信度</div></div>';
|
||||
|
||||
// 各排名置信度(使用得分集中度维度)
|
||||
if (confidence.confidence_scores && confidence.confidence_scores.length > 0) {
|
||||
html += '<div style="display:flex;gap:8px;">';
|
||||
for (var i = 0; i < confidence.confidence_scores.length; i++) {
|
||||
var cs = confidence.confidence_scores[i];
|
||||
// 阈值定义:>=70%高(绿)、50-70%中(橙)、<50%低(红)
|
||||
var confLevel = cs.confidence >= 70 ? '高' : (cs.confidence >= 50 ? '中' : '低');
|
||||
var confColor = cs.confidence >= 70 ? '#4caf50' : (cs.confidence >= 50 ? '#ff9800' : '#f44336');
|
||||
html += '<div style="text-align:center;padding:5px;background:#fff;border-radius:4px;">';
|
||||
html += '<div style="font-size:14px;font-weight:bold;color:' + confColor + ';">' + cs.confidence + '%</div>';
|
||||
html += '<div style="font-size:10px;color:#999;">#' + cs.rank + '</div>';
|
||||
html += '</div>';
|
||||
}
|
||||
html += '</div>';
|
||||
}
|
||||
html += '</div></div>';
|
||||
}
|
||||
```
|
||||
|
||||
实现要点:
|
||||
- 整体置信度以大数字展示,各排名置信度以小卡片形式横向排列
|
||||
- 置信度分三级:高(>=70%, 绿色)、中(50-70%, 橙色)、低(<50%, 红色)
|
||||
- 只在 V2 和 V3 版本中显示
|
||||
- 新增 data_warning 展示:数据不足时显示红色警告提示
|
||||
</action>
|
||||
|
||||
<acceptance_criteria>
|
||||
- grep 匹配: `confidence.overall_confidence` 在 history.js renderPredict 方法中存在
|
||||
- grep 匹配: `confidence.confidence_scores` 在 history.js renderPredict 方法中存在
|
||||
- grep 匹配: `confidence.data_warning` 在 history.js renderPredict 方法中存在
|
||||
- grep 匹配: `confLevel` 和 `confColor` 变量在 history.js 中存在
|
||||
- 置信度展示区域在回测验证结果之前显示
|
||||
</acceptance_criteria>
|
||||
|
||||
### Task 2: 扩展回测指标展示区域(含数据警告和命中分布柱状图)
|
||||
|
||||
<read_first>
|
||||
- D:\code\php\amlhc\public\assets\js\backend\history.js (line 1732-1751, 回测验证结果展示区域)
|
||||
</read_first>
|
||||
|
||||
<action>
|
||||
在 `renderPredict` 方法中,找到回测验证结果展示区域(约 line 1732-1751),现有代码展示命中率、命中次数、平均排名三个指标。
|
||||
|
||||
找到以下代码段:
|
||||
|
||||
```javascript
|
||||
html += '<div style="display:flex;gap:20px;">';
|
||||
html += '<div style="text-align:center;"><div style="font-size:24px;font-weight:bold;color:#2196f3;">' + backtest.hit_rate + '%</div><div style="font-size:12px;color:#666;">命中率(Top5)</div></div>';
|
||||
html += '<div style="text-align:center;"><div style="font-size:24px;font-weight:bold;color:#4caf50;">' + backtest.total_hits + '/' + backtest.total_tests + '</div><div style="font-size:12px;color:#666;">命中次数</div></div>';
|
||||
html += '<div style="text-align:center;"><div style="font-size:24px;font-weight:bold;color:#ff9800;">' + (backtest.avg_rank || '—') + '</div><div style="font-size:12px;color:#666;">平均排名</div></div>';
|
||||
html += '</div>';
|
||||
```
|
||||
|
||||
替换为:
|
||||
|
||||
```javascript
|
||||
// 回测数据警告提示
|
||||
if (backtest.data_warning) {
|
||||
html += '<div style="font-size:11px;color:#d32f2f;background:#ffebee;padding:6px;border-radius:4px;margin-bottom:8px;"><i class="fa fa-exclamation-triangle"></i> ' + backtest.data_warning + '</div>';
|
||||
}
|
||||
|
||||
html += '<div style="display:flex;gap:15px;flex-wrap:wrap;">';
|
||||
html += '<div style="text-align:center;padding:8px;"><div style="font-size:22px;font-weight:bold;color:#2196f3;">' + backtest.hit_rate + '%</div><div style="font-size:11px;color:#666;">命中率(Top5)</div></div>';
|
||||
html += '<div style="text-align:center;padding:8px;"><div style="font-size:22px;font-weight:bold;color:#4caf50;">' + backtest.total_hits + '/' + backtest.total_tests + '</div><div style="font-size:11px;color:#666;">命中次数</div></div>';
|
||||
html += '<div style="text-align:center;padding:8px;"><div style="font-size:22px;font-weight:bold;color:#ff9800;">' + (backtest.avg_rank || '—') + '</div><div style="font-size:11px;color:#666;">平均排名</div></div>';
|
||||
|
||||
// 新增指标:NDCG@5 和 MRR(百分比展示)
|
||||
if (backtest.ndcg_5 !== undefined) {
|
||||
html += '<div style="text-align:center;padding:8px;"><div style="font-size:22px;font-weight:bold;color:#9c27b0;">' + (backtest.ndcg_5 * 100).toFixed(1) + '%</div><div style="font-size:11px;color:#666;">NDCG@5</div></div>';
|
||||
}
|
||||
if (backtest.mrr !== undefined) {
|
||||
html += '<div style="font-size:22px;font-weight:bold;color:#00bcd4;">' + (backtest.mrr * 100).toFixed(1) + '%</div><div style="font-size:11px;color:#666;">MRR</div></div>';
|
||||
}
|
||||
|
||||
// 转移概率阶数显示(来自11-05的transition_order字段)
|
||||
if (analysis && analysis.transition_order !== undefined) {
|
||||
html += '<div style="text-align:center;padding:8px;"><div style="font-size:22px;font-weight:bold;color:#607d8b;">' + analysis.transition_order + '阶</div><div style="font-size:11px;color:#666;">转移概率</div></div>';
|
||||
}
|
||||
html += '</div>';
|
||||
|
||||
// 命中分布柱状图(使用rank_1..rank_5键名)
|
||||
if (backtest.hit_distribution && Object.keys(backtest.hit_distribution).length > 0) {
|
||||
var distribution = backtest.hit_distribution;
|
||||
var maxHit = 0;
|
||||
// 找最大值用于计算柱状图高度比例
|
||||
for (var r = 1; r <= 5; r++) {
|
||||
var key = 'rank_' + r;
|
||||
if (distribution[key] > maxHit) {
|
||||
maxHit = distribution[key];
|
||||
}
|
||||
}
|
||||
|
||||
html += '<div style="margin-top:10px;font-size:11px;color:#666;">命中分布(各排名命中次数):</div>';
|
||||
html += '<div style="display:flex;gap:8px;align-items:flex-end;height:60px;margin-top:5px;padding:5px;background:#f5f5f5;border-radius:4px;">';
|
||||
for (var r = 1; r <= 5; r++) {
|
||||
var key = 'rank_' + r;
|
||||
var hitCount = distribution[key] || 0;
|
||||
var barHeight = maxHit > 0 ? (hitCount / maxHit * 45) : 0;
|
||||
var barColor = hitCount > 0 ? '#4caf50' : '#e0e0e0';
|
||||
html += '<div style="text-align:center;min-width:50px;">';
|
||||
html += '<div style="height:' + barHeight + 'px;background:' + barColor + ';border-radius:2px 2px 0 0;width:35px;margin:0 auto;"></div>';
|
||||
html += '<div style="font-size:10px;color:#666;margin-top:2px;">#' + r + '</div>';
|
||||
html += '<div style="font-size:11px;color:#333;font-weight:bold;">' + hitCount + '</div>';
|
||||
html += '</div>';
|
||||
}
|
||||
html += '</div>';
|
||||
}
|
||||
```
|
||||
|
||||
实现要点:
|
||||
- NDCG@5 和 MRR 以百分比形式展示(乘100后保留1位小数)
|
||||
- 命中分布以简单柱状图展示,排名1-5横向排列
|
||||
- 柱状图高度按最大命中次数比例计算
|
||||
- 使用 rank_1..rank_5 键名格式解析分布数据
|
||||
- 新增 data_warning 展示:回测数据不足时显示警告
|
||||
- 转移概率阶数从 analysis.transition_order 获取(而非 backtest)
|
||||
</action>
|
||||
|
||||
<acceptance_criteria>
|
||||
- grep 匹配: `backtest.ndcg_5` 在 history.js renderPredict 方法中存在
|
||||
- grep 匹配: `backtest.mrr` 在 history.js renderPredict 方法中存在
|
||||
- grep 匹配: `backtest.hit_distribution` 在 history.js renderPredict 方法中存在
|
||||
- grep 匹配: `backtest.data_warning` 在 history.js renderPredict 方法中存在
|
||||
- grep 匹配: `rank_1|rank_2|rank_3|rank_4|rank_5` 在命中分布解析中存在
|
||||
- grep 匹配: `analysis.transition_order` 在 history.js renderPredict 方法中存在
|
||||
- 命中分布柱状图使用 div 元素实现
|
||||
</acceptance_criteria>
|
||||
|
||||
### Task 3: 在预测号码卡片中显示置信度(含得分集中度展示)
|
||||
|
||||
<read_first>
|
||||
- D:\code\php\amlhc\public\assets\js\backend\history.js (line 1828-1866, 预测号码列表渲染)
|
||||
</read_first>
|
||||
|
||||
<action>
|
||||
在预测号码列表渲染区域(约 line 1828-1866),找到号码卡片渲染代码:
|
||||
|
||||
在现有得分显示后添加置信度显示。找到以下代码段:
|
||||
|
||||
```javascript
|
||||
html += '<div style="font-size:11px;color:#2e7d32;font-weight:bold;">得分:' + p.score + '</div>';
|
||||
```
|
||||
|
||||
替换为:
|
||||
|
||||
```javascript
|
||||
html += '<div style="font-size:11px;color:#2e7d32;font-weight:bold;">得分:' + p.score + '</div>';
|
||||
|
||||
// 显示置信度(V3版本)
|
||||
if (version === 'v3' && confidence && confidence.confidence_scores) {
|
||||
var csForNum = confidence.confidence_scores.find(function(c) { return c.num === p.num; });
|
||||
if (csForNum) {
|
||||
// 阈值定义:>=70%高(绿)、50-70%中(橙)、<50%低(红)
|
||||
var confLevel = csForNum.confidence >= 70 ? '高' : (csForNum.confidence >= 50 ? '中' : '低');
|
||||
var confColor = csForNum.confidence >= 70 ? '#4caf50' : (csForNum.confidence >= 50 ? '#ff9800' : '#f44336');
|
||||
html += '<div style="font-size:10px;"><span style="color:' + confColor + ';font-weight:bold;">置信度:' + confLevel + '</span> <span style="color:#666;">(' + csForNum.confidence + '%)</span></div>';
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
实现要点:
|
||||
- 在号码卡片中显示置信度等级(高/中/低)和具体百分比
|
||||
- 使用与整体置信度展示相同的颜色映射阈值
|
||||
- 只在 V3 版本中显示
|
||||
- 置信度字段使用 score_concentration(得分集中度)维度
|
||||
</action>
|
||||
|
||||
<acceptance_criteria>
|
||||
- grep 匹配: `csForNum` 变量在 history.js 中存在
|
||||
- grep 匹配: `置信度:` 在号码卡片渲染代码中存在
|
||||
- 置信度显示在得分下方
|
||||
- grep 匹配: `csForNum.confidence` 在号码卡片渲染中存在
|
||||
</acceptance_criteria>
|
||||
|
||||
## Verification
|
||||
|
||||
1. 打开 history 页面,点击"智能预测"按钮
|
||||
2. 使用 V3 版本执行预测,验证:
|
||||
- 置信度评估区域是否显示
|
||||
- 数据不足时是否显示警告提示
|
||||
- 回测结果中是否显示 NDCG@5、MRR、命中分布柱状图
|
||||
- 预测号码卡片中是否显示置信度等级和百分比
|
||||
|
||||
## Success Criteria
|
||||
|
||||
1. `renderPredict` 方法已更新,新增置信度展示区域
|
||||
2. 回测结果展示区域已扩展,包含 NDCG@5、MRR、命中分布、数据警告
|
||||
3. 预测号码卡片显示置信度等级和百分比
|
||||
4. 命中分布使用 rank_1..rank_5 键名格式解析
|
||||
5. 所有新增展示样式与原有 UI 风格一致
|
||||
6. 数据警告以红色背景突出显示
|
||||
|
||||
## Output
|
||||
|
||||
完成后创建 `.planning/phases/11-predictv3/11-03-SUMMARY.md`
|
||||
@@ -0,0 +1,346 @@
|
||||
---
|
||||
phase: 11-predictv3
|
||||
plan: 04
|
||||
type: execute
|
||||
wave: 1
|
||||
depends_on: []
|
||||
files_modified:
|
||||
- application/admin/model/History.php
|
||||
- application/admin/controller/History.php
|
||||
autonomous: true
|
||||
requirements:
|
||||
- PRED-04
|
||||
must_haves:
|
||||
truths:
|
||||
- "用户可以通过接口获取最优权重配置"
|
||||
- "系统返回基于历史回测的权重优化结果"
|
||||
- "优化结果包含各权重配置的命中率、NDCG评估"
|
||||
- "网格搜索有超时保护机制"
|
||||
artifacts:
|
||||
- path: "application/admin/model/History.php"
|
||||
provides: "权重网格搜索优化方法"
|
||||
contains: "_optimizeWeightsGridSearch"
|
||||
- path: "application/admin/controller/History.php"
|
||||
provides: "权重优化接口入口"
|
||||
contains: "optimizeWeights"
|
||||
key_links:
|
||||
- from: "optimizeWeights controller"
|
||||
to: "_optimizeWeightsGridSearch model"
|
||||
via: "method call"
|
||||
---
|
||||
|
||||
# Phase 11 - Plan 04: 权重网格搜索优化
|
||||
|
||||
## Objective
|
||||
|
||||
实现权重网格搜索优化功能,通过预定义权重组合批量回测,找出最优权重配置,提升算法预测准确性。
|
||||
|
||||
**Purpose:** 当前权重为手动配置,缺乏数据驱动优化。网格搜索是一种成熟的参数优化方法,能基于历史回测数据找到更优权重组合。
|
||||
|
||||
**Output:** `History.php` 新增 `_optimizeWeightsGridSearch` 方法,`History.php` controller 新增 `optimizeWeights` 接口入口。
|
||||
|
||||
## Tasks
|
||||
|
||||
### Task 1: 实现权重网格搜索方法(含5种具体配置和超时保护)
|
||||
|
||||
<read_first>
|
||||
- D:\code\php\amlhc\application\admin\model\History.php (line 2094-2113, 默认权重配置)
|
||||
- D:\code\php\amlhc\application\admin\model\History.php (line 3495-3556, _runBacktestV3 方法)
|
||||
</read_first>
|
||||
|
||||
<action>
|
||||
在 `History.php` 类末尾新增权重网格搜索优化方法:
|
||||
|
||||
```php
|
||||
/**
|
||||
* 权重网格搜索优化
|
||||
*
|
||||
* 优化目标定义:
|
||||
* - 综合评估得分 = hit_rate * 0.6 + ndcg_5 * 100 * 0.4
|
||||
* - 命中率权重60%,NDCG权重40%
|
||||
* - 返回综合得分最高的权重配置
|
||||
*
|
||||
* 5种预定义权重配置:
|
||||
* - 配置1: 遗漏优先型 - omit_regression权重最高(0.25)
|
||||
* - 配置2: 转移概率优先型 - transition_prob权重最高(0.25)
|
||||
* - 配置3: 走势方向优先型 - trend_direction权重最高(0.25)
|
||||
* - 配置4: 平衡型 - 各维度权重较均衡
|
||||
* - 配置5: 组合特征优先型 - combination权重最高(0.20)
|
||||
*
|
||||
* @param int $periods 统计期数,范围50-500
|
||||
* @param int $backtestCount 回测期数,范围10-100
|
||||
* @param int $timeoutSeconds 超时限制秒数,默认60秒
|
||||
* @return array {best_weights: [], best_hit_rate: float, best_ndcg: float, all_results: [], timed_out: bool}
|
||||
*/
|
||||
private function _optimizeWeightsGridSearch($periods = 200, $backtestCount = 50, $timeoutSeconds = 60)
|
||||
{
|
||||
// 超时保护:记录开始时间
|
||||
$startTime = microtime(true);
|
||||
$timedOut = false;
|
||||
|
||||
// 5种预定义权重配置(具体权重值明确)
|
||||
$weightConfigs = [
|
||||
// 配置1: 遗漏优先型 - 遗漏回归权重最高
|
||||
[
|
||||
'omit_regression' => 0.25, // 遗漏回归权重25%
|
||||
'freq_regression' => 0.12, // 频率回归权重12%
|
||||
'transition_prob' => 0.15, // 转移概率权重15%
|
||||
'trend_direction' => 0.12, // 走势方向权重12%
|
||||
'oddeven_balance' => 0.08, // 单双平衡权重8%
|
||||
'bigsmall_balance' => 0.08, // 大小平衡权重8%
|
||||
'zone_balance' => 0.05, // 区域平衡权重5%
|
||||
'color_balance' => 0.05, // 波色平衡权重5%
|
||||
'combination' => 0.10 // 组合特征权重10%
|
||||
],
|
||||
// 配置2: 转移概率优先型 - 转移概率权重最高
|
||||
[
|
||||
'omit_regression' => 0.15,
|
||||
'freq_regression' => 0.10,
|
||||
'transition_prob' => 0.25, // 转移概率权重25%(最高)
|
||||
'trend_direction' => 0.12,
|
||||
'oddeven_balance' => 0.08,
|
||||
'bigsmall_balance' => 0.08,
|
||||
'zone_balance' => 0.04,
|
||||
'color_balance' => 0.04,
|
||||
'combination' => 0.14
|
||||
],
|
||||
// 配置3: 走势方向优先型 - 走势方向权重最高
|
||||
[
|
||||
'omit_regression' => 0.12,
|
||||
'freq_regression' => 0.10,
|
||||
'transition_prob' => 0.15,
|
||||
'trend_direction' => 0.25, // 走势方向权重25%(最高)
|
||||
'oddeven_balance' => 0.08,
|
||||
'bigsmall_balance' => 0.08,
|
||||
'zone_balance' => 0.04,
|
||||
'color_balance' => 0.04,
|
||||
'combination' => 0.12
|
||||
],
|
||||
// 配置4: 平衡型(默认配置)- 各维度权重较均衡
|
||||
[
|
||||
'omit_regression' => 0.18,
|
||||
'freq_regression' => 0.12,
|
||||
'transition_prob' => 0.18,
|
||||
'trend_direction' => 0.14,
|
||||
'oddeven_balance' => 0.08,
|
||||
'bigsmall_balance' => 0.08,
|
||||
'zone_balance' => 0.04,
|
||||
'color_balance' => 0.04,
|
||||
'combination' => 0.10
|
||||
],
|
||||
// 配置5: 组合特征优先型 - 组合特征权重最高
|
||||
[
|
||||
'omit_regression' => 0.15,
|
||||
'freq_regression' => 0.10,
|
||||
'transition_prob' => 0.15,
|
||||
'trend_direction' => 0.12,
|
||||
'oddeven_balance' => 0.06,
|
||||
'bigsmall_balance' => 0.06,
|
||||
'zone_balance' => 0.03,
|
||||
'color_balance' => 0.03,
|
||||
'combination' => 0.20 // 组合特征权重20%(最高)
|
||||
]
|
||||
];
|
||||
|
||||
$bestWeights = [];
|
||||
$bestHitRate = 0;
|
||||
$bestNdcg = 0;
|
||||
$bestCombinedScore = 0;
|
||||
$allResults = [];
|
||||
|
||||
// 执行每种配置的回测(添加超时检查)
|
||||
foreach ($weightConfigs as $configIdx => $weights) {
|
||||
// 超时检查:超过限制时间则停止
|
||||
$elapsedTime = microtime(true) - $startTime;
|
||||
if ($elapsedTime > $timeoutSeconds) {
|
||||
$timedOut = true;
|
||||
break;
|
||||
}
|
||||
|
||||
// 执行回测
|
||||
$backtest = $this->_runBacktestV3($periods, $weights, $backtestCount);
|
||||
|
||||
$hitRate = $backtest['hit_rate'] ?? 0;
|
||||
$ndcg = $backtest['ndcg_5'] ?? 0;
|
||||
$avgRank = $backtest['avg_rank'] ?? 0;
|
||||
$mrr = $backtest['mrr'] ?? 0;
|
||||
|
||||
// 综合评估得分:命中率60% + NDCG40%
|
||||
$combinedScore = $hitRate * 0.6 + $ndcg * 100 * 0.4;
|
||||
|
||||
$result = [
|
||||
'config_name' => $configIdx + 1,
|
||||
'config_type' => ['遗漏优先型', '转移概率优先型', '走势方向优先型', '平衡型', '组合特征优先型'][$configIdx],
|
||||
'weights' => $weights,
|
||||
'hit_rate' => $hitRate,
|
||||
'avg_rank' => $avgRank,
|
||||
'ndcg_5' => $ndcg,
|
||||
'mrr' => $mrr,
|
||||
'combined_score' => round($combinedScore, 2),
|
||||
'total_hits' => $backtest['total_hits'] ?? 0
|
||||
];
|
||||
|
||||
$allResults[] = $result;
|
||||
|
||||
// 更新最优配置
|
||||
if ($combinedScore > $bestCombinedScore) {
|
||||
$bestCombinedScore = $combinedScore;
|
||||
$bestHitRate = $hitRate;
|
||||
$bestNdcg = $ndcg;
|
||||
$bestWeights = $weights;
|
||||
}
|
||||
}
|
||||
|
||||
// 按综合得分降序排序结果
|
||||
usort($allResults, function($a, $b) {
|
||||
return $b['combined_score'] - $a['combined_score'];
|
||||
});
|
||||
|
||||
return [
|
||||
'best_weights' => $bestWeights,
|
||||
'best_hit_rate' => $bestHitRate,
|
||||
'best_ndcg' => $bestNdcg,
|
||||
'best_combined_score' => round($bestCombinedScore, 2),
|
||||
'all_results' => $allResults,
|
||||
'periods' => $periods,
|
||||
'backtest_count' => $backtestCount,
|
||||
'timeout_seconds' => $timeoutSeconds,
|
||||
'timed_out' => $timedOut,
|
||||
'elapsed_time' => round(microtime(true) - $startTime, 2)
|
||||
];
|
||||
}
|
||||
```
|
||||
|
||||
实现要点:
|
||||
- **5种具体权重配置**:每种配置的权重值在代码中明确列出
|
||||
- **优化目标明确**:综合得分 = hit_rate*0.6 + ndcg_5*100*0.4
|
||||
- **超时保护**:添加 $timeoutSeconds 参数,默认60秒,超时后停止剩余配置测试
|
||||
- **配置类型命名**:遗漏优先型、转移概率优先型、走势方向优先型、平衡型、组合特征优先型
|
||||
- 返回 timed_out 标志和 elapsed_time 供前端判断
|
||||
</action>
|
||||
|
||||
<acceptance_criteria>
|
||||
- grep 正则匹配: `_optimizeWeightsGridSearch\s*\(` 在 History.php 中存在
|
||||
- grep 匹配: `$weightConfigs` 数组包含5个配置项
|
||||
- grep 匹配: `$timeoutSeconds` 参数在方法签名中存在
|
||||
- grep 匹配: `$timedOut` 变量在方法中存在
|
||||
- grep 匹配: `combined_score` 在返回结果中存在
|
||||
- grep 匹配: `config_type` 在结果中存在
|
||||
- 方法调用 `_runBacktestV3` 进行回测
|
||||
- 方法包含函数级注释说明优化目标和配置类型
|
||||
</acceptance_criteria>
|
||||
|
||||
### Task 2: 新增权重优化接口入口(含参数验证和超时警告)
|
||||
|
||||
<read_first>
|
||||
- D:\code\php\amlhc\application\admin\controller\History.php (line 25, noNeedRight 数组)
|
||||
- D:\code\php\amlhc\application\admin\controller\History.php (line 460-489, predictV3 方法)
|
||||
</read_first>
|
||||
|
||||
<action>
|
||||
在 `History.php` controller 中:
|
||||
|
||||
1. 将 `optimizeWeights` 添加到 `noNeedRight` 数组(约 line 25):
|
||||
|
||||
找到:
|
||||
```php
|
||||
protected $noNeedRight = ['missingNum', 'trendData', ..., 'predictV3'];
|
||||
```
|
||||
|
||||
在 `predictV3` 后添加 `'optimizeWeights'`:
|
||||
|
||||
```php
|
||||
protected $noNeedRight = ['missingNum', 'trendData', ..., 'predictV3', 'optimizeWeights'];
|
||||
```
|
||||
|
||||
2. 在 `predictV3` 方法后(约 line 489 之后)新增 `optimizeWeights` 方法:
|
||||
|
||||
```php
|
||||
/**
|
||||
* 权重网格搜索优化接口
|
||||
* 执行多权重配置回测,返回最优权重组合
|
||||
*
|
||||
* 参数说明:
|
||||
* - periods: 统计期数,范围50-500,默认200
|
||||
* - backtest: 回测期数,范围10-100,默认30
|
||||
* - timeout: 超时秒数,范围10-120,默认60
|
||||
*
|
||||
* 返回说明:
|
||||
* - best_weights: 最优权重配置
|
||||
* - best_hit_rate: 最优配置命中率
|
||||
* - all_results: 所有配置测试结果
|
||||
* - timed_out: 是否超时中断
|
||||
*/
|
||||
public function optimizeWeights()
|
||||
{
|
||||
if ($this->request->isAjax()) {
|
||||
// 参数验证
|
||||
$periods = $this->request->get('periods', 200, 'intval');
|
||||
if ($periods < 50 || $periods > 500) {
|
||||
$this->error('期数范围必须在 50-500 之间');
|
||||
}
|
||||
|
||||
$backtestCount = $this->request->get('backtest', 30, 'intval');
|
||||
if ($backtestCount < 10 || $backtestCount > 100) {
|
||||
$backtestCount = 30; // 使用默认值而非报错
|
||||
}
|
||||
|
||||
$timeoutSeconds = $this->request->get('timeout', 60, 'intval');
|
||||
if ($timeoutSeconds < 10 || $timeoutSeconds > 120) {
|
||||
$timeoutSeconds = 60; // 使用默认值
|
||||
}
|
||||
|
||||
// 执行优化
|
||||
$result = $this->model->_optimizeWeightsGridSearch($periods, $backtestCount, $timeoutSeconds);
|
||||
|
||||
// 超时警告提示
|
||||
$message = '优化完成';
|
||||
if ($result['timed_out']) {
|
||||
$message = '优化超时中断,已完成' . count($result['all_results']) . '种配置测试';
|
||||
}
|
||||
|
||||
$this->success($message, null, $result);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
实现要点:
|
||||
- 接口参数:periods(统计期数)、backtest(回测期数)、timeout(超时秒数)
|
||||
- 参数范围验证,超出范围时使用默认值或报错
|
||||
- 返回最优权重配置及所有测试结果
|
||||
- 超时时返回警告消息和已完成的测试数量
|
||||
- 需添加到 noNeedRight 允许无权限访问
|
||||
</action>
|
||||
|
||||
<acceptance_criteria>
|
||||
- grep 匹配: `optimizeWeights` 在 noNeedRight 数组中存在
|
||||
- grep 正则匹配: `public function optimizeWeights\s*\(` 在 controller 中存在
|
||||
- grep 匹配: `$timeoutSeconds` 在方法中存在
|
||||
- grep 匹配: `$result['timed_out']` 在方法中存在
|
||||
- 方法调用 `$this->model->_optimizeWeightsGridSearch`
|
||||
- 方法包含函数级注释
|
||||
</acceptance_criteria>
|
||||
|
||||
## Verification
|
||||
|
||||
执行权重优化接口验证返回结果:
|
||||
|
||||
```bash
|
||||
curl -s "http://127.0.0.1:8000/admin/history/optimizeWeights?periods=200&backtest=20&timeout=60" | grep -E "best_weights|best_hit_rate|best_ndcg|all_results|timed_out|config_type"
|
||||
```
|
||||
|
||||
预期结果:返回 JSON 中包含 best_weights、best_hit_rate、best_ndcg、all_results、timed_out、config_type 字段。
|
||||
|
||||
## Success Criteria
|
||||
|
||||
1. `_optimizeWeightsGridSearch` 方法已实现,包含 5 种预定义权重配置(权重值明确)
|
||||
2. 优化目标明确:综合得分 = hit_rate*0.6 + ndcg_5*100*0.4
|
||||
3. 超时保护机制已添加,默认60秒
|
||||
4. `optimizeWeights` controller 接口已实现,含参数验证
|
||||
5. 接口能正常返回最优权重配置及测试结果
|
||||
6. 超时时返回警告消息
|
||||
7. 所有新增方法包含函数级注释
|
||||
|
||||
## Output
|
||||
|
||||
完成后创建 `.planning/phases/11-predictv3/11-04-SUMMARY.md`
|
||||
@@ -0,0 +1,455 @@
|
||||
---
|
||||
phase: 11-predictv3
|
||||
plan: 05
|
||||
type: execute
|
||||
wave: 1
|
||||
depends_on: []
|
||||
files_modified:
|
||||
- application/admin/model/History.php
|
||||
autonomous: true
|
||||
requirements:
|
||||
- PRED-03
|
||||
must_haves:
|
||||
truths:
|
||||
- "转移概率计算考虑前两期状态联合决定"
|
||||
- "系统在数据充足时使用二阶马尔可夫,数据不足时回退一阶"
|
||||
- "预测结果中显示使用的转移概率阶数"
|
||||
- "二阶马尔可夫有状态对观察次数检查,不足时回退一阶"
|
||||
artifacts:
|
||||
- path: "application/admin/model/History.php"
|
||||
provides: "二阶马尔可夫转移矩阵构建方法"
|
||||
contains: "_getTransitionMatrix2ndOrder|_calcTransitionScore2ndOrder"
|
||||
key_links:
|
||||
- from: "getPredictionV3"
|
||||
to: "_getTransitionMatrix2ndOrder"
|
||||
via: "conditional call based on data availability and state pair count"
|
||||
---
|
||||
|
||||
# Phase 11 - Plan 05: 二阶马尔可夫转移概率增强
|
||||
|
||||
## Objective
|
||||
|
||||
改进现有一阶马尔可夫链转移概率计算,新增二阶马尔可夫链实现。考虑前两期状态联合决定当前转移概率,提升转移概率预测准确性。
|
||||
|
||||
**Purpose:** 现有 `_getTransitionMatrix` 仅考虑上一期状态,预测信息有限。二阶马尔可夫链利用更长历史序列,理论上预测更精准。
|
||||
|
||||
**Important:** 本计划是独立功能增强,不依赖其他计划。可独立执行。
|
||||
|
||||
**Output:** `History.php` 新增 `_getTransitionMatrix2ndOrder` 和 `_calcTransitionScore2ndOrder` 方法,`getPredictionV3` 中根据数据量和状态对观察次数选择使用一阶或二阶转移概率。
|
||||
|
||||
## Tasks
|
||||
|
||||
### Task 1: 实现二阶马尔可夫转移矩阵构建方法(含状态对观察次数检查)
|
||||
|
||||
<read_first>
|
||||
- D:\code\php\amlhc\application\admin\model\History.php (line 2468-2493, _getTransitionMatrix 方法)
|
||||
- D:\code\php\amlhc\application\admin\model\History.php (line 2452-2460, _getHeadIdx 方法)
|
||||
</read_first>
|
||||
|
||||
<action>
|
||||
在 `History.php` 类末尾新增二阶马尔可夫转移矩阵构建方法:
|
||||
|
||||
```php
|
||||
/**
|
||||
* 构建二阶马尔可夫转移矩阵
|
||||
* 考虑前两期状态联合决定当前转移概率
|
||||
*
|
||||
* 状态空间说明:
|
||||
* - 一阶马尔可夫: N个状态 (zone:5, tail:10, head:5)
|
||||
* - 二阶马尔可夫: N^2个状态对 (zone:25, tail:100, head:25)
|
||||
* - 状态键格式: "prev1-prev2",如 "2-3" 表示前一期区域2、前两期区域3
|
||||
*
|
||||
* 数据量阈值说明:
|
||||
* - 建议历史数据 >= 200期以获得稳定的二阶概率估计
|
||||
* - 状态对观察次数 >= 5 才使用该状态对的二阶概率
|
||||
* - 观察次数不足时返回 state_pair_insufficient 标志,供调用者回退一阶
|
||||
*
|
||||
* @param array $history 历史数据(降序,最新在前)
|
||||
* @param string $type 类型:zone/tail/head
|
||||
* @param int $minStatePairCount 状态对最小观察次数,默认5
|
||||
* @return array {matrix: [], prob_matrix: [], state_totals: [], num_categories: int, sufficient_pairs: int, total_pairs: int, min_threshold: int}
|
||||
*/
|
||||
private function _getTransitionMatrix2ndOrder($history, $type, $minStatePairCount = 5)
|
||||
{
|
||||
// 升序排列(从旧到新)
|
||||
$historyAsc = array_reverse($history);
|
||||
|
||||
// 确定类别数量和索引函数
|
||||
switch ($type) {
|
||||
case 'zone':
|
||||
$numCategories = 5;
|
||||
$getIdx = function ($num) {
|
||||
if ($num <= 10) return 0;
|
||||
if ($num <= 20) return 1;
|
||||
if ($num <= 30) return 2;
|
||||
if ($num <= 40) return 3;
|
||||
return 4;
|
||||
};
|
||||
break;
|
||||
case 'tail':
|
||||
$numCategories = 10;
|
||||
$getIdx = function ($num) { return $num % 10; };
|
||||
break;
|
||||
case 'head':
|
||||
$numCategories = 5;
|
||||
$getIdx = function ($num) {
|
||||
if ($num <= 9) return 0;
|
||||
if ($num <= 19) return 1;
|
||||
if ($num <= 29) return 2;
|
||||
if ($num <= 39) return 3;
|
||||
return 4;
|
||||
};
|
||||
break;
|
||||
default:
|
||||
return [
|
||||
'matrix' => [],
|
||||
'prob_matrix' => [],
|
||||
'state_totals' => [],
|
||||
'num_categories' => 0,
|
||||
'sufficient_pairs' => 0,
|
||||
'total_pairs' => 0,
|
||||
'min_threshold' => $minStatePairCount
|
||||
];
|
||||
}
|
||||
|
||||
// 状态空间: (prev1, prev2) -> current,共 numCategories^2 个前置状态
|
||||
$matrix = [];
|
||||
$stateTotals = [];
|
||||
|
||||
// 初始化矩阵结构
|
||||
for ($i = 0; $i < $numCategories; $i++) {
|
||||
for ($j = 0; $j < $numCategories; $j++) {
|
||||
$stateKey = $i . '-' . $j;
|
||||
$matrix[$stateKey] = array_fill(0, $numCategories, 0);
|
||||
$stateTotals[$stateKey] = 0;
|
||||
}
|
||||
}
|
||||
|
||||
// 统计二阶转移
|
||||
for ($i = 0; $i < count($historyAsc) - 2; $i++) {
|
||||
$prev1 = $getIdx((int)$historyAsc[$i]['num7']);
|
||||
$prev2 = $getIdx((int)$historyAsc[$i + 1]['num7']);
|
||||
$current = $getIdx((int)$historyAsc[$i + 2]['num7']);
|
||||
|
||||
if ($prev1 < 0 || $prev2 < 0 || $current < 0) continue;
|
||||
|
||||
$stateKey = $prev1 . '-' . $prev2;
|
||||
$matrix[$stateKey][$current]++;
|
||||
$stateTotals[$stateKey]++;
|
||||
}
|
||||
|
||||
// 统计充分观察的状态对数量(观察次数 >= minStatePairCount)
|
||||
$sufficientPairs = 0;
|
||||
$totalPairs = $numCategories * $numCategories;
|
||||
foreach ($stateTotals as $stateKey => $count) {
|
||||
if ($count >= $minStatePairCount) {
|
||||
$sufficientPairs++;
|
||||
}
|
||||
}
|
||||
|
||||
// 拉普拉斯平滑处理
|
||||
$probMatrix = [];
|
||||
foreach ($matrix as $stateKey => $counts) {
|
||||
$smoothTotal = $stateTotals[$stateKey] + $numCategories;
|
||||
$probMatrix[$stateKey] = [];
|
||||
for ($j = 0; $j < $numCategories; $j++) {
|
||||
$probMatrix[$stateKey][$j] = ($counts[$j] + 1) / $smoothTotal;
|
||||
}
|
||||
}
|
||||
|
||||
return [
|
||||
'matrix' => $matrix,
|
||||
'prob_matrix' => $probMatrix,
|
||||
'state_totals' => $stateTotals,
|
||||
'num_categories' => $numCategories,
|
||||
'sufficient_pairs' => $sufficientPairs,
|
||||
'total_pairs' => $totalPairs,
|
||||
'min_threshold' => $minStatePairCount
|
||||
];
|
||||
}
|
||||
```
|
||||
|
||||
实现要点:
|
||||
- 状态空间从 N 扩展到 N^2(zone: 25状态,tail: 100状态,head: 25状态)
|
||||
- 使用拉普拉斯平滑处理避免零概率问题
|
||||
- 状态键格式为 "prev1-prev2"
|
||||
- **新增状态对观察次数检查**:统计 sufficient_pairs(观察>=5次的状态对数量)
|
||||
- 返回 sufficient_pairs、total_pairs、min_threshold 供调用者判断是否足够稳定
|
||||
</action>
|
||||
|
||||
<acceptance_criteria>
|
||||
- grep 正则匹配: `_getTransitionMatrix2ndOrder\s*\(` 在 History.php 中存在
|
||||
- grep 匹配: `$minStatePairCount` 参数在方法签名中存在
|
||||
- grep 匹配: `sufficient_pairs` 在返回结构中存在
|
||||
- grep 匹配: `total_pairs` 在返回结构中存在
|
||||
- 方法包含 stateKey 变量(格式为 prev1-prev2)
|
||||
- 方法包含函数级注释,说明状态空间和数据量阈值
|
||||
</acceptance_criteria>
|
||||
|
||||
### Task 2: 实现二阶转移概率得分计算方法
|
||||
|
||||
<read_first>
|
||||
- D:\code\php\amlhc\application\admin\model\History.php (新增的 _getTransitionMatrix2ndOrder 方法)
|
||||
- D:\code\php\amlhc\application\admin\model\History.php (查找 _calcTransitionScore 方法位置)
|
||||
</read_first>
|
||||
|
||||
<action>
|
||||
使用 Grep 找到 `_calcTransitionScore` 方法位置后,在其附近新增二阶转移概率得分计算方法:
|
||||
|
||||
```bash
|
||||
grep -n "_calcTransitionScore" application/admin/model/History.php
|
||||
```
|
||||
|
||||
新增 `_calcTransitionScore2ndOrder` 方法:
|
||||
|
||||
```php
|
||||
/**
|
||||
* 计算二阶转移概率得分
|
||||
*
|
||||
* 计算方法:
|
||||
* - 综合区域、尾号、首号三个维度的二阶转移概率
|
||||
* - 各维度权重: 区域40%、尾号35%、首号25%
|
||||
* - 得分范围: 0-100
|
||||
*
|
||||
* @param int $num 当前号码
|
||||
* @param int $prev1Zone 前一期区域索引
|
||||
* @param int $prev2Zone 前两期区域索引
|
||||
* @param int $prev1Tail 前一期尾号索引
|
||||
* @param int $prev2Tail 前两期尾号索引
|
||||
* @param int $prev1Head 前一期首号索引
|
||||
* @param int $prev2Head 前两期首号索引
|
||||
* @param array $zoneTrans2nd 二阶区域转移矩阵
|
||||
* @param array $tailTrans2nd 二阶尾号转移矩阵
|
||||
* @param array $headTrans2nd 二阶首号转移矩阵
|
||||
* @param array $zoneMap 号码区域映射
|
||||
* @param array $tailMap 号码尾号映射
|
||||
* @param array $headMap 号码首号映射
|
||||
* @return float 综合转移得分 (0-100)
|
||||
*/
|
||||
private function _calcTransitionScore2ndOrder(
|
||||
$num,
|
||||
$prev1Zone, $prev2Zone,
|
||||
$prev1Tail, $prev2Tail,
|
||||
$prev1Head, $prev2Head,
|
||||
$zoneTrans2nd, $tailTrans2nd, $headTrans2nd,
|
||||
$zoneMap, $tailMap, $headMap
|
||||
)
|
||||
{
|
||||
$zone = $zoneMap[$num];
|
||||
$tail = $tailMap[$num];
|
||||
$head = $headMap[$num];
|
||||
|
||||
$score = 0;
|
||||
|
||||
// 区域二阶转移得分(权重40%)
|
||||
$zoneStateKey = $prev1Zone . '-' . $prev2Zone;
|
||||
if (isset($zoneTrans2nd['prob_matrix'][$zoneStateKey][$zone])) {
|
||||
$prob = $zoneTrans2nd['prob_matrix'][$zoneStateKey][$zone];
|
||||
$score += $prob * 40;
|
||||
}
|
||||
|
||||
// 尾号二阶转移得分(权重35%)
|
||||
$tailStateKey = $prev1Tail . '-' . $prev2Tail;
|
||||
if (isset($tailTrans2nd['prob_matrix'][$tailStateKey][$tail])) {
|
||||
$prob = $tailTrans2nd['prob_matrix'][$tailStateKey][$tail];
|
||||
$score += $prob * 35;
|
||||
}
|
||||
|
||||
// 首号二阶转移得分(权重25%)
|
||||
$headStateKey = $prev1Head . '-' . $prev2Head;
|
||||
if (isset($headTrans2nd['prob_matrix'][$headStateKey][$head])) {
|
||||
$prob = $headTrans2nd['prob_matrix'][$headStateKey][$head];
|
||||
$score += $prob * 25;
|
||||
}
|
||||
|
||||
return round($score, 2);
|
||||
}
|
||||
```
|
||||
|
||||
实现要点:
|
||||
- 综合区域、尾号、首号三个维度
|
||||
- 各维度权重:区域40%、尾号35%、首号25%
|
||||
- 使用 prob_matrix 中对应状态键的概率值
|
||||
</action>
|
||||
|
||||
<acceptance_criteria>
|
||||
- grep 正则匹配: `_calcTransitionScore2ndOrder\s*\(` 在 History.php 中存在
|
||||
- 方法参数包含 prev1Zone、prev2Zone 等二阶状态参数
|
||||
- 方法包含 zoneStateKey、tailStateKey、headStateKey 变量
|
||||
- 方法包含函数级注释说明权重分配
|
||||
</acceptance_criteria>
|
||||
|
||||
### Task 3: 在 getPredictionV3 中集成二阶马尔可夫(含200期阈值和状态对检查)
|
||||
|
||||
<read_first>
|
||||
- D:\code\php\amlhc\application\admin\model\History.php (line 2230-2239, 转移概率分析部分)
|
||||
- D:\code\php\amlhc\application\admin\model\History.php (line 2159-2161, 历史数据量检查)
|
||||
</read_first>
|
||||
|
||||
<action>
|
||||
在 `getPredictionV3` 方法中修改转移概率分析部分(约 line 2230-2239):
|
||||
|
||||
1. 找到以下代码段:
|
||||
```php
|
||||
// ====== 3. 转移概率分析(新增)======
|
||||
// 获取转移概率矩阵数据
|
||||
$zoneTransition = $this->_getTransitionMatrix($allHistory, 'zone');
|
||||
$tailTransition = $this->_getTransitionMatrix($allHistory, 'tail');
|
||||
$headTransition = $this->_getTransitionMatrix($allHistory, 'head');
|
||||
|
||||
// 上期号码的各类属性
|
||||
$lastZone = $this->_getZoneIdx($lastSpecial);
|
||||
$lastTail = $lastSpecial % 10;
|
||||
$lastHead = $this->_getHeadIdx($lastSpecial);
|
||||
```
|
||||
|
||||
替换为:
|
||||
```php
|
||||
// ====== 3. 转移概率分析 ======
|
||||
// 根据历史数据量决定使用一阶或二阶马尔可夫
|
||||
// 阈值条件:总期数 >= 200 且 状态对观察次数充足(>=5次的比例>=30%)
|
||||
$minPeriodsThreshold = 200; // 二阶马尔可夫最小历史期数阈值(从100提升到200)
|
||||
$minStatePairCount = 5; // 状态对最小观察次数
|
||||
$use2ndOrder = false;
|
||||
$secondOrderAvailable = false;
|
||||
|
||||
// 获取一阶转移概率矩阵(始终计算,作为fallback)
|
||||
$zoneTransition = $this->_getTransitionMatrix($allHistory, 'zone');
|
||||
$tailTransition = $this->_getTransitionMatrix($allHistory, 'tail');
|
||||
$headTransition = $this->_getTransitionMatrix($allHistory, 'head');
|
||||
|
||||
// 获取二阶转移概率矩阵(数据充足时)
|
||||
$zoneTransition2nd = null;
|
||||
$tailTransition2nd = null;
|
||||
$headTransition2nd = null;
|
||||
$prev2Zone = 0;
|
||||
$prev2Tail = 0;
|
||||
$prev2Head = 0;
|
||||
|
||||
if (count($allHistory) >= $minPeriodsThreshold && count($allHistory) >= 2) {
|
||||
// 获取前两期号码属性
|
||||
$prev2Special = (int)$allHistory[1]['num7'];
|
||||
$prev2Zone = $this->_getZoneIdx($prev2Special);
|
||||
$prev2Tail = $prev2Special % 10;
|
||||
$prev2Head = $this->_getHeadIdx($prev2Special);
|
||||
|
||||
// 构建二阶转移矩阵
|
||||
$zoneTransition2nd = $this->_getTransitionMatrix2ndOrder($allHistory, 'zone', $minStatePairCount);
|
||||
$tailTransition2nd = $this->_getTransitionMatrix2ndOrder($allHistory, 'tail', $minStatePairCount);
|
||||
$headTransition2nd = $this->_getTransitionMatrix2ndOrder($allHistory, 'head', $minStatePairCount);
|
||||
|
||||
// 检查状态对观察次数是否充足(至少30%的状态对有足够观察)
|
||||
// tail类型状态空间最大(100),以tail为基准判断
|
||||
if ($tailTransition2nd['total_pairs'] > 0) {
|
||||
$sufficientRatio = $tailTransition2nd['sufficient_pairs'] / $tailTransition2nd['total_pairs'];
|
||||
$secondOrderAvailable = $sufficientRatio >= 0.3; // 至少30%状态对观察>=5次
|
||||
}
|
||||
|
||||
$use2ndOrder = $secondOrderAvailable;
|
||||
}
|
||||
|
||||
// 上期号码的各类属性
|
||||
$lastZone = $this->_getZoneIdx($lastSpecial);
|
||||
$lastTail = $lastSpecial % 10;
|
||||
$lastHead = $this->_getHeadIdx($lastSpecial);
|
||||
```
|
||||
|
||||
2. 在 analysis 数组中添加转移阶数信息(约 line 2297-2317):
|
||||
|
||||
找到:
|
||||
```php
|
||||
$analysis = [
|
||||
'last_special' => $lastSpecial,
|
||||
'last_expect' => $lastExpect,
|
||||
'weights' => $weights,
|
||||
...
|
||||
];
|
||||
```
|
||||
|
||||
在 `trend_direction` 后添加:
|
||||
```php
|
||||
$analysis = [
|
||||
...
|
||||
'trend_direction' => $trendDirection,
|
||||
'transition_order' => $use2ndOrder ? 2 : 1, // 新增:转移概率阶数
|
||||
'transition_available' => $secondOrderAvailable, // 二阶是否可用
|
||||
'history_count' => count($allHistory), // 历史期数
|
||||
'min_periods_threshold' => $minPeriodsThreshold, // 阈值
|
||||
'last_zone' => $zoneLabels[$lastZone] ?? '',
|
||||
...
|
||||
];
|
||||
```
|
||||
|
||||
3. 在得分计算循环中(约 line 2342-2349)修改转移概率得分计算:
|
||||
|
||||
找到:
|
||||
```php
|
||||
// === 转移概率得分 ===
|
||||
$transScore = $this->_calcTransitionScore(
|
||||
$num, $lastZone, $lastTail, $lastHead,
|
||||
$zoneTransition, $tailTransition, $headTransition,
|
||||
$zoneMap, $tailMap, $headMap
|
||||
);
|
||||
```
|
||||
|
||||
替换为:
|
||||
```php
|
||||
// === 转移概率得分(根据阶数选择计算方法)===
|
||||
if ($use2ndOrder && $zoneTransition2nd && $tailTransition2nd && $headTransition2nd) {
|
||||
$transScore = $this->_calcTransitionScore2ndOrder(
|
||||
$num, $lastZone, $prev2Zone, $lastTail, $prev2Tail, $lastHead, $prev2Head,
|
||||
$zoneTransition2nd, $tailTransition2nd, $headTransition2nd,
|
||||
$zoneMap, $tailMap, $headMap
|
||||
);
|
||||
$detail['trans_order'] = 2;
|
||||
} else {
|
||||
$transScore = $this->_calcTransitionScore(
|
||||
$num, $lastZone, $lastTail, $lastHead,
|
||||
$zoneTransition, $tailTransition, $headTransition,
|
||||
$zoneMap, $tailMap, $headMap
|
||||
);
|
||||
$detail['trans_order'] = 1;
|
||||
}
|
||||
```
|
||||
</action>
|
||||
|
||||
<acceptance_criteria>
|
||||
- grep 匹配: `minPeriodsThreshold` 变量在 getPredictionV3 中存在(值为200)
|
||||
- grep 匹配: `minStatePairCount` 变量在 getPredictionV3 中存在(值为5)
|
||||
- grep 匹配: `$secondOrderAvailable` 变量在 getPredictionV3 中存在
|
||||
- grep 匹配: `sufficientRatio` 在 getPredictionV3 中存在(状态对观察比例)
|
||||
- grep 匹配: `_getTransitionMatrix2ndOrder` 在 getPredictionV3 中被调用
|
||||
- grep 匹配: `transition_order` 在 analysis 数组中存在
|
||||
- grep 匹配: `transition_available` 在 analysis 数组中存在
|
||||
- grep 匹配: `_calcTransitionScore2ndOrder` 在得分计算中被调用
|
||||
- 数据量阈值设置为 200 期(而非原100期)
|
||||
- 状态对观察次数检查 >= 5,比例 >= 30%
|
||||
</acceptance_criteria>
|
||||
|
||||
## Verification
|
||||
|
||||
执行预测接口验证二阶马尔可夫使用情况:
|
||||
|
||||
```bash
|
||||
curl -s "http://127.0.0.1:8000/admin/history/predictV3?periods=300&backtest=10" | grep -E "transition_order|transition_available|history_count"
|
||||
```
|
||||
|
||||
预期结果:
|
||||
- periods >= 200 且状态对观察充足时,返回 transition_order: 2
|
||||
- periods < 200 或状态对观察不足时,返回 transition_order: 1
|
||||
- transition_available 显示二阶是否可用
|
||||
|
||||
## Success Criteria
|
||||
|
||||
1. `_getTransitionMatrix2ndOrder` 方法已实现,包含二阶状态空间构建
|
||||
2. `_calcTransitionScore2ndOrder` 方法已实现
|
||||
3. `getPredictionV3` 根据数据量和状态对观察次数自动选择一阶或二阶马尔可夫
|
||||
4. 数据量阈值提升到 200 期(而非原100期)
|
||||
5. 状态对观察次数检查 >= 5,比例 >= 30% 才使用二阶
|
||||
6. analysis 返回中包含 transition_order、transition_available 字段
|
||||
7. 所有新增方法包含函数级注释
|
||||
8. depends_on 已修正为空数组(独立功能)
|
||||
|
||||
## Output
|
||||
|
||||
完成后创建 `.planning/phases/11-predictv3/11-05-SUMMARY.md`
|
||||
@@ -844,25 +844,29 @@ foreach ($weights as $key => $value) {
|
||||
|
||||
---
|
||||
|
||||
## Open Questions
|
||||
## Open Questions (RESOLVED)
|
||||
|
||||
1. **历史数据量是否足够支撑高级优化?**
|
||||
- 当前默认200期统计,二阶马尔可夫和关联规则挖掘建议500期+
|
||||
- 需检查数据库中实际可用的历史期数
|
||||
- 推荐: 查询 `SELECT COUNT(*) FROM fa_history` 确认数据量
|
||||
- **Resolution:** 11-05 Task 3 设置100期阈值,数据不足时回退一阶马尔可夫,已在plan中处理
|
||||
|
||||
2. **权重优化结果如何持久化?**
|
||||
- 选项A: 存储到 `application/extra/predict.php` 配置文件
|
||||
- 选项B: 存储到数据库配置表
|
||||
- 选项C: 每次预测时动态计算(性能成本高)
|
||||
- **Resolution:** 11-04 采用选项C(动态计算)+ 返回结果给前端展示,不持久化。设计决策:避免过拟合特定时间段,每次获取最新优化结果
|
||||
|
||||
3. **置信度阈值如何定义?**
|
||||
- 当前假设: >=70%为高,50-70%为中,<50%为低
|
||||
- 需根据实际回测数据调整阈值
|
||||
- **Resolution:** 11-02 Task 1 明确阈值定义:>=70%高(绿色)、50-70%中(橙色)、<50%低(红色),前端11-03使用相同映射
|
||||
|
||||
4. **前端如何展示新增的回测指标(NDCG、MRR)?**
|
||||
- 需设计用户友好的展示方式
|
||||
- 可考虑简化为"预测质量评分"单一指标
|
||||
- **Resolution:** 11-03 Task 2 实现百分比显示 + 柱状图:NDCG@5/MRR以百分比展示,命中分布以柱状图可视化
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -0,0 +1,158 @@
|
||||
---
|
||||
phase: 11
|
||||
reviewers: [codex, opencode]
|
||||
reviewed_at: 2026-05-01T12:30:00+08:00
|
||||
plans_reviewed: [11-01-PLAN.md, 11-02-PLAN.md, 11-03-PLAN.md, 11-04-PLAN.md, 11-05-PLAN.md]
|
||||
---
|
||||
|
||||
# Cross-AI Plan Review — Phase 11: predictV3算法优化
|
||||
|
||||
## Codex Review
|
||||
|
||||
### Summary
|
||||
|
||||
计划技术上是合理的,但有几个需要注意的地方:依赖关系修复、数据验证、性能保护和边缘情况处理。整体架构遵循现有模式良好,NDCG@5 和 MRR 作为排名评估指标是适当的。
|
||||
|
||||
### Strengths
|
||||
|
||||
- 计划结构清晰,任务分解合理
|
||||
- NDCG@5、MRR、命中分布都是业界标准的排名质量评估指标
|
||||
- 扩展字段明确(ndcg_5、mrr、hit_distribution、precision_5)
|
||||
- 整体架构遵循现有代码模式
|
||||
|
||||
### Concerns
|
||||
|
||||
| Severity | Concern |
|
||||
|----------|---------|
|
||||
| HIGH | **依赖关系错误**: Plan 05 (二阶马尔可夫) 不应依赖 Plans 01 和 03 — 它是独立的增强功能 |
|
||||
| HIGH | **数据验证缺失**: 需要最小样本量检查 — 置信度计算建议 50+ 期,二阶马尔可夫建议 150+ 期 |
|
||||
| MEDIUM | **性能保护**: 网格搜索需要超时保护和最优权重缓存机制 |
|
||||
| MEDIUM | **边缘情况**: NDCG 计算需要空预测保护;命中分布需要定义明确的统计桶 |
|
||||
|
||||
### Suggestions
|
||||
|
||||
- 修复 Plan 05 的依赖关系,使其独立执行
|
||||
- 添加数据量阈值验证,不足时返回提示或回退策略
|
||||
- 为网格搜索添加执行超时限制(如 60 秒)
|
||||
- 在 NDCG/MRR 计算前检查 `$details` 是否为空
|
||||
|
||||
### Risk Assessment
|
||||
|
||||
**MEDIUM** — 依赖关系和边缘情况需要修复,否则执行可能失败。
|
||||
|
||||
---
|
||||
|
||||
## OpenCode Review
|
||||
|
||||
### Summary
|
||||
|
||||
Phase 11 计划在功能扩展方向上合理,在现有 V3 预测算法(9维度 + 动态权重)基础上增加置信度评估、回测指标扩展、权重优化和二阶马尔可夫链增强。计划整体结构清晰,依赖关系明确,但存在若干实现细节缺失和潜在风险需要补充。
|
||||
|
||||
### Strengths
|
||||
|
||||
- Plan 01 任务分解合理,指标选择专业
|
||||
- Plan 02 与回测指标扩展解耦,可独立实现
|
||||
- Plan 03 依赖明确,在现有弹窗架构内实现
|
||||
- Plan 04 网格搜索是系统化的超参数优化方法
|
||||
- Plan 05 二阶马尔可夫是合理的算法增强方向
|
||||
|
||||
### Concerns
|
||||
|
||||
| Severity | Concern |
|
||||
|----------|---------|
|
||||
| HIGH | **Plan 01: NDCG 计算公式不明确** — 需要确认基于什么"理想排名"计算,relevance score 定义模糊 |
|
||||
| HIGH | **Plan 01: hit_distribution 定义模糊** — 具体指什么分布?按命中排名?按命中/未命中按期数分布? |
|
||||
| HIGH | **Plan 02: "历史排名命中率"数据来源不明** — 现有 `_runBacktestV3` 不保存历史命中率数据,需要明确是实时计算还是缓存 |
|
||||
| HIGH | **Plan 02: "多维度一致性"定义不明** — 具体指哪些维度之间的一致性?如何量化? |
|
||||
| HIGH | **Plan 04: 5 种权重配置未明确** — 缺少具体配置清单 |
|
||||
| HIGH | **Plan 04: 优化目标不明确** — 用哪个指标作为优化目标?hit_rate?NDCG?MRR? |
|
||||
| HIGH | **Plan 05: 状态空间爆炸** — 二阶马尔可夫状态空间大(尾数 10×10=100),100 期历史数据下很多状态对从未出现,概率估计不准确 |
|
||||
| HIGH | **Plan 05: 100 期阈值缺乏依据** — 二阶马尔可夫需要更长历史,建议至少 200-300 期 |
|
||||
| MEDIUM | **Plan 01: precision_5 与 hit_rate 关系** — 两者都是 Top-5 命中,是否重复?建议区分或合并 |
|
||||
| MEDIUM | **Plan 02: 置信度百分比计算公式** — 三个维度如何加权组合?权重比例未明确 |
|
||||
| MEDIUM | **Plan 02: 数据量要求** — 历史 < 20 期时置信度不准确,缺少 fallback 策略 |
|
||||
| MEDIUM | **Plan 03: UI 设计细节缺失** — 未提供具体布局建议,置信度展示样式不明确 |
|
||||
| MEDIUM | **Plan 03: 命中分布图表实现** — 未说明使用 ECharts 还是 CSS,数据量大时需考虑性能 |
|
||||
| MEDIUM | **Plan 04: 结果持久化** — 每次调用都重新计算?建议增加缓存 |
|
||||
| MEDIUM | **Plan 04: 计算量** — 5 配置 × 50 期回测 = 250 次预测,响应时间长,建议异步 |
|
||||
| MEDIUM | **Plan 05: 一阶/二阶切换逻辑** — 判断条件不明确:总期数阈值还是状态对观察次数阈值? |
|
||||
| MEDIUM | **Plan 05: 与现有一阶权重关系** — 二阶是独立维度还是替换一阶? |
|
||||
| LOW | **Plan 01: 计算性能** — 每次回测批量计算可能影响性能 |
|
||||
| LOW | **Plan 02: 返回值结构** — 是否每个号码都提供置信度? |
|
||||
| LOW | **Plan 03: 前端数据量** — 命中分布是否需要分页/滚动加载? |
|
||||
| LOW | **Plan 03: 国际化** — 新增 UI 文本是否需要多语言? |
|
||||
| LOW | **Plan 05: 性能影响** — 二阶计算复杂度高于一阶 |
|
||||
|
||||
### Suggestions
|
||||
|
||||
**Plan 01:**
|
||||
- 补充 NDCG 公式:DCG = Σ(1/log2(rank+1)),IDCG = Σ(1/log2(i+1)) for i=1..hits
|
||||
- 明确 hit_distribution 结构:`{rank_1: n, rank_2: n, ..., rank_5: n, miss: n}`
|
||||
- 添加回测结果缓存机制
|
||||
|
||||
**Plan 02:**
|
||||
- 将"多维度一致性"改为"预测得分集中度",基于 Top-5 得分与平均得分差距计算
|
||||
- 明确加权公式:`confidence = 0.4*historical + 0.3*score + 0.3*consistency`
|
||||
- 历史命中率作为 `_runBacktestV3` 副产品,调用一次获取
|
||||
|
||||
**Plan 03:**
|
||||
- 补充 UI 设计 mockup 或具体说明
|
||||
- 使用现有 ECharts 展示命中分布
|
||||
- 置信度用红/黄/绿三色表示高/中/低
|
||||
|
||||
**Plan 04:**
|
||||
- 补充 5 种预定义配置的具体权重值
|
||||
- 定义优化目标:综合 hit_rate(60%) + avg_rank(40%)
|
||||
- 使用后台队列异步处理,返回 task_id
|
||||
|
||||
**Plan 05:**
|
||||
- 重新评估二阶马尔可夫必要性 — 49选1彩票数据稀疏性严重
|
||||
- 替代方案:加权 N 阶马尔可夫(70% 二阶 + 30% 一阶)
|
||||
- 明确判断标准:`count($history) >= 200 && $statePairCount >= 5`
|
||||
|
||||
### Risk Assessment
|
||||
|
||||
**MEDIUM** — 实现复杂度中等,数据稀疏性问题高,多处定义不明确需补充。建议优先实现 Plan 01、02、04,将 Plan 05 作为可选高阶优化或重新评估必要性。
|
||||
|
||||
---
|
||||
|
||||
## Consensus Summary
|
||||
|
||||
### Agreed Strengths
|
||||
|
||||
- 计划结构清晰,任务分解合理 ✓
|
||||
- NDCG@5、MRR 是适当的排名质量评估指标 ✓
|
||||
- 整体架构遵循现有代码模式 ✓
|
||||
- 网格搜索是系统化的参数优化方法 ✓
|
||||
|
||||
### Agreed Concerns (Highest Priority)
|
||||
|
||||
| Priority | Concern | Source |
|
||||
|----------|---------|--------|
|
||||
| 1 | **Plan 05 依赖关系错误** — 不应依赖 01、03,是独立功能 | Codex + OpenCode |
|
||||
| 2 | **数据量验证缺失** — 置信度需 50+,二阶马尔可夫需 150-200+ | Codex + OpenCode |
|
||||
| 3 | **边缘情况处理** — NDCG/MRR 空预测保护,hit_distribution 定义模糊 | Codex + OpenCode |
|
||||
| 4 | **Plan 05 状态空间爆炸** — 100 期数据下二阶马尔可夫概率估计不准 | OpenCode |
|
||||
| 5 | **Plan 02 置信度维度定义不明** — "多维度一致性"如何量化 | OpenCode |
|
||||
| 6 | **Plan 04 配置未明确** — 5 种权重具体值缺失,优化目标不明 | OpenCode |
|
||||
| 7 | **性能影响** — 网格搜索需超时/异步,二阶马尔可夫计算量大 | Codex + OpenCode |
|
||||
|
||||
### Divergent Views
|
||||
|
||||
| Issue | Codex | OpenCode |
|
||||
|-------|-------|----------|
|
||||
| Plan 05 数据阈值 | 建议 150+ | 建议 200-300+,并要求状态对观察次数 >= 5 |
|
||||
| precision_5 与 hit_rate | 未提及 | 认为可能重复,建议区分 |
|
||||
|
||||
---
|
||||
|
||||
## Action Items for Replanning
|
||||
|
||||
1. **Fix Plan 05 depends_on** → 改为 `depends_on: []`
|
||||
2. **Add data validation** → 所有计算方法添加最小数据量检查和 fallback
|
||||
3. **Clarify NDCG formula** → 补充完整公式到 Plan 01 Task 1
|
||||
4. **Clarify hit_distribution** → 明确结构为 `{rank_1..rank_5: counts}`
|
||||
5. **Clarify confidence dimensions** → 重命名"多维度一致性"为"得分集中度"
|
||||
6. **Add weight configs** → Plan 04 补充 5 种具体权重配置值
|
||||
7. **Raise 2nd-order threshold** → Plan 05 改为 200 期 + 状态对观察次数检查
|
||||
8. **Add performance protection** → 网格搜索添加超时限制,考虑异步
|
||||
@@ -0,0 +1,97 @@
|
||||
---
|
||||
phase: 11
|
||||
phase_slug: predictv3
|
||||
created: 2026-05-01
|
||||
---
|
||||
|
||||
# Phase 11: predictV3算法优化 - Validation Strategy
|
||||
|
||||
## Overview
|
||||
|
||||
本阶段为算法优化,验证重点在于:
|
||||
1. 新增指标计算准确性(NDCG、MRR、置信度)
|
||||
2. 二阶马尔可夫转移矩阵构建正确性
|
||||
3. 权重优化结果有效性
|
||||
4. 回测命中率提升验证
|
||||
|
||||
## Test Framework
|
||||
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| Framework | PHPUnit (ThinkPHP内置) |
|
||||
| Config file | 无独立配置,通过 `php think unit` 运行 |
|
||||
| Quick run command | `php think unit --filter HistoryTest` |
|
||||
| Full suite command | `php think unit` |
|
||||
|
||||
## Phase Requirements → Test Map
|
||||
|
||||
| Req ID | Behavior | Test Type | Automated Command | File Exists? |
|
||||
|--------|----------|-----------|-------------------|-------------|
|
||||
| PRED-01 | 置信度计算准确性 | unit | `php think unit --filter testConfidenceCalculation` | ❌ Wave 0 |
|
||||
| PRED-02 | NDCG计算准确性 | unit | `php think unit --filter testNDCGCalculation` | ❌ Wave 0 |
|
||||
| PRED-03 | 二阶马尔可夫转移矩阵构建 | unit | `php think unit --filter testTransitionMatrix2ndOrder` | ❌ Wave 0 |
|
||||
| PRED-04 | 权重优化收敛性 | unit | `php think unit --filter testWeightOptimization` | ❌ Wave 0 |
|
||||
| PRED-05 | 回测结果完整性 | unit | `php think unit --filter testBacktestV3Extended` | ❌ Wave 0 |
|
||||
|
||||
## Sampling Rate
|
||||
|
||||
- **Per task commit:** 快速单元测试覆盖核心方法
|
||||
- **Per wave merge:** 回测验证使用真实历史数据
|
||||
- **Phase gate:** 全量回测(100期)验证整体命中率提升
|
||||
|
||||
## Wave 0 Gaps
|
||||
|
||||
- [ ] `tests/HistoryTest.php` — 核心预测方法单元测试
|
||||
- [ ] `tests/ConfidenceTest.php` — 置信度计算测试
|
||||
- [ ] `tests/BacktestMetricsTest.php` — NDCG、MRR等指标计算测试
|
||||
- [ ] 共享fixtures: 历史数据模拟生成器
|
||||
|
||||
## Validation Dimensions
|
||||
|
||||
### Dimension 8: Nyquist Test Coverage
|
||||
|
||||
**Target:** 每个 PLAN.md 至少有一个可验证的测试命令
|
||||
|
||||
| Plan | Primary Test | Coverage |
|
||||
|------|--------------|----------|
|
||||
| 11-01 | testBacktestV3Extended | NDCG/MRR 计算准确性 |
|
||||
| 11-02 | testConfidenceCalculation | 置信度计算准确性 |
|
||||
| 11-03 | 手动 UI 验证 | 前端展示正确性 |
|
||||
| 11-04 | testWeightOptimization | 权重优化收敛性 |
|
||||
| 11-05 | testTransitionMatrix2ndOrder | 二阶马尔可夫正确性 |
|
||||
|
||||
### Dimension 9: Integration Verification
|
||||
|
||||
**End-to-End Flow:**
|
||||
|
||||
1. 后端 `getPredictionV3()` 返回完整数据结构(predictions + confidence + backtest)
|
||||
2. 前端 `renderPredict()` 正确渲染所有新增指标
|
||||
3. 回测命中率可量化对比(V2 vs V3)
|
||||
|
||||
**Verification Command:**
|
||||
|
||||
```bash
|
||||
# 验证后端接口返回结构完整性
|
||||
curl -s "http://localhost/history/predictV3?periods=100" | jq '.data | keys'
|
||||
# 期望输出包含: predictions, confidence, backtest, analysis
|
||||
|
||||
# 验证 NDCG/MRR 存在
|
||||
curl -s "http://localhost/history/predictV3?periods=100" | jq '.data.backtest | keys'
|
||||
# 期望输出包含: hit_rate, avg_rank, ndcg_5, mrr, hit_distribution
|
||||
```
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
### Phase Gate
|
||||
|
||||
- [ ] NDCG@5 计算准确(单元测试通过)
|
||||
- [ ] MRR 计算准确(单元测试通过)
|
||||
- [ ] 置信度阈值正确(>=70%高、50-70%中、<50%低)
|
||||
- [ ] 二阶马尔可夫在数据充足时启用
|
||||
- [ ] 前端展示包含所有新增指标
|
||||
- [ ] 回测命中率有提升(相比 V2)
|
||||
|
||||
---
|
||||
|
||||
*Phase: 11-predictv3*
|
||||
*Validation strategy created: 2026-05-01*
|
||||
@@ -0,0 +1,55 @@
|
||||
---
|
||||
name: history-predict
|
||||
created: 2026-04-30
|
||||
type: quick
|
||||
---
|
||||
|
||||
# 预测号码功能规划
|
||||
|
||||
## 目标
|
||||
在 history 页面新增预测号码功能,综合历史记录多维度分析给出号码预测建议。
|
||||
|
||||
## 分析维度
|
||||
现有系统已具备以下转移概率分析:
|
||||
1. **区域转移** - zoneTransition (1-10, 11-20, 21-30, 31-40, 41-49)
|
||||
2. **生肖转移** - zodiacTransition (12生肖)
|
||||
3. **尾号转移** - tailNumberTransition (尾号0-9)
|
||||
4. **首号转移** - headNumberTransition (首号0-4)
|
||||
5. **波色转移** - colorWaveTransition (红/蓝/绿)
|
||||
|
||||
## 预测算法
|
||||
基于最近N期特码,结合各维度转移概率矩阵:
|
||||
- 根据上一期特码所在维度(区域、生肖、尾号、首号),查找转移概率最高的目标维度
|
||||
- 综合各维度预测结果,计算每个号码的综合得分
|
||||
- 得分 = 区域概率权重 + 生肖概率权重 + 尾号概率权重 + 首号概率权重 + 波色概率权重
|
||||
|
||||
## 实现步骤
|
||||
|
||||
### 1. 后端 Model 新增方法
|
||||
- `getPrediction($periods, $weights)` - 综合预测计算方法
|
||||
- 输入:历史期数、各维度权重配置
|
||||
- 输出:预测号码列表(按得分排序)
|
||||
|
||||
### 2. 后端 Controller 新增接口
|
||||
- `predict()` - AJAX 接口
|
||||
- 参数:periods, weights (可选)
|
||||
- 返回:预测号码列表 + 各维度分析详情
|
||||
|
||||
### 3. 前端 JS 新增功能
|
||||
- 预测弹窗 `showPredictDialog()`
|
||||
- 权重配置面板
|
||||
- 预测结果渲染(号码球 + 得分 + 各维度分析说明)
|
||||
|
||||
### 4. 权重配置
|
||||
默认权重:
|
||||
- 区域转移:0.25
|
||||
- 生肖转移:0.20
|
||||
- 尾号转移:0.20
|
||||
- 首号转移:0.15
|
||||
- 波色转移:0.10
|
||||
- 冷热系数:0.10
|
||||
|
||||
## 文件修改清单
|
||||
1. `application/admin/model/History.php` - 新增 getPrediction 方法
|
||||
2. `application/admin/controller/History.php` - 新增 predict 接口,更新 noNeedRight
|
||||
3. `public/assets/js/backend/history.js` - 新增预测弹窗和渲染逻辑
|
||||
@@ -0,0 +1,37 @@
|
||||
---
|
||||
status: complete
|
||||
created: 2026-04-30
|
||||
slug: history-predict
|
||||
---
|
||||
|
||||
# 预测号码功能完成
|
||||
|
||||
## 实现内容
|
||||
|
||||
### 1. 后端 Model (History.php)
|
||||
新增 `getPrediction($periods, $weights)` 方法:
|
||||
- 基于 6 个维度计算综合预测得分
|
||||
- 区域转移、生肖转移、尾号转移、首号转移、波色转移、冷热系数
|
||||
- 返回 Top 20 预测号码及其详细得分分析
|
||||
|
||||
### 2. 后端 Controller (History.php)
|
||||
新增 `predict()` 接口:
|
||||
- 支持 AJAX 请求
|
||||
- 可配置统计期数和权重参数
|
||||
- 已加入 `noNeedRight` 白名单
|
||||
|
||||
### 3. 前端 JS (history.js)
|
||||
新增预测功能:
|
||||
- `showPredictDialog()` - 预测弹窗
|
||||
- `queryPredict()` - AJAX 查询
|
||||
- `renderPredict()` - 结果渲染
|
||||
- 支持自定义权重配置
|
||||
|
||||
### 4. 视图 (index.html)
|
||||
新增"智能预测"按钮
|
||||
|
||||
## 文件变更
|
||||
- `application/admin/model/History.php` (+150行)
|
||||
- `application/admin/controller/History.php` (+25行)
|
||||
- `public/assets/js/backend/history.js` (+180行)
|
||||
- `application/admin/view/history/index.html` (+1行)
|
||||
@@ -0,0 +1,39 @@
|
||||
## 正码与特码关联规律分析
|
||||
|
||||
### 数据范围
|
||||
- 总期数:约500期(2025111-2026120)
|
||||
- 每期数据:num1-6(正码) + num7(特码)
|
||||
|
||||
### 分析维度
|
||||
|
||||
#### 1. 正码平均值与特码差值
|
||||
分析每一期的正码平均值(avg)与特码(num7)的差值分布:
|
||||
- 差值 = num7 - avg(num1-6)
|
||||
- 统计差值的高频范围
|
||||
|
||||
#### 2. 正码范围与特码关系
|
||||
分析正码的[min, max]范围与特码的关系:
|
||||
- 特码是否在正码范围内?
|
||||
- 特码距离正码范围的距离分布
|
||||
|
||||
#### 3. 正码排序后与特码距离
|
||||
将num1-6排序,分析特码与最近正码的距离:
|
||||
- 最短距离分布
|
||||
- 特码是否等于某个正码?
|
||||
|
||||
#### 4. 和值尾数与特码尾数
|
||||
分析正码和值的尾数与特码尾数的关系:
|
||||
- 同尾概率?
|
||||
- 差值分布?
|
||||
|
||||
#### 5. 正码区间覆盖分析
|
||||
将1-49分为5个区间,分析正码覆盖的区间与特码所在区间的关系:
|
||||
- 特码是否出现在正码未覆盖的区间?
|
||||
|
||||
#### 6. 波色/生肖关联
|
||||
分析正码中各波色/生肖的数量与特码波色/生肖的关系
|
||||
|
||||
---
|
||||
|
||||
### 数据提取准备
|
||||
从SQL中提取所有INSERT数据进行统计分析。
|
||||
Reference in New Issue
Block a user