Files

T

916117771 8b2590c5b5 docs(predictV3): 添加predictV3算法优化研究文档和前端功能实现

- 完成Phase 11: predictV3算法优化研究文档，涵盖6个优化方向的技术分析
- 实现置信度评估功能，提供历史命中率、得分分布、多维度一致性置信度指标
- 扩展回测指标体系，新增NDCG@K、MRR、命中率分布等排名质量评估指标
- 优化转移概率算法，引入二阶马尔可夫链和多属性联合转移增强预测准确性
- 设计权重训练机制，支持网格搜索和遗传算法进行数据驱动的参数优化
- 集成组合特征挖掘功能，采用关联规则和序列模式发现号码间潜在关联
- 实现完整的前端交互界面，支持预测结果显示、置信度展示和回测验证功能
- 建立性能优化策略，包括预计算缓存、批量计算和降级策略保障响应速度

2026-05-01 23:17:24 +08:00

12 KiB

Raw Blame History

phase, plan, type, wave, depends_on, files_modified, autonomous, requirements, must_haves

phase

plan

type

wave

depends_on

files_modified

autonomous

requirements

must_haves

11-predictv3

execute

application/admin/model/History.php

true

PRED-01

truths

artifacts

key_links

用户可以看到每个预测号码的置信度百分比

用户可以看到 Top5 预测的整体置信度

置信度基于历史命中率、得分集中度、得分分布三个维度计算

系统在数据不足时提供合理的置信度估算

path	provides	contains
application/admin/model/History.php	置信度计算方法	_calculateConfidence\|_getHistoricalHitRateByRank\|_getScoreDistributionConfidence\|_getScoreConcentration

from	to	via
getPredictionV3	_calculateConfidence	method call before return

Phase 11 - Plan 02: 置信度评估实现

Objective

为预测结果添加置信度评估，帮助用户判断预测可靠性。置信度基于历史排名命中率、得分分布、得分集中度三个维度计算。

Purpose: 当前预测结果只有排名和得分，缺少置信度指标。用户无法判断预测结果的可信程度，置信度评估能有效辅助用户决策。

Output: History.php 中新增置信度计算方法，getPredictionV3 返回结果扩展 confidence 字段。

Tasks

Task 1: 实现置信度核心计算方法（含明确维度定义和数据量检查）

<read_first>

D:\code\php\amlhc\application\admin\model\History.php (line 2436-2444, getPredictionV3 返回语句)
D:\code\php\amlhc\application\admin\model\History.php (line 3495-3556, _runBacktestV3 方法) </read_first>

在 `History.php` 类末尾新增置信度计算相关方法：

/**
 * 计算预测置信度
 * 
 * 置信度组成（三个维度加权平均）:
 * - 维度1: 历史排名命中率 (权重0.4) - 基于回测数据统计各排名位置的命中率
 * - 维度2: 得分分布置信度 (权重0.3) - 当前号码得分与Top5得分范围的比例关系
 * - 维度3: 得分集中度 (权重0.3) - Top5得分与平均得分的差距，差距越大置信度越高
 * 
 * 加权公式:
 * confidence = 0.4 * historical_hit_rate + 0.3 * score_distribution + 0.3 * score_concentration
 * 
 * 阈值定义:
 * - 高置信度: >= 70% (绿色展示)
 * - 中置信度: 50-70% (橙色展示)
 * - 低置信度: < 50% (红色展示)
 * 
 * @param array $predictions 预测结果数组（Top5）
 * @param array $backtest 回测结果
 * @param array $scoresAll 所有号码得分详情（可选，用于集中度计算）
 * @param int $minDataThreshold 最小数据量阈值，默认50期
 * @return array {confidence_scores: [], overall_confidence: float, data_warning: string|null}
 */
private function _calculateConfidence($predictions, $backtest, $scoresAll = null, $minDataThreshold = 50)
{
    // 数据量检查
    $dataWarning = null;
    $hasBacktest = $backtest && !empty($backtest['details']) && $backtest['total_tests'] > 0;
    
    if (!$hasBacktest || $backtest['total_tests'] < $minDataThreshold) {
        $dataWarning = '回测数据不足(' . ($backtest['total_tests'] ?? 0) . '期)，置信度基于估算，建议至少50期';
    }
    
    $confidenceScores = [];
    
    // 计算Top5平均得分（用于集中度计算）
    $avgScore = 0;
    if (!empty($predictions)) {
        $totalScore = array_sum(array_column($predictions, 'score'));
        $avgScore = $totalScore / count($predictions);
    }
    
    foreach ($predictions as $idx => $pred) {
        $rank = $idx + 1;
        $num = $pred['num'];
        $score = $pred['score'];
        
        // 维度1: 历史排名命中率 (权重0.4)
        $rankHitRate = $this->_getHistoricalHitRateByRank($rank, $backtest);
        
        // 维度2: 得分分布置信度 (权重0.3) - 得分比例
        $scoreDistribution = $this->_getScoreDistributionConfidence($score, $predictions);
        
        // 维度3: 得分集中度 (权重0.3) - Top得分与平均得分的差距比例
        $scoreConcentration = $this->_getScoreConcentration($score, $avgScore, $predictions);
        
        // 综合置信度（加权平均）
        $overallConfidence = $rankHitRate * 0.4 + $scoreDistribution * 0.3 + $scoreConcentration * 0.3;
        
        $confidenceScores[] = [
            'num' => $num,
            'rank' => $rank,
            'confidence' => round($overallConfidence * 100, 1),
            'rank_hit_rate' => round($rankHitRate * 100, 1),
            'score_distribution' => round($scoreDistribution * 100, 1),
            'score_concentration' => round($scoreConcentration * 100, 1)
        ];
    }
    
    // 整体置信度（Top5平均）
    $overallConfidence = count($confidenceScores) > 0 
        ? round(array_sum(array_column($confidenceScores, 'confidence')) / count($confidenceScores), 1)
        : 0;
    
    return [
        'confidence_scores' => $confidenceScores,
        'overall_confidence' => $overallConfidence,
        'data_warning' => $dataWarning
    ];
}

/**
 * 基于历史排名获取命中率
 * 
 * 计算方法:
 * - 有回测数据时: 统计各排名的历史命中次数 / 总测试次数
 * - 无回测数据时: 根据排名估算，排名越靠前置信度越高
 *   估算公式: 1 - (rank - 1) * 0.15，即第1名估算85%，第5名估算25%
 * 
 * @param int $rank 排名位置 (1-5)
 * @param array $backtest 回测结果
 * @return float 该排名的历史命中率 (0-1)
 */
private function _getHistoricalHitRateByRank($rank, $backtest)
{
    if (!$backtest || empty($backtest['details']) || $backtest['total_tests'] == 0) {
        // 无回测数据时，根据排名估算（排名越靠前置信度越高）
        // 估算公式: 1 - (rank - 1) * 0.15
        // 第1名: 1.0, 第2名: 0.85, 第3名: 0.70, 第4名: 0.55, 第5名: 0.40
        return max(0, 1 - ($rank - 1) * 0.15);
    }
    
    // 统计各排名的历史命中次数
    $rankHits = array_fill(1, 5, 0);
    foreach ($backtest['details'] as $detail) {
        if ($detail['hit'] && $detail['rank'] >= 1 && $detail['rank'] <= 5) {
            $rankHits[$detail['rank']]++;
        }
    }
    
    $totalTests = $backtest['total_tests'];
    return $totalTests > 0 ? $rankHits[$rank] / $totalTests : 0;
}

/**
 * 计算得分分布置信度
 * 
 * 计算方法:
 * - 得分比例 = (score - bottomScore) / (topScore - bottomScore)
 * - 得分越接近第一名，置信度越高
 * - 所有得分相同时返回1
 * 
 * @param float $score 当前号码得分
 * @param array $predictions 所有预测结果
 * @return float 得分置信度 (0-1)
 */
private function _getScoreDistributionConfidence($score, $predictions)
{
    if (empty($predictions)) return 0;
    
    $topScore = $predictions[0]['score'];
    $bottomScore = end($predictions)['score'];
    
    if ($topScore == $bottomScore) return 1; // 所有得分相同
    
    // 得分比例：(score - bottom) / (top - bottom)
    $ratio = ($score - $bottomScore) / ($topScore - $bottomScore);
    return max(0, min(1, $ratio));
}

/**
 * 计算得分集中度
 * 
 * 计算方法:
 * - 集中度 = (score - avgScore) / (topScore - avgScore) 如果 score > avgScore
 * - 集中度 = 0 如果 score <= avgScore
 * - Top得分与平均得分差距越大，集中度越高，表示预测结果区分度明显
 * 
 * @param float $score 当前号码得分
 * @param float $avgScore Top5平均得分
 * @param array $predictions 所有预测结果
 * @return float 集中度置信度 (0-1)
 */
private function _getScoreConcentration($score, $avgScore, $predictions)
{
    if (empty($predictions)) return 0;
    
    $topScore = $predictions[0]['score'];
    
    // 如果得分低于平均，集中度为0
    if ($score <= $avgScore) {
        return 0;
    }
    
    // 如果Top得分等于平均，所有得分相同，集中度为0.5
    if ($topScore == $avgScore) {
        return $score == $topScore ? 0.5 : 0;
    }
    
    // 集中度 = (score - avg) / (top - avg)
    $concentration = ($score - $avgScore) / ($topScore - $avgScore);
    return max(0, min(1, $concentration));
}

实现要点：

维度重命名：将"多维度一致性"改为"得分集中度"，更明确地表示Top得分与平均得分的差距
加权公式明确：confidence = 0.4*历史命中率 + 0.3*得分分布 + 0.3*得分集中度
数据量检查：回测数据不足50期时返回警告
阈值明确：>=70%高、50-70%中、<50%低
无数据fallback：回测缺失时使用估算公式

<acceptance_criteria>

grep 正则匹配: _calculateConfidence\s*\( 在 History.php 中存在
grep 正则匹配: _getHistoricalHitRateByRank\s*\( 在 History.php 中存在
grep 正则匹配: _getScoreDistributionConfidence\s*\( 在 History.php 中存在
grep 正则匹配: _getScoreConcentration\s*\( 在 History.php 中存在（替代原_getDimensionConsistency）
grep 匹配: minDataThreshold 在方法中存在（数据量阈值）
grep 匹配: score_concentration 在返回结构中存在（替代原consistency）
所有方法包含函数级注释，注释中包含加权公式说明 </acceptance_criteria>

Task 2: 在 getPredictionV3 中调用置信度计算（含数据量传递）

<read_first>

D:\code\php\amlhc\application\admin\model\History.php (line 2413-2444, getPredictionV3 返回部分) </read_first>

在 `getPredictionV3` 方法中，找到以下代码段（约 line 2413-2444）：

// ====== 10. 历史回测验证 ======
$backtest = $skipBacktest ? null : $this->_runBacktestV3($periods, $weights, $backtestCount, $cutoffTime);

// 计算命中情况
$hitInfo = null;
...
return [
    'predictions' => $predictions,
    ...
];

在 $backtest 计算后、$hitInfo 计算前，插入置信度计算代码：

// ====== 10. 历史回测验证 ======
$backtest = $skipBacktest ? null : $this->_runBacktestV3($periods, $weights, $backtestCount, $cutoffTime);

// ====== 11. 置信度评估（新增）======
// 最小数据量阈值设为50期，不足时置信度基于估算
$minDataThreshold = 50;
$confidence = $this->_calculateConfidence($predictions, $backtest, null, $minDataThreshold);

// 计算命中情况
$hitInfo = null;
...

并修改返回结构，添加 confidence 字段：

return [
    'predictions' => $predictions,
    'last_special' => $lastSpecial,
    'last_expect' => $lastExpect,
    'analysis' => $analysis,
    'actual_result' => $actualResult,
    'hit_info' => $hitInfo,
    'backtest' => $backtest,
    'confidence' => $confidence  // 新增置信度字段
];

<acceptance_criteria>

grep 匹配: _calculateConfidence 在 getPredictionV3 方法中被调用
grep 匹配: $minDataThreshold 在 getPredictionV3 中存在
grep 匹配: 'confidence' 在 getPredictionV3 返回结构中存在
置信度计算在回测验证之后执行 </acceptance_criteria>

Verification

执行预测接口验证置信度字段返回：

curl -s "http://127.0.0.1:8000/admin/history/predictV3?periods=200&backtest=10" | grep -E "confidence|overall_confidence|confidence_scores|score_concentration"

预期结果：返回 JSON 中包含 confidence、overall_confidence、confidence_scores、score_concentration 字段。

Success Criteria

_calculateConfidence 及 4 个辅助方法已实现
置信度维度重命名为：历史排名命中率、得分分布、得分集中度
加权公式在注释中明确：confidence = 0.4*历史 + 0.3*分布 + 0.3*集中度
添加数据量检查，不足50期时返回警告
getPredictionV3 返回结构包含 confidence 字段
所有新增方法包含函数级注释

Output

完成后创建 .planning/phases/11-predictv3/11-02-SUMMARY.md

12 KiB Raw Blame History Unescape Escape