TradeArena Leaderboard Registry

Generated from redacted leaderboard submission manifests. Raw provider prompts, responses, credentials, and private portfolio data are not included.

EntryScenarioAgent PromptFeedbackEvidence ClaimTierParse DataReturnMax DD FillRejectedRisk edits AuditBadgesDetails
ta-109a118ee5d7anonymous_entry_synthetic_stress_v0_1anonymous / anonymous-redacted-policyweights_onlytruestress-onlyexternal-submittedredacted-promptbenchmarkstress-benchmark0.9420synthetic-market (daily, 3 symbols)0.0184-0.02610.81259311.0000ReproducibleRedacted
Open
Model redacted: True
Claim scope: anonymous external submission under stress-only execution
Source: examples/benchmark_submissions/anonymous_entry_redacted_submission.json
Hash: sha256:109a118ee5d70ca663873c613da8caef7802ceb2f80a45df7b05f48e25ecced9
ta-aad1948b44bfcrisis_scene_llm_redacted_examplepoe / frontier-chat-model-redactedrationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark0.9670yahoo-finance-csv (hourly, 3 symbols)0.0108-0.01870.7816281961.0000ReproducibleRedacted
Open
Model redacted: True
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/example_llm_redacted_submission.json
Hash: sha256:aad1948b44bf9d607641a8b84455224c87d5bcc6446a8acf5bf2fc8f81f29ff0
ta-ed2d5e4f2ff3quickstart_core_synthetic_v0_1deterministic / signal-weighted-baselinenonetruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.3508-0.01260.9034141241.0000ReproducibleRedacted
Open
Model redacted: True
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/example_redacted_submission.json
Hash: sha256:ed2d5e4f2ff3c87513c4b72ab96a375369150877e6771e7a86b5baa911e9c138
ta-0a4d0479945dleaderboard_llm_calm_trend_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/calm_trend__baseline_always_hold__seed_11.json
Hash: sha256:0a4d0479945d620fa4a6558eada3d815242f670a88cc099653bbafad18937b49
ta-e6b53c235779leaderboard_llm_calm_trend_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/calm_trend__baseline_always_hold__seed_17.json
Hash: sha256:e6b53c2357796f84ae7e9134a7ec5d87f29f9ccbda89513d5eb9ae299ec440da
ta-0d0673c4df95leaderboard_llm_calm_trend_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/calm_trend__baseline_always_hold__seed_23.json
Hash: sha256:0d0673c4df95ec1d80a708921d5295eb172e26e8703e6bb69645d815e6fff981
ta-bdad4e3ca968leaderboard_llm_calm_trend_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/calm_trend__baseline_always_hold__seed_31.json
Hash: sha256:bdad4e3ca9680dd061973d21fcb7e18a0dc29ba6a756af66abab08647fef6746
ta-70cc7102f89eleaderboard_llm_calm_trend_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/calm_trend__baseline_always_hold__seed_7.json
Hash: sha256:70cc7102f89ea2e707c0f9a6e684b76ffe543abbc3e3cc29222cb2834a4eebee
ta-43260ef9e93eleaderboard_llm_calm_trend_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0174-0.00180.7333201.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/calm_trend__baseline_random__seed_11.json
Hash: sha256:43260ef9e93e709396975af40b15adfb61dc06ac61e1777968e5f67c4522dd06
ta-3ae7ff9c8fffleaderboard_llm_calm_trend_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0098-0.00230.7333201.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/calm_trend__baseline_random__seed_17.json
Hash: sha256:3ae7ff9c8ffff38edd76d9e697055fd640837c94010fd3a72ce087bd70583ab0
ta-7fc2c1ac4801leaderboard_llm_calm_trend_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0362-0.00120.6667201.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/calm_trend__baseline_random__seed_23.json
Hash: sha256:7fc2c1ac4801de38cacd90d1fa1a506c8c1721fe5c48a682cd767cabd588a0d2
ta-cf42e362917aleaderboard_llm_calm_trend_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0195-0.00120.7778001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/calm_trend__baseline_random__seed_31.json
Hash: sha256:cf42e362917a88acf59925ca1d65599bbde3caf1adf95a71efaec9b480b6e0be
ta-1aae0b97e8b7leaderboard_llm_calm_trend_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0031-0.00690.7857101.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/calm_trend__baseline_random__seed_7.json
Hash: sha256:1aae0b97e8b7aa1145e714bb5c9106e126bda5189fbe29b22e1dd2cb63080b2b
ta-06927d1dc59cleaderboard_llm_calm_trend_synthetic_v0_1deepseek / deepseek-v4-flashrationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0257-0.00080.83330121.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/calm_trend__deepseek_deepseek_v4_flash.json
Hash: sha256:06927d1dc59c5d5c1fd56aa60226c55cd251643347f0a31412483749c417b855
ta-14378629b078leaderboard_llm_calm_trend_synthetic_v0_1deepseek / deepseek-v4-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0240-0.00050.6667141.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/calm_trend__deepseek_deepseek_v4_pro.json
Hash: sha256:14378629b078648470eeeca89976bdfbfccfdc7033e14a1693c6183f6118f320
ta-4b68d43caa14leaderboard_llm_calm_trend_synthetic_v0_1poe / claude-opus-4.7rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0309-0.00080.84620121.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/calm_trend__poe_claude_opus_4_7.json
Hash: sha256:4b68d43caa1450e078562b14ce58495e1aacb6aefc71f6c9756131f4fcf2da51
ta-06ea62894f11leaderboard_llm_calm_trend_synthetic_v0_1poe / gemini-3.1-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0253-0.00080.6667261.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/calm_trend__poe_gemini_3_1_pro.json
Hash: sha256:06ea62894f112c7f946fc6f9a611ccc444d932b899901c9ebe7efc41bb8bdd4b
ta-4e7b44b87eb3leaderboard_llm_calm_trend_synthetic_v0_1poe / glm-5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0257-0.00080.83330121.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/calm_trend__poe_glm_5.json
Hash: sha256:4e7b44b87eb3b6d0d8ea4495f46c7a3a29b4c2a1689c8c34a0cfc3074a606766
ta-1a6cea67f2ebleaderboard_llm_calm_trend_synthetic_v0_1poe / gpt-5.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0288-0.00080.83330111.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/calm_trend__poe_gpt_5_5.json
Hash: sha256:1a6cea67f2ebbc9cb0a5375e05852dcc75fb3e1b8041f606a0f448cc93e18fd5
ta-e3e0649b2c2bleaderboard_llm_calm_trend_synthetic_v0_1poe / kimi-k2.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0319-0.00080.75001111.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/calm_trend__poe_kimi_k2_5.json
Hash: sha256:e3e0649b2c2baae0e0fb580f079af69bee241d1820376430cc06ed5d368c0066
ta-20c5743cca23leaderboard_llm_high_vol_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/high_vol__baseline_always_hold__seed_17.json
Hash: sha256:20c5743cca23bc3adfad33316cbd1e17914dfad144841e80e7f03bcac364962a
ta-05283aa56f64leaderboard_llm_high_vol_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/high_vol__baseline_always_hold__seed_21.json
Hash: sha256:05283aa56f64bfd8abb14dba02509818b421f1019b5a7492f0cd44dcf10b0b92
ta-cf6987259a15leaderboard_llm_high_vol_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/high_vol__baseline_always_hold__seed_27.json
Hash: sha256:cf6987259a158d7b5c33aa7611ea4d6d1a7d90e78ab21bed358552a2ed300064
ta-4dfc17c135a0leaderboard_llm_high_vol_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/high_vol__baseline_always_hold__seed_33.json
Hash: sha256:4dfc17c135a0e1c2ff7e7c715722d2904de20678b09ded0ead3b3c5ff35671b7
ta-85914c9333ebleaderboard_llm_high_vol_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/high_vol__baseline_always_hold__seed_41.json
Hash: sha256:85914c9333eb1b2436090f81d23c90d759b5f115265ac090b854bc16e32e268d
ta-56efa955c691leaderboard_llm_high_vol_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)-0.0141-0.01410.7333201.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/high_vol__baseline_random__seed_17.json
Hash: sha256:56efa955c69154b343ec2a083e11d476053f86c3ee9d5f0bdffc79b73721cc8c
ta-a2f838f37d16leaderboard_llm_high_vol_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0139-0.00220.7500301.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/high_vol__baseline_random__seed_21.json
Hash: sha256:a2f838f37d1695fbfad592e2f9869f7f5efb7f595eb5832e3a5be76d230e9268
ta-9662fa012555leaderboard_llm_high_vol_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)-0.0238-0.02620.7857101.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/high_vol__baseline_random__seed_27.json
Hash: sha256:9662fa0125554fb00e583209281b52dd9e67069a3472b36ab873694d9486ac90
ta-0a73e46ec48bleaderboard_llm_high_vol_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0204-0.01790.8125101.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/high_vol__baseline_random__seed_33.json
Hash: sha256:0a73e46ec48b495c3adbc03b973c3ab67c6e9686bb246c0837be6c5437dc7860
ta-315a367cb5b1leaderboard_llm_high_vol_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0383-0.00090.8125101.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/high_vol__baseline_random__seed_41.json
Hash: sha256:315a367cb5b14bf1459643e1319d825134ec43108cc7117c41e1fb116c20cd7a
ta-817eef2f37a2leaderboard_llm_high_vol_synthetic_v0_1deepseek / deepseek-v4-flashrationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0033-0.00750.76921101.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/high_vol__deepseek_deepseek_v4_flash.json
Hash: sha256:817eef2f37a2024f091e6fdf02a054517680b7f212ccba64636b56364378d299
ta-05e101952ae2leaderboard_llm_high_vol_synthetic_v0_1deepseek / deepseek-v4-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0043-0.00650.7500191.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/high_vol__deepseek_deepseek_v4_pro.json
Hash: sha256:05e101952ae24d095162eab0825c55527082b1f979982c7a75502c3935912e4b
ta-8aaf791e5111leaderboard_llm_high_vol_synthetic_v0_1poe / claude-opus-4.7rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0045-0.00580.7500181.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/high_vol__poe_claude_opus_4_7.json
Hash: sha256:8aaf791e5111bb3817f8ef20c8dd6551975587c68bab188c7d874eedb62539b7
ta-11a2de57a72dleaderboard_llm_high_vol_synthetic_v0_1poe / gemini-3.1-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0144-0.00400.7500261.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/high_vol__poe_gemini_3_1_pro.json
Hash: sha256:11a2de57a72d0449395c0bf0a50f78ac7f57d72ecbc1bf3c6699c5d1db60935b
ta-96447d495a71leaderboard_llm_high_vol_synthetic_v0_1poe / glm-5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0030-0.00580.8462081.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/high_vol__poe_glm_5.json
Hash: sha256:96447d495a712b930d8ada9f6c184271ded3e5fc6e87d31980ea136ec8cb6966
ta-e622727b4323leaderboard_llm_high_vol_synthetic_v0_1poe / gpt-5.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0038-0.00650.7500181.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/high_vol__poe_gpt_5_5.json
Hash: sha256:e622727b432351dcf220bc4a812b614fe2a099d91f7423b9c9c527081b132f88
ta-bcde515dace4leaderboard_llm_high_vol_synthetic_v0_1poe / kimi-k2.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0037-0.00520.8462081.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/high_vol__poe_kimi_k2_5.json
Hash: sha256:bcde515dace4494efb6ec18fbaec68b891688fc915e3832d865db4c314650928
ta-d15a87438728leaderboard_llm_jump_tail_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/jump_tail__baseline_always_hold__seed_29.json
Hash: sha256:d15a87438728de761e003597c65c7c580cff1b6a78f4f3ad079f0f2fd8893834
ta-c353fa8d8aacleaderboard_llm_jump_tail_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/jump_tail__baseline_always_hold__seed_33.json
Hash: sha256:c353fa8d8aacc5b5ea87516f297c7172da5d8cf370df9d142517c0fb1f0c70e2
ta-3f4548fcfe1aleaderboard_llm_jump_tail_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/jump_tail__baseline_always_hold__seed_39.json
Hash: sha256:3f4548fcfe1a3e197dd1ad41ac1b07cb0e7b1dd24e1972b6e23fdc2668934853
ta-7f22806dea18leaderboard_llm_jump_tail_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/jump_tail__baseline_always_hold__seed_45.json
Hash: sha256:7f22806dea18d7b73257b411d6a5d750ee29adb351e43b526d9995f49d6df680
ta-836432d9b11aleaderboard_llm_jump_tail_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/jump_tail__baseline_always_hold__seed_53.json
Hash: sha256:836432d9b11adbcad2d8a42ffc1a71d0c0f073526149cfc79415b43078bb777e
ta-5cb2d1f11f6dleaderboard_llm_jump_tail_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0104-0.03420.7500201.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/jump_tail__baseline_random__seed_29.json
Hash: sha256:5cb2d1f11f6dfd2e555c79dc5f0f71b1e2b26006c249a9a5642ab28bdc575b2c
ta-b441e1db5528leaderboard_llm_jump_tail_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0269-0.00870.8125101.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/jump_tail__baseline_random__seed_33.json
Hash: sha256:b441e1db5528bf3f1fe727d50357ba039f5aa98d1e373c9c432bddb84bd293c4
ta-3b6992b6545dleaderboard_llm_jump_tail_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)-0.0437-0.05420.8125101.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/jump_tail__baseline_random__seed_39.json
Hash: sha256:3b6992b6545d1e366f00589815a1c01d9b42100d2ac33b1215cfea0f4549020c
ta-a84ff6d7402fleaderboard_llm_jump_tail_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)-0.0286-0.04780.8125101.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/jump_tail__baseline_random__seed_45.json
Hash: sha256:a84ff6d7402f725575ad3704f57f6537a7bc7d16ef16d0eb3456d11283f9cabb
ta-6825325cfdb0leaderboard_llm_jump_tail_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0942-0.00430.7500201.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/jump_tail__baseline_random__seed_53.json
Hash: sha256:6825325cfdb071e85696bf681170db75ef232c3b081d4ef95894aedde038f72d
ta-99983feb3ad7leaderboard_llm_jump_tail_synthetic_v0_1deepseek / deepseek-v4-flashrationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0167-0.02140.66673121.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/jump_tail__deepseek_deepseek_v4_flash.json
Hash: sha256:99983feb3ad7745410e61fdfd0a339aeb0bbc61d5a970457be68f04883f7b3cd
ta-30b9043e4dafleaderboard_llm_jump_tail_synthetic_v0_1deepseek / deepseek-v4-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0167-0.02140.7692291.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/jump_tail__deepseek_deepseek_v4_pro.json
Hash: sha256:30b9043e4dafd573418092dfedb7a78b772f28bca64a5b234e8ca7ec3d5509c1
ta-0d773b86a41bleaderboard_llm_jump_tail_synthetic_v0_1poe / claude-opus-4.7rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0167-0.02140.66673101.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/jump_tail__poe_claude_opus_4_7.json
Hash: sha256:0d773b86a41b201e3824c946197c2dfaab9c744fcde5158195ba92ad857b53e5
ta-5c0f219d4cealeaderboard_llm_jump_tail_synthetic_v0_1poe / gemini-3.1-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0157-0.00750.6429391.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/jump_tail__poe_gemini_3_1_pro.json
Hash: sha256:5c0f219d4cea28712e7d0bc11c323943280c578d087acffc9f36b8f6af0b33df
ta-514d4157571dleaderboard_llm_jump_tail_synthetic_v0_1poe / glm-5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0167-0.02140.71433111.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/jump_tail__poe_glm_5.json
Hash: sha256:514d4157571d384947bba0ea295da35096643d1ca9a16b20ebf6042c59426587
ta-50e42e87acb5leaderboard_llm_jump_tail_synthetic_v0_1poe / gpt-5.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0167-0.02140.66673121.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/jump_tail__poe_gpt_5_5.json
Hash: sha256:50e42e87acb5866092c8059f76d9af57993deb02219c93b68e81ac9abe3da12a
ta-07e91c904d14leaderboard_llm_jump_tail_synthetic_v0_1poe / kimi-k2.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0167-0.02140.66673111.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/jump_tail__poe_kimi_k2_5.json
Hash: sha256:07e91c904d14031ed2eedd42dead47953a65e7603a8ff6015e9575c30e724063
ta-fd3ab766cf8eleaderboard_llm_latency_spike_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/latency_spike__baseline_always_hold__seed_65.json
Hash: sha256:fd3ab766cf8e028e48f327c867424833d5ea8ea4495792aaeed1fc14504f6840
ta-29d627fb2020leaderboard_llm_latency_spike_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/latency_spike__baseline_always_hold__seed_69.json
Hash: sha256:29d627fb2020db782d17a8b72728cb7b536d7aaec3951ab7eea2fd528a4c82f3
ta-828ff0cc0185leaderboard_llm_latency_spike_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/latency_spike__baseline_always_hold__seed_75.json
Hash: sha256:828ff0cc0185c46011bae40b05f4a04308f48947dc1ed65b4c1f00bbd2b7194e
ta-0803defa1f4eleaderboard_llm_latency_spike_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/latency_spike__baseline_always_hold__seed_81.json
Hash: sha256:0803defa1f4e25532c5a2f98205b31e04b125ae71dde35ae00a9414c490a2fa5
ta-a6890d44909fleaderboard_llm_latency_spike_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/latency_spike__baseline_always_hold__seed_89.json
Hash: sha256:a6890d44909ff2ab39ad4feab7b6d74a399ba3d3fd0bc3e6411f484f1debe541
ta-a59a34b1b3ffleaderboard_llm_latency_spike_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0197-0.01100.2143301.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/latency_spike__baseline_random__seed_65.json
Hash: sha256:a59a34b1b3ff18154a2c6c49f38542fa96371037902e9ed3b3f37da0fb53b4f1
ta-210515aca353leaderboard_llm_latency_spike_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0033-0.00390.1875501.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/latency_spike__baseline_random__seed_69.json
Hash: sha256:210515aca353d3f076f5f624fd3b72458a1a61feb8c80ec065f1e5c8210f4ac2
ta-181d6d3404bdleaderboard_llm_latency_spike_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)-0.0203-0.03460.2500401.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/latency_spike__baseline_random__seed_75.json
Hash: sha256:181d6d3404bd3b031d693be4c4d564239c07c7ce33c384d942aad4dca6bcc09f
ta-012721f5cc0fleaderboard_llm_latency_spike_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0475-0.00330.1875501.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/latency_spike__baseline_random__seed_81.json
Hash: sha256:012721f5cc0f604897f2e1fcd19669ec09cc92cdd03e8f594541667c844c6f19
ta-381c3fa51ad8leaderboard_llm_latency_spike_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)-0.0070-0.00700.3333201.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/latency_spike__baseline_random__seed_89.json
Hash: sha256:381c3fa51ad8829c0fe4ae0171dabec4cec1e8c514a09622e74e5997028e5385
ta-ea537e97eec9leaderboard_llm_latency_spike_synthetic_v0_1deepseek / deepseek-v4-flashrationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0010-0.01330.2727191.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/latency_spike__deepseek_deepseek_v4_flash.json
Hash: sha256:ea537e97eec939685beeb77cb4beed02341db376e53bc5254c6e562d2e104c08
ta-c1a0fb026bddleaderboard_llm_latency_spike_synthetic_v0_1deepseek / deepseek-v4-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0010-0.01330.2727191.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/latency_spike__deepseek_deepseek_v4_pro.json
Hash: sha256:c1a0fb026bddee5eabc86cdf6201a1d7b40a5053949c59784d4eeb5736834250
ta-88f9614808b5leaderboard_llm_latency_spike_synthetic_v0_1poe / claude-opus-4.7rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0010-0.01330.2727191.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/latency_spike__poe_claude_opus_4_7.json
Hash: sha256:88f9614808b564cf60fedd3fbca8d5d913a86c46ec07bf82220d8b12ade92cfa
ta-6dc1d83bcc76leaderboard_llm_latency_spike_synthetic_v0_1poe / gemini-3.1-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0329-0.00910.23083131.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/latency_spike__poe_gemini_3_1_pro.json
Hash: sha256:6dc1d83bcc76a051907218994fb623e1d7286df8fa8664adbf1b3b03ae86020d
ta-6dd5c84b9142leaderboard_llm_latency_spike_synthetic_v0_1poe / glm-5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark0.8750synthetic-market (daily, 2 symbols)0.0010-0.01330.3000091.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/latency_spike__poe_glm_5.json
Hash: sha256:6dd5c84b914274571c34c5e89aca8e0a42baa7664f188959d7590e18f53bd8c2
ta-fb175dbf03ccleaderboard_llm_latency_spike_synthetic_v0_1poe / gpt-5.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0010-0.01330.2727191.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/latency_spike__poe_gpt_5_5.json
Hash: sha256:fb175dbf03cc28342ba82a2296241ae915c6b0db752137ebd9a4d62ecf4c509e
ta-d83fec4e3980leaderboard_llm_latency_spike_synthetic_v0_1poe / kimi-k2.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0010-0.01330.2727191.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/latency_spike__poe_kimi_k2_5.json
Hash: sha256:d83fec4e3980482715bfe34d19b3765b66206a7df8d12e01d2a4db9b646d3d67
ta-9910a6a580f0leaderboard_llm_liquidity_collapse_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/liquidity_collapse__baseline_always_hold__seed_41.json
Hash: sha256:9910a6a580f0e7ce7e0368e0551094e77b900e94a6c32c12ba3051ce4e781336
ta-0a0d435433d7leaderboard_llm_liquidity_collapse_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/liquidity_collapse__baseline_always_hold__seed_45.json
Hash: sha256:0a0d435433d7c7222d0c1b81e55a7d795d593b8ae4c9440236abf1a5ea5ddac0
ta-5a12ea693cfdleaderboard_llm_liquidity_collapse_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/liquidity_collapse__baseline_always_hold__seed_51.json
Hash: sha256:5a12ea693cfdac56ae45404dfc240283b9e2eb357d46f03c8d93b09fea96885c
ta-1df45ac7fc12leaderboard_llm_liquidity_collapse_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/liquidity_collapse__baseline_always_hold__seed_57.json
Hash: sha256:1df45ac7fc129370790b2124042d2d7dbe942e5743f919421b25a019cb48bcce
ta-1b934080b980leaderboard_llm_liquidity_collapse_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/liquidity_collapse__baseline_always_hold__seed_65.json
Hash: sha256:1b934080b980734cfeb56fb6b0b9e976349f8a15db837bffc9d0db0923034e4c
ta-8786d22669aaleaderboard_llm_liquidity_collapse_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0094-0.01650.8125101.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/liquidity_collapse__baseline_random__seed_41.json
Hash: sha256:8786d22669aa0e92d16c586e0c8c53a270d1452558760ec308e950b2e2f1c3d5
ta-3281dbcfb47aleaderboard_llm_liquidity_collapse_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0574-0.00570.8125101.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/liquidity_collapse__baseline_random__seed_45.json
Hash: sha256:3281dbcfb47a6966a3391c0c8217b40c45fa6bbef6402d3fb2523628fe133b52
ta-20120db93e7eleaderboard_llm_liquidity_collapse_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0304-0.00880.8667001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/liquidity_collapse__baseline_random__seed_51.json
Hash: sha256:20120db93e7eaf4b8c37432dd83f066a7a0faae26cd5a1056fdf234daec5eda2
ta-57c175c1aafaleaderboard_llm_liquidity_collapse_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)-0.0012-0.01410.8571001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/liquidity_collapse__baseline_random__seed_57.json
Hash: sha256:57c175c1aafaa5ef18ff230882abbd605fdac07847bce2743d0cd5f7d75541d3
ta-f6a55649e5f6leaderboard_llm_liquidity_collapse_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0297-0.00900.7333201.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/liquidity_collapse__baseline_random__seed_65.json
Hash: sha256:f6a55649e5f6e1952ced589b86e5d0c1bd2297df9e08c9e2a4acdc9936ae38a4
ta-9700e6062d4bleaderboard_llm_liquidity_collapse_synthetic_v0_1deepseek / deepseek-v4-flashrationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0442-0.00120.8000091.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/liquidity_collapse__deepseek_deepseek_v4_flash.json
Hash: sha256:9700e6062d4b474e4b75a15f58f01608084464508abbbef9807375942083fba8
ta-16ce8dd0a233leaderboard_llm_liquidity_collapse_synthetic_v0_1deepseek / deepseek-v4-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0236-0.01670.6250161.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/liquidity_collapse__deepseek_deepseek_v4_pro.json
Hash: sha256:16ce8dd0a233d319fc3984a13e6024c0692cacccbf4e02ef3d2c110f14433b2b
ta-46b6ba61345aleaderboard_llm_liquidity_collapse_synthetic_v0_1poe / claude-opus-4.7rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0442-0.00120.8000081.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/liquidity_collapse__poe_claude_opus_4_7.json
Hash: sha256:46b6ba61345a587cc724ae55207f6383643577acbbb56e73071282f13397a8b7
ta-21b67d022098leaderboard_llm_liquidity_collapse_synthetic_v0_1poe / gemini-3.1-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0209-0.01020.76921101.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/liquidity_collapse__poe_gemini_3_1_pro.json
Hash: sha256:21b67d022098b525010682bb7d25854f52b01f1d163c5aceb2a2158fdbbf65ab
ta-5d82e02acbd9leaderboard_llm_liquidity_collapse_synthetic_v0_1poe / glm-5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0150-0.03720.84620111.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/liquidity_collapse__poe_glm_5.json
Hash: sha256:5d82e02acbd9a164b94d14fb5a7ae3a6a67832a73427ba58375ba3ab0c4a4b7f
ta-a4b7fa9d47d2leaderboard_llm_liquidity_collapse_synthetic_v0_1poe / gpt-5.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0442-0.00120.8000091.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/liquidity_collapse__poe_gpt_5_5.json
Hash: sha256:a4b7fa9d47d238c7729587ceed72dc28d8571a39170d90adff58a1719a347316
ta-4a456013198bleaderboard_llm_liquidity_collapse_synthetic_v0_1poe / kimi-k2.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0442-0.00120.8000091.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/liquidity_collapse__poe_kimi_k2_5.json
Hash: sha256:4a456013198b2779569bf94b150144f3b0dcc4c21e1cf60e68a74e86d68224d6
ta-f6d12eaf7bc6leaderboard_llm_spread_explosion_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/spread_explosion__baseline_always_hold__seed_53.json
Hash: sha256:f6d12eaf7bc644ea3465f144c226ea058f478e474156958d1a06b71bd2d66306
ta-ea8e3471713cleaderboard_llm_spread_explosion_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/spread_explosion__baseline_always_hold__seed_57.json
Hash: sha256:ea8e3471713cc1a994a25dae87a051f02009de413656bc3aaa79b9310a524cb6
ta-90ffb0a33779leaderboard_llm_spread_explosion_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/spread_explosion__baseline_always_hold__seed_63.json
Hash: sha256:90ffb0a33779be056115d57464bbf2bd2ddd4ab194ae7539f29f0d51ceda030b
ta-e0bc6cfad842leaderboard_llm_spread_explosion_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/spread_explosion__baseline_always_hold__seed_69.json
Hash: sha256:e0bc6cfad842b84395e7dedd18938b4773af9e744be27619734fb5eaad0eaf73
ta-1aa1a999eea5leaderboard_llm_spread_explosion_synthetic_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/spread_explosion__baseline_always_hold__seed_77.json
Hash: sha256:1aa1a999eea5f773a61eef4bd39c025650098c473622665e493b2d4c2fdc764b
ta-41b6d20be85bleaderboard_llm_spread_explosion_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)-0.0007-0.03920.7500201.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/spread_explosion__baseline_random__seed_53.json
Hash: sha256:41b6d20be85b2c237ec9b8140eb59df1e79479ffe5eaf947bc1160e3edb8bd10
ta-63351965b1cfleaderboard_llm_spread_explosion_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)-0.0108-0.01370.8571001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/spread_explosion__baseline_random__seed_57.json
Hash: sha256:63351965b1cfdb1786964410601ab87fd75fa45a24c052117eb114baf04a6f76
ta-218ecb3bb308leaderboard_llm_spread_explosion_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)-0.0263-0.04580.8750001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/spread_explosion__baseline_random__seed_63.json
Hash: sha256:218ecb3bb308847c7680e38f60e7ab0de292e523d451a2d9344eafeb6f29aba6
ta-6ecc04dbeac2leaderboard_llm_spread_explosion_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0141-0.01220.8750001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/spread_explosion__baseline_random__seed_69.json
Hash: sha256:6ecc04dbeac2b366fcea8e6dac63f9945fd036f43c3a50480d44fc3d707bd7bd
ta-e0cc2136ae91leaderboard_llm_spread_explosion_synthetic_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0137-0.01770.7500201.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/model_matrix/spread_explosion__baseline_random__seed_77.json
Hash: sha256:e0cc2136ae9183ca6dbb4ae277854071b92dd7522f56fd417477156f818c1afe
ta-3f07c8ab9963leaderboard_llm_spread_explosion_synthetic_v0_1deepseek / deepseek-v4-flashrationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)-0.0155-0.04240.7692191.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/spread_explosion__deepseek_deepseek_v4_flash.json
Hash: sha256:3f07c8ab996333cd2453f3b7b2db76cb42e52d9a6894de08f6cfdb3ed5dfbe53
ta-e51d439c3fb3leaderboard_llm_spread_explosion_synthetic_v0_1deepseek / deepseek-v4-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)0.0048-0.02060.8333141.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/spread_explosion__deepseek_deepseek_v4_pro.json
Hash: sha256:e51d439c3fb379a071bbd1e72982a586c39a415d220585bbacf51c14148b6beb
ta-fe1be1a36c38leaderboard_llm_spread_explosion_synthetic_v0_1poe / claude-opus-4.7rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)-0.0155-0.04240.7692191.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/spread_explosion__poe_claude_opus_4_7.json
Hash: sha256:fe1be1a36c38fc5a4ca851b2db5ba2fe5a0b496a252fb068dfe642a357efd731
ta-895143237d66leaderboard_llm_spread_explosion_synthetic_v0_1poe / gemini-3.1-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)-0.0065-0.03340.7857191.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/spread_explosion__poe_gemini_3_1_pro.json
Hash: sha256:895143237d66885366809bd9ad86995312a5c136e822965ac0cfe5ec35840663
ta-886315ca40c5leaderboard_llm_spread_explosion_synthetic_v0_1poe / glm-5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark0.8750synthetic-market (daily, 2 symbols)-0.0116-0.02690.7500181.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/spread_explosion__poe_glm_5.json
Hash: sha256:886315ca40c5b3196fbfb1a906376a777bc3e70a63ffdb6f41bc6293efaeeefa
ta-4cb4062b028aleaderboard_llm_spread_explosion_synthetic_v0_1poe / gpt-5.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)-0.0155-0.04240.7692191.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/spread_explosion__poe_gpt_5_5.json
Hash: sha256:4cb4062b028a14852be60cf99ba6665d6dd8405044ffceae7c5ea2e1b253bc80
ta-0820021e1502leaderboard_llm_spread_explosion_synthetic_v0_1poe / kimi-k2.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000synthetic-market (daily, 2 symbols)-0.0007-0.02330.7500181.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/model_matrix/spread_explosion__poe_kimi_k2_5.json
Hash: sha256:0820021e150264610213f88640e10a0ab2519ea4914a62403aba3d2b341ff1c1
ta-f85e3b63f63cleaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__baseline_always_hold__seed_11.json
Hash: sha256:f85e3b63f63cbce7b774bc6b780b1af15b5edf5d0ca93c3f6a25264e50b9c9ab
ta-14330dea416cleaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__baseline_always_hold__seed_17.json
Hash: sha256:14330dea416c278668f1e7849a9fdbfeeb57db6f3f7d1ba51d2c1e3d88b8cf0a
ta-ed60bebc2bacleaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__baseline_always_hold__seed_23.json
Hash: sha256:ed60bebc2bac4e1f1ea4b9e66ef561a009c46d57a1b154c1db3cfa08e6c40efb
ta-493b31723e1aleaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__baseline_always_hold__seed_31.json
Hash: sha256:493b31723e1aa3dcba513cf51cfd783c22aae061c622a34b572f305ab377f653
ta-bb97d5793bbfleaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__baseline_always_hold__seed_7.json
Hash: sha256:bb97d5793bbfb0ef392f9919a89164f9cc18106a6d1457ef0f56ceb39a1b0f3f
ta-7aa11e67f69aleaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.0721-0.02260.7353601.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__baseline_random__seed_11.json
Hash: sha256:7aa11e67f69ac91c26333c66126c35925ee3b1caf5c27830a528206aab38b2ab
ta-6a1d2a467cd8leaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1061-0.15850.8000401.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__baseline_random__seed_17.json
Hash: sha256:6a1d2a467cd8f4ba8686c350a869757c01a54d2d1971fb44058111756acf0c09
ta-40ca7e76523aleaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0982-0.17320.6765801.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__baseline_random__seed_23.json
Hash: sha256:40ca7e76523ad47f72bd496a7da8340772e582349d8838ac623e59f992ee90ef
ta-9661fd5891a5leaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.0416-0.04630.8333201.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__baseline_random__seed_31.json
Hash: sha256:9661fd5891a5da906dc8ade2d2fa80d64129dba22794df31e618ed2d2b60d6a3
ta-480274e4c362leaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0695-0.11160.7429601.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__baseline_random__seed_7.json
Hash: sha256:480274e4c362e189011e2b908c978a44aae67cc60c52e58c5d72c81eb20352d2
ta-b89707bd8492leaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1deepseek / deepseek-v4-flashrationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1014-0.16580.85191251.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__deepseek_deepseek_v4_flash__seed_11.json
Hash: sha256:b89707bd849277cdf8776f4af235d603a67b961bf7d6ef717147e60427e000aa
ta-f2dfd5a8b623leaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1deepseek / deepseek-v4-flashrationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.0212-0.05500.70375271.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__deepseek_deepseek_v4_flash__seed_17.json
Hash: sha256:f2dfd5a8b6238186f04f9d9f8f464d49ac255527360b6ae73b2e38e7f196611c
ta-e1889c42a03eleaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1deepseek / deepseek-v4-flashrationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0897-0.10580.60876241.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__deepseek_deepseek_v4_flash__seed_23.json
Hash: sha256:e1889c42a03e98b5be5b91bf45ad1c3ee83d296f9ffbc78e1f0fc94fbf7cd17b
ta-f08bc58ea950leaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1deepseek / deepseek-v4-flashrationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1346-0.19430.88003221.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__deepseek_deepseek_v4_flash__seed_31.json
Hash: sha256:f08bc58ea95057405e8ea3d7cef8f00b6560b9ab6d91816c3626b1f2b5a2e463
ta-ea5d4d3b4effleaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1deepseek / deepseek-v4-flashrationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1601-0.20580.72734191.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__deepseek_deepseek_v4_flash__seed_7.json
Hash: sha256:ea5d4d3b4eff3796378b50b0623ab6f26fee4a241cc341ae52993ec833e8e809
ta-569966a71c3dleaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1deepseek / deepseek-v4-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1343-0.19630.73084231.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__deepseek_deepseek_v4_pro__seed_11.json
Hash: sha256:569966a71c3dfc2daf35bee61204dd10b428f29d1d65c9308ec553a4b62b4571
ta-0a4a0fab532eleaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1deepseek / deepseek-v4-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.0012-0.05500.66675231.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__deepseek_deepseek_v4_pro__seed_17.json
Hash: sha256:0a4a0fab532eadafa8478075e92fc69b338fed601e3868186dd33d6ae32e31ac
ta-fea519cd8389leaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1deepseek / deepseek-v4-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0203-0.05270.60005231.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__deepseek_deepseek_v4_pro__seed_23.json
Hash: sha256:fea519cd8389993a75dc41fa894adb03bb374ef8d64cd348e828304c81f60612
ta-a378c6db80c2leaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1deepseek / deepseek-v4-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0144-0.05290.76195181.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__deepseek_deepseek_v4_pro__seed_31.json
Hash: sha256:a378c6db80c2cbd502b9f1113eeb6137d145074ba54f27a7092f413a645f03c2
ta-c7d453e14955leaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1deepseek / deepseek-v4-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1385-0.18540.72005231.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__deepseek_deepseek_v4_pro__seed_7.json
Hash: sha256:c7d453e149555ca531d1cd714e83c6014307d56f879e09ca958b20c5d07a7490
ta-22ab171611c0leaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / claude-opus-4.7rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1343-0.19630.73084231.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_claude_opus_4_7__seed_11.json
Hash: sha256:22ab171611c072826b238ce1cc15c171366fe58b1351d19c1aafc488f5afdbef
ta-f4dee12a74b8leaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / claude-opus-4.7rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.0207-0.05500.75004281.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_claude_opus_4_7__seed_17.json
Hash: sha256:f4dee12a74b88dfe881a0619657ab3c063f84ddbbb28985199352651b4595769
ta-b70991c0a097leaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / claude-opus-4.7rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0473-0.07600.70375271.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_claude_opus_4_7__seed_23.json
Hash: sha256:b70991c0a0979efb854bb5996cf35f26280300d10f296751125c401d5d77905b
ta-2d83546bf33dleaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / claude-opus-4.7rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1039-0.16190.82145231.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_claude_opus_4_7__seed_31.json
Hash: sha256:2d83546bf33def821d608c3e9ffbca0890738f76fd920573f152517e0ee15375
ta-a9db83b7afa7leaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / claude-opus-4.7rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1385-0.18540.72005221.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_claude_opus_4_7__seed_7.json
Hash: sha256:a9db83b7afa7e61668ee0989d4699d560aec4cf51c7c949bf17014c37c2f4484
ta-5a4b4ac82a0dleaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / gemini-3.1-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1343-0.19630.73084211.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_gemini_3_1_pro__seed_11.json
Hash: sha256:5a4b4ac82a0d0da517fbbe4d8f95a0a10c4010b2722b7a28e4b37f6054f9bc6f
ta-5ea4bfedc3deleaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / gemini-3.1-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.0256-0.05500.65386221.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_gemini_3_1_pro__seed_17.json
Hash: sha256:5ea4bfedc3de266b2a62f83f1c5d1f154f11f8a283ab571c5d4c5c190a9d5343
ta-95fb805c9921leaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / gemini-3.1-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1277-0.18510.57698221.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_gemini_3_1_pro__seed_23.json
Hash: sha256:95fb805c99219cab83f4c8632a3f576ee60a1668b794b5a80897dbee947df697
ta-7ddf880bf3ecleaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / gemini-3.1-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0577-0.08810.80954181.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_gemini_3_1_pro__seed_31.json
Hash: sha256:7ddf880bf3ecc2f57562998f61359fc5b456f3afe44293c1867c492a972c66e6
ta-292c6fde1882leaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / gemini-3.1-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1385-0.18540.68006211.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_gemini_3_1_pro__seed_7.json
Hash: sha256:292c6fde1882bb9ccd9b5932978aa731488fab1fcc39d10181e56e4a7c19f682
ta-26e5eb0a6305leaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / glm-5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1343-0.19630.73084231.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_glm_5__seed_11.json
Hash: sha256:26e5eb0a630593e6c1ad5fb3383fc32e0ef3c9d526c30badfabe9c52d14112f3
ta-4954717bfa81leaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / glm-5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.0212-0.05500.70375271.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_glm_5__seed_17.json
Hash: sha256:4954717bfa810f0fa079609f0928deb6ddeb4067ca14f40144a6f41b2544f12b
ta-0876e7751807leaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / glm-5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark0.9167yahoo-finance-csv (weekly, 3 symbols)0.0202-0.05500.66675271.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_glm_5__seed_23.json
Hash: sha256:0876e7751807f8bb022bdeba8ad3c4146e69658500633513908246bf19875c3a
ta-2e43162b667aleaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / glm-5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark0.9167yahoo-finance-csv (weekly, 3 symbols)-0.1228-0.18410.78576241.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_glm_5__seed_31.json
Hash: sha256:2e43162b667af37195ee0d9073762bd1b9df1ff950930d800d052a4a5c63445c
ta-a4b9ecae134cleaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / glm-5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1385-0.18540.72005231.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_glm_5__seed_7.json
Hash: sha256:a4b9ecae134cf82e08792e02ec69caa87a3f1b9cde96f60f71babf0472f5302e
ta-9f33639094f7leaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / gpt-5.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1343-0.19630.73084211.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_gpt_5_5__seed_11.json
Hash: sha256:9f33639094f774c2f0a9a896f158509ae3b1ff083b83b977e2a5bca8e0e3c29e
ta-7b3f159e7656leaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / gpt-5.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.0212-0.05500.70375271.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_gpt_5_5__seed_17.json
Hash: sha256:7b3f159e765622987b44ce741cbeec0840f668f21f3b97ad4fb8023c6b0a083e
ta-7f028c544df5leaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / gpt-5.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0672-0.10070.61547281.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_gpt_5_5__seed_23.json
Hash: sha256:7f028c544df549aa0c3de9068f6c741c95d0132fb3a223ee826ac367818f831d
ta-bbf88fa62999leaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / gpt-5.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1228-0.18410.78576241.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_gpt_5_5__seed_31.json
Hash: sha256:bbf88fa62999a501592cbc262422935b190a60f600b355d1b056c532cd6a2fca
ta-5872db22a33eleaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / gpt-5.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1385-0.18540.72005221.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_gpt_5_5__seed_7.json
Hash: sha256:5872db22a33e2b834ed39a97301088d92366296f63ff23bb437d19c0e1c6d0fa
ta-0bc65cff62d0leaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / kimi-k2.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1420-0.20400.62967251.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_kimi_k2_5__seed_11.json
Hash: sha256:0bc65cff62d0375547027e0891a449ce57dcd151cfd5a3e4691c4f14b142dbb3
ta-8f36b888685cleaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / kimi-k2.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.0256-0.05500.62967231.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_kimi_k2_5__seed_17.json
Hash: sha256:8f36b888685c5032ed7de3a2ecf80081464f6378de074f23fe3ef2af50ed648b
ta-243b6ff130f2leaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / kimi-k2.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0674-0.10080.64297301.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_kimi_k2_5__seed_23.json
Hash: sha256:243b6ff130f2da101d196e36bfc304968cbbce2b02f61eb7202cb401cf29cea7
ta-92023a00f17dleaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / kimi-k2.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0527-0.09940.84623191.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_kimi_k2_5__seed_31.json
Hash: sha256:92023a00f17d034051c434acf28cc6513f0c07e034e2d383299851515c88ed7e
ta-5e0395334c9dleaderboard_real_yahoo_2022_gspc_btc_btcf_weekly_v0_1poe / kimi-k2.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1258-0.17550.68007181.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/rates_drawdown__poe_kimi_k2_5__seed_7.json
Hash: sha256:5e0395334c9d4444fe13204f3141b699128fd402543589373c5f872d09d78683
ta-9f03f4f1c0e4leaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__baseline_always_hold__seed_11.json
Hash: sha256:9f03f4f1c0e4ac0c633e4b6c73b685ee78bc23c401d49400cbbb11b18433a75a
ta-61f784cac384leaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__baseline_always_hold__seed_17.json
Hash: sha256:61f784cac384e97886b974e740ab1662ebd20081b427a4ad85ca4d37bda8e817
ta-098a9999647aleaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__baseline_always_hold__seed_23.json
Hash: sha256:098a9999647aa9c942965a12d401a6b981e5aa706192ba76f93b78cb9bf3b710
ta-ff6a1c057835leaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__baseline_always_hold__seed_31.json
Hash: sha256:ff6a1c057835df2d5780284b1ee7f3df5469b5fe72dcc1b5895c821a99657205
ta-e61a71e6b1cdleaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1baseline / always-holdrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.00000.00000.0000001.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__baseline_always_hold__seed_7.json
Hash: sha256:e61a71e6b1cdcdabdae52507a0336d33422736193f483de1c01ae76ee86bbb57
ta-81e598e08f3aleaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.0758-0.05460.7647501.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__baseline_random__seed_11.json
Hash: sha256:81e598e08f3a7645df441d4d465b5c00893f66d58ce9e85a0ed64a8509cf7f42
ta-490b13bf92d9leaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.0542-0.02780.7714501.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__baseline_random__seed_17.json
Hash: sha256:490b13bf92d9e3bb9fa0f29d0eac85ab051bdb9eb90be5ee879578ae598e9826
ta-923a9957f19eleaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0635-0.12160.8286301.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__baseline_random__seed_23.json
Hash: sha256:923a9957f19e628d1c4681cb6061a4e2fc0dfc7102b350ffa1b24229a5d64231
ta-b19cdee87beeleaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.0416-0.05340.8000301.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__baseline_random__seed_31.json
Hash: sha256:b19cdee87bee635aafa1316d2df05095449f6b549d8a54d1331f7e23745fcc1c
ta-f17b01957b04leaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1baseline / randomrationaletruestress-onlydeterministic-baselinebenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.0692-0.04490.6857801.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: deterministic baseline under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__baseline_random__seed_7.json
Hash: sha256:f17b01957b04e4ad8361d2daf47d5ec0dfff9cbb0c14dca6fcae120d2213b386
ta-62847a2b522aleaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1deepseek / deepseek-v4-flashrationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0111-0.05660.78573371.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__deepseek_deepseek_v4_flash__seed_11.json
Hash: sha256:62847a2b522abb58940534288fadfbafd5975cbdbcf5baad5442eaf670e45759
ta-1e387f026febleaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1deepseek / deepseek-v4-flashrationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0111-0.05660.80003311.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__deepseek_deepseek_v4_flash__seed_17.json
Hash: sha256:1e387f026febbacee227ba0039137468355fe5a1757a0165e9dbeb08f0c9b873
ta-1798bc1768beleaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1deepseek / deepseek-v4-flashrationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1093-0.11990.70976361.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__deepseek_deepseek_v4_flash__seed_23.json
Hash: sha256:1798bc1768be5d50328fc202a4a4cdd4975e8ce252b005840603fa284f7bce8f
ta-6b9d42757b1aleaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1deepseek / deepseek-v4-flashrationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1184-0.11990.67866301.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__deepseek_deepseek_v4_flash__seed_31.json
Hash: sha256:6b9d42757b1abda1b4758b22401421cea1f0cf76bc9c0c8f16578b7dc87143a4
ta-3c681fbadadaleaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1deepseek / deepseek-v4-flashrationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.0055-0.05660.82764371.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__deepseek_deepseek_v4_flash__seed_7.json
Hash: sha256:3c681fbadada5f0bf46fa1d37623912ea6332af72327d1a9020a6bbe24cec886
ta-a72a9fc81b75leaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1deepseek / deepseek-v4-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0229-0.04110.71433241.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__deepseek_deepseek_v4_pro__seed_11.json
Hash: sha256:a72a9fc81b7594188c74ddff29bf1ff57e4e2e46271da26f0ab1b3837b88051d
ta-6e1fb210d152leaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1deepseek / deepseek-v4-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0382-0.05090.75003171.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__deepseek_deepseek_v4_pro__seed_17.json
Hash: sha256:6e1fb210d152ac2236214bed0ebf27e8ead73f909ed6fc8d75a21caf844876c9
ta-6e36090be7ccleaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1deepseek / deepseek-v4-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0411-0.05090.66674221.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__deepseek_deepseek_v4_pro__seed_23.json
Hash: sha256:6e36090be7ccb6b1795d0051dbd717f9e88c61619513be558b8063c4e3e6b664
ta-33360ef68cb9leaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1deepseek / deepseek-v4-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1184-0.11990.73915181.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__deepseek_deepseek_v4_pro__seed_31.json
Hash: sha256:33360ef68cb91f0c48e05321eef5b68d1ea009039a70f773823e7fe4eea2d80a
ta-956794b2c8dbleaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1deepseek / deepseek-v4-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0055-0.05090.76925251.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__deepseek_deepseek_v4_pro__seed_7.json
Hash: sha256:956794b2c8db6a940c4a4da5ed967f3e9b4cd7e97d46a98b8277e73a45e55d67
ta-5d7be6f375feleaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / claude-opus-4.7rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark0.9167yahoo-finance-csv (weekly, 3 symbols)-0.0146-0.04240.72004341.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_claude_opus_4_7__seed_11.json
Hash: sha256:5d7be6f375fed388458db37b8493375fa4431b0c71424633b078c101d68d7281
ta-f3e3116e2730leaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / claude-opus-4.7rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.0182-0.05830.80003311.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_claude_opus_4_7__seed_17.json
Hash: sha256:f3e3116e2730f9c224e2a65f64368089f73aa0f882a94e1bb253036d0591475e
ta-76bd42463b57leaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / claude-opus-4.7rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0101-0.04430.65386351.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_claude_opus_4_7__seed_23.json
Hash: sha256:76bd42463b57a75f69e3d3002f50f8aec6cc568d261a2f8755421b75aff4a5a4
ta-52282c9a9d51leaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / claude-opus-4.7rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0429-0.04460.69574331.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_claude_opus_4_7__seed_31.json
Hash: sha256:52282c9a9d512fdc58d2416a90e15b0cf5b6c92bd4e8e707c027a9d8e53763b8
ta-28d78d5b9968leaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / claude-opus-4.7rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.0439-0.05400.81253401.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_claude_opus_4_7__seed_7.json
Hash: sha256:28d78d5b99688fb8246c25de33d70eb25bef0c920d630aaea370f8a4981a2de8
ta-4415accc1ec9leaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / gemini-3.1-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.0166-0.04130.76003271.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_gemini_3_1_pro__seed_11.json
Hash: sha256:4415accc1ec93cd393a23987a0af4dc53352aee940934ba53fc3391c16412004
ta-8ec660b9803bleaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / gemini-3.1-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.0315-0.04230.77273231.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_gemini_3_1_pro__seed_17.json
Hash: sha256:8ec660b9803be9925ef87d04700ed157ae065178e380f301e5156e3cceb11010
ta-1c7196d1253dleaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / gemini-3.1-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0529-0.12120.63338231.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_gemini_3_1_pro__seed_23.json
Hash: sha256:1c7196d1253dc98c256131df5e97c0da9f83345a733247277bf09f9d2ab8d424
ta-499250f22d86leaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / gemini-3.1-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0974-0.12070.62967191.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_gemini_3_1_pro__seed_31.json
Hash: sha256:499250f22d862e51cdc5116d25099137f8a82039c99ae1b5a1acdc8b193857df
ta-e57b4893cff3leaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / gemini-3.1-prorationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.0486-0.04230.73086241.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_gemini_3_1_pro__seed_7.json
Hash: sha256:e57b4893cff3f049e6c6e305af3a88fee6ca2e13ac651d124e4661adee1a5e68
ta-1bd2254f8b18leaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / glm-5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0116-0.05660.81483321.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_glm_5__seed_11.json
Hash: sha256:1bd2254f8b183f8c6a02875d9f1dbffa07446540078ad30befd3891110d8b349
ta-bfea064200d1leaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / glm-5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0107-0.05710.78263261.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_glm_5__seed_17.json
Hash: sha256:bfea064200d15b20d01f32712d8bf81dd1c791d91b59639e0f5838a89560c805
ta-08978397f6dfleaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / glm-5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1093-0.11990.70976361.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_glm_5__seed_23.json
Hash: sha256:08978397f6df59a199b9c709ca8b26edc2c6a711c92256cd0ab8e25fe35b4600
ta-307185d6b99cleaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / glm-5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1184-0.11990.67866301.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_glm_5__seed_31.json
Hash: sha256:307185d6b99c6113542439cb76cea0b7b759d52506bd29bde8cf6f5e4294ffb4
ta-694e250680feleaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / glm-5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.0066-0.05660.82144331.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_glm_5__seed_7.json
Hash: sha256:694e250680feb51bf86760ed8dd1f6fe0cc80c9305bba09c0d922ddd2222baac
ta-bd8f2291a9fbleaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / gpt-5.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0082-0.05670.79313371.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_gpt_5_5__seed_11.json
Hash: sha256:bd8f2291a9fbcb67fb2cccd9fe47a751841190d67704a5dd8aa77a52d6da2407
ta-8705d0026b8bleaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / gpt-5.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.0053-0.05690.80773321.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_gpt_5_5__seed_17.json
Hash: sha256:8705d0026b8b449515dbd60f2a147c7cf608342457de93f06b29c82f0b62607b
ta-892eda5d0bbbleaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / gpt-5.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1031-0.12000.68757361.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_gpt_5_5__seed_23.json
Hash: sha256:892eda5d0bbb298a4a4a2f870ebe58ed8fdd8be3cbc3154dc979f119a22e862b
ta-56ef503e6f84leaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / gpt-5.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1103-0.12020.68976301.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_gpt_5_5__seed_31.json
Hash: sha256:56ef503e6f84dcf50b448435428472b326db3628cd28f3e8ecd2d6dc251c1ea6
ta-d080576867c6leaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / gpt-5.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.0095-0.05660.83333371.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_gpt_5_5__seed_7.json
Hash: sha256:d080576867c6ebe09805870258ff5a5caab2e6dddd1b9f286b3bbab677d6a493
ta-99bc73bf56a8leaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / kimi-k2.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.0046-0.05710.73335411.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_kimi_k2_5__seed_11.json
Hash: sha256:99bc73bf56a8edd11736737f45fe321da861dad7a3303eabb4794d8943d4b5e5
ta-60d531e261b0leaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / kimi-k2.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.0068-0.04210.73914251.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_kimi_k2_5__seed_17.json
Hash: sha256:60d531e261b0e16bc714b1693ad64fe43338ee4a55aa2b5f0b4f4f16c0d8a593
ta-9fdedfcd3b81leaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / kimi-k2.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1093-0.11990.67747321.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_kimi_k2_5__seed_23.json
Hash: sha256:9fdedfcd3b814e67fdb4491f2d17e1732805d549c2d6088714bf352512a60979
ta-028041a81f5cleaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / kimi-k2.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)-0.1103-0.12020.68976261.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_kimi_k2_5__seed_31.json
Hash: sha256:028041a81f5c6dec9799c8f19598b8d82e918bfa8692d5e2fcd5017b11da62bd
ta-40cce8421d1cleaderboard_real_yahoo_recent_gspc_btc_btcf_weekly_v0_1poe / kimi-k2.5rationaletruestress-onlycached-providerredacted-promptbenchmarkstress-benchmark1.0000yahoo-finance-csv (weekly, 3 symbols)0.0055-0.05660.82764371.0000ReproducibleRedacted
Open
Model redacted: False
Claim scope: cached-provider reliability under stress-only execution
Source: examples/benchmark_submissions/real_market_matrix/recent_cross_asset__poe_kimi_k2_5__seed_7.json
Hash: sha256:40cce8421d1cdec7d235a50dae95561df55f605c71eaaf6a812711353de44db3

Rows are accepted only after schema validation and reproducibility-hash verification.