Home Writing Projects Setup Protocol CV
How to turn money into predictions

How to turn money into predictions

A menu of mechanisms for organizations that want to fund forecasts on policy, economic, and AI outcomes.

Organizations that fund policy research, economic analysis, or AI deployment want predictions about future outcomes — which tax provisions will pass, what inflation will run, how labor markets will absorb a new model class. The Federation of American Scientists is sponsoring forecasting tournaments to inform climate policy; the Forecasting Research Institute ran its Existential Risk Persuasion Tournament with 80 domain experts and 89 superforecasters on AI and other catastrophic risks; FiscalNote announced a major expansion into prediction-market infrastructure for its policy-intelligence customer base in February 2026; and AI researchers are now benchmarking LLM forecasting against expert and crowd panels. Quality varies sharply across markets — some are deep, actively traded, and accurate; others sit thinly populated and stay mispriced for weeks. Funders can improve the quality on the questions that matter to them.

This post lays out the available mechanisms. I drafted an earlier version in 2024 and circulated it with a small group of researchers and funders. The landscape has changed enough since then — prediction markets crossed $44 billion in notional volume in 2025, Kalshi got Federal Reserve research validation, and a serious proposal for sponsorship-funded markets on AI labor impacts hit the literature — that an updated public version is overdue.

Two structural problems

Two problems make most prediction markets shallower than they should be.

The liquidity problem. Markets need both informed traders (“smart money”) and less informed traders willing to absorb their bets (“dumb money”). Andrew Gelman puts it cleanly in Prediction markets and the need for “dumb money” as well as “smart money”: “Markets become efficient when making them efficient is profitable.” Without enough total volume, smart traders can’t profitably correct a mispriced market, so prices stay wrong.

A Manifold market asking “Will there be any tax on unrealized capital gains in the USA before EOY2028?” sat at 22% probability for nearly two weeks after Trump’s 2024 election victory, with 27 traders. The election outcome obviously cut the probability substantially, but no one was paid enough to move the price.

A parallel Manifold market on “Will a raise to the top capital gains tax rate be enacted in 2025?” behaved differently. Manifold staff made it sweepstakes-eligible at creation, so the market traded in real money (Manifold’s then-experimental real-cash mode, shut down in March 2025). It held 20-21% until election results, then dropped to 6%. The mana version attracted 18 trades and 4,093 mana; the sweepstakes version had 9 traders and 334 sweepcash. The cash version attracted enough attention to update on election news; the play-money version did not. Money matters here mostly as an attention allocator, not as a calibration mechanism — Servan-Schreiber et al. (2004) found play-money and real-money NFL prediction markets equally accurate when both had engaged trader bases.

The zero-sum challenge. Unlike equities, where investors collectively benefit from economic growth, prediction markets are zero-sum or negative-sum after fees. Every winner needs a loser. Zvi Mowshowitz lists the consequences in “Subsidizing Prediction Markets”: no natural long-term investors, higher perceived risk, and limited professional capital deployment. Funders willing to take an expected loss on a market are the substitute for that missing long-horizon capital.

Where prediction markets are in 2026

Volume. Total notional exceeded $44 billion in 2025, with monthly peaks around $13 billion during the US election period. The market is concentrated in Kalshi (a CFTC-designated contract market) and Polymarket (crypto-native, recently returned to the US via a $112 million acquisition of CFTC-licensed exchange QCX).

Regulatory legitimacy. Kalshi went from a startup with questionable contract approvals to the subject of a Federal Reserve working paper (Diercks, Katz, and Wright, 2026) that finds Kalshi’s prices on inflation, jobs, GDP, and FOMC decisions “outperform surveys and interest-rate futures” as macro forecasting tools. The paper explicitly recommends Kalshi as a real-time benchmark for researchers and policymakers. That is the most credible institutional validation prediction markets have ever received.

A sponsorship thesis. Andrey Fradkin, Brian Jabarian, and Andrew Koh published “We need well-capitalized prediction markets” arguing that the right model is sponsorship — AI labs, Big Tech firms, and philanthropies seed liquidity in markets on labor outcomes, AI capability benchmarks, and other questions where they have decision-relevant interest. Sponsorship covers the expected loss in exchange for better information for everyone. I’ll come back to this in the last section.

Funding mechanism 1: Subsidize public markets

Each major platform supports liquidity provision through a different mechanism.

Manifold Markets runs on play money called mana. The platform briefly offered sweepstakes trading in real cash but shut that program down in March 2025 and returned to a pure play-money model. Despite the play-money currency, Manifold’s aggregate calibration is strong — predictions land within roughly four percentage points of realized frequencies — and Manifold’s own analysis finds calibration improves with trader count up to roughly 10-20 traders per market, after which it plateaus. The implication for funders: subsidizing a market that’s stuck below the plateau directly buys better predictions. Organizations can purchase mana with dollars and add it to specific markets as liquidity; the deeper book attracts more traders and tightens the spread. The subsidizer expects to lose mana — the more the market price moves, the less you recover — but the lost mana is the cost of a better-priced market, not a tax on the platform. Manifold’s code is also MIT-licensed, which makes it the friendliest platform for organizations that want to run programmatic experiments at low cost.

Polymarket uses a Liquidity Mining & Rewards program. Market makers earn rewards for providing consistent quotes; traders earn fee rebates proportional to trading activity; community bounties incentivize market promotion. Organizations can participate through these mechanisms rather than as standalone subsidizers. After a 2022 CFTC settlement that barred US users for nearly four years, Polymarket regained US access in December 2025 via the QCX acquisition above, and runs hundreds of macro markets globally.

Kalshi offers a limit order liquidity model. Organizations place standing limit orders at prices they’re willing to trade at; orders execute when the market crosses those prices. Limit orders avoid trading fees, allow precise price targeting, and accumulate into market liquidity over time. Qualified market makers can join a formal market maker program with additional benefits. The CFTC approval and Fed validation make this the right platform for institutional-grade questions where regulatory clarity matters.

Hypermind has run institutional prediction markets since 2000, with clients including the US intelligence community and the Johns Hopkins Center for Health Security. Sponsors fund custom markets and panels of selected forecasters on bespoke questions, with the platform providing aggregation and reporting infrastructure. Less suitable for one-off retail experiments, more suitable for sustained corporate or government use.

Funding mechanism 2: Sponsor a tournament

Metaculus is a forecasting platform aggregating predictions across thousands of binary and numerical questions, scored on accuracy. Organizations sponsor dedicated tournaments — a slate of related questions with a prize pool that pays out to the best forecasters.

The Federation of American Scientists funded a $5,000 Climate Tipping Points tournament covering 43 questions about climate policy and outcomes, including conditional questions. The tournament structure lets a funder set the agenda — what questions matter, what counts as resolution, what horizons to ask about — without taking on market-maker risk.

The Forecasting Research Institute (Philip Tetlock’s research arm, distinct from Good Judgment Inc) runs research-oriented tournaments like the Existential Risk Persuasion Tournament (80 domain experts, 89 superforecasters, thousands of forecasts on long-horizon catastrophic risks). Sponsoring FRI is closer to funding a research program than buying a forecast.

Tournaments work well for questions where you want a calibrated probability today, not a tradeable instrument over time. They also support reasoning prizes for the best-argued forecasts, not just the most accurate ones, which matters when you’re trying to surface analytical talent rather than just numerical answers.

Funding mechanism 3: Buy custom human forecasts

Good Judgment sells custom forecasting from professional superforecasters — individuals identified by Philip Tetlock’s Good Judgment Project as consistently outperforming domain experts and intelligence analysts (the project beat intelligence-community analysts with classified access by 25-30% in the IARPA ACE tournament). Services include question development, daily forecast updates, written analysis, and follow-up question support. Organizations can keep forecasts private or release them publicly.

This is the most concierge option. You write a question, a small team of trained forecasters works on it, you get back a probability with explanation. No liquidity to manage, no market to monitor. Cost is higher per question and the methodology is opaque relative to a public market.

Funding mechanism 4: Buy AI-generated forecasts

A new category emerged in 2024-2026. FutureSearch (publicly launched in early 2026) runs LLM-based research agents that gather evidence, weigh base rates, and produce calibrated probability forecasts with reasoning. Pricing is per-operation — deep-research agents at 1-11¢, forecasters at 20-90¢ per researcher per row — orders of magnitude cheaper than human superforecasters per question, with the obvious tradeoff that the depth of domain expertise is whatever the model has internalized plus what its retrieval finds.

The category builds on published research: Halawi et al. (2024) “Approaching Human-Level Forecasting with Language Models” (NeurIPS) showed retrieval-augmented LLM systems approaching competitive-forecaster accuracy; the Center for AI Safety’s forecasting bot reported superhuman accuracy on certain competitive platforms; ForecastBench (Karger et al., ICLR 2025) — a dynamic, continuously-updated benchmark — finds frontier LLMs roughly match the general-public crowd in Brier score but still trail elite superforecasters by a wide margin (LLM Brier ≈ 0.135–0.159 vs. superforecaster Brier ≈ 0.02). The ForecastBench team’s linear extrapolation projects superforecaster parity around late 2026, with a 95% confidence interval from December 2025 to January 2028.

The cost structure inverts the human-forecasting category. Where Good Judgment’s value is depth on a few high-stakes questions, AI forecasting’s value is breadth — hundreds or thousands of calibrated probabilities a day. Use AI forecasting when you want coverage; use human superforecasters when you want depth and defensibility on a question where the audience expects a human accountable for the call.

The trajectory matters more than the snapshot. Per-query costs are falling and accuracy is rising; within a few years, the marginal cost of a calibrated probability on most public questions drops sharply, even if the gap to elite superforecasters takes longer to close than the ForecastBench point estimate suggests. What remains expensive — and where real returns on investment will sit — is optimizing the agents: which tools they call, which substrates they query, how their calibration is benchmarked, how they update under new information. Cheap commodity forecasts and frontier-quality forecasts will look very different, and the work to produce the latter still needs to be incentivized.

Funding mechanism 5: Run your own prediction market

For internal questions an organization doesn’t want public — product launch dates, project completion probabilities, internal strategic questions — private prediction markets aggregate dispersed information already inside the organization. Eli Lilly used internal markets in 2003 to forecast which drug candidates would advance through clinical trials; Google, Microsoft, and Ford have run variants. The corporate-prediction-market vendor landscape has consolidated — Cultivate Labs (formerly Inkling Markets) dropped market mechanics in 2022 in favor of opinion pools, citing user confusion — but Hypermind’s Prescience platform still offers managed private markets for corporate and government clients.

The sponsorship thesis, and why it matters

The Fradkin / Jabarian / Koh proposal is worth treating as its own category. They argue that the chicken-and-egg problem in prediction markets — no liquidity, no traders; no traders, no liquidity — can be solved by sponsorship capital from organizations that have real decision-relevant interest in the questions. Their canonical example is AI labor impacts: a major AI lab, a federal agency, or a philanthropy seeds liquidity on markets tied to Bureau of Labor Statistics data series — occupation-level employment, wages, labor-force participation at 1-, 2-, and 5-year horizons.

The contract design they advocate has four properties: verifiability (objective resolution against published government data), stability (consistent measurement over time), robustness to gaming (large underlying quantities that one trader can’t move), and attention (sufficient interest to draw informed traders). BLS data hits all four. So do many other government statistics — BEA NIPA tables, Census ACS variables, IRS Statistics of Income.

The model generalizes. Any organization that benefits from better forecasting on a specific question can sponsor a market on it without expecting trading profit. The sponsorship pays for information that improves everyone’s decisions — including the sponsor’s. Conditional markets (outcome Y given policy state X) are particularly underprovided today and particularly valuable for policy analysis. Sponsorship is the cleanest mechanism to bring them into existence.

The sponsorship lane connects to a structural question: who pays for the public good of calibrated forecasts? In 2024 the answer was mostly “individual hobbyist traders subsidizing inefficient markets.” In 2026 the answer is starting to be “institutions with skin in the question, sponsoring markets that produce information they want.” That shift is what makes 2026 different from 2024.

Where to start

Different mechanisms suit different goals:

  • Want to test the waters cheaply? Buy mana on Manifold and seed a market that matters to you. Total commitment can be under $1,000. The mana you lose to traders moving the price is the cost of a deeper, better-calibrated market on a question you care about.
  • Want a probability anchored to your priorities? Sponsor a Metaculus tournament. $5,000–$50,000 buys a slate of questions with a real prize pool and visible forecasters.
  • Want a single high-stakes forecast with human reasoning? Engage Good Judgment. Highest cost per question, deepest analysis, accountable human author.
  • Want broad coverage across many questions cheaply? Use FutureSearch or a similar AI-forecasting service. Per-query pricing in pennies, hundreds of forecasts a day, calibrated but synthetic.
  • Want regulated, real-money markets on macro questions? Place limit orders on Kalshi. Fee-free, precise pricing, supports the most institutionally credible platform.
  • Want an institutional sustained-engagement platform? Hypermind for custom corporate panels (25+ years of track record) or for running a private internal market with a defined trader population.
  • Want a research tournament on long-horizon questions? Fund the Forecasting Research Institute directly — closer to research-program sponsorship than to buying a forecast.
  • Want to move the field? Sponsor markets that solve a real liquidity hole — BLS-anchored contracts, AI capability benchmarks, conditional policy-outcome markets. The Fradkin / Jabarian / Koh model is the right starting point.

The hard part is no longer figuring out the mechanism. The mechanisms exist, the platforms work, the regulatory path is clearer than at any point in the past decade. The opportunity is institutional: adding prediction markets to the portfolio of truth-seeking tools organizations already use — alongside surveys, expert panels, internal modeling, and traditional forecasts — and developing the craft of writing forecastable questions. A well-formed question is conditional, scoreable, and decision-relevant (“will 2028 child poverty under reform X exceed Y%?”); a poorly-formed one is none of those. Prediction markets are a new epistemic technology. Organizations that build the discipline to use them well will reason more clearly about the future.