Methodology

How AI rates election promises.

RealityCheck Malta is an AI-operated information product. Every pledge feasibility rating on this site is generated by automated AI systems — not by human journalists, courts, regulators or political actors. This page explains exactly how that works.

AI-operated, by design

Every pledge on this site is collected, evaluated, rated and described by an automated AI pipeline. There is no human editor reviewing each verdict before publication. We chose this model for transparency, scale and consistency — the same evidence rules are applied to every pledge regardless of party.

AI systems can make mistakes. They can misread a statistic, miss context or over-weight a secondary source. That is why every rating is published with the source the pledge was drawn from — so any reader can audit the reasoning and challenge it via the corrections page.

What is eligible for rating

The AI rates specific, concrete pledges made publicly by political parties or their leaders — for example in manifestos, conference speeches, press releases, parliamentary statements and official party channels. Eligible pledges are those specific enough to be tested against fiscal, legal and delivery evidence.

What we do not rate

Opinions, value judgements and predictions about the future.
Sarcasm, satire and obvious rhetorical flourishes.
Vague pledges that are not specific enough to be tested.
Personal characterisations or attacks on individuals.

Source hierarchy

The AI is instructed to prefer primary sources: official statistics, original documents, raw datasets, legislative records and verifiable recordings. Secondary reporting may be included for context but should not be the sole basis for a verdict where primary evidence is reasonably available.

The feasibility scale

Pledges are placed on a five-level feasibility scale. The written rationale does the heavy lifting; the meter position is a reading aid.

Feasible
Implementable within a normal legislative term using existing fiscal headroom.
Partially feasible
Achievable in part. Full delivery faces budget, capacity or legal hurdles.
Costly
Possible but with material recurrent cost or trade-offs against other spending.
Unlikely as stated
Substantial legal, structural or fiscal barriers make delivery improbable in one term.
Insufficient detail
The pledge lacks the numbers or mechanism needed for an AI feasibility assessment.

The AI model and rubric

Pledges are graded by Google Gemini 2.5 Pro. Each pledge is passed to the model with a strict, Malta-specific rubric anchored to the public finances framework: EU fiscal rules (3% deficit / 60% debt-to-GDP), the current national budget envelope, the legal powers of central government, and the routine delivery capacity of the Maltese civil service.

Under that rubric, "feasible" is reserved for measures that can be funded within existing budget lines, sit within executive authority, follow routine delivery patterns, and do not depend on uncertain third parties. Anything requiring meaningfully new funding, new legislation, or significant institutional change defaults to "partially-feasible" or "costly". "Unfeasible" is reserved for measures that break EU rules, lack legal basis, or are physically impossible at Malta's scale.

Known limitations

We want to be honest about what this system does and does not do. The AI is a structured first-pass reviewer, not a fiscal auditor.

Each pledge is evaluated in isolation. The model does not see, and does not weigh, the cumulative cost of all the other pledges in the same manifesto against the same budget envelope. A chapter that promises ten new spending programmes may still see each individual programme rated as feasible.
Large language models tend toward leniency on plausible-sounding proposals. Even with a strict rubric, the headline share of pledges scored as "feasible" is likely higher than a full fiscal audit would conclude. Read the distribution, not just the percentage.
Costing is qualitative, not parametric. The model does not compute euro-denominated cost estimates. It assesses whether a pledge fits a credible funding pattern, not the precise price tag.
Source coverage varies by party. Where a party has published a full manifesto, coverage is broad. Where only press releases and speeches are available, coverage is narrower and skews toward whatever the party chooses to publicise.

These limitations are why every pledge is published with its source link and written rationale. The numbers are a navigation aid; the rationale and source are the substance.

Re-grading and revisions

We periodically re-run the rubric across the full pledge set when the prompt is tightened or when new context is added (for example, after a full manifesto is ingested). When this happens, individual verdicts and headline scores can move materially. We do not freeze old ratings to protect a narrative — the most recent application of the rubric is what is shown. As of now, every published manifesto from the four tracked parties has been ingested in full and the pledge corpus is locked through polling day on 30 May 2026. Any further movement in scores will come from rubric refinement, not from new ingestion.

Updates and corrections

Verdicts reflect the cited evidence available at the time of publication. Where new credible evidence emerges, ratings may be revised. All revisions are logged transparently. Readers may submit correction requests via our corrections page.

Editorial scope

We focus on the pledge, not on the person. We do not use language that implies unlawful conduct unless it is presented strictly as a sourced quotation with context. We avoid characterisations such as "lied", "fraud" or "corrupt" in our own editorial voice.