The Complete Guide to Tennis Data: How to Use the fault.bet API for Trading, Predictions, and Analysis

April 2026 · 25 min read · API Documentation · All articles

If you're reading this, you probably fall into one of a few categories. Maybe you trade tennis on Betfair and you're tired of gut-feel decisions. Maybe you're a Polymarket trader looking for an edge on tennis contracts. Maybe you're a journalist who needs serve data for a story. Maybe you build models and you need clean training data.

Whatever brought you here, the underlying problem is the same: tennis data is a mess. It's scattered across dozens of sources, formatted inconsistently, missing key metrics, and almost nobody provides the kind of derived features — pressure indices, weather-adjusted performance, surface-specific ratings — that actually move the needle on prediction accuracy.

We built the fault.bet Tennis Data API to fix that. One endpoint, one API key, clean JSON. 500 active players, 40+ computed features each, updated daily. This guide covers everything you can do with it.

What's in this guide

What data is available (and why it matters)
Quick start: your first API call in 60 seconds
Use case: Betfair tennis trading
Use case: Polymarket and prediction markets
Use case: Sports journalism and content
Use case: Fantasy tennis and DFS
Use case: Building your own prediction model
Deep dive: Elo ratings and why WElo matters
Deep dive: Pressure metrics nobody else tracks
Deep dive: Weather-adjusted serve projections
The data CSV: bulk analysis without code
Pricing and getting started

1. What data is available (and why it matters)

Most tennis data sources give you the basics: who won, what the score was, maybe some serve percentages. That's fine for a Wikipedia article but useless for prediction. The gap between "Sinner won 6-3 6-4" and understanding why he won — and whether he'll win next time — is enormous.

The fault.bet API provides 40+ computed features per player, updated daily. These aren't raw stats pulled from a match database. They're derived metrics calculated from rolling windows, exponential averages, and cross-referenced with weather and venue data. Here's what you get:

Elo ratings (8 variants)

Standard Elo tells you who's better overall. But tennis is a surface sport. A player ranked 1800 overall might be 2100 on clay and 1500 on grass. We maintain separate Elo ratings for hard court, clay, and grass — plus weighted versions (WElo) that give recent matches more importance.

Right now, the API returns this for Carlos Alcaraz:

{
  "elo_overall": 2078,
  "elo_hard": 2041,
  "elo_clay": 1961,
  "elo_grass": 1840,
  "welo_overall": 2228,
  "welo_momentum_8w": -135.2
}

That momentum number is interesting. -135 over 8 weeks means Alcaraz's form has dropped significantly in the last two months. The standard Elo (2078) doesn't reflect that because it moves slowly. The WElo (2228) does — it's pulling back hard. If you're pricing a match, which number do you trust more?

Serve and return metrics (12 rolling stats)

We compute rolling averages over the last 20 matches and 52 matches for every serve and return metric that matters:

Hold percentage — how often they win their service games
Break percentage — how often they break the opponent
First serve percentage — how often the first serve lands in
First serve points won — win rate when the first serve goes in
Second serve points won — win rate on second serve (the vulnerable moments)
Ace rate — free points per service game
Double fault rate — free points given away per service game
Break points saved — clutch serving under pressure
Break points converted — clinical returning when it matters
Return points won — overall return effectiveness

The 20-match window captures current form. The 52-match window captures baseline ability. When they diverge, something interesting is happening — a player improving rapidly, or one falling off a cliff.

Pressure metrics (5 indices)

This is where it gets unique. Nobody else provides these because they require point-level data analysis, not just match results. We compute:

Choke index — how often a player loses games from a winning position. Sinner's is 0.065 (elite). Some players sit above 0.40 (they fold under pressure regularly).
Lead-lost rate — similar but focused on set-level leads
Closing rate — when serving for the set or match, how often do they convert? Sinner closes at 93.6%. The tour average is around 75%.
Tiebreak win percentage — some players are tiebreak specialists, others collapse
Deciding set win percentage — who holds their nerve in a final set?

Why this matters for pricing: Two players with identical Elo ratings can have wildly different pressure profiles. If Player A chokes 40% of the time and Player B closes 90% of the time, the market price should reflect that — but bookmakers rarely adjust for it. That's edge.

Weather-adjusted performance (6 splits)

Wind above 20 km/h reduces first-serve accuracy by 5-8% on average. But the effect isn't uniform. Some players' hold rates barely move in wind. Others drop by 15 percentage points.

We track:

Hold percentage in wind vs hold percentage in calm conditions
Hold percentage indoors vs outdoors
Ace rate in cold vs warm conditions

When Madrid's altitude and wind combine with a specific player matchup, these splits become genuinely predictive.

2. Quick start: your first API call in 60 seconds

You need an API key. Get one at fault.bet/api (£59/month, cancel anytime). Your key starts with fb_ and gets emailed to you immediately after payment.

Then:

# Get today's ATP rankings by Elo
curl -H "X-API-Key: fb_your_key_here" \
  https://api.fault.bet/v1/rankings/elo?tour=atp&limit=5

Response:

{
  "status": "ok",
  "data": [
    {"rank": 1, "name": "Jannik Sinner", "elo_overall": 2156},
    {"rank": 2, "name": "Carlos Alcaraz", "elo_overall": 2078},
    {"rank": 3, "name": "Novak Djokovic", "elo_overall": 1971},
    {"rank": 4, "name": "Alexander Zverev", "elo_overall": 1921},
    {"rank": 5, "name": "Arthur Fils", "elo_overall": 1818}
  ]
}

Python example

# pip install requests
import requests

API_KEY = "fb_your_key_here"
BASE = "https://api.fault.bet/v1"
headers = {"X-API-Key": API_KEY}

# Get a player profile with all stats
player = requests.get(f"{BASE}/player/Sinner", headers=headers).json()

elo = player["data"]["ratings"]["elo_overall"]
hold = player["data"]["rolling_stats"]["hold_pct_20w"]
choke = player["data"]["pressure"]["choke_index"]

print(f"Sinner: Elo {elo:.0f}, hold {hold:.0%}, choke {choke:.3f}")
# Output: Sinner: Elo 2156, hold 99.2%, choke 0.065

Full interactive documentation with "try it" buttons is at api.fault.bet/docs.

3. Use case: Betfair tennis trading

This is what fault.bet was originally built for. The core idea is simple: if your model says a player has a 65% chance of winning but Betfair is pricing them at 55% (odds of 1.82), you have a 10% edge. Bet enough of these and the edge compounds.

Building a pre-match value finder

import requests

API_KEY = "fb_your_key_here"
BASE = "https://api.fault.bet/v1"
h = {"X-API-Key": API_KEY}

# Get today's signals (pre-computed value bets)
signals = requests.get(f"{BASE}/signals/today", headers=h).json()

for sig in signals["data"]:
    print(f"{sig['player1']} vs {sig['player2']}")
    print(f"  Back: {sig['signal_player']}")
    print(f"  Model: {sig['model_prob']:.0%}")
    print(f"  Market: {sig['market_implied']:.0%}")
    print(f"  Edge: {sig['edge_pct']:.1f}%")
    print()

The signals endpoint does all the heavy lifting — it's already compared model probability against live exchange prices and filtered for matches where the edge exceeds 6%. You just need to place the bet.

Building a custom model

If you want to build your own model rather than using our signals, the player endpoint gives you everything you need:

# Compare two players for a specific match
p1 = requests.get(f"{BASE}/player/Sinner", headers=h).json()["data"]
p2 = requests.get(f"{BASE}/player/Alcaraz", headers=h).json()["data"]

# Simple Elo-based win probability
elo_diff = p1["ratings"]["elo_clay"] - p2["ratings"]["elo_clay"]
p1_win_prob = 1 / (1 + 10 ** (-elo_diff / 400))

# Adjust for pressure — if one player chokes more
choke_diff = p1["pressure"]["choke_index"] - p2["pressure"]["choke_index"]
# Lower choke = better under pressure = slight boost
pressure_adj = choke_diff * -0.05
adjusted_prob = p1_win_prob + pressure_adj

# Compare against market
betfair_price = 1.85  # from Betfair API
market_implied = 1 / betfair_price
edge = adjusted_prob - market_implied

print(f"Model: {adjusted_prob:.1%}, Market: {market_implied:.1%}, Edge: {edge:.1%}")

The more features you incorporate — serve stats, weather, fatigue, surface form — the more accurate your model becomes. Our API gives you all the ingredients; you build the recipe.

Automating with webhooks

For serious traders, checking the API manually isn't practical. You can poll the signals endpoint every hour and auto-place via the Betfair API when value appears:

# Poll for new signals every hour
import time
import betfairlightweight

trading = betfairlightweight.APIClient(...)
trading.login()

while True:
    signals = requests.get(f"{BASE}/signals/today", headers=h).json()
    for sig in signals["data"]:
        if sig["edge_pct"] > 8.0 and sig["confidence"] > 70:
            # Place bet via Betfair API
            market_id = find_betfair_market(sig)
            trading.betting.place_orders(market_id, [
                betfairlightweight.filters.limit_order(
                    price=sig["betfair_price"],
                    size=calculate_stake(sig["edge_pct"]),
                    persistence_type="LAPSE"
                )
            ])
            print(f"Bet placed: {sig['signal_player']} @ {sig['betfair_price']}")
    time.sleep(3600)

4. Use case: Polymarket and prediction markets

Prediction markets like Polymarket are growing fast, and tennis is one of the sports they're adding. The mechanics are different from traditional betting — you're buying "shares" in an outcome — but the underlying question is the same: what's the true probability?

The fault.bet API gives you calibrated probabilities. Here's how to use them on Polymarket:

Finding mispriced contracts

# Get model probabilities for today's matches
matches = requests.get(f"{BASE}/matches/today", headers=h).json()

for match in matches["data"]:
    model_p1 = match["p1_win_prob"]

    # Compare with Polymarket price (you'd scrape or use their API)
    poly_price = get_polymarket_price(match["player1"], match["player2"])

    if poly_price and abs(model_p1 - poly_price) > 0.08:
        direction = "BUY" if model_p1 > poly_price else "SELL"
        edge = abs(model_p1 - poly_price)
        print(f"{direction} {match['player1']} contract")
        print(f"  Model: {model_p1:.0%}, Polymarket: {poly_price:.0%}")
        print(f"  Edge: {edge:.0%}")

Prediction markets are less efficient than Betfair for tennis because they have less liquidity and fewer sharp bettors. The edges tend to be larger but the volume you can trade is smaller.

Important: Prediction market prices can move quickly during matches. The fault.bet API provides pre-match probabilities. For inplay trading, you'd need to combine our pre-match data with live scoring data.

5. Use case: Sports journalism and content

If you write about tennis, you need data to back up your analysis. "Djokovic is clutch" is an opinion. "Djokovic closes out 87% of sets when serving at 5-4" is a fact. The API gives you facts.

Pre-match preview data

# Pull stats for a match preview
p1 = requests.get(f"{BASE}/player/Rublev", headers=h).json()["data"]
p2 = requests.get(f"{BASE}/player/Fils", headers=h).json()["data"]

print(f"=== {p1['name']} vs {p2['name']} ===\n")

print(f"Elo: {p1['ratings']['elo_overall']:.0f} vs {p2['ratings']['elo_overall']:.0f}")
print(f"Hold %: {p1['rolling_stats']['hold_pct_20w']:.0%} vs {p2['rolling_stats']['hold_pct_20w']:.0%}")
print(f"Break %: {p1['rolling_stats']['break_pct_20w']:.0%} vs {p2['rolling_stats']['break_pct_20w']:.0%}")
print(f"Choke: {p1['pressure']['choke_index']:.2f} vs {p2['pressure']['choke_index']:.2f}")
print(f"Closing: {p1['pressure']['closing_rate']:.0%} vs {p2['pressure']['closing_rate']:.0%}")
print(f"Deciding sets: {p1['pressure']['deciding_set_pct']:.0%} vs {p2['pressure']['deciding_set_pct']:.0%}")

That's a match preview written in 10 lines of Python. You can build a weekly newsletter, a podcast stat sheet, or a pre-tournament analysis piece from the same data.

Trend stories

The player history endpoint gives you time-series data going back to 2022. Want to write about Sinner's rise? Pull his Elo trajectory:

# Sinner's Elo trajectory over time
history = requests.get(f"{BASE}/player/Sinner/history", headers=h).json()

for point in history["data"][-12:]:  # Last 12 data points
    print(f"{point['date']}: Elo {point['elo_overall']:.0f}, WElo {point['welo_overall']:.0f}")

Plot that in a chart and you've got a visual story that no other outlet has — because nobody else maintains surface-specific, recency-weighted Elo ratings with 8-week momentum tracking.

6. Use case: Fantasy tennis and DFS

Daily fantasy sports (DraftKings, FanDuel) score tennis players based on games won, aces, break points, and match results. The fault.bet API helps you find underpriced players:

Finding value picks

# Get all players ranked by Elo but cheap on DFS
rankings = requests.get(f"{BASE}/rankings/elo?tour=wta&limit=50", headers=h).json()

for player in rankings["data"]:
    # High Elo + high ace rate = likely to score fantasy points
    profile = requests.get(f"{BASE}/player/{player['name']}", headers=h).json()["data"]

    ace_rate = profile["rolling_stats"].get("ace_rate_20w", 0)
    hold_pct = profile["rolling_stats"]["hold_pct_20w"]

    # Ace machines who hold serve well = high fantasy floor
    if ace_rate > 0.5 and hold_pct > 0.80:
        print(f"{player['name']}: {ace_rate:.1f} aces/game, {hold_pct:.0%} hold")

The weather splits add another dimension. If you know tomorrow's match is outdoors in 25+ km/h wind, you can find players whose serve doesn't degrade in wind — they'll outperform their DFS salary.

7. Use case: Building your own prediction model

If you're a data scientist or quantitative trader, you probably want the raw features to build your own model. The fault.bet API gives you two options:

Option 1: Use the API for live features

Pull player stats before each match, compute your own diff features, run through your model:

import numpy as np
from sklearn.linear_model import LogisticRegression

# For each match, get both player profiles
features = []
for match in today_matches:
    p1 = get_player_stats(match["player1"])
    p2 = get_player_stats(match["player2"])

    # Compute diff features (player1 - player2)
    diff = {
        "elo_diff": p1["elo_overall"] - p2["elo_overall"],
        "hold_diff": p1["hold_pct_20w"] - p2["hold_pct_20w"],
        "break_diff": p1["break_pct_20w"] - p2["break_pct_20w"],
        "choke_diff": p2["choke_index"] - p1["choke_index"],  # inverted
        "momentum_diff": p1["welo_momentum"] - p2["welo_momentum"],
    }
    features.append(diff)

Option 2: Buy the CSV for training data

If you need historical snapshots for backtesting, the Data CSV gives you the full dataset — 250 ATP + 250 WTA players with all 40 features, updated weekly. At £200 per download (or £600 for 6 downloads over a year), it's significantly cheaper than building the data pipeline yourself.

The CSV includes everything the API provides, but in a format you can load directly into pandas, R, or Excel:

import pandas as pd

atp = pd.read_csv("faultbet_atp_top250_2026-04-23.csv")
wta = pd.read_csv("faultbet_wta_top250_2026-04-23.csv")

# Immediate analysis
print(atp.sort_values("welo_overall", ascending=False)[["player_name", "welo_overall", "hold_pct_20w", "choke_index"]].head(10))

8. Deep dive: Elo ratings and why WElo matters

Standard Elo was designed for chess. It works reasonably well for tennis but has a critical flaw: it treats a match from 6 months ago the same as a match from last week.

Tennis form is volatile. A player can be unbeatable for two months and then crash for six weeks. Djokovic in January is a different beast from Djokovic in March. Standard Elo can't capture this because it updates by the same amount regardless of when the match was played.

WElo (Weighted Elo) fixes this with exponential decay. Recent matches carry more weight. The welo_momentum_8w field tells you whether a player is trending up or down over the last 8 weeks — positive means improving, negative means declining.

Right now, our API shows Alcaraz with a WElo momentum of -135. That's a significant decline. The market may not have fully priced this in because his "name value" and official ranking still look strong. But the data says his recent form has dropped off a cliff.

Conversely, we maintain surface-specific Elo. Alcaraz's clay Elo (1961) is significantly lower than his hard court Elo (2041). If he's about to play on clay, the market might price him based on his overall ranking — but our surface Elo suggests he's less dominant on this surface than people think.

How to use Elo for pricing

The simplest model: take both players' surface-matched WElo ratings and compute expected win probability:

def elo_probability(elo_a, elo_b):
    return 1 / (1 + 10 ** ((elo_b - elo_a) / 400))

# Sinner (clay WElo 2097) vs Alcaraz (clay WElo 2109)
prob = elo_probability(2097, 2109)
print(f"Sinner win prob: {prob:.1%}")  # ~48.3%
print(f"Fair price: {1/prob:.2f}")     # ~2.07

If Betfair has Sinner at 2.30, that's a 5% edge. Combine this with serve stats and pressure metrics and you've got a multi-factor model that's more accurate than Elo alone.

9. Deep dive: Pressure metrics nobody else tracks

The choke index is our most unique feature. Here's how it works:

We look at every game where a player was in a winning position — ahead in the score, serving for the set, up a break. Then we measure how often they lose from that position. A choke index of 0.10 means they convert 90% of their advantages. A choke index of 0.45 means they blow nearly half their leads.

This metric is invisible to traditional stats. Two players might both hold serve 85% of the time, but one holds at 5-5 and the other collapses at 5-4. The hold percentage doesn't distinguish between these — the choke index does.

How the market misprices pressure

Bookmakers set prices based on overall ability, recent results, and market sentiment. They don't adjust for pressure-specific performance because they don't have the data. This creates systematic mispricings:

Players with low choke indices get underpriced in tight matches — the market doesn't give them enough credit for converting close situations
Players with high choke indices get overpriced as favourites — their overall stats look good but they leak value in the moments that decide matches
Tiebreak specialists are undervalued in sets that project close — a player who wins 60% of tiebreaks vs one who wins 40% is worth a lot more in a match where multiple tiebreaks are likely

The API makes this data available for every player. You don't need to build the analysis pipeline — just pull the numbers and compare against the market.

10. Deep dive: Weather-adjusted serve projections

This one is genuinely underexplored in tennis analytics. Most models treat every hard court match the same, whether it's played indoors in Rotterdam in January or outdoors in Miami in 30+ km/h wind.

We use historical weather data from Open-Meteo (hourly conditions for every match venue) to compute each player's serve performance in different conditions:

Wind above 20 km/h — ball trajectory becomes unpredictable, first serve accuracy drops, toss becomes inconsistent
Cold conditions (below 15°C) — ball doesn't bounce as high, heavier, favours flat hitters over spin-heavy players
High altitude (Madrid, Bogota) — ball travels faster through thinner air, serve speeds increase, rallies shorten
Indoor vs outdoor — no wind, consistent bounce, big servers dominate indoors

The API returns splits like hold_pct_windy and hold_pct_calm. When there's a significant gap (some players drop 15%+ in wind), that's actionable information that the market isn't pricing in.

# Example: Finding wind-vulnerable players
rankings = requests.get(f"{BASE}/rankings/elo?tour=atp&limit=100", headers=h).json()

for player in rankings["data"]:
    profile = requests.get(f"{BASE}/player/{player['name']}", headers=h).json()["data"]
    ws = profile.get("weather_splits", {})
    calm = ws.get("hold_pct_calm", 0)
    windy = ws.get("hold_pct_windy", 0)

    if calm and windy and (calm - windy) > 0.08:
        print(f"{player['name']}: hold drops {(calm-windy)*100:.0f}% in wind ({calm:.0%} → {windy:.0%})")

Run that before a windy outdoor event and you've got a list of players the market is likely overpricing.

11. The data CSV: bulk analysis without code

Not everyone wants to write Python. If you're an analyst working in Excel, a researcher building a dataset, or a content creator who just needs numbers — the CSV download is simpler.

Each download gives you:

Column	Example (Sinner)	What it tells you
elo_overall	2156	Overall strength rating
welo_overall	2360	Recency-weighted strength
welo_delta_8w	+80	Trending up or down
hold_pct_20w	99.2%	Current serve form
break_pct_20w	33.9%	Return game quality
choke_index	0.065	Pressure performance
closing_rate	93.6%	Finishing matches
hold_pct_windy	93.8%	Serve in wind
hold_pct_indoor	89.5%	Indoor performance
ace_rate_cold	0.44	Aces in cold weather

...plus 30 more columns. Updated at point of download.

Pricing

ATP Top 250 — £125 per download
WTA Top 250 — £125 per download
Both tours (500 players) — £200 per download
6 downloads over 12 months — £600 (save £600)
10 downloads over 12 months — £1,000 (save £1,000)

Buy at fault.bet/api. The CSV is emailed to you immediately after purchase.

12. Pricing and getting started

API — £59/month

Full access to all endpoints. 1,000 calls per day. Cancel anytime. Your API key is emailed immediately after payment.

Endpoints included:

GET /v1/signals/today — today's value signals with edge percentages
GET /v1/matches/today — all matches with win probabilities
GET /v1/player/{name} — full player profile (40+ features)
GET /v1/rankings/elo — Elo rankings by tour and surface
GET /v1/player/{name}/matches — recent match history
GET /v1/player/{name}/history — historical feature time series
GET /v1/tournament/{name}/matches — tournament match data
GET /v1/data/features/export — bulk feature export
GET /v1/data/matches/export — bulk match data
GET /v1/data/stats/export — bulk serve stats

Full interactive documentation at api.fault.bet/docs.

Start using the fault.bet Tennis Data API today.

Get API Key — £59/mo Buy Data CSV

Built on 1.6M matches · Updated daily · ATP + WTA coverage

Who's using this data?

We built this API because we needed it ourselves. fault.bet started as a Betfair tennis trading project — we built the data pipeline, the feature engineering, the models. Then we realised the data itself was the most valuable part.

The signal service has a verified track record of 71W 65L since launch. Every signal is publicly tracked at fault.bet/results. But the signals are just one application of the data. The same data powers Polymarket analysis, match previews, fantasy sports research, and academic papers.

If you're doing anything with tennis data, we've probably already solved the hardest part — the data collection, cleaning, and feature engineering. The API gives you the finished product.