If you're reading this, you probably fall into one of a few categories. Maybe you trade tennis on Betfair and you're tired of gut-feel decisions. Maybe you're a Polymarket trader looking for an edge on tennis contracts. Maybe you're a journalist who needs serve data for a story. Maybe you build models and you need clean training data.
Whatever brought you here, the underlying problem is the same: tennis data is a mess. It's scattered across dozens of sources, formatted inconsistently, missing key metrics, and almost nobody provides the kind of derived features — pressure indices, weather-adjusted performance, surface-specific ratings — that actually move the needle on prediction accuracy.
We built the fault.bet Tennis Data API to fix that. One endpoint, one API key, clean JSON. 500 active players, 40+ computed features each, updated daily. This guide covers everything you can do with it.
Most tennis data sources give you the basics: who won, what the score was, maybe some serve percentages. That's fine for a Wikipedia article but useless for prediction. The gap between "Sinner won 6-3 6-4" and understanding why he won — and whether he'll win next time — is enormous.
The fault.bet API provides 40+ computed features per player, updated daily. These aren't raw stats pulled from a match database. They're derived metrics calculated from rolling windows, exponential averages, and cross-referenced with weather and venue data. Here's what you get:
Standard Elo tells you who's better overall. But tennis is a surface sport. A player ranked 1800 overall might be 2100 on clay and 1500 on grass. We maintain separate Elo ratings for hard court, clay, and grass — plus weighted versions (WElo) that give recent matches more importance.
Right now, the API returns this for Carlos Alcaraz:
{
"elo_overall": 2078,
"elo_hard": 2041,
"elo_clay": 1961,
"elo_grass": 1840,
"welo_overall": 2228,
"welo_momentum_8w": -135.2
}
That momentum number is interesting. -135 over 8 weeks means Alcaraz's form has dropped significantly in the last two months. The standard Elo (2078) doesn't reflect that because it moves slowly. The WElo (2228) does — it's pulling back hard. If you're pricing a match, which number do you trust more?
We compute rolling averages over the last 20 matches and 52 matches for every serve and return metric that matters:
The 20-match window captures current form. The 52-match window captures baseline ability. When they diverge, something interesting is happening — a player improving rapidly, or one falling off a cliff.
This is where it gets unique. Nobody else provides these because they require point-level data analysis, not just match results. We compute:
Why this matters for pricing: Two players with identical Elo ratings can have wildly different pressure profiles. If Player A chokes 40% of the time and Player B closes 90% of the time, the market price should reflect that — but bookmakers rarely adjust for it. That's edge.
Wind above 20 km/h reduces first-serve accuracy by 5-8% on average. But the effect isn't uniform. Some players' hold rates barely move in wind. Others drop by 15 percentage points.
We track:
When Madrid's altitude and wind combine with a specific player matchup, these splits become genuinely predictive.
You need an API key. Get one at fault.bet/api (£59/month, cancel anytime). Your key starts with fb_ and gets emailed to you immediately after payment.
Then:
# Get today's ATP rankings by Elo
curl -H "X-API-Key: fb_your_key_here" \
https://api.fault.bet/v1/rankings/elo?tour=atp&limit=5
Response:
{
"status": "ok",
"data": [
{"rank": 1, "name": "Jannik Sinner", "elo_overall": 2156},
{"rank": 2, "name": "Carlos Alcaraz", "elo_overall": 2078},
{"rank": 3, "name": "Novak Djokovic", "elo_overall": 1971},
{"rank": 4, "name": "Alexander Zverev", "elo_overall": 1921},
{"rank": 5, "name": "Arthur Fils", "elo_overall": 1818}
]
}
# pip install requests import requests API_KEY = "fb_your_key_here" BASE = "https://api.fault.bet/v1" headers = {"X-API-Key": API_KEY} # Get a player profile with all stats player = requests.get(f"{BASE}/player/Sinner", headers=headers).json() elo = player["data"]["ratings"]["elo_overall"] hold = player["data"]["rolling_stats"]["hold_pct_20w"] choke = player["data"]["pressure"]["choke_index"] print(f"Sinner: Elo {elo:.0f}, hold {hold:.0%}, choke {choke:.3f}") # Output: Sinner: Elo 2156, hold 99.2%, choke 0.065
Full interactive documentation with "try it" buttons is at api.fault.bet/docs.
This is what fault.bet was originally built for. The core idea is simple: if your model says a player has a 65% chance of winning but Betfair is pricing them at 55% (odds of 1.82), you have a 10% edge. Bet enough of these and the edge compounds.
import requests
API_KEY = "fb_your_key_here"
BASE = "https://api.fault.bet/v1"
h = {"X-API-Key": API_KEY}
# Get today's signals (pre-computed value bets)
signals = requests.get(f"{BASE}/signals/today", headers=h).json()
for sig in signals["data"]:
print(f"{sig['player1']} vs {sig['player2']}")
print(f" Back: {sig['signal_player']}")
print(f" Model: {sig['model_prob']:.0%}")
print(f" Market: {sig['market_implied']:.0%}")
print(f" Edge: {sig['edge_pct']:.1f}%")
print()
The signals endpoint does all the heavy lifting — it's already compared model probability against live exchange prices and filtered for matches where the edge exceeds 6%. You just need to place the bet.
If you want to build your own model rather than using our signals, the player endpoint gives you everything you need:
# Compare two players for a specific match p1 = requests.get(f"{BASE}/player/Sinner", headers=h).json()["data"] p2 = requests.get(f"{BASE}/player/Alcaraz", headers=h).json()["data"] # Simple Elo-based win probability elo_diff = p1["ratings"]["elo_clay"] - p2["ratings"]["elo_clay"] p1_win_prob = 1 / (1 + 10 ** (-elo_diff / 400)) # Adjust for pressure — if one player chokes more choke_diff = p1["pressure"]["choke_index"] - p2["pressure"]["choke_index"] # Lower choke = better under pressure = slight boost pressure_adj = choke_diff * -0.05 adjusted_prob = p1_win_prob + pressure_adj # Compare against market betfair_price = 1.85 # from Betfair API market_implied = 1 / betfair_price edge = adjusted_prob - market_implied print(f"Model: {adjusted_prob:.1%}, Market: {market_implied:.1%}, Edge: {edge:.1%}")
The more features you incorporate — serve stats, weather, fatigue, surface form — the more accurate your model becomes. Our API gives you all the ingredients; you build the recipe.
For serious traders, checking the API manually isn't practical. You can poll the signals endpoint every hour and auto-place via the Betfair API when value appears:
# Poll for new signals every hour import time import betfairlightweight trading = betfairlightweight.APIClient(...) trading.login() while True: signals = requests.get(f"{BASE}/signals/today", headers=h).json() for sig in signals["data"]: if sig["edge_pct"] > 8.0 and sig["confidence"] > 70: # Place bet via Betfair API market_id = find_betfair_market(sig) trading.betting.place_orders(market_id, [ betfairlightweight.filters.limit_order( price=sig["betfair_price"], size=calculate_stake(sig["edge_pct"]), persistence_type="LAPSE" ) ]) print(f"Bet placed: {sig['signal_player']} @ {sig['betfair_price']}") time.sleep(3600)
Prediction markets like Polymarket are growing fast, and tennis is one of the sports they're adding. The mechanics are different from traditional betting — you're buying "shares" in an outcome — but the underlying question is the same: what's the true probability?
The fault.bet API gives you calibrated probabilities. Here's how to use them on Polymarket:
# Get model probabilities for today's matches matches = requests.get(f"{BASE}/matches/today", headers=h).json() for match in matches["data"]: model_p1 = match["p1_win_prob"] # Compare with Polymarket price (you'd scrape or use their API) poly_price = get_polymarket_price(match["player1"], match["player2"]) if poly_price and abs(model_p1 - poly_price) > 0.08: direction = "BUY" if model_p1 > poly_price else "SELL" edge = abs(model_p1 - poly_price) print(f"{direction} {match['player1']} contract") print(f" Model: {model_p1:.0%}, Polymarket: {poly_price:.0%}") print(f" Edge: {edge:.0%}")
Prediction markets are less efficient than Betfair for tennis because they have less liquidity and fewer sharp bettors. The edges tend to be larger but the volume you can trade is smaller.
Important: Prediction market prices can move quickly during matches. The fault.bet API provides pre-match probabilities. For inplay trading, you'd need to combine our pre-match data with live scoring data.
If you write about tennis, you need data to back up your analysis. "Djokovic is clutch" is an opinion. "Djokovic closes out 87% of sets when serving at 5-4" is a fact. The API gives you facts.
# Pull stats for a match preview
p1 = requests.get(f"{BASE}/player/Rublev", headers=h).json()["data"]
p2 = requests.get(f"{BASE}/player/Fils", headers=h).json()["data"]
print(f"=== {p1['name']} vs {p2['name']} ===\n")
print(f"Elo: {p1['ratings']['elo_overall']:.0f} vs {p2['ratings']['elo_overall']:.0f}")
print(f"Hold %: {p1['rolling_stats']['hold_pct_20w']:.0%} vs {p2['rolling_stats']['hold_pct_20w']:.0%}")
print(f"Break %: {p1['rolling_stats']['break_pct_20w']:.0%} vs {p2['rolling_stats']['break_pct_20w']:.0%}")
print(f"Choke: {p1['pressure']['choke_index']:.2f} vs {p2['pressure']['choke_index']:.2f}")
print(f"Closing: {p1['pressure']['closing_rate']:.0%} vs {p2['pressure']['closing_rate']:.0%}")
print(f"Deciding sets: {p1['pressure']['deciding_set_pct']:.0%} vs {p2['pressure']['deciding_set_pct']:.0%}")
That's a match preview written in 10 lines of Python. You can build a weekly newsletter, a podcast stat sheet, or a pre-tournament analysis piece from the same data.
The player history endpoint gives you time-series data going back to 2022. Want to write about Sinner's rise? Pull his Elo trajectory:
# Sinner's Elo trajectory over time history = requests.get(f"{BASE}/player/Sinner/history", headers=h).json() for point in history["data"][-12:]: # Last 12 data points print(f"{point['date']}: Elo {point['elo_overall']:.0f}, WElo {point['welo_overall']:.0f}")
Plot that in a chart and you've got a visual story that no other outlet has — because nobody else maintains surface-specific, recency-weighted Elo ratings with 8-week momentum tracking.
Daily fantasy sports (DraftKings, FanDuel) score tennis players based on games won, aces, break points, and match results. The fault.bet API helps you find underpriced players:
# Get all players ranked by Elo but cheap on DFS rankings = requests.get(f"{BASE}/rankings/elo?tour=wta&limit=50", headers=h).json() for player in rankings["data"]: # High Elo + high ace rate = likely to score fantasy points profile = requests.get(f"{BASE}/player/{player['name']}", headers=h).json()["data"] ace_rate = profile["rolling_stats"].get("ace_rate_20w", 0) hold_pct = profile["rolling_stats"]["hold_pct_20w"] # Ace machines who hold serve well = high fantasy floor if ace_rate > 0.5 and hold_pct > 0.80: print(f"{player['name']}: {ace_rate:.1f} aces/game, {hold_pct:.0%} hold")
The weather splits add another dimension. If you know tomorrow's match is outdoors in 25+ km/h wind, you can find players whose serve doesn't degrade in wind — they'll outperform their DFS salary.
If you're a data scientist or quantitative trader, you probably want the raw features to build your own model. The fault.bet API gives you two options:
Pull player stats before each match, compute your own diff features, run through your model:
import numpy as np from sklearn.linear_model import LogisticRegression # For each match, get both player profiles features = [] for match in today_matches: p1 = get_player_stats(match["player1"]) p2 = get_player_stats(match["player2"]) # Compute diff features (player1 - player2) diff = { "elo_diff": p1["elo_overall"] - p2["elo_overall"], "hold_diff": p1["hold_pct_20w"] - p2["hold_pct_20w"], "break_diff": p1["break_pct_20w"] - p2["break_pct_20w"], "choke_diff": p2["choke_index"] - p1["choke_index"], # inverted "momentum_diff": p1["welo_momentum"] - p2["welo_momentum"], } features.append(diff)
If you need historical snapshots for backtesting, the Data CSV gives you the full dataset — 250 ATP + 250 WTA players with all 40 features, updated weekly. At £200 per download (or £600 for 6 downloads over a year), it's significantly cheaper than building the data pipeline yourself.
The CSV includes everything the API provides, but in a format you can load directly into pandas, R, or Excel:
import pandas as pd
atp = pd.read_csv("faultbet_atp_top250_2026-04-23.csv")
wta = pd.read_csv("faultbet_wta_top250_2026-04-23.csv")
# Immediate analysis
print(atp.sort_values("welo_overall", ascending=False)[["player_name", "welo_overall", "hold_pct_20w", "choke_index"]].head(10))
Standard Elo was designed for chess. It works reasonably well for tennis but has a critical flaw: it treats a match from 6 months ago the same as a match from last week.
Tennis form is volatile. A player can be unbeatable for two months and then crash for six weeks. Djokovic in January is a different beast from Djokovic in March. Standard Elo can't capture this because it updates by the same amount regardless of when the match was played.
WElo (Weighted Elo) fixes this with exponential decay. Recent matches carry more weight. The welo_momentum_8w field tells you whether a player is trending up or down over the last 8 weeks — positive means improving, negative means declining.
Right now, our API shows Alcaraz with a WElo momentum of -135. That's a significant decline. The market may not have fully priced this in because his "name value" and official ranking still look strong. But the data says his recent form has dropped off a cliff.
Conversely, we maintain surface-specific Elo. Alcaraz's clay Elo (1961) is significantly lower than his hard court Elo (2041). If he's about to play on clay, the market might price him based on his overall ranking — but our surface Elo suggests he's less dominant on this surface than people think.
The simplest model: take both players' surface-matched WElo ratings and compute expected win probability:
def elo_probability(elo_a, elo_b):
return 1 / (1 + 10 ** ((elo_b - elo_a) / 400))
# Sinner (clay WElo 2097) vs Alcaraz (clay WElo 2109)
prob = elo_probability(2097, 2109)
print(f"Sinner win prob: {prob:.1%}") # ~48.3%
print(f"Fair price: {1/prob:.2f}") # ~2.07
If Betfair has Sinner at 2.30, that's a 5% edge. Combine this with serve stats and pressure metrics and you've got a multi-factor model that's more accurate than Elo alone.
The choke index is our most unique feature. Here's how it works:
We look at every game where a player was in a winning position — ahead in the score, serving for the set, up a break. Then we measure how often they lose from that position. A choke index of 0.10 means they convert 90% of their advantages. A choke index of 0.45 means they blow nearly half their leads.
This metric is invisible to traditional stats. Two players might both hold serve 85% of the time, but one holds at 5-5 and the other collapses at 5-4. The hold percentage doesn't distinguish between these — the choke index does.
Bookmakers set prices based on overall ability, recent results, and market sentiment. They don't adjust for pressure-specific performance because they don't have the data. This creates systematic mispricings:
The API makes this data available for every player. You don't need to build the analysis pipeline — just pull the numbers and compare against the market.
This one is genuinely underexplored in tennis analytics. Most models treat every hard court match the same, whether it's played indoors in Rotterdam in January or outdoors in Miami in 30+ km/h wind.
We use historical weather data from Open-Meteo (hourly conditions for every match venue) to compute each player's serve performance in different conditions:
The API returns splits like hold_pct_windy and hold_pct_calm. When there's a significant gap (some players drop 15%+ in wind), that's actionable information that the market isn't pricing in.
# Example: Finding wind-vulnerable players
rankings = requests.get(f"{BASE}/rankings/elo?tour=atp&limit=100", headers=h).json()
for player in rankings["data"]:
profile = requests.get(f"{BASE}/player/{player['name']}", headers=h).json()["data"]
ws = profile.get("weather_splits", {})
calm = ws.get("hold_pct_calm", 0)
windy = ws.get("hold_pct_windy", 0)
if calm and windy and (calm - windy) > 0.08:
print(f"{player['name']}: hold drops {(calm-windy)*100:.0f}% in wind ({calm:.0%} → {windy:.0%})")
Run that before a windy outdoor event and you've got a list of players the market is likely overpricing.
Not everyone wants to write Python. If you're an analyst working in Excel, a researcher building a dataset, or a content creator who just needs numbers — the CSV download is simpler.
Each download gives you:
| Column | Example (Sinner) | What it tells you |
|---|---|---|
| elo_overall | 2156 | Overall strength rating |
| welo_overall | 2360 | Recency-weighted strength |
| welo_delta_8w | +80 | Trending up or down |
| hold_pct_20w | 99.2% | Current serve form |
| break_pct_20w | 33.9% | Return game quality |
| choke_index | 0.065 | Pressure performance |
| closing_rate | 93.6% | Finishing matches |
| hold_pct_windy | 93.8% | Serve in wind |
| hold_pct_indoor | 89.5% | Indoor performance |
| ace_rate_cold | 0.44 | Aces in cold weather |
...plus 30 more columns. Updated at point of download.
Buy at fault.bet/api. The CSV is emailed to you immediately after purchase.
Full access to all endpoints. 1,000 calls per day. Cancel anytime. Your API key is emailed immediately after payment.
Endpoints included:
GET /v1/signals/today — today's value signals with edge percentagesGET /v1/matches/today — all matches with win probabilitiesGET /v1/player/{name} — full player profile (40+ features)GET /v1/rankings/elo — Elo rankings by tour and surfaceGET /v1/player/{name}/matches — recent match historyGET /v1/player/{name}/history — historical feature time seriesGET /v1/tournament/{name}/matches — tournament match dataGET /v1/data/features/export — bulk feature exportGET /v1/data/matches/export — bulk match dataGET /v1/data/stats/export — bulk serve statsFull interactive documentation at api.fault.bet/docs.
Start using the fault.bet Tennis Data API today.
Get API Key — £59/mo Buy Data CSVWe built this API because we needed it ourselves. fault.bet started as a Betfair tennis trading project — we built the data pipeline, the feature engineering, the models. Then we realised the data itself was the most valuable part.
The signal service has a verified track record of 71W 65L since launch. Every signal is publicly tracked at fault.bet/results. But the signals are just one application of the data. The same data powers Polymarket analysis, match previews, fantasy sports research, and academic papers.
If you're doing anything with tennis data, we've probably already solved the hardest part — the data collection, cleaning, and feature engineering. The API gives you the finished product.