Skip to main content

Elo Rating

If a higher-rated player beats a lower-rated player, their rating will go up, while the loser's rating will go down. Improving Elo is relatively easy but usually comes at the cost of complexity.

Probability

New rating = Old rating + K * (outcome - expected outcome)

where:
- New rating is the updated rating after the game
- Old rating is the player's rating before the game
- K is a constant that determines the weight of the outcome on the rating
- Outcome is the actual result of the game (1 for a win, 0 for a loss, 0.5 for a draw)
- The expected outcome is the probability of the player winning, calculated using the following formula:

Expected outcome = 1 / (1 + 10^((opponent's rating - player's rating) / 400))
# Define a function to calculate the Elo rating for each player
def calculate_elo(player_A, player_B, result):
# Set the basic parameters for the Elo calculation
K = 32
RA = player_A.rating
RB = player_B.rating

# Calculate the expected score for each player
EA = 1 / (1 + 10**((RB - RA) / 400))
EB = 1 / (1 + 10**((RA - RB) / 400))

# Update the player's rating based on the actual result
if result == "A":
RA = RA + K * (1 - EA)
RB = RB + K * (0 - EB)
elif result == "B":
RA = RA + K * (0 - EA)
RB = RB + K * (1 - EB)
elif result == "T":
RA = RA + K * (0.5 - EA)
RB = RB + K * (0.5 - EB)

# Set the updated ratings for each player
player_A.rating = RA
player_B.rating = RB

Use Casesโ€‹

  • Matching players in online multiplayer games
  • Ranking professional sports teams or players
  • Evaluating the performance of political candidates in an election
  • Predicting the success of romantic relationships in online dating (Zuckerberg allegedly used Elo in his "Face Mash" app to rank students).
  • Ranking the quality of restaurants or other businesses based on customer ratings and reviews

Shortcomingsโ€‹

  • Players who stop playing to keep their rating
  • Selective match-making, where players seek out players that are overrated and avoid underrated players
  • Inability to compare across periods, as ratings may be inflated or deflated over time.