Calculation of Performance Rating

alexholowczak

It is sometimes desirable to post the results of tournaments elsewhere, or amend the results of a tournament after it has finished for "some reason"...

There is an opportunity to do that via the API and new links on the tournament download pages, because you can generate the TRF.

Lichess's two tie-breaks for Swisses are Sonneborn-Berger and Performance Rating. I've recently posted about an issue with Sonneborn-Berger. However, the TRF includes the ratings at the end of the tournament, which means when you import the TRF into other software, you get the Performance Rating calculated based on that rating.

There are two ways in which Lichess could calculate Performance:
1. Using the rating at the start of the tournament
2. Using the rating before each individual game in the tournament

Which is it?

For the purpose of reproducibility, my fingers are crossed that it is 1. If it is 1, is it possible for the TRF to include the rating at the start of the tournament, not the end of the tournament?

tpr

#1
"Lichess's two tie-breaks for Swisses are Sonneborn-Berger and Performance Rating. "
That is correct, but it is wrong. Sonneborn-Berger is a good tie breaker for round robin, not for swiss.
Per FIDE Handbook modified Buchholz should be the tie breaker for swiss.

alexholowczak

You've said that once in another thread already. The FIDE Handbook does not tell people what tie-breaks to use, it provides recommendations. People (or websites!) can make their own judgements.

tpr

#3
That is right, but it is the deviation from the recommendations that causes your problem.
Performance rating at the start? performance rating at the end? Performance rating with or without correction for players disqualified for cheating?

13.16.4. Individual Swiss Tournaments where not all the ratings
are consistent:
Buchholz Cut 1
Buchholz
Sonneborn-Berger
Cumulative system - Sum of Progressive Scores
Direct encounter
The greater number of wins including forfeits
The greater number of wins with Black pieces
13.16.5. Individual Swiss Tournaments where all the ratings are
consistent:
Buchholz Cut 1
Buchholz
Direct encounter
AROC
The greater number of wins including forfeits
The greater number of wins with Black pieces
The greater number of games with Black (unplayed games shall
be counted as played with White)
Sonneborn-Berger

handbook.fide.com/files/handbook/C02Standards.pdf

Toadofsky

#1 "Lichess's two tie-breaks for Swisses are Sonneborn-Berger and Performance Rating. I've recently posted about an issue with Sonneborn-Berger. However, the TRF includes the ratings at the end of the tournament, which means when you import the TRF into other software, you get the Performance Rating calculated based on that rating."

You get the performance rating calculated based on that rating... what? Huh? http://xyproblem.info/

alexholowczak edited

@Toadofsky

1. I download the Lichess TRF. It exports with the end-of-tournament ratings.

2. I then import the TRF into (e.g.) Swiss-Manager, then it calculates the performance rating using the end of tournament ratings as the means of calculation.

3. The Lichess calculation of performance rating in the API does not equal the performance rating in 2.

4. There are the following possibilities that I can think of:
(a) Lichess is calculating performance rating wrong in principle. (I assume not)
(b) Lichess is not using the rating at the end of the tournament to calculate the performance rating. (I assume more likely, because using the end of tournament rating is the wrong thing to do in any case.)

5. What Lichess should be doing to calculate performance is using the rating at the start of the tournament to calculate performance rating. This is not reproducible in the TRF exported in 2; at least, not without manually changing ratings for every single player that entered the tournament manually.

The questions:
1. How does Lichess calculate Performance Rating?
2. Are the ratings that Lichess uses to calculate the Performance Rating the same as the rating contained in the TRF?
3. If the answer to 2 is no, can the TRF specification be adjusted accordingly so that the ratings Lichess use to calculate the performance are the same as the ones contained in the TRF?

Toadofsky

#6 OK, I endorse (agree with) your questions and am trying to find answers (at least to the first two):
github.com/ornicar/lila/commit/18c882cc7a053b740ac33569b19f48904b34594c#diff-aa0c90c32dfa73096e4e256a134f9a205bd2f79c9d379fd0673b59f68dc953c0

1. Time permitting I will try to figure this out.
2. Time permitting I will try to figure this out.
3. My knee-jerk reaction here is that "if somehow the answer to 2 is no, the answer to 3 is that it's probably a lot of work to bring the two implementations into sync otherwise that work would have been done already" but I am uncertain.

kleineme

The performance seems to be the average of
opponent's rating in case of a draw
opponent's rating +500 in case of a win
opponent's rating -500 in case of a loss

This seems to be consistent for players with a few games in arena tournaments. For no apparent reason the results are a couple of points off for more than a few games, even for 50% results. In arena tournaments the ratings taken into account are the ones at the start of any given game.

For Swiss I've checked one of my recent tournaments where I didn't come to a consistent conclusion. In a few cases I had a perfect fit with above formula for the ratings at the start of the tournment, but not even for most of them. Anyway this is counterintuitive when it's the ratings at the end of the tournament which are listed in the player's score cards and in the TRF.

I would support the idea to create the TRF with the ratings at the start of the tournament. And I would support the idea to calculate the performance according to § 8.1a of the FIDE Rating Regulations (handbook.fide.com/chapter/B022017). Meanwhile I'll keep modifying the export files from SwissChess manually ;-)

CM TBest

edited

"I would support the idea to create the TRF with the ratings at the start of the tournament"

You should really make sure to define what you expect here. Does this mean the rating a player has when they sign up? Or the rating when the first game is played (BUT NOT the rating that is SHOWN during the first game, since that could be different).

Things get's a bit confusing when a player signs up to an arena, plays games, then the arena starts, since the rating changes in between here. ( Example of confusion : github.com/ornicar/lila/issues/8327 )

Similarly, a players rating could change during a game (if you play two games at the same time etc.) or really at any point. What I am trying to say, is that finding a point where the rating is constant, is perhaps easier said then done. But probably, one should be really clear about the implementation that's wanted, to avoid the weird edge cases where a players rating changes.

I do ofc, agree that a less confusing and easier reproducible calculation would be nice. Just keep in mind that "start of the tournament" also needs to be defined.

[As a side note: other weird behavior incudes stuff like this: github.com/ornicar/lila/issues/7963 , not sure if this could also apply to swiss when a TC is changed.]

kleineme

#10

Thanks for your reply, these are valid points to be taken into account. I would say that the rating at the start of the first game in that tournament would be ok.

Regarding rating changes after joining a tournament, this is just only a problem in Arena tournaments. I've just checked with one of my future Swiss tournaments: I've played two games and the rating change is already reflected in the upcoming Swiss tournament.

This topic has been archived and can no longer be replied to.