Hey guys, I wanted to share a little side project I've been toying around with.
What is expected goals ("xG")? It is a method for estimating the quality of chances that a football team creates or concedes in a match. In real life football, there is plenty of historical data on the location of the shot, the type of a pass that assisted that shot, whether the attacker dribbled before trying the shot, etc. By aggregating all of these factors, an estimate can be calculated of the likelihood of scoring from all the different shots. For example, if your team takes 10 shots during a match, and each one of those shots has a historical goal rate of 0.2 (aka 20%), then on average, your team would expect to score 2 goals in that match.
In FMM, the data that we get is very limited however that doesn't mean we can't have fun and use the idea behind xG to create a model that suits the game. There are a few uses for this but let me list the ones I like most:
Sometimes it feels like you got unlucky (or lucky) in a match. A difference between actual goals scored and expected goals will answer that question (and by how much!)
Sometimes you will think "my team is creating chances but not scoring". A simple calculation will quantify that in a single number which could help provide an explanation as to how much of that is perception and how much is fact.
xG is a good way to discuss and compare the effectiveness of a tactic, especially over time. If you are inclined to keep track of your results for fun, or if you're testing something, or for a career narrative, xG can provide insight into how well a given tactic is working for you.
Arriving at xG in real life is an extremely complicated calculation that takes into account a dozen factors. However, we don't have many of the same key data points in FMM so the math is much simpler. All we can use is shots, shots on target ("SOT"), and clear-cut chances ("CCC"), plus a few secondary statistics.
After experimenting with the data, I decided to primarily focus on SOT and CCC. I made this choice because it's simple while still getting the job done and because FMM is a simulation and real life analogies don't always apply. While a high number of shots is nice, it doesn't force a save from the goalkeeper where an extra check need to be made to determine a goal. By comparison, in real life xG calculations, all shots count because even the act of getting a shot off can be an indication of the potency of attack. In FMM, the link between the two isn't always clear because players often fire off shots without any rhyme or reason. As this is a simulation, the goal of any tactic should be to force a goalkeeper into making saves as frequently as possible.
This is also a good place to mention that penalty kicks and own goals do NOT count toward goals scored when using this metric. Open play goals only (including free kicks).
The basic idea is then xG=(SOT*x)+(CCC*y) where x and y are constants of how often each action is expected to result in a goal.
Now let's take a brief look at the data. It comes from three sources:
Season 1 from my career with El Ejido (49 matches)
Match screenshots from @BatiGoal's Villalibre career (138 matches)
Tests I've been running recently with MU & Leicester in the Community Shield (125 matches)
After evaluating all the data sources, I've settled on the following constants:
SOT = 0.15
CCC = 0.73 (note: this may be slightly higher than actual goal rate from CCC's in-game)
I also thought that shot attempts, while minimal, should have a small contribution to the formula as a proxy for successful possession, therefore Shots = .005.
What this basically means is that you should, on average, over time, expect to see a goal from CCC about 73% of the time. Same logic applies to SOT - a goal can be expected from about 15% of shots on target. So our final formula becomes xG = (Shots*0.005)+(SOT*0.15)+(CCC*0.73). And remember, open play goals only!
This formula seems to scale well across saves and formations. Here are the results for the three data sources mentioned above:
BG save - goals scored 527 (xG = 531.255) / goals allowed 207 (xG = 269.895) (I've noticed that with good GK's, actual usually underperforms xG, especially in AI's case)
A fun observation - in BG's save, I show 70 home games in which he had 288.925 xG (4.123 xG per game) while in the 68 games on the road he had 242.330 xG (3.564 xG per game). Field advantage is real in this game! And it's worth about a goal every two games!
I plan on using xG in my career thread for El Ejido when it suits the narrative as well as a key comparison metric for some tests I've been running for a future article. I welcome your comments to help refine these numbers.
You can post now and register later.
If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.
Hey guys, I wanted to share a little side project I've been toying around with.
What is expected goals ("xG")? It is a method for estimating the quality of chances that a football team creates or concedes in a match. In real life football, there is plenty of historical data on the location of the shot, the type of a pass that assisted that shot, whether the attacker dribbled before trying the shot, etc. By aggregating all of these factors, an estimate can be calculated of the likelihood of scoring from all the different shots. For example, if your team takes 10 shots during a match, and each one of those shots has a historical goal rate of 0.2 (aka 20%), then on average, your team would expect to score 2 goals in that match.
In FMM, the data that we get is very limited however that doesn't mean we can't have fun and use the idea behind xG to create a model that suits the game. There are a few uses for this but let me list the ones I like most:
Arriving at xG in real life is an extremely complicated calculation that takes into account a dozen factors. However, we don't have many of the same key data points in FMM so the math is much simpler. All we can use is shots, shots on target ("SOT"), and clear-cut chances ("CCC"), plus a few secondary statistics.
After experimenting with the data, I decided to primarily focus on SOT and CCC. I made this choice because it's simple while still getting the job done and because FMM is a simulation and real life analogies don't always apply. While a high number of shots is nice, it doesn't force a save from the goalkeeper where an extra check need to be made to determine a goal. By comparison, in real life xG calculations, all shots count because even the act of getting a shot off can be an indication of the potency of attack. In FMM, the link between the two isn't always clear because players often fire off shots without any rhyme or reason. As this is a simulation, the goal of any tactic should be to force a goalkeeper into making saves as frequently as possible.
This is also a good place to mention that penalty kicks and own goals do NOT count toward goals scored when using this metric. Open play goals only (including free kicks).
The basic idea is then xG=(SOT*x)+(CCC*y) where x and y are constants of how often each action is expected to result in a goal.
Now let's take a brief look at the data. It comes from three sources:
After evaluating all the data sources, I've settled on the following constants:
What this basically means is that you should, on average, over time, expect to see a goal from CCC about 73% of the time. Same logic applies to SOT - a goal can be expected from about 15% of shots on target. So our final formula becomes xG = (Shots*0.005)+(SOT*0.15)+(CCC*0.73). And remember, open play goals only!
This formula seems to scale well across saves and formations. Here are the results for the three data sources mentioned above:
A fun observation - in BG's save, I show 70 home games in which he had 288.925 xG (4.123 xG per game) while in the 68 games on the road he had 242.330 xG (3.564 xG per game). Field advantage is real in this game! And it's worth about a goal every two games!
I plan on using xG in my career thread for El Ejido when it suits the narrative as well as a key comparison metric for some tests I've been running for a future article. I welcome your comments to help refine these numbers.
Link to comment
Share on other sites