# Training a ML model for sport results forecasting A bet with odds of 2 may yield £1 profit, while odds of 1.1 yield a smaller profit of £0.1. Both bets, however, are subject to the same £1 loss if they fail. So they are not equal; risking £1 to earn £1 is not the same as risking £1 to earn £0.1.
2022-10-26, by ,

#Data Science || #ML || #AI ||

To take this into account in our neural network, we need to use a custom loss function, The loss function (or target function) is a measure of "how good" the neural network is, given its training sample and expected outcome. In standard neural network classification, we use loss functions such as categorical cross-entropy. However, these kinds of functions give equal weights to all bets, ignoring the differences in profitability.

In our case, we want the model to maximise the overall winning strategy Thus, the input of our custom loss function should include the potential profitability of each bet.

## Custom loss function

We set up our custom loss function with Keras over TensorFlow,

In Keras, the loss function takes two arguments:

• y_true: True mark vector (Win at home, win at home or draw, draw, win, draw or draw, no bet). Based on our neural network architecture, this takes the form of a vector of 1 and 0. For example, a game leading to a home win has the following vector y_true (1,1,0,0,0,0).
• y_pred: Prediction vector. This is the output of our neural network classifier.

Since we can't pass the game odds to the loss function due to Keras constraints, we have to pass them as additional elements of the y_true vector.

## A bit of Python code

Below is our custom loss function written in Python and Keras.
In a nutshell, it measures the average profit/loss over entry for a fraction of a block. For each observation (each game) the following steps are performed:

• Get the odds of entry y_true
• Calculate the potential profit of each bet, using the odds.
• Combine the profit of the winning bets and the loss of those bets.

We end up with the expected profit from this observation. We multiply this by -1 to get the "loss "minimise(not the gain to maximise).

``````
def odds_loss(y_true, y_pred):
"""
The function implements the custom loss function

Inputs
true : a vector of dimension batch_size, 7. A label encoded version of the output and the backp1_a and backp1_b
pred : a vector of probabilities of dimension batch_size , 5.

Returns
the loss value
"""
win_home_team = y_true[:, 0:1]
win_home_or_draw = y_true[:, 1:2]
win_away = y_true[:, 2:3]
win_away_or_draw = y_true[:, 3:4]
draw = y_true[:, 4:5]
no_bet = y_true[:, 5:6]
odds_a = y_true[:, 6:7]
odds_b = y_true[:, 7:8]
gain_loss_vector = K.concatenate([win_home_team * (odds_a - 1) + (1 - win_home_team) * -1,
win_home_or_draw * (1/(1 -1/odds_b) - 1) + (1 - win_home_or_draw) * -1,
win_away * (odds_b - 1) + (1 - win_away) * -1,
win_away_or_draw * (1/(1 -1/odds_a) - 1) + (1 - win_away_or_draw) * -1,
draw * (1/(1 - 1/odds_a - 1/odds_b) - 1) + (1 - draw) * -1,
K.zeros_like(odds_a)], axis=1)
return -1 * K.mean(K.sum(gain_loss_vector * y_pred, axis=1))
```
```

## Data

For our data, we take a list of 200 English Premier League games, 2018-2019 season, August-December 2018. It contains descriptive game data such as team names, odds from Betfair and the mood score (representing the percentage of positive tweets over positive and negative tweets).

This needs to be converted into a hot coding vector representing the output layer of our neural network. Plus we add the odds of each command as elements of this vector. This is exactly what we do below.

### Loading data from a file and splitting into training and testing data

Our data contains the results of each game as 1, 2 or 3:

• 1: home win
• 2: away win
• 3: Draw

This needs to be converted into a hot coding vector representing the output level of our neural network. Plus we add each team's odds as elements of this vector. This is exactly what we do below.

``````
def odds_loss(y_true, y_pred):
def get_data():
X = data.values[:, 5:-5]
y = data.values[:, -1]
y_full = np.zeros((X.shape, 8))
for i, y_i in enumerate(y):
if y_i == 1:
y_full[i, 0] = 1.0
y_full[i, 1] = 1.0
if y_i == 2:
y_full[i, 2] = 1.0
y_full[i, 3] = 1.0
if y_i == 3:
y_full[i, 1] = 1.0
y_full[i, 3] = 1.0
y_full[i, 4] = 1.0
y_full[i, 6] = X[i, 1] # ADD ODDS OF HOME TEAM
y_full[i, 7] = X[i, 2] # ADD ODDS OF AWAY TEAM
return X, y_full, yX, y, outcome = get_data()# SPLIT THE DATA IN TRAIN AND TEST DATASET.
train_x, test_x, train_y, test_y, = train_test_split(X,  y)

```
```

## Training the model

Before we can train the model, we must first define it. We use a fully connected neural network with two hidden layers, We use BatchNormalization to normalize the weights and eliminate the vanishing gradient problem.

Then we train the model using a set of arbitrary parameters.

As we can see, we end up with a training loss of -0.05. Keep in mind that we are trying to minimise our loss function, which is the opposite of our gains and losses. This number tells us that on average each bet will produce a profit of 0.05 for every £1 bet. Our validated data set shows an average profit of 0.08 for every £1. Not bad, considering we've just provided the baseline data for our neural network. In over 200 games, our theoretical NN betting strategy would yield between £10 and £16.6, assuming we bet £1 on each game.

## Conclusion

We have presented a way to incorporate P & L bets into the neural network classifier using a custom loss function, This goes beyond the precision that can be misleading when designing betting systems. We believe it is useful for anyone who wants to use machine learning in sports.