Betting vs Investments
Sports betting can be called an investment, the purpose of which is to make a profit over different time periods.
A bet with odds of 2 may yield £1 profit, while odds of 1.1 yield a smaller profit of £0.1. Both bets, however, are subject to the same £1 loss if they fail. So they are not equal; risking £1 to earn £1 is not the same as risking £1 to earn £0.1.
2022-10-26, by ,
#Data Science || #ML || #AI ||
To take this into account in our neural network, we need to use a custom loss function, The loss function (or target function) is a measure of "how good" the neural network is, given its training sample and expected outcome. In standard neural network classification, we use loss functions such as categorical cross-entropy. However, these kinds of functions give equal weights to all bets, ignoring the differences in profitability.
In our case, we want the model to maximise the overall winning strategy Thus, the input of our custom loss function should include the potential profitability of each bet.
We set up our custom loss function with Keras over TensorFlow,
In Keras, the loss function takes two arguments:
Since we can't pass the game odds to the loss function due to Keras constraints, we have to pass them as additional elements of the y_true vector.
See also: Best Betting Sites in India
Below is our custom loss function written in Python and Keras.
In a nutshell, it measures the average profit/loss over entry for a fraction of a block. For each observation (each game) the following steps are performed:
We end up with the expected profit from this observation. We multiply this by -1 to get the "loss "minimise(not the gain to maximise).
def odds_loss(y_true, y_pred):
"""
The function implements the custom loss function
Inputs
true : a vector of dimension batch_size, 7. A label encoded version of the output and the backp1_a and backp1_b
pred : a vector of probabilities of dimension batch_size , 5.
Returns
the loss value
"""
win_home_team = y_true[:, 0:1]
win_home_or_draw = y_true[:, 1:2]
win_away = y_true[:, 2:3]
win_away_or_draw = y_true[:, 3:4]
draw = y_true[:, 4:5]
no_bet = y_true[:, 5:6]
odds_a = y_true[:, 6:7]
odds_b = y_true[:, 7:8]
gain_loss_vector = K.concatenate([win_home_team * (odds_a - 1) + (1 - win_home_team) * -1,
win_home_or_draw * (1/(1 -1/odds_b) - 1) + (1 - win_home_or_draw) * -1,
win_away * (odds_b - 1) + (1 - win_away) * -1,
win_away_or_draw * (1/(1 -1/odds_a) - 1) + (1 - win_away_or_draw) * -1,
draw * (1/(1 - 1/odds_a - 1/odds_b) - 1) + (1 - draw) * -1,
K.zeros_like(odds_a)], axis=1)
return -1 * K.mean(K.sum(gain_loss_vector * y_pred, axis=1))
For our data, we take a list of 200 English Premier League games, 2018-2019 season, August-December 2018. It contains descriptive game data such as team names, odds from Betfair and the mood score (representing the percentage of positive tweets over positive and negative tweets).
This needs to be converted into a hot coding vector representing the output layer of our neural network. Plus we add the odds of each command as elements of this vector. This is exactly what we do below.
Our data contains the results of each game as 1, 2 or 3:
This needs to be converted into a hot coding vector representing the output level of our neural network. Plus we add each team's odds as elements of this vector. This is exactly what we do below.
def odds_loss(y_true, y_pred):
def get_data():
data = pd.read_csv('extract-betsentiment-com.csv')
X = data.values[:, 5:-5]
y = data.values[:, -1]
y_full = np.zeros((X.shape[0], 8))
for i, y_i in enumerate(y):
if y_i == 1:
y_full[i, 0] = 1.0
y_full[i, 1] = 1.0
if y_i == 2:
y_full[i, 2] = 1.0
y_full[i, 3] = 1.0
if y_i == 3:
y_full[i, 1] = 1.0
y_full[i, 3] = 1.0
y_full[i, 4] = 1.0
y_full[i, 6] = X[i, 1] # ADD ODDS OF HOME TEAM
y_full[i, 7] = X[i, 2] # ADD ODDS OF AWAY TEAM
return X, y_full, yX, y, outcome = get_data()# SPLIT THE DATA IN TRAIN AND TEST DATASET.
train_x, test_x, train_y, test_y, = train_test_split(X, y)
Before we can train the model, we must first define it. We use a fully connected neural network with two hidden layers, We use BatchNormalization to normalize the weights and eliminate the vanishing gradient problem.
Then we train the model using a set of arbitrary parameters.
As we can see, we end up with a training loss of -0.05. Keep in mind that we are trying to minimise our loss function, which is the opposite of our gains and losses. This number tells us that on average each bet will produce a profit of 0.05 for every £1 bet. Our validated data set shows an average profit of 0.08 for every £1. Not bad, considering we've just provided the baseline data for our neural network. In over 200 games, our theoretical NN betting strategy would yield between £10 and £16.6, assuming we bet £1 on each game.
We have presented a way to incorporate P & L bets into the neural network classifier using a custom loss function, This goes beyond the precision that can be misleading when designing betting systems. We believe it is useful for anyone who wants to use machine learning in sports.