What better opportunity than the Melbourne Cup, for the Flow data science team to investigate if historical data and media sentiment, can be a predictor of a future result.
By producing a quantitative model which allocates weighting and ratings across multiple factors; including a horse’s current form, make up (age, colour, country of birth), prize money, odds, and jockey performance – it allows Flow’s data minds to see how similar a current running horse is, compared with all historical winners of the Melbourne Cup.
In addition, the agency has included media presence into the model by looking at how many times each horse is being talked about, and the public’s sentiment towards it through online mentions. Flow is hoping that social listening can deliver a good indication of a horse’s performance, and the public, as a collective, can pick the winner come race day.
Flow’s Data lead, Michael Elith said: “It is always great to see how data can play a role in our daily decision making, and how a number of factors can be brought together in an automated and simplified way to help answer complex questions.”
The results have confirmed the favourite, Deauville Legend, is a favourite for a reason – running well ahead of the pack (but important to note, Melbourne Cup favourites have only won 21% of the time since the race inception in 1861). Coming in second is an outsider, Interpretation, largely gaining extra places by its social and media impact. Finally, rounding out the top 3 is Without a Fight.
“While we are much better at buying media than picking horses it is all done in fun and hopefully gives everyone something to chat about come the big race, said Jimmy Hyett, Founder & CEO, This is Flow.
Methodology.
Within each category Flow has highlighted 7 core fields and ranked them against the full history of previous winners, then added all the current horses, their form and social impact to paint the full picture and deliver a rating on each horse.
The core fields are
- Horse Characteristics – Age, Sex, Weight, Barrier, Saddlecloth Number, Foaled, Colour, and if they are a newcomer.
- Winning Ratio – Win percentage, Place percentage and Experience
- Prize Money – Average prize money
- Form – Previous Form and Average Time (their longest distance race they have run, then recalculated to match the Melbourne Cup distance of 3200m)
- Jockey Performance – Jockey Barrier, Jockey Distance, Field Size and Jockey Weight
- Media Impact – Mentions, Interactions, Reach, Shares and Likes across a variety of sites including Social Platforms, Google trends etc
- Market Sentiment – Odds (as of Monday 12pm so we could get this out – caveat is that these may have changed by the race day)
Time Predictor – we have set estimated times for the horses based on the average speed of the previous 25 Melbourne Cup winners and reweighted for each horse based on their final rating.