In my previous post I wrote about people courtsiding at Tennis, @borisranting from www.sportismadeforbetting.com dropped by and left the first comment in this blog’s short life so far:
Courtsiding is only an issue though if you are trying to price snip, i.e. think you are beating the market because you react faster than everyone else. If you are pitting your wits against a settled market, and saying that your assessment is better than the market’s, it doesn’t affect you. But you need to factor it into your betting behaviour.
Two essential rules of betting punters must live by:
There will always be people with more information than you in a market.
There will always be people with faster access to live information in a market.
And if your model/assessment is exactly the same as the market, then you are either punting, or playing a zero-sum game with no margin, like Exchange Games.
Can’t argue with any of that.
I promised to write more about my experience of trading tennis, so here it is. First off I am by all means not an expert on tennis trading/betting, so any comments, as always are welcome.
Ok a quick recap of what I did:
- Wrote a program to scrape 2 year’s worth of data. It was all match summary statistics. Not point by point data as I think only 1 site had it and it was a bit patchy to say the least.
- Sort and organise this data into a lovely database
- Then for a particular matchup, on a given surface, I could calculate each players:
- 1st Serve Percentage
- Ace Rate (per 1st Serve)
- 1st Serve Points Won (exc Ace)
- DF Rate (per 2nd Serve)
- 2nd Serve Points Won (exc DFs)
I would then simulate a tennis match using these stats as many times as I wanted, usually around 50,000 times. I could do this pre-game or in play using the current match score. The model would spit out what it thought the chances of each player winning from whatever scoreline, compare those to the match odds, and tell me it there were any differences. Ie did I have an edge. Unfortunately, these odds were very rarely different from the market. I seem to remember at most I would have about a 2% edge. Which rarely happened, the majority of the time it was bang on.
So, I created a model that gave the same match odds that a highly liquid market was at… This at least proved that my model wasn’t a load of rubbish. During the testing phase of this model I was ever-so-slightly up. I didn’t test it long enough to determine if it did have around a 2% edge when the odds were different. I felt like I overdosed on tennis, and I didn’t enjoy watching it to the level that I would have been required to watch it. I certainly prefer watching other sports, even without a bet on such as basketball, NFL and rugby.
Not that the model I created was perfect. I was running under the assumption that points in tennis are Independent and Identically Distributed (IID).
Huh, what the hell does that mean?
It means each a player has the same constant chance of winning a point on serve regardless of score or whether he lost the previous point. My model ran with Player A’s chance of winning a point on serve being the same whether it was the first point on a service game in the 1st set, or if he was defending a break point in the final set. IIRC research available suggests that points in tennis are not IID, but the effects are small. I’m sure that the top players play better on the “bigger” points and certain players are slow starters or streaky players. Plus there’s players such as Petra Kvitova who seem to love playing 3 sets, hence her nickname of P3TRA.
If I was to ever go back and improve my model (not likely) I would need to tackle this problem. It would need a lot of point by point data though.
My model obviously wouldn’t know if one player was hitting the ball cleanly, hitting winner after winner or if they were playing like a total knobhead, hitting constant unforced errors. That’s the problem with computer model, a lack of subjectivity. Obviously I’m not advocating blind betting based on what your numbers tell you, you need an element of both. As mentioned previously, it’s very difficult to make money betting in play on tennis. (It’s pretty difficult on any sport). But there are people that do who aren’t courtsiders. These chaps are probably armed with their own numbers, watch A LOT of tennis, can ascertain how play a player is hitting the ball, know the player’s characteristics and will know how a player is likely to perform when they’re breathing out of their arse.
My model didn’t just tell me who would win a match, it also calculated the odds related to some of the popular alternative tennis betting markets:
- Actual match scoreline
- Total points/games/sets
- Number of aces/double faults.
- Chance of a tiebreaker
1st Set Score Betting
First off the overround on set scoreline betting is not great (it’s a piss take). So it’s already unlikely that you’re going to find any edge. You don’t need to mess about designing a model to know that if you’re betting on 7-6 or 7-5 scorelines you need your head looking at. The slim likelihood of those scorelines are rarely even remotely matched by realistic odds.
Another thing that should already be completely obvious is that in this market, it matters who’s serving first. Yet most people don’t even grasp that! Djokovic will be playing Federer and you’ll see on Twitter or wherever that some mug has bet on Djokovic to win the 1st set 6-4. Djokovic wins the toss and elects to serve first, and that bet is already pretty much dead in the water. If Djoko serves first, how is he going to win 6-4? Most likely he would need to break Federer to end the set. If Djokovic serves first the most likely score line is 6-3 Djokovic, with one break of serve (and not to end the set).
Rafael Nadal famously always elects to receive. So as long as he’s not facing someone else who likes to receive, you know who’s going to be serving first. When Djokovic played Nadal I used to like betting on 6-3 to Djokovic as I found an edge here. The market hadn’t compensated enough for the fact that with someone like Nadal, you can always tell who’s serving first. Otherwise it’s just 50-50 with that coin toss.
Sometimes I found value in 6-0 or 6-1 scorelines when there was big difference in the standard of the 2 players. But not when Serena Williams was playing, her 6-0 scorelines were always too short.
Total Points was always a decent market to bet on. It’s pretty hard to know whether a line of 135.5 total points was too high or low without a computer model.
Most Aces in a Match
My favourite alternative tennis market to bet on. Unfortunately this market’s only available for bigger events, and only with some bookmakers. The edge here lies not just with knowing who hits aces, but with knowing who’s likely to have more service points and knowing how often a player gets aced.
Have a guess which ATP players hits lots of aces:
Yep, easy question no surprises. Now guess which players gets aced a lot?
Notice how Del Potro, the big heavy footed tub of lard gets aced a lot. When accounting for strength of opponent serve, Djokovic is the least likely to be aced. A player as agile as Djoko will at least get a racket head on a big serve when others could only watch it go past. As a result there was often value in backing Djokovic in the most aces betting, and opposing Del Potro against decent servers.
Most Double Faults in a Match
This market is quite similar to the aces. Again the edge doesn’t just lie with knowing who has lots of double faults, but with knowing who’s likely to have more service points in a match. I’ll show you some old data for the leaders for both ATP & WTA:
Notice the grand slam winners high up on the WTA list…