Tag Archives: Model

Tennis Trading

In my previous post I wrote about people courtsiding at Tennis, @borisranting from www.sportismadeforbetting.com dropped by and left the first comment in this blog’s short life so far:

Courtsiding is only an issue though if you are trying to price snip, i.e. think you are beating the market because you react faster than everyone else. If you are pitting your wits against a settled market, and saying that your assessment is better than the market’s, it doesn’t affect you. But you need to factor it into your betting behaviour.

Two essential rules of betting punters must live by:
There will always be people with more information than you in a market.
There will always be people with faster access to live information in a market.

And if your model/assessment is exactly the same as the market, then you are either punting, or playing a zero-sum game with no margin, like Exchange Games.

Can’t argue with any of that.

I promised to write more about my experience of trading tennis, so here it is. First off I am by all means not an expert on tennis trading/betting, so any comments, as always are welcome.

Ok a quick recap of what I did:

  • Wrote a program to scrape 2 year’s worth of data. It was all match summary statistics. Not point by point data as I think only 1 site had it and it was a bit patchy to say the least.
  • Sort and organise this data into a lovely database
  • Then for a particular matchup, on a given surface, I could calculate each players:
    • 1st Serve Percentage
    • Ace Rate (per 1st Serve)
    • 1st Serve Points Won (exc Ace)
    • DF Rate (per 2nd Serve)
    • 2nd Serve Points Won (exc DFs)

I would then simulate a tennis match using these stats as many times as I wanted, usually around 50,000 times. I could do this pre-game or in play using the current match score. The model would spit out what it though the chances of each player winning from whatever scoreline, compare those to the match odds, and tell me it there were any differences. Ie did I have an edge. Unfortunately, these odds were very rarely different from the market. I seem to remember at most I would have about a 2% edge. Which rarely happened, the majority of the time it was bang on.

So, I created a model that gave the same match odds that a highly liquid market was at… This at least proved that my model wasn’t a load of crap. During the testing phase of this model I was ever-so-slightly up. I didn’t test it long enough to determine if it did have around a 2% edge when the odds were different. I felt like I overdosed on tennis, and I didn’t enjoy watching it to the level that I would have been required to watch it. I certainly prefer watching other sports, even without a bet on such as basketball, NFL and rugby.

Not that the model I created was perfect. I was running under the assumption that points in tennis are Independent and Identically Distributed (IID).

Huh, what the fuck does that mean?

It means each a player has the same constant chance of winning a point on serve regardless of score or whether he lost the previous point. My model ran with Player A’s chance of winning a point on serve being the same whether it was the first point on a service game in the 1st set, or if he was defending a break point in the final set. IIRC research available suggests that points in tennis are not IID, but the effects are small. I’m sure that the top players play better on the “bigger” points and certain players are slow starters or streaky players. Plus there’s players such as Petra Kvitova who seem to love playing 3 sets, hence her nickname of P3TRA.

If I was to ever go back and improve my model (not likely) I would need to tackle this problem. It would need a lot of point by point data though.

My model obviously wouldn’t know if one player was hitting the ball cleanly, hitting winner after winner or if they were playing like a total knobhead, hitting constant unforced errors. That’s the problem with computer model, a lack of subjectivity. Obviously I’m not advocating blind betting based on what your numbers tell you, you need an element of both. As mentioned previously, it’s very difficult to make money betting in play on tennis. (It’s pretty difficult on any sport). But there are people that do who aren’t courtsiders. These chaps are probably armed with their own numbers, watch A LOT of tennis, can ascertain how play a player is hitting the ball, know the player’s characteristics and will know how a player is likely to perform when they’re breathing out of their arse.

My model didn’t just tell me who would win a match, it also calculated the odds related to some of the popular alternative tennis betting markets:

  • Actual match scoreline
  • Total points/games/sets
  • Number of aces/double faults.
  • Chance of a tiebreaker

1st Set Score Betting

First off the overround on set scoreline betting is not great (it’s a piss take). So it’s already unlikely that you’re going to find any edge. You don’t need to mess about designing a model to know that if you’re betting on 7-6 or 7-5 scorelines you need your head looking at. The slim likelihood of those scorelines are rarely even remotely matched by realistic odds.

Another thing that should already be completely obvious is that in this market, it matters who’s serving first. Yet most people don’t even grasp that! Djokovic will be playing Federer and you’ll see on Twitter or wherever that some mug has bet on Djokovic to win the 1st set 6-4. Djokovic wins the toss and elects to serve first, and that bet is already pretty much dead in the water. If Djoko serves first, how is he going to win 6-4? Most likely he would need to break Federer to end the set. If Djokovic serves first the most likely score line is 6-3 Djokovic, with one break of serve (and not to end the set).

Rafael Nadal famously always elects to receive. So as long as he’s not facing someone else who likes to receive, you know who’s going to be serving first. When Djokovic played Nadal I used to like betting on 6-3 to Djokovic as I found an edge here. The market hadn’t compensated enough for the fact that with someone like Nadal, you can always tell who’s serving first. Otherwise it’s just 50-50 with that coin toss.

Sometimes I found value in 6-0 or 6-1 scorelines when there was big difference in the standard of the 2 players. But not when Serena Williams was playing, her 6-0 scorelines were always too short.

Total Points was always a decent market to bet on. It’s pretty hard to know whether a line of 135.5 total points was too high or low without a computer model.

Most Aces in a Match

My favourite alternative tennis market to bet on. Unfortunately this market’s only available for bigger events, and only with some bookmakers. The edge here lies not just with knowing who hits aces, but with knowing who’s likely to have more service points and knowing how often a player gets aced.

Have a guess which ATP players hits lots of aces:

ATP Ace Rate
2012 & 2013 seasons, all surfaces, min 20 matches on all tables

Yep, easy question no surprises. Now guess which players gets aced a lot?

ATP AceD Rate

Notice how Del Potro, the big heavy footed tub of lard gets aced a lot. When accounting for strength of opponent serve, Djokovic is the least likely to be aced. A player as agile as Djoko will at least get a racket head on a big serve when others could only watch it go past. As a result there was often value in backing Djokovic in the most aces betting, and opposing Del Potro against decent servers.

Most Double Faults in a Match

This market is quite similar to the aces. Again the edge doesn’t just lie with knowing who has lots of double faults, but with knowing who’s likely to have more service points in a match. I’ll show you some old data for the leaders for both ATP & WTA:

ATP DF Rate WTA DF Rate

Notice the grand slam winners high up on the WTA list…

Caught Courtsiding

Yesterday the BBC had an interesting radio programme about courtsiding in tennis. For the benefit of those who don’t know what courtsiding is: it’s people who sit court side (stop me if I’m going too fast) and either bet themselves, or relay that information to someone else to then place bets. As the pictures you view on TV or online are around 5 seconds behind the actual action. The odds on betfair will move before a point has been won when watching online/TV.

They are basically like High Frequency Trading firms in the financial world. They have a massive edge in terms of time, they can buy the higher price before everyone else knows that a point has even been won, and can then quickly sell the lower price back into the market. The market in this case will usually refer to a betting exchange such as Betfair. And this happens on every point in a tennis match.

If you want to check it out, the radio programme is here:

http://www.bbc.co.uk/programmes/b05r3w43

And an article from which is here:

http://www.bbc.co.uk/news/magazine-32402945

Some of the key excerpts are:

“The purpose of us being there is that we can send back information a lot faster than TV or betting companies can get the data,” says Dobson.

That information would be fed back in milliseconds to the syndicate in London. The syndicate designed betting software for tennis which works out the probability of each player winning the match at any point and how the odds are moving in real time on the betting exchange.

Steve showed me the kind of device Dan used, a modified game console controller.

“We had an automated system whereby the point data would come in and then we would cancel any bets that we had in the market that we deemed were at the wrong price. And then we would place bets straight back into the market that we deemed were now the correct price.” he says.

The difference in getting that data first and changing the syndicate’s betting position can be worth thousands of pounds per point, sometimes even tens of thousands, and there are a lot of points in a tennis match.

This has led many other syndicates to employ courtsiders. Steve High says he has been told reliably that 75 people were at last year’s Wimbledon final, “sending information back or betting on their own”.

To put it mildly: your ordinary Joe sat at home who fancies trying to trade tennis has an incredibly hard job to have any sort of edge. The big fish in this pond have the fastest information and have programs to tell them what the correct price should be.

I used to trade tennis myself. I used to do alright, but this was back in the days before courtsiding was as prevalent as what it is now. After it was clear that any sort of edge, if I actually even had a edge on tennis, had gone, I decided to write my own program to predict the odds either pre-game or in-play.

Tennis is a really easy sport to model. If I was to succeed in this field I needed something to give me an edge. So I downloaded as much data as I could get, then calculated player A’s chance of winning a point on serve against player B, and vice versa. I broke this down further into first serve %, winning a point on first serve, winning a point on second serve etc. I also took into account aces and double faults. My program would use these factors and simulate the match as many times as I wanted, and then calculate each player’s chance of winning. It would run after each point so the calculated odds would be continuously updated depending on the game state.

After a few weeks testing, the majority of the time my program gave out the same prices that the betting market was at. If there were ever any differences between my price and the market price, it wasn’t out by much at all.

Although my tennis program didn’t give an edge in the Match odds/Moneyline market, it did provide an edge in some of the alternative markets… I’ll write a longer post in the future about my experiences of trading tennis, as I’ve got a lot more to say on this topic.