TOSS: Tennis Open Software Standards

An ecosystem to support transparency and promote innovation

Charles Allen
8 min readApr 29, 2018

Tennis is an old sport. It predates many other sports, and yet it lags far behind in terms of data gathering, data management and data analytics. At the 2015 MIT Sloan Sports Analytics Conference it was noted that tennis is a “fragmented, decentralized landscape” and that as an individual sport there is less incentive for investment. This is a structural problem, because tennis is an expensive sport for families.

The largest organizational structures which exist for tennis are professional organizations such as ATP, WTA and ITF. Such organizations cater to only a very small percentage of tennis players, and the investments in data analytics that have been made have been driven primarily by the presentation of information to spectators. IBM, SAP and others use the presentation of such data to market their services. It is rare for such information to contribute in any way towards player development.

The vast majority of tennis players who are part of any organization have registered with National Federations which are tasked with assembling Davis Cup and Fed Cup teams and organizing tournaments for junior players. Only in the largest, wealthiest countries are these organizations well funded and sufficiently staffed. Typically these organizations have no in-house expertise in data management; many are still using Excel spreadsheets (or paper) for running tournaments, and often do not have historical match data in any form other than PDFs.

Some federations have paid local software companies to create a player management component, tournament registrations, and even in some cases rudimentary tournament management… but these local internet companies don’t have a deep understanding of tennis, and the “solutions” are incomplete. The wealthier countries have outsourced such functions to Tournament Management solutions which are not exclusively focused on Tennis, and in the process have given their data away… it exists only on servers in the Cloud, in proprietary formats.

In short, tennis data does not exist in any standard, accessible format such that it could be used to create additional value for Federations, specifically in the area of player development, player retention and “growing the game”, not only in terms of its appeal to players and spectators, but also in terms of the additional revenues which can be generated with increased website traffic and marketing partnerships, and fed back into national programs. In fact there appears to be very little understanding of how such data could be used, and how valuable it really could be.

One obvious example of how value can be extracted from such data is Universal Tennis. UTR (Universal Tennis Rating) has had a great deal of success recently, primarily in the US, though it got some airtime during the 2018 Australian Open and has recently signed a strategic partnership with World Team Tennis.

UTR benefits players looking for scholarships to US universities, and was used by Universal Tennis during 2017 to organize more than 1000 tournaments featuring “level-based play”. There is a strong argument that such tournaments are beneficial for player development, and that level-based play and the structuring of tournaments based on analysis of player match data can also contribute to player retention. At the same time it should be a significant concern for federations that these events were not sanctioned by any governing body.

Another downside in this example is that Universal Tennis has described a rather limited match format; it’s actually not even a specification and contemplates nothing more than what Universal Tennis requires to build its platform at this moment in time. Apart from the extensive web-site scraping they do to gather their data the burden is on Federations to convert whatever format they have internally into the Universal Tennis Excel/CSV format. Because there is no specification, and no transparency as to how the data is used, many Federations are giving away more data that is necessary, which raises privacy concerns.

Federations are giving their data away (or having it taken) with no benefit other than the promise that UTR will help a few of their top players with college scholarships. And Universal Tennis is benefitting from the deep analysis of this data and the traffic it drives to its own website. UTR utilizes a proprietary algorithm and lacks transparency. It runs counter to the Open Data movement that is growing rapidly in many areas of public life. Federations should be wary of partnerships without transparency, as it could result (and has resulted) in backlash from members and the public.

There is, in fact, nothing that unique about UTR ratings. ELO ratings for Tennis predated them and extensive research has been done on various tennis rating systems within universities and by private individuals. Such tools only lack accessible data and a curated archive for this analysis to be freely available to tennis organizations such as National Federations. The option for federations to use alternatives to UTR to create their own player ratings in the future, to do their own analysis of the strength of players coming from various regions, clubs, academies, training systems, becomes more difficult and costly without a set of standards in place.

Other examples of how valuable tennis data can be are easy to come across:

  • Many European countries award an “international ranking” to players based on a Tennis Europe ranking which is above a certain threshold; but this is an entirely manual process. A standard way of representing rank list data would enable in-country rank lists to automatically include international ranking data.
  • Some countries have agreements with neighboring countries such that foreign players’ rankings are valid when those players participate in cross-border tournaments… but there is no standard way of exchanging rank list data, so, again, it is all done manually. If there were a standard in place this process could be completely automated, saving referees time and ensuring accuracy.
  • Players in remote areas are often geographically closer to tournaments in neighboring countries than to tournaments in their own country; federations could better support player development if there were “federated” tournament schedules in such areas, including points or rating systems that recognize cross-border results. Standards for representing tournament schedules (including geospatial data), match outcomes, and points systems would make this possible; it would also be possible to better analyze player participation by location, tournament categories & etc.
  • There are many software solutions for Court Management, but these systems cannot exchange data with Tournament Management software, so court management functions are replicated and both systems run side-by-side.
  • Live Broadcasting applications such as Tennis Tracker don’t do tournament management, so information about tournament matches is imported by custom conversion scripts… this is not scalable if they are to support multiple tournament management applications… It would be far easier for this type of software vendor if there were standards, and it would mean that clients (Federations or other organizations) were not locked into single-vendor software solutions.
  • Match tracking systems, from simple mobile apps to automated systems such as PlaySight, Mojjo, and even Hawkeye could use standard interfaces and data formats to seamlessly integrate with various Tournament Management, Live Broadcasting, and Player Development platforms.
  • Logistics challenges (tournament selection, travel and hospitality/accommodation) are faced by families, clubs, national teams and professional managers. Standard formats for the representation of tournament calendars (including geospatial data) and tournament rankings/ratings, can enable a whole new class of applications related to tennis. This topic was covered with respect to tennis at the 2015 MIT Sloan Sports Analytics Conference… but little if anything has been achieved.

And of course the most obvious way to extract value out of tennis data is to present it in such a way that website traffic grows… so that Federation sponsors see more value in being sponsors in the first place, and so that advertisements get more “hits”.

Live Broadcasting of events drives website traffic, as does presentation of historical match and tournament data and “Head to Head” comparison of players. This means that Federation websites need to be built in such a way that they can integrate with the data being made available by other systems related to tennis, and this integration will be greatly simplified by standard data formats and standardized application programming interfaces.

Federations which have no long term strategy for leveraging their tennis data to improve player development in their own countries will have no choice but to accept solutions offered by software vendors… and unless these vendors support some standards for the exchange of data related to tennis (players, matches, tournaments, schedules, calendars, rankings), federations will be “locked in” to whichever vendor captures them first, making change difficult. Vendors which are creating their own platforms are very likely to end up competing with Federations, using the Federation’s own membership to build up private data stores as a launch pad for future, as yet unspecified, product offerings.

Tennis Open Software Standards (TOSS) should cover data formats, interfaces for the exchange of data, algorithms for modeling aspects of tennis such as the structure of matches, the relationships between tournament events (qualification=>main draw=>consolation), rating systems, and point allocation strategies. TOSS can apply at all levels within the world of tennis, up to and including the pro circuits.

TOSS can also include certification that software vendors’ products adhere to other international standards which pertain to tennis, such as GDPR (General Data Privacy Regulations). This is an area where all vendors of software related to tennis should work together to forge an agreement on what and how junior player data should be warehoused and maintained.

The development and promotion of TOSS would not only help set data standards, but also promote an environment where systems could provide better transparency, and where the creativity of university and private researchers and other software vendors could be unleashed. Match data analysis can contribute to early identification of players for Davis/Fed Cup participations, and the TOSS framework can help drive the development of additional metrics which can be used to benchmark player development, as well as the performance of coaches, clubs and academies; it can empower national federations to better measure, manage and improve performance metrics using their own data in a cost effective way.

It’s not hard to see that data is actually the new oil, a primary resource, and that there is currently a “land grab” underway by well funded interests. If this data is locked up, inaccessible, in proprietary formats, alternative approaches will become increasingly untenable, and the influence of federations will diminish.

Tennis is an old sport, enjoyed by millions of players of all ages, all over the world; yet it is uniquely decentralized and fragmented, and far behind other sports in terms of constructively and creatively using data to support the game. The structures that have supported tennis to this point are facing disruptive forces, though they may not realize it. The creation of an organization to support a set of standards such as TOSS can propel Tennis from the laggard to a leader in sports data management and analytics.

An early version of this paper was presented at the Tennis Europe Annual General Meeting which was hosted by the Hungarian Tennis Association in Budapest from 22–24 March, 2018.

--

--

Charles Allen

Tennis Parent, Ecological History, #dataviz, #DataVisualization, #sportsanalytics, #d3js