TD;LR: Tennis Data; Loving Ratings

A brief response to TL;DR responses

Charles Allen
8 min readJun 8, 2018

--

A number of people have requested a condensed version of the concerns I raised about Universal Tennis in my article “The Universal Appeal of Tennis Ratings”. Here it is… Though I would still prefer that these concerns be understood in the larger context presented in the original article.

Data Sharing: One-way

The Universal Tennis (UT) data sharing agreement contemplates only data shared by organizations with UT and makes no mention of any data sharing on the part of UT. While the benefits of players receiving a UTR is touted, there is nothing to indicate whether or how calculated UTRs might be shared with the organizations whose data is being used to calculate the ratings.

The marketing materials sent along with the Data Sharing agreement first define UTR, speak about its “rapid expansion” and then outline “problems” in tennis which may be addressed by “empowering” organizations with a new tool… but data provided by any such tools are not included in the data sharing agreement.

Data Sharing: Scope of use

The data sharing agreement grants “worldwide, non-exclusive, perpetual, irrevocable, transferable, sublicensable, royalty-free” rights to Universal Tennis, which will be able to “freely use, reproduce, display, distribute, modify and create derivative works” from all shared data.

Since the UT business model is as-yet unknown, having an undefined scope of use while granting unrestricted use of data provided to UT is problematic for any number of reasons.

Privacy Concerns

The data sharing agreement provides no assurances with respect to the safeguarding of data which is being shared, and even specifically states that data is not to be considered confidential. While UT talks about taking GDPR seriously, there has been no mass email to existing users and nothing can yet be found on the UT website (May 25th, 2018). Many smaller firms in the tennis industry have addressed GDPR weeks, even months ago.

Non-specific, Non-standard

There is no reference to any specification defining what data is to be shared.

Burden of data conversion

The marketing materials state that there is no cost to submit federation results… this is, well, marketing, because, even though Universal Tennis may not charge fees, it is actually a huge cost in terms of time and energy to prepare results for submission. If there were a set of standards in place with respect to how data related to tennis may flow into and out of applications, often maintained by other vendors (services and software) on which federations depend, then the investments made in data transformation and export could be justified in a broader context.

In fact, preparing results for submission to Universal Tennis could result in a benefit to many federations if in the process they were able to rationalize the management of their own data, which may then be used in other ways.

No benefit to Federations

Independent of any specific plan to utilize UTRs to analyze or reorganize federation tournament structures, there is no benefit in federations undertaking the effort to share data with Universal Tennis. The data sharing agreement obligates organizations to provide data, but does not obligate UT to provide anything in return.

The benefit stated in marketing outreach is that all players with results in the system receive a UTR, and that UTRs may help some players get scholarships to US universities… but that’s not a benefit to federations, and in fact several federation representatives have stated that they did not want their players going to US universities… they wanted them to stay in Europe to represent their countries. Some even have a partnership with academies giving their players a path to sponsorship… and one of the points in the contract is that this sponsorship obligates players to represent their home country when they play internationally.

As an aside: From the point of view of non-US citizens, there also can be unexpected problems with playing college tennis in the US. For many players it is hard to stay in the US after graduation; in some cases US degrees are non-transferrable; if it is not possible to pursue tennis professionally, then many players must return to university in their home countries.

Transparency: Business Model Unknown

Universal Tennis is amassing a dataset which is unavailable to any other organization; it has the ability to mine this data, to perform data analysis and data analytics, in the pursuit of new product and service offerings.

Universal Tennis has not declared a business model, a path to profit. Given the scope of the data sharing agreement it is reasonable to ask whether organizations should be freely sharing data when as yet unannounced offerings from Universal Tennis may end up negatively impacting their existing business.

If there were a collaborative model for sourcing and managing data in a standards based framework this concern would evaporate. The data would be available under agreed-upon terms to all interested parties. Unique, proprietary, and specific analysis and analytics under this scenario would still be possible.

Data Acquisition

Universal Tennis has no public data acquisition model, apart from the statement that only results which are publicly available on the internet are used to calculate UTR. It is unknown what percentage of data is being scraped from websites. Claims are made that data is being submitted by federations when in fact the data is actually be scraped from federation websites. This is disingenuous marketing which does not inspire confidence or respect.

Data Quality

Universal Tennis makes no assurances about data quality, and does not even have any disclaimers. If UTRs are being used to seed players in events then poor data quality (some results being included, some not, one player having multiple UTRs, incorrect match results) can have real-world consequences.

In many cases, particularly with screen scraping, it is impossible to uniquely identify players.. which leads to duplications.

It is unknown whether corrected match results are fed back into UTR calculations. Are sites periodically re-scraped to validate results? For directly submitted to UT, is there a defined workflow for the submission of corrections? The data sharing agreement states that defects in data will be promptly fixed. But apart from such agreements, who has the authority to correct scores?

Data Sourcing

Universal Tennis does not specify the criteria for data source selection. It is stated that only results available on the internet are included, but who does the validation for internet based results? What qualifies as a tournament? If Academies are able to host internal events amongst their own (and visiting?) players should these events count as tournament play? Since a player’s last 30 matches are considered, will players be more or less likely to play high school tournaments and internal academy events if these matches could make up a significant portion of the matches considered in their UTR? If Universal Tennis’ Tournament Management System is used by a wide variety of organizations, from federations down to individual clubs and community organizations, how can assurance be made that these (potentially unsanctioned) events are properly conducted? Is the kitchen-sink approach appropriate for all applications of tennis ratings, or might there be different criteria for different types of players and different types of events (sanctioned vs. unsanctioned might be only one such dimension). If ratings are calculated for different surfaces (for marketing purposes only?), why couldn’t this be done with specific parameters for different constituents?

“Free” TMS Offering

Independent tournaments (whether or not they are level-based) do take away from players who are available to play federation events… and federations have been looking for ways to register more recreational players to build their base. These efforts will be impacted by the emergence of third-party events.

Data Availability

Will federations or other organizations own and have access to the data generated by using UT’s TMS? If federations wish to use the data for calculating their own rankings or submit for use in other ratings algorithms will that be allowed (and easy)? Standards for data import/export would make this a non-issue.

Data Additions

Universal Tennis does not specify the conditions under which historical data may be added to the system which calculates UTRs. Large batches of incoming data could suddenly and significantly impact players’ UTRs, which could have material effects, especially now that UTR is being used in the college admission process. This is rather unprecedented as PPR systems never retroactively add points for events not previously included in player rankings (unless of course tournaments or matches were accidentally excluded… but within the context of a federation parents and players are always keeping a close watch on ranking lists and all factors which influence their rankings).

Proprietary Algorithm

There is no good justification for the Universal Tennis Rating to be based on a proprietary algorithm; at least no good justification has been given. Protecting people from attempts to game the system is not a sound or sufficient argument.

Research Community

I’ve yet to come across anyone in the research community who is happy with the fact that UTR is proprietary. They, and others, want comparative analysis to be possible. They want to know why and under what circumstances one way of calculating ratings is superior to others. They, and others, want to know how the interaction of disparate pools of players can be taken into consideration… whether confidence factors should play a part in how tournaments are organized when such pools of players intersect.

Is Universal Tennis the only entity which gets to decide how players on different parts of the spectrum are differentiated? Where along the spectrum do the majority of recreational players fall and how are/might those players be treated differently algorithmically from others?

If the algorithm is proprietary then it is not possible for others to model the application of the algorithm to different data sets, to understand how and under what circumstances such a rating might be most beneficially applied. UTR mahy be beneficial for player development, but competition structures based on rankings will persist. Universal Tennis claims that level-based play is complementary, but how, exactly, might such tournament structures co-exist in the future? How are we to expect that such decisions be made? Organically or by market forces?

Business Continuity / Blowback

Universal Tennis is still in startup mode. There are a number of transitions that are possible, from management changes and ownership changes to questions of whether or not the business is viable for the long term. What safeguards are there that the UTR algorithm will not be modified to satisfy strategic objectives of the organization in whatever future form it may take? What safeguards are there in the event that the business fails? Should other organizations make changes to their businesses based on an algorithm that may change or disappear without warning?

There is also the concern that federations or other organizations could be opening themselves to criticism if they base their activities on a proprietary algorithm. The proprietary nature of UTR runs counter to the Open Data and Open Source movements which are gathering momentum.

Website Traffic

Will the UT business model drive traffic to UT and away from federation sites… news, rankings, tournament registrations and tournament results are the primary source of federation web traffic, and web traffic is the basis of many sponsorship deals. If results are replicated on UT site and registrations done in UT software this would cut traffic to federation websites unless UT software seamlessly integrates with federation websites or there is branding/advertising/sponsor pass-through to the UT properties. Will UT allow ratings histories to be posted in player profiles on federation sites?

--

--

Charles Allen

Tennis Parent, Ecological History, #dataviz, #DataVisualization, #sportsanalytics, #d3js