One of the common areas of data collection within sport science is daily wellness monitoring. This include both subjective questions around athletes general well-being (soreness, sleep, appetite, energy, mood etc) and objective musculoskeletal (MSK) based data (groin squeeze, knee-to-wall, sit and reach, INT/EXT shoulder rotation etc). One of the issues with this data collection is how can it be recorded in a streamlined manner along with the data being analysed in time for action to be taken based on its results.
There are a number of athlete management systems (AMS) available to purchase which attempt, in different ways, to provide ready-made solutions to these issues. While the AMS available often work well, they can be costly and not an option for many clubs. However, with some work and utilising a combination of softwares, a custom Wellness monitoring system can be designed. For this example, we will use a combination of Googleforms/Googlesheets, PowerBI and R. The end goal will be to have data collected/stored by Google then analysed/visualised by PowerBI with R running in the background to aid the analysis.
When it comes to collection of data for this system there is a number of considerations: Type of data; who will be inputting; where the data will be stored; does the storage allow for easy export.
Type of Data
The type of data is important as it informs how the data will be collected. If we take Wellness data based on a scale of it can be straight forward enough if we want, a case of selecting the correct word or number. However, I have found that it can be useful to vary how this data is input in order to gain a bit more interest or increase the amount of “thinking” needed by the athlete to complete the questionnaire. For example if we take 3 questions, Sleep, Soreness, Energy:
- Sleep could be a basic 1-5 on how well someone slept;
- Soreness might be numbered 1-5 but each number having a sentence describing what the number should feel like (often this can be made humorous or specific to a team/group of athletes etc.)
- Energy could be numbered but have an image where the athlete selects the image they feel best represents their energy level (again these images can be anything relevant to energy level)
Current weight is a question regularly included in wellness questionnaires that the above approach isn’t suitable for. For weight, ensuring the correct weight is entered is often the most important issue, it’s not unusual for athletes (pre-morning coffee!) to enter this incorrectly whether that means missing a decimal point or entering 200kg instead of 100kg (those buttons can be small on tablets!). Thankfully we can opt to include upper/lower limits on the data entry to at the very least limit the potential for data to be entered incorrectly.
MSK data often falls into the same category as weight where the goal is to limit the potential for incorrect data entry by similar means. Along with having limits in place for the data, images of the test for which the data is being entered can help as well. Anyone who has stood around while shoulder or hip external rotation and internal rotation is being carried out will have definitely being asked numerous times daily which is which!!
Who Will be Inputting Data
Who will be inputing the data is important as we want the process to be as seamless and intuitive as possible. Often this data is recorded as early as possible so we will usually be dealing with athletes in varying states of waking up where the temptation for them to avoid entering the same answer for all questions seems to be too large a step for many (in my experience anyway ;)). Some of the steps outlined above help to prevent this however a simple step that can increases their efficiency exponentially is to randomise the order that questions appear in. While this can initially cause untold chaos while athletes react to this drastic step, once over the initial shock it can be very effective at increasing the honesty with which people answer as again it’s a small way to increase the “thinking” needed.
Data storage has 2 main issues: is it secure; can it be accessed easily.
In terms of security, the first thing to realise is that your data probably isn’t 100% secure no matter where you have it. A quick search around the internet showing huge breaches in data security have occurred everywhere from large multinationals to the German government! If someone really wants access to your data, they will more than likely find a way. Luckily, this hasn’t appeared to be a major issue for sports (unless of course this is Leeds United next step).
The vast majority of sports organisations store wellness based data in two ways: AMS; Excel. From a data storage perspective the main benefit of using an AMS is that is it password protected and no data is stored on a shared drive/laptop etc. It’s certainly beyond the scope of this blog (and my knowledge!) to explore the cyber security of an AMS however one point of note would be that to ensure your AMS provider allows for efficient data removal. While this has yet to surface to any great extent in Europe, athletes are becoming more and more protective of their own data and have a right (potentially even a legal one due to GDPR) to own it and ask for its removal if they leave the organisation. This is not always the easiest step to perform when using outside an external data management system
Of course, if you are going down the route of using excel, then removing data is a simple process and pressing delete. However the risk here is that data can be very easily transferred to a USB or Dropbox file and shown to anyone and everyone. To a certain degree this can be lessened by either using password protected files, storing data on an external hard-drive or a more secure online storage system (with googlesheets and/or dropbox being the most frequent). However both of these options still have issues wherein files can be left open and access by unauthorised staff or with online storage sites you are reliant on their policies protecting your data.
There is a third option of using an external data warehousing company, however for the majority of sports organisations the volume of data collected doesn’t require this step (yet). For national or university based organisations dealing with 100s of athletes this can be a very beneficial option to investigate.
While an AMS does look a more promising approach to use from a security point of view, it can often make accessing your data difficult. Not all AMS are designed for easy access to the raw data where the data export can be in a difficult format or very time-consuming. This has often led to me writing R scripts to transform a data export in a usable format so I can then analyse the data. However, it must be noted this is not always the case and some companies do allow for API access.
Accessing data within excel is a process where through some variation (PowerQuery, PowerPivot, PowerBI etc) of linked files, data can be efficiently moved around. Similarly with googlesheets, although can be dependant on company firewalls, you can automate data access. However depending on how you choose to do this, as data builds up it can turn into a time consuming process
Hopefully I have outlined some areas of consideration when it comes to online Wellness data collection, maybe even some you hadn’t previously considered. Up next we will look to use Google Forms to create our own Wellness data collection system.