Qualitative Prediction of Weight-Lifting Exercises
One day your company tells your team that they should switch from a proprietary analytics platform like SAS, to something open source like RStudio. Undoubtedly some analysts may become fascinated, while others anxious. How would you get them excited about the switch?
My approach was to demonstrate the potential of the new open source platform. How easily could RStudio generate reproducible research and facilitate story telling with data? How could we weave together narrative text and code to seamlessly produce and deliver elegantly formatted analyses to multiple audiences?
Leveraging Human Activity Recognition (HAR) data provided from a Groupware@LES study, a machine learning use-case was born. HAR data has become ubiquitous with the advent of devices like the Fitbit, Nike FuelBand, and even smartphones. Although users of these devices tend to quantify how much they participate in an activity, they rarely consider how well they perform the activity.
The provided multiclass variable was generated by participants wearing HAR devices, and is relatively balanced (equally distributed). This simplifies the analysis somewhat since we don’t need to consider tactics to combat imbalanced classes. In addition, this lets us focus primarily on the other major analytics steps required in most machine learning projects.
Machine learning project goals:
-
use the multiclass variable to build a predictor that distinguishes between participants that correctly completed fitness exercises versus those that hadn’t and what their mistakes may have been.
-
demonstrate the feasibility of using RStudio for delivering reproducible research via a webpage.
The project results, and code are available on my GitHub Repository.