Machine Learning’s Growing Pains

Solomon_B 150x150Brad Solomon, Junior Investment Analyst

“Machine learning,” on its surface, sounds nothing short of miraculous.  For anyone who has ever felt overwhelmed when working with a large amount of intractable data, it evokes a certain fantasy: press a button, and let the machine learn. Poof, without any further instruction, your computer spits out relationships in the data seemingly untraceable to the human eye.

Yet paradoxically, there is also a competing perception that only a certain breed of mathematics PhDs and programming prodigies are worthy of using machine learning (ML) techniques.  The field of computer science has never been short on patronization; this post recommends first making sure that you have several advanced degrees and then learning the C or C++ languages, both of which are seen as some of the least user-friendly computer languages and neither of which are the language into which most machine learning is actually incorporated.

Now that machine learning has made its way to the top of Gartner’s Hype Cycle for emerging technologies, and has also become pervasively marketed as part of the tool set of quantitative investment strategies, it’s probably a good time to debunk some misconceptions about what machine learning is, and what it isn’t.

Let’s start with a positive.  ML encompasses a wide range of statistical modeling techniques that can be applied toward facial recognition, predicting credit card fraud, and classifying tumors as malignant or benign, to name just a few implementations.  At the heart of machine learning are a number of different models that all serve as means to the same ends: predicting a value or classifying something categorically.  The list of models themselves is an intimidating mouthful: to name a few, there are neural networks, decision trees, Bayesian ridge regression, and support vector machines.

If your head is spinning, you’re not alone.  However, you might be surprised to learn that you likely covered some elements of machine learning in any introductory statistics course: for instance, ordinary least squares regression (linear regression) also falls under the hood of machine learning. Machine learning practitioners also like to throw around a number of fancy terms that go by other names elsewhere in the realm of broader statistics discipline.  For example, training and test data are analogous to the more familiar terms in-sample and out-of-sample; supervised learning simply means that you are starting with an independent and dependent variable and want to establish a relationship between the two and then apply that relationship to a “fresh” (test) variable.

Now, to debunk one of several myths: ML is not new; the term was coined in 1959 and has been used pervasively in the tech industry for decades.  However, growth in the popularity of the Python programming language, which is open-source, free, and offers a number of user-friendly machine learning packages, has fueled interest in the concept.

One result has been the proliferation of machine learning techniques and their (purported) use in quantitative investment applications.  At Brinker, we’ve come across more than a handful of managers using machine learning: the use of random forest classification to identify the likelihood that a company will cut its dividend, or forecasting of market volatility regimes through Markov chain Monte Carlo methods.  However, we would be remiss to mention that for every manager that usefully employs machine learning, there are a half-dozen others that simply like being able to include it in a slide in their strategy’s pitch-book.  Bloomberg bluntly articulated this recently: “Hedge Funds Beware: Most Machine Learning Talk Is Really ‘Hokum’.”  A healthy dose of skepticism is warranted.

That engenders a second key point: when interacting with managers who profess to use ML in their everyday process, ask as many “dumb” questions as possible.  In layman’s terms, can you describe what’s going on “under the hood”?  Why did you select this model in particular?  While the mathematics behind certain models can be quite hairy, the high-level intuition should not be.  And lastly, while machine learning hasn’t yet been fully commoditized, that doesn’t mean you should be paying a 2 & 20 fee to access its capabilities.

The views expressed are those of Brinker Capital and are not intended as investment advice or recommendation. For informational purposes only. Holdings are subject to change. Brinker Capital, Inc., a Registered Investment Advisor.