Redirecting to https://www.makingdatamistakes.com/what-is-machine-learning/ - click if you are not redirected.


Note: You will be redirected to the original article. A local copy is included below for convenience.

Machine learning algorithms are faced with the same challenge you had as a pupil at school.

There’s going to be an exam. The higher you score, the better. You’re given some exams from the past with answer sheets to study from.

This leads to a few standard data science principles that will be familiar from your schooldays:

In machine learning terms, we’d say that the practice exams are the algorithm’s ‘training set’, i.e., the data it uses to learn from. Its ‘performance’ is the score when it generalises to new, unseen data, aka the ‘testing set’ - that’s what we really care about.

Machine learning algorithms automatically improve with experience. They are unlike normal computer code, which needs every single step to be spelled out in unambiguous and complete detail. Given enough high-quality data and a clear scoring system, machine learning algorithms use clever maths to notice patterns in the training set.

In Part 2, we’ll talk about some techniques for guessing when the algorithms will be able to meet this magical-sounding promise, and when they’ll fall far short.

This post is part of a series called Data Science Explained - where I explain Data Science concepts to managers and other non-technical people. Other posts in this series include: