In supervised learning we are given an input domain $A$ and an output codomain $B$ with some observations $O \subset A \times B$ of a function $f^{\ast} : A \rightarrow B$. The goal is to approximate $f^{\ast}$ the best we can given these observations.