Week 5 - Infinite hypothesis spaces

OMSCS

In Week 5 - Computational Learning Theory we developed a method to build a PAC learner on a finite dimensional hypothesis space. However this breaks down in infinite dimensions. We need to get around this.

Lets just cheat

Whilst the hypothesis space might be infinite theoretically, the actual number of distinct hypothesises could be finite. Then we can fall back on the previous work.

Discrete problem domain

Let $A=\{i \in \mathbb{N} \vert i \leq 10\}$ and $B = \{T,F\}$. Then notice that $Fun(A,B) = 2^{10}$ which is finite. If we had $H = \{x \geq \theta \ \vert \ \theta \in \mathbb{R}\}$ then $H$ is infinite as $\theta \in \mathbb{R}$. However, in reality there are at most 11 realised functions here $\{x > n \vert 0 \leq n \leq 10, \ n \in \mathbb{Z}\}$ so we have a finite hypothesis space.

VC-dimension

In the problem above the hypothesis space was in some way limited. We want to capture the idea of a limited hypothesis space when looking at proper infinite dimensional hypothesis spaces.

VC dimension

In the example above we have VC dimension 1. As for any two points $1 \leq l \leq u \leq 10$ if $f(l) = T$ and $f(u) = F$ then no such function $h_{\theta}(x) = [x \geq \theta]$ can set $h_{\theta}(l) = T$ without setting $h_{\theta}(u) = T$.

VC dimensions roughly follow degrees of freedom

SeparatorVC-dimensionparameters
1- demensional hyperplane11
Interval on $\mathbb{R}$22
2-dimensional hyper plane33
d-dimesional hyperplaned + 1-
Convex polygon in $\mathbb{R}^n, n > 1$$\infty$$\infty$

PAC learnable and VC dimension

PAC learnable bound with VC-dimension

This formula is out of no where but there is connection with the finite case where

$$ m \geq \frac{1}{\epsilon}\left ( \ln(\vert H \vert) + \ln\left(\frac{1}{\delta}\right) \right). $$

For a finite hypothesis space $H$ for $VC(H) = d$ we require $\vert H \vert \geq 2^d$ as we need at last that many hypothesis to split the cases up. Therefore we get a natural upper bound on the VC dimension

$$ d \leq \log_2(\vert H \vert). $$

This nicely connects the formulas.

PAC learnable if and only if finite VC dimension

PAC-learnable if and only if finite VC dimension