Programming

Pages in this section

Last edited: 2023-11-12

2SAT(f):
    Input: 2-SAT problem on n variables with m clauses.
    Output: Assignment from n variables satisfying f or stating no assignment is possible.
1. Find all the unit clauses u_i in f.
2. Create a dummy variable Y. Set f' = f
    1. Replace the unit clauses in f' with u_i \lor Y.
    2. Add new clauses to f' of u_i \lor \overline{Y}.
3. Construct graph G for f'.
4. Run the SCC algorithm on G.
5. If any variable and its complement are in the same SCC, return no assignment possible.
6. While there are still SCCs left:
    1. Take sink SCC S.
    2. Set the clauses in S to true (and their complements to false).
    3. Remove S & \overline{S}.
7. Return the assignment, excluding the value for Y.

Correctness

3-SAT is NP-complete
Last edited: 2025-12-05
# Statement
Lemma
3-SAT is NP-complete .
Adjacency list format (graph)
Last edited: 2026-02-05
Adjacency list
For a graph $G = (V,E)$ (that can be a directed graph or undirected graph ) the adjacency list format is a list of neighbourhoods $N_v = \{u \in V \vert (v,u) \in E\}$ for $v \in V$.
Anonymous Functions
Last edited: 2023-11-11
These are functions that are not named. They are usually used when you are using a function once and don’t want to clutter up the name space.
# Examples
In Python Index these are implemented in by Lambda functions but these are explicitly for functions that can be defined in the Functional Programming paradigm.
Application Programming Interface (API)
Last edited: 2023-11-11
This is a interface intended to be used by other applications (i.e. not other humans). Whist this is mainly thought of as a web API’s, there are others such as programming languages (think machine code), software libraries or operating systems .
Array (data structure)
Last edited: 2026-01-28
Array
This data structure consists of a collection of elements of the same memory size, where each one is identified by an index. This comes with a function to map indices to memory location of each element. (This is normally integer index with a contiguous memory block.) These can be static where you fix the size of the array beforehand or dynamic where you allow the array size to change.
Associative array
Last edited: 2026-02-05
Associative array
This is an abstract data structure that hold a collection of key - value pairs. The keys must be unique. The data structure needs to support the following operations:
Asynchronous programming
Last edited: 2026-02-05
Asynchronous programming
Asynchronous programming is a programming paradigm that allows a program to handle operations that may take an indeterminate amount of time, such as I/O -bound tasks, without blocking the execution of the entire program. Instead of waiting for an operation to complete before moving on to the next one, asynchronous programming enables a program to initiate an operation and then continue executing other tasks while waiting for the operation to finish. Once the operation completes, the program is notified, and the relevant code can be executed.
Backward propagation of errors (Back propagation)
Last edited: 2026-02-05
This is the specific implementation of gradient descent applied to neural networks . There are two stages of this calculation:
- Forward pass through the neural network as you evaluate it on test data .
- Backward propagation of that error as you use the chain rule for differentiation on each of the perceptrons .
For this to work all the perceptrons need to have differentiable activation function .
Backwards compatibility
Last edited: 2023-11-11
This is the concept that new releases can still work in applications that used the old version of your code. It is an absolute nightmare when you are programming leading to horrible design choices and many Deprecation warnings.
Bagging
Last edited: 2024-01-24
This is one of the simplest Ensemble learning methods but out performs classic modelling methods in certain problems.
Suppose we are in the modelling framework . The bagging is the following process
1. We choose a set of modelling algorithms $A_i$ for $1 \leq i \leq k$ that could fit the problem - these could all be the same.
2. We take random subsets of the training data $T$, $T_i$ for $1 \leq i \leq k$, with replacement - so two samples could contain the same data point.
3. We then train $\hat{f}_i$ using algorithm $A_i$ with training data $T_i$ for $1 \leq i \leq k$.
4. Then we have some method of averaging these models over our problem space to produce $\hat{f}$ our final model.
# Example
Suppose we want to use polynomial regression on a simple function $f: \mathbb{R} \rightarrow \mathbb{R}$ with training data $T$.
Balance between library providers and users
Last edited: 2023-11-11
When developing third-party libraries, creators often aim for wide applicability, packing in a broad range of features to cater to as many use-cases as possible. On the flip side, consumers of these libraries typically have a more narrow focus, needing only a subset of the available functionality. This dynamic can create a form of natural tension between library providers and users. It’s like buying a multipurpose tool loaded with features, knowing you’ll probably only ever use a few of them.
Balanced cut problem
Last edited: 2026-02-05
# Statement
Balanced cut problem
Given an undirected graph $G = (V,E)$ and an integer $b$. Output a cut $S, T \subset V$ such that $cut(S,T) \leq b$ or no if no such cut exists.
Bellman-Ford algorithm
Last edited: 2023-11-11
Suppose we are trying to solve the problem of how to Find path in undirected graph where we have no negative weight cycles.
In this case you can guarantee that paths will only visit each vertex at most once. So they will use at most $\vert V \vert - 1$ edges. So it is enough to solve the subproblem.
Let $D(i,z)$ be the length of the shortest path between $s$ and $z$ that use at most $0 \leq i \leq n-1$ edges.
Big-O notation
Last edited: 2023-11-11
Big-O notation is the main way people determine the runtime of algorithms. (Rightly or wrongly.) It is the worst case analysis and has a formal mathematical definition
$$ f = O(g) \mbox{ if } \exists \ x_0 \in \mathbb{N}, m \in \mathbb{R}_{>0} \mbox{ such that } f(x) \leq m \ast g(x)\ \forall x > x_0.$$
This is just saying that $f$ eventually is at most a scalar multiple of $g$.
Big-Omega notation
Last edited: 2023-11-11
Big-Omega notation is said to be the best case runtime of an algorithm. However formally it is simply a lower bound on the run time - the same as Big-O notation is simply an upper bound. This has a similar formal definition
$$ f = \Omega(g) \mbox{ if } \exists \ x_0 \in \mathbb{N}, m \in \mathbb{R}_{>0} \mbox{ such that } 0 \leq m \ast g(x) \leq f(x) \ \forall x > x_0.$$
This is just saying that $f$ eventually is at least a scalar multiple of $g$.
Big-Theta notation
Last edited: 2023-11-11
Big-Theta notation is the tight bound on the runtime of an algorithm. This is only really defined when some function is the following $f = O(g)$ using Big-O notation and $f = \Omega(g)$ then we have $f=\Theta(g)$ using Big-Theta notation this is also an equivalence relation, so $g = \Theta(f)$ as well.
Bit
Last edited: 2023-11-11
A bit in the context of Computer Science is the smallest unit of data. It is a single 0 or 1.
This should not be confused with a byte which is 8 bits.
Bitwise operations in python
Last edited: 2026-01-28
# Bitwise operations
Bitwise operations can apply to the binary data type but also integers .
Representation of integers in Python Index
In Python integers are signed and stored using Two’s complement .
Boolean variable
Last edited: 2023-11-11
A Boolean variable is one that takes the value True or False it is a basic type in most programming languages.
In Maths Index the True or False value is regularly replaced with 1 or 0 respectively. This is similar to the representation that variable would have in memory.
Boosting
Last edited: 2024-01-24
Boosting is an Ensemble learning method that uses multiple instances of another model to average them out to make a classifier that is better than any one individual one.
Suppose we are in the modelling framework where $B = \{-1,1\}$ and we have a training set $T$ where $(a_t, b_t) = t \in T$. Boosting iteratively train classifiers $h_i: A \rightarrow \mathbb{R}$ based on how the past classifiers have performed. To do this before training $h_i$ we define a distribution $\mathbb{D}_i$ on $A_T = \{a_t \vert t \in T\}$.

Breadth-first search (BFS)

Last edited: 2023-11-11

This is a method for exploring a tree. Lets say you have a vertex $v \in E$ and you either want to run some procedure on vertices or extract some information form them (here we return the parents). Here we use a queue data structure.

 BFS(G, root):
	Input: Graph G = (V,E) and root in V
	Output: Shortest paths to the root in a tree.
1. let Q be a queue
2. label root as explored
3. Queue the root node on Q.
4. while Q is not empty
	4.1. v = Q.dequeue()
	4.2. for all edges (v,w) in E
		4.2.1. if w is not labeled as explored:
			4.2.1.1. label w as explored
			4.2.1.2. set v to be the parent of w.
			4.2.1.3. Queue w on Q.
5. return parent array

This runs in $O(\vert V \vert + \vert E \vert)$ as we run over each edge once (or twice if undirected) and visit each vertex once.

Byte
Last edited: 2023-11-11
This is a group of 8 bits i.e. a 1 or 0.
Cache
Last edited: 2026-02-05
Cache
A cache is a method used to speed up data retrieval. It stores data being accessed in a faster type of memory or a closer location to the CPU . If the data is not present in the cache, it is loaded from somewhere else into the cache. This is called a cache miss and normally results in a costly action.
Chain Hashing
Last edited: 2026-02-05
Chain Hashing
This is a form of Hash table ($A$, $h: K \rightarrow A$) who’s entries in the associative array are linked lists . Given an element $k \in K$ to add it to this hash table you calculate $h(k)$ then append it to the associated linked list . Then to check if an element $e \in K$ is in your hash table calculate $h(e)$ and then search the associated linked list.
Chain matrix multiply problem
Last edited: 2026-01-28
# Statement
Chain Matrix Multiply problem
For $n$ matrices $A_1, A_2, \ldots, A_n$, where $A_i$ has size $m_{i-1} \times m_i$, what is the minimum cost for computing $A_1 A_2 \ldots A_n$?
Check if a linear programme is solvable
Last edited: 2023-11-11
We are going to use the feasibility algorithm for a linear programme to do this - lets call this $feasibility\_checker$ that takes a linear programme and outputs a boolean based on if it is feasible or not.
This uses a corollary of the Weak duality theorem (linear programme) which tells us unbounded linear programmes have infeasible duals . So we will check if the dual linear programme is feasible. For this let $dual$ be an algorithm to calculate the dual.
Checked exceptions
Last edited: 2023-11-11
Checked exceptions are a feature of some programming languages, most notably Java , that enforce a strong contract between a method that can fail (throw an exception) and the caller that must handle that failure. When a method might throw a checked exception, the language requires that the method declare this fact in its method signature using a throws clause. Any calling method is then obligated either to catch the exception and handle it or to declare that it, too, throws the exception , propagating it up the call stack.
Checking if a linear programme is feasible
Last edited: 2023-11-11
Suppose we are given a linear programme in standard form specified by $x, c \in M_{n,1}(\mathbb{R})$, $A \in M_{m,n}(\mathbb{R})$, and $b \in M_{m,1}(\mathbb{R})$ and we have a standard form linear programme solver $linear\_programme\_solver$.
Note we will include in the algorithm some transformation to guarantee the inputted linear programme is in standard form by breaking up the added variable into it’s positive and negative components. See getting a linear programme in standard form
Classes in Python
Last edited: 2023-11-11
Classes in python are defined by the class keyword for example.
```
class NewThing:

	def __init__(self, variable):
		self.variable = variable
```
When using classes you use the self keyword to refer back to the object. Therefore a class method requires to have the first argument as self.
Special functions
# __call__
When a class has this property it makes it callable. For example:
Clean Code
Last edited: 2023-11-11
This is a book by C. Martin Robert . It contains ideas about how code should be written. I have summarised the ideas I took from this into the Conventions notes.
Clique of a given size problem
Last edited: 2026-02-05
# Statement
Clique of a given size problem
Given an undirected graph $G = (V,E)$ and a positive integer $g > 0$. Does there exist a clique of size at least $g$ if so what is it?
Clique of a given size problem is in NP
Last edited: 2026-01-28
# Statement
Lemma
Clique of a given size problem is in NP .
Cocktail party problem
Last edited: 2026-02-05
# Statement
Cocktail party problem
Suppose you have $n$ people talking at the same time with $k$ microphones placed around the room. From the recordings of the $k$ microphones, when and how can you recover the $n$ sounds?
Coding Principles
Last edited: 2023-11-11
There are many coding principles to stick to, these are just a collection of useful heuristics to avoid the pain of past programmers.
- SOLID
- GRASP
- The Law of Demeter
Cohesion
Last edited: 2023-11-11
When googling what this word means I got the following
Google
The action or fact of forming a united whole.
Comment conventions
Last edited: 2023-11-11
Comments in code are a necessary evil. As they are not the code, there is no way to check their integrity. As code gets changed the comments may not be updated and can easily cause more confusion that good. Comments also take some thought to process as you need to understand what the person who wrote the comment meant - which is far easier to do in code.
There are good use cases like a public API or todo notes - though they are mainly there to compensate for our failure to express ourselves in code. They do not make up for bad code!
Conventions
Last edited: 2023-11-11
Whilst I have written summaries of the conventions from Clean Code and added in some comments about my preferences. The main thing to remember is that the team conventions are the ones that matter. The purpose of conventions is for teams to work better together, when you look at file and your brain doesn’t have to mentally map the code to how to like to read code. The team rules, when it comes to conventions. Whenever you start a new project, get the people who are working on it to agree on a standard and then write what that is down or better implement a Linter to automatically apply the standard when saving or pushing to the common repository.
Cost complexity pruning for decision trees (CPP)
Last edited: 2024-02-02
When making a decision tree with no pruning it can tend to overfit the training data . To reduce this we can “prune” the decision tree. We do this by looking at how useful each of the branches are - then removing the ones that don’t add enough value. This is controlled by a variable $\alpha$ which increases the cost of having more leaf nodes.
Coupling
Last edited: 2023-11-11
Two bits of code are ‘Coupled’ if they rely on one another. In other words if one changes then you are required to change the other bit of code. When we write programs we really want low coupling as these are easier to change and maintain.
There are lots of methods to decrease coupling, such as introducing interfaces and The Law of Demeter .
CRUD API
Last edited: 2023-11-11
A CRUD API is one that has a set of standard end points:
- Create,
- Read,
- Update, and
- Delete. These normally connects to a database .
CS6035 O01 Introduction to Information Security
Last edited: 2026-01-28
Introduction to Information Security is a graduate-level introductory course in information security. It teaches the basic concepts and principles of information security, and the fundamental approaches to secure computers and networks.
Main topics include:
- Security basics
- Security management and risk assessment
- Software security
- Operating systems security
- Database security
- Cryptography algorithms and protocols
- Network authentication and secure network applications
- Malware
- Network threats and defences
- Web security
- Mobile security
- Legal and ethical issues
- Privacy
# Links
- Central
- OMSCS page
CS6200 Graduate introduction to Operating Systems
Last edited: 2023-12-03
This is a graduate-level introductory course in operating systems. This course teaches the basic operating system abstractions, mechanisms, and their implementations. The core of the course focuses on OS support for concurrency (threads) and synchronisation, resource management (CPU, memory, I/O), and distributed services. The practical component of the course teaches multi-thread programming, inter-process communication, and distributed interactions via RPC.
Break
I retook this course so there is a slight break in dates between lectures 4 and 5. The content was the same for both times I took it.
CS6210 Advanced Operating Systems
Last edited: 2025-12-03
Advanced Operating Systems is a graduate-level course that addresses a broad range of topics in operating system design and implementation, including:
- Operating system structuring
- Synchronization, communication, and scheduling in parallel systems
- Distributed systems, their communication mechanisms, distributed objects and middleware
- Failures and recovery management
- System support for Internet-scale computing By tracing the key ideas of today’s most popular systems to their origins in research, the class highlights key developments in operating system design over the last two decades and illustrates how insight has evolved to implementation.
# Links
- Central
- OMSCS page
CS6215 Introduction to Graduate Algorithms
Last edited: 2023-12-03
This course is a graduate-level course in the design and analysis of algorithms. We study techniques for the design of algorithms (such as dynamic programming) and algorithms for fundamental problems (such as fast Fourier transform FFT). In addition, we study computational intractability, specifically, the theory of NP-completeness. The main topics covered in the course include: dynamic programming; divide and conquer, including FFT; randomized algorithms, including RSA cryptosystem; graph algorithms; max-flow algorithms; linear programming; and NP-completeness.
CS6250 Computer Networks
Last edited: 2025-12-05
This course provides a quick refresher on introductory material and offers broad coverage of protocols and algorithms that span all layers of the Internet protocol stack.
The course begins with an overview of the evolution of the Internet architecture and a short refresh of the transport layer algorithms and protocols such as TCP and congestion control. Next, the course will dive into intradomain/interdomain routing, peering and networks relationships with a follow-up into router design and functionalities. After discussing routing and router technologies, the course will continue with Software Defined Networking technologies and discuss topics that intersect Network Security and Computer Networks, for example, attacks on Internet routing such as BGP hijacking. Finally, the course wraps up with a discussion on multimedia applications and Content Delivery Networks.
CS7641 Machine Learning
Last edited: 2024-01-10
This is a 3-course Machine Learning Series, taught as a dialogue between Professors Charles Isbell (Georgia Tech) and Michael Littman (Brown University).
Supervised Learning Supervised Learning is a machine learning task that makes it possible for your phone to recognize your voice, your email to filter spam, and for computers to learn a number of fascinating things. This sort of machine learning task is an important component in all kinds of technologies. From stopping credit card fraud; to finding faces in camera images; to recognizing spoken language - our goal is to give students the skills they need to apply supervised learning to these technologies and interpret their output. This is especially important for solving a range of data science problems.
CS7642 Reinforcement Learning
Last edited: 2024-01-10
The course explores automated decision-making from a computational perspective through a combination of classic papers and more recent work. It examines efficient algorithms, where they exist, for learning single-agent and multi-agent behavioral policies and approaches to learning near-optimal decisions from experience.
Topics include Markov decision processes, stochastic and repeated games, partially observable Markov decision processes, reinforcement learning, deep reinforcement learning, and multi-agent deep reinforcement learning. Of particular interest will be issues of generalization, exploration, and representation. We will cover these topics through lecture videos, paper readings, and the book Reinforcement Learning by Sutton and Barto.
CS8803 O08 Compilers - Theory and Practice
Last edited: 2026-01-28
The objective of this course is to learn the theory and practice behind building automatic translators (compilers) for higher level programming languages and to engineer and build key phases of a compiler in Java or C++ for a small language.
# Links
- Central
- OMSCS page
- Course textbook
CSE6220 Introduction to High Performance Computing
Last edited: 2026-01-28
This course is a graduate-level introduction to scalable parallel algorithms. “Scale” really refers to two things: efficient as the problem size grows, and efficient as the system size (measured in numbers of cores or compute nodes) grows. To really scale your algorithm in both of these senses, you need to be smart about reducing asymptotic complexity the way you’ve done for sequential algorithms since CS 101; but you also need to think about reducing communication and data movement. This course is about the basic algorithmic techniques you’ll need to do so. The techniques you’ll encounter cover the main algorithm design and analysis ideas for three major classes of machines: for multicore and manycore shared memory machines, via the work-span model; for distributed memory machines like clusters and supercomputers, via network models; and for sequential or parallel machines with deep memory hierarchies (e.g., caches). You will see these techniques applied to fundamental problems, like sorting, search on trees and graphs, and linear algebra, among others. The practical aspect of this course is implementing the algorithms and techniques you’ll learn to run on real parallel and distributed systems, so you can check whether what appears to work well in theory also translates into practice. (Programming models you’ll use include OpenMP, MPI, and possibly others.)
Cut property
Last edited: 2023-11-11
Cut property
Let $G = (V,E)$ be an undirected graph with an edge weight $w: E \rightarrow \mathbb{R}$. Suppose $X \subset E$ is part of an MST $T$. Let $S \subset V$ be a cut where no edge of $X$ is in the cut edges $cut(S, \overline{S})$. Then for $e^{\ast} \in cut(S, \overline{S})$ of minimum weight - $X \cup \{e^{\ast}\}$ is part of an MST $T^{\ast}$.
Data - Object Anti-Symmetry
Last edited: 2026-01-28
Clean Code
Objects hide their data behind abstractions and expose functions that operate on that data. Data structures expose their data and have no meaningful functions.
Data structure
Last edited: 2023-11-11
A data structure is simply a container of information. All its variables are public and can be depended on by other actors in the code base. It could contain some functions that present the data in a nicer format but it can’t rely on private data holding internal state (like a boolean has_printed).
In python data structures are really easy to define.
```
from dataclasses import dataclass

@dataclass
class point:
	x : float
	y : float
```
Normally, a single data structure will have lots of implementations as it is just a way to group data together in a meaningful way.
Declarative Language
Last edited: 2023-11-11
This is a language that specifies the desired end state. So it is the picture on the front of the Ikea manual rather than the instruction manual!
Examples
- Markdown
- Markup Language
Thoughts A good User Interface (UI) should be declarative with the system behind the scenes doing the leg work to get to that state. This paradigm is being applied more and more within DevOps for example Kubernetes and Terraform both you specify the end state and the program sorts altering the current state to match this.
Dependency Inversion Principle (DIP)
Last edited: 2023-11-11
The Dependency Inversion Principle (DIP) is one of the five principles of the SOLID acronym in object-oriented programming and design. It states that:
1. High-level modules should not depend on low-level modules. Both should depend on abstractions.
2. Abstractions should not depend on details. Details should depend on abstractions.
In simpler terms, the Dependency Inversion Principle promotes the idea that software components should depend on abstractions (interfaces or abstract classes) rather than concrete implementations. This principle aims to reduce the coupling between modules, making the system more flexible, maintainable, and easier to change or extend.
Depth-first search (DFS)
Last edited: 2023-11-11
This is a way of traversing a graph (could be a directed graph or undirected graph ). It explores all the way down one branch of the graph before tracing and exploring other branches.
The generally runs in $O(\vert V \vert + \vert E \vert)$ time.
# Examples
- DFS to find connected components in an undirected graph
- DFS to find path in a directed graph
- DFS to find path in an undirected graph
Design Patterns
Last edited: 2023-11-11
- Special case pattern

DFS for finding strongly connected components

Last edited: 2026-02-05

SCC(G):
	Input: directed graph G = (V,E) in adjacency list
	Output: labeling of V by strongly connected component
		1. Construct G^R
		2. Run DFS on G^R
		3. Order V by decreasing post order number.
		4. Run directed DFS on G using the vertex ordering from before and
		   label the connected components we reach.

Additional property

In addition to labelling the connected components the order we have labelled them is in reverse topological ordering. This is because we always start at the next sink vertex after we have labelled a component.

DFS to find connected components in an undirected graph

Last edited: 2026-02-05

To solve the following problem: Statement

We can use the following DFS algorithm.

DFS(G)
  input: G = (V,E) in adjacency list representation
  output: Vertices labelled by connected components
    connected_component = 0
    for all v in V, set visited(v) = False
    for all v in V
      if not visited(v) then
        connected_component++
        explore(v)
    return connected_component_number (defined in explore)

Explore(z)
  input: vertex z
  output: side effect, it defines the connected component_number on some vertices and marks them as visited.
    connected_component_number(z) = connected_component
    visited(z) = True
    for all (z, w) in E
      if not visited(w) then
        Explore(w)

As we run through every vertex once and then every edge twice we have $O(\vert V \vert + \vert E \vert)$.

DFS to find path in a directed graph

Last edited: 2023-11-11

# Finding paths in a directed graph via DFS

To do this we are going to use a DFS algorithm like [[]] but we are going to track pre/postorder numbers.

DFS(G)
  input: G = (V,E) in adjacency list representation
  output: Vertices labelled by connected components
    clock = 1
    for all v in V, set visited(v) = False
    for all v in V
      if not visited(v) then
        explore(v)
  return post (defined in Explore)

Explore(z)
  input: vertex z
    pre(z) = clock, clock ++
    visited(z) = True
    for all (z, w) in E
      if not visited(w)
        Explore(w)
	post(z) = clock, clock++

# Example

Suppose we have the following graph and let $B$ be the root node. Suppose we explore edges alphabetically and lets run the algorithm above on it.

DFS to find path in an undirected graph
Last edited: 2023-11-11
# DFS to find path in an undirected graph
For undirected graphs we just keep track of the previous vertex and find a spanning sub -forest for the graph . We can use this to find paths between vertices by going back to the root vertices of the trees .

Dijkstra's algorithm

Last edited: 2023-11-11

This is an algorithm to solve the shortest path problem in undirected graphs or directed graphs $G = (V,E)$. It is based on Breath-first search but uses a priority queue instead of just a queue . It requires positive edge lengths $w: E \rightarrow \mathbb{R}_{>0}$.

Dijkstra(G,w,s):
	Input: graph G=(V,E) with weights w(e) and a start vertex s.
	Output: For all vertices reachable from s a shortest path length dist
1. for all u in V
	1.1 dist(u) = inf
	1.2 prev(u) = nil
2. dist(s) = 0
3. H = makequeue(V)
4. while H is not empty:
	4.1 v = deletemin(H)
	4.2 for each {v,z} in E:
		4.2.1 if dist(z) > dist(v) + w(v,z):
			4.2.1.1 dist(z) = dist(v) + w(v,z)
			4.2.1.2 prev(z) = v
			4.2.1.3 decreasekey(H,z)
5. output dist

# Correctness

# Run time

To initialise $dist$ and $prev$ takes $O(\vert V \vert)$ time.

Directed acyclic graph (DAG)
Last edited: 2026-01-23
Directed acyclic graph (DAG)
A directed acyclic graph.
Dual linear programme
Last edited: 2026-02-05
Dual linear programme
Suppose we have a linear programme in standard form given by $A \in M_{m,n}(\mathbb{R})$, $b \in M_{m,1}(\mathbb{R})$ and $c \in M_{n,1}(\mathbb{R})$. Define the dual linear programme to be defined by $-A^T \in M_{n,m}$, $-c \in M_{n,1}(\mathbb{R})$ and $-b \in M_{m,1}(\mathbb{R})$. Where we want to solve for some $y \in M_{m,1}$ where we
Dynamic Programming
Last edited: 2023-11-11
This is an algorithmic technique to solve some recursion type problems in an iterative way. This can speed up these processes dramatically especially when recursion would compute a similar step multiple times.
The basic structure of this technique is to:
1. Break a problem down into smaller sub problems.
2. Use the solutions to previous sub problems to solve subsequent sub-problems.
3. Use the solution to all sub problems to solve the main question.
(If this feels a lot like Induction , that is because this is essentially a programming equivalent.)

Edmonds-Karp algorithm

Last edited: 2023-11-11

This is an algorithm to solve the max flow problem but also solves the min st-cut problem on a flow network $(G = (V,E), c, s, t)$. This is very similar to the Ford-Fulkerson Algorithm but with the use of BFS when picking the s-t path.

# Algorithm

edmonds_karp(G, c, s, t)
	Input: A flow network.
	Output: A max flow on the network.
1. Set f(e) = 0 for all e in E.
2. Build the residual network G^f for the current flow f.
3. Check for any s-t paths p in G^f using BFS.
	1. If no such path then output f.
4. Given p let c(p) = min_{e in p} c^f(e).
5. Augment f by c(p) units along p.
	1. f(e) = f(e) + c(p) for forward edges.
	2. f(e) = f(e) - c(p) for backward edges.
6. Repeat from step 2.

# Correctness

This proof is the same as Ford-Fulkerson Algorithm .

Error code
Last edited: 2023-11-11
An error code is a numerical or alphanumerical value that represents the status or outcome of an operation in a program. Unlike exceptions , which interrupt the flow of a program, error codes are returned as the result of a function or method call to indicate success, failure, or some specific condition. Programs that rely on error codes require explicit checks after each operation to determine whether it was successful and to decide how to proceed based on the returned error code.
Error Handling
Last edited: 2023-11-11
To build robust software you must handle things going wrong. In code this comes in the form of error handling. There are many ways of handling errors, some common methods include: exceptions , error codes , and return objects.
We must build robust code, however this should be done in a way that maintains the readability and reusability of the code. For which there are some helpful principles here.
Euclidean algorithm
Last edited: 2023-11-11
This algorithm is used to calculate the greatest common divisor for two positive integers. This uses Euclid’s rule and modular arithmetic .
# Algorithm
```
Euclid(x,y):
	Input: integers x,y where x >= y >= 0
	Output: gcd(x,y)
1. If y = 0 return x
2. return Eculid(y, x mod y)
```
# Run time
If $x$ and $y$ are $n$-bit integers then calculating $x$ mod $y$ takes $O(n^2)$ time. From Claim 1 we have every 2 rounds $x$ decreases by at least a factor of 2. So there are at most $2n$ rounds giving this algorithm a run time of $O(n^3)$.
Every min-cut is at full capacity in a max-flow
Last edited: 2023-11-11
# Statement
Lemma
Given a Flow network $(G, c, s, t)$ with a max flow $f$. Then for every min st-cut ($S$, $T$) all edges in $cut(S,T)$ have maximum capacity in $f$.
Exception
Last edited: 2023-11-11
An exception is a mechanism used in many programming languages to interrupt the normal flow of execution when an error occurs. In languages like Java and Python, an exception is represented as an object . When an error condition is encountered, an exception is “thrown,” interrupting the current process and propagating up through the program’s call stack to find an appropriate “catch” block to handle the error.
Expectation Maximisation
Last edited: 2024-03-09
Expectation maximising is an algorithm which can be used on problems such as clustering problem . It is used in situations where you assume your data follows some underlying distribution $\mathbb{D}$ which in this instance is given by parameters $\theta$. We assume $\theta \in H$ our hypothesis space .
Suppose we draw $m$ samples from $\mathbb{D}_{\theta}$ which has an observed component $X = \{x_1, \ldots, x_m\}$ and an unobserved component $Z = \{z_1, \ldots, z_m\}$. Therefore the samples of $\mathbb{D}_{\theta}$ are given by $Y = X \cup Z$.
Fast API
Last edited: 2023-11-11
This is a python library to implement an API . Whilst there are other API implementations in python such as Flask the unique selling point of Fast API is it’s speed.
Similarly to Flask this is implemented mainly by using decorators to wrap functions to turn them into end points.
Find connected components in an undirected graph
Last edited: 2026-02-05
# Statement
Find connected components
in an undirected graph Given a graph $G = (V,E)$ how can we find a mapping from its vertices $V$ to the connected components of $G$.
Find path in a directed graph
Last edited: 2025-12-05
# Statement
Find path
between vertices in a directed graph Given a directed graph $G = (V,E)$ and two vertices $a,b \in V$, how can we find a path in $G$ between them? Sometimes we will also have a weighting $W: E \rightarrow D$ and we want to minimise the path length.
Find path in undirected graph
Last edited: 2025-12-05
# Statement
Find path
between vertices in an undirected graph Given an undirected graph $G = (V,E)$ and two vertices $a,b \in V$. How can we find a path in $G$ between them. Sometimes we will also have a weighting $W: E \rightarrow D$ and we want to minimise the path length.
Find strongly connected components for a directed graph
Last edited: 2026-01-28
# Statement
Find strongly connected components
in a directed graph Given a directed graph $G = (V,E)$ how can we find a mapping from its vertices $V$ to the strongly connected components of $G$.
First-class object
Last edited: 2023-11-11
This is an entity that can be:
- Created at runtime,
- Assigned to a variable or element in a data structure,
- Passed as an argument to a function, and
- Returned as the result of a function.
In Python Index most data types are First-class objects such as Integers , Floats , Dictionaries , Lists and Functions .
Flow
Last edited: 2026-02-05
Flow
Given a flow network $(G, c, s, t)$ a flow is an allocation $f: E \rightarrow \mathbb{R}_{\geq0}$ such that the following holds:
Flow network
Last edited: 2026-02-05
Flow network
A flow network contains the following data:
Ford-Fulkerson Algorithm
Last edited: 2023-11-11
This is the naïve solution to the Max flow problem . It runs in pseudo-polynomial time depending on the size of the solution. A more developed algorithm that uses the same design is the Edmonds-Karp algorithm .
Their main difference is that Edmonds-Karp algorithm must use BFS whereas Ford-Fulkerson Algorithm can use DFS also. For a runtime bound we require Ford-Fulkerson Algorithm to use integer capacities.
Formatting conventions
Last edited: 2023-11-11
Here formatting means the spacing and organisation of files in your code base. This is about communication and leaving a well organised desk space for others to use after you.
The main metaphor used for formatting is a news paper. Nearly all code is read left to right and top to bottom. You documents, should start with a head line, then as you go down slowly increase in the level of detail. The code should be in nice columns and no story should be too long. It shouldn’t use too much indentation but should be nicely spaced to make it easy to read. You should use paragraphs to break up concepts.
Fourier Matrix
Last edited: 2023-11-11
The Fourier matrix is used in Fast Fourier Transform . Let $\omega$ be a $n$‘th root of unity then define the Fourier matrix to be
$$F_n(\omega) = \left [ \begin{array} \ 1 & 1 & 1 & \cdots & 1\\ 1 & \omega & \omega^2 & \cdots & \omega^{n-1}\\ 1 & \omega^2 & \omega^4 & \cdots & \omega^{2(n-1)}\\ \vdots & \vdots & \vdots & \ddots & \vdots\\ 1 & \omega^{n-1} & \omega^{n-2} & \cdots & \omega \end{array} \right ].$$
Function conventions
Last edited: 2023-11-11
Functions should be small encapsulated bits of code that does a single thing. It takes parameters which it does not alter and potentially returns some value or alters the state of the object they belong to.
# Function body
# Functions should do one thing
Function should do one thing, and one thing only. If you need to decide if it is doing more, write out in words what it is doing. If there are sections in your functions, then it for sure is doing too much.
Functions in Python
Last edited: 2023-11-11
Functions in Python Index are a class and have type ‘function’. i.e.
>>> type(my_function) <class ‘function’>
They have their own namespace that gets destroyed once the function call has ended. In Python Index functions are First-class objects , meaning they can be assigned to variables.
You can see this in the below example.
Genetic algorithm (meta)
Last edited: 2024-02-24
Suppose we have a optimisation problem which as two properties associated to is domain $A$
- There is a method to make a small change to any $a \in A$ called a mutation $M: A \rightarrow \mathcal{P}(A)$ where any of $m(a)$ are a small change to $a$.
- It has some crossover function $C: A \times A \rightarrow A^k$.
Then for a collection of points $P \subset A$ called a population the algorithm assess the fitness of each point $f(a)$ for $a \in P$.
Gradient decent
Last edited: 2026-02-05
This algorithm uses differentiation to minimise error terms for models. Suppose we have a model with some parameters $w \in \mathbb{R}^k$. Further suppose the error function $E(w)$ is differentiable with respect to $w$. Then we set some learning rate $\eta$ and we perturb the weights $w$ by $w \leftarrow w - \eta \frac{\partial E}{\partial w}$ until $w$ settles in some local minimum.

Graph representations

Last edited: 2023-11-11

How we represent a directed graph or undirected graph in an algorithm can effect the run time of an algorithm.

Bellow are some common ways to do this:

Data structure	Space complexity	Time to check connection	Time to find neighbours
Adjacency list	$O(\vert V \vert + \vert E \vert)$	$O(N_v)$	$O(N_v)$
Adjacency matrix	$O(\vert V \vert^2)$	$O(1)$	$O(\vert V \vert)$
Edge list	$O(\vert E \vert)$	$O(\vert E \vert)$	$O(\vert E \vert)$

Halting problem
Last edited: 2026-02-05
# Statement
Halting problem
Given a programme $P$ with an input $I$. Does $P(I)$ ever terminate?
Hash function
Last edited: 2025-12-05
Hash function
A hash function $h$ is a function that takes an input in a given domain and returns a fixed-size string of bytes $h: K \rightarrow B$, this output is normally called the hash value. It should ideally be:
Hash table
Last edited: 2026-02-05
Hash table
This consists of an associative array $A$ and a hash function $f: K \rightarrow A$. Values provided to the hash table get hashed using $f$ which provides a key (usually thought of as a number below a given load value) for $A$. Then the within this entry of $A$ you will store some object that can verify if the value is in this table. In an ideal world the hash function $f$ would map each key to a different key of $A$ however this is unlikely to happen in practice. So hash tables need to handle key collision.
Heap (OS)
Last edited: 2026-01-28
Heap
The heap of a process is dynamic memory which is allocated at run time. It will be used to store variables which may vary dramatically in size depending on what the application is run on - for example reading data into memory. This memory will stay allocated until it is explicitly deallocated. Therefore the heap can come with considerable overheads and require techniques like garbage collection or custom allocations to handle.

Hill climbing

Last edited: 2024-02-24

This is the most naïve random optimisation algorithm. Suppose we have a optimisation problem where $A$ has some sense of neighbourhood $N(a)$ for each point $a \in A$.

Start at some random point $a_s \in A$ then move to the point

$$a' = \mbox{arg}\max_{a' \in N(a_s)} f(a')$$

keep iterating until $a_s = a'$.

# Pseudocode

hill_climber(optimise, neighourhood):
	Input:
		optimise: The function you are looking to optimise
		neighbourhood: A function that returns the neighbourhood of a point A -> P(A).
	Output: Some a in A that is a local optimum for optimise.
1. Select random point current_best in A.
2. Let next_best be some different point in A.
3. While current_best != next_best:
	1. Let current_best = next_best
	2. next_best_score = optimise(current_best)
	3. for n in neighourhood(next_best):
		1. if optimise(n) > next_best_score
			1. next_best_score = optimise(n)
			2. next_best = n
4. Return current_best

# Run time

This strongly depends on the sense of neighbourhood. For example if the neighbourhoods were the entire domain this might be very slow.

Image Segmentation
Last edited: 2026-02-05
# Statement
Image Segmentation
Given an undirected graph $G = (V,E)$ with weights:
Image segmentation by max flow
Last edited: 2026-02-05
# Theory
# Reformulation
Right now we have two problems, it is a maximalisation problem and the weights might not be non-negative.
Let
$$L = \sum_{v \in V} f(v) + b(v).$$
Then we can reformulate
$$\begin{aligned}w(F,B) & = \sum_{v \in F} f(v) + \sum_{u \in B} b(u) - \sum_{(v,u) \in cut(F,B)} p(u,v)\\ & = L - \sum_{u \in B} f(u) - \sum_{v \in F} b(v) - \sum_{(v,u) \in cut(F,B)} p(u,v)\end{aligned}$$
Therefore to maximise $w(F,B)$ we could instead minimise
Imperative Programming
Last edited: 2023-11-11
In this paradigm the language specifies a series of instructions to potentially get to an end state (cough Holting Problem cough). Lots of peoples first programs are imperative.
```
letters = ["a","B","C","d"]
output = []

for letter in letters:
	if letter.lower() == letter:
		output.append(letter)

print(output) # ["a", "b"]
```
Here you provide the machine exact steps to what has to be done to produce the output.
There is a sub category of this called Procedural Programming that uses this with functions .
Independent set of a given size
Last edited: 2026-02-05
# Statement
Independent set of a given size
Given a undirected graph $G = (V,E)$ and positive integer $g > 0$. Does $G$ have an independent set of size $g$?
Independent set of a given size is in NP
Last edited: 2025-12-05
# Statement
Lemma
Independent set of a given size is in NP .
Infeasible linear programme
Last edited: 2026-02-05
Infeasible linear programme
A linear programme is infeasible if there are no $x \in \mathbb{R}^n$ that satisfy its constraints.
Integer linear programming problem
Last edited: 2026-02-05
# Statement
Linear programme problem
Given a linear programme what is an $x \in \mathbb{z}^n$ that achieves an optimal solution if it exists.
Interface
Last edited: 2023-11-11
An interface is a standard way for a someone else to interact with an object. There are lots of interfaces you will deal with such as API’s , library functions, or UI’s .
A well defined interface makes objects easy to use and help the process of Abstraction .
Iterative algorithms
Last edited: 2023-11-11
An iterative algorithm is a computational procedure that solves a problem by repeating a set of steps until a specific condition is met, without making recursive calls. In other words, it uses loops (like for, while, etc.) to repeatedly execute code blocks, updating the state variables as it progresses, until it reaches an exit condition that signifies the problem has been solved.
Here are some key characteristics of iterative algorithms:
Iterative Dichotomiser 3 (ID3)
Last edited: 2024-01-11
This algorithm works on classification problems . It is a greedy algorithm to design a maximal decision tree . Therefore are looking for the tree to represent some function $f: A \rightarrow B$, where we may only have access to some training data $D \subset f$.
This decision tree will be maximal with regards to the information entropy of each of the leaf nodes within that decision tree . For any $A' \subset A$ we can define $Entropy(A')$ by considering drawing a random element in $A'$ out with a probability of them being mapped to some element of $B$.
k-colourings problem (graphs)
Last edited: 2026-02-05
# Statement
$k$-colourings problem (graphs)
Given an undirected graph $G = (V,E)$ and integer $k > 0$. Does $G$ have a proper vertex colouring using $k$-colours, if it does output the colouring and if not say so.
k-means clustering
Last edited: 2024-03-09
Suppose we are in the clustering problem set up. Suppose we are provided $k$ the number of clusters. Further assume $A$ is a space with some method of averaging $Avg:2^A \rightarrow A$ which satisfies
$$ \sum_{x \in X} d(x, Avg(X)) = \min_{a \in A} \sum_{x \in X} d(x,a), \mbox{ for all } X \subset A. $$
In the Euclidian example we will have
$$ Avg(X) = \frac{1}{\vert X \vert} \sum_{x \in X} x. $$
To start the algorithm we pick $k$ random points in our training data $T$. We set these to be the current centres $center^0_i$ for $1 \leq i \leq k$. For iteration $t \in \mathbb{N}$ assume we have centres $center_i^{t-1} \in A$ (they don’t need to be points in $T$). Now we group the training data
k-nearest neighbour
Last edited: 2024-01-22
This algorithm is an Instance-based learning model and uses closest neighbours to determine what values should be.
In the modelling framework , suppose we are given:
- training data $T \subset (A,B)$,
- a metric on $A$ called $d: A \times A \rightarrow \mathbb{R}$,
- $k \in \mathbb{Z}$ to be the number of points to consider, and
- an averaging mechanism $V: \mathcal{P}(B \times \mathbb{R}) \rightarrow B$ that takes a weighted set of values in $B$ and returned another value in $B$ - like the weighted geometric mean . Then to model our data when given a new input point $a \in A$ we find the $k$-nearest neighbours in our training data $T$ then take the average of their associated values in $B$.
This model is very simple - however a lot of the complexity comes in defining $d$ and $V$.
k-SAT is in NP
Last edited: 2026-01-28
# Statement
Lemma
k-SAT is in NP
k-SAT is NP-complete for k greater than or equal to 3
Last edited: 2026-01-28
# Statement
Lemma
k-SAT is NP-complete for $k \geq 3$
k-satisfiability problem (k-SAT problem)
Last edited: 2026-02-05
# Statement
k-SAT Problem
Given a boolean function $f$ in CNF with $n$ variables and $m$ clauses of size at most $k$. Is there a true/false assignment to the $n$ variables that satisfies $f$. If yes then output it, otherwise say no.
Knapsack Problem
Last edited: 2025-12-05
# Statement
The knapsack problem
Given $n$ objects $o_i$ with weight $w_i$ and value $v_i$ how do we maximise the value of a bag that can carry weight $B$. The solution to this is a subset of objects $S \subset \{1, 2, \ldots, n\}$ such that:
Knapsack problem (without repetition)
Last edited: 2026-01-28
# Statement
The knapsack problem
Given $n$ objects $o_i$ with weight $w_i$ and value $v_i$ how do we maximise the value of a bag that can carry weight $B$. The solution to this is a subset of objects $S \subset \{1, 2, \ldots, n\}$ such that:
Knapsack-search (without replacement)
Last edited: 2026-01-28
# Statement
Knapsack-search (without replacement)
Given $n$ objects $o_i$ with weight $w_i$ and value $v_i$, and a goal $g$ is there a bag that has weight less than $B$ but value at least $g$. The solution to this is a subset of objects $S \subset \{1, 2, \ldots, n\}$ such that:
Knapsack-search is NP-complete
Last edited: 2025-12-05
# Statement
Lemma
Knapsack-search is NP-complete .
Kruskal's algorithm
Last edited: 2023-11-11
This is an algorithm to solve the MST problem. It uses the union-find data structure to help it identify cycles.
# Pseudocode
```
Kruskal's(G,w):
	Input: undirected graph G=(V,E) with weights w(e).
	Output: MST edges X
1. Sort E by weights, smallest to largest
2. Set X to be empty
3. For e = (v,w) in E (ordered)
	1. If X U e does't have a cycle then X = X U e.
4. Output X
```
# Run time
For step 1, we use merge sort so this takes $O(\vert E \vert \log(\vert E \vert))$ time. Though as $\vert E \vert \leq \vert V \vert^2$, we can think of this as $O(\vert E \vert \log(\vert V \vert))$.
Lambda functions
Last edited: 2023-11-11
These are Anonymous Functions that take multiple inputs and evaluate the output in a single statement. It has the following syntax:
```
lambda arguments : expression
```
Syntactic Sugar
A lambda expression creates a function object just like the def statement.
Linear programme
Last edited: 2026-02-05
Linear programme
A linear programme has $n$ variables $x_j$ for $1 \leq j \leq n$ where we are trying to either
Linear programme standard form
Last edited: 2026-02-05
Linear programme standard form
The standard form for a linear programme with $n$ variables and $m$ constraints is given by matrices $x, c \in M_{n,1}(\mathbb{R})$, $A \in M_{m,n}(\mathbb{R})$, and $b \in M_{m,1}(\mathbb{R})$ were we want to
Linear programming problem
Last edited: 2026-02-05
# Statement
Linear programme problem
Given a linear programme what is an $x \in \mathbb{R}^n$ that achieves an optimal solution if it exists.
Linked lists
Last edited: 2026-02-05
Linked list
A linked list stores a collect of elements of the same data type. The data structure consists of a set of sequential nodes. Each node has the following information
Logging in python
Last edited: 2023-11-11
# logging python
The logging module in python is used to do exactly what you would expect, it logs messages.
Some cool features it has is:
- different levels of messages for filtering of messages,
- different logger names for different parts of the application,
- different handlers to post messages such as streamed, files, http and rotating, and
- custom formatters for different handlers.
This library was converted from Java which is why it does not conform to the python naming conventions.
Machine Learning by Tom Mitchell
Last edited: 2024-01-10
This is a book by Tom Mitchell about Machine Learning that was the course text book for CS7641 Machine Learning . You can find it the repo below or for free on his website .
Many-one reduction (problem)
Last edited: 2026-02-05
Many-one reduction
There is a many-one reduction of problem $A$ to $B$ ($A \rightarrow B$ or $A \leq B$) if a polynomial time algorithm to solve $B$ would also solve $A$ in polynomial time .
Markdown
Last edited: 2023-11-11
This is a Markup Language that is widely used, for example it is used in GitHub README pages and Obsidian notes. It has a fairly basic syntax, that I am going to have great fun describing in a Markdown file.
Syntax I am sure there is more, but this is what I have found useful so far!
Markup Language
Last edited: 2023-11-11
A Markup Language is text-encoding system that tells the computer how to display some text. There are 3 main categories
- Presentational markup - ironically the least well presented of all of them. Essentially binary code that is interpreted for programs such as Word.
- Procedural markup - This uses tags within the language to define how it should be rendered. Such as Markdown and Tex .
- Descriptive markup - (Sometimes called semantic) This defines how to label sections of the text. Such as Latex and HTML .
Difference from a Programming Language
Max clique problem (graph)
Last edited: 2026-02-05
# Statement
Max clique size problem
Given an undirected graph $G = (V,E)$ what is the largest $S \subset V$ such that the induced graph on $S$ forms a clique .
Max flow problem
Last edited: 2025-12-05
# Statement
Max flow problem
Given a flow network $(G, c, s, t)$, what is the maximum value a flow on this network can have?
Max independent set problem (graph)
Last edited: 2026-02-05
# Statement
Max independent set problem
Given a undirected graph $G = (V,E)$ what is the size of the largest independent set in $G$?
Max independent set problem is NP-hard
Last edited: 2025-12-05
# Statement
Lemma
Max independent set problem is NP-hard .
Max-flow min-cut Theorem
Last edited: 2023-11-11
# Statement
Theorem
For a flow network $(G, c, s, t)$ the size of the max flow is the same as the min st-cut .
Max-k-exact-satisfiability problem
Last edited: 2026-02-05
# Statement
Max-$k$-exact-satisfiability problem
Provided with a boolean function $f$ in CNF with $n$ variables and $m$ clauses, where each variable only appears in a single literal in each clause and each clause has exactly $k$ litterals. What is an assignment maximising the number of satisfied clauses?

Max-SAT random approximation algorithm

Last edited: 2023-11-12

This is a naïve approximate solution to the Max-SAT problem. If the formula has $n$ variables and $m$ clauses this guarantees $m/2$ clauses will be satisfied by the assignment.

# Pseudocode

Name(f):
	Input: Boolean formula in CNF form f, with variables x_i and clauses c_j.
	Output: An assignment to x_i that satisified m/2 clauses.
1. Let C be the set of clauses and build assignment a : x_i |-> T/F.
2. For each x_i:
	1. Let A be all the clauses in C containing the x_i literal.
	2. Let B be all the clauses in C containing the not x_i literal.
	3. If A is bigger than or equal to B, set a(x_i) = T and remove A from C.
	4. Else set a(x_i) = F and remove B from C.
3. Return assignment a.

# Run time

We loop $n$ times having to look through a list of size at most $m$. Therefore this takes $O(nm)$.

Max-Satisfiability Problem
Last edited: 2026-02-05
# Statement
Max-Satisfiability Problem (Max-SAT problem)
Provided with a boolean function $f$ in CNF with $n$ variables and $m$ clauses, where each variable only appears in a single literal in each clause. What is an assignment maximising the number of satisfied clauses?
MIMIC (meta)
Last edited: 2024-02-24
Suppose we have some Optimisation problem with input space $A$ and fitness function $f: A \rightarrow \mathbb{R}$. Then for any $\theta \in \mathbb{R}$ define
$$ P^{\theta}(a) = \begin{cases} \frac{1}{Z_{\theta}} & \mbox{if } f(a) \geq \theta\\ 0 & \mbox{otherwise.} \end{cases}$$
(Where $Z_{\theta}$ is some normalisation coefficient to make this a probability distribution .) Then let $\theta_{min} = \min_{a \in A} f(a)$ and $\theta_{max} = \max_{a \in A} f(a)$ then $P^{\theta_{min}}$ is the uniform distribution over $A$ whereas $P^{\theta_{max}}$ is the uniform distribution over the optimum.
MIMIC by dependency trees
Last edited: 2024-02-25
This follows on from MIMIC (meta) . Where we are going to use dependency trees to calculate the distribution. Here we are just implementing a version of the probability_constructor.
We assume the domain of our problem breaks up $A = \oplus_{i=1}^n A_i$ into features, we will also break up the random variable $X=\oplus_{i=1}^n X_i$ on this also. To generate the probability distribution we are going to build a dependency tree . To decide the form of this dependency tree - due to some maths - we will use provide pass or fail samples to calculate Mutual information for each pair of features.
Min st-cut problem
Last edited: 2026-02-05
# Statement
Min st-cut
Given a flow network $(G, c, s, t)$ what is the st-cut $(L, R)$ with minimum capacity.
Min-heap
Last edited: 2026-01-21
A min-heap is a complete binary tree T (i.e., all layers are full except possibly the bottom). Each node $x \in T$ in this tree is associated with a value in the heap $v_x$. This tree obeys the min-heap property:
- For all nodes $x \in T$ in the tree if they have a parent $p \in T$ then $v_x \geq v_p$.
This means the root node is the smallest value in the heap.
Minimum Spanning Tree problem (MST)
Last edited: 2026-02-05
# Statement
Minimum Spanning Tree problem
Given an undirected graph $G = (V,E)$ with weights $w: E \rightarrow \mathbb{R}$ can you find a spanning tree $T = (V, E')$ with minimum weight
Minimum vertex cover problem
Last edited: 2026-02-05
# Statement
Minimum vertex cover problem
Given an undirected graph $G = (V,E)$ provide a vertex cover of smallest size.
Modular exponent algorithm
Last edited: 2023-11-11
This algorithm has a polynomial run time in the size of the input to calculate powers in using modular arithmetic . This has not been sped up using Euler’s theorem (modular arithmetic) .
Algorithm
```
resursive_mod_exp(x,y,N):
	Input: n-bit integers x,y,N >= 0
	Ouput: x^y mod N
1. if y = 0, return 1
2. z = resursive_mod_exp(x, floor(y/2), N)
3. if y is even return z^2 mod N
4. if y is off return xz^2 mod N
```
Run time
Modular exponent problem
Last edited: 2026-02-05
# Statement
Modular exponent problem
Given 3 $n$-bit integers $x, y, N \geq 0$ what is $x^y$ mod $N$.
Modular inverse algorithm (extended Euclidean algorithm)
Last edited: 2023-11-11
This algorithm solves the Modular inverse problem and uses the Extended Euclidean algorithm to do it.
Algorithm
```
modular_inverse_algorithm(x, N):
	Input: 2 n-bit integers x, N.
	Output: Either x^{-1} mod N or it says it does not exists.
1. Run extended_euclidean_algorithm(x, N) = (d, a, b)
2. If d != 1 return that no inverse exists.
3. Return a
```
Runtime
As the Extended Euclidean algorithm takes $O(n^3)$ so does this algorithm.
Modular inverse problem
Last edited: 2026-02-05
# Statement
Modular inverse problem
Given 2 $n$-bit integers $x, N \geq 0$ what is $x^{-1}$ mod $N$.
Mutability
Last edited: 2023-11-11
An object is considered mutable if if can be changed after it is created, whereas it is immutable if it can not be changed. If you change a mutable object all references to that object will also be changed.
In Python Index , lists, sets and dictionaries are mutable whereas numbers, strings, tuples and frozen sets are immutable.
This is commonly used in interview questions, for example:
Mutability in Python
Last edited: 2025-12-05
In Python Index all variables are references to objects. Every object has a mutability property. Objects that are mutable in Python behave as you would expect: when a variable pointing to a mutable object is altered, the object itself is altered.
```
x = [1,2]
y = x
x.append(3)
print(y) # [1,2,3]
```
Here x and y point to the mutable list object. When x is altered to add 3 to the list, y also reflects this change because both variables point to the same object.
Naive Bayes classifier
Last edited: 2024-02-24
Suppose we have a classification problem and we are in the modelling framework . We are given features $A = \oplus_{i=1}^n A_i$ where each feature $A_i$ has finite values (also split up the random variable $X$ into $X = \oplus_{i=1}^n X_i$ for our input). Let our classification also have finite domain $B$.
We can use our testing data to estimate $\mathbb{P}[Y=b]$ for each $b \in B$ and $\mathbb{P}[X_i = a_i \vert Y = b]$ for each $a_i \in A_i$ and $b \in B$.
Namespaces
Last edited: 2023-11-11
A namespace in computer science is essentially a container where identifiers (names of types, variables, functions, etc.) live. It allows different developers, libraries, or parts of a program to use the same identifier names without causing conflicts, because the same identifier can be bound to different entities in different namespaces.
In other words, namespaces provide a way to disambiguate identifiers that might be common and likely to be used more than once. This way, even if the same identifier is used in different parts of a program, as long as they are in different namespaces, they are considered to be different entities.
Naming conventions
Last edited: 2023-11-11
When coding we repeatedly have to name things. So having some go to standards for how to do this makes writing code quickly but it easier to read. The key thing is for the name to be fully expansive about what the variable is doing, if you can’t do this maybe this variable is doing too much.
Clean Code
One difference between a smart programmer and a professional programmer is that the professional understands that clarity is king. Professionals use their powers for good and write code that others can understand.
Neighbourhood (graph)
Last edited: 2026-02-05
Neighbourhood
For a undirected graph $G = (V,E)$ the neighbourhood of $X \subset V$ is $N_G(X) = \{u \in V \vert (x,u) \in E, x \in X\}$ ($\backslash X$, $\cup X$). Sometimes this will be defined to include $X$ or exclude $X$ - this is called open and closed neighbourhoods.
Nondeterministic Polynomial time (NP)
Last edited: 2026-02-05
This definition can mean multiple things.
Nondeterministic Polynomial time (search)
The class of NP problems is the class of all search problems .
NP-Complete
Last edited: 2026-02-05
NP-Complete
A problem $f$ is NP-complete if
NP-hard
Last edited: 2026-02-05
NP-hard
A problem $f$ is NP-hard if for all search problems $g$ there is a many-one reduction from $g$ to $f$. In other terms, $f$ is as hard as any search problem .
Object
Last edited: 2023-11-11
An object keeps data behind an interface. It makes its internal variables private. Then exposes setter functions that allow people to edit in there preferred way. It also has getter functions to provide the methods to get at the data how they would like to access it.
The following would be a data structure .
```
class point:
	x : float
	y : float

	def init(self, x: float, y: float):
		self.x = x
		self.y = y
```
Whereas the following would be an object.
Ones complement
Last edited: 2026-02-05
Ones complement
The ones complement of a binary number is obtained by flipping 1’s and 0’s. Therefore if you were to add a number with its one complement you would end up with a number of all 1’s.
Open Closed Principle (OCP)
Last edited: 2023-11-11
# The Open Closed Principle (OCP)
This is one of the SOLID principles and states:
Bertrand Meyer
- software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification
P equals NP or P not equals NP
Last edited: 2023-11-11
This is the open problem whether the class of Polynomial time problems is equal to the class of Nondeterministic Polynomial time (NP) problems.
Page rank
Last edited: 2026-02-05
Page rank
The page rank is a probability distribution over the vertices of the Webgraph $G = (V,E)$. This is a function $\pi : V \rightarrow [0,1]$ that assigns a ranking to each page. It is defined recursively as
Page rank algorithm
Last edited: 2023-12-03
This algorithm is used to calculate the “importance” of a page on the internet, or the page rank . This a Probability distribution on the Webgraph , $\pi$.
To approximate this we use a Markov chain and approximate its stationary distribution .
# Pseudocode
Build a Markov chain $P$ on the vertices of the Webgraph $G = (V, E)$. This will have
Palindrome
Last edited: 2026-01-28
Palindrome
A sequence $a_0, a_1, \ldots, a_n$ is a palindrome if $a_i = a_{n-i}$ for all $0 \leq i \leq n$.
Parallelisation
Last edited: 2026-02-05
Parallelisation
This is when tasks are scheduled on different CPU and so happen in parallel to one another. This usually uses multi-threading or multi-processing .
Passing variables to a function
Last edited: 2023-11-11
When you pass a variable to a function , what does the function actually receive?
First lets name things to make this a little simpler. The variable we pass to the function will be called the argument whereas the variable used by the function will be called the parameter.
```
def function(parameter):
	# The variable used by the function is called the parameter
	...

argument = ...

 The variable we pass to the function is called the argument

function(argument)
```
There are two main choices of what can happen to the argument when it gets passed into the parameter, it can be:
Perceptron rule
Last edited: 2024-01-18
This is a way to train a perceptron using the binary step activation function. To classify training data $T$ where $A = \mathbb{R}^n$ and $B = \{0,1\}$. We assume the perceptron we are training is a function $p: \mathbb{R}^n \rightarrow \{0,1\}$ defined by weights $w_i, \theta \in \mathbb{R}$. For this we require a sufficiently small learning rate $\eta \in \mathbb{R}_{>0}$. The idea is to reply the data by nudging the weights ($\Delta w_i$) of the perceptron slowly towards an even split.
Pipe
Last edited: 2023-11-11
This is where you do multiple manipulations to a data object in ‘one line’ but break it up over multiple. This is commonly used in Python Index with the Pandas library.
```
data_frame.group_by(["animals"])
	.mean()
	.reset_index()
	.rename({"height": "mean_height", "weight": "mean_weight"})
```
This operations are naturally grouped together and effect the same object. This is not to be confused with a train wreck .
Policy Iteration (MDP)
Last edited: 2024-04-06
This is a method of solving Markov decision processes . It uses discounted rewards to get a local as well as global optimum solution.
It uses the same set up as Value iteration (MDP) but instead we use the policy to decide the utility and iterate that.
# Pseudocode
Instead of looking at the utility of a state we could instead look at the policy and use the utility to guide us.
Polymorphism
Last edited: 2023-11-11
This is the ability of a single interface or base class to represent multiple concrete types or classes. This allows objects of different classes to be treated as objects of a common superclass or interface.
Polymorphism enables developers to write code that is more flexible, extensible, and easier to maintain because it helps decouple the implementation details from the code that uses those implementations. By interacting with the abstract interface, code can work with different concrete implementations without needing to know their specific details.
Polynomial time
Last edited: 2026-01-28
Polynomial time
An algorithm runs in polynomial time if its running time is polynomial in the length of the input. A problem can be solved in polynomial time if it is a search problem and there exists a polynomial time algorithm for it.
Polynomial time is a subset of NP-complete
Last edited: 2023-11-11
# Statement
Lemma
The class of Polynomial time problems is a subset of Nondeterministic Polynomial time (NP) .
Pre-commit hooks
Last edited: 2023-11-11
Pre-commit hooks are a great way to run checks against your code before it is committed to the repository. This saves time and money by not having to run checks in the CI/CD pipeline, and also puts control of your commits back in your hands.
# Setting up pre-commit hooks
You can install the pre-commit package using pip.

Prim's algorithm

Last edited: 2023-11-11

This solves the MST problem using a method similar to Dijkstra’s algorithm . It will use a Priority queue to do this.

Algorithm

Prim's(G,w):
	Input: undirected graph G=(V,E) with weights w(e).
	Output: MST defined by the array prev
for all u in V
	cost(u) = inf
	prev(u) = nil
Pick any initial node u_0
cost(u_0) = 0

H = makequeue(V)
while H is not empty:
	v = deletemin(H)
	for each {v,z} in E:
		if cost(z) > w(v,z):
			cost(z) = w(v,z)
			prev(z) = v
			decreasekey(H,z)
output prev

Correctness

Principle component analysis
Last edited: 2026-02-05
Principle component analysis is a linear dimension reduction algorithm. In concept principle component analysis find the axis along which the data has maximal variance if it were projected. It does this by finding the Eigenvector and Eigenvalues of the Covariance matrix . It uses these eigenvectors as a new basis of the data’s feature space.
Procedural Programming
Last edited: 2023-11-11
Programming paradigms
Last edited: 2023-11-11
When designing a programming language, there are different aspects you can choose to focus on. This leads to different programming paradigms that are useful to know about, and understand the pay offs of each of them.
- Object Oriented Programming (OOP)
- Functional Programming
- Logical Programming
- Imperative programming
Pseudo-polynomial time
Last edited: 2026-01-28
Pseudo-polynomial time
An algorithm runs in pseudo-polynomial time if its running time is polynomial in the numeric value of the input but not necessarily in the length of the input.
Python Built-in Functions
Last edited: 2023-11-11
# Python Builtin Functions
Python Index has some built in functions that make using the language slightly easier. You can find a full list of these in the python documentation .
# Functions
# callable(object)
This returns True or False depending on if the object is callable. For example
Pythonic
Last edited: 2023-11-11
Pythonic means code that doesn’t just get the syntax right, but that follows the conventions of the Python community and uses the language in the way it is intended to be used. ¹
For example:
```
for index in range(len(some_list)):
	some_list[index].do_something()
```
would not be considered pythonic however
```
for item in some_list:
	item.do_something()
```
would be as it uses the built in iteration of the python language.
Q-learning
Last edited: 2024-04-06
Q-learning is a reinforcement learning class of algorithms which are value function based. It uses the approach of Incremental learning of the Q-function (RL) . We use the model of transitions where the learning can provide the action each iteration.
Pick an initial estimation $\hat{Q}(s,a)$ of the Q-function (RL) .
We need to pick a learning rate $\alpha_t$ such that
Quick sort
Last edited: 2023-11-11
This is a recursive algorithm used to solve the sorting problem that uses pivots $p$ to sort the list. A very brief guide to how this works is as follow
1. Choose a pivot $p$.
2. Partition $A$ into $A_{p}$ .
3. Recursively sort $A_{p}$.
The issue is how to choose a good piviot $p$ - ideally you would want to pick the median element.
Random Access Memory (RAM)
Last edited: 2026-02-05
Random Access Memory (RAM)
Random access memory is used to temporarily store data relevant to an executing program. Unlike disk storage RAM is temporary and will lose its state once the machine is turned off. However, RAM is fast access which makes it more usable to programs when they need to access it frequently. Processes access RAM through virtual memory which is mapped to physical memory through page tables .
Random component analysis
Last edited: 2024-03-10
Random component analysis is a form of linear dimension reduction . It picks a random linear map $p: \mathbb{F}^n \rightarrow \mathbb{F}^m$ and applies that.
This works mainly due to The curse of dimensionality .
Recursion
Last edited: 2023-11-13
This is the technique of a function calling itself recursively to find an answer and returning at some given condition. This is in opposition to Iterative algorithms that achieve what they want in a for or while loop.
The important thing to remember is to set up a return condition that will always be met otherwise, this could go on forever (see Halting problem ).
Refactored
Last edited: 2023-11-11
The process of refactoring is to review your code and make it more readable by breaking it down into smaller components.
Reference counting in Python
Last edited: 2023-11-11
In Python Index when objects are created, it also stores the number of references there are to that object. This is what we mean when we say the reference count of an object.
You can access this reference count by using the sys library with the function sys.getrefcount.
```
import sys

x = [1,2]
print(sys.getrefcount(x)) # 2
y = x
print(sys.getrefcount(x)) # 3
print(sys.getrefcount(y)) # 3
del y
print(sys.getrefcount(x)) # 2
```
(Note the numbers are 1 more than you would expect, this is due to the object [0,1] being passed into the function sys.getrefcount which then has a reference to that object also.)
REST API
Last edited: 2023-11-11
This is a specific type of API that conforms to a certain specification.
Restart hill climbing
Last edited: 2024-02-24
# Random restart hill climbing
Suppose we have a optimisation problem where $A$ has some sense of neighbourhood $N(a)$ for each point $a \in A$.
Then we want to run Hill climbing but try to get around the issue with only finding local optimum by at the end of a run restarting in a new place.
Rivest-Shamir-Adleman algorithm (RSA algorithm)
Last edited: 2023-11-11
To set up the RSA algorithm you do the following:
1. You pick 2 n-bit random primes $p$ and $q$.
2. You choose $e$ relatively prime to $(p-1)(q-1)$.
3. You let $N = pq$ and you publish the public key $(N,e)$.
4. You compute your private key $d = e^{-1}$ (mod $(p-1)(q-1)$).
Then to encrypt a message $m \in \mathbb{Z}$ someone else does the following:
1. Take a public key $(N,e)$.
2. You compute $y = m^e$ (mod $N$).
3. Then you send message $y$.
Lastly to decipher the message using the private key you do the following.
Rudrata cycle problem
Last edited: 2026-02-05
# Statement
Rudrata cycle problem
Given a graph $G = (V,E)$, what is a Rudrata cycle if it exists?
Rudrata path problem
Last edited: 2026-02-05
# Statement
Rudrata path problem
Given a graph $G = (V,E)$, what is a Rudrata path if it exists?
Run time complexity
Last edited: 2023-11-11
There are 3 main ways people use to analyse the run time complexity of algorithms.
# Big-O notation
Big-O notation is the main way people determine the run time of algorithms. (Rightly or wrongly.) It is the worst case analysis and has a formal mathematical definition
$$ f = O(g) \mbox{ if } \exists \ x_0 \in \mathbb{N}, m \in \mathbb{R}_{>0} \mbox{ such that } f(x) \leq m \ast g(x)\ \forall x > x_0.$$
This is just saying that $f$ eventually is at most a scalar multiple of $g$.
Satisfiability problem (SAT problem)
Last edited: 2026-02-05
# Statement
Sat Problem
Given a boolean function $f$ in CNF with $n$ variables and $m$ clauses. Is there a true/false assignment to the $n$ variables that satisfies $f$. If yes then output it, otherwise say no.
Side effect
Last edited: 2023-11-11
In programming, a side effect of a subroutine is the effect it has on things outside its domain. For example updating a database, changing a global variable, or changing an input mutable variable. It is good practice to avoid doing this at all costs. It makes your code harder to understand and debugging even worse!
Signed or unsigned integers
Last edited: 2023-11-11
# Signed integers
When storing integers in a computer, you can do it with a ‘sign’ or without it. This essentially just means you designate one bit at the start of the representation of that integer which indicates it is either positive or negative number.
Simplex method (linear programme)
Last edited: 2023-11-11
Simulated Annealing
Last edited: 2024-02-24
Suppose we have a optimisation problem where $A$ has some sense of neighbourhood $N(a)$ for each point $a \in A$.
Then we want to emulate Hill climbing but probabilistically allow for random walking. We do this by randomly deciding to take moves from $x$ to $x_t$ based on the temperature of the algorithm at that time $T$ this is given by
$$P(x,x_t,T) = \begin{cases} 1 & \mbox{if } f(x_t) \geq f(x)\\ \exp{\frac{f(x_t) - f(x)}{T}} & \mbox{otherwise.} \end{cases}$$
Then we just modify what is done in Hill climbing with this and gradually decrease the temperature and we have simulated annealing.
Single linkage clustering
Last edited: 2024-03-09
Suppose we are in the clustering problem set up. Here we are given the number of desired clusters $k$, and the sense of distance $d$.
The idea will be to make every element in our training data $T$ its own cluster. Then we take the union of two clusters which are considered the closest. For this linkage clustering we use the minimum distance of points in the cluster. i.e.
Single Responsibility Principle (SRP)
Last edited: 2023-11-11
This is one of the SOLID principles and it states:
Robert C. Martin
“A module should be responsible to one, and only one, actor.”
Singleton
Last edited: 2023-11-11
The Singleton pattern is a creational design pattern that is used to ensure that only one instance of a class is created throughout the entire lifetime of an application. This pattern is particularly useful in situations where multiple objects of a class could cause issues with data consistency, resource allocation, or system performance.
To implement the Singleton pattern, a class must have a private constructor so that no other objects can be instantiated, and a static method that allows access to a single instance of the class. This static method will typically create an instance of the class if one does not exist, and return the existing instance if it already exists.
Soft clustering
Last edited: 2024-03-09
Suppose we are in the clustering problem set up. Soft clustering will use Expectation Maximisation with the Gaussian distribution . Here we fix a $\sigma^2$ and then we have
$$ \mathbb{P}[X=x_i \vert \mu = \mu_j] = \exp\left[- \frac{1}{2} \sigma^2 (x_i - \mu_j)^2 \right]. $$
We will use a mean to optimise $\mu_j$ at each step.
# Correctness
- It will have all the downsides of Expectation Maximisation but we can use restarts to over come them.
SOLID principles
Last edited: 2023-11-11
The SOLID principles are a set of five design guidelines for writing maintainable, scalable, and robust software in object-oriented programming. These principles were introduced by Robert C. Martin and are an acronym formed from the first letters of each principle:
1. Single Responsibility Principle (SRP) : A class or module should have only one reason to change, meaning it should have a single responsibility or job. By adhering to this principle, developers can create more modular and focused code, making it easier to understand, maintain, and modify.
2. Open/Closed Principle (OCP) : Software entities (classes, modules, functions, etc.) should be open for extension but closed for modification. This principle encourages creating code that can be easily extended with new functionality without altering the existing code, which reduces the risk of introducing new bugs and improves maintainability.
3. Liskov Substitution Principle (LSP) : Subtypes must be substitute for their base types, meaning objects of a derived class should be able to replace objects of the base class without affecting the correctness of the program. By following this principle, developers can create more reusable and flexible code that adheres to the proper inheritance hierarchy.
4. Interface Segregation Principle (ISP) : Clients should not be forced to depend on interfaces they do not use. This principle suggests that interfaces should be small, focused, and specific to a particular client’s needs, resulting in more modular and easier-to-understand code.
5. Dependency Inversion Principle (DIP) : High-level modules should not depend on low-level modules; both should depend on abstractions. Additionally, abstractions should not depend on details; details should depend on abstractions. This principle promotes the use of abstractions (interfaces or abstract classes) to reduce the coupling between software components, making the system more flexible, maintainable, and easier to change.
By adhering to the SOLID principles, developers can create more robust, maintainable, and scalable software systems in object-oriented programming.
Sorting problem
Last edited: 2026-02-05
Sorting problem
Given an unsorted list $A = [a_1, \ldots, a_n]$ of $n$ numbers can you return a sorted list for a given comparison.
Spanning Tree Protocol (STP)
Last edited: 2024-05-23
The spanning tree protocol was introduced to get rid of cycles in networks that used bridges and hubs . It does this by forming a spanning tree in the network so that the layer 2 devices use to send messages through.
This is a distributed algorithm and is computed locally on each device in the network for their neighbours. Each device will have a unique ID and the tree will be formed by picking the root of the tree to be the device with the lowest ID. Then each device working out its nearest neighbour to the root of the tree.
Special case pattern
Last edited: 2023-11-11
The Special Case Pattern is a design pattern used in software development to handle exceptional conditions without resorting to conditional logic scattered throughout the code. Essentially, it allows you to model what would be a “special case” as an object of the same type as the typical cases. By doing so, you can reduce the need for conditional logic to check for these cases, making the code more maintainable and easier to understand.
Special functions
Last edited: 2023-11-11
This are also known as dunder (double underscore) or magic functions. They are signified by two underscores before and after its name i.e. __len__.
These functions are not normally called by the user, the are for the python interpreter to use. For example if you call
```
item in some_iterable:
	pass
```
to get the list of items this calls some_iterable.__iter__(). The same for len(object) this calls object.__len__().
st-cut
Last edited: 2026-02-05
st-cut
Given a flow network $(G, c, s, t)$, an st-cut is a cut $(L, R)$ of $G$ where $s \in L$ and $t \in R$. The capacity of this cut is
Stack (OS)
Last edited: 2026-02-05
Stack (OS)
The stack of an application is a FIFO queue of stack frames—these contain a function’s parameters, local variables, and return address. These get added when a function is called and removed once a function completes. The stack acts as the control flow for a process , determining where to return to once a function has completed. The stack has a fixed size when a process starts, and if it goes beyond that size, it can cause a stack overflow.
Subset-sum problem
Last edited: 2026-02-05
# Statement
Subset-sum problem
Given positive integers $a_1, \ldots, a_n$ and $t$, is there a subset $S \subset \{1, \ldots, n\}$ where
Subset-sum problem is in NP
Last edited: 2025-12-05
# Statement
Lemma
Subset-sum problem is in NP .
Subset-sum problem is NP-complete
Last edited: 2026-01-28
# Statement
Lemma
Subset-sum problem is NP-Complete .
Substring
Last edited: 2026-01-28
Substring
Given a sequence $a_1, a_2, \ldots, a_n$ a substring is a sequence of consecutive terms of $a_i$. For example $a_{i+1}, a_{i+2}, \ldots, a_{i+k-1}, a_{i+k}$. The length of the substring is the number of terms it contains, in the example the length would be $k$.
Support vector machines (SVM)
Last edited: 2024-02-02
Support Vector Machines operate using the modelling framework to try to linearly separate data points. Suppose we have some training data $T \subset A \times B$. This utilises the kernel trick to change the topology of the feature space of our data whilst still keeping the computation relatively simple. Let $K: A \times A \rightarrow \mathbb{R}$ represent such a kernel. Then we solve the following optimisation problem
Test Driven Development (TDD)
Last edited: 2023-11-11
Testing conventions
Last edited: 2023-11-11
Tests are what makes your code safe to change. They are validate the assumptions that other bits of your code make. Without them you can not change your code without fearing the consequences. Though for this we pay a price with time, time to write them, time to run them and time to maintain them. This is a problem similar to the rest of your code but with different emphasis - tests do not need to be hyper efficient, however they will need to clearly state the assumption that has been broken.
The 5S Philosophy
Last edited: 2023-11-11
This was taken from the book Clean Code but was created else where:
Clean Code
, page 24 The 5S philosophy comprises these concepts:
The flow across an st-cut is equal to the value of the flow itself
Last edited: 2026-01-28
# Statement
Lemma
Let $(G, c, s, t)$ be a flow network with a flow $f$ and an st-cut $(S,T)$. Then the flow across $(S,T)$
The Law of Demeter
Last edited: 2023-11-11
This is a heuristic to determine what an object should and shouldn’t know about. It says:
A method f of an object C should only call the methods of the following:
- The class C,
- An object created in f, and
- An object held in an instance variable of C.
It should explicitly not call methods on objects that are returned by any of the allowed functions.
You know this has been violated this when you see Train Wrecks in the code.
Train Wrecks
Last edited: 2023-11-11
This is a line of code which repeatedly calls different functions on return variables. Like the following:
```
object.get_settings().get_graphics_config().get_makers_url().get_abs_path()
```
Note that they must return a different object, if they are doing permuting objects this a pipe and is more acceptable when doing data manipulation.
Trampolining
Last edited: 2023-11-11
When using recursion you generate a large number of namespaces for each function call you make. To get around this in the case of tail recursion instead of leaving the function open whilst you call itself again - you can instead return a function to be called with the parameters already inside it. The runner for a trampoline looks like the following.
Traveling salesman problem
Last edited: 2026-02-05
# Statement
Traveling salesman problem
Given a list of $n$ cities $C$ and a symmetric distance function $d: C \times C \rightarrow \mathbb{R}$. What is an ordering on $C$, $c_i \in C$ with $1 \leq i \leq n$ that visits every city and has minimum length $d(c_n, c_1) + \sum_{i=1}^{n-1} d(c_i, c_{i+1})$?
Travelling salesman problem (search)
Last edited: 2026-02-05
# Statement
Travelling salesman problem (search)
Given a list of $n$ cities $C$, a symmetric distance function $d: C \times C \rightarrow \mathbb{R}$ and a goal $g$. Is there an ordering on $C$, $c_i \in C$ with $1 \leq i \leq n$, that visits every city and has length $d(c_n, c_1) + \sum_{i=1}^{n-1} d(c_i, c_{i+1}) \leq g$ if it exists?
Trie
Last edited: 2026-02-05
Trie
A trie or prefix tree is used for associating words in a given alphabet $A$ to some values in $X$.
Two's complement
Last edited: 2025-12-05
This is a way to represent signed integers . Assuming the leftmost digit is the most significant bit then:
- Sign digit: Whether the number is positive or negative
- Numerical representation: The magnitude of the number
So if you store your signed integer as a byte , the leftmost bit would represent the sign and the other 7 bits would represent the magnitude.
Unbounded linear programme
Last edited: 2026-02-05
Unbounded linear programme
A Linear programme is unbounded if the value of the objective value can be arbitrarily large and so is not bounded above.
Using multiple git profiles
Last edited: 2023-11-11
# Issue
I am using a laptop where I have my personal repositories and some work ones. I would like to commit to these using different profiles and secure them using different SSH keys .
# Assumptions
I will assume you are:
Value iteration (MDP)
Last edited: 2024-04-06
This is a method of solving Markov decision processes . It uses discounted rewards to get a local as well as global optimum solution.
We ideally would like to work out what the best policy is
$$ \pi^{\ast} = \mbox{arg}\max_{\pi} \mathbb{E}\left [ \sum_{i=1}^{\infty} \gamma^i R(s_t) \ \Bigg \vert \ \pi \right ]. $$
To help us work out what $\pi^{\ast}$ should be lets define the utility of a state
Variables in python
Last edited: 2023-11-11
To understand how variables work in python it is good to keep in mind the separation between computer memory and references to places in computer memory . In Python Index , variables are more like tags or labels that are attached to objects, rather than containers that store data, as they are in many other programming languages.
When you assign a value to a variable, what Python Index actually does is create an object in computer memory representing that value, and then creates a name in the appropriate namespace that points to that object. This happens regardless of whether the value is immutable (like an integer or a string) or mutable (like a list or a dictionary).
Vertex cover of a given size
Last edited: 2026-02-05
# Statement
Vertex cover of a given size
Given an undirected graph $G = (V,E)$ and a positive integer $g > 0$, is there a vertex cover using at most $g$ vertices, if so what is it?
Vertex cover of a given size is NP
Last edited: 2026-01-28
# Statement
Lemma
Vertex cover of a given size is NP .
Vertex cover of a given size is NP-complete
Last edited: 2026-01-28
# Statement
Lemma
Vertex cover of a given size is NP-complete .
Webgraph
Last edited: 2026-02-05
Webgraph
We define a directed graph to represent a web network. Define its vertices $V$ to be the set of webpages and the directed edges $E$ to be the hyperlinks so $(x, y) \in E$ if webpage $x$ has a hyperlink to $y$.
When to Use Error Codes and Exceptions
Last edited: 2026-02-05
# HTTP Stack
In the HTTP , status codes serve as a standardized way of communicating the result of a client’s request to a server. These status codes are well-defined, universally understood, and cover a broad range of scenarios from success (200 OK) to client errors (404 Not Found) and server errors (500 Internal Server Error). Using status codes in HTTP is important for interoperability, as multiple clients and servers across different platforms and languages interact with each other.
Wrap 3rd party libraries
Last edited: 2023-11-11
It is generally good practice to wrap 3rd party libraries with your own API to be used within your code bases. However, this does come with some downsides. There is a balance between library providers and users and depending on how well they have managed this balance also effects the decision on whether to wrap the library or not.
# Advantages of Wrapping
1. API Control: Wrapping allows you to define your own API, giving you greater control over how you interact with the 3rd party library. This can simplify usage and make it easier to implement across your codebase.
2. Easy Transition: If the 3rd party library becomes obsolete, gets deprecated, or you find a better alternative, wrapping allows you to switch to a different library with minimal effort. You only have to update the wrapper instead of changing every instance where the library is used.
3. Exception Handling: Wrapping allows you to handle exceptions in a way that’s more aligned with your application’s needs. This can improve robustness and make it easier to debug issues.
4. Isolation: Wrapping isolates the 3rd party library, making it easier to mock for unit testing, monitor its behaviour, or switch out for a different library.
# Caveats and Considerations
1. Overhead: Wrapping adds another layer to your code, which can introduce bugs, increase complexity, and potentially reduce performance.
2. Maintenance: The wrapper needs to be updated whenever the 3rd party library is updated, especially if there are breaking changes. This can become a maintenance burden over time.
3. Learning Curve: New team members will need to understand both the wrapper and the underlying library, which can increase the time needed to become productive.
4. Feature Utilization: It’s easy to create a wrapper that only exposes the subset of the library’s features you initially need, making it harder to take advantage of the library’s full capabilities later on.
5. Premature Abstraction : For very small projects or prototypes, wrapping might be overkill and could unnecessarily complicate the codebase.
# Summary
While wrapping 3rd party libraries can provide benefits in terms of API control, transition ease, and tailored exception handling, it’s essential to weigh these advantages against the potential downsides of increased complexity, maintenance burden, and a steeper learning curve. The decision should be made based on the project’s size, long-term maintenance outlook, and the criticality of the 3rd party library to the application.

Programming

Pages in this section

# Statement

# Examples

# Example

# Statement

# Bitwise operations

# Statement

# __call__

# Statement

# Statement

# Statement

# Links

# Links

# Links

# Examples

# Finding paths in a directed graph via DFS

# Example

# DFS to find path in an undirected graph

# Correctness

# Run time

# Algorithm

# Correctness

# Algorithm

# Run time

# Statement

# Statement

# Statement

# Statement

# Statement

# Function body

# Functions should do one thing

# Statement

# Pseudocode

# Run time

# Statement

# Theory

# Reformulation

# Statement

# Statement

# Statement

# Statement

# Statement

# Statement

# Statement

# Statement

# Statement

# Statement

# Statement

# Pseudocode

# Run time

# Statement

# logging python

# Statement

# Statement

# Statement

# Statement

# Statement

# Statement

# Pseudocode

# Run time

# Statement

# Statement

# Statement

# Statement

# Statement

# Statement

# The Open Closed Principle (OCP)

# Pseudocode

# Pseudocode

# Statement

# Setting up pre-commit hooks

# Python Builtin Functions

# Functions

# callable(object)

# Random restart hill climbing

# Statement

# Statement

# Big-O notation

# Statement

# call