Programming
Pages in this section
- 2-SAT algorithm using SCCLast edited: 2023-11-12
2SAT(f): Input: 2-SAT problem on n variables with m clauses. Output: Assignment from n variables satisfying f or stating no assignment is possible. 1. Find all the unit clauses u_i in f. 2. Create a dummy variable Y. Set f' = f 1. Replace the unit clauses in f' with u_i \lor Y. 2. Add new clauses to f' of u_i \lor \overline{Y}. 3. Construct graph G for f'. 4. Run the SCC algorithm on G. 5. If any variable and its complement are in the same SCC, return no assignment possible. 6. While there are still SCCs left: 1. Take sink SCC S. 2. Set the clauses in S to true (and their complements to false). 3. Remove S & \overline{S}. 7. Return the assignment, excluding the value for Y.Correctness
- 3-SAT is NP-completeLast edited: 2025-12-05
# Statement
Lemma3-SAT is NP-complete .
- Adjacency list format (graph)
Last edited: 2026-02-05Adjacency listFor a graph $G = (V,E)$ (that can be a directed graph or undirected graph ) the adjacency list format is a list of neighbourhoods $N_v = \{u \in V \vert (v,u) \in E\}$ for $v \in V$.
- Anonymous Functions
Last edited: 2023-11-11These are functions that are not named. They are usually used when you are using a function once and don’t want to clutter up the name space.
# Examples
In Python Index these are implemented in by Lambda functions but these are explicitly for functions that can be defined in the Functional Programming paradigm.
- Application Programming Interface (API)
Last edited: 2023-11-11This is a interface intended to be used by other applications (i.e. not other humans). Whist this is mainly thought of as a web API’s, there are others such as programming languages (think machine code), software libraries or operating systems .
- Array (data structure)
Last edited: 2026-01-28ArrayThis data structure consists of a collection of elements of the same memory size, where each one is identified by an index. This comes with a function to map indices to memory location of each element. (This is normally integer index with a contiguous memory block.) These can be static where you fix the size of the array beforehand or dynamic where you allow the array size to change.
- Associative array
Last edited: 2026-02-05Associative arrayThis is an abstract data structure that hold a collection of key - value pairs. The keys must be unique. The data structure needs to support the following operations:
- Asynchronous programming
Last edited: 2026-02-05Asynchronous programmingAsynchronous programming is a programming paradigm that allows a program to handle operations that may take an indeterminate amount of time, such as I/O -bound tasks, without blocking the execution of the entire program. Instead of waiting for an operation to complete before moving on to the next one, asynchronous programming enables a program to initiate an operation and then continue executing other tasks while waiting for the operation to finish. Once the operation completes, the program is notified, and the relevant code can be executed.
- Backward propagation of errors (Back propagation)
Last edited: 2026-02-05This is the specific implementation of gradient descent applied to neural networks . There are two stages of this calculation:
- Forward pass through the neural network as you evaluate it on test data .
- Backward propagation of that error as you use the chain rule for differentiation on each of the perceptrons .
For this to work all the perceptrons need to have differentiable activation function .
- Backwards compatibility
Last edited: 2023-11-11This is the concept that new releases can still work in applications that used the old version of your code. It is an absolute nightmare when you are programming leading to horrible design choices and many Deprecation warnings.
- Bagging
Last edited: 2024-01-24This is one of the simplest Ensemble learning methods but out performs classic modelling methods in certain problems.
Suppose we are in the modelling framework . The bagging is the following process
- We choose a set of modelling algorithms $A_i$ for $1 \leq i \leq k$ that could fit the problem - these could all be the same.
- We take random subsets of the training data $T$, $T_i$ for $1 \leq i \leq k$, with replacement - so two samples could contain the same data point.
- We then train $\hat{f}_i$ using algorithm $A_i$ with training data $T_i$ for $1 \leq i \leq k$.
- Then we have some method of averaging these models over our problem space to produce $\hat{f}$ our final model.
# Example
Suppose we want to use polynomial regression on a simple function $f: \mathbb{R} \rightarrow \mathbb{R}$ with training data $T$.
- Balance between library providers and users
Last edited: 2023-11-11When developing third-party libraries, creators often aim for wide applicability, packing in a broad range of features to cater to as many use-cases as possible. On the flip side, consumers of these libraries typically have a more narrow focus, needing only a subset of the available functionality. This dynamic can create a form of natural tension between library providers and users. It’s like buying a multipurpose tool loaded with features, knowing you’ll probably only ever use a few of them.
- Balanced cut problem
Last edited: 2026-02-05# Statement
Balanced cut problemGiven an undirected graph $G = (V,E)$ and an integer $b$. Output a cut $S, T \subset V$ such that $cut(S,T) \leq b$ or no if no such cut exists.
- Bellman-Ford algorithm
Last edited: 2023-11-11Suppose we are trying to solve the problem of how to Find path in undirected graph where we have no negative weight cycles.
In this case you can guarantee that paths will only visit each vertex at most once. So they will use at most $\vert V \vert - 1$ edges. So it is enough to solve the subproblem.
Let $D(i,z)$ be the length of the shortest path between $s$ and $z$ that use at most $0 \leq i \leq n-1$ edges.
- Big-O notation
Last edited: 2023-11-11Big-O notation is the main way people determine the runtime of algorithms. (Rightly or wrongly.) It is the worst case analysis and has a formal mathematical definition
$$ f = O(g) \mbox{ if } \exists \ x_0 \in \mathbb{N}, m \in \mathbb{R}_{>0} \mbox{ such that } f(x) \leq m \ast g(x)\ \forall x > x_0.$$This is just saying that $f$ eventually is at most a scalar multiple of $g$.
- Big-Omega notation
Last edited: 2023-11-11Big-Omega notation is said to be the best case runtime of an algorithm. However formally it is simply a lower bound on the run time - the same as Big-O notation is simply an upper bound. This has a similar formal definition
$$ f = \Omega(g) \mbox{ if } \exists \ x_0 \in \mathbb{N}, m \in \mathbb{R}_{>0} \mbox{ such that } 0 \leq m \ast g(x) \leq f(x) \ \forall x > x_0.$$This is just saying that $f$ eventually is at least a scalar multiple of $g$.
- Big-Theta notation
Last edited: 2023-11-11Big-Theta notation is the tight bound on the runtime of an algorithm. This is only really defined when some function is the following $f = O(g)$ using Big-O notation and $f = \Omega(g)$ then we have $f=\Theta(g)$ using Big-Theta notation this is also an equivalence relation, so $g = \Theta(f)$ as well.
- Bit
Last edited: 2023-11-11A bit in the context of Computer Science is the smallest unit of data. It is a single 0 or 1.
This should not be confused with a byte which is 8 bits.
- Bitwise operations in python
Last edited: 2026-01-28# Bitwise operations
Bitwise operations can apply to the binary data type but also integers .
Representation of integers in Python IndexIn Python integers are signed and stored using Two’s complement .
- Boolean variable
Last edited: 2023-11-11A Boolean variable is one that takes the value
TrueorFalseit is a basic type in most programming languages.In Maths Index the
TrueorFalsevalue is regularly replaced with1or0respectively. This is similar to the representation that variable would have in memory.- Boosting
Last edited: 2024-01-24Boosting is an Ensemble learning method that uses multiple instances of another model to average them out to make a classifier that is better than any one individual one.
Suppose we are in the modelling framework where $B = \{-1,1\}$ and we have a training set $T$ where $(a_t, b_t) = t \in T$. Boosting iteratively train classifiers $h_i: A \rightarrow \mathbb{R}$ based on how the past classifiers have performed. To do this before training $h_i$ we define a distribution $\mathbb{D}_i$ on $A_T = \{a_t \vert t \in T\}$.
- Breadth-first search (BFS)
Last edited: 2023-11-11This is a method for exploring a tree. Lets say you have a vertex $v \in E$ and you either want to run some procedure on vertices or extract some information form them (here we return the parents). Here we use a queue data structure.
BFS(G, root): Input: Graph G = (V,E) and root in V Output: Shortest paths to the root in a tree. 1. let Q be a queue 2. label root as explored 3. Queue the root node on Q. 4. while Q is not empty 4.1. v = Q.dequeue() 4.2. for all edges (v,w) in E 4.2.1. if w is not labeled as explored: 4.2.1.1. label w as explored 4.2.1.2. set v to be the parent of w. 4.2.1.3. Queue w on Q. 5. return parent arrayThis runs in $O(\vert V \vert + \vert E \vert)$ as we run over each edge once (or twice if undirected) and visit each vertex once.
- Byte
Last edited: 2023-11-11This is a group of 8 bits i.e. a 1 or 0.
- Cache
Last edited: 2026-02-05CacheA cache is a method used to speed up data retrieval. It stores data being accessed in a faster type of memory or a closer location to the CPU . If the data is not present in the cache, it is loaded from somewhere else into the cache. This is called a cache miss and normally results in a costly action.
- Chain Hashing
Last edited: 2026-02-05Chain HashingThis is a form of Hash table ($A$, $h: K \rightarrow A$) who’s entries in the associative array are linked lists . Given an element $k \in K$ to add it to this hash table you calculate $h(k)$ then append it to the associated linked list . Then to check if an element $e \in K$ is in your hash table calculate $h(e)$ and then search the associated linked list.
- Chain matrix multiply problem
Last edited: 2026-01-28# Statement
Chain Matrix Multiply problemFor $n$ matrices $A_1, A_2, \ldots, A_n$, where $A_i$ has size $m_{i-1} \times m_i$, what is the minimum cost for computing $A_1 A_2 \ldots A_n$?
- Check if a linear programme is solvable
Last edited: 2023-11-11We are going to use the feasibility algorithm for a linear programme to do this - lets call this $feasibility\_checker$ that takes a linear programme and outputs a boolean based on if it is feasible or not.
This uses a corollary of the Weak duality theorem (linear programme) which tells us unbounded linear programmes have infeasible duals . So we will check if the dual linear programme is feasible. For this let $dual$ be an algorithm to calculate the dual.
- Checked exceptions
Last edited: 2023-11-11Checked exceptions are a feature of some programming languages, most notably Java , that enforce a strong contract between a method that can fail (throw an exception) and the caller that must handle that failure. When a method might throw a checked exception, the language requires that the method declare this fact in its method signature using a
throwsclause. Any calling method is then obligated either to catch the exception and handle it or to declare that it, too, throws the exception , propagating it up the call stack.- Checking if a linear programme is feasible
Last edited: 2023-11-11Suppose we are given a linear programme in standard form specified by $x, c \in M_{n,1}(\mathbb{R})$, $A \in M_{m,n}(\mathbb{R})$, and $b \in M_{m,1}(\mathbb{R})$ and we have a standard form linear programme solver $linear\_programme\_solver$.
Note we will include in the algorithm some transformation to guarantee the inputted linear programme is in standard form by breaking up the added variable into it’s positive and negative components. See getting a linear programme in standard form
- Classes in Python
Last edited: 2023-11-11Classes in python are defined by the
classkeyword for example.class NewThing: def __init__(self, variable): self.variable = variableWhen using classes you use the
selfkeyword to refer back to the object. Therefore a class method requires to have the first argument asself.Special functions
# __call__
When a class has this property it makes it callable. For example:
- Clean Code
Last edited: 2023-11-11This is a book by C. Martin Robert . It contains ideas about how code should be written. I have summarised the ideas I took from this into the Conventions notes.
- Clique of a given size problem
Last edited: 2026-02-05# Statement
Clique of a given size problemGiven an undirected graph $G = (V,E)$ and a positive integer $g > 0$. Does there exist a clique of size at least $g$ if so what is it?
- Clique of a given size problem is in NP
Last edited: 2026-01-28# Statement
LemmaClique of a given size problem is in NP .
- Cocktail party problem
Last edited: 2026-02-05# Statement
Cocktail party problemSuppose you have $n$ people talking at the same time with $k$ microphones placed around the room. From the recordings of the $k$ microphones, when and how can you recover the $n$ sounds?
- Coding Principles
Last edited: 2023-11-11There are many coding principles to stick to, these are just a collection of useful heuristics to avoid the pain of past programmers.
- Cohesion
Last edited: 2023-11-11When googling what this word means I got the following
GoogleThe action or fact of forming a united whole.
- Comment conventions
Last edited: 2023-11-11Comments in code are a necessary evil. As they are not the code, there is no way to check their integrity. As code gets changed the comments may not be updated and can easily cause more confusion that good. Comments also take some thought to process as you need to understand what the person who wrote the comment meant - which is far easier to do in code.
There are good use cases like a public API or todo notes - though they are mainly there to compensate for our failure to express ourselves in code. They do not make up for bad code!
- Conventions
Last edited: 2023-11-11Whilst I have written summaries of the conventions from Clean Code and added in some comments about my preferences. The main thing to remember is that the team conventions are the ones that matter. The purpose of conventions is for teams to work better together, when you look at file and your brain doesn’t have to mentally map the code to how to like to read code. The team rules, when it comes to conventions. Whenever you start a new project, get the people who are working on it to agree on a standard and then write what that is down or better implement a Linter to automatically apply the standard when saving or pushing to the common repository.
- Cost complexity pruning for decision trees (CPP)
Last edited: 2024-02-02When making a decision tree with no pruning it can tend to overfit the training data . To reduce this we can “prune” the decision tree. We do this by looking at how useful each of the branches are - then removing the ones that don’t add enough value. This is controlled by a variable $\alpha$ which increases the cost of having more leaf nodes.
- Coupling
Last edited: 2023-11-11Two bits of code are ‘Coupled’ if they rely on one another. In other words if one changes then you are required to change the other bit of code. When we write programs we really want low coupling as these are easier to change and maintain.
There are lots of methods to decrease coupling, such as introducing interfaces and The Law of Demeter .
- CRUD API
Last edited: 2023-11-11A CRUD API is one that has a set of standard end points:
- Create,
- Read,
- Update, and
- Delete. These normally connects to a database .
- CS6035 O01 Introduction to Information Security
Last edited: 2026-01-28Introduction to Information Security is a graduate-level introductory course in information security. It teaches the basic concepts and principles of information security, and the fundamental approaches to secure computers and networks.
Main topics include:
- Security basics
- Security management and risk assessment
- Software security
- Operating systems security
- Database security
- Cryptography algorithms and protocols
- Network authentication and secure network applications
- Malware
- Network threats and defences
- Web security
- Mobile security
- Legal and ethical issues
- Privacy
# Links
- CS6200 Graduate introduction to Operating Systems
Last edited: 2023-12-03This is a graduate-level introductory course in operating systems. This course teaches the basic operating system abstractions, mechanisms, and their implementations. The core of the course focuses on OS support for concurrency (threads) and synchronisation, resource management (CPU, memory, I/O), and distributed services. The practical component of the course teaches multi-thread programming, inter-process communication, and distributed interactions via RPC.
BreakI retook this course so there is a slight break in dates between lectures 4 and 5. The content was the same for both times I took it.
- CS6210 Advanced Operating Systems
Last edited: 2025-12-03Advanced Operating Systems is a graduate-level course that addresses a broad range of topics in operating system design and implementation, including:
- Operating system structuring
- Synchronization, communication, and scheduling in parallel systems
- Distributed systems, their communication mechanisms, distributed objects and middleware
- Failures and recovery management
- System support for Internet-scale computing By tracing the key ideas of today’s most popular systems to their origins in research, the class highlights key developments in operating system design over the last two decades and illustrates how insight has evolved to implementation.
# Links
- CS6215 Introduction to Graduate Algorithms
Last edited: 2023-12-03This course is a graduate-level course in the design and analysis of algorithms. We study techniques for the design of algorithms (such as dynamic programming) and algorithms for fundamental problems (such as fast Fourier transform FFT). In addition, we study computational intractability, specifically, the theory of NP-completeness. The main topics covered in the course include: dynamic programming; divide and conquer, including FFT; randomized algorithms, including RSA cryptosystem;Â graph algorithms; max-flow algorithms; linear programming; and NP-completeness.
- CS6250 Computer Networks
Last edited: 2025-12-05This course provides a quick refresher on introductory material and offers broad coverage of protocols and algorithms that span all layers of the Internet protocol stack.
The course begins with an overview of the evolution of the Internet architecture and a short refresh of the transport layer algorithms and protocols such as TCP and congestion control. Next, the course will dive into intradomain/interdomain routing, peering and networks relationships with a follow-up into router design and functionalities. After discussing routing and router technologies, the course will continue with Software Defined Networking technologies and discuss topics that intersect Network Security and Computer Networks, for example, attacks on Internet routing such as BGP hijacking. Finally, the course wraps up with a discussion on multimedia applications and Content Delivery Networks.
- CS7641 Machine Learning
Last edited: 2024-01-10This is a 3-course Machine Learning Series, taught as a dialogue between Professors Charles Isbell (Georgia Tech) and Michael Littman (Brown University).
Supervised Learning Supervised Learning is a machine learning task that makes it possible for your phone to recognize your voice, your email to filter spam, and for computers to learn a number of fascinating things. This sort of machine learning task is an important component in all kinds of technologies. From stopping credit card fraud; to finding faces in camera images; to recognizing spoken language - our goal is to give students the skills they need to apply supervised learning to these technologies and interpret their output. This is especially important for solving a range of data science problems.
- CS7642 Reinforcement Learning
Last edited: 2024-01-10The course explores automated decision-making from a computational perspective through a combination of classic papers and more recent work. It examines efficient algorithms, where they exist, for learning single-agent and multi-agent behavioral policies and approaches to learning near-optimal decisions from experience.
Topics include Markov decision processes, stochastic and repeated games, partially observable Markov decision processes, reinforcement learning, deep reinforcement learning, and multi-agent deep reinforcement learning. Of particular interest will be issues of generalization, exploration, and representation. We will cover these topics through lecture videos, paper readings, and the book Reinforcement Learning by Sutton and Barto.
- CS8803 O08 Compilers - Theory and Practice
Last edited: 2026-01-28The objective of this course is to learn the theory and practice behind building automatic translators (compilers) for higher level programming languages and to engineer and build key phases of a compiler in Java or C++ for a small language.
# Links
- CSE6220 Introduction to High Performance Computing
Last edited: 2026-01-28This course is a graduate-level introduction to scalable parallel algorithms. “Scale” really refers to two things: efficient as the problem size grows, and efficient as the system size (measured in numbers of cores or compute nodes) grows. To really scale your algorithm in both of these senses, you need to be smart about reducing asymptotic complexity the way you’ve done for sequential algorithms since CS 101; but you also need to think about reducing communication and data movement. This course is about the basic algorithmic techniques you’ll need to do so. The techniques you’ll encounter cover the main algorithm design and analysis ideas for three major classes of machines: for multicore and manycore shared memory machines, via the work-span model; for distributed memory machines like clusters and supercomputers, via network models; and for sequential or parallel machines with deep memory hierarchies (e.g., caches). You will see these techniques applied to fundamental problems, like sorting, search on trees and graphs, and linear algebra, among others. The practical aspect of this course is implementing the algorithms and techniques you’ll learn to run on real parallel and distributed systems, so you can check whether what appears to work well in theory also translates into practice. (Programming models you’ll use include OpenMP, MPI, and possibly others.)
- Cut property
Last edited: 2023-11-11Cut propertyLet $G = (V,E)$ be an undirected graph with an edge weight $w: E \rightarrow \mathbb{R}$. Suppose $X \subset E$ is part of an MST $T$. Let $S \subset V$ be a cut where no edge of $X$ is in the cut edges $cut(S, \overline{S})$. Then for $e^{\ast} \in cut(S, \overline{S})$ of minimum weight - $X \cup \{e^{\ast}\}$ is part of an MST $T^{\ast}$.
- Data - Object Anti-Symmetry
Last edited: 2026-01-28Clean CodeObjects hide their data behind abstractions and expose functions that operate on that data. Data structures expose their data and have no meaningful functions.
- Data structure
Last edited: 2023-11-11A data structure is simply a container of information. All its variables are public and can be depended on by other actors in the code base. It could contain some functions that present the data in a nicer format but it can’t rely on private data holding internal state (like a boolean has_printed).
In python data structures are really easy to define.
from dataclasses import dataclass @dataclass class point: x : float y : floatNormally, a single data structure will have lots of implementations as it is just a way to group data together in a meaningful way.
- Declarative Language
Last edited: 2023-11-11This is a language that specifies the desired end state. So it is the picture on the front of the Ikea manual rather than the instruction manual!
Examples
Thoughts A good User Interface (UI) should be declarative with the system behind the scenes doing the leg work to get to that state. This paradigm is being applied more and more within DevOps for example Kubernetes and Terraform both you specify the end state and the program sorts altering the current state to match this.
- Dependency Inversion Principle (DIP)
Last edited: 2023-11-11The Dependency Inversion Principle (DIP) is one of the five principles of the SOLID acronym in object-oriented programming and design. It states that:
- High-level modules should not depend on low-level modules. Both should depend on abstractions.
- Abstractions should not depend on details. Details should depend on abstractions.
In simpler terms, the Dependency Inversion Principle promotes the idea that software components should depend on abstractions (interfaces or abstract classes) rather than concrete implementations. This principle aims to reduce the coupling between modules, making the system more flexible, maintainable, and easier to change or extend.
- Depth-first search (DFS)
Last edited: 2023-11-11This is a way of traversing a graph (could be a directed graph or undirected graph ). It explores all the way down one branch of the graph before tracing and exploring other branches.
The generally runs in $O(\vert V \vert + \vert E \vert)$ time.
# Examples
- Design Patterns
Last edited: 2023-11-11- DFS for finding strongly connected components
Last edited: 2026-02-05SCC(G): Input: directed graph G = (V,E) in adjacency list Output: labeling of V by strongly connected component 1. Construct G^R 2. Run DFS on G^R 3. Order V by decreasing post order number. 4. Run directed DFS on G using the vertex ordering from before and label the connected components we reach.Additional propertyIn addition to labelling the connected components the order we have labelled them is in reverse topological ordering. This is because we always start at the next sink vertex after we have labelled a component.
- DFS to find connected components in an undirected graph
Last edited: 2026-02-05To solve the following problem: Statement
We can use the following DFS algorithm.
DFS(G) input: G = (V,E) in adjacency list representation output: Vertices labelled by connected components connected_component = 0 for all v in V, set visited(v) = False for all v in V if not visited(v) then connected_component++ explore(v) return connected_component_number (defined in explore)Explore(z) input: vertex z output: side effect, it defines the connected component_number on some vertices and marks them as visited. connected_component_number(z) = connected_component visited(z) = True for all (z, w) in E if not visited(w) then Explore(w)As we run through every vertex once and then every edge twice we have $O(\vert V \vert + \vert E \vert)$.
- DFS to find path in a directed graph
Last edited: 2023-11-11# Finding paths in a directed graph via DFS
To do this we are going to use a DFS algorithm like [[]] but we are going to track pre/postorder numbers.
DFS(G) input: G = (V,E) in adjacency list representation output: Vertices labelled by connected components clock = 1 for all v in V, set visited(v) = False for all v in V if not visited(v) then explore(v) return post (defined in Explore)Explore(z) input: vertex z pre(z) = clock, clock ++ visited(z) = True for all (z, w) in E if not visited(w) Explore(w) post(z) = clock, clock++# Example
Suppose we have the following graph and let $B$ be the root node. Suppose we explore edges alphabetically and lets run the algorithm above on it.
- DFS to find path in an undirected graph
Last edited: 2023-11-11# DFS to find path in an undirected graph
For undirected graphs we just keep track of the previous vertex and find a spanning sub -forest for the graph . We can use this to find paths between vertices by going back to the root vertices of the trees .
- Dijkstra's algorithm
Last edited: 2023-11-11This is an algorithm to solve the shortest path problem in undirected graphs or directed graphs $G = (V,E)$. It is based on Breath-first search but uses a priority queue instead of just a queue . It requires positive edge lengths $w: E \rightarrow \mathbb{R}_{>0}$.
Dijkstra(G,w,s): Input: graph G=(V,E) with weights w(e) and a start vertex s. Output: For all vertices reachable from s a shortest path length dist 1. for all u in V 1.1 dist(u) = inf 1.2 prev(u) = nil 2. dist(s) = 0 3. H = makequeue(V) 4. while H is not empty: 4.1 v = deletemin(H) 4.2 for each {v,z} in E: 4.2.1 if dist(z) > dist(v) + w(v,z): 4.2.1.1 dist(z) = dist(v) + w(v,z) 4.2.1.2 prev(z) = v 4.2.1.3 decreasekey(H,z) 5. output dist# Correctness
# Run time
To initialise $dist$ and $prev$ takes $O(\vert V \vert)$ time.
- Directed acyclic graph (DAG)
Last edited: 2026-01-23Directed acyclic graph (DAG)- Dual linear programme
Last edited: 2026-02-05Dual linear programmeSuppose we have a linear programme in standard form given by $A \in M_{m,n}(\mathbb{R})$, $b \in M_{m,1}(\mathbb{R})$ and $c \in M_{n,1}(\mathbb{R})$. Define the dual linear programme to be defined by $-A^T \in M_{n,m}$, $-c \in M_{n,1}(\mathbb{R})$ and $-b \in M_{m,1}(\mathbb{R})$. Where we want to solve for some $y \in M_{m,1}$ where we
- Dynamic Programming
Last edited: 2023-11-11This is an algorithmic technique to solve some recursion type problems in an iterative way. This can speed up these processes dramatically especially when recursion would compute a similar step multiple times.
The basic structure of this technique is to:
- Break a problem down into smaller sub problems.
- Use the solutions to previous sub problems to solve subsequent sub-problems.
- Use the solution to all sub problems to solve the main question.
(If this feels a lot like Induction , that is because this is essentially a programming equivalent.)
- Edmonds-Karp algorithm
Last edited: 2023-11-11This is an algorithm to solve the max flow problem but also solves the min st-cut problem on a flow network $(G = (V,E), c, s, t)$. This is very similar to the Ford-Fulkerson Algorithm but with the use of BFS when picking the s-t path.
# Algorithm
edmonds_karp(G, c, s, t) Input: A flow network. Output: A max flow on the network. 1. Set f(e) = 0 for all e in E. 2. Build the residual network G^f for the current flow f. 3. Check for any s-t paths p in G^f using BFS. 1. If no such path then output f. 4. Given p let c(p) = min_{e in p} c^f(e). 5. Augment f by c(p) units along p. 1. f(e) = f(e) + c(p) for forward edges. 2. f(e) = f(e) - c(p) for backward edges. 6. Repeat from step 2.# Correctness
This proof is the same as Ford-Fulkerson Algorithm .
- Error code
Last edited: 2023-11-11An error code is a numerical or alphanumerical value that represents the status or outcome of an operation in a program. Unlike exceptions , which interrupt the flow of a program, error codes are returned as the result of a function or method call to indicate success, failure, or some specific condition. Programs that rely on error codes require explicit checks after each operation to determine whether it was successful and to decide how to proceed based on the returned error code.
- Error Handling
Last edited: 2023-11-11To build robust software you must handle things going wrong. In code this comes in the form of error handling. There are many ways of handling errors, some common methods include: exceptions , error codes , and return objects.
We must build robust code, however this should be done in a way that maintains the readability and reusability of the code. For which there are some helpful principles here.
- Euclidean algorithm
Last edited: 2023-11-11This algorithm is used to calculate the greatest common divisor for two positive integers. This uses Euclid’s rule and modular arithmetic .
# Algorithm
Euclid(x,y): Input: integers x,y where x >= y >= 0 Output: gcd(x,y) 1. If y = 0 return x 2. return Eculid(y, x mod y)# Run time
If $x$ and $y$ are $n$-bit integers then calculating $x$ mod $y$ takes $O(n^2)$ time. From Claim 1 we have every 2 rounds $x$ decreases by at least a factor of 2. So there are at most $2n$ rounds giving this algorithm a run time of $O(n^3)$.
- Every min-cut is at full capacity in a max-flow
Last edited: 2023-11-11# Statement
LemmaGiven a Flow network $(G, c, s, t)$ with a max flow $f$. Then for every min st-cut ($S$, $T$) all edges in $cut(S,T)$ have maximum capacity in $f$.
- Exception
Last edited: 2023-11-11An exception is a mechanism used in many programming languages to interrupt the normal flow of execution when an error occurs. In languages like Java and Python, an exception is represented as an object . When an error condition is encountered, an exception is “thrown,” interrupting the current process and propagating up through the program’s call stack to find an appropriate “catch” block to handle the error.
- Expectation Maximisation
Last edited: 2024-03-09Expectation maximising is an algorithm which can be used on problems such as clustering problem . It is used in situations where you assume your data follows some underlying distribution $\mathbb{D}$ which in this instance is given by parameters $\theta$. We assume $\theta \in H$ our hypothesis space .
Suppose we draw $m$ samples from $\mathbb{D}_{\theta}$ which has an observed component $X = \{x_1, \ldots, x_m\}$ and an unobserved component $Z = \{z_1, \ldots, z_m\}$. Therefore the samples of $\mathbb{D}_{\theta}$ are given by $Y = X \cup Z$.
- Fast API
Last edited: 2023-11-11This is a python library to implement an API . Whilst there are other API implementations in python such as Flask the unique selling point of Fast API is it’s speed.
Similarly to Flask this is implemented mainly by using decorators to wrap functions to turn them into end points.
- Find connected components in an undirected graph
Last edited: 2026-02-05# Statement
Find connected componentsin an undirected graph Given a graph $G = (V,E)$ how can we find a mapping from its vertices $V$ to the connected components of $G$.
- Find path in a directed graph
Last edited: 2025-12-05# Statement
Find pathbetween vertices in a directed graph Given a directed graph $G = (V,E)$ and two vertices $a,b \in V$, how can we find a path in $G$ between them? Sometimes we will also have a weighting $W: E \rightarrow D$ and we want to minimise the path length.
- Find path in undirected graph
Last edited: 2025-12-05# Statement
Find pathbetween vertices in an undirected graph Given an undirected graph $G = (V,E)$ and two vertices $a,b \in V$. How can we find a path in $G$ between them. Sometimes we will also have a weighting $W: E \rightarrow D$ and we want to minimise the path length.
- Find strongly connected components for a directed graph
Last edited: 2026-01-28# Statement
in a directed graph Given a directed graph $G = (V,E)$ how can we find a mapping from its vertices $V$ to the strongly connected components of $G$.
- First-class object
Last edited: 2023-11-11This is an entity that can be:
- Created at runtime,
- Assigned to a variable or element in a data structure,
- Passed as an argument to a function, and
- Returned as the result of a function.
In Python Index most data types are First-class objects such as Integers , Floats , Dictionaries , Lists and Functions .
- Flow
Last edited: 2026-02-05FlowGiven a flow network $(G, c, s, t)$ a flow is an allocation $f: E \rightarrow \mathbb{R}_{\geq0}$ such that the following holds:
- Flow network
Last edited: 2026-02-05Flow networkA flow network contains the following data:
- Ford-Fulkerson Algorithm
Last edited: 2023-11-11This is the naĂŻve solution to the Max flow problem . It runs in pseudo-polynomial time depending on the size of the solution. A more developed algorithm that uses the same design is the Edmonds-Karp algorithm .
Their main difference is that Edmonds-Karp algorithm must use BFS whereas Ford-Fulkerson Algorithm can use DFS also. For a runtime bound we require Ford-Fulkerson Algorithm to use integer capacities.
- Formatting conventions
Last edited: 2023-11-11Here formatting means the spacing and organisation of files in your code base. This is about communication and leaving a well organised desk space for others to use after you.
The main metaphor used for formatting is a news paper. Nearly all code is read left to right and top to bottom. You documents, should start with a head line, then as you go down slowly increase in the level of detail. The code should be in nice columns and no story should be too long. It shouldn’t use too much indentation but should be nicely spaced to make it easy to read. You should use paragraphs to break up concepts.
- Fourier Matrix
Last edited: 2023-11-11The Fourier matrix is used in Fast Fourier Transform . Let $\omega$ be a $n$‘th root of unity then define the Fourier matrix to be
$$F_n(\omega) = \left [ \begin{array} \ 1 & 1 & 1 & \cdots & 1\\ 1 & \omega & \omega^2 & \cdots & \omega^{n-1}\\ 1 & \omega^2 & \omega^4 & \cdots & \omega^{2(n-1)}\\ \vdots & \vdots & \vdots & \ddots & \vdots\\ 1 & \omega^{n-1} & \omega^{n-2} & \cdots & \omega \end{array} \right ].$$- Function conventions
Last edited: 2023-11-11Functions should be small encapsulated bits of code that does a single thing. It takes parameters which it does not alter and potentially returns some value or alters the state of the object they belong to.
# Function body
# Functions should do one thing
Function should do one thing, and one thing only. If you need to decide if it is doing more, write out in words what it is doing. If there are sections in your functions, then it for sure is doing too much.
- Functions in Python
Last edited: 2023-11-11Functions in Python Index are a class and have type ‘function’. i.e.
>>> type(my_function) <class ‘function’>
They have their own namespace that gets destroyed once the function call has ended. In Python Index functions are First-class objects , meaning they can be assigned to variables.
You can see this in the below example.
- Genetic algorithm (meta)
Last edited: 2024-02-24Suppose we have a optimisation problem which as two properties associated to is domain $A$
- There is a method to make a small change to any $a \in A$ called a mutation $M: A \rightarrow \mathcal{P}(A)$ where any of $m(a)$ are a small change to $a$.
- It has some crossover function $C: A \times A \rightarrow A^k$.
Then for a collection of points $P \subset A$ called a population the algorithm assess the fitness of each point $f(a)$ for $a \in P$.
- Gradient decent
Last edited: 2026-02-05This algorithm uses differentiation to minimise error terms for models. Suppose we have a model with some parameters $w \in \mathbb{R}^k$. Further suppose the error function $E(w)$ is differentiable with respect to $w$. Then we set some learning rate $\eta$ and we perturb the weights $w$ by $w \leftarrow w - \eta \frac{\partial E}{\partial w}$ until $w$ settles in some local minimum.
- Graph representations
Last edited: 2023-11-11How we represent a directed graph or undirected graph in an algorithm can effect the run time of an algorithm.
Bellow are some common ways to do this:
Data structure Space complexity Time to check connection Time to find neighbours Adjacency list $O(\vert V \vert + \vert E \vert)$ $O(N_v)$ $O(N_v)$ Adjacency matrix $O(\vert V \vert^2)$ $O(1)$ $O(\vert V \vert)$ Edge list $O(\vert E \vert)$ $O(\vert E \vert)$ $O(\vert E \vert)$ - Halting problem
Last edited: 2026-02-05# Statement
Halting problemGiven a programme $P$ with an input $I$. Does $P(I)$ ever terminate?
- Hash function
Last edited: 2025-12-05Hash functionA hash function $h$ is a function that takes an input in a given domain and returns a fixed-size string of bytes $h: K \rightarrow B$, this output is normally called the hash value. It should ideally be:
- Hash table
Last edited: 2026-02-05Hash tableThis consists of an associative array $A$ and a hash function $f: K \rightarrow A$. Values provided to the hash table get hashed using $f$ which provides a key (usually thought of as a number below a given load value) for $A$. Then the within this entry of $A$ you will store some object that can verify if the value is in this table. In an ideal world the hash function $f$ would map each key to a different key of $A$ however this is unlikely to happen in practice. So hash tables need to handle key collision.
- Heap (OS)
Last edited: 2026-01-28HeapThe heap of a process is dynamic memory which is allocated at run time. It will be used to store variables which may vary dramatically in size depending on what the application is run on - for example reading data into memory. This memory will stay allocated until it is explicitly deallocated. Therefore the heap can come with considerable overheads and require techniques like garbage collection or custom allocations to handle.
- Hill climbing
Last edited: 2024-02-24This is the most naĂŻve random optimisation algorithm. Suppose we have a optimisation problem where $A$ has some sense of neighbourhood $N(a)$ for each point $a \in A$.
Start at some random point $a_s \in A$ then move to the point
$$a' = \mbox{arg}\max_{a' \in N(a_s)} f(a')$$keep iterating until $a_s = a'$.
# Pseudocode
hill_climber(optimise, neighourhood): Input: optimise: The function you are looking to optimise neighbourhood: A function that returns the neighbourhood of a point A -> P(A). Output: Some a in A that is a local optimum for optimise. 1. Select random point current_best in A. 2. Let next_best be some different point in A. 3. While current_best != next_best: 1. Let current_best = next_best 2. next_best_score = optimise(current_best) 3. for n in neighourhood(next_best): 1. if optimise(n) > next_best_score 1. next_best_score = optimise(n) 2. next_best = n 4. Return current_best# Run time
This strongly depends on the sense of neighbourhood. For example if the neighbourhoods were the entire domain this might be very slow.
- Image Segmentation
Last edited: 2026-02-05# Statement
Image SegmentationGiven an undirected graph $G = (V,E)$ with weights:
- Image segmentation by max flow
Last edited: 2026-02-05# Theory
# Reformulation
Right now we have two problems, it is a maximalisation problem and the weights might not be non-negative.
Let
$$L = \sum_{v \in V} f(v) + b(v).$$Then we can reformulate
$$\begin{aligned}w(F,B) & = \sum_{v \in F} f(v) + \sum_{u \in B} b(u) - \sum_{(v,u) \in cut(F,B)} p(u,v)\\ & = L - \sum_{u \in B} f(u) - \sum_{v \in F} b(v) - \sum_{(v,u) \in cut(F,B)} p(u,v)\end{aligned}$$Therefore to maximise $w(F,B)$ we could instead minimise
- Imperative Programming
Last edited: 2023-11-11In this paradigm the language specifies a series of instructions to potentially get to an end state (cough Holting Problem cough). Lots of peoples first programs are imperative.
letters = ["a","B","C","d"] output = [] for letter in letters: if letter.lower() == letter: output.append(letter) print(output) # ["a", "b"]Here you provide the machine exact steps to what has to be done to produce the output.
There is a sub category of this called Procedural Programming that uses this with functions .
- Independent set of a given size
Last edited: 2026-02-05# Statement
Independent set of a given sizeGiven a undirected graph $G = (V,E)$ and positive integer $g > 0$. Does $G$ have an independent set of size $g$?
- Independent set of a given size is in NP
Last edited: 2025-12-05# Statement
LemmaIndependent set of a given size is in NP .
- Infeasible linear programme
Last edited: 2026-02-05Infeasible linear programmeA linear programme is infeasible if there are no $x \in \mathbb{R}^n$ that satisfy its constraints.
- Integer linear programming problem
Last edited: 2026-02-05# Statement
Linear programme problemGiven a linear programme what is an $x \in \mathbb{z}^n$ that achieves an optimal solution if it exists.
- Interface
Last edited: 2023-11-11An interface is a standard way for a someone else to interact with an object. There are lots of interfaces you will deal with such as API’s , library functions, or UI’s .
A well defined interface makes objects easy to use and help the process of Abstraction .
- Iterative algorithms
Last edited: 2023-11-11An iterative algorithm is a computational procedure that solves a problem by repeating a set of steps until a specific condition is met, without making recursive calls. In other words, it uses loops (like
for,while, etc.) to repeatedly execute code blocks, updating the state variables as it progresses, until it reaches an exit condition that signifies the problem has been solved.Here are some key characteristics of iterative algorithms:
- Iterative Dichotomiser 3 (ID3)
Last edited: 2024-01-11This algorithm works on classification problems . It is a greedy algorithm to design a maximal decision tree . Therefore are looking for the tree to represent some function $f: A \rightarrow B$, where we may only have access to some training data $D \subset f$.
This decision tree will be maximal with regards to the information entropy of each of the leaf nodes within that decision tree . For any $A' \subset A$ we can define $Entropy(A')$ by considering drawing a random element in $A'$ out with a probability of them being mapped to some element of $B$.
- k-colourings problem (graphs)
Last edited: 2026-02-05# Statement
$k$-colourings problem (graphs)Given an undirected graph $G = (V,E)$ and integer $k > 0$. Does $G$ have a proper vertex colouring using $k$-colours, if it does output the colouring and if not say so.
- k-means clustering
Last edited: 2024-03-09Suppose we are in the clustering problem set up. Suppose we are provided $k$ the number of clusters. Further assume $A$ is a space with some method of averaging $Avg:2^A \rightarrow A$ which satisfies
$$ \sum_{x \in X} d(x, Avg(X)) = \min_{a \in A} \sum_{x \in X} d(x,a), \mbox{ for all } X \subset A. $$In the Euclidian example we will have
$$ Avg(X) = \frac{1}{\vert X \vert} \sum_{x \in X} x. $$To start the algorithm we pick $k$ random points in our training data $T$. We set these to be the current centres $center^0_i$ for $1 \leq i \leq k$. For iteration $t \in \mathbb{N}$ assume we have centres $center_i^{t-1} \in A$ (they don’t need to be points in $T$). Now we group the training data
- k-nearest neighbour
Last edited: 2024-01-22This algorithm is an Instance-based learning model and uses closest neighbours to determine what values should be.
In the modelling framework , suppose we are given:
- training data $T \subset (A,B)$,
- a metric on $A$ called $d: A \times A \rightarrow \mathbb{R}$,
- $k \in \mathbb{Z}$ to be the number of points to consider, and
- an averaging mechanism $V: \mathcal{P}(B \times \mathbb{R}) \rightarrow B$ that takes a weighted set of values in $B$ and returned another value in $B$ - like the weighted geometric mean . Then to model our data when given a new input point $a \in A$ we find the $k$-nearest neighbours in our training data $T$ then take the average of their associated values in $B$.
This model is very simple - however a lot of the complexity comes in defining $d$ and $V$.
- k-SAT is in NP
Last edited: 2026-01-28# Statement
Lemma- k-SAT is NP-complete for k greater than or equal to 3
Last edited: 2026-01-28# Statement
Lemmak-SAT is NP-complete for $k \geq 3$
- k-satisfiability problem (k-SAT problem)
Last edited: 2026-02-05# Statement
k-SAT ProblemGiven a boolean function $f$ in CNF with $n$ variables and $m$ clauses of size at most $k$. Is there a true/false assignment to the $n$ variables that satisfies $f$. If yes then output it, otherwise say no.
- Knapsack Problem
Last edited: 2025-12-05# Statement
The knapsack problemGiven $n$ objects $o_i$ with weight $w_i$ and value $v_i$ how do we maximise the value of a bag that can carry weight $B$. The solution to this is a subset of objects $S \subset \{1, 2, \ldots, n\}$ such that:
- Knapsack problem (without repetition)
Last edited: 2026-01-28# Statement
The knapsack problemGiven $n$ objects $o_i$ with weight $w_i$ and value $v_i$ how do we maximise the value of a bag that can carry weight $B$. The solution to this is a subset of objects $S \subset \{1, 2, \ldots, n\}$ such that:
- Knapsack-search (without replacement)
Last edited: 2026-01-28# Statement
Knapsack-search (without replacement)Given $n$ objects $o_i$ with weight $w_i$ and value $v_i$, and a goal $g$ is there a bag that has weight less than $B$ but value at least $g$. The solution to this is a subset of objects $S \subset \{1, 2, \ldots, n\}$ such that:
- Knapsack-search is NP-complete
Last edited: 2025-12-05# Statement
LemmaKnapsack-search is NP-complete .
- Kruskal's algorithm
Last edited: 2023-11-11This is an algorithm to solve the MST problem. It uses the union-find data structure to help it identify cycles.
# Pseudocode
Kruskal's(G,w): Input: undirected graph G=(V,E) with weights w(e). Output: MST edges X 1. Sort E by weights, smallest to largest 2. Set X to be empty 3. For e = (v,w) in E (ordered) 1. If X U e does't have a cycle then X = X U e. 4. Output X# Run time
For step 1, we use merge sort so this takes $O(\vert E \vert \log(\vert E \vert))$ time. Though as $\vert E \vert \leq \vert V \vert^2$, we can think of this as $O(\vert E \vert \log(\vert V \vert))$.
- Lambda functions
Last edited: 2023-11-11These are Anonymous Functions that take multiple inputs and evaluate the output in a single statement. It has the following syntax:
lambda arguments : expressionSyntactic SugarA
lambdaexpression creates a function object just like thedefstatement.- Linear programme
Last edited: 2026-02-05Linear programmeA linear programme has $n$ variables $x_j$ for $1 \leq j \leq n$ where we are trying to either
- Linear programme standard form
Last edited: 2026-02-05Linear programme standard formThe standard form for a linear programme with $n$ variables and $m$ constraints is given by matrices $x, c \in M_{n,1}(\mathbb{R})$, $A \in M_{m,n}(\mathbb{R})$, and $b \in M_{m,1}(\mathbb{R})$ were we want to
- Linear programming problem
Last edited: 2026-02-05# Statement
Linear programme problemGiven a linear programme what is an $x \in \mathbb{R}^n$ that achieves an optimal solution if it exists.
- Linked lists
Last edited: 2026-02-05Linked listA linked list stores a collect of elements of the same data type. The data structure consists of a set of sequential nodes. Each node has the following information
- Logging in python
Last edited: 2023-11-11# logging python
The logging module in python is used to do exactly what you would expect, it logs messages.
Some cool features it has is:
- different levels of messages for filtering of messages,
- different logger names for different parts of the application,
- different handlers to post messages such as streamed, files, http and rotating, and
- custom formatters for different handlers.
This library was converted from Java which is why it does not conform to the python naming conventions.
- Machine Learning by Tom Mitchell
Last edited: 2024-01-10This is a book by Tom Mitchell about Machine Learning that was the course text book for CS7641 Machine Learning . You can find it the repo below or for free on his website .
- Many-one reduction (problem)
Last edited: 2026-02-05Many-one reductionThere is a many-one reduction of problem $A$ to $B$ ($A \rightarrow B$ or $A \leq B$) if a polynomial time algorithm to solve $B$ would also solve $A$ in polynomial time .
- Markdown
Last edited: 2023-11-11This is a Markup Language that is widely used, for example it is used in GitHub README pages and Obsidian notes. It has a fairly basic syntax, that I am going to have great fun describing in a Markdown file.
Syntax I am sure there is more, but this is what I have found useful so far!
- Markup Language
Last edited: 2023-11-11A Markup Language is text-encoding system that tells the computer how to display some text. There are 3 main categories
- Presentational markup - ironically the least well presented of all of them. Essentially binary code that is interpreted for programs such as Word.
- Procedural markup - This uses tags within the language to define how it should be rendered. Such as Markdown and Tex .
- Descriptive markup - (Sometimes called semantic) This defines how to label sections of the text. Such as Latex and HTML .
Difference from a Programming Language
- Max clique problem (graph)
Last edited: 2026-02-05# Statement
Max clique size problemGiven an undirected graph $G = (V,E)$ what is the largest $S \subset V$ such that the induced graph on $S$ forms a clique .
- Max flow problem
Last edited: 2025-12-05# Statement
Max flow problemGiven a flow network $(G, c, s, t)$, what is the maximum value a flow on this network can have?
- Max independent set problem (graph)
Last edited: 2026-02-05# Statement
Max independent set problemGiven a undirected graph $G = (V,E)$ what is the size of the largest independent set in $G$?
- Max independent set problem is NP-hard
Last edited: 2025-12-05# Statement
LemmaMax independent set problem is NP-hard .
- Max-flow min-cut Theorem
Last edited: 2023-11-11# Statement
TheoremFor a flow network $(G, c, s, t)$ the size of the max flow is the same as the min st-cut .
- Max-k-exact-satisfiability problem
Last edited: 2026-02-05# Statement
Max-$k$-exact-satisfiability problemProvided with a boolean function $f$ in CNF with $n$ variables and $m$ clauses, where each variable only appears in a single literal in each clause and each clause has exactly $k$ litterals. What is an assignment maximising the number of satisfied clauses?
- Max-SAT random approximation algorithm
Last edited: 2023-11-12This is a naĂŻve approximate solution to the Max-SAT problem. If the formula has $n$ variables and $m$ clauses this guarantees $m/2$ clauses will be satisfied by the assignment.
# Pseudocode
Name(f): Input: Boolean formula in CNF form f, with variables x_i and clauses c_j. Output: An assignment to x_i that satisified m/2 clauses. 1. Let C be the set of clauses and build assignment a : x_i |-> T/F. 2. For each x_i: 1. Let A be all the clauses in C containing the x_i literal. 2. Let B be all the clauses in C containing the not x_i literal. 3. If A is bigger than or equal to B, set a(x_i) = T and remove A from C. 4. Else set a(x_i) = F and remove B from C. 3. Return assignment a.# Run time
We loop $n$ times having to look through a list of size at most $m$. Therefore this takes $O(nm)$.
- Max-Satisfiability Problem
Last edited: 2026-02-05# Statement
Max-Satisfiability Problem (Max-SAT problem)Provided with a boolean function $f$ in CNF with $n$ variables and $m$ clauses, where each variable only appears in a single literal in each clause. What is an assignment maximising the number of satisfied clauses?
- MIMIC (meta)
Last edited: 2024-02-24Suppose we have some Optimisation problem with input space $A$ and fitness function $f: A \rightarrow \mathbb{R}$. Then for any $\theta \in \mathbb{R}$ define
$$ P^{\theta}(a) = \begin{cases} \frac{1}{Z_{\theta}} & \mbox{if } f(a) \geq \theta\\ 0 & \mbox{otherwise.} \end{cases}$$(Where $Z_{\theta}$ is some normalisation coefficient to make this a probability distribution .) Then let $\theta_{min} = \min_{a \in A} f(a)$ and $\theta_{max} = \max_{a \in A} f(a)$ then $P^{\theta_{min}}$ is the uniform distribution over $A$ whereas $P^{\theta_{max}}$ is the uniform distribution over the optimum.
- MIMIC by dependency trees
Last edited: 2024-02-25This follows on from MIMIC (meta) . Where we are going to use dependency trees to calculate the distribution. Here we are just implementing a version of the
probability_constructor.We assume the domain of our problem breaks up $A = \oplus_{i=1}^n A_i$ into features, we will also break up the random variable $X=\oplus_{i=1}^n X_i$ on this also. To generate the probability distribution we are going to build a dependency tree . To decide the form of this dependency tree - due to some maths - we will use provide pass or fail samples to calculate Mutual information for each pair of features.
- Min st-cut problem
Last edited: 2026-02-05# Statement
Min st-cutGiven a flow network $(G, c, s, t)$ what is the st-cut $(L, R)$ with minimum capacity.
- Min-heap
Last edited: 2026-01-21A min-heap is a complete binary tree T (i.e., all layers are full except possibly the bottom). Each node $x \in T$ in this tree is associated with a value in the heap $v_x$. This tree obeys the min-heap property:
- For all nodes $x \in T$ in the tree if they have a parent $p \in T$ then $v_x \geq v_p$.
This means the root node is the smallest value in the heap.
- Minimum Spanning Tree problem (MST)
Last edited: 2026-02-05# Statement
Minimum Spanning Tree problemGiven an undirected graph $G = (V,E)$ with weights $w: E \rightarrow \mathbb{R}$ can you find a spanning tree $T = (V, E')$ with minimum weight
- Minimum vertex cover problem
Last edited: 2026-02-05# Statement
Minimum vertex cover problemGiven an undirected graph $G = (V,E)$ provide a vertex cover of smallest size.
- Modular exponent algorithm
Last edited: 2023-11-11This algorithm has a polynomial run time in the size of the input to calculate powers in using modular arithmetic . This has not been sped up using Euler’s theorem (modular arithmetic) .
Algorithm
resursive_mod_exp(x,y,N): Input: n-bit integers x,y,N >= 0 Ouput: x^y mod N 1. if y = 0, return 1 2. z = resursive_mod_exp(x, floor(y/2), N) 3. if y is even return z^2 mod N 4. if y is off return xz^2 mod NRun time
- Modular exponent problem
Last edited: 2026-02-05# Statement
Modular exponent problemGiven 3 $n$-bit integers $x, y, N \geq 0$ what is $x^y$ mod $N$.
- Modular inverse algorithm (extended Euclidean algorithm)
Last edited: 2023-11-11This algorithm solves the Modular inverse problem and uses the Extended Euclidean algorithm to do it.
Algorithm
modular_inverse_algorithm(x, N): Input: 2 n-bit integers x, N. Output: Either x^{-1} mod N or it says it does not exists. 1. Run extended_euclidean_algorithm(x, N) = (d, a, b) 2. If d != 1 return that no inverse exists. 3. Return aRuntime
As the Extended Euclidean algorithm takes $O(n^3)$ so does this algorithm.
- Modular inverse problem
Last edited: 2026-02-05# Statement
Modular inverse problemGiven 2 $n$-bit integers $x, N \geq 0$ what is $x^{-1}$ mod $N$.
- Mutability
Last edited: 2023-11-11An object is considered mutable if if can be changed after it is created, whereas it is immutable if it can not be changed. If you change a mutable object all references to that object will also be changed.
In Python Index , lists, sets and dictionaries are mutable whereas numbers, strings, tuples and frozen sets are immutable.
This is commonly used in interview questions, for example:
- Mutability in Python
Last edited: 2025-12-05In Python Index all variables are references to objects. Every object has a mutability property. Objects that are mutable in Python behave as you would expect: when a variable pointing to a mutable object is altered, the object itself is altered.
x = [1,2] y = x x.append(3) print(y) # [1,2,3]Here
xandypoint to the mutable list object. Whenxis altered to add3to the list,yalso reflects this change because both variables point to the same object.- Naive Bayes classifier
Last edited: 2024-02-24Suppose we have a classification problem and we are in the modelling framework . We are given features $A = \oplus_{i=1}^n A_i$ where each feature $A_i$ has finite values (also split up the random variable $X$ into $X = \oplus_{i=1}^n X_i$ for our input). Let our classification also have finite domain $B$.
We can use our testing data to estimate $\mathbb{P}[Y=b]$ for each $b \in B$ and $\mathbb{P}[X_i = a_i \vert Y = b]$ for each $a_i \in A_i$ and $b \in B$.
- Namespaces
Last edited: 2023-11-11A namespace in computer science is essentially a container where identifiers (names of types, variables, functions, etc.) live. It allows different developers, libraries, or parts of a program to use the same identifier names without causing conflicts, because the same identifier can be bound to different entities in different namespaces.
In other words, namespaces provide a way to disambiguate identifiers that might be common and likely to be used more than once. This way, even if the same identifier is used in different parts of a program, as long as they are in different namespaces, they are considered to be different entities.
- Naming conventions
Last edited: 2023-11-11When coding we repeatedly have to name things. So having some go to standards for how to do this makes writing code quickly but it easier to read. The key thing is for the name to be fully expansive about what the variable is doing, if you can’t do this maybe this variable is doing too much.
One difference between a smart programmer and a professional programmer is that the professional understands that clarity is king. Professionals use their powers for good and write code that others can understand.
- Neighbourhood (graph)
Last edited: 2026-02-05NeighbourhoodFor a undirected graph $G = (V,E)$ the neighbourhood of $X \subset V$ is $N_G(X) = \{u \in V \vert (x,u) \in E, x \in X\}$ ($\backslash X$, $\cup X$). Sometimes this will be defined to include $X$ or exclude $X$ - this is called open and closed neighbourhoods.
- Nondeterministic Polynomial time (NP)
Last edited: 2026-02-05This definition can mean multiple things.
Nondeterministic Polynomial time (search)The class of NP problems is the class of all search problems .
- NP-Complete
Last edited: 2026-02-05NP-CompleteA problem $f$ is NP-complete if
- NP-hard
Last edited: 2026-02-05NP-hardA problem $f$ is NP-hard if for all search problems $g$ there is a many-one reduction from $g$ to $f$. In other terms, $f$ is as hard as any search problem .
- Object
Last edited: 2023-11-11An object keeps data behind an interface. It makes its internal variables private. Then exposes setter functions that allow people to edit in there preferred way. It also has getter functions to provide the methods to get at the data how they would like to access it.
The following would be a data structure .
class point: x : float y : float def init(self, x: float, y: float): self.x = x self.y = yWhereas the following would be an object.
- Ones complement
Last edited: 2026-02-05Ones complementThe ones complement of a binary number is obtained by flipping 1’s and 0’s. Therefore if you were to add a number with its one complement you would end up with a number of all 1’s.
- Open Closed Principle (OCP)
Last edited: 2023-11-11# The Open Closed Principle (OCP)
This is one of the SOLID principles and states:
- software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification
- P equals NP or P not equals NP
Last edited: 2023-11-11This is the open problem whether the class of Polynomial time problems is equal to the class of Nondeterministic Polynomial time (NP) problems.
- Page rank
Last edited: 2026-02-05Page rankThe page rank is a probability distribution over the vertices of the Webgraph $G = (V,E)$. This is a function $\pi : V \rightarrow [0,1]$ that assigns a ranking to each page. It is defined recursively as
- Page rank algorithm
Last edited: 2023-12-03This algorithm is used to calculate the “importance” of a page on the internet, or the page rank . This a Probability distribution on the Webgraph , $\pi$.
To approximate this we use a Markov chain and approximate its stationary distribution .
# Pseudocode
Build a Markov chain $P$ on the vertices of the Webgraph $G = (V, E)$. This will have
- Palindrome
Last edited: 2026-01-28PalindromeA sequence $a_0, a_1, \ldots, a_n$ is a palindrome if $a_i = a_{n-i}$ for all $0 \leq i \leq n$.
- Parallelisation
Last edited: 2026-02-05ParallelisationThis is when tasks are scheduled on different CPU and so happen in parallel to one another. This usually uses multi-threading or multi-processing .
- Passing variables to a function
Last edited: 2023-11-11When you pass a variable to a function , what does the function actually receive?
First lets name things to make this a little simpler. The variable we pass to the function will be called the
argumentwhereas the variable used by the function will be called theparameter.def function(parameter): # The variable used by the function is called the parameter ... argument = ... The variable we pass to the function is called the argument function(argument)There are two main choices of what can happen to the argument when it gets passed into the parameter, it can be:
- Perceptron rule
Last edited: 2024-01-18This is a way to train a perceptron using the binary step activation function. To classify training data $T$ where $A = \mathbb{R}^n$ and $B = \{0,1\}$. We assume the perceptron we are training is a function $p: \mathbb{R}^n \rightarrow \{0,1\}$ defined by weights $w_i, \theta \in \mathbb{R}$. For this we require a sufficiently small learning rate $\eta \in \mathbb{R}_{>0}$. The idea is to reply the data by nudging the weights ($\Delta w_i$) of the perceptron slowly towards an even split.
- Pipe
Last edited: 2023-11-11This is where you do multiple manipulations to a data object in ‘one line’ but break it up over multiple. This is commonly used in Python Index with the Pandas library.
data_frame.group_by(["animals"]) .mean() .reset_index() .rename({"height": "mean_height", "weight": "mean_weight"})This operations are naturally grouped together and effect the same object. This is not to be confused with a train wreck .
- Policy Iteration (MDP)
Last edited: 2024-04-06This is a method of solving Markov decision processes . It uses discounted rewards to get a local as well as global optimum solution.
It uses the same set up as Value iteration (MDP) but instead we use the policy to decide the utility and iterate that.
# Pseudocode
Instead of looking at the utility of a state we could instead look at the policy and use the utility to guide us.
- Polymorphism
Last edited: 2023-11-11This is the ability of a single interface or base class to represent multiple concrete types or classes. This allows objects of different classes to be treated as objects of a common superclass or interface.
Polymorphism enables developers to write code that is more flexible, extensible, and easier to maintain because it helps decouple the implementation details from the code that uses those implementations. By interacting with the abstract interface, code can work with different concrete implementations without needing to know their specific details.
- Polynomial time
Last edited: 2026-01-28Polynomial timeAn algorithm runs in polynomial time if its running time is polynomial in the length of the input. A problem can be solved in polynomial time if it is a search problem and there exists a polynomial time algorithm for it.
- Polynomial time is a subset of NP-complete
Last edited: 2023-11-11# Statement
LemmaThe class of Polynomial time problems is a subset of Nondeterministic Polynomial time (NP) .
- Pre-commit hooks
Last edited: 2023-11-11Pre-commit hooks are a great way to run checks against your code before it is committed to the repository. This saves time and money by not having to run checks in the CI/CD pipeline, and also puts control of your commits back in your hands.
# Setting up pre-commit hooks
You can install the pre-commit package using pip.
- Prim's algorithm
Last edited: 2023-11-11This solves the MST problem using a method similar to Dijkstra’s algorithm . It will use a Priority queue to do this.
Algorithm
Prim's(G,w): Input: undirected graph G=(V,E) with weights w(e). Output: MST defined by the array prev for all u in V cost(u) = inf prev(u) = nil Pick any initial node u_0 cost(u_0) = 0 H = makequeue(V) while H is not empty: v = deletemin(H) for each {v,z} in E: if cost(z) > w(v,z): cost(z) = w(v,z) prev(z) = v decreasekey(H,z) output prevCorrectness
- Principle component analysis
Last edited: 2026-02-05Principle component analysis is a linear dimension reduction algorithm. In concept principle component analysis find the axis along which the data has maximal variance if it were projected. It does this by finding the Eigenvector and Eigenvalues of the Covariance matrix . It uses these eigenvectors as a new basis of the data’s feature space.
- Procedural Programming
Last edited: 2023-11-11- Programming paradigms
Last edited: 2023-11-11When designing a programming language, there are different aspects you can choose to focus on. This leads to different programming paradigms that are useful to know about, and understand the pay offs of each of them.
- Pseudo-polynomial time
Last edited: 2026-01-28Pseudo-polynomial timeAn algorithm runs in pseudo-polynomial time if its running time is polynomial in the numeric value of the input but not necessarily in the length of the input.
- Python Built-in Functions
Last edited: 2023-11-11# Python Builtin Functions
Python Index has some built in functions that make using the language slightly easier. You can find a full list of these in the python documentation .
# Functions
# callable(object)
This returns
TrueorFalsedepending on if the object is callable. For example- Pythonic
Last edited: 2023-11-11Pythonic means code that doesn’t just get the syntax right, but that follows the conventions of the Python community and uses the language in the way it is intended to be used. 1
For example:
for index in range(len(some_list)): some_list[index].do_something()would not be considered pythonic however
for item in some_list: item.do_something()would be as it uses the built in iteration of the python language.
- Q-learning
Last edited: 2024-04-06Q-learning is a reinforcement learning class of algorithms which are value function based. It uses the approach of Incremental learning of the Q-function (RL) . We use the model of transitions where the learning can provide the action each iteration.
Pick an initial estimation $\hat{Q}(s,a)$ of the Q-function (RL) .
We need to pick a learning rate $\alpha_t$ such that
- Quick sort
Last edited: 2023-11-11This is a recursive algorithm used to solve the sorting problem that uses pivots $p$ to sort the list. A very brief guide to how this works is as follow
- Choose a pivot $p$.
- Partition $A$ into $A_{
p}$ .
- Recursively sort $A_{
p}$.
The issue is how to choose a good piviot $p$ - ideally you would want to pick the median element.
- Random Access Memory (RAM)
Last edited: 2026-02-05Random Access Memory (RAM)Random access memory is used to temporarily store data relevant to an executing program. Unlike disk storage RAM is temporary and will lose its state once the machine is turned off. However, RAM is fast access which makes it more usable to programs when they need to access it frequently. Processes access RAM through virtual memory which is mapped to physical memory through page tables .
- Random component analysis
Last edited: 2024-03-10Random component analysis is a form of linear dimension reduction . It picks a random linear map $p: \mathbb{F}^n \rightarrow \mathbb{F}^m$ and applies that.
This works mainly due to The curse of dimensionality .
- Recursion
Last edited: 2023-11-13This is the technique of a function calling itself recursively to find an answer and returning at some given condition. This is in opposition to Iterative algorithms that achieve what they want in a for or while loop.
The important thing to remember is to set up a return condition that will always be met otherwise, this could go on forever (see Halting problem ).
- Refactored
Last edited: 2023-11-11The process of refactoring is to review your code and make it more readable by breaking it down into smaller components.
- Reference counting in Python
Last edited: 2023-11-11In Python Index when objects are created, it also stores the number of references there are to that object. This is what we mean when we say the reference count of an object.
You can access this reference count by using the
syslibrary with the functionsys.getrefcount.import sys x = [1,2] print(sys.getrefcount(x)) # 2 y = x print(sys.getrefcount(x)) # 3 print(sys.getrefcount(y)) # 3 del y print(sys.getrefcount(x)) # 2(Note the numbers are 1 more than you would expect, this is due to the object
[0,1]being passed into the functionsys.getrefcountwhich then has a reference to that object also.)- REST API
Last edited: 2023-11-11This is a specific type of API that conforms to a certain specification.
- Restart hill climbing
Last edited: 2024-02-24# Random restart hill climbing
Suppose we have a optimisation problem where $A$ has some sense of neighbourhood $N(a)$ for each point $a \in A$.
Then we want to run Hill climbing but try to get around the issue with only finding local optimum by at the end of a run restarting in a new place.
- Rivest-Shamir-Adleman algorithm (RSA algorithm)
Last edited: 2023-11-11To set up the RSA algorithm you do the following:
- You pick 2 n-bit random primes $p$ and $q$.
- You choose $e$ relatively prime to $(p-1)(q-1)$.
- You let $N = pq$ and you publish the public key $(N,e)$.
- You compute your private key $d = e^{-1}$ (mod $(p-1)(q-1)$).
Then to encrypt a message $m \in \mathbb{Z}$ someone else does the following:
- Take a public key $(N,e)$.
- You compute $y = m^e$ (mod $N$).
- Then you send message $y$.
Lastly to decipher the message using the private key you do the following.
- Rudrata cycle problem
Last edited: 2026-02-05# Statement
Rudrata cycle problemGiven a graph $G = (V,E)$, what is a Rudrata cycle if it exists?
- Rudrata path problem
Last edited: 2026-02-05# Statement
Rudrata path problemGiven a graph $G = (V,E)$, what is a Rudrata path if it exists?
- Run time complexity
Last edited: 2023-11-11There are 3 main ways people use to analyse the run time complexity of algorithms.
# Big-O notation
Big-O notation is the main way people determine the run time of algorithms. (Rightly or wrongly.) It is the worst case analysis and has a formal mathematical definition
$$ f = O(g) \mbox{ if } \exists \ x_0 \in \mathbb{N}, m \in \mathbb{R}_{>0} \mbox{ such that } f(x) \leq m \ast g(x)\ \forall x > x_0.$$This is just saying that $f$ eventually is at most a scalar multiple of $g$.
- Satisfiability problem (SAT problem)
Last edited: 2026-02-05# Statement
Sat ProblemGiven a boolean function $f$ in CNF with $n$ variables and $m$ clauses. Is there a true/false assignment to the $n$ variables that satisfies $f$. If yes then output it, otherwise say no.
- Side effect
Last edited: 2023-11-11In programming, a side effect of a subroutine is the effect it has on things outside its domain. For example updating a database, changing a global variable, or changing an input mutable variable. It is good practice to avoid doing this at all costs. It makes your code harder to understand and debugging even worse!
- Signed or unsigned integers
Last edited: 2023-11-11# Signed integers
When storing integers in a computer, you can do it with a ‘sign’ or without it. This essentially just means you designate one bit at the start of the representation of that integer which indicates it is either positive or negative number.
- Simplex method (linear programme)
Last edited: 2023-11-11- Simulated Annealing
Last edited: 2024-02-24Suppose we have a optimisation problem where $A$ has some sense of neighbourhood $N(a)$ for each point $a \in A$.
Then we want to emulate Hill climbing but probabilistically allow for random walking. We do this by randomly deciding to take moves from $x$ to $x_t$ based on the temperature of the algorithm at that time $T$ this is given by
$$P(x,x_t,T) = \begin{cases} 1 & \mbox{if } f(x_t) \geq f(x)\\ \exp{\frac{f(x_t) - f(x)}{T}} & \mbox{otherwise.} \end{cases}$$Then we just modify what is done in Hill climbing with this and gradually decrease the temperature and we have simulated annealing.
- Single linkage clustering
Last edited: 2024-03-09Suppose we are in the clustering problem set up. Here we are given the number of desired clusters $k$, and the sense of distance $d$.
The idea will be to make every element in our training data $T$ its own cluster. Then we take the union of two clusters which are considered the closest. For this linkage clustering we use the minimum distance of points in the cluster. i.e.
- Single Responsibility Principle (SRP)
Last edited: 2023-11-11This is one of the SOLID principles and it states:
“A module should be responsible to one, and only one, actor.”
- Singleton
Last edited: 2023-11-11The Singleton pattern is a creational design pattern that is used to ensure that only one instance of a class is created throughout the entire lifetime of an application. This pattern is particularly useful in situations where multiple objects of a class could cause issues with data consistency, resource allocation, or system performance.
To implement the Singleton pattern, a class must have a private constructor so that no other objects can be instantiated, and a static method that allows access to a single instance of the class. This static method will typically create an instance of the class if one does not exist, and return the existing instance if it already exists.
- Soft clustering
Last edited: 2024-03-09Suppose we are in the clustering problem set up. Soft clustering will use Expectation Maximisation with the Gaussian distribution . Here we fix a $\sigma^2$ and then we have
$$ \mathbb{P}[X=x_i \vert \mu = \mu_j] = \exp\left[- \frac{1}{2} \sigma^2 (x_i - \mu_j)^2 \right]. $$We will use a mean to optimise $\mu_j$ at each step.
# Correctness
- It will have all the downsides of Expectation Maximisation but we can use restarts to over come them.
- SOLID principles
Last edited: 2023-11-11The SOLID principles are a set of five design guidelines for writing maintainable, scalable, and robust software in object-oriented programming. These principles were introduced by Robert C. Martin and are an acronym formed from the first letters of each principle:
- Single Responsibility Principle (SRP) : A class or module should have only one reason to change, meaning it should have a single responsibility or job. By adhering to this principle, developers can create more modular and focused code, making it easier to understand, maintain, and modify.
- Open/Closed Principle (OCP) : Software entities (classes, modules, functions, etc.) should be open for extension but closed for modification. This principle encourages creating code that can be easily extended with new functionality without altering the existing code, which reduces the risk of introducing new bugs and improves maintainability.
- Liskov Substitution Principle (LSP) : Subtypes must be substitute for their base types, meaning objects of a derived class should be able to replace objects of the base class without affecting the correctness of the program. By following this principle, developers can create more reusable and flexible code that adheres to the proper inheritance hierarchy.
- Interface Segregation Principle (ISP) : Clients should not be forced to depend on interfaces they do not use. This principle suggests that interfaces should be small, focused, and specific to a particular client’s needs, resulting in more modular and easier-to-understand code.
- Dependency Inversion Principle (DIP) : High-level modules should not depend on low-level modules; both should depend on abstractions. Additionally, abstractions should not depend on details; details should depend on abstractions. This principle promotes the use of abstractions (interfaces or abstract classes) to reduce the coupling between software components, making the system more flexible, maintainable, and easier to change.
By adhering to the SOLID principles, developers can create more robust, maintainable, and scalable software systems in object-oriented programming.
- Sorting problem
Last edited: 2026-02-05Sorting problemGiven an unsorted list $A = [a_1, \ldots, a_n]$ of $n$ numbers can you return a sorted list for a given comparison.
- Spanning Tree Protocol (STP)
Last edited: 2024-05-23The spanning tree protocol was introduced to get rid of cycles in networks that used bridges and hubs . It does this by forming a spanning tree in the network so that the layer 2 devices use to send messages through.
This is a distributed algorithm and is computed locally on each device in the network for their neighbours. Each device will have a unique ID and the tree will be formed by picking the root of the tree to be the device with the lowest ID. Then each device working out its nearest neighbour to the root of the tree.
- Special case pattern
Last edited: 2023-11-11The Special Case Pattern is a design pattern used in software development to handle exceptional conditions without resorting to conditional logic scattered throughout the code. Essentially, it allows you to model what would be a “special case” as an object of the same type as the typical cases. By doing so, you can reduce the need for conditional logic to check for these cases, making the code more maintainable and easier to understand.
- Special functions
Last edited: 2023-11-11This are also known as dunder (double underscore) or magic functions. They are signified by two underscores before and after its name i.e. __len__.
These functions are not normally called by the user, the are for the python interpreter to use. For example if you call
item in some_iterable: passto get the list of items this calls
some_iterable.__iter__(). The same forlen(object)this callsobject.__len__().- st-cut
Last edited: 2026-02-05st-cutGiven a flow network $(G, c, s, t)$, an st-cut is a cut $(L, R)$ of $G$ where $s \in L$ and $t \in R$. The capacity of this cut is
- Stack (OS)
Last edited: 2026-02-05 - Adjacency list format (graph)