🧠
AI
  • Artificial Intelligence
  • Intuitive Maths behind AI
    • Probability
    • Information Theory
    • Linear Algebra
    • Calculus
  • Overview
  • Research Ideas and Philosophy
  • Basic Principles
  • Information Theory
    • Entropy
    • Log Probability
  • Probability & Statistics
    • Random Variables
    • Probability
      • Probablistic Equations
      • Bayes Theorem
      • Probability Distributions & Processes
    • Statistics
      • Measures
      • Z-Scores
      • Covariance and Correlation
      • Correlation vs Dependance
    • Mahalanobis vs Chi-Squared
    • Uncertainty
    • Statistical Inference
      • Graphical Models
      • Estimator vs Parameter
      • Estimation
      • Bayesian/Probabilistic Inference
        • Probabilistic Modelling
        • Problems of Bayesian Inference
        • Conjugate Priors
        • Dirichlet Distribution/Process
        • Posterior Predictive Distribution
      • Sampling-Based Inference
    • Sampling
      • Rejection Sampling
      • Reservoir Sampling
      • Thompson Sampling
    • Bayesian Inference
    • Regression
    • Markov
    • Monte Carlo
      • Monte Carlo Estimators
      • Importance Sampling
    • Kernel Density Estimation
    • Gaussian Processes
    • Gaussian Soap Bubble
  • Linear Algebra
    • Vector Space and Matrices
    • Geometry of System of Linear Equations
    • Determinants
    • Transformations
    • Geometrical Representation
    • Positive (Semi)Definite Matrices
    • Matrix Interpretation
    • Dot Product as Linear Transformation and Duality of Vector-Linear Transformation
    • Norms
    • Linear Least Square
    • Matrix Decomposition
      • QR Decomposition
      • Cholesky Decomposition
      • Eigen Value Decomposition
      • SVD - Singular Value Decomposition
    • Matrix Inversion
    • Matrix Calculus
    • Matrix Cookbook
    • Distributed Matrix Algebra
    • High Dimensional Spaces
  • Optimization
    • Derivatives
      • Partial Derivative
      • Directional Derivative
      • Gradient
      • Jacobian
    • Regularization
    • Gradient Descent
    • Newton's Method
    • Gauss-Newton
    • Levenberg–Marquardt
    • Conjugate Gradient
    • Implicit Function Theorem for optimization
    • Lagrange Multiplier
    • Powell's dog leg
    • Laplace Approximation
    • Cross Entropy Method
    • Implicit Function Theorem
  • Statistical Learning Theory
    • Expectation Maximization
  • Machine Learning
    • Clustering
    • Bias Variance Trade-off
  • Deep Learning
    • PreProcessing
    • Convolution Arithmetic
    • Regularization
    • Optimizers
    • Loss function
    • Activation Functions
    • Automatic Differentiation
    • Softmax Classifier and Cross Entropy
    • Normalization
    • Batch Normalization
    • Variational Inference
    • VAE: Variational Auto-Encoders
    • Generative vs Discriminative
      • Generative Modelling
    • Making GANs train
    • Dimensionality of Layer Vs Number of Layers
    • Deep learning techniques
    • Dilated Convolutions
    • Non-Maximum Suppression
    • Hard Negative Mining
    • Mean Average Precision
    • Fine Tuning or Transfer Learning
    • Hyper-parameter Tuning
  • Bayesian Deep Learning
    • Probabilistic View
    • Uncertainty
    • Variational Inference for Bayesian Neural Network
  • Reinforcement Learning
    • General
    • Multi-armed Bandit
    • Imitation Learning
    • MDP Equations
    • Solving MDP with known Model
    • Value Iteration
    • Model Free Prediction and Control
    • Off Policy vs On Policy
    • Control & Planning from RL perspective
    • Deep Reinforcement Learning
      • Value Function Approximation
      • Policy Gradient
        • Algorithms
    • Multi Agent Reinforcement Learning
    • Reinforcement Learning - Sutton and Barto
      • Chapter 3: Finite Markov Decision Processes
      • Chapter 4: Dynamic Programming
    • MBRL
  • Transformers
    • Tokenziation
    • Embedding
      • Word Embedding
      • Positional Encoding
    • Encoder
    • Decoder
    • Multi-head Attention Block
    • Time Complexities of Self-Attention
    • KV Cache
    • Multi-head Latent Attention
    • Speculative Decoding
    • Flash Attention
    • Metrics
  • LLMs
    • LLM Techniques
    • LLM Post-training
    • Inference/Test Time Scaling
    • Reasoning Models
    • Reward Hacking
  • Diffusion Models
    • ImageGen
  • Distributed Training
  • State Space Models
  • RLHF
  • Robotics
    • Kalman Filter
    • Unscented Kalman Filter
  • Game Theory and ML
    • 1st Lecture - 19/01
    • Lecture 2 - 22/01
    • Lecture 4: Optimization
  • Continual Learning
    • Lecture - 21/01
    • iCaRL: Incremental Classifier and Representation Learning
    • Variational Continual Learning
  • Computer Vision
    • Hough Transform
    • Projective Geometry
      • Extrinsic and Intrinsic Parameters
      • Image Rectification
    • Tracking
    • Optical Flow
    • Harris Corner
    • Others
  • Papers
    • To Be Read
    • Probabilistic Object Detection and Uncertainty Estimation
      • BayesOD
      • Leveraging Heteroscedastic Aleatoric Uncertainties for Robust Real-Time LiDAR 3D Object Detection
      • Gaussian YOLOv3
      • Dropout Sampling for Robust Object Detection in Open-Set Condition
      • *Sampling Free Epistemic Uncertainty Estimation using Approximated Variance Propagation
      • Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics
      • Can We Trust You? On Calibration of Probabilistic Object Detector for Autonomous Driving
    • Object Detection
    • Temporal Fusion in Object Detection/ Video Object Detection
    • An intriguing failing of convolutional neural networks and the CoordConv solution
    • A Neural Algorithm of Artistic Style - A.Gatys
  • Deep Learning Book
    • Chapter 4: Optimization
    • Chapter 5: Machine Learning Basics
    • Chapter 6: Deep FeedForward Networks
  • Python
    • Decorators
    • Packages
      • Pip
    • Gotchas
    • Async functions
  • Computer Science
  • TensorFlow
  • Pytorch
    • RNN/LSTM in Pytorch
    • Dataset/ Data loader
    • Resuming/Loading Saved model
  • Programming
    • Unit Testing
    • How to write code
  • General Software Engineering
    • SSH tunneling and Ngrok
  • How To Do Research
  • Resources
  • ROS for python3
  • Kitti
Powered by GitBook
On this page
  • Linear Subspace
  • Spanning
  • Determining number of linear dependences
  • Linear Combination are vectors!
  • Null space
  • Spanning Set
  • Dimension of vector space
  • Basis
  • Way to Look at Linear system of Eq'ns
  • Null space vs Column Space
  • Solving Systems of Eqn's: Related to gaussian elimination
  • Matrix Multiplication
  • Column Perspective (What we kind of been looking at until now)
  • Row perspective
  • Matrix Inverse
  • Transpose and Symmetric Matrices
  • x^TAy
  • A=LU Decomposition
  • Determinant
  1. Linear Algebra

Vector Space and Matrices

PreviousLinear AlgebraNextGeometry of System of Linear Equations

Last updated 12 months ago

Linear Subspace

  • Element zero has to be part of the subspace

  • Linear combination (multiplication by scalar, addition of two elements) of any two elements belonging to the subspace also needs to belong to the subspace.

Spanning

To span a complete nnndim space, we need atleast nnn independent vectors of nnn dimension. For example, to span complete Rn\mathbb{R}^nRn, you need atleast nnnindependent xxxst x∈Rnx \in \mathbb{R}^nx∈Rn.

Determining number of linear dependences

Let's say you have K vectors each of xix_ixi​ ∈Rn\in \mathbb{R}^n∈Rn, such that K > n. But you use maximum have n vectors to span the whole Rn\mathbb{R}^nRnsuch that there are no linearly dependent vectors. Hence when you have K vectors, some of them have to be linearly dependent on others. Hence there will be exactly, K-n linear dependences among the K vectors.

Linear Combination are vectors!

Say you have three vectors, a⃗,b⃗,c⃗\vec{a},\vec{b},\vec{c}a,b,c st a⃗+b⃗+c⃗=0⃗\vec{a}+\vec{b} +\vec{c}=\vec{0}a+b+c=0, in this there is a linear combiantion. But that linear combination is actually a vector, 0⃗\vec{0}0vector. So you can treat each linear dependence equation as actually a vector.

Null space

The span formed by the vectors which are the coefficients of vectors whose linear combination is zero.

i.e null space of given k vectors is basically the span of vectors made up by coefficients of linear dependence equations in those k vectors. One linear dependence given eq'n give one vector, hence if there are m dependent eq'n among those k vectors, the Null space will be span of those m vectors.

Essentially null space of a given k vectors is space of vectors which will always give zero when those k vectors are linearly expressed with coefficients of the vector. i.e let AAA be a matrix formed by stack kkkvectors as column of the Matrix AAA. Then null space is the space such that any vector xxx belonging to null space gives: Ax=0Ax=0Ax=0. ie Null space is space of all vectors xxx such that Ax=0Ax=0Ax=0 where AAAis matrix formed by k vectors. Note that if the K vectors are linearly independent, then there won't be any non-trivial xxx possible in that case, hence Null space will be just zero vector.

You can also see Null space as defining the relatoinship among the columns of the matrix. Null space defines are complete relations among the columns. i.e defines the linear dependence of columns.

Spanning Set

Set of vectors that can express any other vector from the same vector space as a linear combination. There would be some minimum number of vectors needed in the spanning set in order to express the all vectors possible in their vector space. For ex: if you have one 2D vector, you can only express other vectors that lie in the same line as the one vector in the space. But if you would have 2 2D vectors which are not colinear then you can span all the vectors in the plane using their linear combination. That spanning set with a minimum number of vectors (those vectors would have to be linearly independent) would be called the basis.

Dimension of vector space

The minimum of vectors needed in the spanning set in order to express all the vectors in that vector space (of course the vectors will have to be linearly independent) is called the dimension of that vector space.

Basis

A spanning set with the minimum number of vectors to express all other vectors in the space is called Basis. A Basis can express all vectors as linear combinations and do so uniquely. The number of vectors in the basis is the dimension of that vector space.

Way to Look at Linear system of Eq'ns

Null space vs Column Space

Null Space: Analysiszing null space of given vectors, helps us to know the dependence between the vectors, i.e if vectors are linear dependent or independent.

Columns Space: Column space is the span of the given vectors, it helps us analyse the span of vectors.

Null space helps as tell if there are infnitely many solutions of the eq'n of unique, and Column space can help tell if are unique solutions or none at all.

can easily see that: dim(R)+dim(N)= \text{dim}(R)+\text{dim}(N) = dim(R)+dim(N)=num of columns. Where R is columns space and N is null space.

Solving Systems of Eqn's: Related to gaussian elimination

The point of gaussian elimination is to find the relations among the given vectors easier to parse or more evident. It helps to know the span of vectors, or see if the vectors are linearly independed/dependent. Or to check if some vector is indeed in the span of given vectors, etc.

Matrix Multiplication

Column Perspective (What we kind of been looking at until now)

Let's consider matrix multiplication of the form Ax AxAx. Where AAAis some dimension matrix and xxxis vector, now you can the multiplication basically gives as a vector which is in column space of AAA, specifically linear combination of columns of AAA where the coeffcients of combination is given by x.x.x. Ax AxAx can be seens as xxx doing something to the AAAmatrix to get some output vector, xxxdeciedes the proportion of columns of A in which they are combined to get the new vector.

Now let's talk about the matrix of the form AXAXAX, where both A,XA, XA,X are matrices, now you can see the matrix multiplication as many AxiAx_iAxi​ where each xix_ixi​ is column of matrix XXX. Hence all the columns of resultant matrix are essentially some combination of columns of matrix AAA, where the proportion/coefficient of combination is decided by the columns of matrix XXX.

Row perspective

Resultant matrix can be seen as linear combination of rows of right hand side matrix where coefficients are given by row by left hand side matrix.

Matrix Inverse

  • only two matric whose multiplication is commutative are AAA and A−1A^{-1}A−1. & vice versa i.e if multiplication of matrices is commutative, then they have to be each other inverse.

You can find matrix inverse by using gaussian elimination.

Transpose and Symmetric Matrices

Columns becomes rows when tranposed. When transposing, it might become useful to use the row perspectinve.

  • Can be used to describe the symmetric metrices i.e A=ATA = A^TA=AT.

  • AATAA^TAATis always a symmetric matrix. Can be easily seen that ATA^TAT, just have interchnages rows and columns and hence the multipliationn for the resultant becomes symmetric.

x^TAy

Properties of xTAyx^TAyxTAy.

A=LU Decomposition

You see as solving A=LUA=LUA=LU, as trying to find L−1L^{-1}L−1such that applying it on left hand side of AAA i.e doing row operations to AAA will get your matrix UUU.

SImilar you can see it as, as trying to find U−1U^{-1}U−1such that applying it on right hand side of AAA i.e doing columns operations to AAA will get your matrix LLL.

Determinant

The determinant is essentially a way to find out if the columns of the matrix, ie the spanning vectors of the column space are linearly dependent or not. If the determinant of the matrix is zero, it essentially means that the columns are linearly dependent.

It is also a way of telling if the systems of eqns will have infinitely many solutions or be unique, if they have infinitely many solutions that means that determinant is zero.

  • Matrix with determinant zero is called singular matrix.

  • Determinant of upper or lower triangular matrix is given by multiplication of diagonal entries of the matrix.

There is a geometrical analogous of determinant as well. The determinant in 2D gives area of paralellogram given by the columns of matrix and in 3D gives volume.

A matrix is only invertible if it's columns are linearly independent i.e rows are linearly independent.

https://youtu.be/fNpPrRNq8DU?list=PLlXfTHzgMRUKXD88IdzS14F4NxAZudSmv
https://youtu.be/OIEEt8SuQYk?list=PLlXfTHzgMRULWJYthculb2QWEiZOkwTSU