cs231n学习笔记

7/5/2026

基础

5-fold cross validation

把数据集分为5部分: d1, d2, d3, d4, d5，测试5次，第1次，取d1为验证集，其余的为训练集，得到一个准确率。第2次，取d2为验证集，其余为训练集，得到另外一个准确率。

Approximate Nearest Neighbor (ANN)

FLANN: https://github.com/mariusmuja/flann

t-SNE

t-Distributed Stochastic Neighbor Embedding (t-SNE) https://lvdmaaten.github.io/tsne/

NCA

https://en.wikipedia.org/wiki/Neighbourhood_components_analysis https://kevinzakka.github.io/2020/02/10/nca/

Random Projection¶

https://scikit-learn.org/stable/modules/random_projection.html

L2 distance向量化方法

xx = np.sum(X**2, axis=1).reshape(-1, 1)
print(xx.shape)
yy = np.sum(self.X_train**2, axis=1)
print(yy.shape)
print((xx + yy).shape)
xy = X.dot(self.X_train.T)
dists =np.sqrt(xx + yy - 2 * xy)

Multi-class SVM

$L_i = \sum_{j\neq y_i}^c max(0, s_j - s_{y_i} + \Delta)$

Generally, $\Delta = 1$

$s_j$ 为负类的分数， $s_{y_i}$ 为正类的分数

HOG

Histogram of Oriented Gradients 能降维，检测出来具体的物体边缘，属于传统图像处理方法，CNN相当于自动训练出来filter来检测

Color Histogram

0-255, 统计颜色的数量，作为图片的特征

Bag of words

参考NLP, random patch, 再做k-means聚类，统计各个cluster以及它拥有的patch数量，作为特征

CNN

YOLO/SDD

SDD

Mask R-CNN

图像分割

semantic/instance segmentation

upsampling/uppooling

bead-of-nail upsampling

Transpose convolution

可解释

guided backprop

Gradient Ascent

DeepDream

Amplify exsisting features

GAN

Generative model
- Explicit density
  - Tractable density
    - PixelRNN/CNN
  - Approximate density
    - Variational
      - Variational Autoencoder
      - Markov Chain
        
        Boltzmann machine
- Implicit density
  - Direct
    - GAN
  - Markov Chain
    - GSN

Autoencoder

encoder: conv decoder: upconv

conv --> upconv

GAN

Gradient Ascent on discriminator Gradient Ascent on Generator

Reinforce Learning

Bellman equation

Q Learning

Todo

Recurrent Atention model

RAM, glimpse