# Self Organizing Maps (SOM) in NetLogo

Última modificación: 23 de Abril de 2017, y ha tenido 328 vistas

Etiquetas utilizadas: || || || || ||

Self Organizing Maps, or SOM, were created in 1982 by Teuvo Kohonen, professor from the Finland Academy, and provide a way to to represent multidimensional data (vectors) in lower dimensional spaces, usually 2D. In general, different techniques to reduce vectors dimensionality are data compression techniques also known as Vector Quantization. One important fact in Kohonen technique is that we also obtain some conservation of topological properties from the original data.

A tipical example to see SOM working is based in the projection of colors (associated to 3D vectors from, for example, the RGB components) to a 2D space. Next figure show a trained SOM to recognize the 8 colors from the left. In the obtained 2D representation, similar colors are located in adjacent regions.

As machine learning algorithm, SOM is in the smaller bag of unsupervised methods, what means we have no feedback about a goal to require, but it provides a distribution of the vectors considering their similarities. From a technical point of view, it considers a network of nodes in a similar way as artificial neural networks.

## Network Architecture

SOM algorithm has a 2 layer network architecture: we have a learning nodes layer, which we mind the geometrical-topological relation between them, and that finally will contain the information about the resulting representation; and we have one more layer with the input nodes, where the original vectors will be feeded to the network during the training process. Also, there exist connections between all the nodes of both layers.

Next figure shows a possible 2D architecture for a SOM training. As in other similar models, the idea of the algorithm is to find correct weights for the connections between to give a correct representation of the input data in the geometrical structure of learning nodes.

In fact, as we don't mind the geometrical or topological representation of input nodes, it is usual to work only with the representation of learning nodes, and hence the weights are shown as vectors in these nodes (every element in this vector is the weight of the connection with the correspondent input node). In this way, if the input layer has size $$n$$ (the dimension of the original space), then every weight vector , $$W$$, will have dimension $$n$$.

In our first NetLogo implementation we will make use of patches as training nodes, and we will store in a custom variable the weights of the connections with the 3 input nodes (3D, RGB colors).

patches-own [
weight ]


The training vectors will be stored in a global variable:

; TSet stores the Training Set
globals [
TSet
]


## Learning Algorithm

Roughly, since there is no goal vector to approximate, what we will do is, where the nodes of the of the network have weight vectors coincident with original vector, the surrounding learning nodes tend to approximate their respective weights to the same value. In this way, starting from a initical distribution of weights (usually, a random one), SOM network tends to approximate a stable weights distribution. Every one of this stabilized areas becomes a properties classifier. Once the network stabilizes, any new vector will stimulate the area with similar weights.

The Setup procedure initially assigns a weight to every node. It can be done randomly, or using a grey gradient, by means of the Init chooser control in the interface.

to setup
ca
ifelse (Init = "Random")
[ ; Associate a random vector to every patch
set weight n-values 3 [random-float 1]
set pcolor rgb (255 * item 0 weight) (255 * item 1 weight) (255 * item 2 weight) ]
]
[ ; Associate a gradient grey vector to every patch
set pcolor scale-color black ((pxcor + pycor + 2 * max-pxcor) / (4 * max-pxcor)) 0 1
set weight map [? / 255] extract-rgb pcolor ]
]  ; Create a random list of random RGB-colors
set TSet n-values TSet-size [n-values 3 [random-float 1]]
reset-ticks
end


In a more detailed form, the steps to follow in the training process are:

1. Every node is initialized with a (random) weight. Usually, a vector from $$[0,1]^n$$,

2. A vector from the input data is selected, $$V$$.

3. The node with the most similar weight to $$V$$, we will denote it as Best Matching Unit (BMU). For that, we simply compute the distances between all the weight vectors of all the learning nodes and $$V$$ (by efficiency reasons, we will not apply the square root, because we only need to know where the minimum is).

4. Compute the radius of the BMU neighbourhood. This radius will start with a big value (to cover all the network) and it will decrease in every iteration of the algorithm.

5. Every node from BMU neighbourhood adjust its weight to get closer to $$V$$, but in such way that closer nodes to BMU are stronger modified.

6. Repeat from step 2 (the number of iterations deemed necessary).

This algorithm is directly coded in NetLogo. We worry about one step iteration, because it will be associated to a Forever button, but it will stop automatically after Training-Time steps (a global variable from the interface):

to SOM  ; One cycle considers all the vectors from TSet
foreach (shuffle TSet)
[
let V ?
let W BMU V
[ ; Compute the new weight
set weight new-weight weight V ticks        ; Adapt the color to the new weight
set pcolor rgb (255 * item 0 weight) (255 * item 1 weight) (255 * item 2 weight) ]
]
]
tick
if ticks > Training-Time [stop]
end


BMU can be computed in a easy way from:

; Best Matching Unit: Closest weight-patch to V
to-report BMU [V]
report min-one-of patches [dist ([weight] of self) V]
end

; Euclidean Distance function (without square root)
to-report dist [v1 v2]
report sum (map [(?1 - ?2) ^ 2] v1 v2)
end


The formula to stablish the radius as a function of iteration (it decreases, but lineally) is:

r(t)=r_0 e^{-\frac{t}{\lambda