Unsupervised QPC - Clusterization
Clustering with Data Relabelling
Indeks UQPC:
<latex> UQPC(\vec{w})=\sum_{i=1}^n\sum_{j=1}^k \alpha_{ij} G\left( \vec{w}(\vec{x}_i - \vec{t}_j) \right) </latex>
Współczynniki <latex>\alpha_{ij}</latex> zależą od odległości wektora <latex>\vec{x}_i</latex> od prototypu (po projekcji na kierunek <latex>\vec{w}</latex>).
Wektorom przypisywane są etykiety najbliższego prototypu, następnie obliczenie indeksu dokonywane jest standardową metodą QPC.
<latex> \alpha_{ij}>0 \qquad \text{if} \qquad \vec{t}_j : j=\arg \min_l{|\vec{w}(\vec{x}_i-\vec{t}_l) |} </latex>
<latex> \alpha_{ij}<0 \qquad \text{if} \qquad \vec{t}_j : j\ne\arg \min_l{|\vec{w}(\vec{x}_i-\vec{t}_l) |} </latex>
Inna wersja może uwzględniać pozycje prototypów w oryginalnej przestrzeni <latex>R^n</latex>, np.:
<latex> \alpha_{ij}>0 \qquad \text{if} \qquad \vec{t}_j : j=\arg \min_l{||\vec{x}_i-\vec{t}_l ||} </latex>
<latex> \alpha_{ij}<0 \qquad \text{if} \qquad \vec{t}_j : j\ne\arg \min_l{||\vec{x}_i-\vec{t}_l ||} </latex>
Testy na prostych danych
Config : przykładowa konfiguracja. Jednymy zmienianym parametrem w tych testach była liczba prototypów K.
>> [w q p]=uqpc_train(data,2,'dataName','iris')
uqpc_parameters =
K: 2
uqpc_initiations: 10
qpc_parameters =
beta: 0.1000
checkPeriod: 5
dataName: 'dataname'
directions: 2
display: 'none'
eps: 1.0000e-03
function: 'gauss'
indGmax: []
initiations: 10
initWeights: []
killPeriod: 10
killRatio: 0.5000
lambda: 0.1000
learningRate: 0.1000
log: 'off'
logFileName: []
maxIterations: 1000
multistart: 'no'
OptConf: []
OptMethod: 'gd'
orthogonalizationMethod: 'projection'
ortoWeights: []
plot: 'none'
plr: 0.1000
prototypes: []
QPCMethod: 'uqpc1'
save: 'none'
savedir: []
stopCriterium: 2
Gauss2 - Pierwsze dwie projekcje
Sztuczne dane zawierające wektory z rozkładu normalnego
Features: 3 Instances: 400 Source: artificial data
Description: two Gaussian clusters (no overlaping). 200 vectors drawn with distribution N([-1 -1 -1];[0.4 0.4 0.4]) and another 200 using distribution N([+1 +1 +1];[0.4 0.4 0.4]).
w = -0.6774 -0.1991 -0.7081 0.1694 -0.9790 0.1133 qpc = 0.6528 0.7564 prototypes = -1.6026 -0.8391 1.0000 1.5947 0.8877 2.0000
w =
-0.6594 -0.0136 -0.7517
-0.2458 -0.9410 0.2326
qpc =
0.7164
0.7636
prototypes =
-1.6482 -1.4852 1.0000
-0.9964 -0.6550 2.0000
1.4212 1.0054 3.0000
w =
-0.8863 -0.4130 -0.2097
0.4008 -0.4568 -0.7942
qpc =
0.7706
0.7472
prototypes =
-2.0251 -0.5340 1.0000
-1.4904 -1.2452 2.0000
-0.9402 0.6215 3.0000
1.5231 1.3934 4.0000
w =
-0.7445 -0.5505 -0.3776
0.6299 -0.3918 -0.6706
qpc =
0.7945
0.7679
prototypes =
-2.2472 -0.0997 1.0000
-1.7423 -1.2027 2.0000
-1.2746 -0.7011 3.0000
-0.9240 0.5470 4.0000
1.6828 1.0826 5.0000
w =
-0.6915 -0.4394 -0.5734
0.4090 -0.8924 0.1905
qpc =
0.8374
0.4182
prototypes =
-2.6780 -1.1170 1.0000
-2.2099 -1.9602 2.0000
-1.7416 -1.9625 3.0000
-1.3012 -1.9600 4.0000
-0.9648 -0.6108 5.0000
-0.6844 0.0602 6.0000
1.7147 0.7292 7.0000
Gauss3a - Pierwsze dwie projekcje
Sztuczne dane zawierające wektory z rozkładu normalnego
Features: 3 Instances: 600 Source: artificial data
Description: three Gaussian clusters (no overlaping and week overlaping). 400 vectors identical as in Gauss2 data.
Additional 200 vectors drawn with distribution N([0 3 3];[1 1 1]).
w =
-0.7692 -0.1555 -0.6198
0.5675 -0.6122 -0.5506
qpc =
0.8484
0.8040
prototypes =
-0.0668 -0.6088 1.0000
0.9765 0.4789 2.0000
w =
-0.1569 -0.6702 -0.7254
-0.9876 0.1003 0.1210
qpc =
0.8547
0.7066
prototypes =
-0.5836 0.8915 1.0000
0.2209 -0.2944 2.0000
1.1085 0.4466 3.0000
w =
-0.2446 -0.7543 -0.6092
-0.9695 0.1826 0.1632
qpc =
0.8474
0.7416
prototypes =
-0.8046 -0.3162 1.0000
-0.3981 0.4073 2.0000
0.2201 -0.7951 3.0000
1.1300 0.9064 4.0000
w =
-0.2985 -0.7124 -0.6351
0.9475 -0.1412 -0.2870
qpc =
0.8501
0.7457
prototypes =
-1.0304 -0.7514 1.0000
-0.7630 0.7096 2.0000
-0.3741 -1.0696 3.0000
0.2116 -0.3223 4.0000
1.1433 0.3275 5.0000
Iris - Pierwsze dwie projekcje
w = 0.1368 0.1681 -0.9666 0.1369 -0.3418 0.5856 -0.0504 -0.7333 q = 0.9010 0.8841 p = -0.3124 -0.4709 1.0000 0.6586 1.0459 2.0000
w = 0.1986 0.1252 -0.8541 -0.4640 -0.6044 0.2767 0.2770 -0.6939 q = 0.8516 0.7963 p = -0.7688 -0.6718 1.0000 -0.0887 -0.0270 2.0000 1.0320 0.8203 3.0000
w = -0.0979 -0.0137 -0.8268 -0.5538 -0.7768 -0.6135 0.1341 -0.0477 q = 0.8172 0.7590 p = -0.9725 -0.9724 1.0000 -0.3340 0.4432 2.0000 0.1701 0.9497 3.0000 1.2426 -0.2903 4.0000
Gauss2n2 - Pierwsze dwie projekcje
Sztuczne dane zawierające wektory z roskładu normalnego oraz jednostajny szum.
Features: 4 Instances: 600 Source: artificial data
Description: two Gaussian clusters (weak overlapping) and uniform noise.
Feature 1 and 2 was drawn with distribution N(-1.3,1) and N(+1.3,1).
Feature 2 and 4 was drawn from uniform distribution with range [-4,+4].
w = -0.5523 -0.0285 -0.8326 -0.0303 0.1305 -0.0569 -0.0487 -0.9886 q = 0.7557 0.7203 p = -0.5532 -0.6354 1.0000 0.4429 0.6202 2.0000
w = -0.5920 -0.5247 -0.6050 -0.0910 -0.1031 0.0999 0.1612 -0.9764 q = 0.7224 0.7346 p = -0.7178 -0.7449 1.0000 0.0330 -0.0118 2.0000 0.7115 0.7362 3.0000
w = -0.1100 -0.9853 -0.0730 -0.1085 -0.3473 0.1587 -0.3430 -0.8582
q =
0.7451
0.7415
p =
-0.8747 -0.9069 1.0000
-0.3419 -0.2859 2.0000
0.2004 0.3974 3.0000
0.7912 0.9605 4.0000
w = -0.4693 -0.8418 -0.2383 -0.1195 -0.3406 0.3571 -0.1619 -0.8545 q = 0.7598 0.7474 p = -1.1052 -0.5754 1.0000 -0.7851 -1.0211 2.0000 -0.2884 -0.0100 3.0000 0.3211 0.5271 4.0000 0.8709 1.0255 5.0000
Notatki
* Problem z pozycjami prototypów przy składaniu projekcji. Metoda klasteryzacji za pomocą prototypów wymaga dopracowania. Czy taki sposób wyznacania kastrów ma sens? Marek Grochowski 2011/02/04 11:28















































