추천 시스템의 바탕이 되는 Matrix factorization 이다.
결국 N 차원을 d 차원으로 축소했다가 다시 N 차원으로 만드는 과정이다.
하지만, N차원을 d차원으로 축소 한것이 N 차원을 잘 표현(?)해야한다...
PCA(차원 축소) 오토 인코더와 비슷한점 차이점을 알면 좋을 것 같지만,, 차차 알아가보도록 하겠다.
이론적인 부분을 아래의 블로그를 참고했다.
1. Matrix Factorization for Network Embedding¶
In [1]:
import pandas as pd
import numpy as np
import random
import networkx as nx
from matplotlib import pyplot as plt
np.random.seed(15)
In [2]:
#Load data
adjlist = nx.read_adjlist("karate_club.adjlist", nodetype=int)
karate_label = np.loadtxt("karate_label.txt")
In [6]:
adj = nx.to_numpy_array(adjlist)
label = karate_label[:,-1]
print(adj.shape)
print(label.shape)
(34, 34)
(34,)
In [7]:
#defining P, Q for matrix factorizaiton
d= 4
P = np.random.random((4,34))
Q = np.random.random((4,34))
In [8]:
zuzv = np.dot(P.T,Q)
zuzv.shape
Out[8]:
(34, 34)
In [9]:
# loss function
def loss(a,b):
return np.sum((a-b)**2)
In [10]:
loss(zuzv,adj)
Out[10]:
1057.929366546701
In [11]:
epoch = 500
lr = 0.001
In [12]:
#Updating params
loss_list = [0 for _ in range(epoch)]
for i in range(epoch):
P -= lr * np.dot(zuzv-adj,Q.T).T
Q -= lr * np.dot(zuzv-adj,P.T).T
loss_list[i] = loss(zuzv,adj)
zuzv = np.dot(P.T,Q)
In [13]:
#plotting the loss
plt.plot(loss_list)
Out[13]:
[<matplotlib.lines.Line2D at 0x17cb59a6b50>]
T-SNE¶
- the membership number are located nearly when they have many relationship
- it differs quite a lot when perplexity changes
- unlike the figure, label doesn't mean anyting (expressed by the color)
In [14]:
ans = np.dot(adj,P.T)
In [15]:
import matplotlib.pyplot as plt
from sklearn.manifold import TSNE
model = TSNE(learning_rate=100,perplexity=3)
transformed = model.fit_transform(ans)
xs = transformed[:,0]
ys = transformed[:,1]
for i in range(len(xs)):
plt.scatter(xs[i],ys[i],c = label[i])
plt.text(xs[i],ys[i],i)
plt.scatter(xs,ys,c=label)
#plt.text(xs,ys)
plt.show()
In [ ]:
In [ ]:
'Data Science > NLP' 카테고리의 다른 글
[NLP] Deepwalk + logistic Regression with python (0) | 2021.04.29 |
---|---|
[NLP] Matrix Factorization + logistic Regression with python (0) | 2021.04.29 |
[NLP] 파이썬으로 backpropagation 구현하기 (with different hidden layers) (0) | 2021.03.31 |
[NLP] 파이썬으로 backpropagation 구현하기 (without bias) (0) | 2021.03.31 |
댓글