Point-wise Mutual Information
(Yao, et al 2019) reclaimed a clear description of Point-wise Mutual Information as below:
[PMI(i, j) = log frac{p(i,j)}{p(i)p(j)} \
p(i, j) = frac{#(i,j)}{#W} \
p(i) = frac{#(i)}{#W}
]
where (#(i)) is the number of sliding windows in a corpus hat contain word (i)
where (#(i,j)) is the number of sliding windows that contain both word (i) and (j)
where (#W) is the total number of sliding windows in the corpus.
(Levy, et al 2014) simplified PMI formula as below:
[PMI(i,j) = logfrac{#(i,j)#W}{#(i)#(j)}
]
Obviously, (#W) is a constant if we fixed slide window size and corpus, hence we can further simplify the formula as below:
[PMI(i, j) = logfrac{#(i,j)}{#(i)#(j)}
]
References
Liang Yao, et al, 2019. Graph Convolutional Networks for Text Classification. AAAI
Omer Levy, et al, 2014. NeuralWord Embedding as Implicit Matrix Factorization. NIPS