Google Research’s word2vec tool, quoting from https://code.google.com/p/word2vec/, “provides an efficient implementation of the continuous bag-of-words and skip-gram architectures for computing vector representations of words.”
It produces some curious results: Russia is closer to France than Germany is:
A simple way to investigate the learned representations is to find the closest words for a user-specified word. The distance tool serves that purpose. For example, if you enter ‘france’, distance will display the most similar words and their distances to ‘france’, which should look like:
Word Cosine distance ------------------------------------------- spain 0.678515 belgium 0.665923 netherlands 0.652428 italy 0.633130 switzerland 0.622323 luxembourg 0.610033 portugal 0.577154 russia 0.571507 germany 0.563291 catalonia 0.534176