You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description: Learn vector representations of sentences, paragraphs or documents by using the 'Paragraph Vector' algorithms,
11
11
namely the distributed bag of words ('PV-DBOW') and the distributed memory ('PV-DM') model.
12
-
The techniques in the package are detailed in the paper "Distributed Representations of Sentences and Documents" by Mikolov et al. (2014), available at <arXiv:1405.4053>.
12
+
The techniques in the package are detailed in the paper "Distributed Representations of Sentences and Documents" by Mikolov et al. (2014), available at <doi:10.48550/arXiv.1405.4053>.
13
13
The package also provides an implementation to cluster documents based on these embedding using a technique called top2vec.
14
14
Top2vec finds clusters in text documents by combining techniques to embed documents and words and density-based clustering.
15
15
It does this by embedding documents in the semantic space as defined by the 'doc2vec' algorithm. Next it maps
@@ -18,12 +18,12 @@ Description: Learn vector representations of sentences, paragraphs or documents
18
18
areas are the topic clusters which can be represented by the corresponding topic vector which is an aggregate of the
19
19
document embeddings of the documents which are part of that topic cluster. In the same semantic space similar words can
20
20
be found which are representative of the topic.
21
-
More details can be found in the paper 'Top2Vec: Distributed Representations of Topics' by D. Angelov available at <arXiv:2008.09470>.
21
+
More details can be found in the paper 'Top2Vec: Distributed Representations of Topics' by D. Angelov available at <doi:10.48550/arXiv.2008.09470>.
0 commit comments