Skip to content

Commit c0d3ac4

Browse files
committed
docs
1 parent 7bdcb76 commit c0d3ac4

1 file changed

Lines changed: 47 additions & 47 deletions

File tree

README.md

Lines changed: 47 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ This repository contains an R package allowing to build `Paragraph Vector` model
77
- The package allows one
88
- to train paragraph embeddings (also known as document embeddings) on character data or data in a text file
99
- use the embeddings to find similar documents, paragraphs, sentences or words
10-
- Note. For getting word vectors in R: look at package https://github.com/bnosac/word2vec
10+
- Note. For getting word vectors in R: look at package https://github.com/bnosac/word2vec, details [here](https://www.bnosac.be/index.php/blog/100-word2vec-in-r), for Starspace embeddings: look at package https://github.com/bnosac/ruimtehol, details [here](https://cran.r-project.org/web/packages/ruimtehol/vignettes/ground-control-to-ruimtehol.pdf)
1111

1212
## Installation
1313

@@ -61,7 +61,7 @@ str(model)
6161
## List of 3
6262
## $ model :<externalptr>
6363
## $ data :List of 4
64-
## ..$ file : chr "C:\\Users\\Jan\\AppData\\Local\\Temp\\Rtmpk9Npjg\\textspace_1c4458cb6943.txt"
64+
## ..$ file : chr "C:\\Users\\Jan\\AppData\\Local\\Temp\\Rtmpk9Npjg\\textspace_1c4432666686.txt"
6565
## ..$ n : num 170469
6666
## ..$ n_vocabulary: num 3867
6767
## ..$ n_docs : num 1000
@@ -117,10 +117,10 @@ embedding[, 1:4]
117117
```
118118

119119
```
120-
## [,1] [,2] [,3] [,4]
121-
## doc_1 0.08172660 -0.03679979 0.05726605 -0.06496991
122-
## doc_10 0.13976580 0.10821507 -0.06986591 -0.05825572
123-
## doc_3 0.09486584 -0.07999156 0.03448128 0.02999697
120+
## [,1] [,2] [,3] [,4]
121+
## doc_1 0.038523957 -0.14341952 -0.06087392 -0.01625664
122+
## doc_10 0.003298676 -0.04789201 0.06048679 -0.14829759
123+
## doc_3 0.030986091 0.08946659 0.02453904 -0.01900235
124124
```
125125

126126
- Get similar documents or words when providing sentences, documents or words
@@ -134,19 +134,19 @@ nn
134134
```
135135
## [[1]]
136136
## term1 term2 similarity rank
137-
## 1 proximus telefoontoestellen 0.5571629 1
138-
## 2 proximus belfius 0.4994604 2
139-
## 3 proximus toenmalige 0.4873388 3
140-
## 4 proximus internetverbinding 0.4730936 4
141-
## 5 proximus gefactureerd 0.4568973 5
137+
## 1 proximus telefoontoestellen 0.5364115 1
138+
## 2 proximus belfius 0.5292925 2
139+
## 3 proximus internetverbinding 0.5140554 3
140+
## 4 proximus ceo 0.4961080 4
141+
## 5 proximus fusie 0.4803250 5
142142
##
143143
## [[2]]
144-
## term1 term2 similarity rank
145-
## 1 koning grondwet 0.5572801 1
146-
## 2 koning verplaatsingen 0.5373006 2
147-
## 3 koning ministerie 0.5140343 3
148-
## 4 koning familie 0.4943074 4
149-
## 5 koning vereiste 0.4715540 5
144+
## term1 term2 similarity rank
145+
## 1 koning ministerie 0.5567209 1
146+
## 2 koning verplaatsingen 0.5317563 2
147+
## 3 koning grondwet 0.5118545 3
148+
## 4 koning gedragen 0.4884593 4
149+
## 5 koning verantwoordelijk 0.4788159 5
150150
```
151151

152152
```r
@@ -157,19 +157,19 @@ nn
157157
```
158158
## [[1]]
159159
## term1 term2 similarity rank
160-
## 1 proximus doc_105 0.6922343 1
161-
## 2 proximus doc_863 0.5826316 2
162-
## 3 proximus doc_186 0.5146015 3
163-
## 4 proximus doc_862 0.5051525 4
164-
## 5 proximus doc_746 0.4467830 5
160+
## 1 proximus doc_105 0.7080573 1
161+
## 2 proximus doc_863 0.6275553 2
162+
## 3 proximus doc_186 0.5301130 3
163+
## 4 proximus doc_862 0.4656175 4
164+
## 5 proximus doc_620 0.4396312 5
165165
##
166166
## [[2]]
167167
## term1 term2 similarity rank
168-
## 1 koning doc_44 0.6228581 1
169-
## 2 koning doc_583 0.5643232 2
170-
## 3 koning doc_45 0.5535781 3
171-
## 4 koning doc_797 0.4408725 4
172-
## 5 koning doc_943 0.4039679 5
168+
## 1 koning doc_44 0.6395732 1
169+
## 2 koning doc_583 0.5574296 2
170+
## 3 koning doc_45 0.5361990 3
171+
## 4 koning doc_943 0.4225507 4
172+
## 5 koning doc_797 0.4086391 5
173173
```
174174

175175
```r
@@ -180,19 +180,19 @@ nn
180180
```
181181
## [[1]]
182182
## term1 term2 similarity rank
183-
## 1 doc_198 doc_343 0.4893735 1
184-
## 2 doc_198 doc_569 0.4858374 2
185-
## 3 doc_198 doc_358 0.4831750 3
186-
## 4 doc_198 doc_498 0.4766597 4
187-
## 5 doc_198 doc_983 0.4761481 5
183+
## 1 doc_198 doc_343 0.4947847 1
184+
## 2 doc_198 doc_899 0.4893836 2
185+
## 3 doc_198 doc_923 0.4850165 3
186+
## 4 doc_198 doc_708 0.4697377 4
187+
## 5 doc_198 doc_642 0.4622465 5
188188
##
189189
## [[2]]
190190
## term1 term2 similarity rank
191-
## 1 doc_285 doc_319 0.5304061 1
192-
## 2 doc_285 doc_286 0.5205777 2
193-
## 3 doc_285 doc_76 0.5086077 3
194-
## 4 doc_285 doc_74 0.4975725 4
195-
## 5 doc_285 doc_537 0.4802507 5
191+
## 1 doc_285 doc_286 0.5537772 1
192+
## 2 doc_285 doc_319 0.5478524 2
193+
## 3 doc_285 doc_874 0.5095125 3
194+
## 4 doc_285 doc_113 0.4878533 4
195+
## 5 doc_285 doc_76 0.4863345 5
196196
```
197197

198198
```r
@@ -206,19 +206,19 @@ nn
206206
```
207207
## $sent1
208208
## term1 term2 similarity rank
209-
## 1 sent1 doc_740 0.4637638 1
210-
## 2 sent1 doc_742 0.4621139 2
211-
## 3 sent1 doc_206 0.4315273 3
212-
## 4 sent1 doc_825 0.4221503 4
213-
## 5 sent1 doc_151 0.4183135 5
209+
## 1 sent1 doc_742 0.4385398 1
210+
## 2 sent1 doc_776 0.4269895 2
211+
## 3 sent1 doc_740 0.4247892 3
212+
## 4 sent1 doc_206 0.4162723 4
213+
## 5 sent1 doc_509 0.4153925 5
214214
##
215215
## $sent2
216216
## term1 term2 similarity rank
217-
## 1 sent2 doc_105 0.5789919 1
218-
## 2 sent2 doc_186 0.4938067 2
219-
## 3 sent2 doc_862 0.4848365 3
220-
## 4 sent2 doc_863 0.4685720 4
221-
## 5 sent2 doc_620 0.4497271 5
217+
## 1 sent2 doc_105 0.5738307 1
218+
## 2 sent2 doc_863 0.5229421 2
219+
## 3 sent2 doc_862 0.4981593 3
220+
## 4 sent2 doc_186 0.4873295 4
221+
## 5 sent2 doc_18 0.4671208 5
222222
```
223223

224224
```r

0 commit comments

Comments
 (0)