forked from forgejo/forgejo
Migrate to dep (#3972)
* Update makefile to use dep * Migrate to dep * Fix some deps * Try to find a better version for golang.org/x/net * Try to find a better version for golang.org/x/oauth2
This commit is contained in:
parent
d7fd9bf7bb
commit
3f3383dc0a
281 changed files with 12024 additions and 32676 deletions
118
vendor/github.com/blevesearch/go-porterstemmer/README.md
generated
vendored
118
vendor/github.com/blevesearch/go-porterstemmer/README.md
generated
vendored
|
@ -1,118 +0,0 @@
|
|||
# This fork...
|
||||
|
||||
I'm maintaining this fork because the original author was not replying to issues or pull requests. For now I plan on maintaining this fork as necessary.
|
||||
|
||||
## Status
|
||||
|
||||
[](https://travis-ci.org/blevesearch/go-porterstemmer)
|
||||
|
||||
[](https://coveralls.io/r/blevesearch/go-porterstemmer?branch=HEAD)
|
||||
|
||||
# Go Porter Stemmer
|
||||
|
||||
A native Go clean room implementation of the Porter Stemming Algorithm.
|
||||
|
||||
This algorithm is of interest to people doing Machine Learning or
|
||||
Natural Language Processing (NLP).
|
||||
|
||||
This is NOT a port. This is a native Go implementation from the human-readable
|
||||
description of the algorithm.
|
||||
|
||||
I've tried to make it (more) efficient by NOT internally using string's, but
|
||||
instead internally using []rune's and using the same (array) buffer used by
|
||||
the []rune slice (and sub-slices) at all steps of the algorithm.
|
||||
|
||||
For Porter Stemmer algorithm, see:
|
||||
|
||||
http://tartarus.org/martin/PorterStemmer/def.txt (URL #1)
|
||||
|
||||
http://tartarus.org/martin/PorterStemmer/ (URL #2)
|
||||
|
||||
# Departures
|
||||
|
||||
Also, since when I initially implemented it, it failed the tests at...
|
||||
|
||||
http://tartarus.org/martin/PorterStemmer/voc.txt (URL #3)
|
||||
|
||||
http://tartarus.org/martin/PorterStemmer/output.txt (URL #4)
|
||||
|
||||
... after reading the human-readble text over and over again to try to figure out
|
||||
what the error I made was (and doing all sorts of things to debug it) I came to the
|
||||
conclusion that the some of these tests were wrong according to the human-readable
|
||||
description of the algorithm.
|
||||
|
||||
This led me to wonder if maybe other people's code that was passing these tests had
|
||||
rules that were not in the human-readable description. Which led me to look at the source
|
||||
code here...
|
||||
|
||||
http://tartarus.org/martin/PorterStemmer/c.txt (URL #5)
|
||||
|
||||
... When I looked there I noticed that there are some items marked as a "DEPARTURE",
|
||||
which differ from the original algorithm. (There are 2 of these.)
|
||||
|
||||
I implemented these departures, and the tests at URL #3 and URL #4 all passed.
|
||||
|
||||
## Usage
|
||||
|
||||
To use this Golang library, use with something like:
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"github.com/reiver/go-porterstemmer"
|
||||
)
|
||||
|
||||
func main() {
|
||||
|
||||
word := "Waxes"
|
||||
|
||||
stem := porterstemmer.StemString(word)
|
||||
|
||||
fmt.Printf("The word [%s] has the stem [%s].\n", word, stem)
|
||||
}
|
||||
|
||||
Alternatively, if you want to be a bit more efficient, use []rune slices instead, with code like:
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"github.com/reiver/go-porterstemmer"
|
||||
)
|
||||
|
||||
func main() {
|
||||
|
||||
word := []rune("Waxes")
|
||||
|
||||
stem := porterstemmer.Stem(word)
|
||||
|
||||
fmt.Printf("The word [%s] has the stem [%s].\n", string(word), string(stem))
|
||||
}
|
||||
|
||||
Although NOTE that the above code may modify original slice (named "word" in the example) as a side
|
||||
effect, for efficiency reasons. And that the slice named "stem" in the example above may be a
|
||||
sub-slice of the slice named "word".
|
||||
|
||||
Also alternatively, if you already know that your word is already lowercase (and you don't need
|
||||
this library to lowercase your word for you) you can instead use code like:
|
||||
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"github.com/reiver/go-porterstemmer"
|
||||
)
|
||||
|
||||
func main() {
|
||||
|
||||
word := []rune("waxes")
|
||||
|
||||
stem := porterstemmer.StemWithoutLowerCasing(word)
|
||||
|
||||
fmt.Printf("The word [%s] has the stem [%s].\n", string(word), string(stem))
|
||||
}
|
||||
|
||||
Again NOTE (like with the previous example) that the above code may modify original slice (named
|
||||
"word" in the example) as a side effect, for efficiency reasons. And that the slice named "stem"
|
||||
in the example above may be a sub-slice of the slice named "word".
|
Loading…
Add table
Add a link
Reference in a new issue