Class | DictionaryMatcher |
In: |
lib/more/facets/dictionarymatcher.rb
|
Parent: | Object |
word_count | [R] |
Create a DictionaryMatcher with no words in it
# File lib/more/facets/dictionarymatcher.rb, line 16 def initialize @trie = {} @word_count = 0 end
Determines whether one of the words in the DictionaryMatcher is a substring of string. Returns the index of the match if found, nil if not found.
# File lib/more/facets/dictionarymatcher.rb, line 79 def =~ text internal_match(text){|md| return md.index} nil end
Add a word to the DictionaryMatcher
# File lib/more/facets/dictionarymatcher.rb, line 22 def add(word) @word_count += 1 container = @trie containers=[] i=0 word.each_byte do |b| container[b] = {} unless container.has_key? b container[:depth]=i containers << container container = container[b] i+=1 end containers << container container[0] = true # Mark end of word container[:depth]=i ff=compute_failure_function word ff.zip(containers).each do |pointto,container| container[:failure]=containers[pointto] if pointto end self end
Determine whether one of the words in the DictionaryMatcher is a substring of string. Returns a DictionaryMatcher::MatchData object if found, nil if not found.
# File lib/more/facets/dictionarymatcher.rb, line 88 def match text internal_match(text){|md| return md} nil end
Scans string for all occurrances of strings in the DictionaryMatcher. Overlapping matches are skipped (only the first one is yielded), and when some strings in the DictionaryMatcher are substrings of others, only the shortest match at a given position is found.
# File lib/more/facets/dictionarymatcher.rb, line 126 def scan(text, &block) matches=[] block= lambda{ |md| matches << md } unless block internal_match(text,&block) matches end