C
IMPORTANT: + FIX file descriptor overflow. See Tickets #341 and #343
add .. operator to query parser. For example, [100 200] could be written as 100..200 or 100…201 like in Ruby Ranges
remove exception handling from C code. All errors to be handled by return values.
Move to sqlite's locking model. Ferret should work fine in a multi-process environment.
Add optional logging. To be enabled at compilation time, perhaps?
Add support for changing zlib and bzlib compression parameters
Improve unit test coverage to 100%
Add benchmark suite
Add Rakefile for development purposes + task to publish gcov and benchmark results to ferret wiki
Index rebuilding of old versioned indexes.
Add a globally accessable, threadsafe symbol table. This will be very useful for storing field names so that no objects need to strdup the field-names but can just store the symbol representative instead. + this has been done but it can be improved using actual Symbol structs
instead of plain char*
Make threading optional at compile time
to_json should limit output to prevent memory overflow on large indexes. Perhaps we could use some type of buffered read for this.
Make BitVector run as fast as bitset from C++ STL. See;
c/benchmark/bm_bitvector.c
Add a symbol table for field names. This will mean that we won't need to worry about mallocing and freeing field names which happens all over the place.
Divide the headers into public and private (the private headers to be stored in the src directory).
Group-by search. ie you should be able to pass a field to group search results by
Auto-loading of documents during search. ie actual documents get returned instead of document numbers.
Ruby bindings
argument checking for every method. We need a new api for argument checking so that the arguments get checked at the start of each method that could cause a segfault.
improve memory management. It was way to complex at the moment. I also need to document how it works so that other developers understand what is going on.
Replace Data_Wrap_Struct with ferret alternative which handles rewrapping of structs automatically and also knows when to release a struct by using refcounting.
Ruby
integrate rcov
improve unit test coverage to 100%
Documentation.
generate Ruby binding documentation with custom build template similar jaxdoc rubyforge.org/projects/jaxdoc
all documentation should meet DOCUMENTATION_STANDARDS
documentation in C code to be generated by doxygen
Someday Maybe
apply for Google Summer of Code 2009
optimize read and write vint
test the following outside of ferret before implementing
perform a binary scan using bit-wise or to find out how many bytes need to be written
if the write/read will overflow the buffer, split it into two, refreshing the buffer in between
use Duff's device to write bytes now that we know how many we need
add a super fast language based dictionary compression
add portable stacktrace function. Perhaps implement as an external library.
investigate unscored searching
user defined sorting
Fix highlighting to work for external fields
investigate faster string hashing method
Done
add rake install task
FIX :create parameter so that it only deletes the files owned by Ferret.
fix compression. Currently nothing is happening if you set a field to :compress. I guess we'll just assume zlib is installed, as I think it has to be for Ruby to be installed.
add bzlib support
integrate gcov
add a field cache to IndexReader
setup email alerts for svn commits
Ranged, unordered searching. Ie search through the index until you have the required number of documents and then break. This will require the ability to start searches from a particular doc-num. + See searcher_search_unordered in the C code and Searcher#scan in Ruby
improve unit test code. I'd like to implement some way to print out a stack trace when a test fails so that it is easy to find the source of the error.
catch segfaults and print stack trace so users can post helpful bug tickets. again, see the same links for adding stacktrace to unit tests.
Add string Sort descripter
fix memory bug
add MultiReader interface
add lexicographical sort (byte sort)
Add highlighting
add field compression
Fix highlighting to work for compressed fields
Fix: + Working Query: field1:value1 AND NOT field2:value2 + Failing Query: field1:value1 AND ( NOT field2:value2 )
update benchmark suite to use getrusage