It is a C library implementation based on swish-e-2.1-dev, but many of the
functions have been rewritten in order to get a thread safe library. That's
is not to say that it is currently thread safe.
The advantage of the library is that the index file(s) can be
opened one time and many queries made on the open index. This saves the
startup time required to fork and run the swish-e binary, and the expensive
time of opening up the index file. Some benchmarks have shown a three fold
increase in speed.
The downside is that your program now has more code and data in it (the
index tables can use quite a bit of memory), and if a fatal error happens
in swish it will bring down your program. These are things to think about,
especially if embedding swish into a web server such as Apache where there
are many processes serving requests.
The best way to learn about the library is to look at two files included
with the swish-e distribution that make use of the library.
This file gives a basic overview of linking a C program with the Swish-e
library. Not all available functions are used in that example, but it
should give you a good overview of building a program with swish-e.
To build libtest run and run libtest:
$ make libtest
$ ./libtest [optional name of index file(s)]
You will be prompted for the search words. The default index used is index.swish-e. This can be overridden by placing a list of index files in a
quote-protected string.
The SWISHE.xs file contains more examples of how to read from the perl library. It
includes example code for reading additional information from the index
files.
Not all available functions are documented here. That's both do to
laziness, and the hope that a better interface will be created for these
functions. Check the above files for details.
You should check for errors after every call. See the src/libtest.c file for examples.
This functions opens and reads the header info of the index files included
in IndexFiles string. The string should contain a space separated list of
index files.
handle : value returned by SwishOpen
words : the search string
structure : At this moment always one (it will implement the -t option of Swish-e)
properties : [Depreciated] Set as NULL. See text for comments.
sortspec : Sort specs for the results. Use NULL if sort by rank
Returns the number of hits or a negative value on error.
There is a new feature here that it is not included in swish-e-2.0: You can
specify several sorting properties including a combination of descending
and ascending fields.
field1 asc field2 desc
Currently, when num_results is zero there is also an error condition set
(``Word not found''). Therefore, only check and report errors if
num_results is a negative number.
You can also pass in a space-separated list of properties to the
SwishSearch() function. This will parse and cache the list of
properites and then the property IDs can be used to fetch the property
values. This saves the time of converting the property names from a string
to a property ID value for each result. It's unlikely that the speed-up is
sigificant. See the perl/SWISHE.xs code for an example how this can be
done.
int SwishSeek(struct SWISH *handle, int n)
This function puts the results pointer on the nth result. The first result
is number zero. Returns n if operation goes OK or a negative number on
error. After calling SwishSeek() call SwishNext()
to fetch the first record at the position selected by
SwishSeek();
Example:
SwishSeek( swish_handle, 0 ); /* start at the beginning */
SwishSeek( swish_handle, 5 ); /* start at the sixth record */
If you always read results from the very start you do not need to call
SwishSeek(). After a query the position is set to the start of
the result list.
struct result *SwishNext(struct SWISH *handle)
This function returns next result. It must be executed after SwishSearch.
Returns NULL on error or when no more results are available. Call
SwishError() to check for errors.
The value returned is used to fetch the various properties for a given file (e.g. rank, title, path name). Typically,
SwishNext() is called in a loop to fetch and display all the
properties.
If the property named is not defined (invalid name supplied) swish will
return the string ``(null)''. If the property does not exist for this
result the null string will be returned.
You must not free the memory returned by the call, and you must copy the
string to a new memory location if you wish to keep the string around
longer than just while processing the current result.
Currently, a cache of one result's properties (per index) are stored in
memory.
This will return numeric (and date) properties as an unsigned long.
It will return ULONG_MAX on error, which can mean either that the property
name specified was invalid, the property specified was not a numeric or
date property, or simply that the no value exists for the current result.
Check SwishError() to determine if it's a real error vs. just
that the result does not have the property.
int SwishError(struct SWISH *handle)
This function returns the last error code. It's often used as a test to see
if any errors happened on the last operation.
char *SwishErrorString(struct SWISH *handle)
Returns the string version of the error code. See src/error.c
for possible errors. This is a generic error class. See
SwishLastErrorMsg() for possible specific messages.
char *SwishLastErrorMsg(struct SWISH *handle)
This can return additional (more specific) information about the last
error. For example, SwishErrorString() might return:
Index file error
But SwishLastErrorMsg might give details like:
Couldn't open the property file "index1.prop": No such file or directory
This will return true if the last error was critical. A critical error
means swish is in an unstable state and you must call
SwishClose() on the handle.
Call this after calling SwishInit() and any messages or
warning will be sent to stderr (standard error) instead of to stdout. This
might be important when running swish-e in a web server environment.