ASPseek

ASPseek is an Internet search engine, written in C++ using the STL library. It consists of an indexing robot, a search daemon, and a search frontend (CGI or Apache module). It can index as many as a few million URLs and search for words and phrases, use wildcards, and do a Boolean search. Search results can be limited to time period given, site, or Web space (set of sites) and sorted by relevance (PageRanks are used) or date. It is optimized for multiple sites (threaded index, async DNS lookups, grouping results by site, and Web spaces), but can be used for searching one site as well. It can work with multiple languages/encodings at once (including multi-byte encodings such as Chinese) due to optional Unicode storage mode. Other features include stopwords and ispell support, a charset and language guesser, HTML templates for search results, excerpts, and query words highlighting.

Tags Internet Web Indexing/Search
Licenses GPL
Operating Systems POSIX
Implementation C++

Tweet this project Short link

Rss Recent releases

  • Rrelease-mid
  •  22 Jul 2002 08:50
  • Rrelease-after

Changes: A bug in reverse citations merging which was introduced in 1.2.9 was fixed. Incorrect redirects merging was fixed. Incorrect merging of the reverse citation index was fixed in the case when URL A refers to B which redirects to C and A also refers to C and one of references from A disappeared. The fact that the ResultsPerPage parameter from s.htm overrides PS cookie was fixed. Small improvements and fixes were made to init.d/aspseek, aspseek-mysql-postinstall, the spec file, and the manual pages. The DBLibDir directive was added to aspseek.conf and searchd.conf.

  • Rrelease-mid
  •  03 Jul 2002 07:20
  • Rrelease-after

Changes: This version added support for the HTTP POST method in s.cgi, IncrementHopsOnRedirect and RedirectLoopLimit options and sed-like \1 to \9 sequences in the Replace command in aspseek.conf. Tag parsing was improved in the index to handle omitted quotes, and the option to limit results by a range of dates was fixed. Several rare core dumps in searchd and s.cgi and two rare memory leaks in searchd were fixed. FreeBSD portability fixes were made, along with fixes and improvements in the results cache and man pages.

  • Rrelease-mid
  •  19 Feb 2002 13:22
  • Rrelease-after

Changes: A generic client library (libaspseek) and module for Apache server, new man pages (index(1) and aspseek-sql(5)), fixes for 64-bit platforms such as Alpha for gcc3/ISO C++ conformance, sped-up citation merging, fixes for incorrect (in some cases) processing of -u -t -s options in "index", a fix for a bug with not closing file in "searchd", a fix for a bug in excerpts processing, a mechanism to reuse URL IDs (saves memory in "searchd"), and a fix for a bug that caused "orphaned" URLs in database.

  • Rrelease-mid
  •  07 Dec 2001 09:19
  • Rrelease-after

Changes: The package was made more portable, and it should now compile on non-Linux systems. The code is compilable with gcc3. Linking on certain Linux platforms was fixed. Checking of deleted entries in the inverted index was fixed. Processing of clones was fixed. The random number generation facility in s.cgi was fixed. CheckOnly and CheckOnlyNoMatch behavior was fixed (GET was used instead of HEAD). More langmaps for Czech charsets were added, and another (smaller) list of Italian stopwords was written.

  • Rrelease-mid
  •  14 Nov 2001 13:45
  • Rrelease-after

Changes: This release implements a buddy-like heap and buffered file in "index" for better memory usage and faster processing. It fixes improper clones processing in "index", a few coredumps in "searchd" and "index", a few memory leaks in "index", and the -P and -A flags in "index." The amount of stack used by "index" has been reduced. Concurrency between searchd threads having many DB connections has been improved; Options have been added to "index" to delete URLs from an inverted index and to re-create broken citation files from the DB. The I/O nce of "searchd" has been optimized.

Rss Recent comments

Rcomment-before 01 Feb 2006 16:47 Rcomment-trans newjobdirect Rcomment-after

Aspseek is discontinued
It's obvious that SWSoft does not develop and support this great project any more. Sadly... However there are another version where a lot of bugs were fixed and some features was added, like mysql 4 and gcc 3.2 support. As far I know this is the last and most stable version (we run in successfully on our site last year). You can download it from http://www.newjobdirect.co.uk/aspseek/. Search Man (http://www.newjobdirect.co.uk/)

No-screenshot

Project Spotlight

fsprotect

Scripts that make filesystems on Debian systems immutable.

No-screenshot

Project Spotlight

Hop

A programming environment for the Web 2.0.