libbnr is an implementation of the Bayesian Noise Reduction (BNR) algorithm. All samples of text contain some degree of noise (data which is either intentionally or unintentionally irrelevant to accurate statistical analysis of the sample where removal of the data would result in a cleaner analysis). The Bayesian noise reduction algorithm provides a means of cleaner machine learning by providing more useful data, which ultimately leads to better sample analysis. With the noisy data removed from the sample, what is left is only data relevant to the classification. libbnr can be linked in with your classifier and called using the standard C interface.
| Tags | Adaptive Technologies Communications Email Filters Usenet News Software Development Libraries Text Processing |
|---|---|
| Licenses | GPL |
| Operating Systems | POSIX Unix |
| Implementation | C C++ |
Recent releases


Changes: A critical bug causing an invalid memory read on bnr_hash_destroy() has been fixed.


Changes: Some minor changes to the API were made to accommodate needs by some filters. Some symbols were also renamed to avoid conflict with other libraries.


Changes: This version employs a purely statistical method of noise reduction using a pattern learning and consistency checking approach. Patterns of p-value tuples are generated and learned as metatokens within the classifier. The disposition of patterns are then compared against the p-values of the tokens included in the pattern. Any inconsistencies exceeding an exclusionary radius are then eliminated as noise.


Changes: Some initial release bugs in the algorithm were repaired. The code was upgraded to v1.2 of the algorithm.


No changes have been submitted for this release.