MPCA is a comprehensive suite of tools for doing discrete principal components analysis on data sets of size 100Mb or more. Scaling is done using sparse vectors, multi-threading, memory mapping, and other POSIX tricks. Reports, file dumping utilities, and other utilities are included. The general problem of discrete components analysis is variously called grade of membership, PLSA, non-neg.matrix factorization, multinomial admixtures, LDA, and multinomial PCA.
| Tags | Scientific/Engineering |
|---|---|
| Licenses | GPL |
| Operating Systems | POSIX Linux |
| Implementation | C |
Recent releases


Changes: Some linkages to the ALVIS system at http://www.alvis.info allow the software to be used to create topic models and annotate linguistically tagged content. Some cleanups with the linkBags Perl utilities have been moved out to CPAN. To see some of the models in action, visit the search demos at www.alvis.info.


Changes: A bug in mpbags that made it constantly use the CPU was fixed. There is no need to update from 1.54 if you don't use mpbags.


Changes: This release focuses on integration with the tool-suite at www.alvis.info. Some useful new scripts include the linkBags, linkTables, and linkMpca set for running MPCA on Web link data augmented with names and title text.


Changes: This release adds a significant bugfix to gibbsk sampling and new capabilities to ALVIS support (still incomplete and undocumented). Users should upgrade to this release.


Changes: This release compiles under Cygwin. Many other minor updates and moderate bugfixes were made.