Projects / DataCleaner

DataCleaner

DataCleaner is an application for profiling, validating, and comparing data. These activities help you administer and monitor your data quality in order to ensure that your data is useful and applicable to your business situation. It can be used for master data management methodologies, data warehousing projects, statistical research, preparation for extract-transform-load activities, and more.

Tags Office/Business Scientific/Engineering Information Management Metadata/Semantic Models Records Management Database Data Warehousing Business Intelligence Data Profiling
Implementation Java

Tweet this project Short link

Rss Recent releases

  • Rrelease-mid
  •  20 Apr 2009 22:01
  • Rrelease-after

Changes: An additional HTML export format has been added to the built-in export formats (usable when exporting Profiler results in the desktop app and when executing the runjob command-line tool). The export format can be chosen directly from the desktop app. Four new measures were added to the String Analysis profile: average characters and maximum/minimum/average white spaces.

  • Rrelease-mid
  •  16 Mar 2009 07:51
  • Rrelease-after

Changes: The license was changed to LGPL. The profiler and validator can be executed using multiple threads. DataCleaner tasks can be executed from the command line for batch operation. More elaborate status information is given during profiler and validator execution. Date mask matcher and regex matcher profiles were added. A regex is loaded from the online RegexSwap repository. Popular database drivers are automatically downloaded and installed. More file types are supported, such as .dat and .txt. XML file support was improved. Memory improvements were made in the Time analysis profile. Logging when running profiling and validation was improved. An information schema is provided for file-based datastores. Columns in the datastore-tree are lazy-loaded.

Changes: This release adds multi-threaded execution, a commandline interface (runjob.sh/runjob.cmd), some UI updates, and a few bugfixes.

Changes: The new online RegexSwap system has been integrated to support browsing and downloading of regexes. Automatic download and installation of popular database drivers. Templates for JDBC connection strings. Profiling and validation results now include detail execution status and monitoring capabilities. Better database and XML file compatibility due to updated MetaModel libraries.

Changes: A major update was made to functionality, with lots of new features that were built upon the stabilization release of 1.4. The license was changed to LGPL. New profiles were added for a date mask matcher and a regex matcher. More file types are supported (.dat and .txt). XML file support was improved.

No-screenshot

Project Spotlight

NuttX

A standards complient RTOS with a small footprint.

No-screenshot

Project Spotlight

GTK+

A library for creating graphical user interfaces.