Open Source Suggester Spellcheck - Spell Checking Java library

Contents:

1. What is it?
2. Advantages
3. Where to get it?
4. Requirements
5. Basic, Advanced and Enterprise versions of Suggester software
6. Documentation
7. Java Code Samples
8. Examples
9. Dictionaries
10. Release Notes
11. Licensing and Legal Issues


1. What is it?

The Suggester Spell Check is a 100% pure Java library to provide local spell checking service. Free to use with already pre-compiled dictionaries. Suggester Spell Check uses Basic Suggester as a spellchecker.

What is the Suggester software?
The Suggester is a Java program, providing recommendations for unknown words in user query for local search systems. System administrator can create a list of preferred words and assign higher weight to such words. As a basic implementation Suggester can serve as a spellchecker.

2. Advantages

Smart suggestions:
The Suggester uses shortest Edit-distance measure combined with Metaphone algorithm and private Fuzzy-matching algorithm to select the suggestion. You can adjust the influence from each algorithm using a configuration file. If your browser allows running Java applets, try Java Applet based Spelling Suggestions Test or more advanced remote spell checking service English Spell Check test to see how it works.

Local service:
Unlike Google's Spelling API, the Suggester library and a dictionary file is all you need to have local spellchecking service. No need to worry about exposure on Internet, connectivity problems and availability of external service. No hidden fees as well.

Multi-lingual:
See below which dictionaries / languages are available for download.

Custom dictionaries:
The Index Builder allows user creating custom dictionaries. It also can be used to extract all words from the dictionary and modify existing dictionaries. The Index Builder is included in the Basic Suggester download package.

High dictionary compression:
The word dictionary is compressed on a hard drive as well as in computer memory. A basic UK English dictionary contains about 57,000 words and has a size of about 90K. The English dictionary contains about 200,000 words (including names, abbreviations, geographic places, etc.) and it takes 236Kb file on a hard drive and about 2Mb space in memory. Other languages are compressed even better. For example, the Russian dictionary contains more than 1,300,000 words (including variants) and it takes 315Kb file on a hard drive and a little more than 2Mb space in memory. Comparing with more than 30Mb file size of original word list (in UTF-8 format), the compressed file size is close to 1% of the original size.

High dictionary search and suggestion selection speed:
Dictionary case dependent - independent look-up takes about 0.002 - 0.005 ms per word (about 500,000 - 200,000 words per second). Suggestions search speed averages about 40 ms per set of suggestions for each unknown word on Pentium M 1.4Gz (with high quality of suggestions).

Portability:
The Suggester software is entirely written in Java 1.2. Runs on any Java® platform: Windows®, Mac OS®, Unix, Linux. Tested on JRE 1.2 and up.

Comparison table of Suggester and other popular spellchecking software: Apache Java library "Jazzy", web site Dictionary.com, Microsoft Word 2000 and Google search engine. This comparison was done in 2007 and is obviously outdated. In 2016 the ROSS Intelligence did an independent Evaluation of legal words in three Java open source spell checkers: Hunspell, Basic Suggester, and Jazzy Free Java spellchecking software.
Please note one important limitation: Basic suggester is not context sensitive, however the same applies to all other open source spell checkers.

3. Where to get it?

The home page for the Suggester Spell Check project can be found on the SoftCorporation LLC. web site http://www.softcorporation.com/products/spellcheck. There you also can find the information on how to download the latest release as well as all other information you might need regarding this project.
To download go to Basic Suggester project.

4. Requirements

o A Java 1.2 compatible or newer JVM for your operating system.
o There are no other requirements to run Suggester as a Spellchecker.
o To run the Index Builder you may need up to 1Gb of virtual memory.

5. Basic, Advanced and Enterprise versions of Suggester software

There are 3 different versions of Suggester software:
o Basic Suggester - (free open source) uses one dictionary, where all words have the same weight. The Suggester Spell Check uses Basic Suggester.
o Advanced Suggester - (commercial) can use multiple dictionaries with different weights assigned to each dictionary and each word. It also supports multiple languages.
o Enterprise Suggester - (not ready for distribution) uses all features from Advanced Suggester plus has an ability to compress information at much higher rate than the Advanced Suggester. It is achieved by removing repeated segments of a trie, which stores dictionary information. As a result each trie segment of the Enterprise Suggester dictionary is unique.
6. Documentation
See The Basic Suggester Project for documentation.

7. Java Code Samples
Java code samples are included in the download package. Click on a link for more information on How to use Basic Suggester Spell Check.

8. Web Examples
Advanced and Enterprise verions of Suggester software allow creating context sensitive spell-checker, which you can test here:
English Spell Check test.
Russian Spell Check test.

9. Dictionaries
Click here to see and download Suggester dictionaries, including English medical dictionary. Full English/American dictionary contains about 200,000 words, including geographical places and often used names. Full Russian dictionary contains more than 1,300,000 words (including variants).
Send us email if you need Suggester Spell Check with other languages.

10. Release Notes


11. Licensing and Legal Issues

For legal and licensing issues, please read the LICENSE.TXT file.

Java (TM) is trademark of Oracle Corporation.

Suggester Project

Java code Samples

More free downloads

E-mail to Tech Support

Keywords: SoftCorporation LLC., Java, free, software, spell check, spelling, spellcheck, free web service, free spellchecking web service, download, application