Problem C
Lost is Close to Lose
Better Documents Inc. is contemplating the next generation of word processors. Now, nearly every word processor includes a Spell Checker. BDI, however, is looking forward to replacing that with a true Typo Checker. We’ve all been caught, after relying on a spell checker, by typing mistakes (“typos”) that just happen to wind up as a correctly spelled word, just not the word we intended. BDI hopes to use AI to determine when a word’s context suggests that it is out of place and probably should have been a different, but similarly spelled, word.
As a first step in this process, they want to see how common such similar words really are in ordinary text. Write a program to read in paragraphs of text and to produce a list of similarly spelled words occurring in that text.
For the purpose of this program, a word is any maximal string of non-whitespace characters containing at least one alphabetic character. Whitespace can be either blanks or line terminators (“\r” or “\n”). The core of a word is what you have left after removing any non-alphabetic characters and replacing any upper-case alphabetic characters to their lower-case equivalents.
Two words are considered to be similarly spelled if the core of one word can be converted to the core of the other word by a single application of any one of the following transformations:
-
Delete a single character.
-
Insert a single alphabetic character.
-
Replace a single character by a different alphabetic character.
-
Transpose (exchange) any two adjacent characters.
Input
Input consists of $1$ to $100$ lines of text, followed by an end of input marker in the form of a line containing only the string “***”.
Each line of text will contain $0$ to $80$ ASCII characters (not counting line terminators).
Output
For each word core in the text that has one or more similarly spelled words, print a line consisting of
-
That word core
-
A colon (“:”) followed by a blank
-
A list of all similarly spelled word cores (with no duplicates and not containing the core to the left of the colons), in alphabetic order, separated by single spaces.
The lines printed should be in alphabetic order of the word cores to the left of the colon.
If there are no similarly spelled words in the input, print a single line containing the string “***”.
Sample Input 1 | Sample Output 1 |
---|---|
Lost is Close to Lose "Better Documents Inc. wants to add Typo Checking in to the next generation of word processors," he said. *** |
close: lose he: the in: inc is inc: in is: in lose: close lost lost: lose the: he |
Sample Input 2 | Sample Output 2 |
---|---|
The fox said, "When?" "Not till 12 o'clock", replied the hen. "That clock is stopped, it will never strike.", he said. *** |
clock: oclock he: hen the hen: he when is: it it: is oclock: clock the: he till: will when: hen will: till |
Sample Input 3 | Sample Output 3 |
---|---|
There are no similar words in this input set. *** |
*** |