Dividing Words into Syllables

Using a very simple algorithm that involves looking at the sequence of vowels and consonants a word contains, a computer program can decide how to divide words into syllables. For instance, consider the two sequences:

1) vowel consonant vowel

In this case, the word is divided after the first vowel. For example, this rule can be applied to the following words:

ruler > ru-ler
prolog > pro-log

2) vowel consonant consonant vowel

In this case, the word is divided between the two consonants. For example,

number > num-ber
panter > pan-ter
console > con-sole

These two rules work well for most words but fail with words like handbook and hungry, which conform to neither pattern. To divide such words, your program would have to use a library containing all words.

Write a Visual Prolog program to divide a word into syllables. The program will first ask for a word to be typed in, and then attempt to split it into syllables using the two rules just given. As we've mentioned, this will not always produce correct results.

First, the program should split the word up into a list of characters. You therefore need the following domain declarations:

    DOMAINS
        letter = symbol
        word= letter*

You must have a predicate that determines whether the letter is a vowel or a consonant. However, the two rules given can also work with the vocals (the usual vowels--a, e, i, o, and u--plus the letter y). The letter y sounds like (and is considered to be) a vowel in many words, for example, hyphen, pity, myrrh, syzygy, and martyr. To account for the vocals, you have the clauses

vocal(a). vocal(e). vocal(i).
vocal(o). vocal(u). vocal(y).

for the predicate vocal. A consonant is defined as a letter that is not a vocal:

consonant(L) :- not(vocal(L)).

You also need two more predicates. First, you need the append predicate.

append(word, word, word)

Second, you need a predicate to convert a string to a list of the characters in that string:

string_word(string, word)

This predicate will use the standard predicate frontstr (described in chapter 13), as well as the standard predicates free and bound (where free(X) succeeds if X is a free variable at the time of calling, and bound(Y) succeeds if Y is bound), to control which clause to activate, dependent on the flow-pattern.

Now you're ready to attack the main problem: defining the predicate divide that separates a word into syllables. divide has four parameters and is defined recursively. The first and second parameters contain, respectively, the Start and the Remainder of a given word during the recursion. The last two arguments return, respectively, the first and the last part of the word after the word has been divided into syllables.

As a example, the first rule for divide is:

    divide(Start, [T1, T2, T3|Rest], D, [T2, T3|Rest]) :-
        vocal(T1), consonant(T2), vocal(T3),
        append(Start, [T1], D).

where Start is a list of the first group of characters in the word to be divided. The next three characters in the word are represented by T1, T2, and T3, while Rest represents the remaining characters in the word. In list D, the characters T2 and T3, and the list Rest represent the complete sequence of letters in the word. The word is divided into syllables at the end of those letters contained in D.

This rule can be satisfied by the call:

divide([p, r], [o, l, o, g], P1, P2)

To see how, insert the appropriate letters into the clause:

    divide([p, r], [o, l, o|[g]], [p, r, o], [l, o | [g]]) :-
        vocal(o), consonant(l), vocal(o),
        append([p, r], [o], [p, r, o]).

The append predicate concatenates the first vocal to the start of the word. P1 becomes bound to [p, r, o], and P2 is bound to [l, o, g].

The second rule for divide is shown in the complete program, 6.

/* Program ch16e06.pro */

DOMAINS

lett er = char

word_ = letter*

PREDICATES

nondeterm divide(word_,word_,word_,word_)

vocal(letter)

consonant(letter)

nondeterm string_word(string,word_)

append(word_,word_,word_)

nondeterm repeat

CLAUSES

vocal(T1),consonant(T2),vocal(T3),

append(Start,[T1],D1).

divide(Start,[T1,T2,T3,T4|Rest],D1,[T3,T4|Rest]):-

vocal(T1),consonant(T2),consonant(T3),vocal(T4),

append(Start,[T1,T2],D1).

divide(Start,[T1|Rest],D1,D2):-

append(Start,[T1],S),

divide(S,Rest,D1,D2).

vocal('a'). vocal('e'). vocal('i').

vocal('o'). vocal('u'). vocal('y').

consonant(B):-

not(vocal(B)),B <= 'z','a' <= B.

string_word("",[]):-!.

string_word(Str,[H|T]):-

bound(Str),frontchar(Str,H,S),string_word(S,T).

string_word(Str,[H|T]):-

free(Str),bound(H),string_word(S,T),frontchar(Str,H,S).

append([],L,L):-!.

append([X|L1],L2,[X|L3]):-

append(L1,L2,L3).

repeat:-repeat.

repeat,

write("Write a multi-syllable word: "),

readln(S),nl,

string_word(S,Word),

divide([],Word,Part1,Part2),

string_word(Syllable1,Part1),

string_word(Syllable2,Part2),

write("Division: ",Syllable1,"-",Syllable2),nl,

fail.