5. Improvements which are necessary
For a computer to be able to interact successfully with humans using natural language,
advanced technology in the field of Speech and Language Processing is required. Speech and
Language Processing itself can be divided into the following areas;
- Speech Recognition
- Natural language understanding
- Information retrieval
- Information extraction
- Inference
- Natural language generation
- Speech synthesis
Speech Recognition
Speech-recognition software is now widely available. However it is not a perfected technology,
but competition in the field has led to vast improvements and most recent software versions
will recognise tens of thousands of words and are able to identify up to a hundred words per
minute and possess extensive idioms. More research is still needed especially in
differentiating between humans and other sounds and the ability of the computer to
recognise different human voices.
Natural Language understanding
However we need to be able to do more than convert audible signals to digital symbols and
back again. We must continue with our efforts to incorporate language-understanding programs
into our systems so computers can comprehend the meaning and context of spoken words.
Computers must do more than what we say, they must do what we mean, by syntactical,
semantical and pragmatical analysis of sentences. For the ambiguities that arise throughout
the categories of language, computers must be able to disambiguate these ambiguities by
part-of-speech tagging and word sense disambiguation.
Deciding whether the phrase 'boil the kettle' means to bring the water in the kettle to
100 °C or to bring the metal and other kettle components to their boiling points can
be resolved by wordsense disambiguation. The computer must analyse a sentence into nouns and
verb phrases and dividing these phrases into smaller units like nouns, verbs and adjectives,
etc (part-of-speech tagging) so that the function of each morpheme is clearly established.
Probabilistic parsing is a method used to resolve syntactic disambiguation and is based upon
assigning probabilities to possible parses for a sentence. This is used as a parsing system
by finding the parse for the sentence with the highest probability of being correct (as in
the meaning intended). However, resolving these problems is still a huge task, as there are
so many possibilities and so many wrong paths the program takes which need to be repaired and
so many new situations which always remain to be added.
Information Retrieval and Extraction
The computer would need to convert the request into a specially formatted query, determine
which database to search and extract the pertinent information from the database. We would need
to increase the number of domains - computer must look across all domains without been
specifically requested to do so. Computers can complete this task reasonably well, once
sentences are syntactically parsed so that the correct meaning is understood, the correct
data can be extracted. However programs like Galaxy are currently confined to look in one
domain at a time (e.g. weather). Programs need to be able to draw data from all domains
without any further input from the user. In reality there should be only one domain, which
will contain vast amounts of knowledge, and advances in processing speed and memory should
provide acceptable search speeds with no waiting period and allow conversations happen in
real time. Also, the knowledge acquisition process needs to be automated, so that agents
(software which runs without constant human supervision) can learn the vast amounts of world
knowledge. Otherwise vast amounts of information would have to be inputted manually. It seems
the deeper you go when trying to capture human knowledge, there are always more details to
represent.
Natural Language Generation
In natural language generation a computer automatically creates natural language, from a
computational representation. Once the information is retrieved, the program must plan and
organise the information, decide on the scope of information to generate, the lexical content
and the syntactic structure. It then generates the information in the users natural language
starting with phrases and morphemes of a sentence and then translating these morphemes into
phonemes. This technology is currently being used for automatic weather reports and
explanations in expert systems but is still largely under development.
Speech Synthesis
The computer must be able to produce properly formed sentences and be able to engage in
dialogue with the user to clarify ambiguities or mistakes in the users questions. Dialogue
would have to be natural-sounding highly intelligible speech. It does this by using a database
of recorded speech. The database is obtained by recording someone whose voice we wish to
emulate. This person records a set of words and phrases. Those words and phrases are then
run through a computer program that stores them in a database in a way that can be accessed
by a synthesizer. Then, when speech is synthesized, the longest continuous strings of speech
are appended together. So if someone is trying to synthesize a phrase or sentence that has
been stored in the database, the resulting synthesized speech will sound as natural as
recorded speech. If, however, someone is trying to synthesize a more unusual phrase or
sentence, shorter portions of speech will be appended together. The resulting synthetic
speech will sound slightly less natural, but it will still be in the same voice as the
person who was recorded. Synthesizing speech this way allows all possible words, phrases,
and sentences to be synthesized even though only a limited number of words and phrases have
been recorded.
6. Summary
Significant progress has been made both in fundamental research (search, logic, machine
learning, knowledge representation, computational linguistics, robotics) and in deployed
applications (speech recognition and synthesis, financial expert systems, chess-playing
programs).
However, much more research is needed in natural language understanding in resolving
ambiguities (syntactic, semantic and pragmatic). Further research is needed in using
models such as state machines (Markov model), formal rule systems (regular grammar and
context-free grammar rule systems) and logic (predicate calculus). State space search
algorithms and dynamic programming algorithms are some of the other methods used. All of
these techniques are boosted by probability theory, where we are basically concerned with
the task of finding the most probable sequence of words when we are given a sentence or
phrase. There are many different possible sequences and we assign probabilities to each
possible sequence (e.g. in speech recognition - select the most probable from those proposed
by the speech analyser.)
Also, the knowledge-acquisition process must somehow be automated so that the computers can
read and learn on their own.
There are basically two schools of thought with different approaches in trying to overcome
these problems. The 'top down' approach sees human thought processes as the result of rule-based
symbol processing in the brain. The brain manipulates symbols according to set rules. The chess
playing program, Deep Blue, applied these methods when it dethroned chess champion Gary
Kasparov. However this approach requires enormous amounts of manual coding.
In contrast, the 'bottom up' approach seeks to build a mechanism that can evolve useful
systems by itself. Using neural networks, where the computing structure of neurons in the
brain is emulated, statistical language models can now perform many tasks once thought to
require manually constructed rules, such as word-sense disambiguation. The hope is that
machines will develop intelligence by learning from its surroundings - as humans do.
Neural networks are an approach to machine learning by using simple processing units
(neurons), organised in a layered and highly parallel architecture, to perform arbitrarily
complex calculations. Learning is achieved through repeated minor modifications to selected
neurons, which results in a very powerful classification system. Successful applications
include stock market analysis and character and speech recognition.
So some researchers see the solution in a system modelled on neural nets - a type of
learning system. Others believe that symbol manipulation is the only viable approach.
Perhaps a combination of both approaches is necessary to unravel the problems.
7. What does the future hold?
Can the human language user be replicated by a computer? Some researchers regard the
intricate workings of the brain as an algorithm, a step-by-step program, which consequentially
can and will be run on a computer. But it will take decades, and will probably be developed on
an artificial brain of silicon and plastic. Is there a critical thought level like critical
mass level? And if so will mechanical/electronic minds then start-up by themselves when they
reach a certain level of complexity? Perhaps they too will have a language acquisition device
- that innate potential to develop language which humans are born with. Other researchers say
that since we don't understand the basis of human consciousness or intelligence yet, how can we
create it in machines? And the philosopher John Searle insists that the mere computational
manipulation of symbols and successful running of algorithms does not constitute understanding
and thinking.
How far are we away from a computer who could coherently and articulately participate in
dialogue with humans and exhibit Hockett's characteristics of language, like HAL in '2001:
A Space Odyssey'? Perhaps developing computers that can understand and produce natural language
will not happen until the computer can also mimic other cognitive faculties including emotions,
dreams, feelings and desires. Or perhaps these cognitive faculties will appear in parallel with
the development of language understanding in computers.
On the other hand, with our artificial limbs, implants and pacemakers of today, and with the
mechanical hearts, cochlear and retinal implants of tomorrow, perhaps we are already part
cyborg. How long will it be before we implant integrated circuits into our brains, to enhance
or repair functions? We will become cyborgs. There will be no distinction between man and
machine. Homo Sapiens will vanish and evolve into Robo Sapiens - with far superior intelligence
and unlimited life expectancy.