1.1 Is Pattern Recognition Important?
Pattern recognition is the scientific discipline whose goal is the classification of objects into a number of categories or classes. Depending on the applicationthese objects can be images or signal waveforms or any type of measurements that need to be classified. We will refer to these objects using the generic term patterns. Pattern recognition has a long historybut before the 1960s it was mostly the output of theoretical research in the area of statistics. As with everything elsethe advent of computers increased the demand for practical applications of pattern recognitionwhich in turn set new demands for further theoretical developments. As our society evolves from the industrial to its postindustrial phaseautomation in industrial production and the need for information handling and retrieval are becoming increasingly important. This trend has pushed pattern recognition to the high edge of today's engineering applications and research. Pattern recognition is an integral part of most machine intelligence systems built for decision making.
Machine vision is an area in which pattern recognition is of importance. A machine vision system captures images via a camera and analyzes them to produce descriptions of what is imaged. A typical application of a machine vision system is in the manufacturing industryeither for automated visual inspection or for automation in the assembly line. For examplein inspectionmanufactured objects on a moving conveyor may pass the inspection stationwhere the camera standsand it has to be ascertained whether there is a defect. Thusimages have to be analyzed onlineand a pattern recognition system has to classify the objects into the “defect” or “nondefect” class. After thatan action has to be takensuch as to reject the offending parts. In an assembly linedifferent objects must be located and “recognized,” that isclassified in one of a number of classes known a priori. Examples are the “screwdriver class,” the “German key class,” and so forth in a tools' manufacturing unit. Then a robot arm can move the objects in the right place.
Character (letter or number) recognition is another important area of pattern recognitionwith major implications in automation and information handling. Optical character recognition (OCR) systems are already commercially available and more or less familiar to all of us. An OCR system has a “front-end” device consisting of a light sourcea scan lensa document transportand a detector. At the output of the light-sensitive detectorlight-intensity variation is translated into “numbers” and an image array is formed. In the sequela series of image processing techniques are applied leading to line and character segmentation. The pattern recognition software then takes over to recognize the characters—that isto classify each character in the correct “letternumberpunctuation” class. Storing the recognized document has a twofold advantage over storing its scanned image. Firstfurther electronic processingif neededis easy via a word processorand secondit is much more efficient to store ASCII characters than a document image. Besides the printed character recognition systemsthere is a great deal of interest invested in systems that recognize handwriting. A typical commercial application of such a system is in the machine reading of bank checks. The machine must be able to recognize the amounts in figures and digits and match them. Furthermoreit could check whether the payee corresponds to the account to be credited. Even if only half of the checks are manipulated correctly by such a machinemuch labor can be saved from a tedious job. Another application is in automatic mail-sorting machines for postal code identification in post offices. Online handwriting recognition systems are another area of great commercial interest. Such systems will accompany pen computerswith which the entry of data will be done not via the keyboard but by writing. This complies with today's tendency to develop machines and computers with interfaces acquiring human-like skills.
Computer-aided diagnosis is another important application of pattern recognitionaiming at assisting doctors in making diagnostic decisions. The final diagnosis isof coursemade by the doctor. Computer-assisted diagnosis has been applied to and is of interest for a variety of medical datasuch as X-rayscomputed tomographic imagesultrasound imageselectrocardiograms (ECGs)and electroencephalograms (EEGs). The need for a computer-aided diagnosis stems from the fact that medical data are often not easily interpretableand the interpretation can depend very much on the skill of the doctor. Let us take for example X-ray mammography for the detection of breast cancer. Although mammography is currently the best method for detecting breast cancer10 to 30% of women who have the disease and undergo mammography have negative mammograms. In approximately two thirds of these cases with false results the radiologist failed to detect the cancerwhich was evident retrospectively. This may be due to poor image qualityeye fatigue of the radiologistor the subtle nature of the findings. The percentage of correct classifications improves at a second reading by another radiologist. Thusone can aim to develop a pattern recognition system in order to assist radiologists with a “second” opinion. Increasing confidence in the diagnosis based on mammograms wouldin turndecrease the number of patients with suspected breast cancer who have to undergo surgical breast biopsywith its associated complications.
Speech recognition is another area in which a great deal of research and development effort has been invested. Speech is the most natural means by which humans communicate and exchange information. Thusthe goal of building intelligent machines that recognize spoken information has been a long-standing one for scientists and engineers as well as science fiction writers. Potential applications of such machines are numerous. They can be usedfor exampleto improve efficiency in a manufacturing environmentto control machines in hazardous environments remotelyand to help handicapped people to control machines by talking to them. A major effortwhich has already had considerable successis to enter data into a computer via a microphone. Softwarebuilt around a pattern (spoken sounds in this case) recognition systemrecognizes the spoken text and translates it into ASCII characterswhich are shown on the screen and can be stored in the memory. Entering information by “talking” to a computer is twice as fast as entry by a skilled typist. Furthermorethis can enhance our ability to communicate with deaf and dumb people.
Data mining and knowledge discovery in databases is another key application area of pattern recognition. Data mining is of intense interest in a wide range of applications such as medicine and biologymarket and financial analysisbusiness managementscience explorationimage and music retrieval. Its popularity stems from the fact that in the age of information and knowledge society there is an ever increasing demand for retrieving information and turning it into knowledge. Moreoverthis information exists in huge amounts of data in various forms includingtextimagesaudio and videostored in different places distributed all over the world. The traditional way of searching information in databases was the description-based model where object retrieval was based on keyword description and subsequent word matching. Howeverthis type of searching presupposes that a manual annotation of the stored information has previously been performed by a human. This is a very time-consuming job andalthough feasible when the size of the stored information is limitedit is not possible when the amount of the available information becomes large. Moreoverthe task of manual annotation becomes problematic when the stored information is widely distributed and shared by a heterogeneous “mixture” of sites and users. Content-based retrieval systems are becoming more and more popular where information is sought based on “similarity” between an objectwhich is presented into the systemand objects stored in sites all over the world. In a content-based image retrieval CBIR (system) an image is presented to an input device (e.g.scanner). The system returns “similar” images based on a measured “signature,” which can encodefor exampleinformation related to colortexture and shape. In a music content-based retrieval systeman example (i.e.an extract from a music piece)is presented to a microphone input device and the system returns “similar” music pieces. In this casesimilarity is based on certain (automatically) measured cues that characterize a music piecesuch as the music meterthe music tempoand the location of certain repeated patterns.
Mining for biomedical and DNA data analysis has enjoyed an explosive growth since the mid-1990s. All DNA sequences comprise four basic building elements; the nucleotides: adenine (A)cytosine (C)guanine (G) and thymine (T). Like the letters in our alphabets and the seven notes in musicthese four nucleotides are combined to form long sequences in a twisted ladder form. Genes consist ofusuallyhundreds of nucleotides arranged in a particular order. Specific gene-sequence patterns are related to particular diseases and play an important role in medicine. To this endpattern recognition is a key area that offers a wealth of developed tools for similarity search and comparison between DNA sequences. Such comparisons between healthy and diseased tissues are very important in medicine to identify critical differences between these two classes.
The foregoing are only five examples from a much larger number of possible applications. Typicallywe refer to fingerprint identificationsignature authenticationtext retrievaland face and gesture recognition. The last applications have recently attracted much research interest and investment in an attempt to facilitate human–machine interaction and further enhance the role of computers in office automationautomatic personalization of environmentsand so forth. Just to provoke imaginationit is worth pointing out that the MPEG-7 standard includes a provision for content-based video information retrieval from digital libraries of the type: search and find all video scenes in a digital library showing person “X” laughing. Of courseto achieve the final goals in all of these applicationspattern recognition is closely linked with other scientific disciplinessuch as linguisticscomputer graphicsmachine visionand database design.
Having aroused the reader's curiosity about pattern recognitionwe will next sketch the basic philosophy and methodological directions in which the various pattern recognition approaches have evolved and developed.