SCANNING GLOSSARY

A-L

 

 

AIIM (Association for Information and Image Management) - A Maryland-based organization dedicated to promoting development of systems that store, retrieve and manage document images.

 

Applet - Applets are small programs developed using either Sun Microsystems' JAVA or Microsoft's ActiveX Web-based programming language.

APRP (Adaptive Pattern Recognition Processing) - one of the most sophisticated technologies currently available in modern text retrieval software. APRP automatically indexes the binary patterns in digital information, creating a pattern-based memory that is optimized for the content of the data. It eliminates the costly labor of manually defining keywords and sorting and labeling information in database fields. APRP has a high tolerance for input data errors, eliminating the need for OCR clean up

Auto Document Feeder (ADF) - An accessory that feeds pages into a scanner when required, enabling the unattended scanning of a large number of documents.

Bandwidth - Number of hertz expressing the difference between the lower and upper limiting Frequencies of a frequency bank. (2) Width of a band of frequencies. (3) Maximum number of information units (bits, characters) capable of traversing a communications path per second.

Barcode - Array of vertical rectangular marks and spaces in a predetermined pattern.

BMP - sometimes called bitmap or DIB file format (for device-independent bitmap), is an image file format used to store bitmap digital images, especially on Microsoft Windows and OS/2 operating systems. (2) Use BMP for any type of bitmap (pixel-based) images. BMPs are huge files, but there is no loss in quality. BMP has no real benefits over TIFF, except you can use it for Windows wallpaper. (3) The simplicity of the BMP file format, and its widespread familiarity in Windows and elsewhere, as well as the fact that this format is relatively well documented and free of patents, makes it a very common format that image processing programs from many operating systems can read and write.

CD (Compact Disc) - The trademarked name for the laser read digital audio disk, 12cm. In diameter, developed jointly by Phillips and Sony.

 

CD-R (Compact disk-recordable) - A standard and technology that enables users to write on and read from a compact disc. This new technology is compatible with existing CDs and CD players.

CD-ROM (Compact Disc Read-Only Memory) - A version of the standard compact disc intended to store general-purpose digital data; provides 556-Mbyte user capacity at 10-13 corrected bit error rate compared to 635 Mbyte at 10-9 for the standard CD 2) a circular disk used to store large amounts of electronic data. CD-ROMs can hold up to 680 MB of computer data. The media is low cost and durable, and in large scale applications can be inexpensively duplicated into thousands of copies. Unlike optical disks which can be written to many times, a CD-ROM is read-only.

Client-Server Based System - a system that stores electronic documents on one computer or a server, while making those documents available to other computers-clients, via a network.

COD (Computer Originated Document) - refers to any document that was originally created on a computer, like a word processing document or a spreadsheet.

 COLD (Computer Output to Laser Disk) - Microfiche replacement system. COLD systems offer economies as a replacement medium when rapid and/or frequent access to archived documents is necessary. Typically, a 12-inch optical disk platter holds approximately 1.4 million 8.5-by-11-inch pages of information, equal to 7,000 fiche masters.

Color depth - The number of colors present in the images of your converted rollfilm. Which color depth you choose depends on your project's requirements, the content of your documents, the intended use of the images, etc. Also remember that grayscale images are significantly larger than bitonal images as more color information is captured, so they will have greater storage requirements (and higher media costs).

COM (Computer Output Microfilm) - A system in which digital data is converted into an image on dry processed microfilm.

 

Compound Document - Any document containing more than one data type, typically rich text, synthetic graphics and raster images.

Compression - In the specific context of digital image representation, compression refers to the process of compacting the data based on the presence of large white or black areas in common business documents, printed pages, and engineering drawings. The re-encoding of data to make it smaller. Most image file formats use compression because image files tend to be large and consume large amounts of disk space and transmission time over networks. (2) A process that reduces the number of bytes required to define a document in order to save disk space or transmission time. Compression is achieved by replacing commonly occurring sequences of pixels with shorter codes. Some compression methods--like JPEG, throw away some data seeking only to preserve the appearance of the image. Others-like Group-IV, preserve all of the original information.

Contrast - The range between the darkest and lightest parts of an image. Part of the trick of improving scan quality is increasing contrast without losing detail in the brightest and darkest parts of the image.

Cropping- Removing unwanted areas from an output image. Cropping is often used while outputting images of scanned microfilm to reduce borders or excess space between the edge of the image and the scanned document.

Cross-Platform software - enables you to share information between computers running different operating systems, such as a Macintosh and Windows workstations.

Database - an organized collection of information stored on a computer. A database is an organized collection of electronic documents stored on a computer. The database is structured to facilitate the search and retrieval of information contained in the database. (2) Database is a term given to a logical assembly of electronic images within the context of suitable document management software, allowing users to browse and search for documents or files within the database.

Database Field - placeholders for discrete bits of information in a database. For example, your last name would be typed in to a field for that purpose. The grouped contents of several fields together form a record.

Database Publishing - enables you to publish a select group of documents from a large-scale document database to laptops and CD-ROMs, allowing you to create miniature, portable databases.

Database Query Screen - a computer generated form which allows you to search for information contained in the fields of a database. By entering information in pre-defined text fields, you instruct the computer to search the database for documents which contain that information. Some document management systems allow you to customize the query screens to accept information that is applicable to the database you wish to search.

Database Record - a collection of the contents of a related group of database fields.

Data Compression - Conversion of a digital image to a lower number of bits for storage.

Descreening - An option included in many scanner drivers that can help remove unpleasant dot patterns from newsprint or magazine images.

Deskew- Removal of any angle present in the document as it appears on the rollfilm in order to make its vertical edge parallel with the edge of the image. Deskew essentially straightens any "crooked" (skewed) images appearing on your rollfilm.

Despeckle - A filtering process which removes noise from images without blurring edges. Despeckling attempts to detect complex areas and leave those intact while smoothing areas where noise is more noticeable.

Digital - Use of binary code to record information. Information can be text in a binary code (e.g., ASCII), images in a bitmapped form, sound in a sampled digital form, or video.

Digital Documents - documents that are stored on a computer. The documents may have been created on a computer, as with word-processing files and spreadsheets, or they may have been converted into digital documents by means of document imaging. Digital documents are also referred to as electronic documents.

Digitization - Synonymous with scanning, it is the conversion from printed paper, film, or some other media, to an electronic form where the page is represented as either black and white dots, or color or grayscale pixels.

 

Dithering - A technique that is used to add more colors or shades of gray to an existing image, the goal being to improve the appearance of the image. Can be thought of as the inverse to quantization.


Dots per inch (dpi) -
Measure of output device resolution and quality. For example, the number of pixels per inch on a display device. Measures the number of dots horizontally and vertically.

 

DVD (Digital Video Disk) - An optical storage medium that can store up to 4.7 Gigabytes (single layer), 8.5 GB (double layer), 9.4 GB (double sided, single layer), or 17 GB (double sided, double layer). Transfer rates and seek times are similar to those of CD-ROM for currently available drives. The DVD spec included higher level specs for audio and video capabilities.

 

Fielded Searching - Searching in a text search database which has the records organized into fields. Common fields are title, author, keywords, abstract, and date. Fields give the searcher a means to focus the search.

 

Electronic Document - A document that has been scanned, or was originally created on a computer. Documents become more useful when stored electronically because they can be widely distributed instantly, and allow searching. HTML and PDF are well known electronic document formats.

 

Electronic Imaging - the process by which documents are digitized.

 

Enhancement - Technique for processing an image so that the result is visually clearer than the original image.

 

Filmer/Scanners - These are microfilm cameras that can also operate as electronic document scanners.  A filmer/scanner produces a microfilm image and a digital image of a page in a single exposure and are intended for applications where both microform and electronic copies are desired.   These units offer a labor-saving alternative to microfilming and scanning source documents in separate operations with different equipment.  Alternatively, a filmer/scanner can be used as a microfilm camera only or as a document scanner only for applications that do not require both types of images.  A filmer/scanner may be rotary or planetary in design and operation. Rotary filmer/scanners, the most common configuration, produce 16mm microfilm, like their camera-only counter parts.

Filters -Tools that allow you to apply effects to your images. These range from conventional options such as Sharpen, which are intended to improve the quality of an image, to more artistic special effects such as Emboss or Mosaic.

Full Text Searching - One big advantage of electronic documentation is the ability to use the computer to search for words in a document, or search for a document containing the word (or words) you are looking for.

G4 Compression - A compression technique used in Fax Group 4. It produces very good results for black and white, and is frequently used as an option in TIFF files for black and white images. It is also used in Adobe Acrobat (PDF) files.

Gamma Correction- Simply put, the contrast selection used to adjust the recorded image. In microfilm, as with other photographic applications, the term "gamma" is used to describe the relationship between the density of the film image versus the film's exposure to light. Gamma correction allows us to adjust the scanned images' contrast for a variety of applications- to produce more natural-looking images, to "lift" or enhance text on dark film to make it more readable, to compensate for under- or over-exposure, etc. (2) The contrast around the midtone of an image. Adjust the gamma and you can alter the brightness of an image without drastically affecting shadows or highlights.

GIF - An image file format that is commonly used on the web. It uses LZW compression, which makes it good for color and grayscale images, but it does not compress as well as G4 for black and white. LZW is "lossless" which means it will not compress as well as JPEG, but will retain all of the image's quality.

Grayscale - An image consisting of shades of grey, with no color. (2) An image type that uses black, white, and a ranges of shades of gray. The number of shades of gray depends on the number of bits per pixel. The larger the number of shades of gray, the better the image will look, and the larger the file will be. (3) A range of gray tones from black to white used to create an image.

Halftone - A printing technique which uses patterns of dots to create the illusion of a continuous tone image.

Highlights - The lightest parts of an image.

Hybrids see filmer/scanner and scanner/filmer

Hyperlinks - allow you to 'link' any document stored in a database with any other document. You can link a spreadsheet to an image, a database to a graphic, or a word processing file to a site on the World Wide Web. You can then navigate from one related document to another, simply by clicking on the hyperlinks.

 ICR (Intelligent Character Recognition) - A technology that employs either software only or software and hardware to automatically recognize and translate raster images into structured date. ICR is an advanced form of OCR technology that may include capabilities such as learning fonts during processing or using context to strengthen probabilities of correct recognition. The processes of recognizing handwritten characters. Similar to OCR, but more difficult since OCR is from printed text.

 Image - A digital representation of a document.

 

Image File Format - When a page is scanned, the page can be stored in a number of file types. The type should be chosen based on the desired use of the image, and the software that will be used. Different file formats commonly use different methods of compression as well, and some types of images compress better using some formats rather than others.

 

Image Support - Hardware (scanner, workstation, printer) and software support for image as a system-recognized information type. Typically, although not necessarily, support for optical storage devices is included.

Index - refers to the information contained in an electronic document that enables you to retrieve it from a database. The index can include physical location information (e.g., where the document is stored) and document identification information (e.g., date archived, creator, and contents).

Inspection - a process of verifying the organization and quality of images on a disk or electronic records/imaging system

Intelligent Scanner - Scanner with additional image processing capabilities, such as OCR, bar code reading, etc.

JPEG - An image file format that is best suited for photographs. It supports "lossiness", which means that it will throw away some detail in order to achieve better compression. It does not work well for text. (2) JPEG (Joint Photographic Experts Group) is a standard image compression mechanism. JPEG compression is "lossy," meaning that the compression scheme sacrifices some image quality in exchange for a reduction in the file's size.

Jukebox - Automated device for housing multiple optical disks and one or more read/write drives.

 

Keyword Searching the ability to search for any single word within certain or all fields of a computer database, thereby provide more precise retrieval of required data

 

Landscape mode a manner of positioning the image i.e. 11x 81/2 or 14x81/2

 

Proudly Serving Wyoming's Business Needs for:
Document Management | Document Imaging | Document Storage | Document Retention | Document Reformatting | Document Destruction