SCANNING GLOSSARY

M-Z

 

 

Magnetic Disk - Digital media that uses magnetic particles to store data. Both hard disks and floppy disks are magnetic disks.

Metadata - is data about data.(2) an item of metadata may describe an individual datum, or content item, or a collection of data including multiple content items. Metadata (sometimes written 'meta data') is used to facilitate the understanding,

use and management of data. The metadata required for effective data  management varies with the type of data and context of use. In a library, where the data is the content of the titles stocked, metadata about a title would typically include a description of the content, the author, the publication date and the physical location. In the context of a camera, where the data is the photographic

image, metadata would typically include the date the photograph was taken and details of the camera settings. In the context of an information system, where the data is the content of the computer files, metadata about an individual data item

would typically include the name of the field and its length. Metadata about a collection of data items, a computer file, might typically include the name of the file, the type of file and the name of the data administrator. (3) Metadata is

structured, encoded data that describe characteristics of information-bearing entities to aid in the identification, discovery, assessment, and management of the described entities.(4) Metadata is a set of optional structured descriptions

that are publicly available to explicitly assist in locating objects.

 

Microfilm/Microfiche Scanner - a type of scanner that converts microfilm or microfiche documents into electronic documents

Microfilm/Microfiche Scanner - a type of scanner that converts microfilm or microfiche documents into electronic documents.

Mode – a manner of position images

Network - refers to two or more computers that have been linked together to enable them to communicate with each other, exchange information, and share resources.

OCR (Optical Character Recognition) - The ability of a computer to recognize written characters through some optical-sensing device and pattern recognition software. Characters are read and then converted into computer-processable codes (e.g., ASCII, EBCDIC). (2) OCR is a type of software designed to extract text from images (for example, digitized images of your rollfilm) and output it to a file such as a PDF or text file. Note that while OCR works very well with typewritten text, it is limited in its ability to recognize handwritten (cursive) script, particularly on older documents of dubious quality. The recognition of printed or written text characters by a computer. This involves scanning of the text, analysis of the scanned-in image, and then translation of the character image into character codes, such as ASCII.  (3) Refers to the process by which scanned images are electronically "read" to convert them into editable text. This conversion is performed after scanning, and may output formatted text or text-only files (flat ASCII files). Text generated by OCR is often input into text search databases, allowing retrieval of the original scanned image based on its content. (4) The means of producing a text document from an electronic image. The text document can then be edited in a suitable text editor (e.g. Microsoft WORD). In a suitable database, the text document will sit behind the corresponding image to give free text as well as key field searching

Optical Disk - A disk read or written by light, generally laser light; such a disk may store video, audio, or digital data. (2) Discs that use tiny optically reflective particles to store data. A laser is used to read the reflective bits, and write data. Unlike CD-ROM, which is read-only, most optical disc systems are writable.

Optical Memory - Memory in which data are recorded and/or read by optical means.

Patch Card - is a document that contains scanner and indexing instructions in the form of a bar code. Patch Cards can be inserted at specific points in a 'scan batch' where you desire new scanner or indexing settings to begin or end. Patch cards can instruct document imaging software to store a document in a specific database, assign the document an incremental sequence number, assign a job name, or record the scan date of a document. Patch cards are also capable of adjusting scanner settings and performing image enhancement operations such as 'deskew,' 'rotate,' and 'despeckle'.

PDF (Portable Document Format) – A proprietary format developed by Adobe Systems lets you capture and view robust information—from any application, on any computer system—and share it with anyone around the world. Adobe PDF files look exactly like original documents and preserve source file information — text, drawings, 3D, full-color graphics, photos, and even business logic — regardless of the application used to create them. Leverage full-text search features to locate words, bookmarks, and data fields in documents. PDF is an open standard, and is now being prepared for submission as an ISO standard

Pixel - Smallest element of a display surface that independently can be assigned color or intensity. The smallest mark or dot on a screen.  Short for “picture element”. The smallest segment of a document that can digitize.

Portable - To be functional across differing types of computers and operating systems. This can be used to describe programs or electronic documents.

Portrait Mode – a manner of positioning an image i.e. 81/2 x 11 or 81/2 x 11

Preview - A low quality, but fast scan, useful to get a quick idea of how a scanned image is going to look. With TWAIN software and many scanners, this is part of the set-up process to allow you to select a specific portion of a document.

Proofing - In OCR, the results are never perfect. The same is true for conversion to PDF. Proofing (in this context) is a service by which the resulting OCR text or PDF file is repaired for errors induced by the OCR process.

RAID (Redundant Array of Inexpensive Disks) - a storage technique that enables you to obtain increased storage reliability and performance by writing data to a connected series of disks referred to as a logical volume. Data reliability is achieved with error correction techniques or data duplication. Disk performance is achieved by parallel data transfers to a set of disks--this technique known as 'data striping.'

Raster - Description of a rectangular or square array formed by a number of horizontal scan lines comprising a number of picture elements. The number of scan lines establishes the vertical dimension of the array, and the number of picture elements form vertical rows that establish the horizontal dimension of the array.

Resize - Your digitized images can be resized according to your parameters (e.g., variable width with maximum height of 350 pixels, resize all to 900 x 650 pixels, etc.). Often, multiple images may be output: a smaller thumbnail image at lower resolution, and a high-quality, full-size archival image.

Resolution - The ability of a scanning or image generation device to reproduce the details of an image. (2) The measure of capability to delineate picture detail. (3) Resolution The number of dots per inch (dpi) that were stored during scanning. The greater the number, the greater the amount of detail that is visible. It is recommended that you use between 72 and 100 dpi for images that will be displayed on the screen, and 300 dpi for images that will print on common inexpensive printers. Higher resolution images take up more space as well. (4) Refers to the 'image-sharpness' of a document, usually measured in dots (or pixels) per inch (dpi). Documents can be scanned at various resolutions depending on your particular needs. The higher the resolution of a document, the greater the image-sharpness, and the larger the file size will be. Resolution also refers to the image-sharpness that printers and monitors are capable of reproducing. (5) Typically expressed as dots per inch or d.p.i. is a measure of the quality of an electronic image, a higher d.p.i. indicating a better quality of image, but also a larger file size. In general, the chosen resolution is a trade-off between acceptable quality and available drive space (6) the number of dots per inch (dpi) at which an image is scanned. Images scanned at higher resolutions contain more image information (detail) but also take up more storage space. 300 dpi is the minimum scanning resolution required if you intend to use OCR to extract text from your images. If you're opting for web presentation, 200 dpi is often a good resolution to use. For example, an 8.5" x 11" document scanned at 200 dpi in 256 color grayscale and output as a JPEG with slight compression is approximately 1.5 MB in size.

Rotation - A process which rotates the output image 90, 180, or 270 degrees from its orientation on the original microfilm. This option is typically used to ensure that all images may be easily read, even if they may have been filmed sideways or upside-down.

Scaleable - refers to the ability to enlarge or reduce the size of an image. A document management system is said to be 'scaleable' if its capabilities can be increased to support additional users or platforms.

Scan Batch - a collection of documents that are fed into a scanner for the purpose of being converted into digital or electronic documents.

Scan Size - The length and width dimensions of the part of a document that can be digitized.

 

Scan Time - The total time to convert text or graphical information into electronic raster form.

 

Scanner - A device that electro-optically converts a document into binary (digital) code by detecting and measuring the intensity of light reflected or transmitted.

 

Scanner, combination – for maximum flexibility, come scanners support both sheet fed and flatbed input methods.  An operator can remove or lift the scanner’s sheet feed mechanism to reveal a flat glass surface on which bound volume or fragile documents can be placed for scanning.

 

Scanner, sheet fed – Also known as pass-through or pull-through scanners.  Pages are inserted into a narrow opening to be scanned. They are transported individually across a scanning mechanism that includes optical and photosensitive components.  Depending on design, the scanned pages are ejected at the back or bottom of the machine. Sheet fed scanners have dominated imaging production since the industry’s inception and are recommended for most records management applications.  These scanners can accommodate up to 11x17 documents and some large-format documents such as engineering drawings. Sheeted scanners are faster and highly productive. Most sheet fed scanners are equipped with automatic page feeders as a standard   feature.

 

Scanner/filmers - Most scanner/filmers are high speed, heavy duty devised intended for large-scale images projects, centralized document conversion departments or commercial service bureaus.  The most prevalent use of scanner/filmers are in the proof department of banks and government agencies where the speed of retrieval of digital images meets their constituents needs and the microfilm meets the long term requirements of their State Archives.

Scanner, flatbed – a devise with a flat exposure surface on which pages are individually position for scanning. Most models feature a glass exposure surface on which pages are places face down with the scanning components located beneath the class surface like an office copier.  Flatbed cameras in a planetary camera design with pages positioned face up on the exposure board and the optical and photosensitive components positioned at the top a vertical column are much less common.

Scanner Interface Board - a piece of hardware that enables software programs to communicate with various models of scanners.

Scanner Threshold - Setting that determines whether a pixel is white or black.  

 

Scanning/Document Scanning - Scanning/Document Scanning produces electronic images from hard copy documentation. Most commonly used image formats for monochrome images are tiff and pdf.

Scanning Service Bureau – a business that performs one or more scanning services to customer specification using the customer own documents, computer data or other source material

Shadows - The darkest parts of an image.

Sharpening - A filtering process which increases the apparent sharpness of microfilmed images. Proper use of a sharpen filter can increase the readability of frames, particularly those which were originally microfilmed slightly out-of-focus. However, care must be used when using sharpen filters as too much of the effect can produce unwanted, exaggerated edges.

Skew - During printing or scanning, the contents of a page are almost never exactly vertical, which referred to as being skewed. De-skewing is a process where the computer detects and corrects the skew in an image file.

 

Source documents – the paper documents to be converted to digital images

SQL (Structured Query Language - a database access language that originated on mainframes and minicomputers, and which is now popular on PCs.

Text Retrieval Software - enables you to retrieve electronic documents from databases by entering 'key' words in a text search field. Documents containing the text you entered are retrieved from the database, and presented to you in a list ranked by relevancy.

Thresholding - When converting a pixel from grayscale to black and white, the threshold is the gray value above which will be considered white, and below or equal to will be considered black.

 

TIFF (Tag Image File Format) - A de facto standard format for image files.

(2) An industry standard image file format. It is unique in that it incorporates multiple compression techniques, allowing the user to specify the best format for a type of image, and that one file can contain multiple images.

TWAIN - the standard means by which a PC can send commands and retrieve data to an external device, most commonly a scanner or digital camera. Most advanced graphics programs support TWAIN, and so will work with your scanner immediately. (2) Is a scanning interface standard developed to address the need for consistent, easy integration of scanners with document imaging programs. Software programs that are written to support the TWAIN standard are capable of controlling any TWAIN compliant scanner.

Vector Graphic - A graphic file whereby an image is represented by continuous functions, as opposed to an image file which is represented by dots (pixels). Vector Graphics require less space and can generate much higher quality output, but are almost always from an electronically produce original (like a cad or drawing program).

Workflow Software - allows businesses to move electronic documents along a user-defined 'routing' path, from one workstation to the next, around a local or wide-area network. Once the document arrives at any given workstation, the receiver can add notations to, or modify, the document as they see fit. An insurance company might use workflow software to route claim forms through their organization. A user at one step might wish to review the forms and add a new document to the electronic 'package' before sending it to the next workstation. The next user might wish to add several notations to the forms before sending it on to the final workstation for approval. The route can be as simple or as complex as a business process requires.

WORM (Write Once, Read Many) - A digital, optical storage medium on which information can be recorded once and ready many ties. Provides for extremely compact storage of data at relatively low prices compared to traditional magnetic storage.

 

XML - eXtensible Markup Language. Used to apply structure to electronic documents. It is more narrowly focused than SGML (the underlying language), thereby making it easier to define standard documents types.

 

Zooming - To make an image appear larger (zoom in) or smaller (zoom out) by re-displaying the image at different resolutions. Higher resolutions will make the image appear larger easier to read.

 

 

 

Proudly Serving Wyoming's Business Needs for:
Document Management | Document Imaging | Document Storage | Document Retention | Document Reformatting | Document Destruction