Please note that these materials are provided for historical purposes only. The information presented is out of date and may be neither accurate nor useful. External hyperlinks may no longer be valid.
CHAPTER TWO
Functions and Technology
Two questions immediately arise when we think about imaging. What functions does it perform? How does it perform those functions?
This chapter will answer those questions in terms the non-technical person can understand. (The site reports in Chapter Seven and Appendix A contain more detailed technical descriptions of imaging equipment, software, and communications.)
At its most general level, imaging consists of the following functions:
- Capture. Receive a document (i.e., usually something printed on paper) from a source external to the imaging system and enter it into the imaging system;
- Index. Cross-reference the document so that, when stored in the system, it is associated with documents in the same group (e.g., same court case, party, and date filed) and can be found and retrieved from storage;
- Process. Perform the functions explicitly requested in the document (e.g., prepare a bench warrant according to a court order) or implicitly required because of the type of document, its source, when it was received, or some other factor (e.g., route court receipts to workstations that handle funds allocation);
- Store and Retrieve. Store a picture of the document in the systems storage, interpret inquiries to retrieve the document from storage, work with the index to find and retrieve it, route it to the proper workstation(s) or parts of the system for dissemination, and return it to the systems storage when the user has completed work on it; and
- Disseminate. Send the document to imaging system users and people outside the system who need it, either by printing it on paper or sending it through communications networks.
At a similar level of generality, the equipment and software that support imaging consist of the following:
- Scanners. Scan documents and other objects (e.g., photographs) into the computer;
- Computer. Contains imaging software and processes images according to instructions in the software;
- Software. Either automatically or with operator intervention, accomplishes the imaging functions noted above through coordinated use of operator information, computer and imaging equipment, other software, images and other data in the computer, and communications;
- Workstations. Permit imaging system operators (e.g., the scanning unit) to capture, edit, and index documents; to work on document images; and to display images with other information--including information to and from other computers and systems (e.g., case processing) that work with the imaging system;
- Storage. Stores images of the scanned documents--usually on optical disk but sometimes on magnetic disk;
- Printers. Print documents from imaging and other systems;
- Viewers. Permit imaging system users (e.g., judges, law clerks, other attorneys, and clerical staff) to view documents as they formerly would have referenced pages in printed books and other documents (e.g., court land records); and
- Communications. Permit imaging computer(s) to communicate (1) with imaging equipment so that operators can process documents and (2) with other computers through, for example, networks so that users and people outside the imaging system can have access to imaged documents (see Figure 4).
The remainder of this chapter describes imaging in terms of its functions (capture, index, process, store and retrieve, and disseminate) and the technology (equipment and software) typically required to perform those functions.
If your primary interests are whether imaging is right for you, how to acquire and implement it, and what the future holds for imaging, please skip to Chapter Three, Is Imaging Right For You? Otherwise, continue reading this technical description.
The description does not distinguish between the size and scope of imaging systems, which can range from stand-alone systems in small courts used for record searches to huge systems in government and industry (such as IRS, NASA, and insurance companies) used to enter, process, and disseminate massive volumes of documents.
Throughout the description, computers and software will be discussed only when unique features exist because they are intrinsic to each function described below. Computers provide the processing, and software makes the different equipment work together to accomplish the functions. For example, database management software helps the user create the index for an imaged document, and it works with the disk controller to store the document in the proper location and retrieve the document when requested by a user.
The chapter concludes with a summary of other technologies that relate to or could be used instead of imaging.
II.A.1. Functions
Documents are entered into an imaging system by scanner, facsimile transmission (fax), or imported image files; converted to images from data or text when initially scanned; and stored on disk storage devices.
If the person with a document to be entered is in the same location as the imaging system, he or she can scan the document directly into the system. This is called local document entry, or local scanning.
If the person and the system are in different locations, he or she sends the document to the system location. There are several ways to accomplish this. First, the document can be sent in paper form and scanned locally at the system site. Second, some imaging systems accept faxes, and since documents must be scanned into a fax machine to be transmitted, the document enters the system directly when it is faxed. Third, the person may have a scanner at his or her location connected to the imaging system. In this case, the document can be scanned directly into the system
from the persons location, which is called remote document entry, or remote scanning.
Documents can be transferred into imaging systems from other systems. These imported documents may be in imaged, data, or text form. Even though the documents are in the same system, truly integrated documents--that permit images, text, and data to be merged into one compound document--are uncommon. Current document imports usually are limited to, first, importing documents into a system that can store and retrieve imaging, office, and data processing files and, then, displaying images and outputs from the other two types of files together on a split screen.
Figure 4: LAN-Based Document Imaging System
II.A.2. Equipment and Software
Scanners, including fax machines that actually are scanners combined with a communications modem and printer, are the principal means of capturing documents in imaging systems. Some local and remote scanners can accept batches of documents, with no operator attention required while a batch is being scanned, and some can scan only one page at a time. Scanners can be stand-alone, combined with PCs, combined with fax machines, or combined with other equipment.
During scanning, each discrete element (e.g., letter, number, punctuation mark, and graphic mark on the paper) of the document is digitized. This means it is bit-mapped or transformed into a
pattern of bits (see Figure 5), each of which is either on or off (i.e., either one or zero). Scanning can be either binary, halftone, grayscale, or color. A higher density of bits permits shades of "gray," as would be needed for graphics, instead of the bitonal (i.e., black or white) that represents characters. A technique called dithering permits adjacent black and white images to be combined to simulate shades of gray so that, for example, photographs can be imaged.
Throughout image processing, workstations with monitors capable of displaying graphics (and therefore imaged documents) and data and text characters on a split screen are needed. These high-resolution monitors normally range between 17 and 21 inches. They must be accompanied by a keyboard and preferably a pointing device such as a mouse. PCs should be considered for workstations because of their low cost, flexibility, and increasing ability to display graphics such as document images.
Usually, imaged documents are stored permanently on optical disk. Magnetic disk provides a staging storage area for documents on their way to or from processing. Disk storage will be covered more fully later in this chapter.
Figure 5: Illustration of Bit-Mapping
II.B.1. Functions
Like their manual counterparts, documents that have been imaged and stored by a computer must be retrieved from storage so they can be used. Since documents for a given case are stored wherever space is available (and not necessarily together), indexes, which contain information needed to reference and find documents in computer storage, commonly are used to accomplish file retrievals. Most imaging systems, therefore, have an indexing capability.
Imaging systems frequently operate with other types of systems--particularly in courts. For example, if imaging provides input, output, and storage for court documents, the imaging and court case processing systems should function together, just as the case processing system and manually filed documents supplemented each other. Sometimes the imaging system is subordinate to the case processing system; sometimes they simply provide data to each other.
Since the bedrock of imaging, data processing, and office systems is their ability to retrieve and use previously stored documents and information, they depend on indexes. Often these three types of systems use the same index, which makes the index one of their main intersection points.
The index contains data and formatted text that uniquely identify each document, and it associates these descriptors with the location of the document in storage. It is a data processing--as opposed to an imaging--function, which may be part of the imaging system or in a separate system. The image, which is a bit-mapped picture of a document, is in the imaging system, and the index, which locates the image, is in a data processing system. This means that when a document is scanned into the imaging system, the index must be entered on the data processing side.
Indexes have different forms:
- Some indexing software lets users define the number, titles, and types (e.g., data or text)
- of document descriptors. A few products let the user design an index that can be tailored exactly to the organization's needs.
- Other indexing is fixed in that it permits only specific, predefined document descriptors. For example, such an index may associate storage location with case number and case style (e.g., Jones v. Smith).
- Many systems provide index entries, called keywords, from which the user can choose the words or phrases that best describe a document. The choice of keywords may or may not be restricted to a predetermined list.
The usefulness of index entries or keywords is only as good as the search capability of the software and the quality of the entries or keywords. Does the user have to remember the exact wording of the keyword or are prompts available? Are searches case-sensitive? Are Boolean searches with logical "AND" and "OR" statements permitted? Are nested searches permitted so that the user can conduct progressively more precise searches to home in on the proper document? Have the index entries and keywords been chosen appropriately and entered accurately?
II.B.2. Equipment and Software
Most data and text are entered into indexes by a keyboard. Generally, users work with a split screen monitor that shows the image on one part and the data processing system information on another part. Index information, and perhaps other information pertaining to the transaction, is entered into the data processing system using the latter area. For example, in a court case processing system, if a deposition is being entered, the image of the scanned deposition would appear on part of the screen and the case processing system data entry screen would appear on the other part. The data entry screen would be used to enter data and text pertaining to the deposition, from which an index entry would be created. Indexing and index processing may occur on either the computer that runs the imaging system or the computer that runs the data processing or office system.
Technology is being perfected to extract data and text for indexes from scanned images to avoid separately entering images and index data. This would result in a one-step scanning and indexing process. The method used to accomplish this is optical character recognition (OCR), which reads documents or parts of documents and converts the images into data or text.
OCR combines scanning with image analysis to convert scanned bit-mapped patterns into ASCII code, which is used to represent data and text in most data and word processing systems. OCR can be used with imaging software to extract data and text for indexes and for other purposes in data and word processing systems. For indexes, this is accomplished in two ways:
- Zones in an imaged document are converted into data or text for index fields so that the document does not have to be indexed by separate data entry. This can save time if an organization uses forms in which zones for fields can be standardized.
- All or part of an imaged document is converted into text so that keywords can be extracted and placed in the index.
The accuracy, and hence the usefulness, of OCR depends on its ability to recognize whatever lettering is used in the scanned documents. Clearly typed or printed characters with excellent contrast to the background are likely to be read correctly. While OCR technology is improving, its use remains limited except for applications in which scanning is restricted to specific, well-defined zones.
Bar code reading is a less versatile but more accurate cousin of OCR in which a bar code reader scans and interprets patterns of closely spaced vertical lines. Grocery stores and other retailers often use bar coding, and courts sometimes pre-number citations with bar codes. In these court situations, when the bar code serves the dual role of case number and document identifier in the index, bar code scanning usually obviates key entry of the index.
II.C.1. Functions
When the document is in the imaging system, it can be processed in the following ways:
- Quality Assurance. In most systems, the document undergoes a quality control check in which the user examines the scanned image, against the actual document if possible, and identifies any parts that were scanned improperly. Some imaging systems help the user identify problems through capabilities such as image enhancement, in which areas of the image are enlarged and clarified by different contrasts and degrees of gray (much like you would adjust your television).
- Compression and Decompression. Since optical images in imaging systems require more computer space (i.e., more bits) than data or text representations in data processing or office systems, most imaging systems compress the image before it is stored or sent over a network. In compression, the system eliminates as many spaces that do not contain useful information as possible. For example, many compression techniques eliminate margins, which they can recognize by a succession of "white" bits, in creating compressed images. This reduces the number of bits needed to contain the image, with a commensurate reduction in storage space and communications line usage. Before the image can be viewed, it must undergo a reverse, or decompression, process.
- Database. Many imaging systems are used with a database of information about or extracted from the imaged documents. These databases range from simple indexes to sophisticated keyword or topical document references. As described above, indexes locate documents in storage and may provide limited summary information. The more sophisticated databases reference documents by descriptors, such as keywords or topics, and may contain document abstracts. In addition to locating documents, such databases may provide information about documents or groups of documents without having to look at the actual documents. This type of database sometimes provides the capability to identify and store information at the summary or "folder" level rather than the individual document level. For example, information from documents about judicial applicants could be organized into a database "folder" containing a summary for each applicant to avoid looking up multiple documents. The system could then statistically analyze the information about applicants and automatically produce a summary sheet for each applicant.
- Document Management. Imaging systems permit vastly improved capabilities for document management--filing, distributing, displaying, and keeping track of documents. As anyone who has worked in a court clerks office knows, strong document management is a necessity. Imaging systems allow the user to (1) find documents according to which electronic "folder, drawer, and cabinet" the file is in; (2) identify who has electronically "checked out" the file; (3) find documents by keywords, dates, and other index fields; and (4) display documents in imaged form instead of hard copy. Through strong document management, users know more about their documents and can use this knowledge to complete daily tasks more effectively.
- Workflow. The workflow capability in imaging
systems routes images of documents around to workstations
for processing. Automated workflow could be regarded as a
progression of electronic in-baskets along specific
routes. (But that analogy does not imply that electronic
in-baskets contain actual copies of documents; electronic
in-baskets contain notifications that there is work to do
and identifiers of assigned documents.) In establishing
these routes and defining the work for the workflow,
system designers can standardize the repetitive tasks
requiring document routing. A workflow consists of the
following four items: the workstations to which a given
document is to be routed, the functions to be performed
at each workstation, the information to be routed with
the document, and the methodology to evaluate performance
at each workstation. For example, in a court clerks
office when a case is ready for trial, suppose one
location determines that the case is ready, another
location schedules the hearing, and a third location
prepares and mails notices. Each location, in turn, needs
the case file. The workflow capability automatically
sends, or sends with supervisor intervention, an
electronic message to a staff member when a task needs to
be completed. The message can include instructions, a
copy of the documents necessary to complete the task, a
deadline, and the identity of the next workstation in the
processing sequence. Workflow helps an organization
perform these tasks more effectively and provides the
opportunity to review and improve the organization's work
processes. Workflow procedures can range from simple
rules, with fixed routing based on document type,
document submittor, or other criteria, to complex
formulas that enable the computer to
- Dynamically route or reroute work based on a documents situation (e.g., unusual contents or length, elevated priority), availability of qualified staff to work on the document, or other conditions;
- Automatically detect and rectify or alert user to delays in the work;
- Raise the status of a given task to expedite it;
- Save data on task performance; and
- Expand tasks into subtasks to handle unusual situations.
II.C.2. Equipment and Software
Image enhancement, compression, and decompression are accomplished by circuit boards installed in scanners or the computers to which the scanners are connected. Compression and decompression also can be done by software.
Database management, document management, and workflow are accomplished using software. These functions may be part of or separate from the imaging software.
Workflow usually is accomplished over a local area network (LAN) comprised of workstations staffed by individuals who perform the requested functions on imaged documents.
II.D.1. Functions
Among the most basic functions of imaging systems are storing and retrieving imaged documents to and from computer storage, but imaging seldom functions in a vacuum. As noted above, imaging usually is combined with other systems to accomplish the work that must be done on documents.
This is especially true with database management that creates, uses, and maintains indexes and more complex databases. Databases, including indexes, permit documents to be referenced, stored, retrieved and, in more sophisticated applications, summarized and reviewed to help decide whether to retrieve an entire document.
The storage and retrieval capabilities of some sophisticated database management systems include highly advanced indexing techniques and document searches using keyword descriptors in Boolean and contextual retrieval statements. Courts seldom need such high-powered systems, which usually are found in large libraries or similar document repositories.
Images consist of bit-mapped pictures of documents, and database management systems contain text and data. While these different representations of their contents make imaging and database management systems inherently different, they sometimes are part of the same system--particularly with simple databases such as indexes.
II.D.2. Equipment and Software
The primary storage medium for images is optical or magnetic disk. Optical disk typically is used for permanent storage because its greater capacity is more suitable for images, which as bit-mapped pictures consume a great deal of storage space. On the other hand, because magnetic disk retrieves and stores data faster, it often serves as a staging area for images on which work is being performed.
Typically, while an image is being input, checked, and indexed, the system stores it on magnetic disk. When this work is complete, unless further work is imminent, the image is transferred to optical disk. When there is more work to be done on the image, the system copies it from optical to magnetic disk so the work, and the storage and retrieval that go with it, can proceed using the speed of magnetic disk.
Optical disks are of two types:
- Those for which a given recording area can be written onto once but read from many times (called Write Once, Read Many or WORM optical disk); and
- Those for which, like the more familiar magnetic disk, a given recording area can be both written onto and read from many times (called erasable optical disk).
Optical disk from which information can only be read (the most well known of these is Compact Disk-Read Only Memory, or CD-ROM) is a special type of WORM technology. CD-ROM is commonly used by publishing and reference services, and WORM or erasable units are more common with imaging systems.
Optical disk autochangers are control devices that move optical disks between their storage areas and the disk drive from which they are read from or written onto by the computer. Autochangers are known as jukeboxes.
II.E.1. Functions
Most imaging systems have at least three user groups:
- Operators who scan documents into the system and index the documents. In many organizations, there is a specific workgroup for these purposes.
- Operators who process imaged documents as part of the imaging workflow. For example, in a court clerks office, this group would use imaged pleadings to assist in performing whatever activities were called for in the pleading, such as scheduling hearings and notifying parties.
- Users and people outside the system who are not part of the workflow but use information from the imaged document in their work or for other reasons. Many courts have public imaging workstations so that individuals outside the system can view records from their case files.
The tasks performed by the first two groups are within the imaging system as described above in the capture, index, process, and store and retrieve functions. In addition, imaged documents usually are disseminated, or at least made available, to people in the third group. Sometimes people in this group depend heavily on imaged documents. Such people include judges and other attorneys who need electronic case "folders" just as they formerly needed hard copy case folders. Other installations use imaging primarily to help with data entry into a computer data processing system, and people in the third group need the imaged documents only if questions or problems arise.
The dissemination function primarily addresses the need to get imaged documents to this third group that consists of users and people outside the imaging system.
II.E.2. Equipment and Software
Dissemination to users outside the imaging system requires viewers, printers, communications, and software.
- Viewers. Those outside the imaging system may be able to use a general-purpose PC monitor for image display if that monitor (1) can display graphics, (2) has sufficiently high resolution and screen size to display legible images, and (3) can satisfy the display requirements associated with imaging for a particular user (e.g., display image and data screen simultaneously). Although many PC monitors do not have this functionality, such users do not necessarily need the high-resolution, special-purpose imaging monitors required by those who will be scanning images and performing quality assurance.
- Printers. Like any computer installation, printer requirements depend on the intended use of the printed documents. Typically, a laser or comparable printer is needed for high-quality reproduction of imaged documents. This is particularly true if official documents will be printed.
- Communications. In most imaging systems,
dissemination requires that imaged documents be
communicated to users not directly connected to the
imaging computer (also known as the imaging server).
Frequently, this requires a communications network, such
as a LAN. Communications networks advance todays
movement toward integrated processing, which embraces
imaging, data processing, word processing, and other
types of office automation. For example, if all court
users are in the same building or nearby buildings, a LAN
can be set up with the following elements:
- Imaging system LAN with
- Imaging workstation PCs to scan, index, check, and process incoming pleadings and other documents;
- Image file server PC and disk units to manage image files and make imaged documents available to users;
- Image print server PC and printers to print imaged documents; and
- Image communications server PC to control imaging LAN and disseminate imaged documents to users over court LAN.
- Court main computer that runs case processing and other centralized systems and serves as the overall communications server for court LAN and external agency computers.
- Individual PCs and shared printers of court users
that
- Communicate with imaging system, main computer systems, and other users as part of court LAN;
- Use court case processing system resident on main computer;
- Use imaging system resident on imaging LAN;
- Use word processing and other office automation functions through court LAN;
- Use desktop applications (e.g., spreadsheet and customized programs) using PC as stand-alone device; and
- Combine all of above LAN applications for integrated use on PC (e.g., imaged documents and case processing data displayed together for case processing system update).
- Imaging system LAN with
- Software. The above functions require imaging, database management, and communications software within the imaging system; software to achieve integration at the PC level; and other computer systems, communications, and applications (e.g., case processing) software.
Several other technologies can substitute for or complement the type of imaging covered in this report. The substitutes need not be "high tech"; photocopiers are a type of imaging. Micrographics can serve either as a substitute or complement, and Computer Output Laser Disk (COLD) and microform output are alternate output technologies. Text retrieval is an alternate method of storing and retrieving text and should be central to the future of imaging, data processing, and word processing integration. Judicial Electronic Document and Data Interchange (JEDDI) is an emerging capability that would permit courts and those who transact business in courts to exchange documents electronically. These technologies are summarized below:
- Micrographics. This includes familiar imaging technologies such as microfilm, microfiche, and other microforms. Like the imaging systems described above, micrographics produce representations of documents that can be stored, retrieved, printed, and displayed. They can include computer-assisted microfilm retrieval (CAR) that uses indexes to locate documents. Compared to the imaging systems described above, they are slower, more cumbersome to use, and more difficult to integrate with other types of systems. Their advantages are lower cost; higher potential for retaining accurate and stable images in storage for long periods; and independence from the hardware and software that store, retrieve, and display images. Some users seek the best of both worlds through a combination of micrographics and the imaging systems described above. For example, in the process of changing from micrographics to a new imaging system for which records must be converted, the old and new systems sometimes are used concurrently with a common index that locates each document regardless of where it resides. In another example, the new imaging system may be used for active documents and micrographics retained for archived documents.
- Microform Output. This output method is an alternative to those in the imaging systems described above. Imaging system output is placed on microfilm, microfiche, or other microforms as if it were the product of micrographics. A computer output microfilm (COM) recorder converts the image directly to microform (i.e, without an intervening paper copy). This method is useful to achieve the longevity of microform records for archived documents, but it is seldom used because COM recorders are expensive.
- COLD. Some imaging systems can store and retrieve
documents in character (as opposed to image)
representation formatted as computer-generated output
pages. COLD technology provides this capability using
read/write optical disk for document storage. It includes
indexes and document retrieval capabilities. The
documents can be displayed at general-purpose monitors
(with no graphics required) since they are
computer-generated output. COLD allows imaging systems to
store documents more efficiently because it uses
high-volume optical disk, and computer-generated output
pages consume less storage than imaged pages. It gives
users on-line access to the output pages instead of
manually distributing micrographics output or computer
printouts.
Earlier it was noted that some imaging systems can create an official document by superimposing data from case processing systems onto pre-imaged forms that contain the necessary inscriptions and signatures. Some COLD systems provide this capability by storing images (e.g., of forms with inscriptions or signatures) that are superimposed on the output pages when they are retrieved. Other imaging systems accomplish this through, for example, customized software. - Text Retrieval. Like imaging, text retrieval gives the capability to store and retrieve documents. Unlike imaging, the documents are stored in character (as opposed to image) representation. Whereas imaging preserves the appearance of the original document, including text and non-textual graphics such as pictures and exhibits, character representation contains only the text. To offset this limitation, text retrieval offers advantages over imaging such as more efficient storage, the ability to display documents on general-purpose monitors, and powerful indexing and retrieval capabilities that may include the full text of documents. Text retrieval is an alternative to imaging if the only objective is to store and retrieve documents, but some users need to see the original document. This leads to imaging or the integration of text retrieval and imaging as discussed later in Chapter Five.
- JEDDI. This technology, which is in the conceptual
stage, would permit documents and data stored in
computers to be electronically exchanged between courts,
attorneys, and others who do business with courts. It
would be a major step forward in the march toward the
paperless court.
Filings created by attorneys on their office PCs would be formatted, checked for errors, and put in electronic packages consisting of the document(s), or simply the data in the document, and the accompanying docket and financial data for the court case processing system. Then the electronic package would be transferred to a preselected network that would have the added value or functionality of storing the package in a "mailbox" on an interchange computer. The clerks office would "look in" the electronic mailbox several times each day (or be notified automatically) and, if something were there, transfer it into the computer. After reviewing the information on the PC screen to verify its accuracy, the clerk would enter the docket and financial information into the case processing system and either enter the data in the document(s) into an electronic file or, if necessary, convert the document(s) to imaged form and enter the image(s) into an electronic case folder. This would consummate the filing without an exchange of papers and perhaps without creating an actual electronic or hard copy document. The process could be reversed for communications, such as notices and receipts, from the clerks office to attorneys.
Aside from the obvious savings in time, effort, and money to file pleadings and distribute notices and other court papers, JEDDI would permit (1) attorneys to put filings into the electronic mailbox at times other than normal clerks office hours and (2) courts to coordinate and standardize electronic filing procedures. Since JEDDI would involve electronic versions of documents, its users would gain the advantages of imaging described in Chapter Three.
Except where specifically stated (e.g., in the coverage of text retrieval), these other technologies will not be addressed again in this report.
