|
|
An exciting new part of the Asian Classics Input Project is the ACIP Imaging Division. Whereas digitizing texts by typing in the words allows them to be searched for the valuable information they contain, and to be printed out easily into books for refugee scholars and others, the scanned images of the books have their own special value. For example, a scholar who wanted to see the original page from which a particular section of data was input could call it up automatically in AsiaView, to confirm whether a doubtful reading were correct. Many of the woodblock pages also contain special illustrations of historical figures and other subjects of great value, and these will, by the way, be preserved as the folios of text are imaged. Finally, imaging is much faster than manual input; by undertaking to produce images of every page in the major Tibetan collections around the world, we can truly assure that they will never be lost during political or economic upheaval, long before the end of the 150 to 200 years that will be required to input them all. The ACIP Imaging Division has worked with a number of imaging experts in the United States to develop advanced techniques for creating microfilm images of Tibetan text, and then converting these to digital images. Archival-quality microfilm has a life-span which exceeds that of CD-ROM’s by about ten-fold (say several hundred years, as opposed to the several decades it takes for the lamination of a CD-ROM to dry and split). This approach also allows for paper printing of the original images; it is critical since - as the treasures of Tibetan literature are digitized, and more and more people use these computer files for their research - support for the native and Western libraries which hold the original printed copies of these texts may well dwindle, as have the printed book collections of public and other libraries in the West. The nature of digital media, though extremely powerful, cheap, and convenient for research, is that it is particularly fragile: the machinery to read it ages and breaks down easily, and unless digital media are updated frequently they can be "passed by" for a whole generation of technology, becoming unreadable by the next generation of machines. It is not at all beyond the imagination that a global-scale political or economic disruption, if it continued for more than a few years, could result in the complete loss of an electronic database such as ACIP’s. And as the distribution and storage of more and more information depends on the Internet or similar electronic networks, we become increasingly vulnerable to accidental or malicious disruption, or interference and control of this information by governmental or special-interest groups. It is therefore important that the images, and not only the digital form, of these texts are preserved; and that repositories of printed paper copies are created and supported. The imaging of the texts slated for input by the Project has a final, major benefit. The Project has successfully entered, manually, many tens of thousands of pages. As the corresponding images are created, artificial-intelligence programs can be written that will be quickly able to correlate even the most unusual carvings of letters to their correct representation in ACIP input code. This will lead to the successful development of the first optical character recognition (OCR) programs for carved woodblock Tibetan texts, which have not been viable to date because of the wide variation in how a single letter is carved in separate works. We will then be able to create searchable file versions of the various images, simply by feeding them into the OCR program. This in turn will shorten the time required take to input the entire body of Tibetan sacred literature, perhaps by many decades. Users are directed to a special "scans" directory on the CD-ROM that contains sample scans done with normal scanning technology, and also sample scans that utilize special new techniques that result in greatly enhanced image sharpness. A "read me" file in that directory gives further information. A Tradition of Social Service | Capturing an Entire Tradition ACIP South Asia Operations | The St. Petersburg Catalog Project | ACIP in Mongolia |
|
|
||