VA-HIST Archives

Discussion of research and writing about Virginia history

VA-HIST@LISTLVA.LIB.VA.US

Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Randy Cabell <[log in to unmask]>
Reply To:
Discussion of research and writing about Virginia history <[log in to unmask]>
Date:
Thu, 21 Mar 2002 10:51:57 -0500
Content-Type:
text/plain
Parts/Attachments:
text/plain (37 lines)
Thanks to all of you who responded with experience and ideas.  I'll close out the discussion with results of a novel approach: speaking the text in.  My son took a couple of the same pages that had also been scanned, OCRd and entered separately, so that we could get some comparions.  One was a carbon copy of typed minutes.  The second was a page from a book, very high quality clear printing.

OCR on the old notes was completely unsatisfactory because the old typewriter did not close some characters, left descenders off, etc.  Manual typing-entry (me) was at about 40 words per minute which included lotsa backspaces and corrections along the way, and came out to about 7 minutes for the page.

OCR on the good quality document was only about 2 minutes, but proofing brought that up to 6 to 8 minutes.  This was a bit faster than the typing which took about 7 minutes.  (I'm a blazingly fast typist, whose entry speed is exceeded only by my error rate.)

It took him four minutes to read each page in which can give you some time comparison to OCR and typing.  Results indicate that maybe he should have taken a bit more time.  In any case, I include some of the results of the 'readings', to illustrate the pitfalls of reading as well as for a bit of merriment.

Two from the typed notes:

'..from Richmond in private rail road car.'       
became
..'from Richmond in private where road car.'


'In apppreciation of this gift, on motion made...'
became
'In appreciation of diskettes, on nation mode...'



Two from the high-quality printed document rate far higher on the "merriment scale":

'and was at one time one of the three Episcopalians in the county of Halifax.  A memorial of her exists in the Bruce Fund...'
became
'...and was at one time one of the three a test of billions in the County of Halifax.  immemorial offer exists in the bruise find...'

But my favorite:
...'was the eldest son of Patrick Henry, the orator, by his second wife, Dorothea Dandridge'
became
...'was the eldest son of Patrick Henry, the war tore, by his second wife, Dorothy had dandruff edge.'

Obviously, speech technology has a bit of a way to go.  Randy Cabell

To subscribe, change options, or unsubscribe, please see the instructions
at http://listlva.lib.va.us/archives/va-hist.html

ATOM RSS1 RSS2


LISTLVA.LIB.VA.US