VA-HIST Archives

Discussion of research and writing about Virginia history

VA-HIST@LISTLVA.LIB.VA.US

Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Holly Hodges <[log in to unmask]>
Reply To:
Discussion of research and writing about Virginia history <[log in to unmask]>
Date:
Tue, 14 Oct 2014 12:16:24 -0400
Content-Type:
text/plain
Parts/Attachments:
text/plain (68 lines)

 Thanks, Bruce, an interesting idea.

I don't use Macintosh.  I've been using MSWord, and simply transcribe.  I've done a fair amount of transcription for the Amherst newspapers, but most of it is still in Word.  I only correct text when I'm up to the tedious process required by the site.  On another note, this enables me to "correct" the newspaper's typos more easily.  After ten years of working with Amherst County history, it's easy for me to spot when the typesetter or reporter has gotten a name _almost_, but not quite, right.  I simply insert it into the document correctly with brackets, as most transcriptionists would do.  Then when I do my "search document," I will still be able to find it again, even if the typesetter/reporter did not get it correct.

However, since I tire of the "Correct this Text" process easily, most of my transcriptions are not online.  I would prefer to be able to make them available to more people who use the site, but the  "Correct this text" hurts my eyes after a short time.  

I look forward to seeing more on this site, and I am most appreciative that LVA has arranged to make these available!

Many thanks!

Holly Hodges
Amherst County

 

 

-----Original Message-----
From: Bruce Harper <[log in to unmask]>
To: VA-HIST <[log in to unmask]>
Sent: Tue, Oct 14, 2014 9:00 am
Subject: [VA-HIST] Correcting OCR Text


I've been rolling through the Virginia Chronicle site, digging out
various and sundry news articles about the Virginia Polytechnic
Institute around the turn of the last century. I have also been
dealing with the "Correct this Text" process* to clean up articles I
have run across. These articles have obvious typos and other errors --
typically a missing letter in a word or transposed letters, plus the
occasional misspelled name. In most cases, I'll leave it as it is and
add [sic] to indicate that the error is in the original text and not
an error in the correct. My question is if this is the appropriate
process, to leave things as they are.

*The line-by-line correction is tedious at best, especially when the
text is more errors than clean words. I came up with a slightly better
process, although not perfect. I select the text to be corrected, copy
it, and paste it into a BBEdit document (the greatest text editor for
Macintosh). I can then use search-and-replace, spelling check, and
general editing to clean up the text. Once done, I go back to the
Chronicle page, click "Correct this Text" and select the article, then
delete line by line the bad text. I then drag and drop from the BBEdit
document to the Chronicle page until all the correct text has
completely replaced the original OCR scan. Still a little tedious but
a little quicker and easier than line-by-line. The perfect solution
would be a way to drop the whole corrected article in as a whole
instead of piece by piece.

Bruce in Blacksburg

Bruce Harper
University Relations/Web Communications
101-D Media Building
101 Draper Rd. NW
Blacksburg, VA 24060
540-231-4360



 


______________________________________
To subscribe, change options, or unsubscribe please see the instructions at
http://listlva.lib.va.us/archives/va-hist.html

ATOM RSS1 RSS2


LISTLVA.LIB.VA.US