Author Topic: OCR limited grid test  (Read 3743 times)

mapurves

  • Shipherd
  • Hero Member
  • *****
  • Posts: 1846
    • View Profile
Re: OCR limited grid test
« Reply #45 on: January 12, 2017, 06:49:52 pm »
I did my morning chores and went to add to the OCR test. This is what I got, after logging in:

Quote
Great work! Looks like this project is out of data at the moment!

Kevin

  • Old Weather Team
  • Hero Member
  • *****
  • Posts: 527
    • View Profile
Re: OCR limited grid test
« Reply #46 on: January 12, 2017, 06:52:31 pm »
I think the retirement is set to 4.

Randi

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 12134
    • View Profile
Re: OCR limited grid test
« Reply #47 on: January 12, 2017, 06:57:33 pm »
I just did a Farragut page (9 Jan 1942) and was presented with a land station page starting with Ceylon.
When I clicked on OLDWEATHER OCR, I got the main screen with the message "Great work! Looks like this project is out of data at the moment!".
If I click on Get Started, I go back to the Ceylon page.

Hanibal94

  • Shipherd
  • Hero Member
  • *****
  • Posts: 4215
  • Better to do it, than live with the fear of it.
    • View Profile
Re: OCR limited grid test
« Reply #48 on: January 12, 2017, 07:08:27 pm »
The site statistics say:

Retirement limit: 3
Images retired: 23 / 20
Classifications: 147 / 60

Kevin

  • Old Weather Team
  • Hero Member
  • *****
  • Posts: 527
    • View Profile
Re: OCR limited grid test
« Reply #49 on: January 15, 2017, 04:31:24 pm »
I'll be in touch with Laura about the next step this week. Thanks everyone for pitching in, and I'll let you know if there was a glitch in the closeout. Also, for your information, we hope the capabilities we are developing will be transferable to any project working on tabular data, and how best to use them may include grid or computer vision analysis as part of the initial processing stage rather than as a citizen-science activity. 

mapurves

  • Shipherd
  • Hero Member
  • *****
  • Posts: 1846
    • View Profile
Re: OCR limited grid test
« Reply #50 on: January 15, 2017, 04:40:38 pm »
How much severance pay will we be getting when the computer takes over and we're all laid off?  ;D ;D ;D

Bob

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 1319
    • View Profile
Re: OCR limited grid test
« Reply #51 on: January 15, 2017, 04:46:18 pm »
I heard it would be at least 1.5 times our current rate.  ;)

How much severance pay will we be getting when the computer takes over and we're all laid off?  ;D ;D ;D

Randi

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 12134
    • View Profile
Re: OCR limited grid test
« Reply #52 on: January 15, 2017, 09:30:54 pm »
Thanks for the update, Kevin!



I sure hope that that doesn't kick me into a higher tax bracket ;D

Kevin

  • Old Weather Team
  • Hero Member
  • *****
  • Posts: 527
    • View Profile
Re: OCR limited grid test
« Reply #53 on: January 15, 2017, 11:13:30 pm »
Well, for those concerned with getting laid off I can assure you that won't be happening. Testing so far suggests there will be an 'eyes only' requirement - you just won't be asked to transcribe the fraction of numbers that can be OCR'd. Best case guess 10-40% will require review and correction. For logbooks the need will also remain for data and information on the remarks page. Hopefully the system will be good enough that bulk processing of pure data tables like the Indian Daily (IDWR) examples in the test.

FYI, we are about to start writing a 3-year proposal to image the remainder of the pre-WW2 logs in the US National Archives. Not counting the many we've already done that's 10,427 volumes in 118-A and 7,119 boxes in 118G-A..Z. Probably we will look at prioritizing early 20th c. typed material and the civil war era. For the latter Mark M is interested in also imaging a related collection of oversize muster rolls, which should open up new opportunities for historical scholarship - especially on the lives of the ordinary sailor. Up to now a researcher would have to visit A1 to work with these records but hopefully they'll be online if we are successful.

AvastMH

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 7071
    • View Profile
Re: OCR limited grid test
« Reply #54 on: January 17, 2017, 07:06:36 pm »
Well, for those concerned with getting laid off I can assure you that won't be happening.

All holidays cancelled then  :'( :'( :'( ( ;) ;D )

Kevin

  • Old Weather Team
  • Hero Member
  • *****
  • Posts: 527
    • View Profile
Re: OCR limited grid test
« Reply #55 on: April 08, 2017, 10:14:39 pm »
To keep everyone up to date: we have now tested a several OCR and script recognition systems and have found that the current state of the shelf does not produce reliable enough results for us. So far auto-recognition doesn't increase the efficiency of data conversion because the manual correction component is high and too variable. However, we did learn some important things about what is possible and worth developing, and we will continue to work on these. There will be no layoffs this year.

Craig

  • Shipherd
  • Hero Member
  • *****
  • Posts: 2974
    • View Profile
Re: OCR limited grid test
« Reply #56 on: April 08, 2017, 10:49:44 pm »
I guess that's not too surprising, Kevin, but as you say, it was worth a try. I suppose using neural network algorithms is too expensive?

Randi

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 12134
    • View Profile
Re: OCR limited grid test
« Reply #57 on: April 08, 2017, 11:05:27 pm »
There will be no layoffs this year.

Whew!

AvastMH

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 7071
    • View Profile
Re: OCR limited grid test
« Reply #58 on: April 08, 2017, 11:45:34 pm »

mapurves

  • Shipherd
  • Hero Member
  • *****
  • Posts: 1846
    • View Profile
Re: OCR limited grid test
« Reply #59 on: April 09, 2017, 12:16:49 am »
There will be no layoffs this year.

Great relief here, although the pain of layoffs can be relieved somewhat by a generous separation bonus...  ;D  (I'm thinking something along the lines of Goldman Sachs and other such organizations...)  ;)