If someone has the time, they can copy the pages and scan those and then put those into Excel. I'm guessing that's what the folks did that have the info. I guess its possible that Check Six did it by hand. That's a lot of typing.
I think you don't see what I'm saying. Copying the pages doesn't give you the text, like as if you typed it in.
that's what makes it tedious. Even with the higher quality Tosaw, ocr will give you errors. It's all about verification. Context per page helps with a lot of verification. I have some other things I do that rely on context. But visual is the final verification.
I don't think many people have a list. I don't think ha.com has a list, except for what they wrote down that they verified.
in terms of describing the scale of the problem:
Each serial is 3 + 3 + 4 + (2 or 3) chars. (the 4 might be legal to be 3, but I've not seen it in this list)
so that's 12 or 13 characters per serial that have to be exactly right.
Multiply that by 9998 and you have between 119976 and 129974 characters to deduce in total.
And you want 0 errors.
That is better than a 1 in 119976 error rate. i.e. 0.0008335% error rate.
EDIT: the context that I'm talking about that helps is:
We know each page is sorted
We know there aren't duplicate serials
We know there is a restricted set of legal serials. All chars are not random. There are positional limits on choices within each of the 12 to 13 chars). And the series year choices are fixed. Additional the leading letter range is just A-L (fed reserve choices) and the trailing letter in the serial is just A or B or C or D or *
(yes I have seen some C and D ..still verifying though. mostly A or B or *
valid series year are "34", "50", "50A", "50B", "50C", "63", "63A", "69"
interestingly the 1934 twenty is dramatically different looking..has different words on it.
if the chars in the serials were fully random it would be a harder task. Also if the fbi list wasn't sorted it would be a harder task. I suspect the sort also helped fbi's verification process, whatever it was.