Live text is awesome, it just took us a long time, 45 years, to get there.
Like many kids, my parents read a lot to me when I was young; but because I am totally blind, they probably read to me way later into my life than most. Even when I was in college, Mom recorded long handouts for me because OCR wasn’t available to me yet. Professors would give them out requiring them to be read so quickly that producing braille copies was not feasible. too bad she recorded them on cassette tapes which had to be brought to me 30 miles away instead of digital files that could have been transferred over the internet. Oh yah, that wasn’t an option yet either; young readers are probably thinking “you’re old”.
Optical character recognition, often called (OCR) is the process of taking images of text, and converting them to actual text that a computer can use, like that you could open in Notepad or Textedit, or text that can be searched. OCR used to take much longer to perform, like 30 seconds per page in the 1990s on a 486 CPU.
I remember when I was young, maybe grade school, and I heard about optical character recognition for the first time. I heard that in 1975, Ray Kurzweil had invented a machine the size of a washing machine that could read books to the blind. Later, when I was in around eighth or ninth grade Mom read an article to me about one of the first personal reading machines, the Kurzweil Personal Reader, called the KPR. It still cost about ten thousand dollars at the time, but Mom said, hey when you grow-up and get a job you can buy one.
Arkenstone formed soon after, and in the early 1990s came out with a competitor to Kurzweil’s software called Open Book, and that’s what I bought a few months after graduating from college in 1994. Both products cost about $1000 and also required a scanner at about the same price. You also needed a higher end computer at the time, so for about $4000 I had a reading system, still very expensive, but I was able to read books independently for the most part, and it was amazing. The scene stayed the same for some years, though there were price drops. TextBridge, and Omnipage were usable by advanced screen reader users, and for some the cognitive load was worth the money saved. I call these systems desktop readers, they still exist, and if scanning a long document, like a book, it’s the best way to go.
There were products like hand scanners, that you could slide down a page of text, and I bought one around 2004, but could never get accurate scans out of it. Also, they still required a laptop or desktop and software to OCR the scanned images, so they weren’t what one could call completely portable yet.
Then, in 2006, truly portable reading was first available. The Kurzweil company joined up with the National Federation of the Blind, and together they came out with the first version of the KNFB Reader.
It was ingenious at the time; combining a 5MP point and shoot camera, with a high-end Pocket PC running Windows CE into a case that held them together. The price was still very high, as it is for many adaptive technologies, at $3500, but a new era of OCR and reading was born. You had to position the camera, take a picture, process the text, but for the first time it was possible for totally blind people to read documents accurately and quickly at work or in meetings on the fly. I remember years past when donating blood, the Red Cross stated that one needed to read the several page document explaining the rules and process around donating blood every time they donated. Back then, a person would have to go with me and read it to me making my experience longer. Finally, a device like the KNFB Reader could speed things up, and make blind donors more independent.
It was still very expensive, and many people in the blind community rightfully complained, as few could afford it. In 2008 a second version of the KNFB Reader software came out for the Nokia N82 phones. The software plus the phone and screen reader still cost about $2600; an improvement, though still very expensive and out of the price range of most. In 2013 we got our big break though, the new iPhone 5C had a big camera upgrade, and the KNFB Reader was finally possible on iOS. It cost $100, some blind people still complained. That was a huge price reduction though, I think some people just like to hear themselves complain. Money was especially tight for me that month, but I still bought it, used it a lot, and was glad I did, even if it was painful for my wallet. It even worked on the top iPod touch at the time, so for about $400 one could buy a KNFB Reader system from scratch.
Since then, more iOS programs have come out in what I call the portable reader space, some like Prizmo compete seriously with KNFB. The weakness of all camera based OCR to this day, however, is the positioning of the camera. Some used dedicated stands that hold the camera and documents, that often improves accuracy a lot. It’s still not instant though, one has to position the camera, take a picture, and then process the image. Prizmo and the KNFB Reader both help blind users focus the camera on documents they want to read and that helps a lot. Though it is a huge improvement from the past, still able to scan books much faster than old flatbed scanners, it’s not quite something you could use in real time like how sighted people can read signs in public. What we needed was OCR performed on realtime video.
Today, that is what we have. There are some apps that have been out for five years or so that take video and then perform OCR on it. Prizmo GO, Envision AI and Seeing AI are the most known with Seeing AI being the most popular in the blind community. These apps excluding Prizmo Go can read text in real time, as well as other things like barcodes, identify color, and identify objects it sees.
Live Text in iOS 15 is Apple’s version of these apps reading text in real time. It’s the culmination of what OCR can do, it’s what I call instant OCR or instant reading. Envision AI and Seeing AI weren’t known as well outside of the blind community, so Live Text seems amazing to many sighted users today as they hadn’t see it in action before. Some only think of it as a cool novelty, though many realize how handy it can be. I invite readers to just take a moment, and imagine how useful reading text in images could be if one was totally blind.
Some might think it’s cool, others might call it a productivity tool; I say it’s a huge game changer.