First round of college exams is now officially over :). I had 3 exams: Calc III, Physics and Computer Science. The grind was on: a friend of mine was in the same Math and Physics class as I was, and every day leading up to our exam, we locked ourselves in a classroom, chalk-boarded up, and ran through every textbook review problem and every past exam available.
Following my last exam, I crashed on the couch and binged a solid 4 hours of Seinfeld with some friends, and flew back home to Memphis the next morning. Following this type of grind, normally I’d give myself a nice break, but, having deprived from my jupyter notebook since exams started, I was itching to build some models.
I remembered a project that I began a while back, a homework individualization platform for elementary school math programs that I was working on that I ran into a wall with. The problem I faced was I had no good way to assess students’ competencies without placing a large burden on the teacher, who would be required to fill in both the worksheet’s questions and the student’s answers. It was simply unfeasible.
However, I decided that with my spare time right at the start of the break, I would try to build a Machine Learning Model that would be able to dynamically read, and thus assess any worksheet from a simple picture.
This was a complex problem which I had absolutely no idea how to begin. So, as always, when I hit a wall, I make sure that that wall is a whiteboard. Meeting up with one of my best friends from high school, Ramiz– a Math/Biochem/Classics major who loves designing but hates implementing solutions– and started from square one: the numpy picture array input.
Yes, we literally drew the array on the board. The initial parts of the problem–locating and reading numbers– would not be exceptionally difficult. We just use OpenCV to do some prepossessing, catch the numbers, and run them through a series of classification Models, and we will know exactly what characters are present on the worksheet.
However, the complex part of this problem was how to read these characters in a cogent manner that returns the question paired with an answer. In order to solve this issue, we grabbed some Taco Bell, some La Croix, and spent the night white boarding a search algorithm that would adapt to nearly all forms of worksheet. Not a line of code was written that night, but after that, I was ready to implement our idea.
The next three days were an absolute grind. Wake up, eat breakfast with my family, head to the library with 5 Peanut Butter, Banana, and Honey sandwiches, head back home 10 hours later for dinner– quick Star Wars Battlefront break with my little brother (the 2003 version) — and then grind a little more. I built, cleaned, and processed 3 datasets, then rebuilt, recleaned, and reprocessed them because I made a number of errors. I built 3 deep learning classification models, an openCV search and dictionary storage model, and 1 search algorithm. After grinding through bugs that I did not even know existed, I eventually built my worksheet reader :).
As it stands, it is currently limited by the classification models that it depends on, as my data preprocessing was not nearly as thorough as I realize I need. Thus, as it stands, my reader is able to read any scanned worksheet’s questions and as soon as I update my find-Squares function to be more lenient to account for the fact that people generally do not box their answers in perfect squares, it will be able to read worksheets with answers as well.
I have a demonstration of my worksheet reader on my github, if you’re interested in checking it at this link: https://github.com/ghodouss/WS_Grader_Demo
This project is not finished, but at this point, I simply have to expand the foundation to account for edge cases, and other types of worksheets. To build a river, you gotta love digging. It was a good 4 days :).