CommonLit Readability Project
Posted on Tuesday, August 3rd, 2021
Client | Leanne Seaward |
Professor(s) | Leanne Seaward, |
Program | Computer Programmer Diploma (0336X) |
Students | Lutfi Cildir Parnoor Singh Gill Abdullah Zeki Ilgun David Lee Marcelo Monteiro de Silva |
Project Description:
CommonLit Readability Prize is a Kaggle machine learning contest that challenges participants to create algorithms that can rate the complexity of passages of text and place them on a 3-12 scale. The idea is to produce a cost-effective measure that CommonLit Inc, a nonprofit education technology organization can use to provide learning materials for teachers and students. Under the guidance of Algonquin College Professor Leanne Seaward we sought to address the problem using a combination of a fine-tuned BERT (Bidirectional Encoder Representations from Transformers) deep learning model and other natural-language processing models.
The prototype reads a CSV (Comma Separated Value) file containing sample sentences and produces a CSV with model targeting information. The prototype uses the fine-tuned BERT model and is currently scripted, with no user interface. The output is a CSV file containing pairs of unique identifiers and generated target scores that evaluated by the Kaggle CommonLit Readability contest to determine the value of the results.
Short Description:
A machine learning contest submission combining natural language toolkits with a fine-tuned machine learning model in order to sort reading material based on reading grade level. Ideally this would be useful in the education system.