CommonLit Readability Project

Posted on Tuesday, August 3rd, 2021

Client	Leanne Seaward
Professor(s)	Leanne Seaward,
Program	Computer Programmer Diploma (0336X)
Students	Lutfi Cildir Parnoor Singh Gill Abdullah Zeki Ilgun David Lee Marcelo Monteiro de Silva

Project Description:

CommonLit Readability Prize is a Kaggle machine learning contest that challenges participants to create algorithms that can rate the complexity of passages of text and place them on a 3-12 scale. The idea is to produce a cost-effective measure that CommonLit Inc, a nonprofit education technology organization can use to provide learning materials for teachers and students. Under the guidance of Algonquin College Professor Leanne Seaward we sought to address the problem using a combination of a fine-tuned BERT (Bidirectional Encoder Representations from Transformers) deep learning model and other natural-language processing models.

The prototype reads a CSV (Comma Separated Value) file containing sample sentences and produces a CSV with model targeting information. The prototype uses the fine-tuned BERT model and is currently scripted, with no user interface. The output is a CSV file containing pairs of unique identifiers and generated target scores that evaluated by the Kaggle CommonLit Readability contest to determine the value of the results.

Short Description:

A machine learning contest submission combining natural language toolkits with a fine-tuned machine learning model in order to sort reading material based on reading grade level. Ideally this would be useful in the education system.

Contact the Team