Word Booster: creating word lists and quizzes from web pages
Andrew Byungin Kim, primary creator of Word Booster, the Oxford Dictionaries API 2017 competition winner, tells us about how he developed his language-learning resource, his inspiration for its creation, and plans for its future.
Project name: Word Booster
Developer: Citrus EduTech Inc.
Tell us about your project!
Word Booster is an automated glossary-maker for HTML documents on the web. If you happen to be using an article on the web to teach a language learner or learn from yourself, making a word list and quiz for the text would normally take about an hour. Word Booster automates the process and saves time for you. Professional-looking PDF documents are generated just with a few clicks, ready for class and self-learning.
We now have some English teachers who have integrated Word Booster into their regular programmes: rather than giving a random list of words to their students for improving their vocabulary, students are given an interesting article along with a word list made with Word Booster. These teachers don’t need to copy and paste text from ordinary dictionary websites to make a glossaries and quizzes for their students – or force the students to memorize a random list of words!
Why did you decide to create it? What gave you the idea for Word Booster?
I taught English for seven years and always spent about 30 minutes – sometimes up to an hour – just making a word list and quiz, and properly formatting the text for each class. This is very labour-intensive and tedious for teachers, but important for students. Naturally, I always wondered if all this could be somehow automated because I studied computer science in college.
While teaching, I worked as a freelancer translating computer science papers from Korean to English, and at one point I stumbled upon the topic of Natural Language Processing (NLP). It was fascinating because I could connect much of what NLP offered to what I needed for my class. By connecting NLP and what I learned while teaching English, I was able to figure out the algorithms for Word Booster.
How did you use the API? Which endpoints and languages did you use?
Before using Oxford Dictionary API, my team had already tried other dictionaries both monolingual and bilingual. To generate a document with many different words, we knew we had to cache the content otherwise it would take forever to produce the result.
I was in the Early Adopter Programme for Oxford Dictionaries API, and thankfully Oxford allowed data caching. From my perspective, this is a critical factor for choosing Oxford Dictionary API over other API services. As for endpoints, I hired a couple of programmers to do the work for me. My programmers used PHP for the backend, and Python for PDF file generation.
What do you hope to do with your app next? Any future developments or new ideas?
I will improve on the interactive aspect of the app. Currently, it feels very static, so I am planning to make it easy to specify which words to include or exclude on the list. Recently I got some funding from a local investor, so I am going to use the funds for that.
In the future, I plan to make it possible for media companies to add a ‘Word Booster’ button next to their ‘Print’ and ‘Share’ button on their websites. With a Word Booster button on each article page, learners will be able to grab a vocabulary list conveniently and improve their English. By including adverts in the process or charging users – just like YouTube does, for example – I will be able to collect revenue to share profits with the content owner. That could be anything from a simple self-published WordPress blog, to multinational media platforms such as the BBC, the Economist, and the New York Times.
How would you like to see the Oxford Dictionaries API developed further?
It would be very nice if some NLP technology could be integrated to the API, for example, word sense disambiguation. If this is implemented, the user could transmit multiple sentences surrounding the queried word, and the engine would be able to figure out which definition is most relevant to the queried word within the context of the surrounding sentences. So, the output query would show if the word ‘plant’ is vegetation or an industrial facility depending on the context. Less relevant sample sentences could be filtered out in the process.
Is there anything else you would like to add?
Getting the award from Oxford Dictionaries API was a real turning point for Word Booster. Investors started to pay more attention and I was accepted to a local start-up accelerator program run jointly by a local publisher, Chunjae Education, and the local government of Sejong City in South Korea.
Listen to Andrew share the founding experience of Word Booster in this webinar session.
- The opinions and other information contained in OxfordWords blog posts and comments do not necessarily reflect the opinions or positions of Oxford University Press.