Continued from the last exercise: detecting those titles with Korean letters was easy but excluding them wasn't. I asked help from Allison to figure out how to filter out the ones with Korean, which is a method that ideally only collects English titles.
The problem was, sometimes I visited contents that belong to neither Korean nor English. After the review, I found out mostly they are Japanese, and excluded them as well - despite of their small number.
Next step was going into more details. One of the things I tried was checking the usage difference of same service in two languages. It will be more useful if I can collect interesting keywords, and filter out results based on them.