Final assignment for Quant Humanists and Nature of Code.
Continued from the last exercise: detecting those titles with Korean letters was easy but excluding them wasn't. I asked help from Allison to figure out how to filter out the ones with Korean, which is a method that ideally only collects English titles.
The problem was, sometimes I visited contents that belong to neither Korean nor English. After the review, I found out mostly they are Japanese, and excluded them as well - despite of their small number.
Next step was going into more details. One of the things I tried was checking the usage difference of same service in two languages. It will be more useful if I can collect interesting keywords, and filter out results based on them.
For your final projects, you will draw upon your personal data collection and the insights gained from your weekly assignments to design and develop a project that critically engages with one or more (or all) of the stages of the quantified self pipeline - track, reflect, act - covered throughout this course.
What are my "code-switching" moments:
The outcomes will be built based on Reporter Application data and keyboard input data. Reporter Application was chosen because of its flexibility, while keyboard input was chosen due to its nature of daily usage, auto-tracking, and active multilingualism in computing (specifically in reading and writing).
Create awareness that individuals are consisted of various sides, which make them interesting and complex. It requires more than a monotonous approach to analyze someone's character.
Include a quippish timeline of your reflection "thesis statements", and design a way to represent your mood throughout the course.
While thinking about different ideas, I kept collecting my ip information. One of the things I noticed is that if the website contents are in certain language, the "title" of the page would likely to contain the language.
I'm not fluent in regular expression in any level, but since I had class about it last week, I thought using range match will be a good idea. As [a-z], [가-힣] will contain all the possible combination of letter with Korean alphabet.
It was successful to collect the ones that contained Korean letters, but I kept on failing filtering out them. From 2/11 to 3/5, I visited 726 webpages that presents Korean contents. There are total 4776, so brief way to exclude Korean letters will be 4776-726 = 4050 pages. However, regarding the small number of pages that present neither Korean nor English contents, I'd like to filter them out properly.
On another hand, for some reason, the Chrome extension I was using to track IP information stopped working. I personally emailed the developer to ask if it's finishing its service or having a temporary problem. It seemed to be having some issue, but didn't say it will permanently close the extension - so hopefully it will be back to service.
Develop a method to hack one (or more) of your tracking apps. Write a short reflection about your hack - how did you do it? What information is gained or lost? What are the implications of hacking your data in the way that you did?
So far I've been using both Reporter Application and Chrome Extensions along with Chrome History, however, for this week's activity using Chrome Extension seemed to be a suitable target. Although Reporter Application also tracks various data such as distance and temperature, because it's "reporting" system - I can always refuse to submit my report.
The Chrome site tracking happens all the time, but in order to have the history in json format, I'm weekly using another extension called History export. There's another way to download the history as csv format. For my convenience, I use the extension. However, the extension does not include IP location so I have to manually type in the information.
The best way to obfuscate the Chrome history tracking is simple: usage of incognito window Under incognito, it's possible to avoid the auto-tracking of my website visits - therefore, those won't be exported to json file at the end of the week. I've actually been using this method when I revisit some websites in order to check their IP locations; because as I mentioned above, they don't automatically get saved and exported. All the extensions are off in incognito window as default, so it's important to allow access of IP Tracking Extension.
Write a short reflection about your hack - how did you do it? What information is gained or lost? What are the implications of hacking your data in the way that you did?
This week's activity was relatively simple, due to the nature of Chome browser that is easy to adjust tracking options. Not only that, but also it was something I've been intentionally doing since the beginning of website tracking to collect IP location. It derived from the idea that revisiting of websites (to recheck IP location) shouldn't be included in Chrome history list - because it's only part of "tracking" process, not a natural visit. Consequently, my website visits during the process of rechecking IP locations are lost forever, under my decision. However, for certain decisions, there are always room for a debate. It made me rethink about a correlation between arbitrary decisions and data cleansing.
Design or prototype an intervention that would help you to change your behavior based on the data you've gathered or based on the service you designed in you Quant Self Service. Tech or community or social based interventions are fair game.
The following application only focuses on writing thorough keyboard in either mobile device or pc. If you are a new user, it will ask the access to data about when you change your language, because it can be sensitive information. Once the user confirms, it lands to the Welcome (Get Started) page and will set up language types and devices to synchronize.
After the registration, the application won't ask accessibility question unless the user desires to stop the process of collecting data. Instead of Welcome page, now it will directly land on profile page and shows the average usage of different languages, list of languages, and list of synchronized devices. This page also has a role as setting page, so any visual change, language and device reset can be done in here as well. In order to see more details about data, tap the left top button and it will land on Time Map page. The default view of Time Map page is daily based.
Once in the Time Map page, the user can navigate to further information as weekly based, monthly based, or yearly based data. Each page's graphic is color-coded according to language, which is adjustable element in setting page. Through these Time Map sections, the user can discover such aspects like: "I use more Korean during weekends", "English is my main language while I'm not on vacation", and "My usage of English is increasing every year".
A short paragraph or phrase on what you learned this week.
There has been some confusion on the order of documentation posts, so I merely spent time reorganizing and making some updates on existing posts. The previous comment by Joey on assignment #6 is now under the proper post as a copy. It is definitely my mistake to carelessly overlooked the assignment details. On another hand, I realized that my assignment #4 and #5 were in a combined form after looking at some other classmates' work - and how I've been often thinking user research and design as one chunk of practice, but they're not.
It's important to conduct user research in isolated manner from the design work itself, especially to keep objective view in both areas. One of the best ways to increase accuracy is testing many users as possible - which wasn't an option in my assignment #4, because I was the only user to form the persona figure. At the same time, I'm aware that this service is targeted towards very limited user pool: people who speak more than one language. However, I still decided to proceed the idea regarding that this class is more about "I" and personal data.
Develop a mini service concept around your own personal data.
A short paragraph or phrase on what you learned this week.
The main reason of making a service about detecting language usage was because it was something that I was searching for in the beginning of this class. There are many related research papers and article about relationship between language and mind development, but it was interesting that there wasn't any automated tool to measure it. My idea for the assignment 4 has many flaws, and has incomplete quality. However, at least the moment when I merely imagined about this tool gave me the feeling of satisfaction somehow. On the other note, it's tiring to collect two different sources of data (that don't perfectly suit my purpose of tracking), but at the same time there were many unexpected aspects I found about myself. For example, during weekends my usage of Korean increased along with the number of added photos - which implicates that I take far more selfies when I'm with my Korean friends, compared to when I'm not.