Subtle Asian Meta

Shameless plug: If you, one of the 5 peo­ple who visit this blog every month, work in com­pu­ta­tional lin­guis­tics or re­lated re­search, please con­tact me at nait­ian at umich.edu. I would love to pick your brain and maybe even work with you. Thank you and en­joy the rest of the show.

Subtle Asian Data

I re­cently pub­lished a joke pa­per I wrote over the course of about three months (with the help of David). While en­tirely a joke, it also con­tained en­tirely real analy­sis of sub­tle asian dat­ing user de­mo­graph­ics and lan­guage.

Here’s the ab­stract:

Recent sub­tle asian Facebook groups have gained in­cred­i­ble pop­u­lar­ity among young adult Asian di­as­pora. We fo­cus on one of these groups — sub­tle asian dat­ing – and con­duct a con­tent analy­sis on its posts to ex­am­ine what it can tell us about the mem­bers of Asian youth sub­cul­ture, and how their iden­tity and cul­ture af­fect their view of and ap­proach to dat­ing and re­la­tion­ships.

You can find the en­tire pa­per here. I highly rec­om­mend it — what it lacks in in­sight, it makes up for in just how ridicu­lous it is.

The Critical Reaction

The ob­vi­ous re­sult­ing course of ac­tion was to post my pa­per on sub­tle asian dat­ing. The post took off in pop­u­lar­ity, gain­ing over 3.4 thou­sand re­ac­tions and close to a thou­sand com­ments over the next three days.

The com­ments were pretty in­ter­est­ing to read at first, and for maybe the first half of the first day, I was keep­ing up pretty closely as they rolled in.

That’s when I made the fol­low­ing as­tute ob­ser­va­tion to my friend:

So I think the break­down in com­ments right now is:

50% about how it was type­set in la­tex

20% why did you do this

10% umich rep­re­sent!

10% I’m ashamed to go to the same school as this guy

And 10% mis­cel­la­neous

To which she re­sponded:

HAHA

Misc tag­ging

al­ways

Naturally, af­ter the com­ment­ing had died down, I had to check my­self. How far off was I with my pre­dic­tions? (Hint: very)

Just how many com­ments were misc tag­ging? (Hint: not that many)

Exactly how many peo­ple were im­pressed with my LaTeX skills? (Hint: my skills are not very im­pres­sive)

Subtle Asian Meta

So I went ahead and scraped all the com­ments from the post.

Data Collection

For the cu­ri­ous, I did all the data col­lec­tion in Chrome Dev Tools.

Terrible code, but it worked I guess.

Analysis

I tossed the gen­er­ated JSON ob­ject into Jupyter Notebook to see what I could get. The fol­low­ing num­bers are just for top level com­ments (this means replies are ex­cluded):

There were 618 com­ments. Almost 95% of users tagged some­one else in their com­ment, so misc tag­ging” was pretty ac­cu­rate. However, I was more in­ter­ested in who com­mented only to tag some­one else. That is, the body of their com­ment con­sisted only of a tag.

In that case, the num­ber drops to less than 25%. But wait, there’s more.

How many peo­ple were im­pressed with my LaTeX? Surely less than 50%. And Shirley would be cor­rect. In fact, only 6.5% (N=40) men­tioned LaTeX in their com­ments.

And how many peo­ple thought it was cool that I went to their school? Well, I looked through the posts and fil­tered by any of the fol­low­ing key­words:

['school', 'michigan', 'hoo', 'blue', 'umich', 'mich', 'uva', 'virginia']

which en­com­passes both U-M and UVa. A cou­ple of false pos­i­tives? Probably. False negatives? Also prob­a­bly. But this gives us a gen­eral sense. Even fewer com­ments matched the cri­te­ria for this fil­ter: 5.8% (N=36).

Finally, I took a cur­sory glance at who was in­cred­u­lous, by fil­ter­ing by the fol­low­ing key­words:

['wtf', 'why', 'time', 'tf', 'believe']

And it turns out that only in­cludes 5% of posts.

In con­clu­sion, I was to­tally wrong, but my ver­sion of re­al­ity was way fun­nier, so who re­ally won this fight?

So there you go. sub­tle asian meta­data