Hi, I’m Naitian.

That’s pro­nounced naɪtjen like 🌙 💴.

I’m a PhD stu­dent at the UC Berkeley School of Information, where I’m ad­vised by David Bamman and sup­ported by the NSF grad­u­ate re­search fel­low­ship.

My re­search cen­ters on de­vel­op­ing com­pu­ta­tional meth­ods to un­der­stand mean­ing em­bed­ded in style. This draws on a vari­a­tion­ist so­ci­olin­guis­tic / cul­tural an­thro­po­log­i­cal per­spec­tive of cul­ture, and spans the fields of NLP, computational so­cial sci­ence, and cul­tural an­a­lyt­ics. Methodologically, I am in­ter­ested in mul­ti­modal ap­proaches to lan­guage, vi­sion and speech. I also care a lot about the news, data jour­nal­ism, data vi­su­al­iza­tion and cross­word puz­zles.

If you are in­ter­ested in do­ing re­search or grad school, I am al­ways happy to chat about my ex­pe­ri­ences. If you are a Berkeley un­der­grad in­ter­ested in do­ing cul­tural an­a­lyt­ics re­search, you should ap­ply for a UROP opportunity with my ad­vi­sor, David Bamman.

A collection of photos of my face.

One pic­ture is hard to iden­tify a per­son with” ~ David Fouhey

Updates

Publications

For a full list, check my Google scholar.

Culture is Not Trivia: Sociocultural Theory for Cultural NLP

We draw on so­cio­cul­tural lin­guis­tic the­ory to pro­vide a the­o­ret­i­cal frame­work for un­der­stand­ing cur­rent gaps in cul­tural NLP, and of­fer paths for­ward for fu­ture work. In this pa­per, we 1) demon­strate in a case study how it can clar­ify method­olog­i­cal con­straints and af­for­dances, 2) of­fer the­o­ret­i­cally-mo­ti­vated paths for­ward to achiev­ing cul­tural com­pe­tence, and 3) ar­gue that lo­cal­iza­tion is a more use­ful fram­ing for the goals of much cur­rent work in cul­tural NLP.
[Website] [PDF]

Once More, With Feeling: Measuring Emotion of Acting Performances in Contemporary American Film

Proceedings of Computational Humanities Research 2024
In this pa­per, we con­duct a com­pu­ta­tional ex­plo­ration of act­ing per­for­mance. Applying speech emo­tion recog­ni­tion mod­els and a vari­a­tion­ist so­ci­olin­guis­tic an­a­lyt­i­cal frame­work to a cor­pus of pop­u­lar, con­tem­po­rary American film, we find nar­ra­tive struc­ture, di­achronic shifts, and genre- and di­a­logue-based con­straints lo­cated in spo­ken per­for­mances.
[Website] [PDF]
Measuring di­ver­sity in Hollywood through the large-scale com­pu­ta­tional analy­sis of film

Proceedings of the National Academy of Sciences
We de­sign a com­pu­ta­tional pipeline for mea­sur­ing the rep­re­sen­ta­tion of gen­der and race/​eth­nic­ity in film. Doing so al­lows us to study rep­re­sen­ta­tion and di­ver­sity in Hollywood over this pe­riod, con­firm­ing ear­lier stud­ies that see an in­crease in di­ver­sity over the past decade, while al­low­ing us to use com­pu­ta­tional meth­ods to un­cover a range of ad hoc an­a­lyt­i­cal find­ings.
[PDF]
Social Meme-ing: Measuring Linguistic Variation in Memes

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Much work in the space of NLP has used com­pu­ta­tional meth­ods to ex­plore so­ci­olin­guis­tic vari­a­tion in text. In this pa­per, we ar­gue that memes, as mul­ti­modal forms of lan­guage com­prised of vi­sual tem­plates and text, also ex­hibit mean­ing­ful so­cial vari­a­tion.
[Website] [PDF]
Condolences and em­pa­thy in on­line com­mu­ni­ties

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
In times of dis­tress, we fre­quently go on­line to seek so­cial sup­port and con­do­lence. But ef­fec­tively pro­vid­ing that sup­port to oth­ers is eas­ier said than done. This study aims to com­pu­ta­tion­ally iden­tify mech­a­nisms and strate­gies for de­liv­er­ing ef­fec­tive and im­pact­ful con­do­lence on so­cial me­dia.
[Website] [PDF]

Code

I am writ­ing or have writ­ten code for: The Michigan Daily as the man­ag­ing on­line ed­i­tor, NBC News as a Data Graphics in­tern, the Michigan Data Science Team as a pro­ject leader, Capital One as a soft­ware en­gi­neer­ing in­tern (x2 sum­mers) and, of course, my­self as naitian.

Here is a sam­pler of my work.

The Spalling Bie
The Spalling Bie is just like the New York Times Spelling Bee, ex­cept you only get points for fake words that sound plau­si­ble. [Link]
Where are the vac­cine deserts?
I did re­search, data col­lec­tion, analy­sis and graph­ics for an NBC News story track­ing where Americans could ex­pect to find phar­ma­cies that would carry the Covid-19 vac­cine. [Link]
Cover Story
As part of a chal­lenge, I used a data­base of book ti­tles from Amazon and a part-of-speech tag­ger to con­struct gram­mat­i­cally cor­rect sen­tences us­ing only book ti­tles. Earned an hon­or­able men­tion from Randall Munroe, cre­ator of XKCD. [Link]
If you start play­ing…
A dif­fer­ent kind of New Year’s count­down, IYSP takes in­spi­ra­tion from the in­ter­net memes about starting the year off right.” [Link]