Big Modernism Goes Macro

Work on Big Modernism goes macro! In the past year, with the Modernist Versions Project and Compute Canada, we’ve been expanding the plain-text repository of modernist prose from thirty-two to eighty-six texts. We’ve added texts by key authors (Katherine Mansfield, Dorothy Richardson, and John Steinbeck, among others), as well as additional texts by the authors we had already included. While still unable to access many modernist texts online (I’ve written about this here), we are now able to experiment on a bigger canvas of literary modernism.

Using scripts that incorporate topic modelling software and Bayesian analysis algorithms, Compute Canada researcher Belaid Moa and I have constructed a multidimensional space in which to better understand the intricate relationships among the novels in our corpus. The scripts position texts according to their topical relevance, as I’ve explained in a previous post. With a larger data set, however, different patterns emerge (see the raw comparison data here).

In the last round of analysis, the top categories, in order of weight, were:

time, felt, day, looked, knew, work, face, hand, night
eyes, face, life, time, white, dark, round, hand, head
men, people, began, room, house, talk, suddenly, end, years

While time, space, and human experience continue to play an important role, we might consider how these central subjects are reconfigured when grouped with new terms. This round, the top three topics were, in order of weight:

looked room hand door eyes head round people stood
face eyes time hand voice turned door heard mind
time felt fact knew matter give life girl point

Space, indeed, takes precedence over time in the most heavily weighted topic in the corpus. The first topic (numbered 21 by Mallet) privileges inhabited space, the ways a room can be experienced by consciousness, the positioning of bodies within that room. Notice too the way the spatial category in the first round of analysis (the third set of terms) led with “men,” while the new category includes only the more gender neutral term “people.” Notably, but not surprisingly perhaps, it is women writers that dominate this topic, with Virginia Woolf, Katherine Mansfield, Dorothy Richardson and May Sinclair exhibiting a high correlation with this category. As critics have long noted the ways women modernists have been especially interested in domestic spaces, this correlation makes intuitive sense, but also asks us to consider the pastness of most of the verbs in this topic. In the verbs then, time reenters the topic. Are the spaces in these novels experienced as having always already passed?

The second category (15) brings time into the discussion. The world of this topic is sensed and inhabited, experienced and navigated by the human individual as a physical entity, but the self that experiences this world seems to also channel time (as discussed in the last post, the words that denote the “face” and “hand” of a person could also refer to parts of a clock). This topic plays the most significant role in the works of Joseph Conrad and Thomas Hardy, as well as in Tarr, The Awakening, The Voyage Out, Night and Day, and The Stranger. We might wonder then whether Woolf’s early interest in the spatiotemporal experience of the individual in The Voyage Out and Night and Day gives way to a decided interest in a more collective experience of space (the topic discussed above) in her later novels, or whether it is just that her interest in the subject of time becomes less obvious in her later works.

The third topic (6) seems more to position time in relation to questions of epistemology. Whereas the world of topic 15 is moved through and occupied, the world of topic 6 seems to be hovering between subjective experience and attempts to translate that experience into accepted forms of socially defined knowledge, that is, into scientific discourse. The texts that most incorporate this topic are by Henry James and Marcel Proust. Tarr, The Secret Agent, The Good Soldier, Chance, and Night and Day are also identified as key texts.

We can see how these topics play out by novel.

Top 3 Topics in Modernist Corpus

What I hadn’t considered in great detail before are the topics that differentiate texts. For example, the topic that appears significantly only in the texts of George Orwell reads:

While there is no doubt that many of the other authors in our corpus were interested in the subjects of war, nationality and class, Orwell appears to be the least subtle in his treatment of the topic. That is, he’s the only writer who commonly uses an expected lexicon to address these issues.

Only appearing in American novels is topic 24, which contains the terms:

The words appear more frequently and in greater proximity in Babbitt, The Great Gatsby, The Beautiful and the Damned, and Tender is the Night more than in any other modernist texts. The topic suggests the prevalence of business in American modernism and how the world of business is imagined by these writers. Temporally, it seems to take place at night (over dinner), and spatially it is networked across the office, the city, the hotel, and the club.

Another topic of particular interest is topic 19, which includes:

This topic, which suggests encounter with foreign cultures and knowledge systems, links together The War of the Worlds with Burmese Days and Passage to India, suggesting that the colonial themes present in the latter two texts might be somehow correlated with H.G. Wells’s sci-fi invasion fantasy.

What if we focus in on one text within the corpus? Can we perhaps use our analysis to envision the “thin thread” that connects Mrs. Dalloway to the other texts in her cultural milieu? Using a heat map, we can look at a quantifiable measure of the similarities between Woolf’s text and the other texts in our corpus:

Comparing Mrs. Dalloway

The heat map (see the raw data here) accounts for the distances between all the topics for all the texts, for example, how different is Mrs. Dalloway from Nightwood when it comes to topic 1? Topic 2? Topic 3? etc. The mapping algorithm measures all of these differences to create a multidimensional space of multiple topics and to thus infer how similar each text is to every other text. For Mrs. Dalloway we see that the text is, unsurprisingly, most like itself. It is very similar to the other texts by Virginia Woolf (see the right side of the chart), but also to the works of E. M. Forster, Portrait of the Artist as a Young Man, Nightwood, Pointed Roofs, and Sinister Street. On the other hand, it appears to be most different from the texts of Joseph Conrad and Henry James.

We can turn back to the topics to interpret how these distances were determined. Compare Mrs. Dalloway with Howard’s End and Portrait of the Artist as a Young Man:

Comparison of Mrs. Dalloway, Howard’s End, and Portrait of the Artist as a Young Man

Topic 21 (looked room hand door eyes head round people stood), as discussed above, is much more prevalent in Woolf’s text than in either other text. Many writers take up the textual relationship between Woolf and Forster’s work. Michael Hoffman and Ann Ter Haar address the tensions between the two authors when it came to visual imagery and representations of space. They cite Forster’s anxious response to the effects Woolf’s writing could have on established spaces, “She wants to destroy the gallery…, and in its place build what? Something more rhythmical. Jacob’s Room suggests a spiral whirling down to a point, Mrs. Dalloway a cathedral.” Forster’s comments bring to mind the anxieties present in Mrs. Dalloway with the way spaces are occupied:

The doors would be taken off their hinges; Rumpelmayer’s men were coming.

“So [Elizabeth] went to [Richard] and they stood together, now that the party was almost over, looking at the people going, and the rooms getting emptier and emptier, with things scattered on the floor.”

Unlike the other two texts, Mrs. Dalloway also extensively incorporates topic 8 (life sea tree trees world leaves beneath women read), which is only significantly present in Woolf’s texts, her signature topic if you will.

Howard’s End, on the other hand, draws more extensively on a topic that, according to Mallet, is particularly prevalent in Forster’s works(7):

life people men sort world bad english years suppose

While both Mrs. Dalloway and Portrait also incorporate this topic, Forster is particularly interested in this grouping of words that brings to mind patriarchy, nation, and how these institutions unite with ideas of order, morality, and precarity.

Readers might also be interested in Portrait‘s dominant category (26):

god night father soul life heart sin head world

This set of words links Portrait with Nightwood and disassoociates it from Mrs. Dalloway, which does appear to sideline religious questions in the character of Miss Kilman, who perhaps experiences the world as “life” and “sin,” but who has no way of bringing this experience into the socially structured lexicon of the novel, except perhaps as a spectre haunting Clarissa. Other analysis we’ve done on Mrs. Dalloway has also marginalized Miss Kilman. As Shawna Ross has noted:

Mrs. Kilman’s repeated absences from our analyses prompts us to ask in what contexts our digital methods are developed and how they might function to further sideline already marginalized issues in modernist studies.

While keeping in mind the limitations, the results are still intriguing. These observations give a sense of how topic modelling and Bayesian analysis might help us compare a novel such as Mrs. Dalloway to other texts in our modernist corpus. Future work will continue close analysis of the topics that compose modernism’s multidimensional spaces and that affect how texts are clustered in that space.

Images provided by Belaid Moa and Jana Millar Usiskin.

Jana Millar Usiskin

Graduate student, studying modernisms and the digital humanities at UVic.

