Thursday, March 16, 2017

Jackson on Norsworthy on “The Night Before Christmas”

Jackson on Norsworthy on “The Night Before Christmas”

Scott Norsworthy lists (February 9, 2017) “Recommended fixes” for my “flawed methodology” in Who Wrote “The Night Before Christmas”? Analyzing the Clement Clarke Moore vs. Henry Livingston Question. He believes I ought to have done some things differently.

Of course I agree with him that “Biography of the Heart of Clement C. Moore” should be added to the Moore poems investigated. His transcription of the manuscript now makes the whole poem readily available, whereas I had relied on the cut version of the poem, amounting to 72 per cent of the whole, that was printed by Samuel White Patterson in his The Poet of Christmas Eve: A Life of Clement Clarke Moore 1779–1863.    


In devising tests that distinguished between Moore’s verse and Livingston’s, I of course ignored all “Henry Plus” pieces and used the Livingston corpus that had been established on Mary Van Deusen’s “Henry Livingston” website several years before I began my investigations. The Moore and Livingston poems were listed in Who Wrote, 164–72. Norsworthy thinks I ought to exclude from the Livingston canon the Carrier Addresses, 1803 and 1819. He suggests (March 8) that the 1803 address is by Isaac Mitchell. He may be right. But in the test described in the last paragraph of the “Accuracy of transcription” section below, the 1803 and 1819 addresses both belong with Livingston rather than with a miscellaneous group of his contemporaries. It would be worth seeing how a Mitchell poem fares. Livingston’s descendants passed both addresses down through the family as his, along with some of his letters and scraps of other writing, and there are no obvious ulterior motives for their having done so. Gertrude Fonda Thomas’s ascription of CA 1803 would have come from Livingston’s daughter Jane, her mother.  But it would be possible to try excluding theses two addresses from the database and to find out whether this makes any appreciable difference in Moore versus Livingston tests of “The Night Before Christmas.”

Norsworthy also wants Moore’s translations from Italian and Greek included, repeating his insistence that verse translations are poetry and “take creative work to achieve.” This I have never denied. Of course they are, and of course they do. Whether translations from a foreign language are suitable for testing by stylometric methods (counts of rates of common words, for example) is less clear. Most attribution scholars would say that they are not. My omission of Moore’s translations was not based on “aesthetic judgements,” as Norsworthy charges, but on familiarity with the field of attribution studies. But, again, it shouldn’t be too hard to indulge Norsworthy over this matter too.

He wants me to compile statistics that from the start compare all Livingston’s poems with all Moore’s. Holding in reserve some items on which to test the efficacy of the tests is, however, common practice among attribution scholars. Norsworthy appears to imply (“too-conveniently”) some kind of sleight of hand in my following it. But in fact it is considered “best practice.” I used all the poems (except the Petrarch translation) in Moore’s manuscript notebook, including the long “Charles Elphinstone,” for this kind of checking, and found that none was, on the full combination of tests, so consistently Livingston-like as “The Night Before Christmas.” And we can now add “Biography of the Heart of Clement C. Moore” to poems reserved for checking the tests. In Figure 4 (Who Wrote, 133), which shows percentages of Livingston favored high-frequency and medium-high-frequency words combined, it would score 42.105 and so be placed in the percentage range occupied by the largest number of Moore poems and remote from where “The Night Before Christmas” near the mean for Livingston’s poems. I’ve not yet been able to check its phoneme pairs, but it is a typical Moore poem in its rates of use of “the” + “a” (6.735 as a percentage of total words) and “that” (17.347 per 1,000 words); it includes the Moore markers “in vain,” “some,” and “many a,” and one might add that it has three instances of “Oh!”—used fairly frequently by Moore but never by Livingston.

Some of Norsworthy’s sarcasm seems to be based on the assumption that my research took no account of phoneme pairs and high-and-medium-frequency words in “Charles Elphinstone,” but the results for this whole poem are reported in Chapter 18 (Who Wrote?, page 92 for words; 93 for phoneme pairs): they are in accord with the figures derived from Moore’s published poems plus the manuscript pieces “From St. Nicholas,” “To Fanny,” and “To Clem.”

I didn’t hold in reserve any of Livingston’s poems, because what mattered in the checking of tests was to find out whether they yielded “false positives” for Livingston—whether, in combination, they falsely ascribed Moore poems to Livingston.

High-frequency words and “lexical” words

No doubt Norsworthy and I could debate at length whether the rates of use of high-frequency words are less subject than choices of lexical or content words to a writer’s conscious and deliberate decision-making. But that issue is of little importance. What matters is that many empirical studies have shown that writers do in fact differ in their rates of use of high-frequency words (however conscious or unconscious their choices among them)—and indeed, I show that high-frequency words and phoneme pairs differentiate most poems by Moore from most poems by Livingston. Because they are so frequent in any piece of writing, they afford enough data to be, in combination, useful discriminators even for short texts such as poems. Attribution scholars have not found the less frequently occurring content words so effective for determining the authors of short poems. It may be possible to devise a reliable means of using words like “brains” and “visions,” but the difference between the sizes of Moore’s and Livingston’s bodies of verse would have to be taken into account—and Norsworthy wants Livingston’s to be reduced by a further 1,400 words.

Accuracy of transcription

Norsworthy urges me to check every word by Moore and Livingston in Mary Van Deusen’s transcriptions of their verse and to correct errors. Getting the text perfectly accurate is indeed desirable. I had amended some mistakes, but I agree that further effort is needed to amend them all. Doing so is most unlikely to make an appreciable difference to the counts of common words. Norsworthy notes the mistranscription of Moore’s “clad” as “glad” and of Livingston’s “sempstress” as “seamstress.” But since neither “clad,” “glad,” “sempstress,” nor “seamstress” occurs in “The Night Before Christmas,” and neither spelling of “sempstress” recurs in either poet’s verse, the authorship tests I employed are unaffected.

I am glad that Norsworthy recognizes that “Mary Van Deusen has made unquestionably valuable and enduring contributions to scholarship on Henry Livingston, Jr. and Clement C. Moore.”

I agree that it would be a good idea to combine counts for capitalized and uncapitalized words. I have done this for a follow-up study that finds high-frequency words and phoneme pairs that combine to distinguish Livingston’s verse from a similarly-sized corpus of verse by contributors to the newspapers and journals in which he published, and that places “The Night Before Christmas” with the Livingston poems. It remains true, however, that when capitalized and uncapitalized words are differentiated, the “Livingston-favored words” results show “The Night Before Christmas,” along with the majority of Livingston’s poems, falling beyond Moore’s range (Who Wrote, 133).


Norsworthy notes the many similes in “The Night Before Christmas” and observes that Moore is much more partial to similes than Livingston. Joe Nickell had made the same point in 2003, and I answered it in Who Wrote, 89. As Norsworthy says, most similes are signalled by “like,” a word that Moore uses at a much greater rate than Livingston. But “like” is duly counted among “Moore-favored words” in the test of “medium-high frequency words” described in Chapter 17. Despite NBC’s total of eight uses of this word—“six of them crowded into a mere nine lines, in a manner unique within either poet’s work” (Who Wrote, 89)—the overall percentage of Livingston-favored test words associates NBC with Livingston. To Norsworthy the “profusion of similes” in “The Writing of Hezekiah” “seems unusual for Livingston and possibly indicates a different writer.” But the poem was published—in the Country Journal and Poughkeepsie Advertiser, to which Livingston contributed other pieces—over his pseudonym “R,” and the four similes within eight lines derive from Isaiah 38: 9–20, on which “Hezekiah” is loosely based. “Hezekiah” nevertheless scores 64.286 for its percentage of the combined Livingston-favored words graphed in Figure 4 (Who Wrote, 133), which places it within the five per cent range that contains the largest number of Livingston poems. (It has too few test phoneme pairs to qualify for inclusion in Figure 3.) Since another of Livingston’s poems takes off from Isaiah 65: 25, “The Writing of Hezekiah” cannot reasonably be banished from the Livingston canon.

Norsworthy is right that more extended similes are to be found in Moore’s verse than in Livingston’s, but the comparison of the upward flight of St Nick’s reindeer to leaves before a hurricane does not seem beyond Livingston. While “coursers” appears once in Moore’s verse but not in Livingston, Livingston, in his smaller corpus, twice mentions hurricanes, and Moore never does. The extended similes by Moore that Norsworthy cites come from the long and ambitious “A Trip to Saratoga” and “Charles Elphinstone.” It is hardly surprising that elaborate similes do not appear in Livingston’s puzzle-poems. The livelier Livingston similes that Norsworthy cites are from longer pieces that are neither rebuses not acrostics, and he might have added others, such as: “My dear native village I scarcely can see / I’ll hie to my home like the tempest-tost bee” (where the bee is windblown like the leaves in NBC); or the portrayal of the lapdog Belle, “Like a sweet pretty lady she bridled her chin / And trip’d o’er the floor like another Miss Prim”; or the description of little “Master Timmy brisk and airy / Blythe as Oberon the fairy.”

Variant versions

Norsworthy wants me to “deal with the problem of different versions” though he “can’t think of any easy fix.” I don’t believe any “fixing” is needed.  I based my corpus of Moore’s verse on his Poems (1844), which gave the texts he approved for final publication, and, on Moore’s autographs of poems that had not been printed. That still seems to me a reasonable decision. Norsworthy has tracked down and put on his website a few earlier published versions of poems that Moore included in his Poems (1844). He notes that in 1824 Moore’s “Lines Written after a Snow-Storm” were printed in the Troy Sentinel, where “ivy bowers” “is probably a copyist’s or printer’s error for ‘icy bowers’,” the Poems reading, and concludes that if we reject “ivy” from Moore’s corpus we should also reject from NBC the Sentinel’s “Dunder and Blixem,” “later revised by Moore” to “Donder and Blitzen.” But “ivy” is an obvious single-letter misprint, and makes no sense in the context, whereas “Dunder and Blixem” makes excellent sense and is most unlikely to be due to complex scribal or compositorial corruption for the reasons I spelt out in my “Response to Scott Norsworthy.”  Presumably this is why Norsworthy now writes of Moore’s having “revised.” If this implies that the Sentinel’s version of the NBC couplet was Moore’s own original version, Norsworthy needs to explain why that version of the couplet has the Livingston features (Dutch oath, nasal half-rhyming final syllables, idiosyncratically placed exclamation marks) that I noted in my “Response.”

If we were to take Poems (1844), instead of the original Sentinel printing, of NBC as our text for testing, the poem would gain an instance of Moore-favored “that” and lose an instance of Livingston-favored “was,” but there would be no change to Livingston-favored or Moore-favored phoneme pairs, and neither of the graphs on Who Wrote, 132–3—showing NBC beyond Moore’s range for both Livingston-favored phoneme pairs and Livingston-favored common words—would require alteration.


I agree with Norsworthy that for trigrams shared with “The Night Before Christmas” I ought to have carefully checked all Moore’s manuscript poems that I had held back for “checking the tests.” The incidence of shared trigrams was one test that I failed to check in the previously withheld material. Had I done this checking I would have discovered that shared trigrams are ineffectual for distinguishing between Moore and Livingston as candidates for the authorship of “The Night Before Christmas.”

Norsworthy (March 6) has found trigrams that the poem shares with Moore’s manuscript poems, and two in Poems (1844) that I missed because the three-word sequences are interrupted by commas: one with “The Pig and the Rooster”  that reads “Pig turn’d, with a grunt, to his mire anew,” where “with a grunt” intervenes between the beginning and end of the phrase “turn’d . . . to his mire,” so that the syntax is rather different from “turn’d with a jirk” in “The Night Before Christmas;” and another with “The Water Drinker,” where “I, in my” matches “I in my” in NBC. He also lists “and then, in” in “Charles Elphinstone.” Let’s, for the sake of argument, accept these punctuated trigrams as matching the unpunctuated counterparts in NBC.

To Livingston’s matches I can now add “his eye and,” not used by Moore.

So far as I can see, the situation is now that Livingston has three NBC trigrams that are not used by Moore—“new fallen snow” / “new fall’n snows,” “meet with an” / “meet with a,” and “his eye and”—while Moore has nine that are not used by Livingston: “I, in my,” “not a word,” “as the snow,” “and then, in,” “turn’d, with a,” “in a moment,” “of his eye,” “the breast of,” “top of the.”  Moore is here credited with “top of the,” since “the top of the,” found in NBC and once in Moore’s “Irish Valentine,” contains not only “the top of” but also “the top of.” Norsworthy judges that the numbers “strongly support the claim for Clement C. Moore.” But he fails to take account of the fact that a Moore corpus including “Charles Elphinstone,” “Biography of the Heart of Clement C. Moore,” and the shorter manuscript pieces is three times the size (in terms of total numbers of words) of the Livingston corpus. “Charles Elphinstone” alone contains more words than the full Livingston corpus. So the numbers of trigrams that NBC shares exclusively with Moore (9) or with Livingston (3) are exactly proportional to the sizes of their poetic corpora (3 to 1).

Of the trigrams that are used by one of the poets but not the other, only  Livingston’s “new fall’n snows” and “meet with a” and Moore’s “turn’d, with a” are rare enough to be found thirty or fewer times in Literature Online poetry of 1750–1850. All the others occur over fifty times, most of them well over a hundred times. So Moore’s are not, as a group, less common than Livingston’s.

If we focus on trigrams shared with NBC but not necessarily exclusive to either poet, and count the numbers of instances, we find that Moore uses “the breast of” twice, “not a word” twice, “to the skies” five times, “the top of” twice, “out of sight” once, and the other seven of his exclusive ones once each, giving a total of nineteen occurrences, whereas Livingston uses “to the skies” four times, and “the top of” and “out of sight” once each, along with single instances of his three exclusive ones, giving a total of nine occurrences. So Livingston’s rate of use, in proportion to the size of his corpus, considerably exceeds Moore’s.

It might be pointed out that “the top of” occurs twice in NBC, which could be said to create four links to Moore (2 x 2) and two (2 x 1) to Livingston, and that since each instance in NBC is actually “the top of the,” which incorporates the further repeated trigram “top of the,” Moore’s one instance of “top of the” might also be doubled. Allowing these adjustments would bring the totals to ten for Livingston, twenty-two for Moore, with Livingston’s rate of use, in proportion to the size of his corpus still exceeding Moore’s.

I noted in my book, however, that one of Moore’s examples of “to the skies” affords a more extended correspondence with NBC, its “mount to the sky” finding a parallel in Moore’s “mounts . . . to the skies” in “The Organist.”  On the other hand, as I also noted, Livingston uses “in a moment” no fewer than five times in his far from voluminous prose. And we might add that the poems categorized as “Henry Plus,” or possibly Livingston’s, and which amount to two-thirds the size of “Charles Elphinstone,” afford three further instances of “to the skies” and one of the singular “to the sky” of NBC, and also an instance of “the top of the.”

The upshot is, I think, that trigrams end up telling us “nothing, neither way” about the authorship of “The Night Before Christmas.”

Only a small proportion of Livingston’s poems survive, mostly from the period 1776–90. You can see how the size of the corpus affects trigram results by comparing trigrams in Livingston’s with trigrams in “Charles Elphinstone,” which is larger by about eight hundred words than all Livingston’s verse combined. Livingston has five trigrams not in CE (the three that Moore doesn’t use anywhere plus “the top of” and “out of sight”), while CE has five not used in Livingston’s verse: “in a moment,” “the breast of” (twice), “of his eye,” “and then, in,” and “not a word.” Counting all instances of trigrams shared by Livingston or CE with NBC, we can add to CE’s total “to the skies” and the second instance of “the breast of” to get seven; and to Livingston’s total four instances of “to the skies” to get nine.

Norsworthy also cites a few collocations that link NBC to Moore’s verse, but in doing so he enters Moore in a one-horse race that he alone can win: mere gathering of similarities with NBC from a single author cannot establish his authorship.

Norsworthy mentions that “Irish Valentine” contains the rhyme “snow”–“below,” not counted in my chapter on “Rhyme Links.” But of course it wasn’t relevant to that context, since “Irish Valentine” was one of the poems reserved for checking the tests, and in Chapter 18 the instance of the rhyme in that poem is duly recorded (Who Wrote, 99).

Norsworthy has now encountered the attribution technique that analyses “word adjacency networks,” which he thinks “looks promising.” In “Attributing the Authorship of the Henry VI Plays by Word Adjacency,” Shakespeare Quarterly, 67 (2016): 232-56, Santiago Segarra, Mark Eisen, Gabriel Egan, and Alejandro Ribeiro report on the application of the technique to some Shakespeare First Folio plays. How well the technique would work on short poems remains untested. But another mathematically sophisticated technique, derived from signal testing and called “modal distance,” did well on 500-word blocks of verse: it measures the extent to which authors use, or avoid using, certain words together. It is described by Ward E. Y. Elliott and Robert J. Valenza, “A Touchstone for the Bard,” Computers and the Humanites, 25 (1991), 199-209, and “Was the Earl of Oxford the True Shakespeare? A Computer-Aided Analysis’, Notes and Queries, 236 (1991): 501-6. But I know of nobody except Elliott and Valenza who can apply it.

Happy Christmas

Norsworthy certainly shows that wishing somebody a “Happy Christmas,” rather than a “Merry Christmas,” was much less rare in Livingston’s and Moore’s time than in Literature Online, though several of his examples are not of the direct wish, as in NBC’s “Happy Christmas to all.” I happily grant that Norsworthy is brilliant at digging out this kind of information.

MacDonald P Jackson
March 13, 2017

Related posts:
  • Recommended fixes

No comments:

Post a Comment