• http://ninebyblue.com/ Vanessa Fox

    Well, academic researchers are almost certainly taking spelling variations such as this into account (although casual researchers likely are not). Back in the day when I studied old English and linguistics, these types of evolutions of words and letters over time was one of the first things we learned and got lots of practice trying out during our research. (The OCR issue, not so much, since I went to school back in the dark ages before such magic.)

  • Chris Harvey

    I’m curious as to why they used both the medial and modern (for lack of a better term) right next to each other. Using both of one or the other, I can see (congreff!). Using both? That’s just strange. It’s also inconsistently used in the passage you quoted where one might “fuck the blood” to borrow a phrase.

  • http://www.pobox.com/~ogilvie Brian Ogilvie

    @Chris Harvey: The medial and modern S appear next to one another because the “modern” S is a terminal S. It appears at the end of words. In “Congress,” the first s is medial (in the middle of a word) and the second is terminal (at the end of a word).

    The German ß (Eszett) ligature, which replaces ss in some circumstances, is a combination of a medial and a terminal S.

  • http://www.stepforth.com/ scott.van.achte

    Try the chart using both “suck” & “fuck” You can see the cross over at around 1800. It’s interesting that there are spikes for both pre-1650.

    http://ngrams.googlelabs.com/graph?content=suck,fuck&year_start=1550&year_end=2000&corpus=0&smoothing=3

    I would never have stumbled on this by accident, but if I had I would have sure been confused.

  • http://www.nexcerpt.com/ nexcerpt

    Note from your final screen grab that the “Christian directorie” reveals another common error: “k” for “h” (more forgivable with very old fonts and faces). They derived one of your hits from “…resist all treating of SUCH affairs…” Not from “suck,” but from “such.”

    How will it Ngram ever be able to diagram pr0n, if they can’t tell a f-uck from a s-uck??? ;-)

  • http://roberthheath.blogspot.com/ robert.h.heath

    Presumably, one of the reasons Google built this tool was to look for oddities in the data that might reflect errors in the OCR or metadata capture. A frequency spike for a certain term might reflect something in the culture, or a systemic error in the capture.

    The tool is fun, but until Google provides some additional capabilities, it’s mostly a fun tool, but not ready for prime-time research.

    More on the topic here: http://roberthheath.blogspot.com/2010/12/google-labs-has-quietly-introduced-new.html