In fact (to get my polemic on), when I was still in a linguistics department, I threw at the head of the department (an Australianist—oddly enough for a department in Australia) the old dictum from George Hatzidakis: Πᾶς μὴ φιλολογῶν, οὐ γλωσσολογεῖ. "If you're not doing philology, you're not doing linguistics." "... Maybe not your kind of linguistics", he blinked.But to the dating of linguistic changes.
But of course, a lot more linguists have to do philology than admit to it (or actually do it). If you understand philology properly. It's not about poring over old manuscripts: philology is understanding the social and historical context of the texts you're working on. Which means factoring the cultural in to your linguistic analysis: language is an instrument of culture, and the choices people make in what to say and how to say it are driven by their culture. To a significant extent, they *are* their culture. And that holds as much for Australian Aboriginal languages as for Early Modern Greek—if not more.
For historical evidence, we rely in the first instance on written texts, and only secondarily on reconstruction and comparison between modern languages. And there are two types of written text before the printing press: literary, and documentary.
Literary texts got copied time and again—which is why they survived: LOCKSS, Lots Of Copies Keeps Stuff Safe. That means we're unlikely to have the autograph, the original author's manuscript—certainly out of the question for Greek antiquity, where the wax tablets and papyri did not survive millenia of Greek dirt. Nor for that matter before the 10th century: as book technology advanced, copies in the old technology were thrown out wholesale—so once lowercase was invented, the older uppercase codexes were discarded.
As I commented over at Roger Pearse's, some leaves from the uppercase autograph of the Life of St Andrew the Fool have accidentally survived—thrown out, and recycled as a book binding. That helps us work out that it was a 10th century text, pretending to be from the 6th century.
Without the autograph, we rely on the fidelity of copyists, our knowledge of the original language, and philological acumen to work out what the original said. With Ancient Greek, our evidence is not perfect, but we're not that badly off. The scribes were typically conscientious. We know Ancient Greek better than the scribes did, so we can often tell if they are wandering off linguistically. We typically have multiple copies of the text, so we can triangulate between them. And as I'll say below, we have other texts to fill out our knowledge of the ancient language.
We're not always there: we suspect for instance that the text of Herodotus we have has been hypercorrected, because it has plurals like τουτέων that look Ionic, but linguistic theory says shouldn't be there (Kühner–Blass 111): some time between 450 BC and 1000 AD, a scribe thought he knew better than the text he was copying, and all subsequent copies of Herodotus reflect his intervention. (The inflection is even more pervasive in the artificial Ionic of Roman-era medical authors.) But that kind of problem with mangled grammar is not debilitating for Ancient Greek.
Things are quite different for Early Modern Greek (and from what I gather, other mediaeval vernaculars). Again, we don't have autographs until well into the Renaissance. But scribes did not revere the language of the original, like they did Ancient Greek: Early Modern Greek had no prestige, and scribes' approach to the verses was influenced by their approach to folk song: liable to adjustment on the fly. So the wording of our multiple copies varies a lot more than we're used to from classical text, in ways where it's not obvious what the original had, if anything. Editors more often have to drop the meticulous stemmatic reconstruction of the text, and go with selectio (gut instinct), or even codex optimus (follow just one manuscript).
It gets worse. Towards the end of the age of scribes, we can see scribes systematically modernising the language of the texts they were copying; that's what the 1625 manuscript of the 14th century Entertaining Tale of Quadrupeds did for example. Scribes could also do the opposite: make the language of their text more archaic and proper. The Digenes Akrites romance must have originated in 11th vernacular ballads, but our earliest manuscript, the 13th century Grottaferrata, is in learnèd Greek. The 15th century Escorial manuscript is in the vernacular people were hankering for, but it's unlikely to be a direct reproduction of the 11th century ballads.
So if you have two variants of a passage, one more learnèd and one more vernacular, you can't always tell which reading is original—and use that to date the change back to when the text was originally written. Classical philology has a handy rule of lectio difficilior: if one reading is interesting and the other is boring, go with the interesting reading. That rule is predicated on the ancient author being a literary genius, and the scribe being a drudge: a drudge will simplify a text in copying it, and will not exercise ingenuity. With the Early Modern corpus though, the scribe is not necessarily less of a genius than the author, and feels more entitled to show us that he is.
An example comes from the very title of the Entertaining Tale of Quadrupeds. The word "entertaining" is given in two forms in the manuscripts: παιδιόφραστος "entertainment-phrased" is the Paris manuscript, the other four have πεζόφραστος, "prosaically-phrased" (i.e. vernacular). "Entertainment-phrased" is an interesting coinage; "prosaically-phrased" is—well, prosaic. So following lectio difficilior, editors of the text have accepted the reading Entertaining Tale, rather than Vernacular Tale.
Just as George Baloglou and I were going to press with our translation of the Entertaining Tale, i came across a paper by Hans Eideneier, arguing that the Paris manuscript routinely tinkered with the text, making it more archaic or pretentious. To Eideneier's mind, this was another such instance: the scribe thought "prosaically-phrased" was prosaic too, and tarted it up to the similar-sounding "entertainment-phrased". Scribes simply would not do that with Homer; but they had no such compunction with vernacular texts. And if they routinely did so, then lectio difficilior does not count as much in reconstructing the original text.
[Eideneier, H. 2002. Η «πεζῇ φράσει» Διήγησις των τετραπόδων ζώων. In Λόγια και Δημώδης Γραμματεία του Ελληνικού Μεσαίωνα: Αφιέρωμα στον Εύδοξο Θ. Τσολάκη. Πρακτικά Θʹ Επιστημονικής Συνάντησης (11-13 Μαΐου 2000). Thessalonica: Aristotle University. 269-277.]
This all amounts to needing a degree of scepticism about dating linguistic changes from literary texts. We can often work out the date a text was originally written in; and if a preponderance of surviving copies confirm the linguistic feature we're looking at, we can date the feature back then. But if we have only one copy, or if there is a dispute between the manuscripts, the only date we can have real confidence in is the date that the ink on the manuscript dried. For Early Modern Greek, that date is almost always the 14th or 15th century, rather than the 12th.
In fact, of the three Early Modern Greek literary texts dated to the 12th century, at least two are philologically problematic. Spaneas' Polonius-like moralising made it hugely popular; that means lots of copies, with lots of variation, and no obvious reconstruction of a single original text. The Ptochoprodromos cycle clearly was written in the late 12th century; but again, it was popular enough and copied enough that we can't always be sure the manuscripts' language is faithful to the original.
The third such text is Michael Glycas' Prison Verses, written in 1158/9 (and edited by Evdoxos Tsolakis, whose Festschrift featured Eideneier's paper above). In this case, we've been uncharacteristically lucky. We have a date for the poem, because we know what campaign the emperor was on when Glycas was petitioning him (unsuccessfully) for pardon. We have a 13th century manuscript of the text (and we know it used to be in another 13th century manuscript), which is completely unlike other vernacular texts. We have an eponymous author writing in the vernacular, and someone who otherwise wrote in the learnèd dialect; we'll need to wait two centuries for Stephen Sachlikes for the next such eponymous text.
It's the combination of all this that has enshrined the work as the first known Modern Greek text. (A second Glycas poem, written a few years later, presents vernacular proverbs more systematically, and I presume is what TAK was referring to in comments. To my embarrassment, I was not familiar with this work at all.)
This is slightly too good to be true, and if you look at the text, it's not as strongly vernacular as Ptochoprodromos: the substrate is learnèd, with some smatterings of vernacular, especially when proverbial wisdom is brought up. There is a continuum of vernacularness which the polemic fuelled by Greek diglossia skips over: John Camaterus, also from the 12th century, is even less vernacular than the Prison Verses, but still has a few shibboleths. (When Modern Greek liteature begins is a vexed question, and Martin Hinterberger has a lucid overview.)
Still, Glycas has one of the earliest attestations of the Early Modern use of subjunctive να to indicate future tense (supplanted a couple of centuries later by θέλω να "want to" > θα). This excerpt, where an Early Modern να is followed by an Ancient future tense (να συμπαθήσῃ, ῥύσεται), illustrates how macaronic the poem gets. I italicise the clearly vernacular words:
Καὶ στὰς ὁ τάλας ἄφωνος καὶ πεπηγὼς ὡς λίθος
καὶ γεγονὼς περίδακρυς ἔδοξα παρακοῦσαι,
ὥσπερ τινὸς ἐγγίσαντος καὶ πρὸς ἐμὲ λαλοῦντος:
«Ἐδά, Μιχάλη ταπεινέ, φέρε τὸν λογισμόν σου·
ὅσα καὶ ἂν εἶδες ἄφες τα, τοῦτα παιγνίδια οὐκ ἔνι,
φόβητρα δὲ καὶ βάσανα καὶ στοναχαὶ καὶ πένθη,
ἀσυμπαθεῖς ἐξετασταὶ καὶ φοβεραὶ κολάσεις·
τῷ βασιλεῖ σου πρόσδραμε, λέγε τὰ πταίσματά σου·
ὁ βασιλεὺς φιλάνθρωπος καὶ νὰ σὲ συμπαθήσῃ.
Ἐκύκλωσάν σε σήμερον ὠδῖνες τοῦ θανάτου;
Ἐπικαλοῦ τὸν Κύριον κἀκεῖνος ῥύσεταί σε.
I, poor man, standing voiceless, fixed like rock,
becoming full of tears—thought I o'erheard
someone approach and speak to me, as 'twere:
"Now then, poor Mike, put on your thinking cap.
Quit everything you've seen: this ain't no game,
but terrors, torment, moaning, and laments,
unfeeling inquisitions, fearsome penance.
Approach thine emperor, admit your faults:
the kindly emperor's gonna pardon you.
Have pangs of death encircled thee today?
Call thou upon the Lord, and He shall save thee. (Prison Verses, 515-525)
We have some evidence of Early Modern Greek before the 14th century from other literary traditions, but not a lot. The Judaeo-Greek Jonah of 1263 is a word-for-word translation; and the bits of Greek in Rumi and Sultan Walad are hard to read, and probably second-language Greek. We have the occasional vernacular proverb or song mentioned in an otherwise learnèd text, such as the proverb in the Continuators of Scylitzes which is slightly earlier evidence of the να future, the imperial acclamations, or the "Go well my hawk" song in the Alexiad. We'd like to retain that evidence, through lectio difficilior: the vernacular is so out of place that the scribe is unlikely to have tampered with it, especially since they were back to "treat text with respect" mode in their copying. Yet that's an assumption, not a proof.
There are, luckily, other kinds of texts we can use as evidence: the documentary kind. They don't yield as copious evidence as the vernacular literary texts do: the literary texts are only macaronically vernacular, whereas the documentary kind are usually vernacular only by lapse—in which we're rather less lucky than Hellenistic Greek, with its abundance of Koine correspondence. Those texts, I look at next post.