Talk:Centum and satem languages

Why not the Dacian suta (plural sute)? Why should be the Dacian language extinct? Romanian is by far more vivid and rich to be of Latin origin. Jacob Stirbu

The name satem (or correctly: satəm) is taken from Avestan for historical reasons. It was just an arbitrary decision.

The question why Dacian should be extinct is not clear. But it IS extinct! Romanian has got only few words from the pre-Roman substratum (i.e. from Dacian). But it is not a topic to be placed within the "Satem" subject.

--Grzegorj 11:54, 9 May 2005 (UTC)[reply]

Actually Grzegorj, Romanian has a very large quantity of substratum words, the only real debate is whether they are Dacian---or Thracian, or Illyrian, et cetera. Decius 04:21, 19 July 2005 (UTC)[reply]

Oh... I am not a Romanist so I do not know the exact number. But A. Cihac in his etymological dictionary of Romanian gives 2350 Slavic words (of course all of them are borrowings), 1150 Latin words, 950 Turk words, 650 Greek words, 600 Hungarian words and only 50 "Albanian" ones (some of them may be Dacian, Thracian or Illyrian). Maybe later studies let scholars find more Balkanisms in Romanian. Nevertheless, 50 of the total amount of 5750 means ca. 8.5 pro mille (not even 1%). I would not rather say that it is a very large quantity...

--Grzegorj 20:23, 4 August 2005 (UTC)[reply]

Frankly, your source was wrong. The cognates (not loans) between Albanian and Romanian alone number about 145 (I take this number from a Hungarian source that was discussing the issue). Add to that the words of unknown etymology/origin that are not found in Albanian, and your figure and your characterization is completely wrong. Decius 20:30, 4 August 2005 (UTC)[reply]

OK, I have cited a source from 19th century. Some of dark Romanian words may be Bastarnan or even Sarmatian. But please, even 200 Balkanian words give only 3% of all the vocabulary. So, it is not "a very large quantity", do not exaggerate. Compare it with the amount of native (Latin) words: they are 6 times more. And all the Romanian grammar is Romance with Slavic influence, not Dacian. So, we should forget Dacian. Dacian is extinct! Only some Dacian words may have survived. Just like some Gaul words have survived in French.

--Grzegorj 21:32, 4 August 2005 (UTC)[reply]

19th century etymological dictionaries dealing with Romanian are known to be outdated and full of errors. As a rule, they tend to have slighted any notion of a large substratum element, and in retrospect this tendency to slight was a tendency based on 19th century psychology rather than science. I agree that Dacian is extinct, and I agree we must not just assume a Dacian substratum. But there is a Balkanic substratum in Romanian, whether "Dacian", "Moesian", "Thracian", or "Illyrian". I may or may not have exaggerated when I said very large quantity, but note that I have in mind a figure around 300, and I consider that a very large quantity when one is speaking of substratum words. The linguist/Thracologist Sorin Olteanu in his LTDM site says there may be thousands [1]. I do not claim thousands, and I'm not sure how he arrived at that figure (I think he was considering Aromanian as well), but I do claim around 300; quite a number of linguists would agree. Decius 21:41, 4 August 2005 (UTC)[reply]

What Jacob was asking was "Why not list Dacian 'suta' among the examples of words for hundred that show the Satem sound-change?". He wasn't asking "Why not change the artice's name to 'suta'?" To answer his question: we can't list Romanian 'suta' as Dacian because it is not attested as Dacian. Though not attested as Dacian, a number of current linguists are accepting that Romanian 'suta' is not from Slavic 'sto', but more likely from the substratum. Decius 04:31, 19 July 2005 (UTC)[reply]

Why not form Slavic? But the Slavic word for "100" was nothing else but *suta! During centuries the former short a developed regularly into o in Slavic, and short u developed into the ultra-short vowel marked with Russian "hard sign", ъ. So, the oldest attested Slavic (Old Church Slavic) word for "100" was sъto. It shows 100% accordance with the Romanian form. And I see virtually no reason for searching another source for the Romanian word.

--Grzegorj 20:23, 4 August 2005 (UTC)[reply]

Despite what you see, and you are entitled to your opinion (nor am I diminishing this particular opinion of yours), a trend in current linguistic sources is to no longer assume a Slavic origin for sutǎ. See even the conservative DEX (it is online [2] ), which does not derive it from Slavic. Unless I'm mistaken (I haven't read the detailed studies), the problem is that by the time the Slavs influenced Romanian, they no longer had the word in the form *suta (unattested), so Romanians were not likely to get it from them. You seem to be interested in this, so I will pass on what linguistic references I can find on this topic onto your Talk Page eventually if not here. I would imagine though that "Slavic" (excuse the generalization) references would still assume that the Romanian word is a loan. Decius 20:37, 4 August 2005 (UTC)[reply]

But it is not true what you say, be more exact for the future. "SÚT//Ă1 ~e num. card. 1) Nouăzeci plus zece. O ~ de pagini. 2) cu valoare de num. ord. Al sutălea; a suta. /<sl. suto"

So, DEX does derive the word from Slavic! So do I. Please give evidence that I am wrong, not trends.

The Slavs were able to influence Romanian very early, even before their migration to the Balkan Peninsula. But some Slavic tribes were north-eastern neighbours of Proto-Romanians as early as at the very beginning of the Middle Age. And the form *suta may have survived up to ca. 800 AD. The evidence is the phonetic shape of old Slavic loans in Greek. I do not think that the first Slavic-Romanian contact was later that the Slavic-Greek one. So, I do not think that those who refuse Slavic origin of sutǎ are specialists in the early history of the Slavs. Of course, they may be right but only if they know another possible and more probable source. Frankly, I do not think that any such source exists.

--Grzegorj 21:47, 4 August 2005 (UTC)[reply]

You have quoted NODEX, and the bibliography does not indicate [3] that it is published by the Academia Romana, which is more official. The DEX, which the bibliography notes is published by the Academy, gives the source as unknown (Et. nec.). Decius 21:53, 4 August 2005 (UTC)[reply]

Negative view ("unknow source") means no view. Please show why the word cannot be of Slavic origin. I cannot see such evidence.

--Grzegorj 23:34, 4 August 2005 (UTC)[reply]

When the DEX states Et. nec., it indicates that the source of the word is unknown. This means that those linguists at the Academy (who are conservative, mind you, and in most cases they will rather say a word is from Slavic than say it is of unknown origin) as of 1998 did not accept the Slavic origin. Why? Well, I live in California, so I do not have much access to the references in question. The NODEX from 2002 does not seem to be published by the Academy, though it is a Romanian publication also, and it gives a Slavic origin for the word. Decius 01:14, 5 August 2005 (UTC)[reply]

Sorin Olteanu: The reason why they reject, on good ground, the slavic etymology is the accent: while in slavic the stress falls on the final syllable (sŭtó), in Rom. if falls on the first, the word being pronounced sútă (like in eng. soóter). For other issues visit my site http://soltdm.tripod.com/index.htm

Polyphyletic or Paraphyletic or

If all Indo-Europeans fall into the two categories centum and satem, it isn't possible for both groups to be paraphyletic according to the usual meaning of the word. Does it have a different meaning for linguistics, did one or the other page mean polyphyletic, or is there simply a mistake?

Answer: Yes, you are right! And which is more, the view that the satəm languages are polyphyletic is only one of two possibilities that are under discussion. My opinion is that they are monophyletic. And e.g. close relation between German and Slavic is surely a result of secondary processes, not evidence for their relationship.

Because the matter has not been univocally solved so far, my suggestion is to delete the appropriate statement from the article. What do you think about it?

BTW. Some modern centum languages are formally satəm in fact. See French cent [sã] which has [s] in the place of the original (Latin) [k]. Even if it is a result of secondary development, the division into kentum (centum) and satem must be taken with caution.

--Grzegorj 11:55, 9 May 2005 (UTC)[reply]

I remove the statement because I agree with you.--Wiglaf 07:34, 13 May 2005 (UTC)[reply]

Centum languages with secondary palatalization of /k/ to /s/ are not in any way "formally satem in fact". In French cent /sã/, /s/ goes back to PIE

/ḱ/, true enough; but in cinq /sɛ̃k/ it goes back to PIE */k^w/. And
/ḱ/ shows up as /k/ in cœur /køːʁ/. --Angr/_{tɔk tə mi} 21:42, 22 July 2005 (UTC)[reply]

It is not true at all. They are just 3 examples which can prove nothing. In fact, PIE *k and *k' merged long before Classic Latin times. *k^w was a separate phoneme but sometimes the labialization was reduced (like in quinque > cinq). There is no reason to get back to the PIE times hence French developed from Latin, not immediately from PIE. Latin lost the difference PIE *k' : *k and French was not able to restore it in any way.

The development of Latin [k] and [k^w] in French depended on the phonetic environment, not on its deeper etymology. Namely, [k] developed into [ts] and eventually into [s] before a front vowel (ex. merci from mercēdem, or cent /sã/ from centum /kentum/), into [tš] and eventually into [š] before [a] (ex. Charles from Carolus, chose from causam etc.), and stayed unchanged in other positions (ex. coude from cubitum [k], aucun from alicūnum etc.).

And, the process of changing of former (Latin) [k] into [s] before a front vowel is what caused the numeral "100" to begin from [s] in French, just like in satəm languages. Because we know Latin, the language from which French developed, we can say that [s] in French word for "10" is secondarily. But if we had not known the language which gave birth to French, we would not have been able to say whether French is centum or satəm. French is formally satəm because the word for "100" begins with [s] in this language. Of course we could count this language as centum but it is so only because we have additional knowledge.

So, if we had known that ex. a Dacian word for "100" begins with [s], it would not have been the ultimate argument yet that Dacian belongs to the satəm group. It is so because it might have been secondarily satemized just like French.

--Grzegorj 20:23, 4 August 2005 (UTC)[reply]

Thracian

The Thracian corpus has preserved a number of Satem examples, but its classification as Satem is still being discussed. Sorin Olteanu states that Thracian was a Centum language till a later period when it became Satemized under Balto-Slavic influence. We have no idea what the Thracian word for hundred was, and some Thracian words don't show the sibilant. This article notes that it is only "perhaps" Satem, not definitely, so no problem. Decius 07:07, 20 May 2005 (UTC)[reply]

I wouldn't have thought there was enough information on Phrygian and Thracian to say definitively one way or another. But there is evidence that some Anatolian languages were satem even though Hittite was centum, and Luwian seems to have been both: it kept *k and *k^w distinct but made a sibilant (spelled z, probably [ts]) out of *ḱ. --Angr/_comhrá 07:15, 20 May 2005 (UTC)[reply]

Because of unsufficient data, it is hard to say anything on the above mentioned languages. It is however worth to mention that in each satəm language there are words with velars in the place of PIE palatals. For instance, the original palatals, as a rule, give velars in Slavic when the root contains an original "s". Just compare Lithuanian žąsis and Old Church Slavic gǫsь 'goose' (< PIE *gʹhans-i-).

I have not heard about satəm words in Luwian but I have heard on such words in Lycian. However, notice that Lycian is known from inscriptions form V-IV BC, i.e. very young when compared with Hittite. Such single Lycian words as snta "100", esbe "horse" or sijeni "it is lying / recumbent" may be borrowings from a Satəm language close related to Phrygian (if it really was Satəm) or Proto-Armenian. Some scholars, like V. Georgyev, believe that some late Anatolian languages had mixed character and had two components: an older one, close related to Hittite, and a newer one, Satəm. Three or four satəm words are known from Hittite itself. The most reliable and most widespread opinion is that they are borrowings from early Indoiranian spoken by people who inhabited Mitanni once.

--Grzegorj 20:56, 4 August 2005 (UTC)[reply]

Dacian

I added Dacian, because it was at least semi-Satem. Decius 07:22, 18 July 2005 (UTC)[reply]

What does "semi-Satem" mean, and what's the evidence? --Angr/_{tɔk tə mi} 08:19, 18 July 2005 (UTC)[reply]

It's a shorthand term I used to describe a language that may have had Satem reflexes only very irregularly (Phrygian) or as a later development. The evidence for Dacian being Satem rests on examples such as the ones I've listed here:Talk:Dacian language. Decius

All right but please do not term (s)kazat(sya) Common Slavic! It is Russian, not Common Slavic ((s)kazat'(sya) in transliteration). The Common Slavic form was kazati (attested in Old Church Slavic) and sъkazati, with the meaning "show something with gesture", not just "show" (see Vasmer, Russisches Etymologisches Wörterbuch). The Old Church Slavic and Common Slavic medium voice was kazati sę and sъkazati sę. The PIE form with *k^w- is not very likely, see Sanskrit śāsati, śāsti 'he/she shows with gesture', but also kāśatē 'it shows up' < PIE *k'ōs-, *kōk'- or *kōg'- (as far I know, the reconstruction of initial *k^w- is based only on Greek tekmar 'sign' < *k^wek-).

--Grzegorj 22:13, 4 August 2005 (UTC)[reply]

Yes, the form is Russian, not Common Slavic. The etymology from *kwek- (or *kweg-) is based on my AHD from 1969, and they may have been wrong, but I think Olteanu in his current LTDM site still gives an initial kw- (Read this, it may be of interest). Decius 22:20, 4 August 2005 (UTC)[reply]

There is no evidence that Thracian and Dacian were separate languages. In fact, the ethnonym Thrax (Thrāix), Threēix is known as long time ago as in Homer's texts. Northern Thracians were known as Getai in classic Greeks' times. Other Getai / Getae were known from the Dnestr region (Tyragetae), from the Don mouth (Thussagetae) and from the northern shore of the Caspian Sea (Massagetae). So, they may have been an Iranian tribe which was Thracized. In the Roman times Dacians (Daci, Dakoi, Dakai, Dakes) show up for the first time and they are count within Getae (so, within Thracians).

The number of Daco-Thracian satəm words exceeds those which are centum. So, we have insufficient evidence to count them within Centum. The term "Semi-Satem" is ridiculous, because in Daco-Thracian there is a number of words with velars in the place of PIE palatals, for different reasons (not necessary borrowings from Centum languages), just like in other Satəm languages (no true Satem languages are known in fact). For example, Akmonia may be native, cf. Slavic kamy, gen. kamene 'stone' (a centum form in a Satem language), related to English hammer (originally "stone hammer"), Lithuanian akmuo and ašmuo 'stone' (centum and satem forms side by side!), Latvian asmens 'blade of knife', Greek akmōn 'anvil', Sanskrit aśmā 'stone, rock'.

Other Dacian and Thracian Satəm words are:

Asamus, hydronymic, from *ak'm- 'stone', see above,
briza 'a sort of corn', cf. Slavic rъžь, German Roggen 'rye' and the English word (< *(w)rug'h-)
-diza, toponymic formant, from *deig'h- 'build in clay, in brick',
Diuzenus = Greek Diogenes (*-g'-),
-esp-, -ezb-, onomastic formant, from *ek'wos 'horse',
Kozeilas, Kozaros, Kozinthēs, cf. Slavic koza 'goat', kozьlъ 'he-goat',
Razea, Raizdos, Rēsos from *rēg'- 'to rule, to govern; king',
Zantiala from *g'enH-t- 'clan, tribe; to give birth',
Zoltes from *g'hol- 'golden, yellow'.

Other Centum words are:

Anguron (now Iron Gate) from PIE *ang'- 'narrow' (however, see also Slavic ǫglъ 'corner' (Latin angulus)),
argilos 'mouse' from *arg'- 'silvery, bright' (see Argessos above),
Decebalus parallel with Sanskrit daśabala 'having the power of 10 men',
Dekaineos also from *dek'm- 'ten',
Peuci (Dacian), Peukē, Pecetum (Thracian) from *peuk'- 'to prick' (cf. Latin picus 'woodpecker', picea 'spruce')
Trikornion, maybe of Celtic origin in fact, from *k'orn- 'horn' (however cf. Slavic *karvā 'cow', literary 'horned animal', also with k-, not with expected s-)

Most of them have parallels in other Satem languages.

--Grzegorj 23:34, 4 August 2005 (UTC)[reply]

I agree, that's why I listed Dacian as Satem. Also, Romanian substratum words indicate satem-sound changes. One of the arguments against the Romanian substratum being "Illyrian" (as some may claim), is that Illyrian was most likely Centum (cf. Wilkes, et al.). So I have come to see the Satem nature of Daco-Thracian as a plus :) . "Semi-satem" is a concoction, and I would have used a better term, but forgot it on short notice. Decius 23:42, 4 August 2005 (UTC)[reply]

However, Vladimir Georgiev and Ivan Duridanov would not agree that "there is no evidence that Dacian and Thracian were separate languages". Viewing them as separate languages on the same Indo-European branch seems likely, unless further evidence indicates something else. Given the uncertainty, it is best to list Dacian and Thracian separate. For example, many of your satem examples above are specifically Thracian (briza, -diza, Rhesos, etc., though Rhesos may also be Dacian, but I'm not sure; interestingly, there is a Dacian name Regalianus). Decius 23:50, 4 August 2005 (UTC)[reply]

Question

Where does Albanian fall into? At the beginning of the article it is stated that it's satem, and yet by the end of the article Albanian ends up being neither satem nor centum. The word for 100 in Albanian is 'qind' which is very close to the sound of k/c-int of centum. ~~Xhamlliku~~

Xhamlliku, Albanian 'qind' most likely is a loan from Latin 'centum'. This is what most linguists theorize, because 'qind' is not the expected form from PIE *(d)k'm.-tom.

Article needs some fixes

The Centum/Satem does not divided Indo-European into two dialects. The most accepted theory is that a sound change occurred very early in on area of the PIE and it spread to other areas. This explains why Indo-Iranian languages have complete Satem (the dialects in which the sound change probably occurred), Balto-Slavic has incomplete Satem, while the branches on the edge like Italic have no Satem changes.

For example, taken from Language History, Language Change, and Language Relationship by H. H. Hock and B. D. Joseph (1996, page 357), Father-in-law (PIE: *swek'uros): Sanskrit: s'vas'ura, Old C. Slavonic: svekuru, Latin: socer...Hundred (PIE: *k'm.tom): Sanskrit: s'atam, Old C. Slavonic: suto, Latin: centum...Notice that Old Church Slavonic has both /k/ and /s/ from PIE /*k'/.

Also, the sound change of the palatals to velars or sibilants may have just developed independently in the various branches of Indo-European, as this sound change occurs frequently across the worlds' languages. Any linguists want to work on this article?

At the very least someone should point out that this is only one of many isoglosses that lies between the various IE accents. Mallory shows 24 in a diagram taken from Raimo Anttila, and for the most part the "K-S boundary" doesn't even fall along the densest bundles. — B.Bryant 09:43, 1 September 2005 (UTC)[reply]

I also can't vouch for the accuracy of the maps. Imperial78

I have edited the page which is more neutral in describing the sound change and at the end discussing the theories which the sound change is a part. Imperial78

Tocharian has no centum sound change, explain.

Dbachmann, how does Tocharian not have centum sound change? The palatals become velars. Also, it is likely that none of the centum languages form a node and the same with the satem languages. So, how is it that Albanian is not a satem language , when the term satem now just describes languages which have fricatives or affricates from the PIE palatals. There are no IE subgroups of Centum or Satem. Finally, I think you can keep the ad hominems to a minimum just because I didn't like your maps. Imperial78

Tocharian and Albanian

As far as I know, Tocharian is unambiguously Centum. The palatovelars and plain velars are merged. Keeping the labiovelars distinct is not a necessary condition for being called Centum, as merging the labiovelars with the plain velars could have happened at any time (it happened between Primitive Irish and Old Irish, for example). As for Albanian, I don't know much more about it than what I've read in Beekes, but it seems clearly Satem: the palatovelars have become the dental fricatives th, dh, while the plain velars and labiovelars largely merged. Beekes says the evidence that plain velars did not (secondarily) palatalize to s before front vowels while labiovelars did is "too meagre". So the strongest evidence is that Albanian is straightforwardly Satem. --Angr/_{tɔk tə mi} 12:19, 5 September 2005 (UTC)[reply]

Well, Albanian is mostly Satem, and for the purposes of dealing with Albanian, we can say it is Satem, no problem, but the mostly becomes essential when looking at the areal nature of Satemization (I admit, of course, that in no family was Satemization 100.000% complete, so yes, we can call Albanian Satem as long as the exceptions are noted). As for Tocharian, my point is that there is no evidence of any connection, areal or otherwise with the western Centum group. Tocharian simply melted all dorsal rows together, which really amounts to a 'null' status. Phonematically, you may as well call it Satem (palatovelars and velars collapsed), although of course the affrication is missing. I argue that evidence of labiovelars but not palatovelars at some stage is required for a Centum language. In the case of Tocharian, there possibly was such a stage, but we simply don't know. In Irish, even if we didn't have Primitive Irish, we could argue from the difference in treatment of gw and g that the labiovelars were, at some stage, separate. If we didn't have any such evidence in Celtic, I agree the situation would be just like in Tocharian. We do, however, have this evidence for Celtic, but not for Tocharian. dab (ᛏ) 13:26, 5 September 2005 (UTC)[reply]

PS, I realize that non-usage of the cuneiform "q" series by Hittite is a weak argument; they didn't use voiced vs voiceless either, and Semitic 'emphatic' stops are not equal to IE labiovelars. The Romans did still adopt Phoenician q for their labiovelar. Anyway, the main point there is that since Hittite doesn't spell its labiovelars as "q", there is no evidence either way. dab (ᛏ) 13:32, 5 September 2005 (UTC)[reply]

In Andrew Sihler's New Comparative Grammar of Greek and Latin he argues that there is no Centum grouping, that Centum is simply a cover term for those languages that did not undergo Satemization. Under that definition Tocharian is clearly Centum by virtue of being non-Satem. But Sihler's definition of Centum (which he probably lifted from somewhere without citing his sources, something that irritates me no end about that book) of course doesn't have to be the only definition of Centum. As for Q in Latin, they didn't borrow Q to stand for their labiovelar, exactly; their labiovelar was spelled QV. And they probably didn't borrow it directly from Phoenician at all, they probably borrowed it from Greek qoppa, which was sometimes used in inscriptions to indicate a backed allophone of /k/ before /u/. --Angr/_{tɔk tə mi} 14:22, 5 September 2005 (UTC)[reply]

agreed on both counts. I will correct the Latin labiovelar to qv. Yes, we should make clearer that while the Satem group is a result of "Satemization", the Centum group is often taken to equal "non-Satem". Since Albanian and Armenian are Satem on the surface, and that they may have been satemized secondarily not more a suspicion, Centum/Satem was indeed an 'either or' classification in Brugmanns time. It is precisely the 'outer' languages, Toch. and Anatolian, that require us to reevaluate the term. Still, logically, there was something like a "Centumization". Either, if there were three rows, a merger of k and k', or if there were only two rows, the (phonetic) creation of a labiovelar row. Clearly then, centum/satem cannot be a logical either/or term, because if the PIE phonology had survived in some remote pocket, it would be neither. Looking at the proto language of each branch, it is very clear that Italic/Germanic/Greek/Celtic were Centum, and that BSl/IIr was Satem. It is unclear whether Proto-Anatolian was either, and it is unclear what happened to Tocharian. If we apply Brugmann's "centum = non-satem" of course they will be centum, but that does not give any insight. Progress lies in the realization that centum/satem is a classification of "inner" dialects. dab (ᛏ) 06:00, 6 September 2005 (UTC)[reply]

I think it is not even clear that Latin has labiovelar phonemes. Latin_spelling_and_pronunciation#Summary_of_phonemes gives gw and kw, but I suppose their monophonemic status is dubitable. I'm not sure about this, I suppose one has to decide from metrical evidence. dab (ᛏ) 06:08, 6 September 2005 (UTC)[reply]