But at a certain point, a loan word becomes "adopted" by its language, right? And at that point, it doesn't make sense to carry over the foreign rules for pluralization, etc.
I'm not sure if "pierogi" is common enough yet for "pierogies" to become the valid way to pluralize it in English, but as a Russian speaker, "чипсы" sounds perfectly normal to me.
After all, if I were speaking Russian I wouldn't say "компьютерз" - I would say "компьютеры", using the correct pluralization in the language I'm speaking in.
You can try to interest "Rada Języka Polskiego" (Polish Language Council) into this topic. Just because we all know, every Polish speaker will follow their advice diligently :).
In early 90's, after the COCOM embargo on import of computers into former then Eastern Block had been lifted, there were initiatives to translate English-sounding word into something more native to Polish language.
An example of a word that badly needed such translation was 'interface', for which a proposed translation was 'międzymordzie' (literally: a thing between two faces) which sounded hilariously bad in Polish. There was another proposal to name 'computer mouse' as something even better, but I've forgotten the name since.
Fortunately, nobody followed, and 'computer mouse' is simply 'mysz' (mouse). I hear, that French had longer lasting successes with their translations.
> I hear, that French had longer lasting successes with their translations.
Kind of. Generally, ‶old″ words are consistently translated. For instance, we have souris for mouse (which is exactly the direct translation, because it conveys the same idea), ordinateur (which could be translated as ‶sorter/processor″) for computer, informatique (science of the information) for computer science, disquette (small disk) for floppy disk, logiciel for software, and so on.
However, newer ones (i.e. past ~1985) didn't catch so much, mostly because the official translation were awful and/or came years too late. For instance, CD-ROM is supposes to be cédérom, i.e. the straight phonetic transliteration – losing all meaning in the process; tablet is supposed to be ardoise numérique (digital slate) – evokes school, came years too late and was too long in comparison to the colloquial tablette; fouineur (snooper) for hacker (no comment...); cybermonnaie (cyber-money) for cryptocurrency – losing half of the semantic, and so on.
I heard the mouse was supposed to be called "manipulator stołokulotoczny" (manipulator with a ball rolling on the table - due to the way old style mice were designed).
> There's no need to know the singular for pierogi, because no one has ever eaten just one.
Ain't that the truth! My Polish friend also introduced me to 'pierogi rooski'? Apparently, a Russian take on pierogi, with more meat. Do I have the spelling correct?
> In return, I promise to continue working to stop Polish people from pluralizing potato chips as "chipsy".
As a Jamaican, where banana chips are like a national snack, it was amusing to see "chipsy bananowe" for sale. Fun times; I plan to go back.
> My Polish friend also introduced me to 'pierogi rooski'? Apparently, a Russian take on pierogi, with more meat.
If you mean "pierogi ruskie", they don't have any meat in them, just quark
with potatoes and onion, though they're often served with bacon. And this dish
comes from "Ruś" (now Ukraine), not "Rosja", otherwise it would be called
"pierogi rosyjskie". I hear that in Ukraine they call this dish "Polish
pierogi".
> As a Jamaican, [...], it was amusing to see "chipsy bananowe" for sale.
I believe "it" was not a Jamaican, however amusing it was.
Adding to the other comments: the closest thing that can be described as "Russian take on pierogi" is pelmeni. In Ukrainian cuisine, it would be vareniki.
Confusingly, Russian does have the word "pirogi" (plural; singular "pirog") - but it's a kind of pie, not a dumpling.
Not because there's anything wrong with it, that is.
It's just always puts a smile on my face when I walk past a tipsy nail salon with a sign that means euphemistically, to my British ears, being on the path to getting properly drunk.
Germans love to order espressos. If they say espressi it always sounds pretentious. Sometimes it's espressis. Then everyone who speaks Italian winces (especially the Italians).
Then again, when I'm ordering an espresso in Italy, my results with just "caffé" are also mixed because it turns into the poison scene from The Princess Bride:
- is the guy speaking Italian?
- is the guy just asking for a regular coffee because he doesn't speak Italian?
- do they maybe call it different in different parts of the country?
Ok, so out of curiosity I googled "singular for pierogi", and it seems to be "pieróg". But there are a number of (seemingly) Polish people scattered around the Google results saying that this is never used, or incorrect, or refers to only the general form for filled dumplings.
Now I know that in real-world usage it would be unlikely to refer to one, but what if there was a need to? Let's say someone made a statue to pierogi, but due to budgetary problems only one of the several pierogi planned was constructed. A tourist asks you how many pierogi make up the statue. How would you respond? Would you use the singular, like "only one pieróg", or would you rework your response so there was no need for the singular, like "sadly only one of the several pierogi was built"? Which seems more natural?
I recently came across some "burekasim" in Israel. The double plural there has a taste of the entire mediterranean from turkish/greek/arabic/ladino/spanish/hebrew :) I guess it could be dialed up one more notch if an english-speaking tourist would ask for some burekasims.
By the way, I was on the big island of Hawai‘i last month and I drove past Pszyk Road. As well as I can discover, this was named for a Polish immigrant whose descendants now have names like Harmony Wai‘olohe Pszyk.
Ouch, that asterisk does some seriously confusing double-duty in a thread like this...
In Internet forum speak, the asterisk means "I meant to say" or "You meant to say".
In linguistics, the asterisk means "This form is not attested (we don't have examples of people using it)" or "This form would be considered mistaken by speakers".
That means that the forum usage of asterisks is something like "I should have said X" while the linguistics use of asterisks is something like "I shouldn't have said X". :-(
Polish orthography is a disaster partly because of the choice of Latin over Cyrillic. The latter would have been a much better fit for the sounds in the language.
Making matters worse, there are homophones (ż and rz, u and ó, ch and h) that depend on when the word entered the language.
Like so many things wrong with the country, you can comfortably blame the Catholic Church for this orthographic train wreck.
Polish letters on top of Latin are no worse than German (ä, ö, ü, ß), or French (é, à, è, ù, â, ê, î, ô, û, ë, ï, ü, ÿ, ç). Cyrillic is not the answer, as each Cyrillic country has own variations, own letters. See https://en.wikipedia.org/wiki/List_of_Cyrillic_letters
Regarding homophones, this is not 'when word entered the language' but rather, the change of pronunciation no longer matching the phonetic writing.
The 'ó' vs 'u' is homophone because the sound now is the same. However, historically it wasn't. Additionally, since historically it was different sound, it also had different rules for how it morphed during declension.
Similar situation is with 'ż' and 'rz'. There are even cool words that due to declension we can see where they originated from. For example two declensioned words have same pronunciation: 'każe' and 'karze'. The first comes from root word 'kazać' as in tell people what to do. The second comes from root word 'karać' meaning punish.
I would wager that 'ch' and 'h' is the most recent homophone. Some people still alive were taught to use correct hard 'H' vs soft 'CH' when pronouncing words.
I would say citing all those French diacritics is cheating a bit. The diaeresis is just used to indicate separate syllables, and ù and ÿ are not a thing in standard French.
Polish is uniquely bad not because of the diacritics, but because of the special phonetic behavior of digraphs ('si', 'ci', 'zi', 'rz', 'sz', 'cz', 'ch', 'dż', 'dź', 'dzi').
To strengthen your argument there is another digraph 'ni'.
But this is now approaching discussion that writing doesn't match phonetic. Pronunciation shifts. Words change. Brother Grimms (famous for fables) were linguists studying how consonants use has changed. They talk how 'pater' (like paternity) shifted to 'father'. But Polish has own such shifts. Take a word like 'jabłko', which many people pronounce 'japko' or 'śliwka' that is pronounced as 'ślifka'.
I wouldn’t call it “using correct pronunciation”—the whole idea of hypercorrection is that the result is not correct, because some perceived rule was mistakenly over-applied, like in English * “He gave it to Sam and I”.
(Plus I’m guessing that sound change happened pretty early on, because as far as I know, all Polish consonant clusters obey that rule of voicing & devoicing obstruents.)
The curious thing about Cyrillic is that, at least in Slavic languages, the correspondence between phonemes and letters is much more uniform than in Slavic languages that use Latin.
For example, "ч" is always the "ch" or "tsh" sound, whether we're talking about Russian, Ukrainian, Bulgarian or Serbian. OTOH, in Polish you use "cz" for it, while in Czech and Serbian it's "č".
It's not an inherent advantage of Cyrillic, of course, it's just that, by virtue of being designed specifically for Slavic languages, it had certain letters designated from the get go, while adopting Latin allowed for a lot more leeway in deciding how to represent sounds, and different languages did it differently.
It's not only that Cyrillic is designed specifically for Slavic languages, but also that the general tradition around Cyrillic is to make up new letters when adapting it to a new language, as oppposed to the Latin alphabet where the two traditions are to use digraphs and to add diacritics. This choice between two possibilities is already causing the problem of "cz" vs. "č", but within those there are still the choices of "cz", "ch", "tsh", "tch", ... and "č", "ç", "ĉ", ...
I think the line between "making up new letters" and "adding diacritics" is rather blurry, and really hinges on your definition of "new letter", which is largely defined by convention. Most (all? I can't think of any that don't) Cyrillic alphabets have some letters with diacritics - ё and й in Russian; й, i and ї in Ukrainian; j in Serbian etc.
But speakers of those languages don't consider the diacritics to be a modifier in this case - we treat them as completely separate letters, that just have a disjoint element in their overall shape. I don't see why e.g. c vs č can't be treated in the same exact way as и vs й (and indeed, I wonder if Czech don't already do that?).
English is indeed a crap shoot, but it's not entirely fair to put French in that bucket as well. Much as you describe for Polish, there is a highly consistent set of writing-to-sound rules for French that you can learn quickly. French's downfall is in the other direction: with multiple ways to write some phonemes and many contexts where letters are silent, hearing a word, even perfectly, is not nearly enough to know with any certainty how to spell it.
But reading French out loud? Anybody with a week or two of practice should definitely be able to know how to pronounce well over 95% of what they read, even if they don't know what they're saying.
You're completely right. There's also a predictable syllable stress pattern. I taught a friend once to read Polish out loud fluently without understanding it.
Irish and Korean are also extremely regular. Hangul (Korean) is probably the easiest alphabet in the world to learn to pronounce. Irish suffers from the use of digraphs in modern use, but the older style of placing a dot on top of the letter instead of an h afterwards is much clearer IMO.
Could have done what Serbian (and formerly Serbo-Croat) does: freely added letters to the Latin alphabet to match the required set of phonemes, so that there's a one-to-one correspondence between the Latin orthography and the Cyrillic orthography.
(It doesn't solve the problem of missing letters and missing diacritics when trying to write your language using English letters only — but Serbs, Ukrainians etc. can't write their language correctly using only Russian letters either.)
Wikipedia says Mandarin has all of /t͡s/ ⟨z⟩, /ʈ͡ʂ/ ⟨zh⟩, /t͡ɕ/ ⟨j⟩, /t͡sʰ/ ⟨c⟩, /ʈ͡ʂʰ/ ⟨ch⟩, /t͡ɕʰ/ ⟨q⟩, /s/ ⟨s⟩, /ʂ/ ⟨sh⟩, and /ɕ/ ⟨x⟩, all contrastive. The fricative and affricate sounds in English that are kinda close to these are only /t͡ʃ/, /ʃ/, /s/, /t͡s/, /ʒ/, and /dʒ/.
I think these contrasts represent a very underacknowledged difficulty for English speakers learning Mandarin, because we're used to thinking of tones as the only hard thing. I'm still struggling to properly pronounce a friend's name that I think begins with /ʈ͡ʂʰ/. Maybe Polish speakers can deal with this easily!
My pronunciation of these consonants was improved a lot by reading the respective Wikipedia articles and imitating the described tongue position. That got me interested in the rest of the IPA, and for a while I spent idle time making strange sounds with my mouth.
Retroflex consonants are still awkward, but at least I can produce them reliably. What's really hard for me are uvulars, they feel a bit like choking every time.
I was a dropout, but I wish I had been a linguistics major!
There was a post on Language Hat or Language Log where a professor recounted meeting a student who had linguistics "as a hobby", which was astonishing to the professor -- and the student explained that this was possible nowadays because of Wikipedia. (To which I could add, for some people because of the conlang community, or because of linguistics blogs!)
Anyway, Wikipedia's coverage of linguistics topics is generally really excellent and detailed (maybe strongest in phonology, perhaps because of a few super-obsessed editors).
It's a trap! Roughly speaking, the ones on the right are those. The ones on the left are different and they correspond historically to something like сь, чь, зь (though the pronunciation is rather different from standard Russian pronunciation).
well, sure, it's not uncommon even for ppl who speak the same language and use the same letters to pronounce things differently.
the point is I don't think Latin vs Cyrillic in Poland's case was driven by Polish language phonetics per se. I would venture to guess this had more to do with political/religious affiliations etc.
Absolutely. Mieszko I chose to convert himself and his realm to the Roman faith for both political and religious reasons. Had he chosen to accept Christianity from the East (remember, this was pre-schism), it would have entailed the adoption of Cyrillic.
Czech also did a great job of this (unfortunately the word Czech itself seems to come to English through the some other spelling system). A few years ago I tried to learn a bit of the language before travelling to Prague, and I was pleasantly surprised by how easy it was to learn to read and write.
Prior to changes introduced by Jan Hus (in this case, the introduction of the háček), Czech shared digraphs that are still used in Polish. "Cz" is was ultimately replaced with "č". It is possible that the archaic form somehow made it into English and survived (though my sources tell me that the spelling has only been present in English since the 17th century, or after Hus' reforms).
"The latter would have been a much better fit for the sounds in the language."
The Cyrillic alphabet as originally developed for Old Church Slavonic lacks several sounds of Polish, namely the velarized /l/ (which has now become a semivowel /w/ except in peripheral dialects), and the palatalized affricates /ź/ and /ć/.
If you are a Russian speaker, you might think that Cyrillic could represent palatalized sounds by use of the soft sign, but that it not what the soft sign was actually used for originally. It was meant to represent the front reduced vowel /ĭ/ before the fall of the yers. So, Russian choose to represent its phonology by extending the Cyrillic alphabet in a way that was originally unintended, while Polish chose to represent its phonology by extending the Latin alphabet. How is either of these choices better than the other?
Worth noting that Serbian did its own thing, and added new Cyrillic letters for palatized L and N as separate letters Љ and Њ, (which are obviously digraphs of Л and Н with Ь). Although that's a late 19th century invention.
Ah, but it's not really an example of Russian in that sense, since he made them separate distinct letters rather than reusing Ь as a modifier, as Russian did. So semantically a rather different approach that just looks similar because it appropriated the shape.
Polish orthography is weird (subjective opinion!) mostly because of its weird digraphs, IMO. Czech and Serbian also use Latin, but they are easier to parse.
I'm not a linguist, but looking at the differences between the corresponding alphabet, it feels like Polish one had a stronger German influence. The use of W rather than V stands out in particular (and makes no sense in an alphabet that doesn't use V at all!). But also, Germans love their digraphs and trigraphs.
1. Any suitable choice of orthography will become unsuitable eventually. Pronunciation is not static.
2. There is, and has been, quite a bit of dialectical variation within the Poland and its diaspora. What may be suitable for one group, may not be for another.
A small correction to the OP: paczki means packages and not boxes (pudełka). It used to be a much more common sign in Polish neighborhoods in the US before Poland leveled up to the first world.
http://buscon.rae.es/dpd/srv/search?id=BapzSnotjD6n0vZiTp is from the Diccionario panhispánico de dudas, 2005. It shows several examples of accented capital letters, like LA NACIÓN in the headline of a newspaper, and the phrase "ESTÁ PROHIBIDO FUMAR DENTRO DE LAS DEPENDENCIAS DE LA EMPRESA."
Yes it is. For some reason many people strongly believe that the Royal Academy (which oversees the language) forbids the use of accents on capital letters, even when it explicitly claims that this has never been the case:
The rule is to write Ç with cedilla even when capitalized, but from what I gather French keyboards make it inconvenient, so people often neglect it. On the other hand, it is true that there are no above-letter accents for capitals (in France; Quebec uses them).
French keyboards indeed have diacritics available, such as é, è, à, ç, etc. but shift+those keys is another character.
On top of that, at least on Windows (I'm not sure about Mac), the caps lock key is not a caps lock, but a shift lock. Which means it doesn't capitalize characters, but does the same as pressing shift+key.
Funnily enough, in other French locales (IIRC at least Belgium), the caps lock key on Windows is a caps lock key.
I lived in a Polish neighborhood in Brooklyn for a few years, and absorbed as much orthography and pronunciation as I could during that time. I ate a few pączki, too.
While I learned to speak a little bit, it was not enough to be functional beyond basic greetings and ordering food. It was always a lot of interesting fun, though.
The main outcome of all this is that for ten years, whenever I see a Toyota Camry, my weird brain thinks "hmm... tsahm-rih..." and I roll my eyes at it.
Ugh, this reminds me of character encoding issues... I hope I never meet the guy who invented ISO-8859-2 or CP1250 or other such nonsense for that guy's sake...
This brings the memory of Internet chats in the 90s when almost nobody used Polish letters with diacritics (because it is faster to type without them and there were a lot of different incompatible encodings). Usually you can understand the meaning from the context but sometimes the same kind of funny misunderstanding would happen as there are quite a few words that without diacritics become completely different but valid words.
It is quite common to remove the diacritics if you are lazy or don't have access to the diacritics. That phrase becomes "Zazolc gesla jazn".
Most search engines find "Zazolc" and "Zażółć" equal because of that. This becomes a problem in case of the words like "paczki" (boxes) and "pączki" (donuts), which have their own separate meaning - as explained in the article.
In contrast to most European countries, in Poland we use American keyboard layout with "Polish (programmers) layout" keyboard setting in OS.
You press ALT+A, ALT+E, ALT+L, ALT+S, ALT+C, ALT+Z, ALT+X to write "ą", "ę", "ł", "ś", "ć", "ż", "ź", respectively.
Some words written without Polish characters can become ambiguous without context. For example: word "łaska" - "mercy", written without Polish letter "ł" is "laska" - "stick".
Although, in old good times some people used British pound character in texts to express this letter, since '£' is visually similar to 'Ł' and more often available.
For other characters (ś,ź,ć,ż,ó,ą,ę) nothing like that was widely adopted. Workaround would be possible for 'ż' and 'ó' since they are (almost - see below) phonetically identical with 'rz' and 'u' respectively, but it wasn't popular, since most probably would be perceived as sign of very, very bad orthography.
*Almost, since some people claim they can distinguish these, but it's not popular ability.
When required people just omit the diacritic signs and replace the letter with regular ASCII letter, so for example: ż->z, ł-l, ó->o, etc.
When used in (computer) writing this is very readable, but no one would do this with handwriting, I think most common cases nowadays would be SMS messages (especially on dump phones) or some weird displays that are unable to properly render diacritic letters.
Its curious how simultaneously we are stubbornly holding onto diacritics (and current orthography in general) while also having continuously lots of issues actually reproducing/handling them. One would think that we either would have gotten better at dealing with them, or dropped them altogether. Especially considering that modern fixed orthographies are generally relatively recent phenomenon.
Usually the diacritics exist because of something that is contrastive in the language's phonology. This post is mainly about a concrete example of this, where pączki and
paczki refer to two different things that a shop could sell. So it's super-helpful that shops can make that distinction in writing!
In Portuguese, my strongest foreign language, I can think of examples like
a 'the (feminine)' / à 'at the (feminine)'
nó 'knot' / no 'in the'
dá 'gives' / da 'of the'
nós 'we' / nos 'in the (plural)'
sê 'I should be' / se 'oneself' / sé 'see (Catholic)'
pode 'can' / pôde 'could (past)'
avô 'grandfather' / avó 'grandmother'
tem 'he/she has' / têm 'they have'
among many others.
At least a couple of these reflect vowel differences, although some would be homophones in speech. Every language that holds on to diacritics would have pairs like this where the diacritics make a difference to the meaning. So people may really appreciate having writing systems that can reflect these differences in order to avoid confusions that would otherwise occur.
Obviously naively just stripping diacritics won't work. But that doesn't mean that diacritics are the only option for orthography; for example in nordics ä and ö (or the equivalent norwegian/danish ones) could be substituted with ae and oe, because those do not occur otherwise in the language, and so the orthography remains unambiguous.
Maybe a good example of related development is the disappearance of Þ (thorn) from english orthography, being replaced with th.
> Maybe a good example of related development is the disappearance of Þ (thorn) from english orthography, being replaced with th.
This didn't have quite the full benefit that you describe, though, because of compound words like "cathouse", "hothouse", "lighthouse", "outhouse", etc. And of course digraphs can be extra-risky whenever languages accept loanwords.
It should be a vowel sound similar to the English words "thought", "dawn", "fall", and "straw", except nasalized. So "pawn" is probably closer than "poon". (If any one of those have a different vowel sound than the others in your dialect, use the more popular version.)
There's no need to know the singular for pierogi, because no one has ever eaten just one.
In return, I promise to continue working to stop Polish people from pluralizing potato chips as "chipsy".