Wikipedia talk:Categorization
![]() Archives | |
---|---|
Archive no. | To approx. |
Archive 1 | |
Archive 2 | |
Archive 3 | |
Archive 4 | 20 August 2004 |
Archive 5 | 8 September 2004 |
Archive 6 | 30 September 2004 |
Archive 7 | 22 January 2005 |
Archive 8 | 7 August 2005 |
Archive 9 | 12 January 2006 |
Archive 10 | |
Archive 11 | |
Archive 12 |
A different perspective: Topical and Instancial categories in Wikipedia
It seems obvious to me that Wikipedia has two category types:
- Topical categories, which includes a broad range of articles and subcats related to a given subject. Sometimes such a cat contains mainly subcategories. Examples: Category:France, Category:Biology, Category:Literature, Category:DNA.
- Instances categories, which list articles sharing a common property: Category:1945 births, Category:Bridges in Canada, Category:Australian poets, Category:Fruit, Category:Furniture.
Sometimes, the exact boundaries are not so clear-cut, but this is typically due either to a cat's small size making a split unnecessary or to a cat that should be renamed or split: Category:Fruit could do with a split to make it a true topical cat, as it currently has mostly instances in a topically-named cat. It is also pertinent to mention that instancial cats should have as few topical articles as possible.
As far as I can see, 90% of the cases of difficult categorization are due to an article being in both a topical and an instancial cat. In the case of Suspension bridge/Category:Suspension bridges it is in an instancial cat that is a child of a topical cat (Category:Bridges). Delhi Metro/Category:Delhi Metro is a topical cat being a child of an instancial cat (Category:Metros in India). I think it evident that an article should not be in two cats of the same type if one is a child of the other. However, when the two cats are of different type, use of common sense will most often allow a user to determine a reasonable solution.
Sometimes, adding an extra cat to empty the parent can also solve the problem. For example, one could create sometihing like Category:Bridge structures under Category:Bridges and move nearly all articles there, with the exception of Bridge. Another option is to take all the individual bridges subcat and move them in a subcat like Category:Individual bridges, made to hold only categories, this extra level allow for both options for classifying suspension bridge to be pertinent, is fonly because the article is not a in cat and it's direct parent (kinda like the whole spelling issue). I used a similar scheme to clear individual bird species out of Category:Birds into Category:Birds by classification and other subcats (such as Category:Birds of prey).
Category:Articles by person are a case of strange categorization: relatively very few people article warrant a topical cat (although I can easily think of a couple more than the existing ones), making them the oddballs of Wikipedia categorization. However, since the topical cat Category:Bill Clinton is unreasonable to have as a subcat of any people cat (as pointed above), the cat+subcat problem is not even present, in my opinion. However I think a policy as to the categorization of Articles by person categories is necessery, if only for consistency. Look at the differences between the categorization of Category:Thomas Jefferson and Category:George Orwell, for example. Circeus 16:01, 15 January 2006 (UTC)
- I agree with your analysis about the different types of categories, if not all your conclusions. Categorization at Wikipedia is inconsistant, but that is partly by design. I think many people expect the categorization of articles to be a consistant classification system. That is NOT the case. There is a multiple overlapping tree structure which resists rigid rules. Categorization is a tool for browsing, as such, decisions have to be made category by category and topic area by topic area based on utility. The organization of cats must be useful and comprehensible. Eventually, we might determine all the different ways categories are used, and create rules to govern the categorization of each type. I don't think we are there yet, things are still evolving.
- As for your suggestions of adding more categories to divide hybrid categories. I suspect that this will make sense in some cases. But in many cases (like Category:Suspension bridges) I think it leads to over categorization. You would be replacing one category with three, adding consistancy to the classification of articles to the detriment of utility. I don't think this makes sense. If the categorization were overly confusing, and the division of one category into three removed the confusion, that might make 3 categories more useful than 1, but I don't think that is often the case. Certainly, if it does make the categories more useful, they can be divided. But I would resist the common temptation to make a pristine structure of categories the top priority. That path leads to great frustration.
- As for the categorization of Category:Thomas Jefferson and Category:George Orwell, George was classified neither by the old guidelines or the new ones. His article belongs in many more of the cat groupings to which George was a member, the cat probably should be removed from a few. Since most of the articles relate to literature, the cat probably belongs in the literature cats and perhaps a few others. Tom was classified according to the new guidelines and looks ok.
- A category will sometimes combine both of your types in one category. It will be intesting to see how this evolves over time. In the case of Category:Bridges, the subcategories are your Category:Individual bridges by type, while the articles are your Category:Bridge structures. If this is explained in the heading of the category, it should be understandable and easy to use. Perhaps your suggestion will make sense for situations when both the subcats and articles are both mixed types. Let us see how this evolves over time. -- Samuel Wantman 21:12, 15 January 2006 (UTC)
- As for your suggestions of adding more categories to divide hybrid categories. [...] My pseudo-proposal for bridge wasjusttomakemy explanations clear in by applying the two solutions that are most often appliableto the curent example. The extra category worked very well to unclutter Category:Birds. I did not intend them to be the normal responses in every case, but at least directions to consider when appropriate.
- A category will sometimes combine both of your types in one category. True, and that is something I believe should be avoided. Otherwise, Your counter points on how how Wikipedia is still evolving and edited by many stands true. Circeus 21:35, 15 January 2006 (UTC)
Circeus Topical categories and Instances categories correspond, more or less, to what the object model in Computer Science often calls "is-a" properties versus "has-a" properties or what Aristotle called (I believe) "essential" and "non-essential" properties. -- Jmabel | Talk 04:42, 17 January 2006 (UTC)
- I think a simpler way of talking about it is to draw an analogy to article space. Some categories are equivalent to "see also" lists (or lists of topics) in article space. Other categories are equivalent to "List of..." articles in article space. Because the category system doesn't make a clear distinction between the two, they tend to get mushed together. There are quite a few non-people in Category:People, for example. Some people care deeply about that, while others don't care at all. More people would care if categories could be used for searching, and not merely browsing. If, for example, you could take the intersection of two or more categories (including subcats), there would be more incentive to keep the lists pure. Mirror Vax 08:30, 17 January 2006 (UTC)
An interesting example of the separation of topical and instancial cateogories are the categories Opera and Operas. The former is a topical category about Opera and the latter contains instances of operas. The difference in usage is delineated on the Opera category page. "Operas" is placed as a subcategory of "Opera". --LiniShu 13:09, 26 January 2006 (UTC)
- Another example is Disasters and Disaster, the former being a category containing Disasters, the latter being a collection of general articles related to the concept of disaster. A while ago I tidied these categories up, but since then things have gone a bit messy again, despite the text I put at the top of the category to explain the suggested structure. It seems that people just categorize without going to check whether what they have done makes any sense in the wider scheme of things. Carcharoth 19:19, 29 January 2006 (UTC)
Images in categories
I've been noticing more and more images being added to categories. There's nothing on this page about it, but I was under the impression that it was a no-no. Anyone know where this has been discussed? -- Samuel Wantman 11:53, 17 January 2006 (UTC)
- I've seen images in categories as well, and what is worse, no visible way to remove them. Here is one example I found Category:Volcanic_events. How to get rid of those silly pictures?? I've also noticed that some people insert article templates at the top of a category, instead of an explanation of the category structure (which is what I thought was supposed to go there). A good example of this is Category:Nature. Another thing I have noticed is people putting links to categories in the "See also" section of articles, which seems to violate the "no internal references" policy of Wikipedia. I've done this myself sometimes, linking to a category page from a disambiguation page, rather than linking to a list in an article. All seems to be a bit random sometimes. Carcharoth 19:25, 29 January 2006 (UTC)
- I removed all the pictures from Category:Volcanic_events and requested comments here. I'll remove pictures from other categories the same way, and see if we can generate some discussion. I think categorization of images should happen at commons. -- Samuel Wantman 10:28, 3 March 2006 (UTC)
- I agree that most of the time, adding images to categories is not helpful. However, I don't know if they should be forbidden outright, because there are times when they can be valuable, like Category:Hieronymus Bosch paintings. - EurekaLott 18:13, 4 March 2006 (UTC)
- Even the example you cite probably should NOT have pictures in it. Here's my reason: If you click on one of the paintings you do not get a page that is user friendly to a novice reader. Many people might be baffled to end up on an image page. Also, the images are so small on the category page, that I can only make out one or two iconic images. A better way to display these images would be to create a new page Hieronymus Bosch paintings, add all the images, and briefly discuss them. As the discussions for each painting lenghten, they could be turned inot separate articles. I have created the article and removed the images from the category. Take a look and tell us what you think. --Samuel Wantman 21:56, 4 March 2006 (UTC)
- I agree that most of the time, adding images to categories is not helpful. However, I don't know if they should be forbidden outright, because there are times when they can be valuable, like Category:Hieronymus Bosch paintings. - EurekaLott 18:13, 4 March 2006 (UTC)
- I removed all the pictures from Category:Volcanic_events and requested comments here. I'll remove pictures from other categories the same way, and see if we can generate some discussion. I think categorization of images should happen at commons. -- Samuel Wantman 10:28, 3 March 2006 (UTC)
- Hello to all. Creating guidelines about this topic will be very helpful. Can the built-in ability to put images into category pages be used to advantage?
- A "summary article" for images, like Hieronymus Bosch paintings, is a good place to merge stubs about the topic, and later, for an introduction to see: Main Article: X. An appropriate lead paragraph needs to be authored for the summary, but this may duplicate information in the detailed articles. The summary can also suffer like a "List of" article - with ranges of incomplete information to fill the hierarchy.
- Using Commons can also be confusing to a novice users coming over from Wikispace. For editors, Commons image categories may be sorted differently than at wikipedia, a problem if you want "chair" photos and the only commonscat is "furniture". Links to Wiki articles are easier from a Wiki category page than from a Commons category.
- Use this interesting feature of Wiki categories to quickly find available articles - and images. Text links to detailed articles in the category are on the same page as related images. Editors can quickly put images in categories, and the thumbnails are pre-formatted, with captions. The small thumbnail size can be enough to identify paintings and other objects, even if useful only sometimes. Do you think the sample message below will help to clarify the issue? (insert at top of category pages w/ images):
Archiving
I haven't carefully checked the last few "archivings" of this page, but be please aware that archiving all discussion up to the actual day that you're making the change is probably not a good idea. One should check to make sure that earlier discussions aren't ongoing, and that appropriate time has been given for recent topics to be discussed. I think a good rule of thumb is to not archive any discussion that's more recent than a couple of weeks. - dcljr (talk) 12:14, 17 January 2006 (UTC)
- The archiving was conversation up to about a week ago. It included the discussion up to the implementation of the new re-write which just "went live". I archived the page because the discussions were related to a page that was no longer being displayed. If you read through them you might be confused about what was being discussed. That said, I should have noted this when I archived everything. For anyone looking for the discussion about the re-write, it is in the last archive, but the issues have been discussed since the time that categories were implemented in Wikipedia, roughly two years ago. It makes for an interesting read. -- Samuel Wantman 21:24, 17 January 2006 (UTC)
Categories and subcategories
There is a guideline that reads "An article should not be in both a category and its subcategory". Why is this?
I think there is a practical utility in placing an article in categories that are linked vertically. Take languages for example. People who look for a specific language, say Bulgarian, but do not know to which family it belongs, might start in a category called Languages, where items are listed alphabetically rather than genetically. At the same time, putting Bulgarian also in the category of Indo-European languages and Slavic languages can help them find the article if they know Bulgarian belongs to one of these families.
Am I missing something? --AdiJapan 07:20, 4 November 2005 (UTC)
- If you want to find Bulgarian, can't you just search for it? Another solution that some people are fond of is to create a list, like List of languages.
- Adding all subcats to the parents would quickly spin out of control. While I probably couldn't dig through the hierarchy underneath Category:Reptiles to find a Gecko on my first try, I wouldn't want to see (some large number) of all the reptile articles listed on a flat page (and, nevermind that Reptiles is actually a subcat of its own, so to be fair, we would need to list them all in the animals category too). The same could be said for all subcategories, even though your example (languages) has a fairly small cross-section so it gives the illusion of being manageable. Neier 12:58, 4 November 2005 (UTC)
- Think of the enormous redundancy if pages were listed in both. I think the guideline makes sense as it stands. Note, later in the page it allows for some exceptions. John Lennon is the example: he is in his own category and also the Beatles category which is the former's "grandparent." This makes sense in this case, but redundancy of this sort should be the exception rather than the rule. Marskell 15:25, 4 November 2005 (UTC)
Thanks guys. I guess a list would be the best thing in this case. --AdiJapan 15:52, 5 November 2005 (UTC)
- Not sure anyone's still reading this here but, in my opinion, it doesn't make sense to have both a subcategory and article listed on the same page. What is the advantage to that? Personally, I read top to bottom so I'll immediately see the subcategory and then click on it. In the subcategory, I'll see not only the article pertaining to the subcategory but also lots of other related articles. I wouldn't have seen the related articles if I'd first seen the article in the parent category and just clicked on it. In this regard, listing them both on the parent somewhat defeats doing the subcategorization at all. Detrimental and redundant. And the only advantages I've heard in scanning previous discussion of the topic is to emphasize that the article is important enough to warrant a subcategory. That seems flimsy at best. Again, just my opinion! :) wknight94 19:55, 17 December 2005 (UTC)
- Not sure that this is the right way to go about this, but I copied this from the archives page because I wanted to make a similar point.
- I strongly feel that the contents of subcategories should sometimes and maybe always be in the category pages. Take the various Musician pages, which is the category that got me thinking about this issue. There's a Category: Musicians, there's a Category: American musicians and there's a few Category: [PARTICULAR STATE] musicians. By Wikipedia rules, anyone who's in American musicians should not be in Musicians, and anyone who's in, say, New York musicians should not be in American musicians.
- Now ask yourself: What's the use of a category that is effectively "Musicians except musicians who someone has chosen to put in a geographic subcategory (or some other kind of subcategory)"? Who would possibly be interested in that particular subset of musicians? If you're interested in Musicians in general, you have to browse through the very long list of every subcategory by nationality. And if you're interested in American musicians, you can look in "American musicians who do not come from a state that has a subcategory, or who come from such a state and have not been put in that subcategory." Again, what's the use of that particular subset? And then you'd have to look at musicians from each of the subcategorized states.
- If no one is interested in looking at musicians in general, or in American musicians, why have those categories at all? Because the current system is certainly not serving people who are interested in those categories.
- The arguments against including subcategories in categories are that the categories will become very large, and that including categories in subcategories creates redundancy. I don't think either argument makes sense. If you bought a book with biographies of musicians around the world, you would expect it to be a big book--there are a lot of musicians! Likewise, a Wikipedia category that includes all musicians in Wikipedia ought to be big as well. (As for the awkwardness of such a large category, presumably it would be dealt with by breaking it into separate pages alphabetically, just like any reference work does--Musicians A-B, or even Musicians A-Ab.)
- And redundancy is really not something an online encyclodedia should be worried about--space, thank gosh, is not an issue. You can use a category tree to organize the material in every way you think people would find useful, using virtually no additional bandwidth. You ought to be able to find a category that includes all the Rock musicians and all the Punk musicians and all the Hardcore punk musicians--and why wouldn't you want to put a hardcore punk band in all three categories?
- One more example: There's a category called Female bassists. And there's a category called Bassists. One's obviously a subcategory of the other. Do you really want to remove all the female bassists from Bassists, thereby turning it into Male bassists? That's essentially what the categorization rule does with every category.
- At the risk of being a little redundant myself, the question of what the main category is for is I think key. What possible interest would people have in a category of leftover articles not placed in subcategories? Either people are interested in looking at the articles in that category, in which case the articles in subcategories should be in the category, or they're not, in which case why have the category at all?
Nareek 16:08, 20 January 2006 (UTC)
Wikipedia:Categorization just went through a rewrite which includes new guidelines that deal with this very issue. The discussion you copied from the archive was talking about the previous guideline which was different. Here are the relevant new guidelines:
- Articles should not usually be in both a category and its subcategory. For example Golden Gate Bridge is in Category:Suspension bridges, so it should not also be in Category:Bridges. However there are occasions when this guideline can and should be ignored. For example, Robert Duvall is in Category:Film actors as well as its subcategory Category:Best Actor Oscar. See #5 for another exception. For more about this see Wikipedia:Categorization/Categories and subcategories
- Check to see where siblings of the article reside. If there are few if any articles in a category, the article probably belongs in one of the subcategories.
- Articles should be placed in categories with the same name. However, the article and the category do not have to be categorized the same way. The article can also be placed in categories populated with similar articles. The category can be put into categories populated with similar subcategories. For an example of this see George W. Bush and Category:George W. Bush.
So, as you can see, it doesn't say that they can't be in both a category and its subcategory, it says that they should not USUALLY be in a both. The subpage goes into more detail about this. Many people (including myself) argued for this flexibility. I believe a new emphasis should be made on making decisions about categorization that make the most sense for the subject matter, and once the decision is made, to apply those decisions to ALL the articles and subcategories that relate to the subject. If articles are going to be duplicated between a parent and child category it make sense for both categories to be complete -- to have ALL the articles that belong there.
That said, there are many times that it makes sense to not have the contents of subcategories in all the parent categories. If the subject is big, then it makes it very difficult to browse through a subject if everything is there. If there is a complete set of subcategories populated with a large number of articles, then it makes sense to only put the articles into the subcategories.
Please read the new guidelines. I suspect that they need some clarification about this, and perhaps you could help with this. -- Samuel Wantman 16:55, 20 January 2006 (UTC)
- Thanks, that's very helpful. So I guess the upshot is that subcategory articles can be included in categories in some cases, and it should be decided on a case by case basis? If that's a fair summary, maybe I should take this discussion to the Category:Musicians talk page.
- Yes. The upshot is that it should make it easier for users to browse through the categories, and fit one of the exceptions mentioned on the subpage. If it does not fit one of the exceptions, but there is consensus to have the duplications, you should probably let us know on this page so we can consider whether the guidelines need some tweaking. -- Samuel Wantman 19:07, 20 January 2006 (UTC)
- I've tweaked it. It was out of touch with what actually happens, eg on treatment of members of hyphenated American sub-categories. But I still think it is too long and complicated and needs to be made a lot easier to follow. Golfcam 22:38, 24 February 2006 (UTC)
- Yes. The upshot is that it should make it easier for users to browse through the categories, and fit one of the exceptions mentioned on the subpage. If it does not fit one of the exceptions, but there is consensus to have the duplications, you should probably let us know on this page so we can consider whether the guidelines need some tweaking. -- Samuel Wantman 19:07, 20 January 2006 (UTC)
- I very strongly object to a swing to dogmatism in the opposite direction. There will be many cases where this sort of multiple categorisation is very messy because it will push many of the subcategories off of the first page. It will also generate immense category clutter on some article pages. And in any case, how do you decide which categories should have topic articles? There are many which don't exist, but could or should. This proposal is for a scattergun approach, when we should concentrate on flexibility and accuracy based on what is best in each area. CalJW 05:46, 2 March 2006 (UTC)
- Has anyone considered the ramifications for sorting efforts? For isntance, there's a number of people (and projects) who make efforts to make sure all articles about musicians are properly categorized into all the neccessary subcategories. It's much harder to see what has and hasn't already been sub-cat'ed if all the articles are still in the top level category. As has been repeatedly mentioned, there's a time and a place for everything, but I like the idea of having athe gernal guideline be to not duplicate into higher lievel categories. That's my $0.02. B.Mearns*, KSC 19:52, 24 March 2006 (UTC)
Professions subcategorized by Nationality
General discussion
- Note: This conversation started at Category talk:Film actors.
I would like people working in this category to consider adding ALL film actors back into this category. The decision to subcategorize everyone was made before there were category TOCs. Nationality is an artificial distinction to many actors who are multi-national. I could see the utility of being able to browse through a category of all actors. You would still have the choice of browsing by nationality. -- Samuel Wantman 21:15, 21 January 2006 (UTC)
- I think that's crazy. I don't see how you're supposed to browse a category with thousands of articles in it. The table of contents helps if you know the name...which you don't need a category for in the first place. What you're proposing defeats the purpose of subcategories and goes against precedent in other categories. For instance, it's possible that I'd want to browse thru all American people, not just American botanists, or Californians. But we don't keep Americans in both categories, because being in the subcategory automatically puts you in the parent category. You can always browse the subcategories, too. Categories as big as film actors was last week are simply unbrowsable, but when subcategorized, they can still be browsed, while people can also check out Category:Mexican film actors and so on, getting more information without any more effort. NickelShoe 21:57, 21 January 2006 (UTC)
- As per Wikipedia:Categorization#Guidelines. It does make sense, for instance, for award-winners to be both in the subcat and the parent cat, because award-winners are a small amount of actors, and because a browser might not have any idea an actor won the particular award. But all people have nationalities, and where the nationality is a question, then we can leave it here in Category:Film actors or in multiple subcats, such as Category:American film actors and Category:Canadian film actors. Duplication here doesn't seem any more useful than duplication at other occupations, and such a large occupation makes duplication more cumbersome here. NickelShoe 22:06, 21 January 2006 (UTC)
- It is not as big as you think. Most of the film actor by nationality subcategories do not have many people in it. I suspect the combined listings would be about three times bigger than the American listings. It doesn't defeat the purpose of the subcategories, they would still exist. The past precedent was because of an inability to easily browse through large categories. With a table of contents, this would not be difficult to browse through. I am not proposing getting rid of the subcategories, I just proposing to add the complete list of film actors here. I do think there is a point when large categories would be unweildy, probably when there is more than about 600 names for each letter of the alphabet. I don't think this category would be that big. I see advantages, and I don't see any disadvantages. It has never been against policy to have articles listed in categories and their grandchild subcategories. The guidelines (which I just concluded facilitating the rewrite), say that these decisions should be made to help people browse through categories. Since I am just proposing to add to what we already have, I don't see how it makes things worse, and it might make it better for some people who would like to see the complete list. For example, let's say someone is trying to remember the name of the actor and all they remember is that it started with a B (or was it a D?) The way things are now, it would be hard to find if you don't know the nationality of the actor. It would be easy if they were all here. -- Samuel Wantman 22:20, 21 January 2006 (UTC)
- It seems very awkward to me to have only the people with unknown or multiple nationalities listed on this page. It should be all or none. -- Samuel Wantman 22:21, 21 January 2006 (UTC)
- I think the line of unbrowsability is much lower than that, and in any case, this category was that big. I've moved hundreds of people out of it in the past few days. I know several of the subcats are pretty small--like Category:Welsh film actors. But there's a lot of nationalities, and if you don't pull out some nationalities into subcats, this category has four hundred actors per letter (because there are plenty of people not in here before I started subcatting). Well, I wasn't counting, but you're free to look at my contributions to get an idea.
- I can only speak for myself, but if I can only remember what letter something possibly started with, the category is actually unuseful to me around fifty. I'm sure others can use it up to two hundred.
- I suppose leaving unknown nationalities here is a little weird...I was thinking as a temporary solution until they were either in multiple subcats or somebody actually included their nationality in the article. NickelShoe 22:30, 21 January 2006 (UTC)
- Since you don't like browsing through more than 50 names you would not be forced to, you could browse the subcats. But many people might. It certainly helps deal with the people of unknown nationality, and in general might make for less work. The natural inclination is for editors to add people to this category. Why fight it? I can imagine that this would be a shock to you if you've just recategorized hundreds of articles. All I can say, is that I'll help you put them back. I would like to hear from others about this. And, BTW, I'd like to make the same change to film directors. Quick, can you tell me what the nationality is for Roman Polanski? -- Samuel Wantman 22:40, 21 January 2006 (UTC)
- But...if I know his name, why don't I just type that in? (And it should go without saying that I'm going to hold off any recategorization up or down until there's some kind of consensus here.) NickelShoe 22:45, 21 January 2006 (UTC)
- I guess I should be more explicit about my point about Roman Polanski. The point I'm making is that "by nationality" often constrains categories artificially. There is often nothing particually notable about an actor or director being from one country or another. They are, like in Polanski's case, often from one country, raised in another, got famous in a third and then moved somewhere else. My point is that while some people might find the distinctions of nationality to be notable, it often is not. Because of that, I'd prefer seeing categories populated at a higher level when possible. This would not apply to Category:People by nationality because that is the entire point of the category. Everywhere else, when nationality intersects with profession, I'd like to populate the smaller categories by nationality and larger categories by profession unless nationality is integral to the profession (like politicians). -- Samuel Wantman 23:05, 21 January 2006 (UTC)
- I have also been doing some work on subcategorization. I guess my greatest concern is that a consensus be established and that the consensus then be posted at the top of the discussion page of the relevant categories. Otherwise, I believe, what we'd tend to get, (and what we probably had before the subcategorization effort) is some individuals only in the grandparent category and some individuals only in the grandchild (by nationality) category, and some in both. That, to me, is worse than the alternatives of either a very big grandparent category, or having to go to the grandchild categories to browse. Whichever way the consensus goes, I would help in the effort to make things consistent. I don't think the multi-nationals or unknown nationalities are a problem. Multi-nationals can be listed in multiple nationality categories. Unknown nationalities can be researched to determine their nationality. In general, I tend to prefer very small categories :P but I guess that with the TOC, 200 names (1 page) per letter, or less, is reasonable. -- LiniShu 23:28, 21 January 2006 (UTC)
I decided to do some analysis of this category. There are about 3300 Entries in Category:Film actors Almost all don't seem to be listed in the nationality subcategories. For the subcategories there are:
- 1113 Americans
- 3 Argentines
- 203 Australians
- 5 Austrians
- 83 British which overlap with 107 English 17 Scottish and 8 Welsh
- 59 Canadians
- 2 Danes
- 23 French
- 5 Germans
- 13 from Hong Kong
- 7 Irish
- 4 Israelis
- 4 Italians
- 71 Mexicans
- 2 Poles
- 5 Russians
- 13 Singapore
- 3 Spaniards
- 6 Swedes
At most there would be about 5000 total names in Category:Film actors which would not make the category very much more crowded than it is now. I also notice that there is already Category:Actors by nationality which makes me wonder why we would need Film actors to also be broken up by nationality. Film actors is already a subcategorization of Actors. -- Samuel Wantman 00:12, 22 January 2006 (UTC)
- I support the inclusion of e.g. American film actors in the Film actors category. It's true that many people might find the category too big to browse, and so would use the subcategories instead--but if people are interested for whatever reason in film actors in general, then as big as that is, that's what would be best for them.
- Look, my copy of the Video Hound book has thousands of movies listed in it, and sometimes I'll browse through it to see what I'll find. Other times I'll be interested in finding a specific kind of film and I'll browse through the book's categories. Why does what works in a printed reference work not work for an online encyclopedia?
- The point I want to stress is that the Film actors category ought to have some use, otherwise it should be a category empty except for the subcategories. Who would be interested in the set of "Film actors who are not listed in a subcategory"? To whom is that a useful group?
- When a consensus is reached, putting the guidelines on the top of each category is of course a great idea.
- The question of whether you need actors by nationality and film actors by nationality is a good one. In terms of what people might be looking for, it's not clear why having separate subcategories for say French actors and French film actors would be helpful. I've thought the same thing about musicians by state and musical groups by state--I spent an evening recently separating out the individual musicians from the musical groups into the two respective subcategories for Massachusetts, and now I'm wondering if someone interested in seeing the Massachusetts musical scene at a glance wouldn't be better served by a single category that combines both the individuals and the bands. Sometimes there's an obvious way to break articles up that really is not very useful.
Nareek 04:42, 22 January 2006 (UTC)
The more general question is "When is a category too big?" The push to subcategorize started before the implementation of category table of contents. I don't know if this question has been really addressed since then. My take on this is that categories get subcategorized into tiny pieces often because it is not possible to do a database selection. These small categories like Category:Polish film directors have minimal browsing value compared to being able to browse through Category:Film directors. These multi-attribute categories are often too small. As a guideline, it seems reasonable that categories be populated at the same level as articles about the subject. We don't have an article about Polish film directors but we do have one about Film directors. It also seems that above about 6000 articles a category would become very unwieldy. -- Samuel Wantman 08:40, 22 January 2006 (UTC)
- Thanks, Samuel, for taking the time to do some numbers analysis for this discussion :) And also, thanks for raising the question of "When is a category too big?" I think it would be helpful if we also had some consensus on that. Not a hard and fast rule, and I could imagine there being exceptions in some specialized areas of knowledge, but, I would appreciate, for example, when I look at Category:American actors and think, "wow, this category is really big", being able to go back to the consensus that has been reached, and to answer myself "ok, it's not too big, yet." Your suggested guideline of categories being populated at the same level as articles about the subject is a good one, with the possible exception of when that category is just absolutely too big per the consensus that I hope we'll arrive at.
- Another point I'd like opinions on - this discussion began with the question of whether articles should be in both the grandparent category Category:Film actors and the grandchild categories (Category:American film actors, for example); I would like to know whether the articles in Category:American film actors should also be listed in their other grandparent, Category:American actors. That is a category that might be over 6000 if all entries were listed there. (Sorry, I haven't counted yet how many are there right now). Actually, as I think about it, there may be many articles in Category:American actors or Category:Actors that are not yet listed in any of the subcategories of Category:Actors by medium So, the count of about 5000 names for Category:Film actors might be higher if everyone was listed there who should be listed there. -- LiniShu 12:44, 22 January 2006 (UTC)
- Category:American actors has 4355 entries at this time. This would not count those that are only listed in a grandchild American actors by medium subcategory (whether recategorized by myself and others, or listed directly in the grandchild subcat in the first place.) Category:American actors currently has a too large tag on it. If you read the discussion page, it looks like there was previously a too large tag, which Samuel removed, noting so, with his reasons, on the discussion page, in August 2005. The tag was added again, without discussion, on 14 January 2006. I began, that same day, slowly, to depopulate the category into the American actors by medium subcategories. --LiniShu 13:21, 22 January 2006 (UTC)
- What exactly does "unbrowsable" mean? You can browse through an unabridged dictionary or a multi-volume encyclopedia, and plenty of people do. If you mean it's not practical to look through the entire category to find someone specific whose name you've forgotten, then that might be a pretty low number (depending on your patience). But there are plenty of other reasons you might be browsing a category like "actors." Maybe you want to see what kind of names actors have. Maybe you want to see how many actors are in Wikipedia. Whatever--if there aren't any reasons people might be interested in actors in general, then why have anyone in the category? The fact that a particular size might be cumbersome for a particular use is an argument for addding subcategories, but it's not an argument for eliminating the category. It's not like it's taking up physical space.
Nareek 16:34, 22 January 2006 (UTC)
My $.02: First, since previous upgrades to the Wikimedia software, large categories no longer pose the technical problems that they once did. So the size of a category alone should not be the rationale for breaking into subcategories. Second, "profession by nationality" is an extremely arbitrary breakdown -- many professionals simply do not have any intrinsic association with a nationality. So while I have no objection to creating these subcategories for those professionals who do have a strong association by nationality, these subcats should not mean that the primary category is de-populated. older ≠ wiser 16:57, 22 January 2006 (UTC)
So it sounds like there is general consensus for the notion that categories should be fully populated at the "topic article level". I still don't know the criteria for where the upper limit is. Is it Category:Film actors, Category:Actors, Category:Entertainers, Category:Celebrities or Category:People. I strongly agree with populating Film actors, not certain about Actors and tend to think anything higher is too much. One criteria is to not put an article into every possible category because that clutters the article with categories. I'd propose that we only populate to the level of notariety (there's probably a better way to say this). If people are notable for being Directors, that is the highest level of populating film people categories. There are still grey areas with this, Poets or Writers? Film actors or Actors? Any ideas? What criteria do we use to make these decisions? -- Samuel Wantman 21:06, 22 January 2006 (UTC)
- I favor categories that are not subdivided into a hundred subcategories. Unless one already knows that a professional is "Scottish" rather than simply "British", he or she might not be found by browsing at all. Categories are already sorted by alphabet. It'd be much simpler, overall, if people were identified as "Scots" and "Actors", rather than as "Scottish actors". -Will Beback 09:08, 23 January 2006 (UTC)
Points that I think we're on the way to establishing, so far:
- There really is no such thing as a category that is too big, by any kind of absolute standard.
- It is reasonable to keep categories farther up the heirarchy populated, to the "topic article level" and/or "to the level of notoriety" (which could maybe be defined by the opening sentence of the articles?) Our articles say things like "Clark Gable was an American film actor..." or "Sir Alec Guiness was an Oscar-winning English actor..." - note the use of the word actor rather than entertainer, celebrity, or person :)
- The contributors to this discussion have different preferences for what level of categorization/ subcategorization they find helpful for browing, and presumably our encyclopedia users will also. Even to the same individual, different methods of browsing may be helpful on different occasions, depending on the object of the browse. We can accomodate all by using the subcategories that currently exist or others that might be requested, and also by keeping categories farther up the heirarchy populated too, according to the criteria in the bullet point immediately above, or as requested.
- The possibility of "cluttering" articles with too many categories and subcategories should be considered, but is a secondary concern to having useful categories, possibly at multiple levels.
I've asked Brion Vibber (a developer) about whether there are any technical concerns about having large categories. -- Samuel Wantman 07:13, 24 January 2006 (UTC)
- Thank you --LiniShu 11:28, 24 January 2006 (UTC)
This is the response from Brion Vibber. "THERE IS NO ISSUE WITH THE SERVERS DUE TO LARGE CATEGORIES". So the only issue is the awkwardness of browsing large categories. -- Samuel Wantman 02:04, 25 January 2006 (UTC)
There is certainly no need for any pause to the improvements to the accuracy of categorisation. Things are much better than they were a year ago, but there is still lots to do. No search system is as good as a category system which does some of the work for people. I have found and categorised hundreds and hundreds of articles which were only in the subject category or only in the national category, and the combined categories filter articles through so they are in both. It shouldn't be forgotten that for many occupations and nationalities there are quite small numbers of articles, and without the precise categories they would belong in two big categories, so the six "Fooian xers" would be lost among say 1,252 Fooians (upmerged from 200 different categories) and 2,213 Xers (upmerged from a 150 different categories. Both of these categories would be a useless seas of unfamliar names for most people. It is not a question of whether random browsing is possible, but rather a matter of helping people to target their browsing. People who know exactly what they are looking for can use the search box. By contrast the category system is a natigational tool which helps people to identify groups of related topics. This is especially relevant to people from smaller countries, as the few articles on people of each occupation of their nationality would get lost in a sea of foreigners. In any case starting the conservation with a category where nationality is not quite as relevant as in some other occupations (though it is still very relevant, especially for actors from outside the English-speaking world) has put this discussion on a somewhat false footing. CalJW 05:10, 25 January 2006 (UTC)
- I don't think anyone is advocating that we remove any of the categories that have been added over the last year. There would still be all the "Fooian fooers" categories. This discussion is about adding all the fooers from all countries back into a single category. Some people might look at these large categories and find them useless, but they could still move to the smaller subcategories. I often find the small categories useless and wish they were combined. This option to look at the contents is small pieces or in one large category would add flexibility to the categorization scheme and would serve the needs of more users. -- Samuel Wantman 06:18, 25 January 2006 (UTC)
- I am also very much opposed to dividing categories of people into subcategories solely based on nationality. I doubt that everyone browsing Film Directors really cares to look through the lens of the person's nationality. Why limit the ability to look at the category in different ways? It doesn't effect the subcategories at all, those are still listed at the top of the page and readily accessible. Cacophony 09:10, 25 January 2006 (UTC)
Proposed consensus statement for professions subcategorized by nationality
- In cases where subcategories of the type "Fooian fooers" exist, it is allowable and acceptable to have the individual fooer articles also listed on the "fooer" category page. This can be done on a case by case basis as editors/ users express the desire to be able to browse articles at the "fooer" level and as there are editors willing to work on the repopulation of the "fooer" category if necessary.
- The object is to accomodate different preferences in browsing as expressed in the discussion on this topic.
- The question of the effect of large categories (1000's of entries) has been raised with developers, and we are told that the nature of the category interface is such that large categories are not a drain on the server.
- The intention is that "ancestor" categories be populated up to the topic article level and/or the "level of notability" as indicated by the identification of the individual's profession in the opening sentence of typical articles in the category. This can be addressed in specific on a case by case basis. We would refrain from automatically populating every category up the heirarchy to Category:People, for example, so as not to clutter articles with categories.
--LiniShu 02:11, 26 January 2006 (UTC)
- I would go even further and say "it is recommended to have the individual articles also listed in the larger categories." Also, I'm wondering if this applies to other categories besides "fooers" and subcategorization other than "fooian". I'll look around the category structure and see what I find. -- Samuel Wantman 01:54, 26 January 2006 (UTC)
- Can we look at this from the other direction? Try categorizing a musician. If I followed the suggestion to put him or her in all the categories below the level of musician I might put him in Musician, American musician, American jazz musician, American saxaphonists, and California musicians, not to mention what bands he might belong or belonged to. I don't think many people writing an article on a musician will put him in all the relevant categories if this is the rule, leading to incomplete parent categories and more work for those trying to maintain the system. There's also the drawback of dozens of category links at the bottom of the page.
- Perhaps a technical solution would be best. Put everyone in the most restrictive categories and have a way to specify that you want to see all the pages in the current category and all child categories (but I'm in danger of bringing in the circular categories discussion into this one so I'd better stop here). --JeffW 06:00, 11 March 2006 (UTC)
- Category maintenance is a very real concern, and there is a technical solution that is already possible. It would be possible to create a bot that would populate categories by looking for specific templates. This is similar to what is already done when categories are renamed. After consensus at WP:CFD the template {{categoryredirect}} is added to the category. A bot looks for categories with this template and moves all the articles to the renamed category. We could create a template, perhaps calling it {{Category duplicated}} that would add a message that says, "all articles found here are also part of category xxx" Xxx would be the parent of the category that gets repopulated. So for example Category:Polish film directors would have the message that says "all articles found here are also part of Category:Film directors". Whenever anyone adds an article to Polish film directors, it would automatically get added to film directors. Likewise, a bot could look for film directors without a nationality and add them to a Category:Film directors of uncategorized nationality. These are tasks very well suited for a bot. -- Samuel Wantman 09:06, 14 March 2006 (UTC)
Action item(s)
Possible general action items if the consensus statement is accepted:
- Is there cause to update Wikipedia:Categorization/Categories and subcategories section on "Reasons for duplication"? Is our "desire to browse articles at the topic article level" a distinct reason from those already mentioned? --LiniShu 02:21, 26 January 2006 (UTC)
Comment - There have been a relatively small number of editors involved in the discussion to this point, although it is in a central location as far as categorization is concerned. I would recommend that, after leaving the discussion open a little longer, we then proceed with the effort of making the categorization of individuals in the film, television and theatrical professions consistent. I would not be surprised, though, if the general topic was revisited in other times/places. Hopefully, though, the current discussion may be good groundwork for any future discussions. Thanks, Samuel, for all of your due diligence on the issue of category size. --LiniShu 02:40, 26 January 2006 (UTC)
- It sounds like a good plan to me--wait a bit, and if there are no further objections, use the actors category as an experiment to see how well a more-or-less fully populated category tree works. Then, if it seems like a good idea, we can meet back here and use it as an example of why the recommendation maybe should be changed. I'm kind of new around here; does that sound wikikosher?
- I would argue for consistency in dealing with categories up to Actors. (I could see including a populated category of People, for a guide to all biographies in Wikipedia, but that would be outside the scope of doing one branch as a test case.) I would say that Actors should be fully populated, as well as Film actors, American actors and American film actors. That's a minimum of four categories for every actor--that does not seem like too many to me. Thus: Category:Actors; Category:NATIONALITY actors; Category:FIELD actors; Category NATIONALITY FIELD actors.
- Is there another kind of category that every actor should have? If we want to fully populate the tree, adding another category would mean adding four more categories to every actor (e.g., DECADE actors, NATIONALITY DECADE actors, DECADE FIELD actors, DECADE NATIONALITY FIELD actors). But we don't have to be strict about this: If we wanted to do the decade thing, for instance, we could say we'll only do it for a few specific types, like American film actors and British film actors. Then only those selected types would get an extra category, and those only one (DECADE NATIONALITY FIELD actors).
- Nareek 05:24, 26 January 2006 (UTC)
- Okay, but keep in mind that most actors aren't simply "American film actors"--they're also "Jewish American actors", "American film directors", "Entertainers who died in their fifties", and "People from New Jersey". I'm not saying this is the end of the world, but let's keep in mind we're not always talking about four categories. NickelShoe 05:47, 26 January 2006 (UTC)
- Very true, many will have more than four categories. But for purposes of this project, there should be a set of categories that should be given to every actor. Are those four it, or is there some other grouping that everyone should have?
- Assume that every topic level category is broken into subcategories by nationality. According to this proposal, each article ends up in two categories. If it is also broken by another attribute, that puts the article in 4 categories. 3 attributes yields 8 categories, etc... with the result that N attributes yield 2^N categories. For this reason, restraint is necessary. Eventually, hopefully, the software will be modified to tag articles with these attributes in a better system, or allow database searches. Until that day arrives (and if it arrives) we cannot expect this system to handle multiple attributes. For this reason, I'd like to propose the following. That duplication of categorization does not go above the lowest possible topic category (for example Film actors), and that subcategorization of that lowest level should only be done with parallel subcategories, which means that they would all be subcategories of the topic category and no attributes would be combined. So for example, if the topic level category is Category:Film actors, then Category:Film actors by decade it should not be combined with Category:Film actors by nationality to create Category:American film actors of the 60's. The point of this proposal started with trying to make it possible to browse through topic level categories, not to create and populate every possible subcategory. Let's do this a small step at a time and see how it goes. For now, let's just re-populate Category:Film actors and Category:Film directors and take it from there... -- Samuel Wantman 08:23, 26 January 2006 (UTC)
- Since the task involved is going to the articles in a category and making sure they're all in one or more other categories, it's not really harder to do four at once than two. If you go to "Category: American film actors" you'd open each article and paste in the following:
- Very true, many will have more than four categories. But for purposes of this project, there should be a set of categories that should be given to every actor. Are those four it, or is there some other grouping that everyone should have?
- Okay, but keep in mind that most actors aren't simply "American film actors"--they're also "Jewish American actors", "American film directors", "Entertainers who died in their fifties", and "People from New Jersey". I'm not saying this is the end of the world, but let's keep in mind we're not always talking about four categories. NickelShoe 05:47, 26 January 2006 (UTC)
- All these articles would already be in Category:American film actors. You might have to then delete some categories if there's duplication.
- Since it's pretty much the same amount of work to paste in one category as three, it seems like we ought to do this the way we want it to turn out. Am I wrong in thinking that the consensus here is in favor of a fully populated tree?
- Nareek 13:27, 26 January 2006 (UTC)
- I'm attempting to kind of "redirect" the discussion on the specifics of the acting and directing professions to the section below :) Thanks, --LiniShu 13:35, 26 January 2006 (UTC)
Application to film, television and theatrical professions
Regarding the scope of the original discussion (and the area for which we have editors willing to work on consistency of categorization at this time): If the consensus statement proposed above is accepted, we have already established in the discussion that the following "grandparent" categories should be populated with all of the articles in their descendant "by nationality" categories: Category:Film actors, Category:Film directors, Category:Television actors, Category:Television directors, Category:Stage actors, Category:Theatre directors Are there any other similair "fooers" categories to add to this list at this time? --LiniShu 03:01, 26 January 2006 (UTC)
Other "ancestor" categories in question- We have Category:Actors, Category:Directors, and "Actors by nationality" categories such as Category:American actors. Should any of these be populated with individual articles? If no one has a strong opinion, we could go with the status quo for now; current general usage based on numbers of articles, is to not have "Actors" and "Directors" populated with individual articles, but we do have the "Actors by nationality" categories populated with individual articles. --LiniShu 03:13, 26 January 2006 (UTC)
- I'm taking the liberty of pulling out a couple of comments from the section above, relevant to what we do here and now with the film, television, and theatrical professions (--LiniShu 13:31, 26 January 2006 (UTC)):
- From Nareek 05:24, 26 January 2006 (UTC) - Populate Category:Actors; Category:NATIONALITY actors; Category:FIELD actors; Category NATIONALITY FIELD actors with individual actor articles.
- From Samuel Wantman 08:23, 26 January 2006 (UTC) - Let's do this a small step at a time and see how it goes. For now, let's just re-populate Category:Film actors and Category:Film directors and take it from there...
- My preference - My "pet question" :P I really want to know what to do with Category:American actors - because that's the avenue by which I became involved in this discussion. I can go through the category making sure each actor is in the appropriate Category:FIELD actors and Category NATIONALITY FIELD actors, but do I then remove them from "American actors" or leave them in? And do I add back those I had removed before this discussion began? I'd like to make the usage consistent, one way or another. Status quo and lesser amount of edits required would probably be to populate rather than depopulate. --LiniShu 13:31, 26 January 2006 (UTC)
- I think we should move this discussion to Category talk:Actors so that if there's anyone following that page, they can have a chance to participate in the discussion. I'm going to copy and paste some of the actor-specific talk here to there--maybe some of it should then be deleted from this page? It's getting kinda unweildy.
Nareek 15:13, 26 January 2006 (UTC)
Proceeding from here
At this time, among this group we have clear consensus on: do re-populate Category:Film actors and Category:Film directors, do not re-populate up the heirarchy tree above Category:Actors. There is less clear agreement on "Actors" itself, and the NATIONALITY actors categories. We also have the "cousin" categories of the Film categories - Stage and Television. I've been thinking this over the past couple days - I think that those who are interested in helping with the repopulation effort for the "MEDIUM(Film,Stage,Television) Actors" (or directors) categories should go ahead and begin, maybe focusing on the film category first (one has to choose somewhere to start), but at the same time, as we "touch" any articles to add in MEDIUM Actors category tags, why not make sure, at the same time, as Nareek suggested, that the NATIONALITY Actors and Actors category tags are there also? Of the group that has given consensus on populating categories farther up the heirarchy tree than the NATIONALITY MEDIUM Actors categories, no one showed significant opposition to populating NATIONALITY Actors or Actors, and our "populate to the level of notability" criterion could be applied either at the MEDIUM Actors or Actors level, and, interest has been expressed in having Category:Actors populated for browsing.
As we begin this work, it would be good to incorporate Samuel's new "Allincluded" template on the appropriate category pages.
I am planning to proceed as I have described above, if there is strong objection, or a different consensus is reached, my changes can always be undone.
Nareek, I am adding this latest to the Category talk:Actors as well; wasn't sure where it would be most likely to be observed, here or there?
NickelShoe, you were part of this discussion at the beginning; I know that the repopulation of the higher level categories is a new direction from the way we were going before, hopefully, after the discussions above, and the checking of whether categories can be too large (from a technical standpoint), this conclusion is something that you can live with? Your opinion is valued; you do a lot of work improving the quality of Wikipedia, and you had already put in a lot of effort in organizing the Actors categories according to the generally held application of WP:MOS prior to this discussion.
Feedback from all is appreciated, as always. Thanks,Lini 13:25, 29 January 2006 (UTC)
- Yeah, I think it makes the process of categorizing articles more complicated (rather than simply using the most specific category, one has to figure out when to also use general categories), but I trust that if others say it's helpful to be able to browse categories I find unwieldy that they're telling the truth. So I wouldn't stand in the way. Hopefully some day we'll have cross-referencing cats. NickelShoe 19:31, 29 January 2006 (UTC)
- I'm against any change. An actor's nationality is important. It effects the type of parts they get etc. We don't need a huge category, some of them are more than big enough as it is. Golfcam 22:41, 24 February 2006 (UTC)
- I very strongly object to a swing to dogmatism in the opposite direction. Extrapolating from a special case such as film actors (not that I agree with any change even there) is a wrong headed proceeding. There will be many cases where this sort of multiple categorisation is very messy because it will push many of the subcategories off of the first page. It will also generate immense category clutter on some article pages. And in any case, how do you decide which categories should have topic articles? There are many which don't exist, but could or should. This proposal is for a scattergun approach, when we should concentrate on flexibility and accuracy based on what is best in each area. CalJW 05:46, 2 March 2006 (UTC)
Actors by time period
A wild and crazy idea - for those who prefer smaller subcategories for browsing - for American actors, even the most precise subcategories that we have now Category:American film actors, Category:American stage actors, and Category:American television actors are not small categories. "American film actors" at the time of this writing has 1331 entries; more will be added in the effort to consistently categorize "American actors", and the category will only keep growing with time. With the above discussions about the disadvantages of "over-subcategorizing", I'm not in a hurry to create these new subcategories, and I would not want anyone reading this to rush out and create them, but... what about the idea of actor subcategories by time period, as we have for musical groups? Some actors like George Burns, of course, would have quite a few decades of activity, but others would have only one or two. Advantages would be: a.) we'd have subcategories that at some point stop growing very much b.) we could see people grouped with their contemporaries c.) an actor's time period implies something about the cultural climate in which they worked. What do you all think? (If you are tired of more and more subcategorization, "please don't shoot me" for suggesting this :) - just an idea) Thanks, --LiniShu 03:33, 26 January 2006 (UTC)
- The only problem would be the number of categories added to individual articles. You would probably be better off creating a list of film actors by time period. —Mike 20:33, 12 February 2006 (UTC)
- Theoretically yes, but this is a wiki and it is very clear from precedent that categories are better maintained than lists in most cases and this is becoming more and more the case. For example most of the country-topic lists have been virtually abandoned, and the exceptions are mostly maintained by a single enthusiast, whereas many people contribute to category maintenance because it can be done as an adjunct to visiting an article. CalJW 06:02, 2 March 2006 (UTC)
- Category:Architects is easier to maintain than List of architects, and the category is made up of articles or stubs with real content. The list tends to accumulate vanity listings, people just tack their name on the list and contribute no article. In general, the structure of architecture is an example for this entire discussion. Architects are listed (as an 'instance') in the format discussed above, namely: Category:NATIONALITY architects, in parent Category:Architects by nationality, fooian fooers. This can go a bit to far, perhaps, in what seems to be extremely narrow categories attached to Alvar Aalto (i.e. Category:Natives of Southern Ostrobothnia).
- The time period for an architect is found in Cat:STYLE architects, such as Category:Baroque architects or Category:Postmodern architects. There's also the ambitious Timeline of architecture and parent Category:Years in architecture. Infoboxes can also be used for showing the span of years that an individual worked or played music (60's, 70s, 80s, etc.). The List of architects is arranged by decades, with some overlap between the 1990s and 2000s, which can be revised.
- From discussion above: "Populating categories with articles... ": Articles in the 'notable level' of Cat:Architects can be reduced to a set of "household name" or "notable" architects (who will be disputed, no doubt). Otherwise, the category will be hard to browse - a mix of notables and, less notable. If, for example, a user misspells a name in the search box (and there's no redirect), they might browse the categories for answers. But, to find Aalto, the user would have to know his country Category:Finnish architects. So the duplicate listing (in 'grandparent' and 'grandchild') would be helpful to a general audience, looking for a world famous architect at the level of Category:Architects. -- Dogears 09:11, 3 March 2006 (UTC)
- But why limit it to just the notables? Categories serve as Wikipedia's index and table of contents. I'd put ALL architects in Category:Architects (duplicating the architects by nationality subcategories) and put the notables in List of notable architects which can be annotated as to why they are notable. This list can be "piped" so that it is at the top of the category as List of architects already is. A user looking for the famous can find the list of notables, someone wanting to browse all the architects could look through Category:Architects, and others could look through the architect categories by nationality. This would make it more useful for more people.
- There are many reasons why someone would want to browse through all the architects. Perhaps they can't remember the name of an obscure architect (Starts with "G"?) Or perhaps they are an expert on Architecture and are scanning the category to see what architects are missing. They might just be looking to compile a list of architects. -- Samuel Wantman 10:17, 3 March 2006 (UTC)
- Theoretically yes, but this is a wiki and it is very clear from precedent that categories are better maintained than lists in most cases and this is becoming more and more the case. For example most of the country-topic lists have been virtually abandoned, and the exceptions are mostly maintained by a single enthusiast, whereas many people contribute to category maintenance because it can be done as an adjunct to visiting an article. CalJW 06:02, 2 March 2006 (UTC)
Articles for specific dates
I just noticed a new trend: articles for specific dates, such as January 18, 2005. Not sure I like it (note the related AFD discussion), but lots of them now exist. For the purpose of this talk page, I'm wondering how they should be categorized. The one I linked to above was placed in Category:2005, definitely not a good choice (I removed it). Most of the others seem to be uncategorized at the moment. So how should they be categorized? - dcljr (talk) 22:54, 24 January 2006 (UTC)
- It should go in Category:2005, unless there's a better subcategory for it to go into. I mean, how is January 18, 2005 itself not a direct subset of the year 2005? -Sean Curtin 01:53, 25 January 2006 (UTC)
- It's now in Category:Days in 2005, which is a pretty obvious solution. CalJW 05:20, 25 January 2006 (UTC)
- Perhaps the formatting of Category:Days in 2005 could be improved. Currently the days are categorized like this ...|04/10 which is good because everything will be in chronological order, but the first 9 months will be under the heading "0" and the last three will be under the heading "1". It might look better if they were all categorized like this ...| 04/10 By adding the space there won't be any headings and the entire year will be chronological. --Samuel Wantman 08:05, 25 January 2006 (UTC)
The one problem I see with January 18, 2005 is that it is really a subpage masquerading as an article. It doesn't have the normal intro paragraph, and I would expect to see some sort of navigational template. If these were added, it would probably screw up the formatting of January 2005. —Mike 20:49, 12 February 2006 (UTC)
- The solution to this problem would be the noinclude tag. dcljr (talk) 23:37, 12 February 2006 (UTC)
Category purpose question
A question concerning the purpose of categories: Is the primary purpose of categories to:
- Aid the reader in finding material which may be of interest, or relevant to a particular topic?
- Producing a taxonomy; wherein being included in one or more categories is an indication--nay, a declaration by the Wikipedia community--that the subject of an article is an instance of the categories it is included in.
I seem to suspect the latter; and the article advises that "Categories appear without annotations, so be careful of NPOV when creating or filling categories. Unless it is self-evident and uncontroversial that something belongs in a category, it should not be put into a category.".
However, one side-effect of that rule is category fights, wherein two (or more editors) will have a dispute over whether a topic belongs in a particular category, or not--or constantly move the article back and forth between two different categories. Some categories are seen as marks of prestige (such as Category:physics, whereas others are seen as badges of dishonor, like Category:Pseudoscience); I can think of several articles in the realm of physics which have been the subject of such disputes.
Perhaps a wider view of categories may be more appropriate--as in "this is a list of possibly relevant subjects; some of which may be of disputed relevance?" That might nip some of the edit wars in the bud. --EngineerScotty 23:01, 25 January 2006 (UTC)
- I'm going to say that the first is definitely the purpose, however readers will still interpret it as a declaration, no matter what we say they should do. Therefore we can only be so loose with category usage. If a category is inherently a value judgment, it should be deleted or renamed. NickelShoe 02:10, 26 January 2006 (UTC)
The first guideline for categories is "Categories are mainly used to browse through similar articles". Even so, there are multiple taxonomies, and it is important that articles are not mis-categorized. Often categories with the word "related" in them are brought up for deletion at WP:CFD. Ineveitably they are deleted or renamed with the reasoning that EVERY category implies that the articles are related to the name of the category. Some categories ONLY contain related articles. See Category:George W. Bush for an example. This is frequently discussed and there is already language about this here. There is also discussion about this at the top of this page. At a practical level, I don't think we could ever reach consensus about a single taxonomy for all the articles in Wikipedia, and I don't see what use it would be. Categories are only good for browsing and discovering what articles exist here. I recently was involved in trying to mediate a controversy about whether the article Matthew Shepard should be in Category:Hate crimes. Technically, his murderers were never convicted of a hate crime, but the case is definitely related to the subject. The point of the category is to lead users to articles related to the subject, so it is reasonable that all articles related to hate crimes be in the hate crime category. -- Samuel Wantman 09:02, 26 January 2006 (UTC)
Both and neither. I'd say the Wikipedia articles themselves are the best way to list the articles related, or partially related, to a subject, since they can qualify how the topic is related. After all, the articles, not the categories, are the center of this project. The best use of categories is to provide an overview of the breadth and depth of articles in a broad field Wikipedia so that the user does not have to slowly explore the branches of the tree of knowledge one twig at a time. You could say the categories provide a way to speed-read Wikipedia; they should include all the links to specific persons & objects and narrower topics & concepts found on each page. GUllman 19:03, 27 January 2006 (UTC)
- Categories should be a Wikipedia-based system, not "factualness"-based. We would either have tons of uncategorized articles for valid but controversial ideas, ... or each category would have dozens of parallel categories for "some people believe" or "usually mentioned in association with" relationships. The real world is too complex to describe in terms of yes and no, so you shouldn't expect to learn about a subject by just reading categories. Want to know why an article is listed in a category? Simply click on it and read the article! GUllman 20:54, 27 January 2006 (UTC)
Can't see categories when editing
Why can't you see the Categories section of a page when editing a page? Cigarette 15:14, 27 January 2006 (UTC)
- Yeah, somebody had to tell me too...they're all the way at the bottom of the page, under the edit box. Interwikis are over at the side. You mean when previewing, right? NickelShoe 17:22, 27 January 2006 (UTC)
- If you mean you can't see the [[Category:]] references in the edit box, note that categories are sometimes "hidden" inside of {{templates}}. Maybe that's the problem. - dcljr (talk) 19:43, 7 March 2006 (UTC)
Template for categories that include the articles in its subdirectories
I have created Template:Allincluded for use in categories that include the articles that are in its subdirectories. Here is an example of its use from Category:Bridges in New York. {{Allincluded|bridges in New York State|the bridges}} produces the following comment:
![]() | For convenience, all bridges in New York State should be included in this category. This includes all the bridges that can also be found in the subcategories. |
I think this template (or some similar comment should be used when subcategorization does not remove articles from topic categories as discussed above. -- Samuel Wantman 08:54, 28 January 2006 (UTC)
- Yes, this new template should work very nicely for that purpose. Thanks! ----Lini 12:24, 28 January 2006 (UTC)
Help -- didn't alphabetize correctly
I linked an article about a person to a category, and when the name appeared on the category page it was in the wrong place. It ended up alphabetized under her first name, instead of the last name like everyone else in the category. How did this happen? How can I fix it?
(The article was Paola Giangiacomo and the category was Category:Television journalists.) — Michael J 17:37, 28 January 2006 (UTC)
- Thanks for the info, and thanks for fixing the link! — Michael J 00:51, 29 January 2006 (UTC)
Sorting foreign names with special letters
I've notice that some people have been changing surnames starting with Æ, Ø or Å to different letters when the pipe (|) is used to sort the articles when categorizing. So Ø might have become O and I've seen Å changed to both A and Aa. Is there any guideline or policy on how to sort these names? Just randomly substituting the actual letters with some sort of home-made standard doesn't seem to be very useful. (I'm sure there are examples of other letters as well, but being Norwegian these are letters I've noticed the most. They are the last three letters in the Norwegian alphabet.) Tskoge 14:11, 5 February 2006 (UTC)
- English speakers would have no idea where to look for these letters, so I suspect they should be put where people would look for them. I don't think there is a guideline for this, but I think it should be something like this:
- If the letter is the same as a letter in the English alphabet with an added accent, it should be piped without the accent. Thus, "Êxample" would be piped as "Example".
- if the name is outside the 26 letters of the English alphabet, the letter in the alphabet that most looks like the letter should be added to the beginning of the name in the pipe, so "Æxample" would be piped as "AÆxample". "Øxample" would be "OØxample", etc... This would put them in order after the listings without foreign letters.
- Any other ideas? -- Samuel Wantman 19:32, 5 February 2006 (UTC)
- I agree that in the English Wikipedia, we're catering to English speakers, who generally don't know what order letters go in foreign alphabets. I usually pipe the cats to sort by what letter I'd assume an English speaker would look under. (Not duplexed as suggested by Samuel.) It's also probably unclear to most English speakers when they're looking at a different letter as opposed to a letter with an accent mark, so I'd be afraid to make a distinction for the procedures in the different situations.
- I think that it would make the categories pretty hard to use for a lot of people if we sorted them correctly according to foreign alphabets. I think Samuel's suggestion is pretty sensible, though I can imagine issues with letters such as Æ for which alternate spellings might begin with a E, not A. NickelShoe 20:47, 5 February 2006 (UTC)
- I tend to agree, especially because each language has its own collation order. Many of these letters support differently depending on language. - Jmabel | Talk 05:26, 10 February 2006 (UTC)
- ...And the actual sort order will be based on Unicode, anyway, not individual language conventions (Unicode breaks up many scripts into different ranges of characters, so they wouldn't necessarily be sorted "correctly" even if you knew the language). - dcljr (talk) 23:41, 12 February 2006 (UTC)
- They should definitely be sorted by the letter they most look like. Anything else is just no sorting at all to English speakers. It doesn't matter whether this look right to foreigners or not becuase they have their own Wikipedias. Golfcam 22:43, 24 February 2006 (UTC)
- Nickel Shoe's suggestion about sording by different foreign alphabets doesn't make any sense at all. We need to sort on the English alphabet. In lengthy categories, that's what we have for shortcuts at the top--exactly 26 English letters (in rare cases numbers as well). Note that you also don't get the sort order of any foreign alphabet either, if you don't use index keys. You get some crazy, arbitrary Unicode number basis, not used in sorting any language. But how in the world are you going to find someone, say in a category such as Category:1893 births or Category:Living people if you have some crazy hodgepodge of 87 different language rules being applied, and where in the world are our editors ever going to discover what those 87 different language rules are. Actually, in some cases, there are variations in the indexing order in the same language in different countries, too. But that's irrelevant; what we need to worry about is English language sorting order.
- Sometimes the differences in whether a character is sorted as one letter or two letters depends on whether it is a ligature or a diacritical mark--that's mentioned somewhere on one of the zillion subproject pages here. But often, it also depends in part on how that person is known in English; someone with a name like Böhm might choose to spell it Boehm or Bohm or even Behm in English. Sometimes the same letter will be used in several different languages, and will be more likely to be changed to two letters in English when it comes from some of those languages and to one letter in English when it comes from the another language.
- But in almost every case, whether you sort it as one letter or two letters, you will be closer to the proper English sorting order than you were with the letter with squiggles on it, even for those people who disagree with you about whether it should be sorted as one letter or two.
- Same goes for the initial letter. Almost all of our categories, especially those for people, should be indexed in a case insensitive way. That means if the indexed name starts with "de" or "von" it should be changed to "De" or "Von" in the indexing key. Gene Nygaard 03:41, 11 March 2006 (UTC)
- I think you and Nickel Shoe are pretty much in agreement. --JeffW 05:13, 11 March 2006 (UTC)
- I tend to agree, especially because each language has its own collation order. Many of these letters support differently depending on language. - Jmabel | Talk 05:26, 10 February 2006 (UTC)
- Any other ideas? -- Samuel Wantman 19:32, 5 February 2006 (UTC)
Changing guideline
I've altered the guideline on the accompanying page to reflect the fact that there's at least a difference of opinion on the subject of populating the category trees. (See above.) There seemed to be a pro-population consensus in the discussion we just had, but perhaps it's too soon to change the guideline completely? In case I acted hastily, I left the old language up, commented out, to make it easy to restore. Nareek 23:45, 11 February 2006 (UTC)
- I think the revised (inclusionist) way should be the guideline. Nobody has given me a good reason why categorization should be limited to the smallest denominator. Why put artificial limits on categorization? It only hurts the end users by creating (sometimes) unnatural divisions. Who cares if there is a few very similar category links at the bottom of the page? Tolerating "redundant" categorization is such a ultra-minor tradeoff to having meaningful (i.e. complete) categories. Cacophony 00:23, 12 February 2006 (UTC)
- I agree with altering the guideline, but I think there is more to it. I don't think that categories should be populated ALL the way up a hierarchy. This would make it MORE difficult to navigate some categories. For example, Category:Bridges has articles about the different types of bridges along with other articles related to bridges, and all the subcategories have the artcles about specific individual bridges. This is a useful division of the articles. If Category:Bridges also had the hundreds of artilces about individual bridges, it would be very difficult to navigate the other articles about bridges. The discussion above talks about populating categories that were subdivided arbitrarily (like by nationality) and only up to the level of a topic article. I don't think we should condone anything more than that. -- Samuel Wantman 11:33, 12 February 2006 (UTC)
- As I've said above I endorse a flexible approach, but I am very strongly opposed to multi-level categorisation as a standard policy. The exceptions should remain exceptions. I find the idea that categorisation by nationality is "arbitary" somewhat offensive. I value nationalities and they still play a major role in the world, even if some people like to thing it is cooler to treat them as anachromisms. They are not anachronisms. CalJW 05:51, 2 March 2006 (UTC)
- I agree with altering the guideline, but I think there is more to it. I don't think that categories should be populated ALL the way up a hierarchy. This would make it MORE difficult to navigate some categories. For example, Category:Bridges has articles about the different types of bridges along with other articles related to bridges, and all the subcategories have the artcles about specific individual bridges. This is a useful division of the articles. If Category:Bridges also had the hundreds of artilces about individual bridges, it would be very difficult to navigate the other articles about bridges. The discussion above talks about populating categories that were subdivided arbitrarily (like by nationality) and only up to the level of a topic article. I don't think we should condone anything more than that. -- Samuel Wantman 11:33, 12 February 2006 (UTC)
Category sorting finer specifications
I added some additional specifications to Wikipedia:Categorization#Category sorting. Just mentioning here that most of that was resulting from a discussion here: Wikipedia talk:Naming conventions (people)#Naming convention for Dutchmen --Francis Schonken 20:02, 12 February 2006 (UTC)
- Something for Category_talk:Peers#Sorting, Spanish, German, Chinese, Japanese, Korean personal names might be included as well. - User:Docu
'Cycles should usually be avoided'
...but why?
Is there any reason why Category:Bill Drummond and Category:Jimmy Cauty shouldn't be both members and subcategories of Category:The KLF? Isn't this is a logical arrangement for the members of music groups who also have profiles as individuals? --kingboyk 18:33, 1 March 2006 (UTC)
- A cycle occur when a category is both a parent and a child of another category. Cycles have nothing to do with the article space. Circeus 18:44, 1 March 2006 (UTC)
- Let's approach this from another angle: could you please have a look at Category:The KLF and tell me not just if anything is wrong, but if it's wrong why it's wrong? It would be much appreciated. --kingboyk 18:49, 1 March 2006 (UTC)
- I think it's a bit confusing for there to be both a subcategory and an article for the same person on the same category page. A naive user might not know the difference and the article can be accessed by going to the category first. On the other hand, it's not bad enough that I'd go in and edit it to remove the duplication. --—The preceding unsigned comment was added by JeffW (talk • contribs) .
- There is actually a value to doing it this way if it is done correctly. For example in Category:Presidents of the United States there is Category:George W. Bush and the article George W. Bush, by having both, it communicates that there is an article about Bush and a category where articles related to Bush can be found. This is even more important for a category like Category:Bridges. In this case Suspension bridge is an article about a specific type of bridge. Category:Suspension bridges is full of all the articles about the many individual suspension bridges. Categories and articles are very different things. I find that often when both the categories and eponymous articles are both listed, it is usually the subcategory that is less appropriate to the category than the article. The precedent in the past was to have the category and not the article. But people's natural inclination is to put the article in the categories it belongs in (and rightly so). I think we are in a transition stage right now, and I suspect in the future, it will be more common to see the article and not the subcategory listed. These decisions, should be made by examining the parent category, and looking to see what the logic is that determines what belongs and what does not. If the sibling articles are there, the article belongs. If the sibling categories belong, the subcategory belongs. -- Samuel Wantman 21:06, 9 March 2006 (UTC)
Another example
I still have not seen a justification either. I'd love to see something besides, "Don't do it." Here are some excerpts from Category talk:Education: Categorization with respect to Science. and Category talk:Education#The classification of education and loops. Rfrisbietalk 19:42, 1 March 2006 (UTC)
Categorization with respect to Science.
I was asked about the removal of Education from Category:Applied sciences and will respond here. My intent is to eliminate cycles from among the categories. Perhaps there is a better way to do this than my reversion, so I'll lay out the problem.
These are the portions of the Science tree and Education tree that apply:
Note that "Science" links to "Acadmic disciplines." The goal is to connect these together in a way that doesn't cause a cycle. I removed, what in this diagram would be an arrow from "Education" back to "Applied sciences".
One must decide which of these categories is the most general, and thus not a subcat of any of the others. To me the candidates are "Science," "Academia" or "Education" (the most general in the current configuration). The one determined to be most general won't link back to the others.
I hope this makes some sort of sense. JonHarder 03:28, 28 February 2006 (UTC)

- Your objective is clear (eliminate loops) but not your (or anyone else's) purpose (justification). It's stated in Cycles should usually be avoided:
- Although the MediaWiki software does not prevent cycles (loops), these should usually be avoided. Here is an example existing in early November 2005: Category:Academic disciplines - Category:Interdisciplinary fields - Category:Social sciences - Category:Education - Category:Academic disciplines - Category:Interdisciplinary fields ....
- No rational is given for not using the Mediawiki software's capability to use loops. In the real world, these also are known as self-referencing systems, in this case, education about education. Other examples include meta-evaluation (evaluating evaluation), meta-ethics (the ethics of ethics), and the history of history. The unfortunate and hopefully unintended consequence of this "no loops at all costs" approach is that the classification of education has been reduced to... Fundamental: Systems: Society: Culture: Education; and Fundamental: Systems: Education. Not only has education been removed from Applied sciences, It's been removed from Social sciences and Interdisciplinary fields as well. In effect, the message is "education is not science." Removing Academia from Education simply shifts the problem without addressing it. We could just as easily "solve" the loops problem by eliminating Science from Academic disciplines. From the Education classification perspective, that would be less disruptive.
- My main point is that eliminating loops in and of itself does not address the classification issue for Education. Clearly, leaving it out of the sphere of any type of science, as well as not acknowledging its self-referencing properties, are problematic. I hope we can keep working toward a solution that addressed all of our key issues. :-) Rfrisbietalk 16:31, 28 February 2006 (UTC)
The classification of education and loops
The current classification of education is as follows:
Apparently, this is the result of an effort to eliminate category "loops." See the above discussion. An unfortunate side-effect of this effort is that education has been eliminated from categories such as category:Applied sciences and category:Social sciences. This directly contradicts education's place in the articles, List of academic disciplines and Social sciences. I'm going to place the following comment on the category page as a compromise, at least until something better comes along. Rfrisbietalk 17:26, 28 February 2006 (UTC)
To avoid loops, the following classifications are not included in the education category structure.
- You're starting to confuse me now people. If someone can come up with a "why" in plain English please buzz me :-) --kingboyk 19:52, 1 March 2006 (UTC)
- Kingboyk, I'm asking "why" too. I'm not trying to answer your question. :-) Rfrisbietalk 20:17, 1 March 2006 (UTC)
- Yes, I had worked that much out :) --kingboyk 20:22, 1 March 2006 (UTC)
- Kingboyk, I'm asking "why" too. I'm not trying to answer your question. :-) Rfrisbietalk 20:17, 1 March 2006 (UTC)
- I think it has to do with the fact the category structure is expected to be roughly semi-hierarchical:
- categories do not form a strict hierarchy or tree structure, but a more general directed acyclic graph (or close to it, see below). (under "categories do not form a tree")
- Generally, people can expect that the entire thing is organized roughly in a tree-like structure, and will be thoroughfully confused when encountering cycles. I know I was. Circeus 20:25, 1 March 2006 (UTC)
- I don't see why. Visited links usually go a different colour. I was doing it this way because I felt it made navigation easier for newbies - there's no doubt to a casual surfer now that The KLF and Bill Drummond are interconnected and if you want to read about one you should probably read about the other! --kingboyk 20:31, 1 March 2006 (UTC)
- I agree this is a "bad" rule. "Self-referencing systems" exist, such as in all the meta- fields. By definition, such systems create "cycles" or "loops." Rfrisbietalk 20:50, 1 March 2006 (UTC)
- I find category cycles are mildly confusing or surprising when I encounter them, but we should definitely use a little sense and not eliminate loops when the costs are higher than the benefits. I doubt anyone goes thru the cycle more than twice before they figure it out. NickelShoe 00:53, 2 March 2006 (UTC)
- I agree this is a "bad" rule. "Self-referencing systems" exist, such as in all the meta- fields. By definition, such systems create "cycles" or "loops." Rfrisbietalk 20:50, 1 March 2006 (UTC)
- I don't see why. Visited links usually go a different colour. I was doing it this way because I felt it made navigation easier for newbies - there's no doubt to a casual surfer now that The KLF and Bill Drummond are interconnected and if you want to read about one you should probably read about the other! --kingboyk 20:31, 1 March 2006 (UTC)
- I think it has to do with the fact the category structure is expected to be roughly semi-hierarchical:
There are several reasons why I prefer to keep categorization tree-like, in addition to the confusion of cycles mentioned above:
- Logically a category is more general than its subcategories. A cycle implies a category is both more general and less general than itself, or perhaps that every member of the cycle is of equal status.
- A tree structure makes it possible, in the future, to be able to request every article on a certain topic and have software automatically build a collection of all the related articles, by traversing the category tree, without wandering into unexpected directions.
- A disciplined approach to categorization helps the subcategorization process by identifying areas that relate to other trees. For example, it might be possible to link a subcategory to another field and not cause cycle, where linking the parent category would create a loop. Another way to say this is that even if a category placement causes a cycle, working instead with one of it's subcategories may not. And this in fact may turn out to be a better, more specific way to group things.
When categorizing it is helpful to consider whether X is a kind of Y or if X is a part of Y. If so, then X is a subcategory of Y. JonHarder 01:03, 3 March 2006 (UTC)
- Indeed, but again I can't see how that applies to the pop group/individual person situation. I'm probably just being thick, granted :-)
- I think I'll leave Category:The KLF how it is for now. If you can tweak the category membership of related pages and categories to make it better then please do so. Otherwise, I shall quietly WP:IAR as I'm not convinced that the current arrangement is 'bad' any way.
- It might be that the advice needs to be softened too? --kingboyk 01:10, 3 March 2006 (UTC)
Those all are fine points above for reasons to avoid loops. So show me how to resolve this.
- Classification: Education: Academia: Academic disciplines: Social sciences: Education:
- Rfrisbietalk 03:12, 3 March 2006 (UTC)
I'd like to take a step back for a second and look at the bigger picture. Wikipedia categories are NOT a pure classification system. Articles are in categories for many reasons. Sometimes they are just related to the subject of the category (for example Category:George W. Bush), sometimes they are members of the set defined by the category (example Category:Suspension bridges). Sometimes the category is very broad like Category:Education. The reason that there is a loop in the education example is because Academia is RELATED to Education, but Academic disciplines are a SUBSET of Academia. This sort of thing is to be expected. This is a wiki. The work that it would take to create a monolithic classification system that everyone could agree to would be huge and probably so frustrating that we'd all leave the project. The beauty of Wikipedia's categories is that multiple intersecting classifications can co-exist. The problems arise when all these hierarchies intersect. So sometimes cycles happen because things have not been thought out very well, but sometimes they just happen.
BTW, I am the original author of Wikipedia:Classification. My intent, in creating the page, was to create a way of showing the multiple classifications systems that exist here. For example, there is a hierarchy under Category:Bridges and another under Category:Toll bridges. By adding a classification, you can illustrate that these are two separate hierarchies that overlap each other. -- Samuel Wantman 09:55, 3 March 2006 (UTC)
- Hi Samuel, thanks for your comments. Your "RELATED" argument makes sense IF it's TRUE. Not everyone, including me, will agree with you on that. :-) Some will say Academia IS a subset of Education. Be that as it may. I expect there must be a few cases SOMEWHERE that you would agree are self-referencing systems. The Meta- article comes to mind for some possible examples. Staying at the bigger picture, this page states that "Cycles should usually be avoided," yet nobody who strongly advocates removing them ever seems to acknowledge an instance where they should be maintained. They're only considered an inconvenience from sloppy classifications by others. So, can you give me ANY example(s) of what you would consider to be ACCEPTABLE category cycles? If so, they should be added to the page. If not, the section should be edited to say "Cycles should be avoided" (period). Thanks for working with me on this. Rfrisbietalk 14:32, 3 March 2006 (UTC)
Perhaps "related" is not the correct word, but here is what I mean: Academia is a place where Education happens. If you move to the other end of the taxonomy, education is the thing that is being studied and taught in academia. So education is a subject of study and also is the process of learning and teaching. Since education has this broad meaning the cycle happens. I find this to be an acceptable cycle. Education does have this self-referential nature. But my bigger point is that this "sloppy"classification system is by design. It allows people to browse in broadly defined categories rather than in narrow ones. I suspect you could create categorization schemes that would have no cycles, but why bother? Categorization should consider the utility of the category for a user who wants to browse. It is not intended to be collaborative work on a perfect taxonomy of all knowledge, and I think that is a good thing. -- Samuel Wantman 07:33, 6 March 2006 (UTC)
- Thanks Samuel Wantman. So, given everything about everything, how would you classify and categorize Category:Education and the related subcategories the create the loops? Rfrisbietalk 15:14, 6 March 2006 (UTC)
I would just leave the loops. I find the comments about the loops on the category pages to be confusing and jarring. The only way remove the loops is to try and separate the different meanings of education into separate categories, and this would make the categories much less useful for browsing. I don't think loops upset many casual users of Wikipedia. They are probably most upsetting to editors who think that categories are a rigid classification system. My experience is that if you follow that view you will be extremely frustrated. It also leads to long unproductive battles over category membership. Restore the categoriztion loops and remove the comments about loops. Think of categories as a collection of articles related to the topic and try to avoid thinking of it as a pure taxonomy. -- Samuel Wantman 20:43, 6 March 2006 (UTC)
- Okay, thanks. Perhaps JonHarder would like to comment on your suggestion as well. Also, would you be willing to consider a rewrite of Cycles should usually be avoided? That section doesn't really reflect the conversation here and when cycles might be "okay." Thanks. :-) Rfrisbietalk 21:49, 6 March 2006 (UTC)
- I think everyone in this conversation understands the issues here and the difficulties organizing in a way that is both fucntional and avoids cycles. It is disappointing when one can't find a solution that is elegant while also eliminating the cycles. Unless new cycles have been introduced in the past week, this will be the only cycle in the main article space (along with the trivial KLF cycle mentioned above). I don't see a need to make any significant change to the main article; "usually should be avoided" works for me! JonHarder 00:18, 8 March 2006 (UTC)
I've restored the category loops and removed the comments. -- Samuel Wantman 07:05, 8 March 2006 (UTC)
I think the "education" example is an instance of legitimate self-inclusion, but the KLF one (which I removed a couple of days ago without being aware of this discussion) is basically using "categories as links", on which question I have to disagree with Samuel on. I think the reason to be careful about such things is to preserve some sort of meaning to membership in a part of the hierarchy. Inevitably this breaks down after a certain point -- I noticed that Category:Geography of Greenland is at the base of a vertical chain of inclusion 37 (!) deep, featuring among others, it being classified (sorry) under Category:Philosophical concepts, Category:Apes, and Category:Theoretical physics. Clearly, the category system does not at present lend itself to questions such as "what are the articles on theoretical physics"; it'd currently give the answer, "almost all of them". But it would be worthwhile to work towards that as a goal. Alai 00:51, 16 March 2006 (UTC)
Third party opinions requested
There's a dispute over at WikiProject Cricket over the appropriate categorization of Pakistani, Bangladeshi and UAE cricketers. Details here. Third party opinions welcome. --Muchness 14:26, 2 March 2006 (UTC)
Recommended order for Category links
- I've been noticing that the AWB bot is alphabetizing the category links within articles. Is there a guideline for the order in which categories should be linked on a page? When I categorize I usually put the most obvious or most telling ones first (probably because that's the order in which I think of them). Is there any benefit to having a preferred way to do this, and if so, should it be discussed in the relevent help topics/style manuals, etc, where I've been looking in vain? --Dystopos 23:03, 7 March 2006 (UTC)
- I agree with you. Alphabetical isn't necessarily best. Furthermore, even when it is alphabetical, there's no particular reason to have numbers in front of letters rather than behind them--stick the Category:1873 births stuff at the end. Gene Nygaard 03:21, 11 March 2006 (UTC)
- For biographical entries, it seems natural to put the birth and death dates first. I think it would be a good idea to make that standard at least. Second best would be to always put them at the end, birth first, death second, just make it consistent. --JeffW 05:09, 11 March 2006 (UTC)
- I too have been noticing the increasing number of alphabetized categories, but was not aware (until now) that this was due to a bot. AWB is not a bot, if someone is using AWB to do these changes, we should invite them here to discuss it. I am not aware of any policy or guideline that calls for alpabetizing categories. It seems premature and a little rude to start a bot to do this. I'm hoping that the person or bot can stop making the changes and the discussion could take place here. There is good reason to organize category listings in ways that may or may not be alphabetical. For example, people with an eponymous category often had that category at the beginning or the end of the heap so that it could be found easily. In the bridge articles, the type of bridge was often put at the top of the heap. It seems that there are primary categories, and this often gets the lead or end spots. I've also seen similar types of categories clumped together in ways that a bot would never understand. -- Samuel Wantman 07:23, 12 March 2006 (UTC)
- One of the options in the AWB is to sort categories alphabetically. I've seen it used for that in the past, though I can't remember any examples right now. - EurekaLott 19:10, 13 March 2006 (UTC)
- So has the person or persons responsible for alphabetizing categories stopped doing so? Should we ask the maintainer of AWB to remove the "sort categories alphabetically" option? --JeffW 16:59, 17 March 2006 (UTC)
My opinion is that alphabetisation is generally preferable for 5 reasons; (1) when an article has many categories, it makes it easier to find a category, (2) it means that category order will appear more similar between different articles, and (3) alphabetisation isn't susceptible to people re-ordering to give different ones a higher priority/or a POV. A category order POV might sound crazy, but it will happen. Plus of course (4) alphabetical ordering is the easiest to maintain and (5) the most popular way of sorting. Martin 10:04, 15 March 2006 (UTC)
Here is a summary of some reasons to not alphabetize; (1) If there are numerous categories it is hard to find the eponymous category. For an example of this see Category:George W. Bush, (2) Alphabetization should be similar to the way categories are "piped", depending on the subject. The key word for alphabetization might not be the first word. (3) Categories might naturally go together. For example there are two bridge type categories for San Francisco-Oakland Bay Bridge. It might make sense to put "Suspension bridge" next to "Cantilever bridge". Another example is birth dates and death dates. (4) There might be good reasons to put certain categories in the same position for all articles of a certain type. For example, it was customary to make the type of bridge the first category in a bridge article. For biographical entries, it seems natural to put the birth and death dates first. --Samuel Wantman 23:13, 16 March 2006 (UTC)
The problem with subject specific rules is that subjects overlap. It could get very instruction creepy. I can see three possible universal orderings:
- Alpha
- By category tree (ok graph) - higher level cats first, sub cats together, then alpha.
- By entity after pipe, then one of the others.
I dare say there are other sensible ones. Rich Farmbrough 01:05 17 March 2006 (UTC).
- Since Wikipedia is a self-organizing chaotic system, it would make sense to me if many policy issues were dealt with by smaller groups of people. I think the Wikiproject is a good model of how to do this. People interested and knowledgeable in a subject area can get together to decide the best way to handle issues for their subject matter. Likewise, people concerned with categorization can meet here to discuss the overall rules for categories. I'd prefer to have the minimum amount of rules here and let each subject area work out the details. Perhaps the minimum is roughly alphabetical, with exceptions possible for the first and last category listed. I'd leave the first position for a main subject grouping and the last for an eponymous category. -- Samuel Wantman 07:29, 17 March 2006 (UTC)
- I agree. I don't think it needs any hard rules. I prefer common sense to consistency. I think if it is taken up, it should be considered a style guideline (i.e. requiring an eye to style rather than any specific knowledge) and never a hard rule. --Dystopos 15:44, 17 March 2006 (UTC)
- I think alphabetizing the categories is generally unhelpful--given that most users probably don't have a list of categories in their head that they're looking for, it's essentially randomizing. (I can see how alphabetical order might be useful for editors in some cases, but the articles should primarily be organized for the benefit of readers.) My inclination is to make the rule as broad as possible--grouping similar categories together, and going as far as possible from more general to more specific. Nareek 16:37, 17 March 2006 (UTC)
- I agree that alphabetizing is not the way to go. I have always categorized big picture first and most specific category last because that seemed like the normal convention. Alphabetical order is basically random. Cacophony 18:15, 17 March 2006 (UTC)
- Yes, alphabetizing categories is bad. Do we have enough agreement to put that in the guidelines? --JeffW 18:39, 17 March 2006 (UTC)
- I think alphabetizing the categories is generally unhelpful--given that most users probably don't have a list of categories in their head that they're looking for, it's essentially randomizing. (I can see how alphabetical order might be useful for editors in some cases, but the articles should primarily be organized for the benefit of readers.) My inclination is to make the rule as broad as possible--grouping similar categories together, and going as far as possible from more general to more specific. Nareek 16:37, 17 March 2006 (UTC)
- I agree. I don't think it needs any hard rules. I prefer common sense to consistency. I think if it is taken up, it should be considered a style guideline (i.e. requiring an eye to style rather than any specific knowledge) and never a hard rule. --Dystopos 15:44, 17 March 2006 (UTC)
Clarifying the section, "Cycles should usually be avoided"
It appears the section on this page, Cycles should usually be avoided is both controversial and ambiguous. While it states they should be avoided, it gives no justification. In addition, the discussion above about Education demonstrates at least one case of a self-referencing system that might be “acceptable,” at least to many editors. Perhaps the more active editors in this area that have weighed in with differing views on the subject, such as SamuelWantman and JonHarder can work together to clarify what types of cycles should be avoided under what circumstances and when it might be okay to use cycles. Using more examples in the "okay" – "not okay" situations would be very helpful as well. Thanks. Rfrisbietalk 18:48, 8 March 2006 (UTC)
- While I am honored to be nominated for a task, you seem to be more concerned about this issue than either I and JoHarder are. There is no reason why you could not come up with some wording and add it to the page. Perhaps a better way to deal with this issue is to take the discussion above and edit into a short sub-page that can be linked to the Cycles should usually be avoided section. You could call it Wikipedia:Categorization/Cycles. A sub-page might be the way to go because in the two years I've been watching the conversations here, this has rarely come up. There was a good deal of discussion about the categorization of John Lennon, which you could also reference as an example. -- Samuel Wantman 08:43, 9 March 2006 (UTC)
- Well, I haven't heard from him since you made the change to the category pages, but I got the impression JonHarder thought it was important, at least enough to remove the cycles in the first place. ;-) If nobody else is all that interested in it, I might try something in a little while. I wasn't the one who claimed there was a problem with cycles in the first place, so I still would like to see a consensus from both perspectives in the revision. Rfrisbietalk 02:12, 11 March 2006 (UTC)
- I added something. I'm not going to add a subpage, at least for now. I also didn't add the John Lennon example because it looked to me more like the George W. Bush example used for "THE TOPIC ARTICLE RULE" at Reasons for duplication. Rfrisbietalk 17:08, 11 March 2006 (UTC)
- Well, I haven't heard from him since you made the change to the category pages, but I got the impression JonHarder thought it was important, at least enough to remove the cycles in the first place. ;-) If nobody else is all that interested in it, I might try something in a little while. I wasn't the one who claimed there was a problem with cycles in the first place, so I still would like to see a consensus from both perspectives in the revision. Rfrisbietalk 02:12, 11 March 2006 (UTC)
Thanks Rfrisbie, that's a lot better. I knew there was an example of a useful cycle, I just couldn't think of one. ··gracefool |☺ 12:14, 15 March 2006 (UTC)
CfD reform -- Criteria for deletion based on precedent
I've started a discussion about some reforms for the Categories for Deletion page (WP:CFD). Please take a look here. Thanks. -- Samuel Wantman 10:32, 9 March 2006 (UTC)
Non-lists in Category:Episode lists
There are a whole bunch of sub-categories of this category that are filled not with lists, but with individual episodes of television shows. I perhaps too hastily started removing the Episode lists category from those pages and in several cases it was the only category so I created categories with no categorization. After getting a complaint I stopped, but the question remains what to do with these pages that don't belong.
Besides being illogical, the way it was was confusing because the same show could appear in the Category section and the Pages section so you'd have to look in both places to make sure you didn't miss the show you were looking for. To see how it was, look at the second page (the next 200).
A lot of those shows had their own Category page, like Category:Friends in which case Category:Friends episodes could be a sub-category of that and I don't see the need any other categories. So we could create show-specific categories for the pages that were only sub-categories of Category:Episode lists. Or they could be categorized under Category:CBS network shows or something like that.
But I think a better solution would be to create Category:Episodes by television show which would fit nicely under Category:Television series. Comments? --JeffW 20:44, 10 March 2006 (UTC)
- I followed you here from your edit of Category:My Name Is Earl episodes. I created it under Category:Episode lists following the example set by the Simpsons and others. Since category pages display lists, I think you may be taking the name of the "Episode lists" category too literally by kicking out episode subcats, but I don't object to your proposed fixes. A "Category:Episodes by television show" would be more accurate if named "Category:Episodes by television series". —RandallJones 21:34, 10 March 2006 (UTC)
- I see your point about Categories being a kind of list, but if you took that thinking to its logical extreme then every Category should go under the Lists category, but that's clearly not what was intended for it. --JeffW 21:46, 10 March 2006 (UTC)
- No argument there. —RandallJones 20:23, 11 March 2006 (UTC)
- I see your point about Categories being a kind of list, but if you took that thinking to its logical extreme then every Category should go under the Lists category, but that's clearly not what was intended for it. --JeffW 21:46, 10 March 2006 (UTC)
Another reason that these categories are confusing is that there are over 200 pages in Category:Episode lists and when these are sub-categorized, they will then be intermingling with all the Foo episodes categories making them harder to locate. (BTW, does anyone else think that Episode lists should be changed to something like Television series episode lists?). --JeffW 00:13, 11 March 2006 (UTC)
- Since it has only television as supercats, yes, it would help avoid mistakes. Our Gang filmography needs to be moved, and the others checked to assure there's nothing from Category: Radio programs. —RandallJones 20:23, 11 March 2006 (UTC)
- I just added the cfr to rename Category:Episode lists to Category:Lists of television series episodes. I'll try to find a place for Our Gang and check that there are no other non-television series. --JeffW 19:28, 14 March 2006 (UTC)
I think enough time has passed for discussion and I haven't received any replies requesting that I not do this, so I'm going to create Category:Episodes by television series putting it under Category:Television series and start moving the episode categories there. --JeffW 16:36, 13 March 2006 (UTC)
- I accidentally created it as "televion show" instead of "television series". Is it worth doing a CfR? --JeffW 17:40, 13 March 2006 (UTC)
- I deleted it (supposing that was what you wanted). BTW good idea to create Category:Episodes by television series, I'm glad these subcategories of Category:Episode lists can now go elsewhere. -- User:Docu
- Thanks. I've finished adding the new category to the pages that I had previously stripped of the old category. --JeffW 21:44, 13 March 2006 (UTC)
- I deleted it (supposing that was what you wanted). BTW good idea to create Category:Episodes by television series, I'm glad these subcategories of Category:Episode lists can now go elsewhere. -- User:Docu
Sub-categorizing the Lists Category
I recently added and rearranged the sub-categories in the Lists category so there would be categories for the subjects Art, Culture, Geography, History, Mathematics, People, Philosophy, Religion, Science, and Technology. This seems to be working pretty well, but these categories are mixed in with Abbreviations, Books, by Country, by Form (timeline etc.), Reference material, Worst lists, and Year lists which, I think, dilutes their usefulness. I tried sorting the subject cats to the beginning using space followed by the category name, but they came out in two columns. Then someone who was just "tidying up" removed the spaces putting it back the way it was.
Is there another way to do this? Does anyone else feel that it would be a good idea to ask for something like [[Category:Lists|Art|Subjects]] from the programming crew? (with the second piping meaning to place the link to the sub-category under the heading "Subjects", instead of the default letter headings).
In the meantime would naming the categories something like "Subject: Art lists", "Subject: Culture lists", etc. be reasonable? --JeffW 08:57, 15 March 2006 (UTC)
- See Category_talk:Lists#Reorganizing_Subcats_Based_on_Standard_.28topical.29_Schemes_.3F. -- User:Docu
- Are you referring to your cat scan tool? If so, I don't see how it helps in figuring out how to sub-categorize and format a category in a way to make navigation easy for a normal user, who presumably won't be using cat scan. --JeffW 19:52, 15 March 2006 (UTC)
Divider between categories in category listing at bottom of a page
Not sure if this is the best place to ask – please redirect me if necessary – but would anyone else prefer the "middle dot" rather than the "vertical line" as the divider between category names in the list at the bottom of a page, i.e.:
Categories: This category · That category · Another category · etc.
rather than
Categories: This category | That category | Another category | etc...?
When glancing, I find it slightly easier to distinguish the categories listed using the dot rather than the line. Thanks for your thoughts, David Kernow 05:13, 15 March 2006 (UTC)
- I don't know if they're easier to distinguish, but they look cooler. --JeffW 19:12, 15 March 2006 (UTC)
"Related categories"?
I know we have examples of "related categories" being created manually, such as WTC-9/11, but is there any serious thought to making them an automated feature? That sure would make it a lot easier to create meaningful broader/narrower term hiercharies and avoid those pesky loops! >;-o Rfrisbietalk 03:17, 16 March 2006 (UTC)
- Automation is in, as part of the MediaWiki software, or, via by bot, etc? I'm not sure I immediately see how this might be done. Alai 13:40, 16 March 2006 (UTC)
- I wasn't very clear, sorry. Say I type in "[[Relatedcategory:Foo]]" on a category page. Then that shows up in its own section at the top of the page, just under the subcategories section. That way, "See also" comments can be avoided, subcategories are more likely to be true subsets, while related categories can be overlapping sets. Gyrations around loops should be much simpler too. Rfrisbietalk 20:56, 16 March 2006 (UTC)
- This suggestion has been floating around for a while. I think I made a similar suggestion about a year ago on this page. I was thinking of using a related category classsification to handle the situation when one hierarcy of categories is a subset of another hierarchy, such as Category:Bridges in the United States and Category:Toll bridges in the United States. But after more than a year of thinking about this issue, I wonder if having a section of related categories is any better than what we are doing now. I wonder if it would be any clearer and easily understood. Having a comment that says "See also" seems to be a clear way of saying "This doesn't belong in this category but you might be looking for it here". Trying to create "true subsets" might be a path that creates more editing conflicts. Part of the beauty of the categorization scheme is that it is not a rigid taxonomy. -- Samuel Wantman 22:18, 16 March 2006 (UTC)
- I agree suggesting "related categories" isn't a new idea. They're a staple of any good taxonomy. I also agree the "See also" kludge is conceptually the same. And I agree a network is more useful than a hierarchy. On the other hand, a "manual" approach loses any meaningful capability for systematic database management of these category relationships. I fully expect that the easier they are to create and view in a consistent way, the more they will be used. In my opinion, that's the tipper for doing it through a related categories code. Rfrisbietalk 22:36, 16 March 2006 (UTC)
Top-Sorting
I've noticed that it has become common practice to sort the main article of a category under * so it will appear at the top of the page list. How does this interact with the catmore template? Is one preferred or should both be used?
Also, in many cases it's appropriate to sort a list (especially when the all the pages in a category are instances) to the top of the page list. If there is already a main article for that category, my gut says lists should appear after the main article. Should they be sorted as ** so they appear after the main article but in the same block? Or perhaps + should be used so they appear as a second block. Or should the main article be sorted with a space instead of * and the * used for the the lists? --JeffW 19:07, 16 March 2006 (UTC)
- This is currently being done several different ways. The guidelines suggest the space for main articles, but are not clear about the star. I think that eponymous articles should be sorted with a space and all others should use the *. --Samuel Wantman 22:27, 16 March 2006 (UTC)
- I've noticed this very "diversity" in the sorting of sub-categories of stub types, which possibly most often use a * as a prefix, sometimes a space, and sometimes no key prefix at all (or indeed, no key). In such cases I think it makes sense to use something, as often these are largeish categories with multiple "pages", and it's useful if all the subcats appear on the first one. Some standardisation would probably be nice (said more in faint hope than strong expectation). Alai 01:21, 18 March 2006 (UTC)
Subcategory intro wording
I recently noticed awkwardness in the wording of the automatically-generated text that precedes a category’s subcategories. The awkward “There are 62 subcategories to this category.” would be better worded as “There are 62 subcategories within this category.” —optikos 00:47, 20 March 2006 (UTC)
Is it appropriate to try regulate guideline/policy via help: namespace?
See also my comments here
There appears to be no consensus to include {{Wikipedia-specific help}} with its present content in Wikipedia:Categorization.
Trying to blank out a link to Wikipedia:Categories, lists, and series boxes in help namespace is WP:POINT, if not plain obnoxiousness, by user:omniplex – not a appreciable procedure for getting things his way. --Francis Schonken 12:46, 22 March 2006 (UTC)
- Your very own guideline isn't the problem, the recursion on the help page is. Your clumsy modification attempt cannot work, if you want to get rid of the pointless recursion kill the Template:Ph:Category redirection. Omniplex 15:23, 22 March 2006 (UTC)
I did. See: Template talk:Wikipedia-specific help#No consensus.
Your edit summary read "editing help pages here is pointless" - so don't. Four times you tried to edit away the Wikipedia:Categorization#Categories vs. Lists vs. Info boxes section on the help page, by editing the Wikipedia:Categorization with noinclude tags. Indeed stop that pointless & confusing editing of the help page here. --Francis Schonken 15:34, 22 March 2006 (UTC)
- Now Help:Category and its redundant info (redundant when included by the help page) cannot muddy the water. Your own guideline not allowing the addition of technical facts is dubious, but that issue is unrelated to included {{style}} and {{guideline}} templates clobbering the derived help page. An included "see also" is also confusing, one trick too many for any included guidelines. is not more included in
- For others, check out Help:Category#Wikipedia-specific_information, the complete Categorization article was automagically inserted there by a redirection of Template:Ph:Category(edit talk links history). Omniplex 16:57, 22 March 2006 (UTC)