Jump to content

Britannica Public Domain

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Koyaanis Qatsi (talk | contribs) at 12:51, 31 March 2002. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Britannica Public Domain/Status

Download wikified articles

As I understand it, Project Gutenberg has published what they call the "Project Gutenberg Encyclopedia". What this actually is, is the classic 11th edition of the Encyclopaedia Britannica. However, because Britannica is a trademark, even though the text is now public domain (because it is so old), they were unable to call it "Encyclopaedia Britannica".

Call it what you will, it is available, and we should consider putting the entire text here, in the Wikipedia. This may be a lot of work (like my own pet project, which is cutting and pasting all of the public domain CIA World Factbook over in Countries of the world -- please help me!), but it sure would be a fun thing to have in here. People could start updating/replacing the articles with Project Gutenberg Encyclopedia as the foundation.

Also, I bet there are a lot of instances where the text of the article could be wikified by cramming words together. -- Jimbo Wales


Whoa, WhatACoincidence. I just now was downloading vol#1 from it and seeing what it'd take to put it in.  :-)

The license says it's in the public domain and we can do whatever we please with it...

I will volunteer to work on chopping out all of the articles from volume 1 and posting them in the appropriate places. Other than really basic stuff, I'm not going to fiddle with the formatting nor update any inaccuracies contained in them, so anyone seeing things to change, correct, or update should lay into it.

The two MAJOR concerns I have in doing this is (a) the information may be WAY out of date in some cases, maybe even to the point of uselessness, and (b) the articles may be too long for the average wikitizen to get into - they may feel too intimidated by the length and tone of the article to add or correct as we'll need.

But, I guess we'll never know unless we try, so here goes...  ;-) -- BryceHarrington


I would suggest adding "PG" or "EB" or "11" at the end of the name of any page that is taken from EB© Then we could edit versions of that, while the original is preserved© -- Larry Sanger


It appears that only the first volume of this encyclopedia was entered into PG... It occurs to me that if people are interested in the original, then they can always just go to the project gutenberg site, and then be certain they're getting the straight stuff. Or they can trust my cut-and-paste skill and look at revision 0...

I think our purpose in using that material is just as a starting point for our own encyclopedia... Thus I think it'd be better if I just plugged them right into the topics, without putting 'EB' on the topicname. So we'd have just one copy of each article, plus any commentary placed on top of them by subsequent editing. Does that sound okay? -- BryceHarrington


Sounds alright -- WojPob


In looking it over, there's two rules we'll have to follow. First, we cannot use the name "Britannica" because that is a trademark. Second, Project Gutenberg requires that NO mention is made of Project Gutenberg if the text is altered from what they supply. We are free to copy and reuse the text, and edit it as we wish, but can't claim it is from Encyclopaedia Britannica nor from Project Gutenberg. Weird, eh? Anyway... Correct me if I've misinterpreted the various bits of small text. -- BryceHarrington


Where can I find the 11th edition for download? I've banged around on the EB site and can't find it. I'd like to start updating and adding the articles. Thanks.


It's not on the Britannica site. Hard to find via google too. ftp://ftp.cdrom.com/pub/gutenberg/etext95 Look for pge*, etext #200, Jan 1995.


For this material to be truly useful, it needs to be marked as from Brittanica of that edition, and NO EDITTING ALLOWED! Otherwise, it's useless. On a related note, without being able to cite references, and without examining references other people have cited, the usefulness of material here is limited to, well, relatively useless stuff. Is there a way to mark up citations automatically? Or am I missing the point somewhere?


I couldn't disagree more. The text of these articles is ours--it belongs to us, the public. We can and should use it in every way that could possibly benefit our project. If that means copying some things verbatim (An article on the history of the letter "A" certainly shouldn't need much updating), then we can do that. If that means completely ignoring some articles that are too hopelessly out of date or out of context, then we should do that. But I think there's a vast area in between where we can take the information from these articles and write our own in more modern language and with other updates and additions. No credit is necessary, as "authorship" is not a relevant concept here. We might also use even the out-of-date texts as examples of outdated historical ideas. We should certainly use things like illustrations and tables. It's a great source, and while I agree that the text in Wikipedia should be written by live humans in our modern context, those people should take advantage of this source in whatever way they feel appropriate. --LDC


LDC: Thanks, nice explanation. A moments reflection of what I wrote previously bears qualifying, namely how I measure usefulness for myself. I completely agree about lack of credit (thus anonymity suits me well for anything I write here).



The first volume can be found here
http://sailor.gutenberg.org/etext95/pge0112.txt

Is there a e-text of the other volumes?

A script could be used to wikify the text.


Thanks, I've downloaded it but I don't expect to be adding any of it for awhile. I would like to know about the other volumes too (though I don't expect to finish 'A through ANDROPHAGI' for quite awhile--it unzips to an 8 MB text file.)


I could not find any other volumes other than 'A'; I guess whomever started putting it into Gutenberg must have given up after the first volume. Also, I don't know if *all* of the entries ought to be placed in Wikipedia; many of them are irrelevant to say the least. I'd encourage picking and choosing; people can always go back to the original Gutenberg text if they *must* know that the vice-bishop of cambrishire from 1889-1894 was an avid collector of moths and wrote sonnets about the Holy Ghost.  ;-)


This topic is also mentioned on Wikipedia Projects/Project Gutenberg Encyclopedia

I'm working on posting more articles from it also. Many of the articles are definitely worthless in my opinion. Obviously opinions will vary. Many basic facts on historical cities or figures may be good and worth posting. Other articles (such as 19th-century contemporary politicians or towns/villages in existence in 1911) do not seem to have contemporary relevance, so I'm skipping them.

I think it's going to be discretionary as to what is worth keeping, but that's the beauty of the Wikipedia: anyone else can fix it later :-)

- Alan Millar


Is it known whether anybody is currently scanning the other volumes? --AxelBoldt

wouldn't it be nice to edit an article that began with somthing other than A?

I've got the 9th and 10th editions of this encyclopedia (1870-1900, which should be out of copyright if the 1911 edition is), but don't want to destroy the bindings to scan the text in. I'll be typing up odd bits of material from these volumes to add to pages, like the paragraph I added to Falkland Islands/History. There's a load of nice victorian engravings in these volumes that I could scan in for illustrations; count this as another request for a picture upload facility - [[[Malcolm Farmer]]


Not citing Project Gutenberg when something has been amended does not present a problem since they have no responsibility for the detailed text anyway. I still believe that even long after a copyright has gone into public domain, the author still retains a moral right. Also, being able to give sources adds credibility to an article. A series of standard ized abbreviations for common information sources is helpful; for the work in question I would go along with "EB11" do distinguish it from the earlier editions that Malcolm plans to use. I would also tend to use the phrase "adapted from" to allow for the fact that even if I don't change the text somebody else may. The Britannica has been repeatedly quoted ever since it first came out in the 1700's, and I don't see that merely mentioning it as a/the source violates any trademark. If that fails, PG's reference to the alternative name "Encyclopedia Anglicana" provides an alternative. There could always be a tacit understanding of just what that term means.

Choosing what articles to include can present a problem. Clearly the historical articles remain among the most valuable, while the ones dealing with technology are most likely to be obsolete. Still even the articles about technology have a historical component of continuing interest. Someone4 has already entered most of the list of articles in EB11 vol 1. If someone feels that an article should not be included in the Wikipedia he can always edit the list with a brief note after that specific entry in the index.


There's a commercial cd-rom edition of the 11th EB (scanned tif files, not OCR'd) available from www.classiceb.com, whose web site claims that the words of the EB are public domain but the scanned images are copyrighted with redistribution of the image files not allowed. It's a 10 disc CD set costing about 100 bucks. Buying a copy and OCR'ing the TIFF files might be a reasonable way to get the text files without having to locate a paper EB copy and scan it all again. The Gutenberg version of vol.1 might be the result of someone scanning vol. 1 and then not having the energy to do the rest of the volumes. By the way the classiceb site has a one page sample scan, which they claim is from the "aeronautics" article. That's wrong, it's actually "aerostation" as can be seen by comparing the scan to the text Gutenberg version. That comparison confirms that the Gutenberg encyclopedia is indeed the EB.

The trademark issue seems nuts, by the way--classiceb isn't hesitating to call their product a copy of the EB 11th. It's the title of the work, after all. --Paul Rubin

Eclecticology


Britannica has put it all online now, at http://1911encyclopedia.org/ and all in plain text. I'd suggest a review of its copyright status - it's past the 75-year corporate pre-Bono rule, but I am not a lawyer. If it should prove to be open season on the text - as the existence of the Gutenberg text suggests - then we'll still have to deal with the issues described above, as to which articles to include and how to update them. I was reading the "nebula" entry and giggling to myself. How things change in 91 years... -- April


I had a hell of a time with that, going through and updating and verifying almost everything in I think the Algeria entry. It is definitely a project to take cautiously, and in small doses. Koyaanis Qatsi


Yes, i was reading some articles, and to copy and correct the articles is worth doing (slowly, as KQ points out). The articles are simply scannings of the enciclopedia, with no quality control whatsoever, and there are lots of machine misreadings (in the article GALILEO, the telescope was invented around 2608). Also, Britannica has nasty pop-up ads, so porting some of them here would be a service to the pop-up hating public.

As expected, they do not claim copyright on the contents anywhere I could see.

The advice would then be, do not port articles about matters you know nothing about, but only those that you can give, at least, a first order correcting and updating... AstroNomer


We can't even call it "EB". Encyclopaedia Britannica Inc. owns a United States registered trademark (serial no. 75452723) on just those two letters. See the USPTO's record of this trademark. --Damian Yerrick

I'm not a lawyer either, but trademarks are not the same as copyrights. Trademarks can be used by others (how else could grocery stores and theaters advertise their offerings?), but they have to be used "properly" (for example Xerox should be used as an adjective, not a noun, and definitely not a verb), and a superscript TM should appear the first time the work is used. It's really not correct to quote from something and then not give the source, or refer to it as the encyclopedia that shall not be named or something ;-) -- Maybe we should just have a page somewhere stating that all trademarked words are the property of their respective owners (yeah, it sounds dumb, but...) -- Marj Tiefert

I've started the acknowledgment list right here: Wikipedia:Trademark notices

Cool! :-) -- Marj Tiefert, Thursday, March 28, 2002



If I'm understanding this right, I assume that the EB 11th edition is not necessarily free for all purposes but that it is free for *our* purposes, i.e. copyright on some sections may have been renewed, etc.. I suspect the best thing to do is to enter the strict 11th edition text as first entry - so it goes as is into revision control. Then future diffs are just that, diffs, and can be rolled back. to the 11th edition exactly as it was printed then...

If you find whole sections of the 11th edition EB text irrelevant, fine, then ignore those... but quote exactly what you do include.

Ideally, all attribution of this many-authored EB 11th edition, all citing from it, would be in the rev control... which any author should be able to alter knowing what s/he's doing... "Summary" is a bad word... implies subtitle.

Strict rev control would always ask four questions of any change you make:

  1. what annoyed you so much you thought you had to change this?
  2. what did you specifically plan to do when you chose to edit this page?
  3. after the fact, what do you think you managed to do, and not to do?
  4. what would be the best possible test for this entry/module as it stands? who would be its toughest reviewer?

"copyright on some sections may have been renewed"

Wrong. Under United States copyright law tradition, copyright term extensions typically do not re-copyright works that have passed into PD due to term expiration. (URAA was about technicalities of notice and registration, not term extension.) EB11 passed into PD at the end of 1967 under the 56-year rule of the Copyright Act of 1909, and no treehugging politician can change that. In the United States, nothing first published before 1923 is still under copyright.

"enter the strict 11th edition text as first entry - so it goes as is into revision control."

As long as we make it clear that what we are publishing is not EB&trade but a derivative work of the contents of EB11, no revision control should be strictly necessary. --Damian Yerrick


This has been discussed to death 100 times here--anybody can claim a copyright to anything, but that doesn't make it true. The entire contents of the 1911 EB are, now and forever, in the public domain. They belong to us, and to everyone else. Anyone who makes any claim to the contrary is lying (which is perfectly legal). We can copy them and use them for any damned thing we please, with credit or without. The trademarks on the names "EB" and "Encyclopedia Britannica" are still active, so those have to be used with a bit of care, but other than that, stop worrying about it. The public domain is just that--the public's. --LDC


There's another argument strict entry of the 1911 EB articles "as is" the first time they go in - to be able to roll such articles back for historical reasons, if we really intend to be in some ways a superset of the original EB11/ProjectGutenberg, would help the credibility of the whole project... we are just writing "another fork off the 1911 understanding of the world", so there is no question about this being "a real encyclopedia"

At least until the next software switchover... --Damian Yerrick

As to the legals, you both interpret the law, US and global, to say that if something was published before 1923 and exactly those same words were also published in say 1983 by the same or different people, there's no copyright problem, anywhere in the world that someone might read the wikipedia? OK...


That first paragraph doesn't parse as complete English sentences, so I'm not sure what you're trying to say. Wikipedia has not ever expressed any desire to be associated with EB in any way, and we have our own goals that may or may not have anything to do with theirs.

As to the law, I am speaking of those actions which the present government of the United States may take with respect to actions performed by me (or any other US resident) in the U.S. In that case, the U.S. (and, so far as I know not being an expert on laws of other countries, the UK) does not regard the contents of the 1911 EB as "property" in any sense; it is as free as the air we breathe, and any use we Americans make of it will not be met with any resistance from any agency of the government. Of course I make no claims for any other--East Timor might very well decide tomorrow that it owns the text and sue us, but I doubt American courts would recognize that action, and they certainly don't have the force to back it up. --LDC


You're right. US copyright law covers only original expression (17 USC 103(b)), and EU law should be similar. You can't "re-copyright" a work simply by printing a new edition without at least putting some effort into a derivative work, and even then, copyright extends only to the new material. --Damian Yerrick, who's not a lawyer


I agree, this is generally not a problem, and the only stupid statement is this: "they certainly don't have the force to back it up"

Last I checked, anyone could mail anthrax...