Scope of collection::Project Gutenberg


Project::title    August::books    Michael::first    Category::author    Https::texts    Project::public

Scope of collection

Growth of Project Gutenberg publications from 1994 until 2008.

As of August 2015, Project Gutenberg claimed over items in its collection, with an average of over fifty new e-books being added each week.<ref>According to gutindex-2006, there were 1,653 new Project Gutenberg items posted in the first 33 weeks of 2006. This averages out to 50.09 per week. This does not include additions to affiliated projects.</ref> These are primarily works of literature from the Western cultural tradition. In addition to literature such as novels, poetry, short stories and drama, Project Gutenberg also has cookbooks, reference works and issues of periodicals.<ref>For a listing of the categorized books, see: {{#invoke:citation/CS1|citation |CitationClass=web }}</ref> The Project Gutenberg collection also has a few non-text items such as audio files and music notation files.<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref>

Most releases are in English, but there are also significant numbers in many other languages. As of February 2013, the non-English languages most represented are: French, German, Finnish, Dutch, Portuguese, and Chinese.<ref name=USINFO>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref>

Whenever possible, Gutenberg releases are available in plain text, mainly using US-ASCII character encoding but frequently extended to ISO-8859-1 (needed to represent accented characters in French and Scharfes s in German, for example). Besides being copyright-free, the requirement for a Latin (character set) text version of the release has been a criterion of Michael Hart's since the founding of Project Gutenberg, as he believes this is the format most likely to be readable in the extended future.<ref>Various Project Gutenberg FAQs allude to this. See, for example: {{#invoke:citation/CS1|citation |CitationClass=web }}</ref> Out of necessity, this criterion has had to be extended further for the sizable collection of texts in East Asian languages such as Chinese and Japanese now in the collection, where UTF-8 is used instead.

Other formats may be released as well when submitted by volunteers. The most common non-ASCII format is HTML, which allows markup and illustrations to be included. Some project members and users have requested more advanced formats, believing them to be much easier to read. But some formats that are not easily editable, such as PDF, are generally not considered to fit in with the goals of Project Gutenberg. Also Project Gutenberg has two options for master formats which can be submitted (from which all other files are generated), customized versions of the Text Encoding Initiative standard since 2005,<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref> and reStructuredText, since 2011.<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref>

Beginning in 2009 the Project Gutenberg catalog began offering auto-generated alternate file formats, including html (when not already provided), EPUB and plucker.<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref>

Project Gutenberg sections
Intro   History    CD and DVD project    Scope of collection    Ideals    Copyright    Criticism    Affiliated projects    See also    References    External links   

Scope of collection
PREVIOUS: IntroNEXT: History