Data library
![]() | This article provides insufficient context for those unfamiliar with the subject.(September 2014) |
A data library, data archive, or data repository is a collection of numeric and/or geospatial data sets for secondary use in research. A data library is normally part of a larger institution (academic, corporate, scientific, medical, governmental, etc.) established for research data archiving and to serve the data users of that organisation. The data library tends to house local data collections and provides access to them through various means (CD-/DVD-ROMs or central server for download). A data library may also maintain subscriptions to licensed data resources for its users to access. Whether a data library is also considered a data archive may depend on the extent of unique holdings in the collection, whether long-term preservation services are offered, and whether it serves a broader community (as national data archives do). Most public data libraries are listed in the Registry of Research Data Repositories.
Importance of data libraries and data librarianship
In August 2001, the Association of Research Libraries (ARL) published SPEC Kit 263: Numeric Data Products and Services, presenting results from a survey of ARL member institutions involved in collecting and providing services for numeric data resources.
Services offered by data libraries and data librarians
Library service providing support at the institutional level for the use of numerical and other types of datasets in research. Amongst the support activities typically available:
- Reference Assistance — locating numeric or geospatial datasets containing measurable variables on a particular topic or group of topics, in response to a user query.
- User Instruction — providing hands-on training to groups of users in locating data resources on particular topics, how to download data and read it into spreadsheet, statistical, database, or GIS packages, how to interpret codebooks and other documentation.
- Technical Assistance - including easing registration procedures, troubleshooting problems with the dataset, such as errors in the documentation, reformatting data into something a user can work with, and helping with statistical methodology.
- Collection Development & Management - acquire, maintain, and manage a collection of data files used for secondary analysis by the local user community; purchase institutional data subscriptions; act as a site representative to data providers and national data archives for the institution.
- Preservation and Data Sharing Services - act on a strategy of preservation of datasets in the collection, such as media refreshment and file format migration; download and keep records on updated versions from a central repository. Also, assist users in preparing original data for secondary use by others; either for deposit in a central or institutional repository, or for less formal ways of sharing data. This may also involve marking up the data into an appropriate XML standard, such as the Data Documentation Initiative, or adding other metadata to facilitate online discovery.
Universal digital library
The concept of a Universal digital library was described as "within reach" by a 2012 LA Times article[1] ("A universal digital library is within reach") which told about Google's attempts to "mass-digitize" what are termed "orphan works" - out-of-print copyrighted works.
The U.S. Copyright Office and the European Union have been working on this. Google has reached agreements in France which "lets the publisher choose which works can be scanned or sold." By contrast, Google has been trying in the USA for a "free to digitize and sell any works unless the copyright holders opted out" deal and is still unsuccessful.[2]
Information repository
Attempts to develop what was called an "information repository" have been underway for decades:
- In 1989 IBM tried to have Officevision combine mainframes and PCs to enable "an information repository."[3]
- In 2003 Microsoft introduced OneNote as an extension to Microsoft Office 2003 which would support "a personal information repository."[4]
- In 1996 an 1898-founded library obtaiend additional funding to expand its mission, and become a major "local resource center and regional information repository."[5] The New York Times described them as "the second largest in the New York City region, second only to the New York Public Library on Fifth Avenue." Their services include "a computer information center devoted to outside-item requests."
Associations
- IASSIST (International Association for Social Science Information and Service Technology)
- DISC-UK (Data Information Specialists Committee—United Kingdom)
- APDU (Association of Public Data Users - USA)
- CAPDU (Canadian Association of Public Data Users)
Examples of Data Libraries
Natural sciences
The following list refers to scientific data archives.
- CISL Research Data Archive
- Dryad
- ESO/ST-ECF Science Archive Facility
- International Tree-Ring Data Bank
- Inter-university Consortium for Political and Social Research
- Knowledge Network for Biocomplexity
- National Archive of Computerized Data on Aging
- National Archive of Criminal Justice Data [1]
- National Climatic Data Center
- National Geophysical Data Center
- National Snow and Ice Data Center
- National Oceanographic Data Center
- Oak Ridge National Laboratory Distributed Active Archive Center
- Pangaea - Data Publisher for Earth & Environmental Science
- World Data Center
- DataONE
- 4TU.Centre for Research Data
Social sciences
In the social sciences, data libraries are referred to as data archives. Data archives are professional institutions for the acquisition, preparation, preservation, and dissemination of social and behavioral data. Data archives in the social sciences evolved in the 1950s and have been perceived as an international movement:
By 1964 the International Social Science Council (ISSC) had sponsored a second conference on Social Science Data Archives and had a standing Committee on Social Science Data, both of which stimulated the data archives movement. By the beginning of the twenty-first century, most developed countries and some developing countries had organized formal and well-functioning national data archives. In addition, college and university campuses often have `data libraries' that make data available to their faculty, staff, and students; most of these bear minimal archival responsibility, relying for that function on a national institution (Rockwell, 2001, p. 3227).[6]
- re3data.org is a global registry of research data repository indexing data archives from all disciplines: http://www.re3data.org
- CESSDA Members are data archives and other organisations that archive social science data and provide data for secondary use: https://www.cessda.eu/About/Consortium
- Consortium of European Social Science Data Archives: http://www.cessda.org/
- Finnish Social Science Data Archive (FSD): http://www.fsd.uta.fi/
- The Danish Data Archives: http://www.sa.dk/content/us/about_us ; specific page (only in Danish): http://www.sa.dk/dda/default.htm
- Inter-university Consortium for Political and Social Research: http://www.icpsr.umich.edu/
- The Roper Center for Public Opinion Research: https://ropercenter.cornell.edu/
- The Social Science Data Archive: http://dataarchives.ss.ucla.edu/
- The NCAR Research Data Archive: http://rda.ucar.edu
- Cornell Institute for Social and Economic Research: https://ciser.cornell.edu/data/data-archive/
References
- Clubb, J., Austin, E., and Geda, C. "'Sharing research data in the social sciences.'" In Sharing Research Data, S. Fienberg, M. Martin, and M. Straf, Eds. National Academy Press, Washington, D.C., 1985, 39-88.
- Geraci, D., Humphrey, C., and Jacobs, J. Data Basics. Canadian Library Association, Ottawa, ON, 2005.
- Martinez, Luis & Macdonald, Stuart, "'Supporting local data users in the UK academic community'". Ariadne, issue 44, July 2005.
- See the IASSIST Bibliography of Selected Works for articles tracing the history of data libraries and its relationship to the archivist profession, going back to the 1960s and '70s up to 1996.
- See IASSIST Quarterly articles from 1993 to the present, focusing on data libraries, data archives, data support, and information technology for the social sciences.
See also
References
- ^ {{cite newspaper |newspaper=[[The Los Angeles Times |url=https://www.latimes.com/opinion/la-xpm-2012-may-01-la-oe-samuelson-google-books-and-copyright-20120501-story.html |title=A universal digital library is within reach |author=Pamela Samuelson |date=May 1, 2012}}
- ^ Eric Pfanner (August 25, 2011). "In France, Publisher and Google Reach Deal". The New York Times.
- ^ "IBM software to integrate systems". The New York Times. May 17, 1989.
- ^ John Markoff (December 11, 2003). "For Doodlers and Pack Rates, a Multi-Media Binder". NYTimes.com.
- ^ F. Romall (May 12, 1996). "Mt. Vernon Library Marks Its 100th Year". The New York Times.
- ^ Rockwell, R. C. (2001). Data Archives: International. IN: Smelser, N. J. & Baltes, P. B. (eds.) International Encyclopedia of the Social and Behavioral Sciences (vol. 5, pp. 3225- 3230). Amsterdam: Elsevier