Arabic Bulgarian Chinese Croatian Czech Danish Dutch English Estonian Finnish French German Greek Hebrew Hindi Hungarian Icelandic Indonesian Italian Japanese Korean Latvian Lithuanian Malagasy Norwegian Persian Polish Portuguese Romanian Russian Serbian Slovak Slovenian Spanish Swedish Thai Turkish Vietnamese
Arabic Bulgarian Chinese Croatian Czech Danish Dutch English Estonian Finnish French German Greek Hebrew Hindi Hungarian Icelandic Indonesian Italian Japanese Korean Latvian Lithuanian Malagasy Norwegian Persian Polish Portuguese Romanian Russian Serbian Slovak Slovenian Spanish Swedish Thai Turkish Vietnamese

definition - Open_Data

definition of Wikipedia

   Advertizing ▼


Open Data

From Wikipedia, the free encyclopedia

Jump to: navigation, search

Open Data is a philosophy and practice requiring that certain data are freely available to everyone, without restrictions from copyright, patents or other mechanisms of control. It has a similar ethos to a number of other "Open" movements and communities such as open source and open access. However these are not logically linked and many combinations of practice are found. The practice and ideology itself is well established (for example in the Mertonian tradition of science) but the term "Open Data" itself is recent. Much of the emphasis in this entry is on data from scientific research and from the data-driven web. In some cases Open Data may be considered as more properly Open Metadata and there is not yet a consistent formalisation. This article uses recent publications and activities to define the scope of the concept and term.



The concept of Open Data is not new; but although the term is currently in frequent use, there are no commonly agreed definitions (unlike, for example, Open Access where several formal declarations have been made and signed).

Open Data is often focussed on non-textual material such as maps, genomes, chemical compounds, mathematical and scientific formulae, medical data and practice, bioscience and biodiversity. Problems often arise because these are commercially valuable or can be aggregated into works of value. Access to, or re-use of, the data are controlled by organisations, both public and private. Control may be through access restrictions, licenses, copyright, patents and charges for access or re-use. Advocates of Open Data argue that these restrictions are against the communal good and that these data should be made available without restriction or fee. In addition, it is important that the data are re-usable without requiring further permission, though the types of re-use (such as the creation of derivative works) may be controlled by license.

A typical depiction of the need for Open Data:

Numerous scientists have pointed out the irony that right at the historical moment when we have the technologies to permit worldwide availability and distributed process of scientific data, broadening collaboration and accelerating the pace and depth of discovery…..we are busy locking up that data and preventing the use of correspondingly advanced technologies on knowledge
[1] John Wilbanks, Executive Director, Science Commons

Creators of data often do not consider the need to state the conditions of ownership, licensing and re-use. For example, many scientists do not regard the published data arising from their work to be theirs to control and the act of publication in a journal is an implicit release of the data into the commons. However the lack of a license makes it difficult to determine the status of a data set and may restrict the use of data offered in an Open spirit. Because of this uncertainty it is also possible for public or private organizations such as IEEE to aggregate said data, protect it with copyright and then resell it.

Under "Toward Open Data" Connolly (2005, v.i.) gives two quotations:

  • I want my data back. (Jon Bosak circa 1997)
  • I've long believed that customers of any application own the data they enter into it. [2]. (This quote refers to Veen's own heart-rate data.)

These quotations suggest that Openness refers to the metadata (formats, licenses, ontologies) rather than the data themselves.


Keith Jeffery writes:

Although the term open data is rather new, the concept is rather old. The International Geophysical Year of 1957-8 caused the setting up of several world data centres and - more importantly - set standards for descriptive metadata to be used for data exchange and utilisation.[3]

In 1995 GCDIS (US) put the position clearly in On the Full and Open Exchange of Scientific Data (A publication of the Committee on Geophysical and Environmental Data - National Research Council):

"The Earth's atmosphere, oceans, and biosphere form an integrated system that transcends national boundaries. To understand the elements of the system, the way they interact, and how they have changed with time, it is necessary to collect and analyze environmental data from all parts of the world. Studies of the global environment require international collaboration for many reasons:
  • to address global issues, it is essential to have global data sets and products derived from these data sets;
  • it is more efficient and cost-effective for each nation to share its data and information than to collect everything it needs independently; and
  • the implementation of effective policies addressing issues of the global environment requires the involvement from the outset of nearly all nations of the world.
International programs for global change research and environmental monitoring crucially depend on the principle of full and open data exchange (i.e., data and information are made available without restriction, on a non-discriminatory basis, for no more than the cost of reproduction and distribution."

The last phrase highlights the traditional cost of disseminating information by print and post. It is the removal of this cost through the Internet which has made data vastly easier to disseminate technically. It is correspondingly cheaper to create, sell and control many data resources and this has led to the current concerns over non-Open data.

More recent uses of the term include:

  • SAFARI 2000 (South Africa, 2001) used a license informed by ICSU and NASA policies [5]
  • the human genome [6] (Kent, 2002)
  • An Open Data Consortium on geospatial data [7] (2003)
  • Manifesto for Open Chemistry [8] (Murray-Rust and Rzepa, 2004) (2004)
  • Presentations to JISC and OAI under the title "Open Data" [9] (Murray-Rust, 2005)
  • Science Commons launch [10] (2004)
  • First Open Knowledge Forums (London, UK) run by the Open Knowledge Foundation (London UK) on open data in relation to civic information and geodata [11] (February and April 2005)
  • The Blue Obelisk group in chemistry (mantra: Open Data, Open Source, Open Standards) (2005) doi:10.1021/ci050400b
  • The Petition for Open Data in Crystallography is launched by the Crystallography Open Database Advisory Board. [12](2005)
  • XML Conference & Exposition 2005 [13] (Connolly 2005)
  • SPARC Open Data mailing list [14] (2005)
  • First draft of the Open Knowledge Definition explicitly references "Open Data" [15] (2005)
  • XTech [16] (Dumbill, 2005), [17] (Bray and O'Reilly 2006)

In 2004, the Science Ministers of all nations of the OECD (Organisation for Economic Co-operation and Development), which includes most developed countries of the world, signed a declaration which essentially states that all publicly-funded archive data should be made publicly available.[18] Following a request and an intense discussion with data-producing institutions in member states, the OECD published in 2007 the OECD Principles and Guidelines for Access to Research Data from Public Funding as a soft-law recommendation.[19]

In 2005 Edd Dumbill introduced an "Open Data" theme in XTech, including:

In 2006 Science Commons [20] ran a 2-day conference in Washington where the primary topic could be described as Open Data. It was reported that the amount of micro-protection of data (e.g. by license) in areas such as biotechnology was creating a Tragedy of the anticommons. In this the costs of obtaining licenses from a large number of owners made it uneconomic to do research in the area.

In 2007 SPARC and Science Commons announced a consolidation and enhancement of their author addenda [21]

Fundamental Open Rights

Arguments made on behalf of Open Data include:

  • "Data belong to the human race". Typical examples are genomes, data on organisms, medical science, environmental data.
  • Public money was used to fund the work and so it should be universally available.
  • It was created by or at a government institution (this is common in US National Laboratories and government agencies)
  • Facts cannot legally be copyrighted.
  • Sponsors of research do not get full value unless the resulting data are freely available
  • Restrictions on data re-use create an anticommons
  • Data are required for the smooth process of running communal human activities (map data, public institutions)
  • In scientific research, the rate of discovery is accelerated by better access to data. [22]

It is generally held that factual data cannot be copyrighted.[23] However publishers frequently add their copyright statements (often forbidding re-use) to scientific data accompanying (supporting, supplementing) a publication. It is also usually unclear whether the factual data embedded in full text are part of the copyright.

While the human abstraction of facts from paper publications is normally accepted as legal there is often an implied restriction on the machine extraction by robots.

As the term Open Data is relatively new it is difficult to collect arguments against it. Unlike Open Access where groups of publishers have stated their concerns, Open Data is normally challenged by individual institutions. Their arguments may include:

  • this is a non-profit organisation and the revenue is necessary to support other activities (e.g. learned society publishing supports the society)
  • the government gives specific legitimacy for certain organisations to recover costs (NIST in US, Ordnance Survey in UK)
  • government funding may not be used to duplicate or challenge the activities of the private sector (e.g. PubChem)

Relation to Open Access

Much data is made available through scholarly publication, which now attracts intense debate under "Open Access". The Budapest Open Access Initiative (2001) coined this term:

By "open access" to this literature, we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited.

The logic of the declaration permits re-use of the data although the term "literature" has connotations of human-readable text and can imply a scholarly publication process. In Open Access discourse the term "full-text" is often used which does not emphasize the data contained within or accompanying the publication.

Some Open Access publishers do not require the authors to assign copyright and the data associated with these publications can normally be regarded as Open Data. Some publishers have Open Access strategies where the publisher requires assignment of the copyright and where it is unclear that the data in publications can be truly regarded as Open Data.

The ALPSP and STM publishers have issued a statement about the desirability of making data freely available [24]:

Publishers recognise that in many disciplines data itself, in various forms,is now a key output of research. Data searching and mining tools permitincreasingly sophisticated use of raw data. Of course, journal articlesprovide one ‘view’ of the significance and interpretation of that data – andconference presentations and informal exchanges may provide other‘views’ – but data itself is an increasingly important community resource.Science is best advanced by allowing as many scientists as possible tohave access to as much prior data as possible; this avoids costlyrepetition of work, and allows creative new integration and reworking ofexisting data.


We believe that, as a general principle, data sets, the raw data outputs ofresearch, and sets or sub-sets of that data which are submitted with apaper to a journal, should wherever possible be made freely accessible toother scholars. We believe that the best practice for scholarly journalpublishers is to separate supporting data from the article itself, and not torequire any transfer of or ownership in such data or data sets as acondition of publication of the article in question.

Even though this statement was without any effect on the open availability of primary data related to publications in journals of the ALPSP and STM members. Data tables provided by the authors as supplement with a paper are still available to subscribers only.

Relation to other Open Activities

There are a number of other "Open" philosophies which are similar to, but not synonymous with Open Data but which may overlap, be supersets, or subsets. Here they are briefly listed and compared.

  • Open Source (Software) is concerned with the licenses under which computer programs can be distributed and is not normally concerned primarily with data.
  • Open Content has similarities to Open Data and may be seen as a superset but differs in that it emphasizes creative works while Open Data is more oriented towards factual data and the output of the scientific research process.
  • Open Notebook Science refers to the application of the Open Data concept to as much of the scientific process as possible, including failed experiments and raw experimental data. [25]
  • Open Knowledge. The Open Knowledge Foundation argues for Openness in a range of issues including, but not limited to, those of Open Data. It covers (a) scientific, historical, geographic or otherwise (b) Content such as music, films, books (c) Government and other administrative information. Open Data is included within the scope of the Open Knowledge Definition, which is alluded to in Science Commons' Protocol for Implementing Open Access Data.[26]

Funders' mandates

Several funding bodies which mandate Open Access also mandate Open Data. A good expression of requirements (truncated in places) is given by the Canadian Institutes of Health Research (CIHR) [27]:

  • to deposit bioinformatics, atomic and molecular coordinate data, experimental data into the appropriate public database immediately upon publication of research results.
  • to retain original data sets for a minimum of five years after the grant. This applies to all data, whether published or not.

Note the fundamental requirement to be able to replicate the experiment.

Other bodies active in promoting the deposition of data as well as fulltext include the Wellcome Trust.

Closed Data

Several intentional or unintentional mechanisms exist for restricting access to or re-use of data. They include:

  • compilation in databases or websites to which only registered members or customers can have access.
  • use of a proprietary or closed technology or encryption which creates a barrier for access.
  • copyright forbidding (or obfuscating) re-use of the data.
  • license forbidding (or obfuscating) re-use of the data (such as share-alike[citation needed] or non-commercial)
  • patent forbidding re-use of the data (for example the 3-dimensional coordinates of some experimental protein structures have been patented)
  • restriction of robots to websites, with preference to certain search engines
  • aggregating factual data into "databases" which may be covered by "database rights" or "database directives" (e.g. Directive on the legal protection of databases)
  • time-limited access to resources such as e-journals (which on traditional print were available to the purchaser indefinitely)
  • webstacles, or the provision of single data points as opposed to tabular queries or bulk downloads of data sets.
  • political, commercial or legal pressure on the activity of organisations providing Open Data (for example the American Chemical Society lobbied the US Congress to limit funding to the National Institutes of Health for its Open PubChem data. [28]

Organisations promoting Open Data

See also

External links


  1. ^ Science Commons
  2. ^ Jeffrey Veen
  3. ^ Keith G Jeffery on Peter Murray-Rust's blog
  4. ^ GCDIS
  5. ^ http://mercury.ornl.gov/safari2k/s2kpolicy.pdf
  6. ^ Jim Kent 2002
  7. ^ Open Data Consortium ca. 2003
  8. ^ Peter Murray-Rust, Henry Rzepa 2004
  9. ^ "Open Data" at CERN Workshop on Innovations in Scholarly Communication (OAI4) Peter Murray-Rust, 2005
  10. ^ Report on Science Commons Dec 2004
  11. ^ [1]
  12. ^ http://www.crystallography.net/
  13. ^ [http://www.w3.org/2002/12/cal/mash/slides#(1) Semantic Web Data Integration with hCalendar and GRDDL; Dan Connolly | From Syntax to Semantics (XML 2005)Atlanta, GA, USA]
  14. ^ SPARC Open Data Mailing list
  15. ^ [2]
  16. ^ XTech 2005
  17. ^ Tim Bray and Tim O'Reilly
  18. ^ OECD Declaration on Open Access to publicly-funded data
  19. ^ OECD Principles and Guidelines for Access to Research Data from Public Funding
  20. ^ Science Commons in Washington 2006
  21. ^ SPARC-OAF forum
  22. ^ How to Make the Dream Come True argues in one research area (Astronomy) that access to open data increases the rate of scientific discovery.
  23. ^ Towards a Science Commons includes an overview of the basis of Openness in science data.
  24. ^ http://www.alpsp.org/ForceDownload.asp?id=129
  25. ^ http://drexel-coas-elearning.blogspot.com/2006/09/open-notebook-science.html creation of term
  26. ^ Protocol for Implementing Open Access Data
  27. ^ SPARC-OpenData@arl.org Mailing List Archive
  28. ^ Review of history and positions by the University of California


All translations of Open_Data

sensagent's content

  • definitions
  • synonyms
  • antonyms
  • encyclopedia

Dictionary and translator for handheld

⇨ New : sensagent is now available on your handheld

   Advertising ▼

sensagent's office

Shortkey or widget. Free.

Windows Shortkey: sensagent. Free.

Vista Widget : sensagent. Free.

Webmaster Solution


A windows (pop-into) of information (full-content of Sensagent) triggered by double-clicking any word on your webpage. Give contextual explanation and translation from your sites !

Try here  or   get the code


With a SensagentBox, visitors to your site can access reliable information on over 5 million pages provided by Sensagent.com. Choose the design that fits your site.

Business solution

Improve your site content

Add new content to your site from Sensagent by XML.

Crawl products or adds

Get XML access to reach the best products.

Index images and define metadata

Get XML access to fix the meaning of your metadata.

Please, email us to describe your idea.


The English word games are:
○   Anagrams
○   Wildcard, crossword
○   Lettris
○   Boggle.


Lettris is a curious tetris-clone game where all the bricks have the same square shape but different content. Each square carries a letter. To make squares disappear and save space for other squares you have to assemble English words (left, right, up, down) from the falling squares.


Boggle gives you 3 minutes to find as many words (3 letters or more) as you can in a grid of 16 letters. You can also try the grid of 16 letters. Letters must be adjacent and longer words score better. See if you can get into the grid Hall of Fame !

English dictionary
Main references

Most English definitions are provided by WordNet .
English thesaurus is mainly derived from The Integral Dictionary (TID).
English Encyclopedia is licensed by Wikipedia (GNU).


The wordgames anagrams, crossword, Lettris and Boggle are provided by Memodata.
The web service Alexandria is granted from Memodata for the Ebay search.
The SensagentBox are offered by sensAgent.


Change the target language to find translations.
Tips: browse the semantic fields (see From ideas to words) in two languages to learn more.

last searches on the dictionary :

3816 online visitors

computed in 0.078s

I would like to report:
section :
a spelling or a grammatical mistake
an offensive content(racist, pornographic, injurious, etc.)
a copyright violation
an error
a missing statement
please precise:



Company informations

My account



   Advertising ▼