Glossary

Access ConditionData licenses can specify who can be an end-user. A well-governed archival repository has mechanisms in place to administer and implement such conditions.
Application Programming Interface (API)A way for computer programs to communicate with each other. It is a way for one computer or system to ask another computer or system to do something, like provide a dataset.
Archival RepositoryA location for the storage of data that has an appropriate governance regime in place.
BinderHub

A Kubernetes-based cloud service that allows users to share reproducible interactive computing environments from code repositories. It is the primary technology behind mybinder.org.

ATAP notebooks are made available using a Binder instance maintained by AARNet/Nectar.

CARE Principles

Four principles developed by the Global Indigenous Data Alliance (GIDA) to ensure that Indigenous communities have control over the application and use of Indigenous data and Indigenous Knowledge for collective benefit.

The principles specify four aspects of the respectful use of data:

  • Collective Benefit
  • Authority to Control
  • Responsibility
  • Ethics
CollectionA group of related Objects. Examples of collections include corpora, and sub-corpora, as well as aggregations of cultural objects such as PARADISEC collections, which bring together items collected in a region or a session with consultants.
ConfidentialityThe obligation to protect identity and privacy as recognised under Australian Law in the Privacy Act 1988. More information.
CopyrightThe legal right of the owner of intellectual property. In simpler terms, copyright is the right to copy. This means that the original creators of products and anyone they give authorisation to are the only ones with the exclusive right to reproduce the work.
Crate-OA tool that allows you to create and update RO-Crates. It provides researchers with a relatively simple way to describe their data using the best practices in formal metadata description.
Data Commons

Cloud-based infrastructure coupled with governance strategies and principles that allow a community to use, share, manage and analyse its data.

LDaCA is a language data commons serving researchers and community groups that are interested in language data.

Data GovernanceThe policies and processes by which data is managed through its life cycle to ensure the quality, reliability, security, and sustainability of the data.
Data LicenseA legal arrangement between the creator of the data and the end-user specifying what users can do with the data. More information.
Data PackagingThe application of widely used standards, for example, in terms of formats, metadata, and access conditions, to the collection data.
Data StewardAn individual or organisation with the authority to make decisions regarding the Collection.
Describo

A tool that allows you to create and update RO-Crates. It provides researchers with a relatively simple way to describe their data using the best practices in formal metadata description.

Superseded for project purposes by Crate-O.

ElpisA tool to obtain a first-pass transcription of untranscribed audio. It brings cutting-edge speech recognition technology within reach of language workers and researchers who don’t have backgrounds in speech engineering.
FAIR Principles

Four key principles developed in 2016 with the aim of supporting the discovery and reuse of research data.

The principles encourage us to make data:

  • Findable
  • Accessible
  • Interoperable
  • Reusable
GLAM WorkbenchA suite of Jupyter notebooks developed by Tim Sherratt to help with exploring and using data from GLAM institutions. Primarily, the notebooks use data from Trove newspaper and magazine collections, but have some extensions beyond this. More information.
Intellectual PropertyCreative works protected by law via patents, copyright and trademarks.
Jupyter NotebookInteractive computational environments, in which you can combine code execution, rich text, mathematics, plots and rich media.
MetadataThe information that defines and describes data. It provides data users with information about the purpose, processes, and methods involved in the data collection. (Source: Australian Bureau of Statistics > Metadata).
ObjectA single resource or a group of tightly related resources that record a communicative event; for example, a dialogue or session in a speech study, a work (document) in a written corpus.
Persistent Identifier (PID)A digital identifier that is permanently assigned and provides a long lasting reference to an object or entity, for example a Digital Object Identifier (DOI).
RO-CrateResearch Object Crate. A way of packaging research data that stores the data together with its associated metadata and other component files, such as the data license.
Sensitive Data

Data that, as a result of research, contains confidential or other ‘sensitive information’ which is defined in the Privacy Act as information or opinion about an individual’s:

  • racial or ethnic origin
  • political opinions
  • membership of a political association
  • religious beliefs or affiliations
  • philosophical beliefs
  • membership of a professional or trade association
  • membership of a trade union
  • sexual preferences or practices
  • criminal record
  • health information
  • genetic information
  • culturally sensitive data or data deemed sensitive by the data provider

More information.

ToolsCode or software developed in order to support or enhance language data accessibility and use.
Metadata