SWAC Metatags

Recent technological developments have made the systematic recording of words and expressions and the creation of language audio collections possible. With specific tools, it is now possible to record 1000 words in less than an hour.

These audio collections can be used for:

  • Linguistic research (record and compare the pronunciation in different regions)
  • Didactics (e.g. «English Irregular Verbs»)
  • Illustration (for electronic dictionaries)

The exchange of audio files has been made much easier by Internet. Files can be copied and downloaded easily. However, since the recordings must be associated with other data (what is the word or expression ? what language is it in?) in order to be indexed or used properly, it is useful to have a standard recording format that contains the associated data. In this way, the audio collections can easily be produced by different software on different platforms and by different people.

We suggest a simple and practical way to associate data to the audio recording, the aim being to show the way to define the data rather than defining the data itself.

The Vorbis Comment metadata system allows you to stock additional data in Ogg Vorbis, Flac and Ogg Speex files. This solution is really adapted to setting up audio word collections. This is an existing, free and widely supported technology. Audio file transfer is easy and, because the associated data is in the audio file already, in the form of metadata in a Vorbis Comment tag, there is no need for additional description.

Below is a proposed list of standard field names with a description of intended use. We recommend adopting the same standard field names for a community that is producing and using audio word collections, on the same principle as the Vorbis Comment field recommendations : for example, ina music collection, you do not have to complete a field that gives the name of the artist but, if you do, you must call the field “ARTIST” and not “BAND” or anything else.

None of these fields are intended to be mandatory, although we believe that no real automated processing can be done without the SWAC_TEXT and SWAC_LANG fields.

FIELDS

1. Data for the pronounced text

SWAC_TEXT
Text pronounced by the speaker
  • « house »
  • « it's raining cats and dogs ! »
SWAC_LANG
The language of the word pronounced (ISO 639-3)
recordvalue
« rendezvous »eng
« rendez-vous »fra
« crocodile »eng
« crocodile »fra
SWAC_ALPHAIDX
Items which allow programs to generate automatically the alphabetical index of the audio collection. The separator is «|» (U+007C)
recordvalue
« house » (eng)house
« It's raining cats and dogs! » (eng)rain|cat|dog
« I am » (eng)be
« 啊 » (chi)ā
« se laver » (fra)laver (se)
« j'ai faim » (fra)avoir|faim
« ett fönster » (swe)fönster
« telefonul » (ron)telefon
SWAC_BASEFORM
When the record is a derivative form of a word, this field indicates the base word
recordvalue
« I was » (eng)to be
« je vais » (fra)aller
« друзей » (rus)друг
SWAC_FORM_NAME
When the SWAC_BASEFORM is defined, this field indicates the name of the form
recordvalue
« je vais » (fra)Present. 1p.S.
« друзей » (rus)Gen. Pl.
SWAC_FORM_REF
Name of the referential used by the SWAC_FORM_NAME field (such as LMF codification)
SWAC_HOMOGRAPHIDX
Index which can help the user to differentiate homographs in the audio collection. The SWAC_HOMOGRAPHIDX is based on the grammatical difference between homographs.
recordvalue
« пропа́сть » (rus)verb
« про́пасть » (rus)noun
« os » (fra) /os/sing
« os » (fra) /o/plur
But it can also be a translation into another language (basically in English) or a small explanation if the difference is not of grammatical nature.
recordvalue
« мука́ » (rus)flow
« му́ка » (rus)pain
« bass » (eng)fish
« bass » (eng)music
SWAC_HOMOGRAPHIDX_REF
Name of the referential used by the SWAC_HOMOGRAPHIDX field.

2. Speaker data

SWAC_SPEAK_NAME
Speaker's name
  • « Jacques Durand »
  • « Иван Иванович Иванов »
SWAC_SPEAK_GENDER
Speaker's gender [M/F]
  • M: masculine
  • F: feminine
SWAC_SPEAK_BIRTH_YEAR
Speaker's year of birth

(Format: YYYY)

SWAC_SPEAK_LANG
Speaker's native speaking language

(ISO 639-3)

SWAC_SPEAK_LANG_COUNTRY
Country where the speaker acquired the SWAC_SPEAK_LANG

(ISO-3166-1)

SWAC_SPEAK_LANG_REGION
Region where the speaker acquired the SWAC_SPEAK_LANG
  • « Pays basque »
SWAC_SPEAK_LANG_LOC
Location of the SWAC_SPEAK_REGION (format: WGS 84 DM)
  • N 48°52.233 E 2°24.232
SWAC_SPEAK_PRON
General note about the pronunciation of the speaker (for example, a speech defect)
SWAC_SPEAK_LIV_COUNTRY
Speaker's living country code

(ISO-3166-1)

SWAC_SPEAK_LIV_TOWN
Speaker's living town
  • « Saint-Jean-Pied-de-Port »
SWAC_SPEAK_CONTACT
Contact data for the speaker
  • « jacques-durand@shtooka.net »
SWAC_SPEAK_DESC
Free note about the speaker

3. Word pronunciation data

SWAC_PRON_INTONATION
Note about the intonation
recordvalue
« oh »Surprise
« oh »Realization
SWAC_PRON_SPEED
[1/2/3]
  • 1: slow pronunciation for pedagogical use
  • 2: normal pronunciation
  • 3: fast
SWAC_PRON_COMMENT
Comments on the pronunciation of the word by the speaker
recordvalue
« abasourdir » (fra) /a.ba.zuʁ.diʁ/ Academic pronunciation
« abasourdir » (fra) /a.ba.suʁ.diʁ/ Popular pronunciation
« догово́р » (rus) Standard pronunciation
« до́говор » (rus) Popular pronunciation in the south of Russia
SWAC_PRON_API
Phonetic transcription (using the international API phonetic alphabet)
SWAC_PRON_PHON
Specific phonetic transcription in the language system concerned
recordvalue
« мука » (rus) мука́ (with the diacritic symbol)
« 啊 » (chi) ā (the pinyin transcription)

4. Audio collection data

SWAC_COLL_NAME
  • « Base Audio Libre De Mots Français »
SWAC_COLL_SECTION
Section in the audio collection
SWAC_COLL_DESC
Description of the collection
SWAC_COLL_ORG
Organization producing the audio collection
SWAC_COLL_ORG_URL
URL for data on the organization producing the audio collection
SWAC_COLL_LICENSE
License which applies to the collection
SWAC_COLL_COPYRIGHT
Audio collection copyrights
SWAC_COLL_AUTHORS
Audio collection authors
SWAC_COLL_URL
URL for general information on the collection

5. Technical data

SWAC_TECH_QLT
Audio Quality [1/2/3/4/5]
  • 1: very poor
  • 2: poor
  • 3: normal
  • 4: good
  • 5: very good
SWAC_TECH_DATE
Date of recording

(Format: YYYY-MM-DD)

SWAC_TECH_SOFT
The program used to record the sound

Note about the Vorbis Comment specifications:

Please consult the Vorbis Comment home page at: http://xiph.org/vorbis/doc/v-comment.html for more information on general comment tag specifications.

The content of tags such as TITLE, DESCRIPTION, LICENSE and COPYRIGHT can be set to any value. These fields can be completed automatically using data provided by SWAC fields, although it is recommended that the GENRE field is set to « Speech ».

GENRE
« Speech »

According to the general Vorbis Comment specifications, the use of additional fields is allowed. This enables SWAC Fields to cohabit with other specific data. For example, electronic dictionaries can use specific fields such as « OMEGAWIKI_ARTICLEIDX » to link audio items to their articles.

Note about the ID3v2 Tagging Format:

Since the availability of the 2.4 version of the ID3 Tagging Format, it is possible to store Unicode character strings in MP3 audio files. We do not recommend the use of this tagging format but SWAC fields can be stored as « TXXX » frames.

Please consult the ID3 Tagging Format home page at http://www.id3.org/ for more information.

Note about this document:

This document is distributed and licensed by the Shtooka Project under the Creative Commons BY-SA License. More information about the license at: http://creativecommons.org/licenses/by/2.0/fr/deed.en_GB