SWAC Metatags
Recent technological developments have made the systematic recording of words and expressions and the creation of language audio collections possible. With specific tools, it is now possible to record 1000 words in less than an hour.
These audio collections can be used for:
- Linguistic research (record and compare the pronunciation in different regions)
- Didactics (e.g. «English Irregular Verbs»)
- Illustration (for electronic dictionaries)
The exchange of audio files has been made much easier by Internet. Files can be copied and downloaded easily. However, since the recordings must be associated with other data (what is the word or expression ? what language is it in?) in order to be indexed or used properly, it is useful to have a standard recording format that contains the associated data. In this way, the audio collections can easily be produced by different software on different platforms and by different people.
We suggest a simple and practical way to associate data to the audio recording, the aim being to show the way to define the data rather than defining the data itself.
The Vorbis Comment metadata system allows you to stock additional data in Ogg Vorbis, Flac and Ogg Speex files. This solution is really adapted to setting up audio word collections. This is an existing, free and widely supported technology. Audio file transfer is easy and, because the associated data is in the audio file already, in the form of metadata in a Vorbis Comment tag, there is no need for additional description.
Below is a proposed list of standard field names with a description of intended use. We recommend adopting the same standard field names for a community that is producing and using audio word collections, on the same principle as the Vorbis Comment field recommendations : for example, ina music collection, you do not have to complete a field that gives the name of the artist but, if you do, you must call the field “ARTIST” and not “BAND” or anything else.
None of these fields are intended to be mandatory, although we believe that no real automated
processing can be done without the SWAC_TEXT
and SWAC_LANG
fields.
FIELDS
1. Data for the pronounced text
- SWAC_TEXT
-
Text pronounced by the speaker
- « house »
- « it's raining cats and dogs ! »
- SWAC_LANG
-
The language of the word pronounced (ISO 639-3)
record value « rendezvous » eng « rendez-vous » fra « crocodile » eng « crocodile » fra - SWAC_ALPHAIDX
-
Items which allow programs to generate automatically the alphabetical index of the audio collection. The
separator is «|» (U+007C)
record value « house » (eng) house « It's raining cats and dogs! » (eng) rain|cat|dog « I am » (eng) be « 啊 » (chi) ā « se laver » (fra) laver (se) « j'ai faim » (fra) avoir|faim « ett fönster » (swe) fönster « telefonul » (ron) telefon - SWAC_BASEFORM
-
When the record is a derivative form of a word, this field indicates the base word
record value « I was » (eng) to be « je vais » (fra) aller « друзей » (rus) друг - SWAC_FORM_NAME
-
When the
SWAC_BASEFORM
is defined, this field indicates the name of the formrecord value « je vais » (fra) Present. 1p.S. « друзей » (rus) Gen. Pl. - SWAC_FORM_REF
-
Name of the referential used by the
SWAC_FORM_NAME
field (such as LMF codification) - SWAC_HOMOGRAPHIDX
-
Index which can help the user to differentiate homographs in the audio collection.
The
SWAC_HOMOGRAPHIDX
is based on the grammatical difference between homographs.record value « пропа́сть » (rus) verb « про́пасть » (rus) noun « os » (fra) /os/ sing « os » (fra) /o/ plur record value « мука́ » (rus) flow « му́ка » (rus) pain « bass » (eng) fish « bass » (eng) music - SWAC_HOMOGRAPHIDX_REF
-
Name of the referential used by the
SWAC_HOMOGRAPHIDX
field.
2. Speaker data
- SWAC_SPEAK_NAME
-
Speaker's name
- « Jacques Durand »
- « Иван Иванович Иванов »
- SWAC_SPEAK_GENDER
-
Speaker's gender [M/F]
- M: masculine
- F: feminine
- SWAC_SPEAK_BIRTH_YEAR
-
Speaker's year of birth
(Format: YYYY)
- SWAC_SPEAK_LANG
-
Speaker's native speaking language
(ISO 639-3)
- SWAC_SPEAK_LANG_COUNTRY
-
Country where the speaker acquired the
SWAC_SPEAK_LANG
- SWAC_SPEAK_LANG_REGION
-
Region where the speaker acquired the
SWAC_SPEAK_LANG
- « Pays basque »
- SWAC_SPEAK_LANG_LOC
-
Location of the
SWAC_SPEAK_REGION
(format: WGS 84 DM)- N 48°52.233 E 2°24.232
- SWAC_SPEAK_PRON
- General note about the pronunciation of the speaker (for example, a speech defect)
- SWAC_SPEAK_LIV_COUNTRY
-
Speaker's living country code
(ISO-3166-1)
- SWAC_SPEAK_LIV_TOWN
-
Speaker's living town
- « Saint-Jean-Pied-de-Port »
- SWAC_SPEAK_CONTACT
-
Contact data for the speaker
- « jacques-durand@shtooka.net »
- SWAC_SPEAK_DESC
- Free note about the speaker
3. Word pronunciation data
- SWAC_PRON_INTONATION
-
Note about the intonation
record value « oh » Surprise « oh » Realization - SWAC_PRON_SPEED
-
[1/2/3]
- 1: slow pronunciation for pedagogical use
- 2: normal pronunciation
- 3: fast
- SWAC_PRON_COMMENT
-
Comments on the pronunciation of the word by the speaker
record value « abasourdir » (fra) /a.ba.zuʁ.diʁ/ Academic pronunciation « abasourdir » (fra) /a.ba.suʁ.diʁ/ Popular pronunciation « догово́р » (rus) Standard pronunciation « до́говор » (rus) Popular pronunciation in the south of Russia - SWAC_PRON_API
- Phonetic transcription (using the international API phonetic alphabet)
- SWAC_PRON_PHON
-
Specific phonetic transcription in the language system concerned
record value « мука » (rus) мука́ (with the diacritic symbol) « 啊 » (chi) ā (the pinyin transcription)
4. Audio collection data
- SWAC_COLL_NAME
-
- « Base Audio Libre De Mots Français »
- SWAC_COLL_SECTION
- Section in the audio collection
- SWAC_COLL_DESC
- Description of the collection
- SWAC_COLL_ORG
- Organization producing the audio collection
- SWAC_COLL_ORG_URL
- URL for data on the organization producing the audio collection
- SWAC_COLL_LICENSE
- License which applies to the collection
- SWAC_COLL_COPYRIGHT
- Audio collection copyrights
- SWAC_COLL_AUTHORS
- Audio collection authors
- SWAC_COLL_URL
- URL for general information on the collection
5. Technical data
- SWAC_TECH_QLT
-
Audio Quality [1/2/3/4/5]
- 1: very poor
- 2: poor
- 3: normal
- 4: good
- 5: very good
- SWAC_TECH_DATE
-
Date of recording
(Format: YYYY-MM-DD)
- SWAC_TECH_SOFT
- The program used to record the sound
Note about the Vorbis Comment specifications:
Please consult the Vorbis Comment home page at: http://xiph.org/vorbis/doc/v-comment.html for more information on general comment tag specifications.
The content of tags such as TITLE
, DESCRIPTION
, LICENSE
and COPYRIGHT
can be set to any value.
These fields can be completed automatically using data provided by SWAC fields, although it is
recommended that the GENRE
field is set to « Speech ».
- GENRE
- « Speech »
According to the general Vorbis Comment specifications, the
use of additional fields is allowed. This enables SWAC Fields to cohabit
with other specific data. For example, electronic
dictionaries can use specific fields such as « OMEGAWIKI_ARTICLEIDX
» to link audio items to their
articles.
Note about the ID3v2 Tagging Format:
Since the availability of the 2.4 version of the ID3 Tagging Format, it is possible to store Unicode character strings in MP3
audio files. We do not recommend the use of this tagging format but SWAC fields can be stored as
« TXXX
» frames.
Please consult the ID3 Tagging Format home page at http://www.id3.org/ for more information.
Note about this document:
This document is distributed and licensed by the Shtooka Project under the Creative Commons BY-SA License. More information about the license at: http://creativecommons.org/licenses/by/2.0/fr/deed.en_GB