sensagent's content

  • definitions
  • synonyms
  • antonyms
  • encyclopedia

Dictionary and translator for handheld

⇨ New : sensagent is now available on your handheld

   Advertising ▼

sensagent's office

Shortkey or widget. Free.

Windows Shortkey: sensagent. Free.

Vista Widget : sensagent. Free.

Webmaster Solution

Alexandria

A windows (pop-into) of information (full-content of Sensagent) triggered by double-clicking any word on your webpage. Give contextual explanation and translation from your sites !

Try here  or   get the code

SensagentBox

With a SensagentBox, visitors to your site can access reliable information on over 5 million pages provided by Sensagent.com. Choose the design that fits your site.

Business solution

Improve your site content

Add new content to your site from Sensagent by XML.

Crawl products or adds

Get XML access to reach the best products.

Index images and define metadata

Get XML access to fix the meaning of your metadata.


Please, email us to describe your idea.

WordGame

The English word games are:
○   Anagrams
○   Wildcard, crossword
○   Lettris
○   Boggle.

Lettris

Lettris is a curious tetris-clone game where all the bricks have the same square shape but different content. Each square carries a letter. To make squares disappear and save space for other squares you have to assemble English words (left, right, up, down) from the falling squares.

boggle

Boggle gives you 3 minutes to find as many words (3 letters or more) as you can in a grid of 16 letters. You can also try the grid of 16 letters. Letters must be adjacent and longer words score better. See if you can get into the grid Hall of Fame !

English dictionary
Main references

Most English definitions are provided by WordNet .
English thesaurus is mainly derived from The Integral Dictionary (TID).
English Encyclopedia is licensed by Wikipedia (GNU).

Copyrights

The wordgames anagrams, crossword, Lettris and Boggle are provided by Memodata.
The web service Alexandria is granted from Memodata for the Ebay search.
The SensagentBox are offered by sensAgent.

Translation

Change the target language to find translations.
Tips: browse the semantic fields (see From ideas to words) in two languages to learn more.

last searches on the dictionary :

2836 online visitors

computed in 0.093s

   Advertising ▼


 » 

Wikipedia

Pronunciation Lexicon Specification

From Wikipedia

Jump to: navigation, search

The Pronunciation Lexicon Specification (PLS) is a W3C Recommendation, which is designed to enable interoperable specification of pronunciation information for both speech recognition and speech synthesis engines within voice browsing applications. The language is intended to be easy to use by developers while supporting the accurate specification of pronunciation information for international use.

The language allows one or more pronunciations for a word or phrase to be specified using a standard pronunciation alphabet or if necessary using vendor specific alphabets. Pronunciations are grouped together into a PLS document which may be referenced from other markup languages, such as the Speech Recognition Grammar Specification SRGS and the Speech Synthesis Markup Language SSML.

Contents

Usage

Here is an example PLS document:

<source lang="xml">

<?xml version="1.0" encoding="UTF-8"?><lexicon version="1.0"     xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"     xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon       http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"    alphabet="ipa" xml:lang="en-US">  <lexeme>    <grapheme>judgment</grapheme>    <grapheme>judgement</grapheme>    <phoneme>ˈdʒʌdʒ.mənt</phoneme>  </lexeme>  <lexeme>    <grapheme>fiancé</grapheme>    <grapheme>fiance</grapheme>    <phoneme>fiˈɒns.eɪ</phoneme>    <phoneme>ˌfiː.ɑːnˈseɪ</phoneme>  </lexeme></lexicon>

</source>

which could be used to improve TTS as shown in the following SSML 1.0 document:

<source lang="xml">

<?xml version="1.0" encoding="UTF-8"?><speak version="1.0"     xmlns="http://www.w3.org/2001/10/synthesis"     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"    xsi:schemaLocation="http://www.w3.org/2001/10/synthesis      http://www.w3.org/TR/speech-synthesis/synthesis.xsd"    xml:lang="en-US">  <lexicon uri="http://www.example.org/lexicon_defined_above.xml"/>

In the judgement of my fiancé, Las Vegas is the best place for a honeymoon. I replied that I preferred Venice and didn't think the Venetian casino was an acceptable compromise.

</speak>

</source>

but also to improve ASR in the following SRGS 1.0 grammar:

<source lang="xml">

<?xml version="1.0" encoding="UTF-8"?><grammar version="1.0"    xmlns="http://www.w3.org/2001/06/grammar"    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"     xsi:schemaLocation="http://www.w3.org/2001/06/grammar       http://www.w3.org/TR/speech-grammar/grammar.xsd"    xml:lang="en-US" root="movies" mode="voice">  <lexicon uri="http://www.example.org/lexicon_defined_above.xml"/>  <rule id="movies" scope="public">    <one-of>            <item>Terminator 2: Judgment Day</item>             <item>My Big Fat Obnoxious Fiance</item>             <item>Pluto's Judgement Day</item>    </one-of>   </rule></grammar>

</source>

Common Use Cases

Multiple pronunciations for the same orthography

For ASR systems it is common to rely on multiple pronunciations of the same word or phrase in order to cope with variations of pronunciation within a language. In the Pronunciation Lexicon language, multiple pronunciations are represented by more than one <phoneme> (or <alias>) element within the same <lexeme> element.

In the following example the word "Newton" has two possible pronunciations.

<source lang="xml">

<?xml version="1.0" encoding="UTF-8"?><lexicon version="1.0"      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"     alphabet="ipa" xml:lang="en-GB">  <lexeme>    <grapheme>Newton</grapheme>    <phoneme>ˈnjuːtən</phoneme>    <phoneme>ˈnuːtən</phoneme>  </lexeme></lexicon>

</source>

Multiple orthographies

In some situations there are alternative textual representations for the same word or phrase. This can arise due to a number of reasons. See Section 4.5 of PLS for details. Because these are representations that have the same meaning (as opposed to homophones), it is recommended that they be represented using a single <lexeme> element that contains multiple graphemes.

Here are two simple examples of multiple orthographies: alternative spelling of an English word and multiple writings of a Japanese word.

<source lang="xml">

<?xml version="1.0" encoding="UTF-8"?><lexicon version="1.0"      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"     alphabet="ipa" xml:lang="en-US">  <lexeme>    <grapheme>colour</grapheme>    <grapheme>color</grapheme>    <phoneme>ˈkʌlər</phoneme>  </lexeme></lexicon>
<?xml version="1.0" encoding="UTF-8"?><lexicon version="1.0"      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"     alphabet="ipa" xml:lang="jp">  <!-- Japanese entry showing how multiple writing systems are handled         romaji, kanji and hiragana orthographies -->  <lexeme>    <grapheme>nihongo</grapheme>    <grapheme>日本語</grapheme>    <grapheme>にほんご</grapheme>    <phoneme>ɲihoŋo</phoneme>  </lexeme></lexicon>

</source>

Homophones

Most languages have homophones, words with the same pronunciation but different meanings (and possibly different spellings), for instance "seed" and "cede". It is recommended that these be represented as different lexemes.

<source lang="xml">

<?xml version="1.0" encoding="UTF-8"?><lexicon version="1.0"      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"     alphabet="ipa" xml:lang="en-US">  <lexeme>    <grapheme>cede</grapheme>    <phoneme>siːd</phoneme>  </lexeme>  <lexeme>    <grapheme>seed</grapheme>    <phoneme>siːd</phoneme>  </lexeme></lexicon>

</source>

Homographs

Most languages have words with different meanings but the same spelling (and sometimes different pronunciations), called homographs. For example, in English the word bass (fish) and the word bass (in music) have identical spellings but different meanings and pronunciations. Although it is recommended that these words be represented using separate <lexeme> elements that are distinguished by different values of the role attribute (see Section 4.4 of PLS 1.0), if a pronunciation lexicon author does not want to distinguish between the two words they could simply be represented as alternative pronunciations within the same <lexeme> element. In the latter case the TTS processor will not be able to distinguish when to apply the first or the second transcription.

In this example the pronunciations of the homograph "bass" are shown.

<source lang="xml">

<?xml version="1.0" encoding="UTF-8"?><lexicon version="1.0"      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"     alphabet="ipa" xml:lang="en-US">  <lexeme>    <grapheme>bass</grapheme>    <phoneme>bæs</phoneme>    <phoneme>beɪs</phoneme>  </lexeme></lexicon>

</source>

Note that English contains numerous examples of noun-verb pairs that can be treated either as homographs or as alternative pronunciations, depending on author preference. Two examples are the noun/verb "refuse" and the noun/verb "address".

<source lang="xml">

<?xml version="1.0" encoding="UTF-8"?><lexicon version="1.0"      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"     xmlns:mypos="http://www.example.org/my_pos_namespace"     alphabet="ipa" xml:lang="en-US">  <lexeme role="mypos:verb">    <grapheme>refuse</grapheme>    <phoneme>rɪˈfjuːz</phoneme>  </lexeme>  <lexeme role="mypos:noun">    <grapheme>refuse</grapheme>    <phoneme>ˈrefjuːs</phoneme>  </lexeme></lexicon>

</source>

Pronunciation by Orthography (Acronyms, Abbreviations, etc.)

For some words and phrases pronunciation can be expressed quickly and conveniently as a sequence of other orthographies. The developer is not required to have linguistic knowledge, but instead makes use of the pronunciations that are already expected to be available. To express pronunciations using other orthographies the <alias> element may be used.

This feature may be very useful to deal with acronym expansion.

<source lang="xml">

<?xml version="1.0" encoding="UTF-8"?><lexicon version="1.0"      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"     alphabet="ipa" xml:lang="en-US">  <!--     Acronym expansion  -->  <lexeme>    <grapheme>W3C</grapheme>    <alias>World Wide Web Consortium</alias>  </lexeme>  <!--     number representation  -->  <lexeme>    <grapheme>101</grapheme>    <alias>one hundred and one</alias>  </lexeme>  <!--     crude pronunciation mechanism  -->  <lexeme>    <grapheme>Thailand</grapheme>    <alias>tie land</alias>  </lexeme>  <!--     crude pronunciation mechanism and acronym expansion  -->  <lexeme>    <grapheme>BBC 1</grapheme>    <alias>be be sea one</alias>  </lexeme></lexicon>

</source>

Status and Future

  • PLS 1.0 reached the status of W3C Recommendation on 14 October 2008.

See also

References

External links


 

All translations of Pronunciation Lexicon Specification


   Advertising ▼