NRC Inuktitut Search Engine

www.inuktitutcomputing.ca       contact


About the Search Engine

The NRC Inuktitut Search Engine consists in an index of syllabic inuktitut words from several thousands of Inuktitut internet pages, and of a query system using either Inuktitut syllabics or Roman orthography, independently from the original encoding and font of the indexed words. The words are indexed in their original form as found in the internet pages, that is, either in syllabics or in the roman alphabet

IN THIS PAGE:

The search query's input mode
The wildcards
The booleans AND, OR and NOT
Parantheses
The results
The highlighting of the words found in the pages
About fonts

The search query's input mode

The search query can be typed in the syllabics script or in the roman alphabet script. In syllabics, one has the choice among Unicode and ProSyl, Nunacom, and AiPaiNunavik legacy fonts. This is to accomodate the user's working setup. Depending on whether the user wishes the search to be done through the syllabic indexed words or the roman alphabet indexed words, (see section The results below), the search query will be transliterated internally to the proper mode (from syllabics to roman alphabet, or from roman alphabet to syllabics), including the wildcard characters (see section The wildcards below).

The search query can be typed in with Unicode syllabic characters or with roman letters:
(see About fonts below)

It can also be typed in with syllabic characters in either of the ProSyl font, the Nunacom font, and the AiPaiNunavik font:
(see About fonts below)

The wildcards

Wildcards can be used in queries, both in syllabics and Roman orthography. The meaning of the wildcards depend on the search query's input mode:

Examples:

The booleans AND,  OR, and - (NOT)

In terms of booleans, the Inuktitut Search Engine follows the same rules as Google. One can use three booleans in the query expressions: one meaning "AND", another one meaning "OR", and the last one meaning "NOT".

The booleans are expressed with the following reserved keywords and symbols:

English French Inuktitut
(Roman Alphabet)
Inuktitut
(Syllabics)
Symbol
AND AND ET AMMA ᐊᒻᒪ x
OR OR OU UVVALU ᐅᕝᕙᓗ | or /
NOT x x x x -      (see note below)

Notes

about '-':
The symbol '-' is not part of the Nunacom font. So, in Nunacom, you will have to type the key '-' on the keyboard. Because the code of this key corresponds to nothing in the Nunacom font, a square will probably show up in place of the character '-': . No need to worry. It will be interpreted correctly.

about roman alphabet keywords:
all roman alphabet keywords may be written in uppercase or lowercase letters.

Parantheses

Parantheses ( and ) can be used in search queries to express another expression.  For example:

The results

The user may decide whether the Inuktitut Search Engine will retrieve pages where the words corresponding to the search query are written in syllabics, or in the roman alphabet, or  in either format. The choice is made through the 'Search for text in' radio button:

Only words
originally in syllabics
are returned by the query:
Only words
originally in Roman alphabet
are returned by the query:
Words
both in syllabics and in Roman alphabet
are returned by the query:

The highlighting of the words found in the pages

The results of a search with the NRC Inuktitut Search Engine, quite similarly to other search engines, consist in lists of URLs where words corresponding to the search query have been found. If  the search was requested to be done in both syllabic and roman texts, then the results are presented in two lists displayed side by side.  Each URL is presented with snippets, that is, fragments of the page where words corresponding to the search query are in bold.  Each URL is also accompanied by a HIGHLIGHTER icon . Clicking on the URL itself will get you to the original page.  Clicking on the HIGHLIGHTER icon will show you a copy of that same page with all the words corresponding to the search query highlighted in green.



About fonts

Display
All Inuktitut syllabics displayed by the NRC Inuktitut Search Engine are displayed in Unicode. If you don't have a Unicode font for Inuktitut already installed on your computer (if you can't see the following Inuktitut word:  ᐃᓄᒃᑎᑐᑦ as something like this: - the shape is not important), then you will have to download one, install it, and tell your browser to use it for the Canadian Aboriginal Syllabary.  You will find here a comprehensive set of Unicode fonts for Inuktitut.
Input
Input of Inuktitut syllabics can be done in Unicode or with one of the Legacy fonts Prosyl, Nunacom, and AiPaiNunavik.

If you choose to go the Unicode way, please visit this page; it explains how to do it.  You will need a Unicode font for Inuktitut.

If you are rather to go the Legacy font way, please check here(By the way, if you are already using Prosyl, Nunacom, or AiPaiNunavik, you should logically have that font already installed on your computer.)  If you can't see wkct`r5 as  , this means that you do not have Prosyl.  If you can't see wkct}r5 as   , that means that you do not have Nunacom. If you can't see wkctŒ5 as , this means that you do not have AiPaiNunavik. You will find here a comprehensive set of Legacy Inuktitut fonts.