Trigrams for 500+ languages.
- What is this?
- When should I use this?
- Install
- Use
- API
- Data
- Compatibility
- Contribute
- Security
- License
This package exposes all trigrams for natural languages. Based on the most translated copyright-free document on this planet: UDHR.
When you are dealing with natural language detection.
This package is ESM only. In Node.js (version 18+), install with npm:
npm install trigrams
In Deno with esm.sh
:
import {min, top} from 'https://esm.sh/trigrams@6'
In browsers with esm.sh
:
<script type="module">
import {min, top} from 'https://esm.sh/trigrams@6?bundle'
</script>
import {min, top} from 'trigrams'
console.log((await min()).nld)
console.log((await top()).pam)
Yields:
[ // 300 top trigrams.
' ar',
'eer',
'tij',
// …
'de ',
'an ',
'en ' // Most common trigram.
]
{ // 300 top trigrams.
'isa': 6,
'upa': 6,
'i k': 6,
// …
'ang': 273,
'ing': 282,
'ng ': 572 // Most common trigram with how often it was found.
}
This package exports the identifiers
min
and
top
.
It exports no TypeScript types.
There is no default export.
Get top trigrams.
Returns a promise resolving to arrays containing the top 300 trigrams sorted
from least occurring to most occurring
(Promise<Record<string, Array<string>>>
).
Get top trigrams to occurrence counts.
Returns a promise resolving to an object mapping
UDHR in Unicode
codes to objects mapping the top 300 trigrams to occurrence counts
(Promise<Record<string, Record<string, number>>>
).
The trigrams are based on the unicode versions of the universal declaration of human rights.
The files are created from all paragraphs made available by
wooorm/udhr
and do not include headings and such.
Before creating trigrams,
- the unicode characters from
\u0021
to\u0040
(both including) are removed - one or more white space characters (
\s+
) are replaced with a single space - alphabetic characters are lower cased (
[A-Z]
)
Additionally, the input is padded with two spaces on both sides.
This package is at least compatible with all maintained versions of Node.js. As of now, that is Node.js 18+. It also works in Deno and modern browsers.
Yes please! See How to Contribute to Open Source.
This package is safe.