Skip to content

Commit

Permalink
docs: more details about regexp engines (#885)
Browse files Browse the repository at this point in the history
  • Loading branch information
slevithan authored Jan 5, 2025
1 parent d1a840a commit a7834b3
Showing 1 changed file with 13 additions and 13 deletions.
26 changes: 13 additions & 13 deletions docs/guide/regex-engines.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,9 @@ outline: deep

# RegExp Engines

TextMate grammars are based on regular expressions to match tokens. Usually, we use [Oniguruma](https://github.com/kkos/oniguruma) (a regular expression engine written in C) to parse the grammar. To make it work in JavaScript, we compile Oniguruma to WebAssembly to run in the browser or Node.js.
TextMate grammars are based on regular expressions that match tokens. More specifically, they assume that [Oniguruma](https://github.com/kkos/oniguruma) (a powerful regex engine written in C) will be used to interpret the regular expressions. To make this work in JavaScript, we compile Oniguruma to WebAssembly to run in the browser or Node.js.

Since v1.15, we expose the ability to for users to switch the RegExp engine and provide custom implementations.

An `engine` option is added to the `createHighlighter` and `createHighlighterCore`. For example:
Since v1.15, we expose the ability for users to switch the regex engine or provide a custom implementation. To do so, add an `engine` option to `createHighlighter` or `createHighlighterCore`. For example:

```ts
import { createHighlighter } from 'shiki'
Expand All @@ -24,7 +22,7 @@ Shiki comes with two built-in engines:

## Oniguruma Engine

This is the default engine that uses the compiled Oniguruma WebAssembly. The most accurate and robust engine.
This is the default engine that uses the compiled Oniguruma WebAssembly.

```ts
import { createHighlighter } from 'shiki'
Expand All @@ -39,7 +37,7 @@ const shiki = await createShiki({

## JavaScript RegExp Engine

This engine uses JavaScript's native RegExp. As regular expressions used by TextMate grammars are written for Oniguruma, they might contain syntax that is not supported by JavaScript's RegExp, or expect different behavior for the same syntax. So we use [Oniguruma-To-ES](https://github.com/slevithan/oniguruma-to-es) to transpile Oniguruma patterns to native JavaScript RegExp.
This engine uses JavaScript's native `RegExp`. Since regular expressions used by TextMate grammars are written for Oniguruma, they might contain syntax that is not supported by JavaScript's `RegExp`, or expect different behavior for the same syntax. So we use [Oniguruma-To-ES](https://github.com/slevithan/oniguruma-to-es) to transpile Oniguruma patterns to native JavaScript regexes.

```ts {2,4,9}
import { createHighlighter } from 'shiki'
Expand All @@ -56,31 +54,33 @@ const shiki = await createHighlighter({
const html = shiki.codeToHtml('const a = 1', { lang: 'javascript', theme: 'nord' })
```

Please check the [compatibility table](/references/engine-js-compat) for the support status of the languages you are using.
The advantages of using the JavaScript engine are that it doesn't require loading a large WebAssembly file for Oniguruma and it is faster for some grammars (since the regular expressions run as native JavaScript).

Although the JavaScript engine's Oniguruma emulation is quite robust, it's not guaranteed that the highlighting will be 100% the same in some edge cases. Additionally, a few grammars are unsupported.

Unlike the Oniguruma engine, the JavaScript engine is strict by default. It will throw an error if it encounters a pattern that it cannot convert. If mismatches are acceptable and you want best-effort results whenever possible, you can enable the `forgiving` option to suppress any errors that happened during the conversion:
Please check the [compatibility table](/references/engine-js-compat) for the support status of languages you are using.

The JavaScript engine is strict by default, and will throw an error if it encounters a pattern that it cannot convert. If mismatches are acceptable and you want best-effort results for unsupported grammars, you can enable the `forgiving` option to suppress any conversion errors:

```ts
const jsEngine = createJavaScriptRegexEngine({ forgiving: true })
// ...use the engine
```

::: info
If you run Shiki on Node.js (or at build time) and bundle size or WebAssembly support is not a concern, we still recommend using the Oniguruma engine for the best result.
If you run Shiki on Node.js (or at build time) and bundle size or WebAssembly support is not a concern, we still recommend using the Oniguruma engine for best results.

The JavaScript engine is best when running in the browser and in cases when you want to control the bundle size.
:::

### JavaScript Runtime Target

For the best result, [Oniguruma-To-ES](https://github.com/slevithan/oniguruma-to-es) uses the [RegExp `v` flag](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicodeSets), which is available in Node.js v20+ and ES2024 ([Browser compatibility](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicodeSets#browser_compatibility)).

For older environments, it can use the `u` flag but this results in a few less grammars being supported.
For best results, the JavaScript engine uses the [RegExp `v` flag](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicodeSets), which is available in Node.js v20+ and ES2024 ([browser compatibility](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicodeSets#browser_compatibility)). For older environments, it automatically uses the `u` flag instead, but this results in a few less grammars being supported.

By default, the runtime target is automatically detected. You can override this behavior by setting the `target` option:

```ts
const jsEngine = createJavaScriptRegexEngine({
target: 'ES2018', // or 'ES2024', default is 'auto'
target: 'ES2018', // or 'auto' (default), 'ES2024', 'ES2025'
})
```

0 comments on commit a7834b3

Please sign in to comment.