Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vendor the antlr4 runtime library #487

Merged
merged 10 commits into from
Oct 30, 2024
2 changes: 2 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
.git_archival.txt export-subst
cf_units/_udunits2_parser/parser/**/*.py linguist-generated=true
cf_units/_udunits2_parser/_antlr4_runtime/**/*.py linguist-generated=true
23 changes: 19 additions & 4 deletions cf_units/_udunits2_parser/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,10 @@ a number of convenient lexical elements.

Once the Jinja2 template has been expanded, the
[ANTLR Java library](https://github.com/antlr/antlr4) is used to
compile the grammar into the targetted runtime language.
compile the grammar into the targeted runtime language.

[A script](compile.py) is provided to automate this as much as possible.
It has a dependency on pip, Jinja2, Java and ruff.

The compiled parser is committed to the repository for ease of
deployment and testing (we know it isn't ideal, but it really does make things easier).
Expand All @@ -24,9 +25,23 @@ changes to the grammar being proposed so that the two can remain in synch.

### Updating the ANTLR version

The above script downloads a Java Jar which needs updating to the same version
as antlr4-python3-runtime specified in the python requirements. Once these have
been updated, run [the script](compile.py) to regenerate the parser.
The [compile.py script](compile.py) copies the ANTLR4 runtime into the _antlr4_runtime
directory, and this should be commited to the repository. This means that we do not
have a runtime dependency on ANTLR4 (which was found to be challenging due to the
fact that you need to pin to a specific version of the ANTLR4 runtime, and aligning
this version with other libraries which also have an ANTLR4 dependency is impractical).

Since the generated code is committed to this repo, and the ANTRL4 runtime is also vendored into it, we won't ever need to run ANTLR4 unless the grammar changes.

So, we will only change the ANTLR4 version if we need new features of the
parser/lexer generators, or it becomes difficult to support the older version.

Upgrading the ANTLR4 version is a simple matter of changing `ANTLR_VERSION` in the compile.py
script, and then re-running it. This should re-generate the parser/lexer, and update
the content in the _antlr4_runtime directory. One complexity may be that the imports
of the ANTRL4 runtime need to be rewritten to support vendoring, and the code needed
to do so may change from version to version. This topic is being followed upstream
with the ANTRL4 project with the hope of making this easier and/or built-in to ANTLR4.

### Testing the grammar

Expand Down
10 changes: 7 additions & 3 deletions cf_units/_udunits2_parser/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,14 @@

import unicodedata

from antlr4 import CommonTokenStream, InputStream
from antlr4.error.ErrorListener import ErrorListener

from . import graph
from ._antlr4_runtime import (
CommonTokenStream,
InputStream,
)
from ._antlr4_runtime.error.ErrorListener import (
ErrorListener,
)
from .parser.udunits2Lexer import udunits2Lexer
from .parser.udunits2Parser import udunits2Parser
from .parser.udunits2ParserVisitor import udunits2ParserVisitor
Expand Down
309 changes: 309 additions & 0 deletions cf_units/_udunits2_parser/_antlr4_runtime/BufferedTokenStream.py

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading