Releases: PyThaiNLP/pythainlp
PyThaiNLP v5.1.2 Released!
PyThaiNLP v5.1.2
is a bug fix release of PyThaiNLP v5.1
.
Install: pip install pythainlp
Upgrade: pip install -U pythainlp
- Documentation: https://pythainlp.github.io/docs/5.1
- Report bug: https://github.com/PyThaiNLP/pythainlp/issues
See PyThaiNLP 5.1 Change Log: #900.
What's Changed
- Update romanize docs and keep space #1110
Full Changelog: v5.1.1...v5.1.2
Contributors
Thanks all the contributors. (Image made with contributors-img)
We build Thai NLP.
PyThaiNLP
PyThaiNLP v5.1.1 Released!
PyThaiNLP v5.1.1
is a bug fix release of PyThaiNLP v5.1
.
Install: pip install pythainlp
Upgrade: pip install -U pythainlp
- Documentation: https://pythainlp.github.io/docs/5.1
- Report bug: https://github.com/PyThaiNLP/pythainlp/issues
See PyThaiNLP 5.1 Change Log: #900.
What's Changed
- PR Description: Refactor thai_consonants_all to Use set in syllable.py #1087 by @allrob23
- ThaiTransliterator: Select 1D CPU int64 tensor device #1089 by @jkingd0n
Full Changelog: v5.1.0...v5.1.1
Contributors
Thanks all the contributors. (Image made with contributors-img)
We build Thai NLP.
PyThaiNLP
PyThaiNLP v5.1.0 Released!
We released PyThaiNLP v5.1.0! This version has increased features and fixed problems such as Thai Discourse Treebank (TDTB), Thai Solar Date converted to Thai Lunar Date, and others.
Install: pip install pythainlp
Upgrade: pip install -U pythainlp
- Documentation: https://pythainlp.github.io/docs/5.1
- Report bug: https://github.com/PyThaiNLP/pythainlp/issues
See PyThaiNLP 5.1 Change Log: #900
What is new?
New features
- Add Thai Discourse Treebank postag #910
- Add Thai Universal Dependency Treebank postag #916
- Add Thai G2P v2 Grapheme-to-Phoneme model #923
- Add support for list of strings as input to sent_tokenize() #927
- Add pythainlp.tools.safe_print to handle UnicodeEncodeError on console #969
- Add Thai Solar Date convert to Thai Lunar Date #998
- Add Thai pangram text #1045
- Add pythainlp.llm #1043
Bug fixes
- Fix collate() to consider tonemark in ordering #926
- Fix maiyamok() that expanding the wrong word #962
- Fix nlpo3.load_dict() that never print error msg when not success #979
Remove
- Remove clause_tokenize #1024
Deprecation and other API changes
- 5.1
pythainlp.util.is_native_thai
, use insteadpythainlp.morpheme.is_native_thai
- 5.2
pythainlp.cls
, use insteadpythainlp.classify
pythainlp.corpus.thai_synonym
, use insteadpythainlp.corpus.thai_synonyms
pythainlp.util.maiyamok
, use insteadpythainlp.util.expand_maiyamok
Improve
- Add more Thailand political party to Thai dictionary 2252dee
- Fix inconsistency in newmm-safe engine by copilot #1063
- Update warn_deprecation to get deprecated and removal versions #1028
- Remove unnecessary enumerate in expand_maiyamok #1029
- Add SPDX FileType #1032
- Fix bug in Longest Matching tokenizer to preprocess spaces consistently #1062
- Add codemeta.json file to root directory #1053
Full Changelog: v5.0.0...v5.1.0
Contributors
Thanks all the contributors. (Image made with contributors-img)
We build Thai NLP.
PyThaiNLP
PyThaiNLP v5.1.0-beta2
Schedule
- First Beta release: 27 December 2024
- Production release: WIP
PyThaiNLP 5.1 Change Log #900
Docs: https://pythainlp.org/dev-docs/
What's Changed
- Add pythainlp.llm by @wannaphong in #1043
- Add How to cut a new release doc by @bact in #1051
- Update pandas requirement from ==1.4.* to ==2.2.* by @dependabot in #1041
- Bump sentence-transformers from 2.2.2 to 2.7.0 by @dependabot in #1038
- Bump pyicu from 2.8 to 2.14 by @dependabot in #1052
- Add pythainlp.lm.calculate_ngram_counts by @wannaphong in #1054
- Fixed #1055 bug: Tone detector + syllable sound bug by @wannaphong in #1056
- Fix inconsistency in newmm-safe engine by copilot by @wannaphong in #1063
- Fix bug in Longest Matching tokenizer to preprocess spaces consistently by @wannaphong in #1062
- [Ready] Reduce reload word tokenizer engine in word_tokenize by @new5558 in #1064
- Add display cell tokenizer by @wannaphong in #1058
- Add longest common subsequence algorithm by @wannaphong in #1059
- Bump transformers from 4.47.1 to 4.48.0 by @dependabot in #1068
- Bump protobuf from 5.29.2 to 5.29.3 by @dependabot in #1067
- Fix custom dict error for unsupported tokenization engines by @wannaphong in #1066
- Add pythainlp.util.spelling by @wannaphong in #1060
- Add misspell command to CLI by @wannaphong in #1057
- Add codemeta.json file to root directory by @wannaphong in #1069
- Bump epitran from 1.25.1 to 1.26.0 by @dependabot in #1072
- Bump transformers from 4.48.0 to 4.48.1 by @dependabot in #1071
- Bump transformers from 4.48.1 to 4.48.2 by @dependabot in #1074
Full Changelog: v5.1.0-beta1...v5.1.0-beta2
PyThaiNLP v5.1.0-beta1
Schedule
- First Beta release: 27 December 2024
- Production release: WIP
PyThaiNLP 5.1 Change Log #900
What's Changed
- Add Thai Universal Dependency Treebank postag by @wannaphong in #916
- Add Thai Discourse Treebank postag by @wannaphong in #910
- Update tone_detector() API description by @bact in #919
- Add save and load for pythainlp.classify.param_free.GzipModel by @wannaphong in #908
- Add Thai G2P v2 Grapheme-to-Phoneme model by @wannaphong in #923
- Bump transformers from 4.36.0 to 4.38.0 by @dependabot in #907
- Add preprocess function to split whitespace before
romanize
by @pavaris-pm in #924 - Fix collate() to consider tonemark in ordering by @WTFPUn in #926
- test: Add more cases too covered all possible Marttra by @HRNPH in #929
- Bump github/codeql-action from 2 to 3 by @dependabot in #939
- Bump actions/setup-python from 4 to 5 by @dependabot in #940
- Bump peaceiris/actions-gh-pages from 3 to 4 by @dependabot in #937
- Bump conda-incubator/setup-miniconda from 2 to 3 by @dependabot in #936
- Bump actions/stale from 6 to 9 by @dependabot in #938
- Add support for list of strings as input to sent_tokenize() by @ayaan-qadri in #927
- Bump python-crfsuite from 0.9.9 to 0.9.11 by @dependabot in #943
- Tidy up workflow files by @bact in #946
- Upgrade Python in CI to 3.10 by @bact in #947
- Fix nltk.downloader warning by @bact in #949
- Remove unused pytest by @bact in #950
- Unify unit test workflow across OSes by @bact in #951
- Specify a limited test suite by @bact in #952
- Use common warn_deprecation by @bact in #956
- Move sent_tokenize with default crfcut to testx by @bact in #958
- Merge new sent_tokenize test to fix-954 by @bact in #959
- Move more sent_tokenize test by @bact in #960
- Move more sent_tokenize test by @bact in #961
- Fix sent_tokenize(engine="whitespace") return value to be a list of string by @wannaphong in #957
- Fix maiyamok() that expanding the wrong word by @bact in #962
- Add version to deprecation warnings by @bact in #963
- Remove tests with Sonarcloud issue by @bact in #964
- Add test_tools to test suite by @bact in #965
- Add pythainlp.tools.safe_print to handle UnicodeEncodeError on console by @bact in #969
- Make CLI able to handle Unicode characters output on Windows console by @bact in #968
- Split test_tag and testx_tag by @bact in #970
- Add test_tag to init by @bact in #971
- Add test_corpus to init by @bact in #972
- Add test coverage by @bact in #974
- Add test_khavee to test suite by @bact in #967
- Create CHANGELOG.md by @bact in #975
- Add Compact Tests (testc) by @bact in #976
- Add testc_tools (misspell) by @bact in #977
- Fix warnings and types by @bact in #978
- Fix nlpo3.load_dict() that never print error msg when not success by @bact in #979
- Add tests.compact.transliterate (PyICU test) by @bact in #980
- Add documentation about compact install option by @bact in #981
- Bump symspellpy from 6.7.7 to 6.7.8 by @dependabot in #985
- Bump sentencepiece from 0.1.99 to 0.2.0 by @dependabot in #982
- Bump tensorflow from 2.13.1 to 2.18.0 by @dependabot in #988
- Bump bpemb from 0.3.4 to 0.3.6 by @dependabot in #989
- Add nlpo3 to compact install/test by @bact in #987
- Bump h5py from 3.1.0 to 3.12.1 by @dependabot in #991
- Use "build" instead of setup.py + add "[cd build]" build trigger word by @bact in #994
- Add Thai Solar Date convert to Thai Lunar Date by @wannaphong in #998
- Update requests requirement from ==2.31.* to ==2.32.* by @dependabot in #1003
- Bump gensim from 4.3.2 to 4.3.3 by @dependabot in #1009
- Update numpy requirement from ==1.22.* to ==1.26.* by @dependabot in #1007
- Bump epitran from 1.9 to 1.25.1 by @dependabot in #1006
- Bump astral-sh/ruff-action from 1 to 2 by @dependabot in #1010
- Bump spacy-thai from 0.7.1 to 0.7.8 by @dependabot in #1014
- Bump fairseq from 0.10.2 to 0.12.2 by @dependabot in #1013
- Bump transformers from 4.38.0 to 4.47.0 by @dependabot in #1020
- Bump panphon from 0.20.0 to 0.21.2 by @dependabot in #1022
- Remove clause_tokenize by @wannaphong in #1024
- Update warn_deprecation to get deprecated and removal versions by @bact in #1028
- Remove unnecessary enumerate in expand_maiyamok by @bact in #1029
- Add SPDX FileType by @bact in #1032
- Bump spylls from 0.1.5 to 0.1.7 by @dependabot in #1035
- Bump emoji from 0.5.4 to 0.6.0 by @dependabot in #1036
- Bump wtpsplit from 1.0.1 to 1.3.0 by @dependabot in #1037
- Simplify calculate_f_year_f_dev() by @bact in #1031
- Bump sacremoses from 0.0.41 to 0.1.1 by @dependabot in #1034
- Bump protobuf from 3.20.3 to 5.29.1 by @dependabot in #1033
- Bump protobuf from 5.29.1 to 5.29.2 by @dependabot in #1042
- Bump ufal-chu-liu-edmonds from 1.0.2 to 1.0.3 by @dependabot in #1040
- Bump transformers from 4.47.0 to 4.47.1 by @dependabot in #1039
- Bump astral-sh/ruff-action from 2 to 3 by @dependabot in #1044
- Add Thai pangram text by @wannaphong in #1045
- Fixed #1004 by @wannaphong in #1046
- PyThaiNLP v5.1.0-beta1 by @wannaphong in #1047
New Contributors
- @WTFPUn made their first contribution in #926
- @ayaan-qadri made their first contribution in #927
Full Changelog: v5.0.5...v5.1.0-beta1
PyThaiNLP v5.0.5 Released!
PyThaiNLP v5.0.5
is a bug fix release of PyThaiNLP v5.0
.
Install: pip install pythainlp
Upgrade: pip install -U pythainlp
- Documentation: https://pythainlp.github.io/docs/5.0
- Report bug: https://github.com/PyThaiNLP/pythainlp/issues
See PyThaiNLP 5.0 Change Log: #788.
What's Changed
- Add clause_tokenize warnings #1026
- Fix maiyamok() (merge back from #962)
Full Changelog: v5.0.4...v5.0.5
PyThaiNLP v5.0.4 Released!
PyThaiNLP v5.0.4
is a bug fix release of PyThaiNLP v5.0.3
.
Install: pip install pythainlp
Upgrade: pip install -U pythainlp
- Documentation: https://pythainlp.github.io/docs/5.0
- Report bug: https://github.com/PyThaiNLP/pythainlp/issues
See PyThaiNLP 5.0 Change Log: #788.
What's Changed
- Fixed #914 by @wannaphong in #917
Full Changelog: v5.0.3...v5.0.4
PyThaiNLP v5.0.3 Released!
PyThaiNLP v5.0.3
is a bug fix release of PyThaiNLP v5.0.2
.
Install: pip install pythainlp
Upgrade: pip install -U pythainlp
- Documentation: https://pythainlp.github.io/docs/5.0
- Report bug: https://github.com/PyThaiNLP/pythainlp/issues
See PyThaiNLP 5.0 Change Log: #788.
What's Changed
- Create .editorconfig by @bact in #909
- Fix empty string ('') added (in some cases) when using word_tokenize with join_broken_num=True by @S2P2 in #912
New Contributors
Full Changelog: v5.0.2...v5.0.3
PyThaiNLP v5.0.2 Released!
PyThaiNLP v5.0.2
is a bug fix release of PyThaiNLP v5.0.1
.
Install: pip install pythainlp
Upgrade: pip install -U pythainlp
- Documentation: https://pythainlp.github.io/docs/5.0
- Report bug: https://github.com/PyThaiNLP/pythainlp/issues
See PyThaiNLP 5.0 Change Log: #788.
What's Changed
- Update README and license header by @bact in #902
- Updated crfcut.py by @varunkatiyar819 in #905
New Contributors
- @varunkatiyar819 made their first contribution in #905
Full Changelog: v5.0.1...v5.0.2
Contributors
Thanks all the contributors. (Image made with contributors-img)
PyThaiNLP v5.0.1 Released!
PyThaiNLP v5.0.1
is a bug fix release of PyThaiNLP v5.0.0
.
Install: pip install pythainlp
Upgrade: pip install -U pythainlp
- Documentation: https://pythainlp.github.io/docs/5.0
- Report bug: https://github.com/PyThaiNLP/pythainlp/issues
See PyThaiNLP 5.0 Change Log: #788.
What's Changed
- Fixed bug: ImportError pycrfsuite #901
Full Changelog: v5.0.0...v5.0.1
Contributors
Thanks all the contributors. (Image made with contributors-img)