Skip to content

Version 1.1.1 Release

Latest
Compare
Choose a tag to compare
@Shani-Sinojiya Shani-Sinojiya released this 11 Dec 08:52

New Features:

  1. Parquet Format Support:

    • Added support for converting ARFF files to Parquet format. This is an efficient columnar storage format that is widely used for big data processing and analytics.
    • You can now convert your ARFF files to Parquet with the following command:
      arff-format-converter -f data.arff -o output -fmt parquet
  2. Fast Mode:

    • Introduced a --fast mode (-f flag) for skipping validation checks during the conversion process. This mode is useful when you are confident in the correctness of the input and output paths, and you need a faster conversion.
    • To enable fast mode, use:
      arff-format-converter -f data.arff -o output -fmt json --fast

Improvements:

  • Performance Optimization:

    • The codebase has been optimized to improve the speed of the conversion process, especially when dealing with large datasets. The Parquet and ORC format conversions benefit from these enhancements.
  • Error Handling:

    • Better error messages and handling for file reading, writing, and format conversion issues to ensure smoother user experience.

Fixed:

  • Bug Fixes:
    • Fixed issues with certain edge cases during conversion between ARFF and CSV formats, ensuring compatibility with various ARFF files.

Documentation Updates:

  • Updated the README file with examples for the new Parquet format and fast mode feature.
  • Improved CLI usage instructions for easier understanding.

Installation:

To upgrade to the latest version, run:

pip install --upgrade arff-format-converter