Add 16A8W support and test for add operation #13568

Ninja91 · 2025-08-21T04:48:11Z

Summary:
Add 16A8W quantization support and test for the add operation in ExecutorTorch ARM backend.

This follows the pattern established for linear operations, extending int16 support to add operations.

Changes:

Add INT16 dtype validation support in op_add.py
Add test_add_tensor_16a8w_tosa_INT test function
Enable test_add.py in test targets configuration

The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency.

Differential Revision: D80510463

Summary: This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_symmetric_a16w8_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. Reviewed By: 3l1 Differential Revision: D79763381

Summary: - Adds linear ops test using the 16A8W config in INT16 profile. - Adds support in view ops validation for INT16 Dtype. - Validated with TOSA pipeline test. - Checked earlier marked flaky tests no longer flaky and remove markers. Note: Not verified with tosa reference model run. Differential Revision: D80308822

Summary: Add 16A8W quantization support and test for the add operation in ExecutorTorch ARM backend. This follows the pattern established for linear operations, extending int16 support to add operations. Changes: - Add INT16 dtype validation support in op_add.py - Add test_add_tensor_16a8w_tosa_INT test function - Enable test_add.py in test targets configuration The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. Differential Revision: D80510463

pytorch-bot · 2025-08-21T04:48:14Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13568

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit 994b904 with merge base 624b38e ():

NEW FAILURES - The following jobs have failed:

Lint / lintrunner / linux-job (gh)
>>> Lint for backends/arm/test/ops/test_linear.py:
pull / unittest-arm-backend-with-no-fvp (test_pytest_ops) / linux-job (gh)
RuntimeError: Command docker exec -t 7443df41db98d9f51157a39f1cf62c71ce3310c9da0cd06a362cc32e9e6760c7 /exec failed with exit code 1

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-08-21T04:48:19Z

This pull request was exported from Phabricator. Differential Revision: D80510463

github-actions · 2025-08-21T04:48:50Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Summary: Add 16A8W quantization support and test for the add operation in ExecutorTorch ARM backend. This follows the pattern established for linear operations, extending int16 support to add operations. Changes: - Add INT16 dtype validation support in op_add.py - Add test_add_tensor_16a8w_tosa_INT test function - Enable test_add.py in test targets configuration The 16A8W configuration uses 16-bit activations with 8-bit weights, enabling higher precision for activations while maintaining weight efficiency. Differential Revision: D80510463

Ninja91 added 3 commits August 20, 2025 21:47

Ninja91 requested a review from digantdesai as a code owner August 21, 2025 04:48

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 21, 2025

facebook-github-bot added the fb-exported label Aug 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add 16A8W support and test for add operation #13568

Add 16A8W support and test for add operation #13568

Uh oh!

Ninja91 commented Aug 21, 2025

Uh oh!

pytorch-bot bot commented Aug 21, 2025 •

edited

Loading

Uh oh!

facebook-github-bot commented Aug 21, 2025

Uh oh!

github-actions bot commented Aug 21, 2025

Uh oh!

Uh oh!

Add 16A8W support and test for add operation #13568

Are you sure you want to change the base?

Add 16A8W support and test for add operation #13568

Uh oh!

Conversation

Ninja91 commented Aug 21, 2025

Uh oh!

pytorch-bot bot commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13568

❌ 2 New Failures

Uh oh!

facebook-github-bot commented Aug 21, 2025

Uh oh!

github-actions bot commented Aug 21, 2025

This PR needs a release notes: label

Uh oh!

Uh oh!

pytorch-bot bot commented Aug 21, 2025 •

edited

Loading

This PR needs a `release notes:` label