Skip to content

Conversation

eeisegn
Copy link
Contributor

@eeisegn eeisegn commented Aug 29, 2025

Summary by CodeRabbit

  • New Features

    • CLI flags to toggle gRPC usage (--grpc) and to ignore TLS certificate verification (--ignore-cert-errors).
    • New command support for submitting component lists for vulnerability lookups.
  • Improvements

    • REST used by default with optional gRPC routing; enhanced headers, proxy, CA/HTTPS handling, and per-request tracing.
    • Parallelized processing for dependency scans and clearer error/reporting messages.
  • Chores

    • Version bumped to 1.32.0 and changelog updated.

@eeisegn eeisegn requested a review from agustingroh August 29, 2025 16:47
Copy link

coderabbitai bot commented Aug 29, 2025

Walkthrough

Adds CLI flags to toggle gRPC usage and to ignore TLS certificate verification; propagates these flags into Scanner and Components; Components can load a "components" payload; ScanossGrpc now supports both REST and gRPC routes for dependencies and vulnerabilities and dispatches at runtime.

Changes

Cohort / File(s) Summary of changes
CLI flags & wiring
src/scanoss/cli.py
Adds --grpc to vulns subcommand and global --ignore-cert-errors; passes use_grpc and ignore_cert_errors into Scanner and Components.
Components API & payloads
src/scanoss/components.py
Adds ignore_cert_errors and use_grpc ctor params; adds load_comps(); generalizes load_purls(..., field=...); switches vulnerabilities payload key to "components" and updates related messages.
Scanner wiring
src/scanoss/scanner.py
Adds use_grpc parameter to Scanner.__init__ and forwards ignore_cert_errors and use_grpc to ScanossGrpc.
gRPC/REST client and routing
src/scanoss/scanossgrpc.py
Adds use_grpc and ignore_cert_errors to ctor; introduces DEFAULT_URI_PREFIX = '/v2'; implements REST session, headers, proxy/CA and cert-ignore handling; adds REST helpers (rest_post, _get_vulnerabilities_rest, _process_dep_file_rest) and gRPC counterparts; adds get_vulnerabilities_json dispatch; switches to ComponentsRequest / ComponentsVulnerabilityResponse; extends status enum.
API init & headers
src/scanoss/scanossapi.py
Adds env-driven defaults (SCANOSS_SCAN_URL, SCANOSS_API_KEY); makes header loading URL-aware (load_generic_headers(url)); auto-switches to premium endpoint when x-api-key given without explicit URL; consolidates User-Agent handling.
Version & changelog
src/scanoss/__init__.py, CHANGELOG.md
Bumps __version__ to 1.32.0; adds Unreleased/1.32.0 changelog entry and updates anchors.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant U as User
  participant CLI as CLI
  participant SC as Scanner
  participant CMP as Components
  participant GR as ScanossGrpc
  participant SVC as Remote Service

  U->>CLI: run scan / vulns (--grpc?, --ignore-cert-errors?)
  CLI->>SC: init(use_grpc, ignore_cert_errors, ...)
  CLI->>CMP: init(use_grpc, ignore_cert_errors, ...)

  note right of GR #ADD8E6: GR routes REST or gRPC based on use_grpc
  SC->>GR: get_dependencies_json(file)
  alt use_grpc = true
    GR->>SVC: gRPC call (dependencies)
    SVC-->>GR: gRPC response
  else
    GR->>SVC: REST POST /v2/dependencies/dependencies
    SVC-->>GR: REST JSON
  end

  CMP->>CMP: load_comps() -> {"components":[...]}
  CMP->>GR: get_vulnerabilities_json({"components":[...]})
  alt use_grpc = true
    GR->>SVC: gRPC GetComponentsVulnerabilities
    SVC-->>GR: ComponentsVulnerabilityResponse
  else
    GR->>SVC: REST POST /v2/vulnerabilities/components
    SVC-->>GR: REST JSON
  end

  GR-->>CLI: aggregated results
  CLI-->>U: output
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

enhancement

Poem

I nibble flags with tiny paws,
Toggle gRPC without a pause.
If certs are stubborn, I hop around,
Components loaded, answers found.
A happy rabbit ships the bound! 🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/sean/SP-3132_switch-free-to-rest

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 1
  • Undeclared components: 0
  • Declared components: 1
  • Detected files: 49
  • Detected files undeclared: 0
  • Detected files declared: 49
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 1
  • Undeclared components: 0
  • Declared components: 1
  • Detected files: 49
  • Detected files undeclared: 0
  • Detected files declared: 49
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 1
  • Undeclared components: 0
  • Declared components: 1
  • Detected files: 48
  • Detected files undeclared: 0
  • Detected files declared: 48
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/scanoss/cli.py (1)

1009-1014: Fix typo: environment variable name should be grpc_proxy

The help text uses "grcp_proxy" twice. Should be "grpc_proxy".

Apply:

-            'Can also use the environment variable "HTTPS_PROXY=<ip>:<port>" '
-            'and "grcp_proxy=<ip>:<port>" for gRPC',
+            'Can also use the environment variable "HTTPS_PROXY=<ip>:<port>" '
+            'and "grpc_proxy=<ip>:<port>" for gRPC',
@@
-            'Can also use the environment variable "grcp_proxy=<ip>:<port>"',
+            'Can also use the environment variable "grpc_proxy=<ip>:<port>"',

Also applies to: 1047-1051

src/scanoss/scanner.py (1)

165-179: Pass CLI/API timeout to ScanossGrpc

Currently ScanossGrpc defaults to 600s even if the user supplies a different --timeout. Align timeouts across HTTP and gRPC/REST clients.

         grpc_api = ScanossGrpc(
             url=grpc_url,
             debug=debug,
             quiet=quiet,
             trace=trace,
             api_key=api_key,
             ver_details=ver_details,
             ca_cert=ca_cert,
             proxy=proxy,
             pac=pac,
             grpc_proxy=grpc_proxy,
             req_headers=self.req_headers,
+            timeout=timeout,
             ignore_cert_errors=ignore_cert_errors,
             use_grpc=use_grpc
         )
src/scanoss/scanossgrpc.py (1)

813-841: Propagate new flags via GrpcConfig

Without these fields, CLI flags (--ignore-cert-errors/--grpc) are dropped for callers using create_grpc_config_from_args (e.g., crypto, HFH).

 @dataclass
 class GrpcConfig:
     url: str = DEFAULT_URL
     api_key: Optional[str] = SCANOSS_API_KEY
     debug: Optional[bool] = False
     trace: Optional[bool] = False
     quiet: Optional[bool] = False
     ver_details: Optional[str] = None
     ca_cert: Optional[str] = None
     timeout: Optional[int] = DEFAULT_TIMEOUT
     proxy: Optional[str] = None
     grpc_proxy: Optional[str] = None
     pac: Optional[PACFile] = None
     req_headers: Optional[dict] = None
+    ignore_cert_errors: Optional[bool] = False
+    use_grpc: Optional[bool] = False
@@
 def create_grpc_config_from_args(args) -> GrpcConfig:
     return GrpcConfig(
         url=getattr(args, 'api2url', DEFAULT_URL),
         api_key=getattr(args, 'key', SCANOSS_API_KEY),
         debug=getattr(args, 'debug', False),
         trace=getattr(args, 'trace', False),
         quiet=getattr(args, 'quiet', False),
         ver_details=getattr(args, 'ver_details', None),
         ca_cert=getattr(args, 'ca_cert', None),
         timeout=getattr(args, 'timeout', DEFAULT_TIMEOUT),
         proxy=getattr(args, 'proxy', None),
         grpc_proxy=getattr(args, 'grpc_proxy', None),
+        pac=getattr(args, 'pac', None),
+        req_headers=getattr(args, 'header', None),
+        ignore_cert_errors=getattr(args, 'ignore_cert_errors', False),
+        use_grpc=getattr(args, 'grpc', False),
     )
🧹 Nitpick comments (7)
src/scanoss/components.py (4)

98-101: Docstring typo

"pf" → "of".

-        :param purls: list pf PURLs (optional)
+        :param purls: list of PURLs (optional)

133-139: Align error message with dynamic field name

Message still hardcodes "purls".

-        else:
-            self.print_stderr('ERROR: No purls specified to process.')
+        else:
+            self.print_stderr(f'ERROR: No {field} specified to process.')
             return None

160-166: Private helper name typo (_open_file_or_sdtout)

Consider renaming to _open_file_or_stdout and keep a backward-compat alias for readability.

-def _open_file_or_sdtout(self, filename):
+def _open_file_or_stdout(self, filename):
@@
-        file = self._open_file_or_sdtout(output_file)
+        file = self._open_file_or_stdout(output_file)

Add temporary alias elsewhere in the class:

_open_file_or_sdtout = _open_file_or_stdout

224-232: Update log message to reflect 'components' payload

Minor UX consistency.

-        self.print_msg('Sending PURLs to Vulnerability API for decoration...')
+        self.print_msg('Sending components to Vulnerability API for decoration...')
src/scanoss/scanossgrpc.py (3)

767-790: Avoid KeyError when stripping 'status' from REST response

Use pop to guard.

-            del response['status']
+            response.pop('status', None)
             return response

792-799: Typo in docstring

"Porcess" → "Process".

-        """
-        Porcess a single dependency file using REST
+        """
+        Process a single dependency file using REST

173-189: Global logging side-effects when trace is enabled

logging.basicConfig and HTTPConnection.debuglevel modify global state. If acceptable for CLI, fine; otherwise scope to a dedicated logger.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 693425b and a3e8fe3.

📒 Files selected for processing (4)
  • src/scanoss/cli.py (5 hunks)
  • src/scanoss/components.py (6 hunks)
  • src/scanoss/scanner.py (2 hunks)
  • src/scanoss/scanossgrpc.py (10 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (6)
src/scanoss/cli.py (3)

311-312: LGTM: Adds --grpc for component vulns

Flag is clearly scoped to the vulns subcommand.


963-969: LGTM: Adds --grpc for scan/container-scan

Cleanly plumbed into Scanner via use_grpc.


2148-2152: LGTM: Components now receives ignore_cert_errors and use_grpc

Correctly wires new flags into runtime.

src/scanoss/components.py (1)

55-57: LGTM: New flags are plumbed into ScanossGrpc

Consistent with CLI/Scanner changes.

Also applies to: 90-92

src/scanoss/scanner.py (1)

110-111: LGTM: use_grpc flag added to Scanner API

Matches CLI and ScanossGrpc wiring.

src/scanoss/scanossgrpc.py (1)

299-319: Thread-safety of shared requests.Session in concurrent executor

Session objects (and PACSession) aren’t guaranteed thread-safe. With 5 workers this can intermittently fail under load.

Consider:

  • Keep the Adapter mount (previous suggestion) and guard rest_post with per-call header copies (done).
  • Or create a per-thread Session via threading.local().

If you want, I can follow up with a patch that introduces a thread-local session factory.

Comment on lines +1059 to 1060
p.add_argument('--ignore-cert-errors', action='store_true', help='Ignore certificate errors')

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Surface security risk in help and ensure the flag is consistently wired

The flag is useful, but skipping TLS verification is dangerous. Also, other component commands don't pass this through to Components.

Apply to propagate ignore_cert_errors to the remaining Components constructors:

@@ def comp_semgrep(parser, args):
-    comps = Components(
+    comps = Components(
         debug=args.debug,
@@
-        req_headers=process_req_headers(args.header),
+        req_headers=process_req_headers(args.header),
+        ignore_cert_errors=args.ignore_cert_errors,
     )
@@ def comp_search(parser, args):
-    comps = Components(
+    comps = Components(
         debug=args.debug,
@@
-        req_headers=process_req_headers(args.header),
+        req_headers=process_req_headers(args.header),
+        ignore_cert_errors=args.ignore_cert_errors,
     )
@@ def comp_versions(parser, args):
-    comps = Components(
+    comps = Components(
         debug=args.debug,
@@
-        req_headers=process_req_headers(args.header),
+        req_headers=process_req_headers(args.header),
+        ignore_cert_errors=args.ignore_cert_errors,
     )
@@ def comp_provenance(parser, args):
-    comps = Components(
+    comps = Components(
         debug=args.debug,
@@
-        req_headers=process_req_headers(args.header),
+        req_headers=process_req_headers(args.header),
+        ignore_cert_errors=args.ignore_cert_errors,
     )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
p.add_argument('--ignore-cert-errors', action='store_true', help='Ignore certificate errors')
# In src/scanoss/cli.py, update each Components(...) call in the comp_* functions:
def comp_semgrep(parser, args):
comps = Components(
debug=args.debug,
api_url=args.api_url,
token=args.token,
req_headers=process_req_headers(args.header),
ignore_cert_errors=args.ignore_cert_errors,
)
# … rest of comp_semgrep …
def comp_search(parser, args):
comps = Components(
debug=args.debug,
api_url=args.api_url,
token=args.token,
req_headers=process_req_headers(args.header),
ignore_cert_errors=args.ignore_cert_errors,
)
# … rest of comp_search …
def comp_versions(parser, args):
comps = Components(
debug=args.debug,
api_url=args.api_url,
token=args.token,
req_headers=process_req_headers(args.header),
ignore_cert_errors=args.ignore_cert_errors,
)
# … rest of comp_versions …
def comp_provenance(parser, args):
comps = Components(
debug=args.debug,
api_url=args.api_url,
token=args.token,
req_headers=process_req_headers(args.header),
ignore_cert_errors=args.ignore_cert_errors,
)
# … rest of comp_provenance …

Comment on lines +94 to 106
def load_comps(self, json_file: Optional[str] = None, purls: Optional[List[str]] = None)-> Optional[dict]:
"""
Load the specified components and return a dictionary

:param json_file: JSON Components file (optional)
:param purls: list pf PURLs (optional)
:return: Components Request dictionary or None
"""
return self.load_purls(json_file, purls, 'components')

def load_purls(self, json_file: Optional[str] = None, purls: Optional[List[str]] = None, field:str = 'purls'
) -> Optional[dict]:
"""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ensure JSON-file inputs are normalized to the expected field ('components')

When json_file is used, load_purls returns the file contents verbatim. If the file contains "purls" (legacy), get_vulnerabilities now expects "components" and will fail server-side. Normalize when field != 'purls'.

Apply:

 def load_purls(self, json_file: Optional[str] = None, purls: Optional[List[str]] = None, field:str = 'purls'
                    ) -> Optional[dict]:
@@
         if json_file:
@@
-            with open(json_file, 'r') as f:
+            with open(json_file, 'r') as f:
                 try:
-                    purl_request = json.loads(f.read())
+                    purl_request = json.loads(f.read())
+                    # Normalize legacy key to the requested field
+                    if field != 'purls' and 'purls' in purl_request and field not in purl_request:
+                        purl_request = {field: purl_request['purls']}
                 except Exception as e:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def load_comps(self, json_file: Optional[str] = None, purls: Optional[List[str]] = None)-> Optional[dict]:
"""
Load the specified components and return a dictionary
:param json_file: JSON Components file (optional)
:param purls: list pf PURLs (optional)
:return: Components Request dictionary or None
"""
return self.load_purls(json_file, purls, 'components')
def load_purls(self, json_file: Optional[str] = None, purls: Optional[List[str]] = None, field:str = 'purls'
) -> Optional[dict]:
"""
def load_purls(self, json_file: Optional[str] = None, purls: Optional[List[str]] = None, field: str = 'purls'
) -> Optional[dict]:
if json_file:
with open(json_file, 'r') as f:
try:
purl_request = json.loads(f.read())
# Normalize legacy key to the requested field
if field != 'purls' and 'purls' in purl_request and field not in purl_request:
purl_request = {field: purl_request['purls']}
except Exception as e:
# existing error handling
self._log_error(f"Error loading JSON '{json_file}': {e}")
return None
# ... rest of the method unchanged ...

Comment on lines +171 to 189
# REST setup
if self.trace:
logging.basicConfig(stream=sys.stderr, level=logging.DEBUG)
http_client.HTTPConnection.debuglevel = 1
if pac and not proxy: # Set up a PAC session if requested (and no proxy has been explicitly set)
self.print_debug('Setting up PAC session...')
self.session = PACSession(pac=pac)
else:
self.session = requests.sessions.Session()
if self.ignore_cert_errors:
self.print_debug('Ignoring cert errors...')
urllib3.disable_warnings(InsecureRequestWarning)
self.session.verify = False
elif ca_cert:
self.session.verify = ca_cert
self.proxies = {'https': proxy, 'http': proxy} if proxy else None
if self.proxies:
self.session.proxies = self.proxies

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

REST base URL and session concurrency

  • Ensure REST base URL always has a scheme.
  • Increase connection pool for parallel requests and keep sessions safe under concurrency.
@@ def __init__(...):
-        self.orig_url = self.url.strip().rstrip('/')  # Used for proxy lookup
+        self.orig_url = self.url.strip().rstrip('/')  # Used for proxy lookup
+        # Build a REST base URL that always includes a scheme
+        self.rest_base_url = self.orig_url if self.orig_url.startswith(('http://', 'https://')) else f'https://{self.orig_url}'
@@
-            self.session = requests.sessions.Session()
+            self.session = requests.sessions.Session()
+        # Increase pool size to match our concurrency and reduce connection churn
+        try:
+            from requests.adapters import HTTPAdapter
+            self.session.mount('https://', HTTPAdapter(pool_connections=MAX_CONCURRENT_REQUESTS,
+                                                      pool_maxsize=MAX_CONCURRENT_REQUESTS))
+            self.session.mount('http://', HTTPAdapter(pool_connections=MAX_CONCURRENT_REQUESTS,
+                                                     pool_maxsize=MAX_CONCURRENT_REQUESTS))
+        except Exception:
+            pass
@@ def _get_vulnerabilities_rest(...):
-        response = self.rest_post(f'{self.orig_url}{DEFAULT_URI_PREFIX}/vulnerabilities/components', request_id, purls)
+        response = self.rest_post(f'{self.rest_base_url}{DEFAULT_URI_PREFIX}/vulnerabilities/components', request_id, purls)
@@ def _process_dep_file_rest(...):
-        response = self.rest_post(f'{self.orig_url}{DEFAULT_URI_PREFIX}/dependencies/dependencies',
+        response = self.rest_post(f'{self.rest_base_url}{DEFAULT_URI_PREFIX}/dependencies/dependencies',
                                   request_id, file_request
                                   )

Also applies to: 725-766, 767-790, 801-804

Comment on lines 725 to 766
def rest_post(self, uri: str, request_id: str, data: dict) -> dict:
"""
Post the specified data to the given URI.
:param request_id: request id
:param uri: URI to post to
:param data: json data to post
:return: JSON response or None
"""
if not uri:
self.print_stderr('Error: Missing URI. Cannot search for project.')
return None
self.print_trace(f'Sending REST POST data to {uri}...')
headers = self.headers
headers['x-request-id'] = request_id # send a unique request id for each post
retry = 0 # Add some retry logic to cater for timeouts, etc.
while retry <= self.retry_limit:
retry += 1
try:
response = self.session.post(uri, headers=headers, json=data, timeout=self.timeout)
response.raise_for_status() # Raises an HTTPError for bad responses
return response.json()
except requests.exceptions.RequestException as e:
self.print_stderr(f'Error: Problem posting data to {uri}: {e}')
except (requests.exceptions.SSLError, requests.exceptions.ProxyError) as e:
self.print_stderr(f'ERROR: Exception ({e.__class__.__name__}) POSTing data - {e}.')
raise Exception(f'ERROR: The SCANOSS Decoration API request failed for {uri}') from e
except (requests.exceptions.Timeout, requests.exceptions.ConnectionError) as e:
if retry > self.retry_limit: # Timed out retry_limit or more times, fail
self.print_stderr(f'ERROR: {e.__class__.__name__} POSTing decoration data ({request_id}): {e}')
raise Exception(
f'ERROR: The SCANOSS Decoration API request timed out ({e.__class__.__name__}) for {uri}'
) from e
else:
self.print_stderr(f'Warning: {e.__class__.__name__} communicating with {self.url}. Retrying...')
time.sleep(5)
except Exception as e:
self.print_stderr(
f'ERROR: Exception ({e.__class__.__name__}) POSTing data ({request_id}) to {uri}: {e}'
)
raise Exception(f'ERROR: The SCANOSS Decoration API request failed for {uri}') from e
return None

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Make REST posting robust: copy headers, correct retry logic, and order exceptions

  • Avoid mutating shared self.headers across threads.
  • Retry loop is off-by-one and exception order makes specific branches unreachable because RequestException is a superclass.
-    def rest_post(self, uri: str, request_id: str, data: dict) -> dict:
+    def rest_post(self, uri: str, request_id: str, data: dict) -> dict:
@@
-        headers = self.headers
-        headers['x-request-id'] = request_id  # send a unique request id for each post
-        retry = 0  # Add some retry logic to cater for timeouts, etc.
-        while retry <= self.retry_limit:
-            retry += 1
+        headers = {**self.headers, 'x-request-id': request_id}
+        for attempt in range(self.retry_limit + 1):
             try:
                 response = self.session.post(uri, headers=headers, json=data, timeout=self.timeout)
                 response.raise_for_status()  # Raises an HTTPError for bad responses
                 return response.json()
-            except requests.exceptions.RequestException as e:
-                self.print_stderr(f'Error: Problem posting data to {uri}: {e}')
-            except (requests.exceptions.SSLError, requests.exceptions.ProxyError) as e:
+            except (requests.exceptions.SSLError, requests.exceptions.ProxyError) as e:
                 self.print_stderr(f'ERROR: Exception ({e.__class__.__name__}) POSTing data - {e}.')
                 raise Exception(f'ERROR: The SCANOSS Decoration API request failed for {uri}') from e
-            except (requests.exceptions.Timeout, requests.exceptions.ConnectionError) as e:
-                if retry > self.retry_limit:  # Timed out retry_limit or more times, fail
+            except (requests.exceptions.Timeout, requests.exceptions.ConnectionError) as e:
+                if attempt >= self.retry_limit:
                     self.print_stderr(f'ERROR: {e.__class__.__name__} POSTing decoration data ({request_id}): {e}')
                     raise Exception(
                         f'ERROR: The SCANOSS Decoration API request timed out ({e.__class__.__name__}) for {uri}'
                     ) from e
                 else:
                     self.print_stderr(f'Warning: {e.__class__.__name__} communicating with {self.url}. Retrying...')
                     time.sleep(5)
+            except requests.exceptions.RequestException as e:
+                self.print_stderr(f'Error: Problem posting data to {uri}: {e}')
+                break
             except Exception as e:
                 self.print_stderr(
                     f'ERROR: Exception ({e.__class__.__name__}) POSTing data ({request_id}) to {uri}: {e}'
                 )
                 raise Exception(f'ERROR: The SCANOSS Decoration API request failed for {uri}') from e
         return None
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def rest_post(self, uri: str, request_id: str, data: dict) -> dict:
"""
Post the specified data to the given URI.
:param request_id: request id
:param uri: URI to post to
:param data: json data to post
:return: JSON response or None
"""
if not uri:
self.print_stderr('Error: Missing URI. Cannot search for project.')
return None
self.print_trace(f'Sending REST POST data to {uri}...')
headers = self.headers
headers['x-request-id'] = request_id # send a unique request id for each post
retry = 0 # Add some retry logic to cater for timeouts, etc.
while retry <= self.retry_limit:
retry += 1
try:
response = self.session.post(uri, headers=headers, json=data, timeout=self.timeout)
response.raise_for_status() # Raises an HTTPError for bad responses
return response.json()
except requests.exceptions.RequestException as e:
self.print_stderr(f'Error: Problem posting data to {uri}: {e}')
except (requests.exceptions.SSLError, requests.exceptions.ProxyError) as e:
self.print_stderr(f'ERROR: Exception ({e.__class__.__name__}) POSTing data - {e}.')
raise Exception(f'ERROR: The SCANOSS Decoration API request failed for {uri}') from e
except (requests.exceptions.Timeout, requests.exceptions.ConnectionError) as e:
if retry > self.retry_limit: # Timed out retry_limit or more times, fail
self.print_stderr(f'ERROR: {e.__class__.__name__} POSTing decoration data ({request_id}): {e}')
raise Exception(
f'ERROR: The SCANOSS Decoration API request timed out ({e.__class__.__name__}) for {uri}'
) from e
else:
self.print_stderr(f'Warning: {e.__class__.__name__} communicating with {self.url}. Retrying...')
time.sleep(5)
except Exception as e:
self.print_stderr(
f'ERROR: Exception ({e.__class__.__name__}) POSTing data ({request_id}) to {uri}: {e}'
)
raise Exception(f'ERROR: The SCANOSS Decoration API request failed for {uri}') from e
return None
def rest_post(self, uri: str, request_id: str, data: dict) -> dict:
"""
Post the specified data to the given URI.
:param request_id: request id
:param uri: URI to post to
:param data: json data to post
:return: JSON response or None
"""
if not uri:
self.print_stderr('Error: Missing URI. Cannot search for project.')
return None
self.print_trace(f'Sending REST POST data to {uri}...')
headers = {**self.headers, 'x-request-id': request_id}
for attempt in range(self.retry_limit + 1):
try:
response = self.session.post(uri, headers=headers, json=data, timeout=self.timeout)
response.raise_for_status() # Raises an HTTPError for bad responses
return response.json()
except (requests.exceptions.SSLError, requests.exceptions.ProxyError) as e:
self.print_stderr(f'ERROR: Exception ({e.__class__.__name__}) POSTing data - {e}.')
raise Exception(f'ERROR: The SCANOSS Decoration API request failed for {uri}') from e
except (requests.exceptions.Timeout, requests.exceptions.ConnectionError) as e:
if attempt >= self.retry_limit:
self.print_stderr(f'ERROR: {e.__class__.__name__} POSTing decoration data ({request_id}): {e}')
raise Exception(
f'ERROR: The SCANOSS Decoration API request timed out ({e.__class__.__name__}) for {uri}'
) from e
else:
self.print_stderr(f'Warning: {e.__class__.__name__} communicating with {self.url}. Retrying...')
time.sleep(5)
except requests.exceptions.RequestException as e:
self.print_stderr(f'Error: Problem posting data to {uri}: {e}')
break
except Exception as e:
self.print_stderr(
f'ERROR: Exception ({e.__class__.__name__}) POSTing data ({request_id}) to {uri}: {e}'
)
raise Exception(f'ERROR: The SCANOSS Decoration API request failed for {uri}') from e
return None
🤖 Prompt for AI Agents
In src/scanoss/scanossgrpc.py around lines 725 to 766, the method mutates shared
self.headers, uses an off-by-one retry loop, and catches
requests.exceptions.RequestException too early making more specific exceptions
unreachable; fix by copying headers into a local dict each attempt (e.g. headers
= dict(self.headers)) before adding x-request-id, change the loop to iterate
attempts correctly (for attempt in range(1, self.retry_limit + 1) or while
attempt < self.retry_limit) so you retry exactly retry_limit times and check
exhaustion after catching retryable errors, reorder exception handlers so
SSLError/ProxyError/Timeout/ConnectionError are caught before RequestException,
and ensure after exhausting retries you raise a clear exception or return None
as intended.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
CHANGELOG.md (1)

662-664: Fix duplicate link reference label

Two [1.31.5] definitions break markdown link refs (MD053). Define a new reference for 1.32.0.

 [1.31.4]: https://github.com/scanoss/scanoss.py/compare/v1.31.3...v1.31.4
-[1.31.5]: https://github.com/scanoss/scanoss.py/compare/v1.31.4...v1.31.5
-[1.31.5]: https://github.com/scanoss/scanoss.py/compare/v1.31.5...v1.32.0
+[1.31.5]: https://github.com/scanoss/scanoss.py/compare/v1.31.4...v1.31.5
+[1.32.0]: https://github.com/scanoss/scanoss.py/compare/v1.31.5...v1.32.0
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a3e8fe3 and 3221d1b.

📒 Files selected for processing (2)
  • CHANGELOG.md (2 hunks)
  • src/scanoss/__init__.py (1 hunks)
🧰 Additional context used
🪛 LanguageTool
CHANGELOG.md

[grammar] ~12-~12: There might be a mistake here.
Context: ...ing changes... ## [1.32.0] - 2025-08-29 ### Added - Switched vulnerability and depen...

(QB_NEW_EN)


[grammar] ~13-~13: There might be a mistake here.
Context: ...s... ## [1.32.0] - 2025-08-29 ### Added - Switched vulnerability and dependency AP...

(QB_NEW_EN)

🪛 markdownlint-cli2 (0.17.2)
CHANGELOG.md

663-663: Link and image reference definitions should be needed
Duplicate link or image reference definition: "1.31.5"

(MD053, link-image-reference-definitions)

🔇 Additional comments (1)
src/scanoss/__init__.py (1)

25-25: Version bump looks good

Matches the PR scope and date. No further changes in this file.

CHANGELOG.md Outdated
Comment on lines 12 to 15
## [1.32.0] - 2025-08-29
### Added
- Switched vulnerability and dependency APIs to use REST by default

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Classify as Changed and call out the behavioral switch

Switching the default transport is a behavior change (potentially breaking), not an addition. Also, note how to force gRPC.

-## [1.32.0] - 2025-08-29
-### Added
-- Switched vulnerability and dependency APIs to use REST by default
+## [1.32.0] - 2025-08-29
+### Changed
+- Default for vulnerability and dependency APIs is now REST (was gRPC). Use `--grpc` to force gRPC. (Potentially breaking)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
## [1.32.0] - 2025-08-29
### Added
- Switched vulnerability and dependency APIs to use REST by default
## [1.32.0] - 2025-08-29
### Changed
- Default for vulnerability and dependency APIs is now REST (was gRPC). Use `--grpc` to force gRPC. (Potentially breaking)
🧰 Tools
🪛 LanguageTool

[grammar] ~12-~12: There might be a mistake here.
Context: ...ing changes... ## [1.32.0] - 2025-08-29 ### Added - Switched vulnerability and depen...

(QB_NEW_EN)


[grammar] ~13-~13: There might be a mistake here.
Context: ...s... ## [1.32.0] - 2025-08-29 ### Added - Switched vulnerability and dependency AP...

(QB_NEW_EN)

🤖 Prompt for AI Agents
In CHANGELOG.md around lines 12 to 15, the release note currently lists
"Switched vulnerability and dependency APIs to use REST by default" under Added;
change it to the "Changed" section to reflect a behavior change and update the
entry to explicitly state that the default transport switched from gRPC to REST
and how to opt back into gRPC (e.g., document the flag/config option name and
example command or config key to force gRPC). Ensure the phrasing indicates
potential breaking impact and include the exact config/flag to force gRPC so
readers can revert the behavior if needed.

Copy link

github-actions bot commented Sep 1, 2025

SCANOSS SCAN Completed 🚀

  • Detected components: 1
  • Undeclared components: 0
  • Declared components: 1
  • Detected files: 48
  • Detected files undeclared: 0
  • Detected files declared: 48
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/scanoss/scanossgrpc.py (1)

107-123: Add ignore_cert_errors and use_grpc to GrpcConfig dataclass and its factory

  • In src/scanoss/scanossgrpc.py’s GrpcConfig @DataClass, append
    ignore_cert_errors: Optional[bool] = False
    use_grpc: Optional[bool] = False
  • In create_grpc_config_from_args, pass
    ignore_cert_errors=getattr(args, 'ignore_cert_errors', False)
    use_grpc=getattr(args, 'use_grpc', False)
♻️ Duplicate comments (4)
CHANGELOG.md (1)

12-15: Move entry to “Changed” and document the transport switch with opt-out flag

This is a behavior change. Call out the previous default and how to force gRPC.

Apply:

-## [1.32.0] - 2025-09-01
-### Added
-- Switched vulnerability and dependency APIs to use REST by default
+## [1.32.0] - 2025-09-01
+### Changed
+- Default transport for vulnerability and dependency APIs is now REST (was gRPC). Use `--grpc` to force gRPC. (Potentially breaking)
src/scanoss/scanossgrpc.py (3)

171-189: Build a scheme-safe REST base URL and increase session pool size

Prevents scheme-less URIs when users pass bare hosts and reduces connection churn under concurrency.

         self.orig_url = self.url.strip().rstrip('/')  # Used for proxy lookup
+        # Build a REST base URL that always includes a scheme
+        self.rest_base_url = (
+            self.orig_url if self.orig_url.startswith(('http://', 'https://'))
+            else f'https://{self.orig_url}'
+        )
@@
         else:
             self.session = requests.sessions.Session()
+        # Increase pool size to match our concurrency and reduce connection churn
+        try:
+            from requests.adapters import HTTPAdapter
+            adapter = HTTPAdapter(pool_connections=MAX_CONCURRENT_REQUESTS,
+                                  pool_maxsize=MAX_CONCURRENT_REQUESTS)
+            self.session.mount('https://', adapter)
+            self.session.mount('http://', adapter)
+        except Exception:
+            pass

725-766: Fix shared-header mutation, retry off-by-one, and exception order in REST posting

Avoids data races and unreachable branches; retries exactly retry_limit+1 attempts.

     def rest_post(self, uri: str, request_id: str, data: dict) -> dict:
@@
-        headers = self.headers
-        headers['x-request-id'] = request_id  # send a unique request id for each post
-        retry = 0  # Add some retry logic to cater for timeouts, etc.
-        while retry <= self.retry_limit:
-            retry += 1
+        headers = {**self.headers, 'x-request-id': request_id}
+        for attempt in range(self.retry_limit + 1):
             try:
                 response = self.session.post(uri, headers=headers, json=data, timeout=self.timeout)
                 response.raise_for_status()  # Raises an HTTPError for bad responses
                 return response.json()
-            except requests.exceptions.RequestException as e:
-                self.print_stderr(f'Error: Problem posting data to {uri}: {e}')
             except (requests.exceptions.SSLError, requests.exceptions.ProxyError) as e:
                 self.print_stderr(f'ERROR: Exception ({e.__class__.__name__}) POSTing data - {e}.')
                 raise Exception(f'ERROR: The SCANOSS Decoration API request failed for {uri}') from e
             except (requests.exceptions.Timeout, requests.exceptions.ConnectionError) as e:
-                if retry > self.retry_limit:  # Timed out retry_limit or more times, fail
+                if attempt >= self.retry_limit:
                     self.print_stderr(f'ERROR: {e.__class__.__name__} POSTing decoration data ({request_id}): {e}')
                     raise Exception(
                         f'ERROR: The SCANOSS Decoration API request timed out ({e.__class__.__name__}) for {uri}'
                     ) from e
                 else:
                     self.print_stderr(f'Warning: {e.__class__.__name__} communicating with {self.url}. Retrying...')
                     time.sleep(5)
+            except requests.exceptions.RequestException as e:
+                self.print_stderr(f'Error: Problem posting data to {uri}: {e}')
+                break
             except Exception as e:
                 self.print_stderr(
                     f'ERROR: Exception ({e.__class__.__name__}) POSTing data ({request_id}) to {uri}: {e}'
                 )
                 raise Exception(f'ERROR: The SCANOSS Decoration API request failed for {uri}') from e
         return None

767-790: Use scheme-safe base URL for REST vulnerabilities endpoint

Prevents invalid URIs when a bare host is provided.

-        response = self.rest_post(f'{self.orig_url}{DEFAULT_URI_PREFIX}/vulnerabilities/components', request_id, purls)
+        response = self.rest_post(f'{self.rest_base_url}{DEFAULT_URI_PREFIX}/vulnerabilities/components',
+                                  request_id, purls)
🧹 Nitpick comments (4)
CHANGELOG.md (1)

662-664: Fix duplicate anchor; add 1.32.0 compare link

Duplicate reference “1.31.5”; the second should be 1.32.0.

 [1.31.5]: https://github.com/scanoss/scanoss.py/compare/v1.31.4...v1.31.5
-[1.31.5]: https://github.com/scanoss/scanoss.py/compare/v1.31.5...v1.32.0
+[1.32.0]: https://github.com/scanoss/scanoss.py/compare/v1.31.5...v1.32.0
src/scanoss/scanossapi.py (1)

98-112: Load headers using the resolved URL

Pass the resolved URL to avoid double-premium switching on falsy url values.

-        self.load_generic_headers(url)
+        self.load_generic_headers(self.url)
src/scanoss/scanossgrpc.py (2)

82-85: Concurrency constant OK; consider tying HTTP pool size to this

We’ll mount HTTPAdapter below to match MAX_CONCURRENT_REQUESTS.


703-718: Case-insensitive API-key detection and mirror X-Session for LB compatibility

Aligns with header handling elsewhere and keeps legacy header present when key comes via req_headers.

-        if self.req_headers:  # Load generic headers
+        if self.req_headers:  # Load generic headers
             for key, value in self.req_headers.items():
-                if key == 'x-api-key':  # Set premium URL if x-api-key header is set
-                    if not url and not os.environ.get('SCANOSS_GRPC_URL'):
+                if key.lower() == 'x-api-key':  # Set premium URL if x-api-key header is set
+                    if not url and not os.environ.get('SCANOSS_GRPC_URL'):
                         self.url = DEFAULT_URL2  # API key specific and no alternative URL, so use the default premium
                     self.api_key = value
                 self.metadata.append((key, value))
                 self.headers[key] = value
+                if key.lower() == 'x-api-key':
+                    self.headers['X-Session'] = value
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 3221d1b and cae9273.

📒 Files selected for processing (3)
  • CHANGELOG.md (2 hunks)
  • src/scanoss/scanossapi.py (3 hunks)
  • src/scanoss/scanossgrpc.py (11 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
src/scanoss/scanossapi.py (1)
src/scanoss/scanossgrpc.py (1)
  • load_generic_headers (703-717)
src/scanoss/scanossgrpc.py (3)
src/scanoss/api/vulnerabilities/v2/scanoss_vulnerabilities_pb2_grpc.py (3)
  • VulnerabilitiesStub (29-74)
  • GetComponentsVulnerabilities (154-166)
  • GetComponentsVulnerabilities (382-406)
src/scanoss/scanossapi.py (1)
  • load_generic_headers (270-283)
src/scanoss/api/dependencies/v2/scanoss_dependencies_pb2_grpc.py (2)
  • GetDependencies (69-74)
  • GetDependencies (142-166)
🪛 GitHub Actions: Lint
src/scanoss/scanossapi.py

[error] 25-25: I001 Import block is unsorted or unformatted. Organize imports. (Command: xargs ruff check)


[error] 93-93: PLR2004 Magic value used in comparison, consider replacing 5 with a constant variable. (Command: xargs ruff check)


[error] 134-134: PLR0912 Too many branches (23 > 12). (Command: xargs ruff check)


[error] 134-134: PLR0915 Too many statements (76 > 50). (Command: xargs ruff check)


[error] 193-193: PLR2004 Magic value used in comparison, consider replacing 503 with a constant variable. (Command: xargs ruff check)


[error] 203-203: PLR2004 Magic value used in comparison, consider replacing 400 with a constant variable. (Command: xargs ruff check)

🪛 LanguageTool
CHANGELOG.md

[grammar] ~12-~12: There might be a mistake here.
Context: ...ing changes... ## [1.32.0] - 2025-09-01 ### Added - Switched vulnerability and depen...

(QB_NEW_EN)


[grammar] ~13-~13: There might be a mistake here.
Context: ...s... ## [1.32.0] - 2025-09-01 ### Added - Switched vulnerability and dependency AP...

(QB_NEW_EN)

🪛 markdownlint-cli2 (0.17.2)
CHANGELOG.md

663-663: Link and image reference definitions should be needed
Duplicate link or image reference definition: "1.31.5"

(MD053, link-image-reference-definitions)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (4)
src/scanoss/scanossapi.py (1)

270-284: Make header key check case-insensitive and mirror X-Session when API key is supplied via headers

     def load_generic_headers(self, url):
         if self.req_headers:
             for key, value in self.req_headers.items():
-                if key == 'x-api-key':
+                if key.lower() == 'x-api-key':
                     if not url and not os.environ.get('SCANOSS_SCAN_URL'):
                         self.url = DEFAULT_URL2
                     self.api_key = value
+                    self.headers['X-Session'] = value
                 self.headers[key] = value

No call-site updates are required; all existing load_generic_headers(url) invocations remain compatible.

src/scanoss/scanossgrpc.py (3)

25-46: Imports and tracing setup look fine


321-343: gRPC dependency path looks good

Request construction, metadata, and response handling are correct.


345-356: Vulnerability dispatch by transport LGTM

Runtime switch via use_grpc is straightforward.

Comment on lines +791 to 809
def _process_dep_file_rest(self, file, depth: int = 1) -> dict:
"""
Porcess a single dependency file using REST

:param file: dependency file purls
:param depth: depth to search (default: 1)
:return: response JSON or None
"""
request_id = str(uuid.uuid4())
self.print_debug(f'Sending data for Dependencies via REST (request id: {request_id})...')
file_request = {'files': [file], 'depth': depth}
response = self.rest_post(f'{self.orig_url}{DEFAULT_URI_PREFIX}/dependencies/dependencies',
request_id, file_request
)
self.print_trace(f'Received response for Dependencies via REST (request id: {request_id}): {response}')
if response:
return response
return None
#
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Fix typo and use scheme-safe base URL for REST dependencies

Also corrects “Porcess” → “Process”.

-    def _process_dep_file_rest(self, file, depth: int = 1) -> dict:
+    def _process_dep_file_rest(self, file, depth: int = 1) -> dict:
         """
-        Porcess a single dependency file using REST
+        Process a single dependency file using REST
@@
-        response = self.rest_post(f'{self.orig_url}{DEFAULT_URI_PREFIX}/dependencies/dependencies',
+        response = self.rest_post(f'{self.rest_base_url}{DEFAULT_URI_PREFIX}/dependencies/dependencies',
                                   request_id, file_request
                                   )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _process_dep_file_rest(self, file, depth: int = 1) -> dict:
"""
Porcess a single dependency file using REST
:param file: dependency file purls
:param depth: depth to search (default: 1)
:return: response JSON or None
"""
request_id = str(uuid.uuid4())
self.print_debug(f'Sending data for Dependencies via REST (request id: {request_id})...')
file_request = {'files': [file], 'depth': depth}
response = self.rest_post(f'{self.orig_url}{DEFAULT_URI_PREFIX}/dependencies/dependencies',
request_id, file_request
)
self.print_trace(f'Received response for Dependencies via REST (request id: {request_id}): {response}')
if response:
return response
return None
#
def _process_dep_file_rest(self, file, depth: int = 1) -> dict:
"""
Process a single dependency file using REST
:param file: dependency file purls
:param depth: depth to search (default: 1)
:return: response JSON or None
"""
request_id = str(uuid.uuid4())
self.print_debug(f'Sending data for Dependencies via REST (request id: {request_id})...')
file_request = {'files': [file], 'depth': depth}
response = self.rest_post(f'{self.rest_base_url}{DEFAULT_URI_PREFIX}/dependencies/dependencies',
request_id, file_request
)
self.print_trace(f'Received response for Dependencies via REST (request id: {request_id}): {response}')
if response:
return response
return None
#
🤖 Prompt for AI Agents
In src/scanoss/scanossgrpc.py around lines 791 to 809, fix the docstring typo
("Porcess" → "Process") and stop concatenating URLs by string joining; instead
build the REST endpoint with a scheme-safe join (for example use
urllib.parse.urljoin(self.orig_url, DEFAULT_URI_PREFIX +
'/dependencies/dependencies')) so the base URL and prefix are combined safely
regardless of trailing slashes or scheme, and replace the current f-string
concatenation with that joined URL when calling self.rest_post.

Copy link

github-actions bot commented Sep 1, 2025

SCANOSS SCAN Completed 🚀

  • Detected components: 1
  • Undeclared components: 0
  • Declared components: 1
  • Detected files: 48
  • Detected files undeclared: 0
  • Detected files declared: 48
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/scanoss/scanossapi.py (2)

134-166: Thread-safety: don’t mutate shared self.headers for per-request fields

Mutating self.headers to inject x-request-id is racy if scan() is called concurrently. Build a per-request copy instead and pass it to post().

-        headers = self.headers
-        headers['x-request-id'] = request_id  # send a unique request id for each post
+        headers = {**self.headers, 'x-request-id': request_id}  # per-request, thread-safe
...
-                    self.url, files=scan_files, data=form_data, headers=self.headers, timeout=self.timeout
+                    self.url, files=scan_files, data=form_data, headers=headers, timeout=self.timeout

193-217: Refine retry policy: don’t retry all 4xx; retry 429/5xx with backoff; honor Retry-After

Current logic retries on any 4xx and treats 503 as fatal. That wastes retries on auth/validation errors and mishandles transient throttling. Recommend: immediate fail for 4xx except 429; retry 429/5xx with (bounded) backoff and Retry-After.

-                elif r.status_code == requests.codes.service_unavailable:  # Service limits most likely reached
-                    self.print_stderr(
-                        f'ERROR: SCANOSS API rejected the scan request ({request_id}) due to '
-                        f'service limits being exceeded'
-                    )
-                    self.print_stderr(f'ERROR: Details: {r.text.strip()}')
-                    raise Exception(
-                        f'ERROR: {r.status_code} - The SCANOSS API request ({request_id}) rejected '
-                        f'for {self.url} due to service limits being exceeded.'
-                    )
-                elif r.status_code >= requests.codes.bad_request:
-                    if retry > self.retry_limit:  # No response retry_limit or more times, fail
-                        self.save_bad_req_wfp(scan_files, request_id, scan_id)
-                        raise Exception(
-                            f'ERROR: The SCANOSS API returned the following error: HTTP {r.status_code}, '
-                            f'{r.text.strip()}'
-                        )
-                    else:
-                        self.save_bad_req_wfp(scan_files, request_id, scan_id)
-                        self.print_stderr(
-                            f'Warning: Error response code {r.status_code} ({r.text.strip()}) from '
-                            f'{self.url}. Retrying...'
-                        )
-                        time.sleep(5)
+                elif r.status_code in (requests.codes.too_many_requests, requests.codes.service_unavailable):
+                    # Retriable throttling/overload
+                    if retry > self.retry_limit:
+                        self.save_bad_req_wfp(scan_files, request_id, scan_id)
+                        raise Exception(
+                            f'ERROR: {r.status_code} - SCANOSS API transient failure ({request_id}) for {self.url}: {r.text.strip()}'
+                        )
+                    retry_after = r.headers.get('Retry-After')
+                    try:
+                        delay = max(1, int(retry_after)) if retry_after else 5
+                    except (TypeError, ValueError):
+                        delay = 5
+                    self.print_stderr(f'Warning: {r.status_code} from {self.url}. Retrying in {delay}s...')
+                    time.sleep(delay)
+                elif 400 <= r.status_code < 500:
+                    # Non-retriable client errors
+                    self.save_bad_req_wfp(scan_files, request_id, scan_id)
+                    raise Exception(
+                        f'ERROR: {r.status_code} - SCANOSS API request ({request_id}) rejected for {self.url}: {r.text.strip()}'
+                    )
+                elif r.status_code >= 500:
+                    if retry > self.retry_limit:
+                        self.save_bad_req_wfp(scan_files, request_id, scan_id)
+                        raise Exception(
+                            f'ERROR: {r.status_code} - SCANOSS API server error ({request_id}) for {self.url}: {r.text.strip()}'
+                        )
+                    self.print_stderr(f'Warning: Server error {r.status_code} from {self.url}. Retrying in 5s...')
+                    time.sleep(5)
🧹 Nitpick comments (7)
src/scanoss/scanossapi.py (7)

45-47: Simplify env var defaults

Use the shorter, idiomatic os.environ.get with a default.

-SCANOSS_SCAN_URL = os.environ.get('SCANOSS_SCAN_URL') if os.environ.get('SCANOSS_SCAN_URL') else DEFAULT_URL
-SCANOSS_API_KEY = os.environ.get('SCANOSS_API_KEY') if os.environ.get('SCANOSS_API_KEY') else ''
+SCANOSS_SCAN_URL = os.environ.get('SCANOSS_SCAN_URL', DEFAULT_URL)
+SCANOSS_API_KEY = os.environ.get('SCANOSS_API_KEY', '')

55-72: Constructor docstring is stale

Docstring omits newer params (ignore_cert_errors, pac, retry, req_headers, ca_cert defaults). Update for consistency.


93-95: MIN_TIMEOUT guard should be inclusive

Make MIN_TIMEOUT inclusive to align with CLI check (which only flags values < MIN_TIMEOUT).

-        self.timeout = timeout if timeout > MIN_TIMEOUT else DEFAULT_TIMEOUT
+        self.timeout = timeout if timeout >= MIN_TIMEOUT else DEFAULT_TIMEOUT

105-111: Duplicate User-Agent header

Setting both 'User-Agent' and 'user-agent' is redundant; HTTP headers are case-insensitive. Keep the canonical 'User-Agent' only.

         user_agent = f'scanoss-py/{__version__}'
-        self.headers['User-Agent'] = user_agent
-        self.headers['user-agent'] = user_agent
+        self.headers['User-Agent'] = user_agent

166-183: Include request_id in raised exceptions for correlation

Where exceptions are raised (e.g., SSLError/ProxyError and generic Exception), embed request_id to ease tracing.


270-284: Header match should be case-insensitive for x-api-key

req_headers keys may arrive in different cases. Normalize for logic while preserving original key casing in self.headers.

-        if self.req_headers:  # Load generic headers
-            for key, value in self.req_headers.items():
-                if key == 'x-api-key': # Set premium URL if x-api-key header is set
-                    if not url and not os.environ.get('SCANOSS_SCAN_URL'):
-                        self.url = DEFAULT_URL2  # API key specific and no alternative URL, so use the default premium
-                    self.api_key = value
-                self.headers[key] = value
+        if self.req_headers:  # Load generic headers
+            for key, value in self.req_headers.items():
+                key_lower = key.lower()
+                if key_lower == 'x-api-key':  # Set premium URL if x-api-key header is set
+                    if not url and not os.environ.get('SCANOSS_SCAN_URL'):
+                        self.url = DEFAULT_URL2  # API key specific and no alternative URL, so use the default premium
+                    self.api_key = value
+                self.headers[key] = value

113-116: Global HTTPConnection.debuglevel

Setting http_client.HTTPConnection.debuglevel is global. If multi-tenant or used as a library, consider limiting to logging at urllib3 level or gating via an env var to avoid surprising consumers.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between cae9273 and 7c25826.

📒 Files selected for processing (1)
  • src/scanoss/scanossapi.py (9 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/scanoss/scanossapi.py (3)
src/scanoss/scanossbase.py (1)
  • ScanossBase (28-107)
src/scanoss/scanossgrpc.py (1)
  • load_generic_headers (703-717)
src/scanoss/cli.py (2)
  • scan (1281-1480)
  • wfp (1182-1246)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (1)
src/scanoss/scanossapi.py (1)

98-103: Env-driven premium endpoint fallback looks good

Auto-switching to DEFAULT_URL2 when an API key is set and no URL/env override is supplied is sensible.

@Alex-1089 Alex-1089 merged commit 84eee42 into main Sep 1, 2025
6 checks passed
@Alex-1089 Alex-1089 deleted the feat/sean/SP-3132_switch-free-to-rest branch September 1, 2025 12:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants