Skip to content

ValueError while Importing Word2Vec from gensim.models #3606

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
usamashami11 opened this issue Mar 19, 2025 · 1 comment
Closed

ValueError while Importing Word2Vec from gensim.models #3606

usamashami11 opened this issue Mar 19, 2025 · 1 comment

Comments

@usamashami11
Copy link

usamashami11 commented Mar 19, 2025

Description of Issue

A week ago everything was perfectly working, however, now, while importing gensim model Word2Vec, after successful installation of the gensim package in Google Colab, I am facing this error. ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

Steps to Reproduce Error

  1. Install the library via !pip install gensim
  2. Import Word2Vec via from gensim.models import Word2Vec

Complete Error Output

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-3-d83094a7b397>](https://localhost:8080/#) in <cell line: 0>()
      1 # Import Libraries
----> 2 from gensim.models import Word2Vec
      3 import nltk
      4 nltk.download('punkt_tab')
      5 from nltk.tokenize import word_tokenize

5 frames
[/usr/local/lib/python3.11/dist-packages/gensim/__init__.py](https://localhost:8080/#) in <module>
      9 import logging
     10 
---> 11 from gensim import parsing, corpora, matutils, interfaces, models, similarities, utils  # noqa:F401
     12 
     13 

[/usr/local/lib/python3.11/dist-packages/gensim/corpora/__init__.py](https://localhost:8080/#) in <module>
      4 
      5 # bring corpus classes directly into package namespace, to save some typing
----> 6 from .indexedcorpus import IndexedCorpus  # noqa:F401 must appear before the other classes
      7 
      8 from .mmcorpus import MmCorpus  # noqa:F401

[/usr/local/lib/python3.11/dist-packages/gensim/corpora/indexedcorpus.py](https://localhost:8080/#) in <module>
     12 import numpy
     13 
---> 14 from gensim import interfaces, utils
     15 
     16 logger = logging.getLogger(__name__)

[/usr/local/lib/python3.11/dist-packages/gensim/interfaces.py](https://localhost:8080/#) in <module>
     17 import logging
     18 
---> 19 from gensim import utils, matutils
     20 
     21 

[/usr/local/lib/python3.11/dist-packages/gensim/matutils.py](https://localhost:8080/#) in <module>
   1032 try:
   1033     # try to load fast, cythonized code if possible
-> 1034     from gensim._matutils import logsumexp, mean_absolute_difference, dirichlet_expectation
   1035 
   1036 except ImportError:

/usr/local/lib/python3.11/dist-packages/gensim/_matutils.pyx in init gensim._matutils()

ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

Output of Versions

Linux-6.1.85+-x86_64-with-glibc2.35
Python 3.11.11 (main, Dec  4 2024, 08:55:07) [GCC 11.4.0]
Bits 64
NumPy 2.0.2
SciPy 1.14.1
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-5-824570de713c>](https://localhost:8080/#) in <cell line: 0>()
      4 import numpy; print("NumPy", numpy.__version__)
      5 import scipy; print("SciPy", scipy.__version__)
----> 6 import gensim; print("gensim", gensim.__version__)
      7 from gensim.models import word2vec;print("FAST_VERSION", word2vec.FAST_VERSION)

5 frames
[/usr/local/lib/python3.11/dist-packages/gensim/__init__.py](https://localhost:8080/#) in <module>
      9 import logging
     10 
---> 11 from gensim import parsing, corpora, matutils, interfaces, models, similarities, utils  # noqa:F401
     12 
     13 

[/usr/local/lib/python3.11/dist-packages/gensim/corpora/__init__.py](https://localhost:8080/#) in <module>
      4 
      5 # bring corpus classes directly into package namespace, to save some typing
----> 6 from .indexedcorpus import IndexedCorpus  # noqa:F401 must appear before the other classes
      7 
      8 from .mmcorpus import MmCorpus  # noqa:F401

[/usr/local/lib/python3.11/dist-packages/gensim/corpora/indexedcorpus.py](https://localhost:8080/#) in <module>
     12 import numpy
     13 
---> 14 from gensim import interfaces, utils
     15 
     16 logger = logging.getLogger(__name__)

[/usr/local/lib/python3.11/dist-packages/gensim/interfaces.py](https://localhost:8080/#) in <module>
     17 import logging
     18 
---> 19 from gensim import utils, matutils
     20 
     21 

[/usr/local/lib/python3.11/dist-packages/gensim/matutils.py](https://localhost:8080/#) in <module>
   1032 try:
   1033     # try to load fast, cythonized code if possible
-> 1034     from gensim._matutils import logsumexp, mean_absolute_difference, dirichlet_expectation
   1035 
   1036 except ImportError:

/usr/local/lib/python3.11/dist-packages/gensim/_matutils.pyx in init gensim._matutils()

ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

Alternative Method (1)

  1. Install the library via !pip install --upgrade gensim

Output of Versions for this method

Linux-6.1.85+-x86_64-with-glibc2.35
Python 3.11.11 (main, Dec  4 2024, 08:55:07) [GCC 11.4.0]
Bits 64
NumPy 2.0.2
SciPy 1.13.1
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
[<ipython-input-2-79ea32cbff8d>](https://localhost:8080/#) in <cell line: 0>()
      4 import numpy; print("NumPy", numpy.__version__)
      5 import scipy; print("SciPy", scipy.__version__)
----> 6 import gensim; print("gensim", gensim.__version__)
      7 from gensim.models import word2vec; print("FAST_VERSION", word2vec.FAST_VERSION)

9 frames
[/usr/local/lib/python3.11/dist-packages/gensim/__init__.py](https://localhost:8080/#) in <module>
      9 import logging
     10 
---> 11 from gensim import parsing, corpora, matutils, interfaces, models, similarities, utils  # noqa:F401
     12 
     13 

[/usr/local/lib/python3.11/dist-packages/gensim/parsing/__init__.py](https://localhost:8080/#) in <module>
      2 
      3 from .porter import PorterStemmer  # noqa:F401
----> 4 from .preprocessing import (  # noqa:F401
      5     preprocess_documents,
      6     preprocess_string,

[/usr/local/lib/python3.11/dist-packages/gensim/parsing/preprocessing.py](https://localhost:8080/#) in <module>
     24 import glob
     25 
---> 26 from gensim import utils
     27 from gensim.parsing.porter import PorterStemmer
     28 

[/usr/local/lib/python3.11/dist-packages/gensim/utils.py](https://localhost:8080/#) in <module>
     33 
     34 import numpy as np
---> 35 import scipy.sparse
     36 from smart_open import open
     37 

[/usr/local/lib/python3.11/dist-packages/scipy/sparse/__init__.py](https://localhost:8080/#) in <module>
    292 import warnings as _warnings
    293 
--> 294 from ._base import *
    295 from ._csr import *
    296 from ._csc import *

[/usr/local/lib/python3.11/dist-packages/scipy/sparse/_base.py](https://localhost:8080/#) in <module>
      3 
      4 import numpy as np
----> 5 from scipy._lib._util import VisibleDeprecationWarning
      6 
      7 from ._sputils import (asmatrix, check_reshape_kwargs, check_shape,

[/usr/local/lib/python3.11/dist-packages/scipy/_lib/_util.py](https://localhost:8080/#) in <module>
     16 
     17 import numpy as np
---> 18 from scipy._lib._array_api import array_namespace
     19 
     20 

[/usr/local/lib/python3.11/dist-packages/scipy/_lib/_array_api.py](https://localhost:8080/#) in <module>
     15 
     16 from scipy._lib import array_api_compat
---> 17 from scipy._lib.array_api_compat import (
     18     is_array_api_obj,
     19     size,

[/usr/local/lib/python3.11/dist-packages/scipy/_lib/array_api_compat/numpy/__init__.py](https://localhost:8080/#) in <module>
----> 1 from numpy import *
      2 
      3 # from numpy import * doesn't overwrite these builtin names
      4 from numpy import abs, max, min, round
      5 

[/usr/local/lib/python3.11/dist-packages/numpy/__init__.py](https://localhost:8080/#) in __getattr__(attr)
    362         try:
    363             x = ones(2, dtype=float32)
--> 364             if not abs(x.dot(x) - float32(2.0)) < 1e-5:
    365                 raise AssertionError()
    366         except AssertionError:

ModuleNotFoundError: No module named 'numpy.rec'

Alternative Method (2)

  1. Install gensim from the source code (available in releases) via the following:
# Downloading tar.gz file
!wget https://github.com/piskvorky/gensim/archive/refs/tags/4.3.2.tar.gz

# Unzipping tar.gz file
!tar -xvzf /content/4.3.2.tar.gz

# Changing directory and Installing the Gensim package
import os 
os.chdir('/content/gensim-4.3.2/')
!pwd
!pip install .

Output of Versions for this method

Linux-6.1.85+-x86_64-with-glibc2.35
Python 3.11.11 (main, Dec  4 2024, 08:55:07) [GCC 11.4.0]
Bits 64
NumPy 2.0.2
SciPy 1.14.1
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
[<ipython-input-4-824570de713c>](https://localhost:8080/#) in <cell line: 0>()
      4 import numpy; print("NumPy", numpy.__version__)
      5 import scipy; print("SciPy", scipy.__version__)
----> 6 import gensim; print("gensim", gensim.__version__)
      7 from gensim.models import word2vec;print("FAST_VERSION", word2vec.FAST_VERSION)

4 frames
[/content/gensim-4.3.2/gensim/__init__.py](https://localhost:8080/#) in <module>
      9 import logging
     10 
---> 11 from gensim import parsing, corpora, matutils, interfaces, models, similarities, utils  # noqa:F401
     12 
     13 

[/content/gensim-4.3.2/gensim/corpora/__init__.py](https://localhost:8080/#) in <module>
      4 
      5 # bring corpus classes directly into package namespace, to save some typing
----> 6 from .indexedcorpus import IndexedCorpus  # noqa:F401 must appear before the other classes
      7 
      8 from .mmcorpus import MmCorpus  # noqa:F401

[/content/gensim-4.3.2/gensim/corpora/indexedcorpus.py](https://localhost:8080/#) in <module>
     12 import numpy
     13 
---> 14 from gensim import interfaces, utils
     15 
     16 logger = logging.getLogger(__name__)

[/content/gensim-4.3.2/gensim/interfaces.py](https://localhost:8080/#) in <module>
     17 import logging
     18 
---> 19 from gensim import utils, matutils
     20 
     21 

[/content/gensim-4.3.2/gensim/matutils.py](https://localhost:8080/#) in <module>
     18 import scipy.sparse
     19 from scipy.stats import entropy
---> 20 from scipy.linalg import get_blas_funcs, triu
     21 from scipy.linalg.lapack import get_lapack_funcs
     22 from scipy.special import psi  # gamma function utils

ImportError: cannot import name 'triu' from 'scipy.linalg' (/usr/local/lib/python3.11/dist-packages/scipy/linalg/__init__.py)

I have also tried reinstalling different versions of numpy, scipy and/or gensim, but no luck. Please look into this and resolve asap. Thanks!

@gojomo
Copy link
Collaborator

gojomo commented Mar 19, 2025

Per my comment at #3605 (comment), I don't see a problem if (a) installing gensim via !pip install --upgrade gensim; then (b) restarting the session as is often necessary when the underlying installed packages have changed.

As this seems to be roughly the same issues as #3605, let's do further discussion there unless the errors are shown to require different workarounds - so closing this as duplicate for now.

@gojomo gojomo closed this as completed Mar 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants