-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Iteration is not usable #260
Comments
Yes, maybe we should expose something like this: [Symbol.iterator]() {
return new NDIter(this)
}
Yes, this is by design since most of the functions in core/ are using the linearised indices. The coordinates are currently accessible using
Yeah, I actually meant to change a while back but if I remember correctly it caused issues with
Yeah, the documentation can be improved here. It is heavily inspired by numpy (see https://numpy.org/doc/stable/dev/internals.code-explanations.html#n-d-iterators).
Hmm, this is weird. Should be fixed. |
The linearized indices are not very useful. If that's what you really want, you can typically just iterate over
In Python, I'm thinking something along the lines of: // if you need something fancy, yield an object handle to each element.
// note I wrote this so that the handles remain valid after the iterator advances.
// It might make sense to keep one handle and update it destructively.
NDArray.prototype.iter = function* () {
const ndim = this.shape.length
const data = this.data
const it = new NDIter(this)
while (!it.done()) {
const offset = it.pos
yield {
offset,
byteOffset: offset * data.BYTES_PER_ELEMENT,
index: it.index,
coords: it.coords.slice(0, ndim),
get: () => data[offset],
set: (value) => { data[offset] = value },
}
it.next()
}
}
// iterate over coordinates and values. I think this should be the default iterator:
NDArray.prototype[Symbol.iterator] = NDArray.prototype.entries = function* () {
for (const v of this.iter()){
yield [v.coords, v.get()];
}
} I could also see an argument that iteration should be over a single axis, not all axes (e.g. should yield matrix rows). |
What is the performance of Symbol.iterator? Iteration is a good programming idiom, and theoretically can have good performance, but in practise I don't know when it comes the latest benchmarks. I don't know how much of a function call overhead there is when using iteration, if it is substantial then I expect that over time the JavaScript compilers within engines such as V8 and SpiderMonkey will improve and there will be much less overhead for using this technique. |
I want to start out by saying this is the wrong question. Benchmarks don't matter if the API does not make it easy to write correct code. There's a conceptual tension between an axis being:
It's not clear which of these vectorious is trying to be good at, and what an iterator should expose depends heavily on this! It might even be some combination. For instance, a list of particles might each have some mass and some velocity.
Anyway, I spent way too much time microbenchmarking and here's what I found:
On a 1000 element
|
To be fair, the iterator is written mainly for internal use. My plan was to write one version in JS and another with same API in C/rust/wasm for performance critical applications, similar to the nblas/nlapack-bindings but life (work) kind of got in the way. It's an interesting idea to expose different iterators for different purposes. Also, benchmarking with only 1000 elements might yield confusing results, try 1M for better comparisons. |
I kinda figured. It feels like NDArray was an attempt to generalize over vectors and matrices, but it doesn't seem like this library can do much with n>2!
Yeah. They also have very different memory patterns:
I'm sure there's a very rich theory of API design waiting to be discovered!
1M elements is not something I really use. If I had something that big and needed performance, I'd probably (1) use a plain array instead of a typed array or a wrapper class and (2) use the GPU or worker threads to squeeze out that performance. But this was just a curiosity-driven exploration on very synthetic code, and the results did not seem confusing: https://gist.github.com/rotu/799f0608bd3ede48b9f8f1280876fb05 Maybe I could see 1M elements if I had highly structured data (e.g. tensor networks, joint probability distributions, or particle-particle interaction simulations) but this library can't do some things that would be necessary there (in particular, tensor contractions!) |
I'm trying to loop over an NDArray coordinates, and it seems like iteration needs an overhaul.
NDArray
is not iterable (sofor (x of my_ndarray)
andArray.from(myNdArray)
throw errors)NDIter
's implementation of theiterable
protocol seems to iterate over only the linearized indices (i.e.Array.from(new NDIter(my_ndarray))
gives[0,1,...]
, even if the ndarray is multidimensional)NDIter
has properties that are length 32 (V_MAXDIMS
) even when the array it's iterating over has fewer dimensions.NDIter
has extraneous properties which obscure how to use it.NDIter
's string representation uses the namef
(I think because it is derived from minified code) and is excessively verbose.prints:
The text was updated successfully, but these errors were encountered: