[Cython] Fused type signature resolution failures

Discussion:

Pauli Virtanen

2014-03-18 22:56:36 UTC

Hi,

Here are the fused types + memoryview signature resolution failures I
promised earlier (Cython 0.20.1):

(A)

TypeError: No matching signature

----------------------------- asd.pyx
cimport numpy as cnp

ctypedef fused value_t:
cnp.float32_t
cnp.float64_t

cpdef foo(value_t x):
pass
----------------------------- quux.py
import numpy as np
import asd
asd.foo(np.float32(1.0))
-----------------------------

(B)

ValueError: Buffer dtype mismatch, expected 'int64_t' but got 'double'

----------------------------- asd.pyx
cimport numpy as cnp

ctypedef fused idx_t:
cnp.int32_t
cnp.int64_t

ctypedef fused value_t:
cnp.int64_t
cnp.float64_t

cpdef foo(idx_t[:,:] i, idx_t[:,:] j, value_t[:,:] x):
pass
----------------------------- quux.py
import numpy as np
import asd
i = np.zeros((3, 3), np.int64)
j = np.zeros((3, 3), np.int64)
x = np.zeros((3, 3), np.float64)
asd.foo(i, j, x)
-----------------------------

(C)

Then some nasty platform-dependent failures:

https://github.com/scipy/scipy/issues/3461

The relevant code is:

https://github.com/scipy/scipy/blob/master/scipy/sparse/_csparsetools.pyx#L202

The code looks nothing special. However, call to `lil_fancy_get` fails
with "TypeError: No matching signature found" when the inputs have types

<class 'int'> <class 'int'> <class 'numpy.ndarray'>
<class 'numpy.ndarray'> <class 'numpy.ndarray'> <class 'numpy.ndarray'>
<class 'numpy.ndarray'> <class 'numpy.ndarray'>

with ndarray dtypes: object object object object int32 int32

The failure occurs only on Christoph Gohlke's Win64 build, but not on
Linux/OSX/MINGW32. This sounds like some integer size combination
issue, but I'm far from sure.

Unfortunately, I'm not easily able to isolate/understand what's going
wrong here.

--
Pauli Virtane

Stefan Behnel

2014-06-22 10:48:27 UTC

Permalink

Hi,

it looks like no-one replied to is so far, so here's a response.

Post by Pauli Virtanen
Here are the fused types + memoryview signature resolution failures I
(A)
TypeError: No matching signature
----------------------------- asd.pyx
cimport numpy as cnp
cnp.float32_t
cnp.float64_t
pass
----------------------------- quux.py
import numpy as np
import asd
asd.foo(np.float32(1.0))
-----------------------------

I don't actually know how NumPy represents its types at the Python level,
but my guess is that it would be tricky to match these two without teaching
Cython itself something about NumPy (and how it wraps basic C types here).
I'd rather like to avoid that and live with the above.

Post by Pauli Virtanen
(B)
ValueError: Buffer dtype mismatch, expected 'int64_t' but got 'double'
----------------------------- asd.pyx
cimport numpy as cnp
cnp.int32_t
cnp.int64_t
cnp.int64_t
cnp.float64_t
pass
----------------------------- quux.py
import numpy as np
import asd
i = np.zeros((3, 3), np.int64)
j = np.zeros((3, 3), np.int64)
x = np.zeros((3, 3), np.float64)
asd.foo(i, j, x)
-----------------------------

This looks like a bug to me at first sight.

Post by Pauli Virtanen
(C)
https://github.com/scipy/scipy/issues/3461
https://github.com/scipy/scipy/blob/master/scipy/sparse/_csparsetools.pyx#L202
The code looks nothing special. However, call to `lil_fancy_get` fails
with "TypeError: No matching signature found" when the inputs have types
<class 'int'> <class 'int'> <class 'numpy.ndarray'>
<class 'numpy.ndarray'> <class 'numpy.ndarray'> <class 'numpy.ndarray'>
<class 'numpy.ndarray'> <class 'numpy.ndarray'>
with ndarray dtypes: object object object object int32 int32
The failure occurs only on Christoph Gohlke's Win64 build, but not on
Linux/OSX/MINGW32. This sounds like some integer size combination
issue, but I'm far from sure.
Unfortunately, I'm not easily able to isolate/understand what's going
wrong here.

Generally speaking, I think that the signature matching algorithm has some
room for improvements, especially the one that matches Python signatures at
runtime.

We should take a look at how other implementations do this dispatch. There
are multiple "generic functions" implementations for Python that do similar
things.

Stefan

Pauli Virtanen

2014-06-22 11:35:24 UTC

Permalink

22.06.2014 13:48, Stefan Behnel kirjoitti:
[clip]

Post by Stefan Behnel

Post by Pauli Virtanen
(A)

[clip]

Post by Stefan Behnel

Post by Pauli Virtanen
asd.foo(np.float32(1.0))

I don't actually know how NumPy represents its types at the Python
level, but my guess is that it would be tricky to match these two
without teaching Cython itself something about NumPy (and how it
wraps basic C types here). I'd rather like to avoid that and live
with the above.

Agreed, it's probably not possible to properly deal with this without
making use of Numpy scalar type object binary layout in some form.

On the other hand, `asd.foo(np.array(1.0))` doesn't work either ---
maybe the buffer code path does not trigger for scalar values.

[clip]

Post by Stefan Behnel

Post by Pauli Virtanen
(B)
ValueError: Buffer dtype mismatch, expected 'int64_t' but got 'double'

[clip]

Post by Stefan Behnel
This looks like a bug to me at first sight.

This was fixed by my PR #284 that you merged.

Post by Stefan Behnel

Post by Pauli Virtanen
(C)

[clip]

Post by Stefan Behnel
Generally speaking, I think that the signature matching algorithm has some
room for improvements, especially the one that matches Python signatures at
runtime.
We should take a look at how other implementations do this dispatch. There
are multiple "generic functions" implementations for Python that do similar
things.

I agree that there probably is room for improvement, possibly also
speed-wise.

I'll try to revisit (at some point) the csparsetools Cython
implementation to see if there are low-hanging fixes that would be
useful there.

--
Pauli Virtanen

Stefan Behnel

2014-06-22 12:01:28 UTC

Permalink

Post by Pauli Virtanen

I agree that there probably is room for improvement, possibly also
speed-wise.

Definitely speed-wise. I was considering to get rid of the "build a key and
do a dict lookup" approach and just do sequential type checks instead,
although looking at your example with its dozen of different numeric types
in one fused type makes me worry that there might be code out there that
actually benefits from dict usage. Still, I'm sure it would be possible to
speed up the most common case of exactly one fused type in a signature,
maybe also that of two, or that of one structured (i.e. array) type.
Everything else isn't worth it, I guess.

Post by Pauli Virtanen
I'll try to revisit (at some point) the csparsetools Cython
implementation to see if there are low-hanging fixes that would be
useful there.

That would be great.

Stefan