Discussion:
[Cython] aritmetic with arrays in Cython
Ian Henriksen
2014-08-04 19:11:13 UTC
Permalink
Hi all,
I noticed that some time ago there was a pull request (
https://github.com/cython/cython/pull/144) open that was trying to
implement basic arithmetic operations with arrays. This seems to have also
been proposed in CEP 518 (
https://github.com/cython/cython/wiki/enhancements-simd). What is the
status on this?

This is a feature I'd really like to see in Cython. I'd be happy to help
with it, though I don't currently know enough about the internals of Cython
to do it without some guidance. What would be the best way to do this? I've
had three ideas thus far that might allow something like this to work, but
I'd really like to get some second opinions before I get too far with any
of them.

1. Make these array operations available as a part of Cython's memory view
objects.

2. Write a Cython wrapper around a NumPy-like C++ library.
https://github.com/ContinuumIO/libdynd seems to be a good candidate, though
I'm not as familiar with it. (This doesn't strike me as something that
would be a part of the main Cython project, but it could be a good way to
make this sort of functionality available. It seems like a good way to
avoid a lot of duplicate work though.)

3. Write a Fortran backend for Cython that uses Fortran's arrays wherever
memory views are used.
(This could be a bit crazy, but with the iso_c_binding module, it might be
possible. It could also allow better optimizations for arrays to be applied
by the compiler.)

What would be the best way to go about this?

Thanks!

-Ian Henriksen
Julian Taylor
2014-08-05 19:38:38 UTC
Permalink
Post by Ian Henriksen
Hi all,
I noticed that some time ago there was a pull request
(https://github.com/cython/cython/pull/144) open that was trying to
implement basic arithmetic operations with arrays. This seems to have
also been proposed in CEP 518
(https://github.com/cython/cython/wiki/enhancements-simd). What is the
status on this?
This is a feature I'd really like to see in Cython. I'd be happy to help
with it, though I don't currently know enough about the internals of
Cython to do it without some guidance. What would be the best way to do
this? I've had three ideas thus far that might allow something like this
to work, but I'd really like to get some second opinions before I get
too far with any of them.
1. Make these array operations available as a part of Cython's memory
view objects.
2. Write a Cython wrapper around a NumPy-like C++
library. https://github.com/ContinuumIO/libdynd seems to be a good
candidate, though I'm not as familiar with it. (This doesn't strike me
as something that would be a part of the main Cython project, but it
could be a good way to make this sort of functionality available. It
seems like a good way to avoid a lot of duplicate work though.)
3. Write a Fortran backend for Cython that uses Fortran's arrays
wherever memory views are used.
(This could be a bit crazy, but with the iso_c_binding module, it might
be possible. It could also allow better optimizations for arrays to be
applied by the compiler.)
What would be the best way to go about this?
4. Use numpy itself. numpy since 1.8 has vectorized base math functions,
though for large arrays you have to apply manual blocking and inplace
operations to see significant gains. That is where cython can help.
Possibly numpy is also open to expose the raw data buffer functions in
its api to make blocking more efficient.
Ian Henriksen
2014-08-06 23:30:21 UTC
Permalink
Message: 3
Date: Tue, 05 Aug 2014 21:38:38 +0200
From: Julian Taylor <jtaylor.debian-gM/Ye1E23mwN+***@public.gmane.org>
To: Core developer mailing list of the Cython compiler
<cython-devel-+ZN9ApsXKcEdnm+***@public.gmane.org>
Subject: Re: [Cython] aritmetic with arrays in Cython
Message-ID: <53E132BE.3070207-gM/Ye1E23mwN+***@public.gmane.org>
Content-Type: text/plain; charset="windows-1252"
Post by Julian Taylor
Post by Ian Henriksen
Hi all,
I noticed that some time ago there was a pull request
(https://github.com/cython/cython/pull/144) open that was trying to
implement basic arithmetic operations with arrays. This seems to have
also been proposed in CEP 518
(https://github.com/cython/cython/wiki/enhancements-simd). What is the
status on this?
This is a feature I'd really like to see in Cython. I'd be happy to help
with it, though I don't currently know enough about the internals of
Cython to do it without some guidance. What would be the best way to do
this? I've had three ideas thus far that might allow something like this
to work, but I'd really like to get some second opinions before I get
too far with any of them.
1. Make these array operations available as a part of Cython's memory
view objects.
2. Write a Cython wrapper around a NumPy-like C++
library. https://github.com/ContinuumIO/libdynd seems to be a good
candidate, though I'm not as familiar with it. (This doesn't strike me
as something that would be a part of the main Cython project, but it
could be a good way to make this sort of functionality available. It
seems like a good way to avoid a lot of duplicate work though.)
3. Write a Fortran backend for Cython that uses Fortran's arrays
wherever memory views are used.
(This could be a bit crazy, but with the iso_c_binding module, it might
be possible. It could also allow better optimizations for arrays to be
applied by the compiler.)
What would be the best way to go about this?
4. Use numpy itself. numpy since 1.8 has vectorized base math functions,
though for large arrays you have to apply manual blocking and inplace
operations to see significant gains. That is where cython can help.
Possibly numpy is also open to expose the raw data buffer functions in
its api to make blocking more efficient.
Thank you for pointing that out. Using the machinery that is already built
in to numpy would be ideal. I hadn't realized that was a part of numpy's
API. If I understand correctly, you're referring to
https://github.com/numpy/numpy/blob/master/doc/neps/generalized-ufuncs.rst.
Is that the right portion of the API? I'll try and figure out how to use
it. Any working examples would be greatly appreciated. Other suggestions
are also still welcome
Thanks!
-Ian Henriksen
Matěj Laitl
2014-08-07 07:58:03 UTC
Permalink
Post by Ian Henriksen
Hi all,
I noticed that some time ago there was a pull request (
https://github.com/cython/cython/pull/144) open that was trying to
implement basic arithmetic operations with arrays. This seems to have also
been proposed in CEP 518 (
https://github.com/cython/cython/wiki/enhancements-simd). What is the
status on this?
This is a feature I'd really like to see in Cython. I'd be happy to help
with it, though I don't currently know enough about the internals of Cython
to do it without some guidance. What would be the best way to do this? I've
had three ideas thus far that might allow something like this to work, but
I'd really like to get some second opinions before I get too far with any
of them.
Hi,
you may also check out https://github.com/strohel/Ceygen if it would suit your
needs.

Cheers,
Matěj
Ian Henriksen
2014-08-07 22:58:26 UTC
Permalink
Nice module! That seems to implement most of the basic functionality I've
seen lacking. Ideally, I'm still trying to find a way to be able to do
something syntactically like this:

cpdef whateverfunction(double[:] a, float[:] b):
return a + a * b + .5

It would be ideal if that sort of thing were already a part of Cython. All
things considered though, most of what I'm looking for seems to be there.

Thanks!

-Ian Henriksen
Post by Ian Henriksen
Post by Ian Henriksen
Hi all,
I noticed that some time ago there was a pull request (
https://github.com/cython/cython/pull/144) open that was trying to
implement basic arithmetic operations with arrays. This seems to have
also
Post by Ian Henriksen
been proposed in CEP 518 (
https://github.com/cython/cython/wiki/enhancements-simd). What is the
status on this?
This is a feature I'd really like to see in Cython. I'd be happy to help
with it, though I don't currently know enough about the internals of
Cython
Post by Ian Henriksen
to do it without some guidance. What would be the best way to do this?
I've
Post by Ian Henriksen
had three ideas thus far that might allow something like this to work,
but
Post by Ian Henriksen
I'd really like to get some second opinions before I get too far with any
of them.
Hi,
you may also check out https://github.com/strohel/Ceygen if it would suit your
needs.
Cheers,
Matěj
Stefan Behnel
2014-08-08 05:05:08 UTC
Permalink
Post by Ian Henriksen
Post by Matěj Laitl
you may also check out https://github.com/strohel/Ceygen if it would suit
your needs.
Interesting project. Thanks for sharing.
Post by Ian Henriksen
Ideally, I'm still trying to find a way to be able to do
return a + a * b + .5
It would be ideal if that sort of thing were already a part of Cython. All
things considered though, most of what I'm looking for seems to be there.
Eigen can apparently do these things already:

http://eigen.tuxfamily.org/dox/TopicLazyEvaluation.html

It might not be all that much work to integrate this with Cython to enable
syntax support. Or have some kind of preprocessor, but I guess that's not
going to be simpler than have it in Cython as an optional AST transform.
Basically, whenever an arithmetic expression consists of a mixture of
memory views and (optionally) scalars, it could be mapped to the kind of
code that Ceygen uses, just at an arbitrarily complex expression length.

Specifically, this could be bound to the Matrix multiplication operator
(which Cython now supports anyway) and limited to C++ compilation mode
(which Eigen requires).

http://legacy.python.org/dev/peps/pep-0465/

Stefan
Matěj Laitl
2014-08-08 09:07:27 UTC
Permalink
Post by Stefan Behnel
Post by Ian Henriksen
Post by Matěj Laitl
you may also check out https://github.com/strohel/Ceygen if it would suit
your needs.
Interesting project. Thanks for sharing.
Post by Ian Henriksen
Ideally, I'm still trying to find a way to be able to do
return a + a * b + .5
It would be ideal if that sort of thing were already a part of Cython. All
things considered though, most of what I'm looking for seems to be there.
http://eigen.tuxfamily.org/dox/TopicLazyEvaluation.html
It might not be all that much work to integrate this with Cython to enable
syntax support. Or have some kind of preprocessor, but I guess that's not
going to be simpler than have it in Cython as an optional AST transform.
Basically, whenever an arithmetic expression consists of a mixture of
memory views and (optionally) scalars, it could be mapped to the kind of
code that Ceygen uses, just at an arbitrarily complex expression length.
Exactly! That was one of my ideas when I decided to create Ceygen, but I
resorted to take the easier route and made uglier static API because I didn't
(and still don't) have enough knowledge of Cython internals and Mark's "array
expressions" still looked like they are going to be merged.

What you suggest above would be definitely possible IMO, and it could easily
cherry-pick some work done on Ceygen, for example how memoryviews are wrapped
as Eigen arrays:
https://github.com/strohel/Ceygen/blob/master/ceygen/eigen_cpp.h

Unfortunately, I no longer work on the project (my master's thesis) that
triggered work on Ceygen, and now I can invest only little maintenance work to
it.

*However*, if such project is found worth to tacking by Cython devs, I can
imagine in my wild dreams that there may be a student at my former university
interested in doing it as their project. (this is rather optimistic statement,
please don't get too excited yet)

Regards,
Matěj
Stefan Behnel
2014-08-08 10:28:09 UTC
Permalink
Post by Matěj Laitl
Post by Stefan Behnel
http://eigen.tuxfamily.org/dox/TopicLazyEvaluation.html
It might not be all that much work to integrate this with Cython to enable
syntax support. Or have some kind of preprocessor, but I guess that's not
going to be simpler than have it in Cython as an optional AST transform.
Basically, whenever an arithmetic expression consists of a mixture of
memory views and (optionally) scalars, it could be mapped to the kind of
code that Ceygen uses, just at an arbitrarily complex expression length.
Exactly! That was one of my ideas when I decided to create Ceygen, but I
resorted to take the easier route and made uglier static API because I didn't
(and still don't) have enough knowledge of Cython internals and Mark's "array
expressions" still looked like they are going to be merged.
There are a couple of points in favour of Eigen/Ceygen. It's a way simpler
approach than moving the entire vectorised code generator into Cython. And
it lets Eigen do the hard work, leaving only the interfacing part to
Cython. Basically, both tools can do what they're good at. We're always
happy that there's a C compiler cleaning up after us to which we can
delegate the low level optimisation and tuning stuff. The same would apply
to Eigen. It's a much more cythonic approach than what's implemented in the
pie-in-the-sky array expressions branch.

It requires a C++ compiler, but I think that's an acceptable dependency
these days in places where array computation performance matters. People
even accept a dependency on huge tool stacks like Numba/LLVM for the same
kind of tasks. An array expression syntax is just so much simpler than
writing dumb nested loops.
Post by Matěj Laitl
Unfortunately, I no longer work on the project (my master's thesis) that
triggered work on Ceygen, and now I can invest only little maintenance work to
it.
I don't see me being the one to implement this, either. :)
Post by Matěj Laitl
*However*, if such project is found worth to tacking by Cython devs, I can
imagine in my wild dreams that there may be a student at my former university
interested in doing it as their project. (this is rather optimistic statement,
please don't get too excited yet)
I'm sure there are many people who would benefit from this. Some of them
might even be able to provide funding, if someone wants to pick up this
idea. But past experience shows that we shouldn't count on this either.
It's definitely a nice project to tackle for someone who likes coding.

Stefan
Ian Henriksen
2014-08-08 16:47:56 UTC
Permalink
I'd like to work on it. That was why I asked. It's been bothering me for
some time that we don't have this feature yet. This will take me some time
though since I'm not at all familiar with Cython's internals. I've used it
in relatively small settings for speeding up array operations and wrapping
external functions, but I've never had occasion to dig through much of what
it does internally. Some help navigating Cython's internals would be
greatly appreciated.

I'll also have to take some time to become more familiar with eigen itself.
Most of my work has used numpy arrays.

Eigen seems to be ideal for this since it allows us to offload the
expression analysis to another package where it is already
well-implemented. It looks like they have an array class for n-dimensional
array operations as well. My guess is that that would allow a fairly
natural translation from memory views to eigen arrays.

For now, I'll work on learning some of the details of how to use eigen.
Where would be a good place to start from there?

Would it be more helpful for me to start working on Cython bindings for
eigen, or to work on adding this directly as a part of Cython? Where should
I look in Cython's source code to do that? What kinds of modifications
would be necessary?

Thanks!

-Ian
Post by Stefan Behnel
Post by Matěj Laitl
Post by Stefan Behnel
http://eigen.tuxfamily.org/dox/TopicLazyEvaluation.html
It might not be all that much work to integrate this with Cython to
enable
Post by Matěj Laitl
Post by Stefan Behnel
syntax support. Or have some kind of preprocessor, but I guess that's
not
Post by Matěj Laitl
Post by Stefan Behnel
going to be simpler than have it in Cython as an optional AST transform.
Basically, whenever an arithmetic expression consists of a mixture of
memory views and (optionally) scalars, it could be mapped to the kind of
code that Ceygen uses, just at an arbitrarily complex expression length.
Exactly! That was one of my ideas when I decided to create Ceygen, but I
resorted to take the easier route and made uglier static API because I
didn't
Post by Matěj Laitl
(and still don't) have enough knowledge of Cython internals and Mark's
"array
Post by Matěj Laitl
expressions" still looked like they are going to be merged.
There are a couple of points in favour of Eigen/Ceygen. It's a way simpler
approach than moving the entire vectorised code generator into Cython. And
it lets Eigen do the hard work, leaving only the interfacing part to
Cython. Basically, both tools can do what they're good at. We're always
happy that there's a C compiler cleaning up after us to which we can
delegate the low level optimisation and tuning stuff. The same would apply
to Eigen. It's a much more cythonic approach than what's implemented in the
pie-in-the-sky array expressions branch.
It requires a C++ compiler, but I think that's an acceptable dependency
these days in places where array computation performance matters. People
even accept a dependency on huge tool stacks like Numba/LLVM for the same
kind of tasks. An array expression syntax is just so much simpler than
writing dumb nested loops.
Post by Matěj Laitl
Unfortunately, I no longer work on the project (my master's thesis) that
triggered work on Ceygen, and now I can invest only little maintenance
work to
Post by Matěj Laitl
it.
I don't see me being the one to implement this, either. :)
Post by Matěj Laitl
*However*, if such project is found worth to tacking by Cython devs, I
can
Post by Matěj Laitl
imagine in my wild dreams that there may be a student at my former
university
Post by Matěj Laitl
interested in doing it as their project. (this is rather optimistic
statement,
Post by Matěj Laitl
please don't get too excited yet)
I'm sure there are many people who would benefit from this. Some of them
might even be able to provide funding, if someone wants to pick up this
idea. But past experience shows that we shouldn't count on this either.
It's definitely a nice project to tackle for someone who likes coding.
Stefan
_______________________________________________
cython-devel mailing list
https://mail.python.org/mailman/listinfo/cython-devel
Stefan Behnel
2014-08-08 17:36:26 UTC
Permalink
Hi,

please don't top-post.
Post by Ian Henriksen
I'd like to work on it. That was why I asked. It's been bothering me for
some time that we don't have this feature yet. This will take me some time
though since I'm not at all familiar with Cython's internals. I've used it
in relatively small settings for speeding up array operations and wrapping
external functions, but I've never had occasion to dig through much of what
it does internally. Some help navigating Cython's internals would be
greatly appreciated.
Here's a little intro.

https://github.com/cython/cython/wiki/HackerGuide#getting-started

I recommend to start by writing a simple test that does some integer
arithmetic and enable code generation traces for AST nodes in the generated
C code. "cython -a" should make this quite navigable.
Post by Ian Henriksen
I'll also have to take some time to become more familiar with eigen itself.
Most of my work has used numpy arrays.
Eigen seems to be ideal for this since it allows us to offload the
expression analysis to another package where it is already
well-implemented. It looks like they have an array class for n-dimensional
array operations as well. My guess is that that would allow a fairly
natural translation from memory views to eigen arrays.
You should dig into the code that Ceygen uses for the mapping. That's most
likely the best start you can get.
Post by Ian Henriksen
Would it be more helpful for me to start working on Cython bindings for
eigen, or to work on adding this directly as a part of Cython? Where should
I look in Cython's source code to do that? What kinds of modifications
would be necessary?
My guess is that this is best implemented with a set of matrix specific AST
nodes for the arithmetic operators. There is a recursive tree processing
phase in Cython's pipeline called "AnalyseExpressionsTransform" or type
analysis, where it figures out what variables reference memory views, what
the result type of an arithmetic expression is, etc. This phase can easily
replace nodes by returning new nodes from the analyse_types() methods of
AST nodes. Currently, arithmetic with memory views is not allowed, so this
needs to be enabled in the corresponding NumBinopNode AST subclasses
(AddNode, MatMulNode, etc. in ExprNodes.py). Write a test and debug it to
see where it fails.

I'm not sure yet how the code generation would look like. In simple cases,
you may be able to implement this into the existing AST arithmetic operator
nodes. But my guess is that you want to have a new set of nodes (probably
in a new module in Cython/Compiler/, e.g. EigenNodes.py) that the
analyse_types() methods of the generic arithmetic AST nodes return instead
of themselves when they determine that they are operating on memory views.
Those would then know how to generate matrix specific arithmetic C++/Eigen
code, potentially even inherit from the general arithmetic nodes to
override only the code generation method. There is a
NewNodeClass.from_node() class method to create a new node from an existing
one.

These specialised AST nodes will be optional in the end, Eigen should only
be needed if in C++ mode and this feature is enabled and used, otherwise,
array arithmetic will continue to be a compile time error.

Try to get something simple like binary matrix multiplication working for
now, or even just matrix x scalar. Map them to Ceygen's API to get a quick
start. Once that's done, try building up the code generation for more
complex nested expressions.

Stefan
Julian Taylor
2014-08-08 18:02:47 UTC
Permalink
Post by Stefan Behnel
Hi,
please don't top-post.
Post by Ian Henriksen
I'd like to work on it. That was why I asked. It's been bothering me for
some time that we don't have this feature yet. This will take me some time
though since I'm not at all familiar with Cython's internals. I've used it
in relatively small settings for speeding up array operations and wrapping
external functions, but I've never had occasion to dig through much of what
it does internally. Some help navigating Cython's internals would be
greatly appreciated.
Here's a little intro.
https://github.com/cython/cython/wiki/HackerGuide#getting-started
I recommend to start by writing a simple test that does some integer
arithmetic and enable code generation traces for AST nodes in the generated
C code. "cython -a" should make this quite navigable.
Post by Ian Henriksen
I'll also have to take some time to become more familiar with eigen itself.
Most of my work has used numpy arrays.
Eigen seems to be ideal for this since it allows us to offload the
expression analysis to another package where it is already
well-implemented. It looks like they have an array class for n-dimensional
array operations as well. My guess is that that would allow a fairly
natural translation from memory views to eigen arrays.
You should dig into the code that Ceygen uses for the mapping. That's most
likely the best start you can get.
Post by Ian Henriksen
Would it be more helpful for me to start working on Cython bindings for
eigen, or to work on adding this directly as a part of Cython? Where should
I look in Cython's source code to do that? What kinds of modifications
would be necessary?
My guess is that this is best implemented with a set of matrix specific AST
nodes for the arithmetic operators. There is a recursive tree processing
phase in Cython's pipeline called "AnalyseExpressionsTransform" or type
analysis, where it figures out what variables reference memory views, what
the result type of an arithmetic expression is, etc. This phase can easily
replace nodes by returning new nodes from the analyse_types() methods of
AST nodes. Currently, arithmetic with memory views is not allowed, so this
needs to be enabled in the corresponding NumBinopNode AST subclasses
(AddNode, MatMulNode, etc. in ExprNodes.py). Write a test and debug it to
see where it fails.
If someone wants to work on an ast transformer for array expressions I
would not restrict its output to something ceygen can parse but also numpy.
E.g. transforming:

cpdef whateverfunction(double[:] a, float[:] b):
return a + a * b + .5

to

cpdef whateverfunction(double[:] a, float[:] b):
bs = 10000
r = empty_like(a)
for i in range(0, a.size, bs):
u = min(i + bs, a.size)
r[i:u] = a[i:u] * b[i:u]
r[i:u] += a[i:u]
r[i:u] += 0.5
return r

this is already very efficient in numpy.
Ian Henriksen
2014-08-08 21:09:05 UTC
Permalink
On Fri, Aug 8, 2014 at 12:02 PM, Julian Taylor <
Post by Ian Henriksen
Post by Stefan Behnel
Hi,
please don't top-post.
Post by Ian Henriksen
I'd like to work on it. That was why I asked. It's been bothering me for
some time that we don't have this feature yet. This will take me some
time
Post by Stefan Behnel
Post by Ian Henriksen
though since I'm not at all familiar with Cython's internals. I've used
it
Post by Stefan Behnel
Post by Ian Henriksen
in relatively small settings for speeding up array operations and
wrapping
Post by Stefan Behnel
Post by Ian Henriksen
external functions, but I've never had occasion to dig through much of
what
Post by Stefan Behnel
Post by Ian Henriksen
it does internally. Some help navigating Cython's internals would be
greatly appreciated.
Here's a little intro.
https://github.com/cython/cython/wiki/HackerGuide#getting-started
I recommend to start by writing a simple test that does some integer
arithmetic and enable code generation traces for AST nodes in the
generated
Post by Stefan Behnel
C code. "cython -a" should make this quite navigable.
Post by Ian Henriksen
I'll also have to take some time to become more familiar with eigen
itself.
Post by Stefan Behnel
Post by Ian Henriksen
Most of my work has used numpy arrays.
Eigen seems to be ideal for this since it allows us to offload the
expression analysis to another package where it is already
well-implemented. It looks like they have an array class for
n-dimensional
Post by Stefan Behnel
Post by Ian Henriksen
array operations as well. My guess is that that would allow a fairly
natural translation from memory views to eigen arrays.
You should dig into the code that Ceygen uses for the mapping. That's
most
Post by Stefan Behnel
likely the best start you can get.
Post by Ian Henriksen
Would it be more helpful for me to start working on Cython bindings for
eigen, or to work on adding this directly as a part of Cython? Where
should
Post by Stefan Behnel
Post by Ian Henriksen
I look in Cython's source code to do that? What kinds of modifications
would be necessary?
My guess is that this is best implemented with a set of matrix specific
AST
Post by Stefan Behnel
nodes for the arithmetic operators. There is a recursive tree processing
phase in Cython's pipeline called "AnalyseExpressionsTransform" or type
analysis, where it figures out what variables reference memory views,
what
Post by Stefan Behnel
the result type of an arithmetic expression is, etc. This phase can
easily
Post by Stefan Behnel
replace nodes by returning new nodes from the analyse_types() methods of
AST nodes. Currently, arithmetic with memory views is not allowed, so
this
Post by Stefan Behnel
needs to be enabled in the corresponding NumBinopNode AST subclasses
(AddNode, MatMulNode, etc. in ExprNodes.py). Write a test and debug it to
see where it fails.
If someone wants to work on an ast transformer for array expressions I
would not restrict its output to something ceygen can parse but also numpy.
return a + a * b + .5
to
bs = 10000
r = empty_like(a)
u = min(i + bs, a.size)
r[i:u] = a[i:u] * b[i:u]
r[i:u] += a[i:u]
r[i:u] += 0.5
return r
this is already very efficient in numpy.
Sorry about the top post.

Stefan, thanks for the direction. I think I know how to proceed now. It'll
probably take me some time to learn how to do all this, but I'll start
reading/working on it now.

Julian, from a dependency standpoint I'd really prefer to use numpy. As
near as I can tell, the people most likely to use this feature already use
numpy. That would also remove the C++ dependency for this. I think the main
reason for using eigen is that it will be easy to interface with and will
eliminate temporaries automatically. In the long term, it could be good to
offer multiple backends for these sorts of operations. Numpy would be a
primary candidate for something like that. I'll spend some time working
with numpy's ufunc api to see if I can figure out a good way to use that
instead.
Thanks!
-Ian
Ian Henriksen
2014-08-08 22:15:29 UTC
Permalink
On Fri, Aug 8, 2014 at 3:09 PM, Ian Henriksen <
Post by Ian Henriksen
On Fri, Aug 8, 2014 at 12:02 PM, Julian Taylor <
Post by Ian Henriksen
Post by Stefan Behnel
Hi,
please don't top-post.
Post by Ian Henriksen
I'd like to work on it. That was why I asked. It's been bothering me
for
Post by Stefan Behnel
Post by Ian Henriksen
some time that we don't have this feature yet. This will take me some
time
Post by Stefan Behnel
Post by Ian Henriksen
though since I'm not at all familiar with Cython's internals. I've
used it
Post by Stefan Behnel
Post by Ian Henriksen
in relatively small settings for speeding up array operations and
wrapping
Post by Stefan Behnel
Post by Ian Henriksen
external functions, but I've never had occasion to dig through much of
what
Post by Stefan Behnel
Post by Ian Henriksen
it does internally. Some help navigating Cython's internals would be
greatly appreciated.
Here's a little intro.
https://github.com/cython/cython/wiki/HackerGuide#getting-started
I recommend to start by writing a simple test that does some integer
arithmetic and enable code generation traces for AST nodes in the
generated
Post by Stefan Behnel
C code. "cython -a" should make this quite navigable.
Post by Ian Henriksen
I'll also have to take some time to become more familiar with eigen
itself.
Post by Stefan Behnel
Post by Ian Henriksen
Most of my work has used numpy arrays.
Eigen seems to be ideal for this since it allows us to offload the
expression analysis to another package where it is already
well-implemented. It looks like they have an array class for
n-dimensional
Post by Stefan Behnel
Post by Ian Henriksen
array operations as well. My guess is that that would allow a fairly
natural translation from memory views to eigen arrays.
You should dig into the code that Ceygen uses for the mapping. That's
most
Post by Stefan Behnel
likely the best start you can get.
Post by Ian Henriksen
Would it be more helpful for me to start working on Cython bindings for
eigen, or to work on adding this directly as a part of Cython? Where
should
Post by Stefan Behnel
Post by Ian Henriksen
I look in Cython's source code to do that? What kinds of modifications
would be necessary?
My guess is that this is best implemented with a set of matrix specific
AST
Post by Stefan Behnel
nodes for the arithmetic operators. There is a recursive tree processing
phase in Cython's pipeline called "AnalyseExpressionsTransform" or type
analysis, where it figures out what variables reference memory views,
what
Post by Stefan Behnel
the result type of an arithmetic expression is, etc. This phase can
easily
Post by Stefan Behnel
replace nodes by returning new nodes from the analyse_types() methods of
AST nodes. Currently, arithmetic with memory views is not allowed, so
this
Post by Stefan Behnel
needs to be enabled in the corresponding NumBinopNode AST subclasses
(AddNode, MatMulNode, etc. in ExprNodes.py). Write a test and debug it
to
Post by Stefan Behnel
see where it fails.
If someone wants to work on an ast transformer for array expressions I
would not restrict its output to something ceygen can parse but also numpy.
return a + a * b + .5
to
bs = 10000
r = empty_like(a)
u = min(i + bs, a.size)
r[i:u] = a[i:u] * b[i:u]
r[i:u] += a[i:u]
r[i:u] += 0.5
return r
this is already very efficient in numpy.
Sorry about the top post.
Stefan, thanks for the direction. I think I know how to proceed now. It'll
probably take me some time to learn how to do all this, but I'll start
reading/working on it now.
Julian, from a dependency standpoint I'd really prefer to use numpy. As
near as I can tell, the people most likely to use this feature already use
numpy. That would also remove the C++ dependency for this. I think the main
reason for using eigen is that it will be easy to interface with and will
eliminate temporaries automatically. In the long term, it could be good to
offer multiple backends for these sorts of operations. Numpy would be a
primary candidate for something like that. I'll spend some time working
with numpy's ufunc api to see if I can figure out a good way to use that
instead.
Thanks!
-Ian
Maybe I should clarify a little about why eigen is a good place to start.
According to http://eigen.tuxfamily.org/dox/TopicLazyEvaluation.html it
already takes care of things like the elimination of temporary variables
and common subexpression reduction at compile time. This should make it
possible to compile array expressions in Cython without having to
re-implement those sorts of optimizations. Ideally we would just have to
map memory view operations to corresponding equivalents from eigen. It's
not yet clear to me how to do things with arbitrary-dimensional arrays or
broadcasting, but, given some more time, a solution may present itself.
-Ian
Sturla Molden
2014-08-10 18:41:31 UTC
Permalink
Ian Henriksen
Post by Ian Henriksen
Maybe I should clarify a little about why eigen is a good place to start.
According to <a
href="http://eigen.tuxfamily.org/dox/TopicLazyEvaluation.html">http://eigen.tuxfamily.org/dox/TopicLazyEvaluation.html</a>
it
already takes care of things like the elimination of temporary variables
and common subexpression reduction at compile time. This should make it
possible to compile array expressions in Cython without having to
re-implement those sorts of optimizations. Ideally we would just have to
map memory view operations to corresponding equivalents from eigen. It's
not yet clear to me how to do things with arbitrary-dimensional arrays or
broadcasting, but, given some more time, a solution may present itself.
-Ian
cilkplus is what you want, not Eigen.

But if you are serious about number crunching, learn Fortran 95.


Sturla
Ian Henriksen
2014-08-12 02:34:36 UTC
Permalink
Post by Ian Henriksen
Ian Henriksen
Post by Ian Henriksen
Maybe I should clarify a little about why eigen is a good place to start.
According to <a
href="http://eigen.tuxfamily.org/dox/TopicLazyEvaluation.html">
http://eigen.tuxfamily.org/dox/TopicLazyEvaluation.html</a>
Post by Ian Henriksen
it
already takes care of things like the elimination of temporary variables
and common subexpression reduction at compile time. This should make it
possible to compile array expressions in Cython without having to
re-implement those sorts of optimizations. Ideally we would just have to
map memory view operations to corresponding equivalents from eigen. It's
not yet clear to me how to do things with arbitrary-dimensional arrays or
broadcasting, but, given some more time, a solution may present itself.
-Ian
cilkplus is what you want, not Eigen.
But if you are serious about number crunching, learn Fortran 95.
Sturla
_______________________________________________
cython-devel mailing list
https://mail.python.org/mailman/listinfo/cython-devel
Cilk Plus would also work really nicely for this. Thanks for the suggestion.
Fortran is a really great language for this sort of thing, but I don't
think I'm ready to tackle the difficulties of using it as a backend for
array arithmetic in Cython. It would be a really great feature to have
later on though.
Thanks!
-Ian
Stefan Behnel
2014-08-12 05:46:33 UTC
Permalink
Post by Ian Henriksen
Post by Ian Henriksen
Post by Ian Henriksen
Maybe I should clarify a little about why eigen is a good place to start.
According to <a
href="http://eigen.tuxfamily.org/dox/TopicLazyEvaluation.html">
http://eigen.tuxfamily.org/dox/TopicLazyEvaluation.html</a>
Post by Ian Henriksen
it
already takes care of things like the elimination of temporary variables
and common subexpression reduction at compile time. This should make it
possible to compile array expressions in Cython without having to
re-implement those sorts of optimizations. Ideally we would just have to
map memory view operations to corresponding equivalents from eigen. It's
not yet clear to me how to do things with arbitrary-dimensional arrays or
broadcasting, but, given some more time, a solution may present itself.
-Ian
cilkplus is what you want, not Eigen.
But if you are serious about number crunching, learn Fortran 95.
Cilk Plus would also work really nicely for this. Thanks for the suggestion.
Fortran is a really great language for this sort of thing, but I don't
think I'm ready to tackle the difficulties of using it as a backend for
array arithmetic in Cython. It would be a really great feature to have
later on though.
That clarifies a bit of the design then: The syntax support should be
somewhat generic, with specialised (sets of) node implementations as
backends that generate code for different libraries/compilers/languages.

It's ok to start only with Eigen, though. We have working example code for
it and everything else has either a much higher entry level for the
implementation or a much lower general availability of the required tools.

For the syntax/type support, a look at the array expressions branch might
also be helpful, although I doubt that there really is all that much to do
on that front.

Stefan
Sturla Molden
2014-08-12 15:18:12 UTC
Permalink
But using Eigen will taint the output with Eigen's license, since the Eigen
library is statically linked. OTOH, Cilkplus is just a compiler extension
for C and C++.

AFAIK, it is currently available for Intel C++ and Clang (also by Intel)
and GCC 4.9. On MSVC I believe it requires Intel Parallel Studio. The main
obstacle to its adoption today is MSVC and GCC 4.8. Cilkplus is also being
evaluated for becoming parts of the future C and C++ standards.

Another thing to observe is that Eigen depends on the C++ compiler to elide
temporary arrays. Currently it only/mostly happens for fixed size arrays.
Clikplus arrays does not suffer from this. You just index pointers like
typed memory views (though the indexing syntax is slightly different), and
it compiles to efficient machine code just like Fortran 90 array code.

Sturla
Post by Stefan Behnel
Post by Ian Henriksen
Post by Ian Henriksen
Post by Ian Henriksen
Maybe I should clarify a little about why eigen is a good place to start.
According to <a
href="http://eigen.tuxfamily.org/dox/TopicLazyEvaluation.html">
http://eigen.tuxfamily.org/dox/TopicLazyEvaluation.html</a>
Post by Ian Henriksen
it
already takes care of things like the elimination of temporary variables
and common subexpression reduction at compile time. This should make it
possible to compile array expressions in Cython without having to
re-implement those sorts of optimizations. Ideally we would just have to
map memory view operations to corresponding equivalents from eigen. It's
not yet clear to me how to do things with arbitrary-dimensional arrays or
broadcasting, but, given some more time, a solution may present itself.
-Ian
cilkplus is what you want, not Eigen.
But if you are serious about number crunching, learn Fortran 95.
Cilk Plus would also work really nicely for this. Thanks for the suggestion.
Fortran is a really great language for this sort of thing, but I don't
think I'm ready to tackle the difficulties of using it as a backend for
array arithmetic in Cython. It would be a really great feature to have
later on though.
That clarifies a bit of the design then: The syntax support should be
somewhat generic, with specialised (sets of) node implementations as
backends that generate code for different libraries/compilers/languages.
It's ok to start only with Eigen, though. We have working example code for
it and everything else has either a much higher entry level for the
implementation or a much lower general availability of the required tools.
For the syntax/type support, a look at the array expressions branch might
also be helpful, although I doubt that there really is all that much to do
on that front.
Stefan
Stefan Behnel
2014-08-12 15:32:26 UTC
Permalink
Post by Sturla Molden
But using Eigen will taint the output with Eigen's license
Which is ok for many users, definitely those who only run their own code
locally. And the others can eventually use a different backend.
Post by Sturla Molden
OTOH, Cilkplus is just a compiler extension for C and C++.
Which isn't available in all compilers. But anyway, if you write the code,
I have no problem with Cilkplus becoming the first backend to support.

Stefan
Matěj Laitl
2014-08-12 16:13:35 UTC
Permalink
Post by Sturla Molden
But using Eigen will taint the output with Eigen's license, since the Eigen
library is statically linked.
There is no such thing as "Eigen library". Eigen is fully implemented in
header files. Cython would just generate C++ code that uses Eigen API.

Also, Eigen is now licensed under MPL2, which is file-level and permissive one
[1]. I don't understand what you mean by 'taint' above and how this could
restrict someone license-wise.

Regards,
Matěj

[1] http://eigen.tuxfamily.org/index.php?title=Licensing_FAQ
Matěj Laitl
2014-08-12 16:22:46 UTC
Permalink
Post by Sturla Molden
Another thing to observe is that Eigen depends on the C++ compiler to elide
temporary arrays.
Either I don't understand you, or you don't understand Eigen. Eigen overloads
operator=() to circumvent need for temporary arrays. It is *not* an
optimisation on the compiler part, the code is explicitly written to do so,
where it is beneficial for performance [1].

Matěj

[1] http://eigen.tuxfamily.org/dox/TopicLazyEvaluation.html
Stefan Behnel
2014-10-24 08:36:56 UTC
Permalink
Post by Ian Henriksen
Post by Ian Henriksen
Stefan, thanks for the direction. I think I know how to proceed now. It'll
probably take me some time to learn how to do all this, but I'll start
reading/working on it now.
Julian, from a dependency standpoint I'd really prefer to use numpy. As
near as I can tell, the people most likely to use this feature already use
numpy. That would also remove the C++ dependency for this. I think the main
reason for using eigen is that it will be easy to interface with and will
eliminate temporaries automatically. In the long term, it could be good to
offer multiple backends for these sorts of operations. Numpy would be a
primary candidate for something like that. I'll spend some time working
with numpy's ufunc api to see if I can figure out a good way to use that
instead.
Maybe I should clarify a little about why eigen is a good place to start.
According to http://eigen.tuxfamily.org/dox/TopicLazyEvaluation.html it
already takes care of things like the elimination of temporary variables
and common subexpression reduction at compile time. This should make it
possible to compile array expressions in Cython without having to
re-implement those sorts of optimizations. Ideally we would just have to
map memory view operations to corresponding equivalents from eigen. It's
not yet clear to me how to do things with arbitrary-dimensional arrays or
broadcasting, but, given some more time, a solution may present itself.
Did anything come out of this yet?

Stefan

Loading...