Discussion:
[Cython] Recent bugs in generators
Vitja Makarov
2011-04-17 15:57:28 UTC
Permalink
Hi!
list((lambda:((yield 1), (yield 2)))())
[1, 2, (None, None)]
list((lambda:((yield 1), (yield 2)))())
[1, 2]

2. GeneratorExit is initialized to StopIteration when running
generators_py doctests

This is strange behaviour of doctest module, try this:
x = __builtins__
def foo():
"""
type(x)
<type 'module'>
"""

3. check_yield_in_exception()

Cython calls __Pyx_ExceptionReset when except block is done, so when
yield is there no exception reset is called.

I'm not sure how to fix this.

import sys

def foo():
"""
list(foo())
[<type 'exceptions.ValueError'>, None]
"""
try:
raise ValueError
except ValueError:
yield sys.exc_info()[0]
yield sys.exc_info()[0] # exc_info is lost here


Here is quick fix for 1 and 2 https://github.com/cython/cython/pull/25
--
vitja.
Stefan Behnel
2011-04-17 21:05:45 UTC
Permalink
Post by Vitja Makarov
3. check_yield_in_exception()
I added this because I found a failing pyregr test that uses it (testing
Post by Vitja Makarov
Cython calls __Pyx_ExceptionReset when except block is done, so when
yield is there no exception reset is called.
I'm not sure how to fix this.
I'm not completely sure either.
Post by Vitja Makarov
import sys
"""
list(foo())
[<type 'exceptions.ValueError'>, None]
"""
raise ValueError
yield sys.exc_info()[0]
yield sys.exc_info()[0] # exc_info is lost here
I think (!), the difference here is that CPython actually keeps the
exception in the generator frame. We don't have a frame, so we have to
emulate it using the closure class. I guess we'll have to store away the
exception into the closure when we yield while an exception is being
handled, and restore it afterwards. Note: this is not the exception that is
freshly *being* raised (the "_cur*" fields in the thread state), it's the
exception that *was* raised and is now being handled, i.e. the thread state
fields without the "_cur", that are reflected by sys.exc_info().

Stefan
Vitja Makarov
2011-04-18 04:38:20 UTC
Permalink
Post by Vitja Makarov
3. check_yield_in_exception()
I added this because I found a failing pyregr test that uses it (testing the
@contextmanager decorator).
Post by Vitja Makarov
Cython calls __Pyx_ExceptionReset when except block is done, so when
yield is there no exception reset is called.
I'm not sure how to fix this.
I'm not completely sure either.
Post by Vitja Makarov
import sys
    """
    >>>  list(foo())
    [<type 'exceptions.ValueError'>, None]
    """
        raise ValueError
        yield sys.exc_info()[0]
        yield sys.exc_info()[0] # exc_info is lost here
I think (!), the difference here is that CPython actually keeps the
exception in the generator frame. We don't have a frame, so we have to
emulate it using the closure class. I guess we'll have to store away the
exception into the closure when we yield while an exception is being
handled, and restore it afterwards. Note: this is not the exception that is
freshly *being* raised (the "_cur*" fields in the thread state), it's the
exception that *was* raised and is now being handled, i.e. the thread state
fields without the "_cur", that are reflected by sys.exc_info().
Interesting difference between py2 and py3:

def foo():
try:
raise ValueError
except ValueError:
yield
raise
list(foo())

File "xxx.py", line 7, in <module>
list(foo())
File "xxx.py", line 6, in foo
raise
TypeError: exceptions must be old-style classes or derived from
BaseException, not NoneType

It seems that exception info is completely lost (tried 2.6, 2.7) and
seems to be fixed in python3.

Btw exception info temps are already saved and restored between yields.
--
vitja.
Stefan Behnel
2011-04-18 04:57:27 UTC
Permalink
Post by Vitja Makarov
Post by Vitja Makarov
3. check_yield_in_exception()
I added this because I found a failing pyregr test that uses it (testing the
@contextmanager decorator).
Post by Vitja Makarov
Cython calls __Pyx_ExceptionReset when except block is done, so when
yield is there no exception reset is called.
I'm not sure how to fix this.
I'm not completely sure either.
Post by Vitja Makarov
import sys
"""
list(foo())
[<type 'exceptions.ValueError'>, None]
"""
raise ValueError
yield sys.exc_info()[0]
yield sys.exc_info()[0] # exc_info is lost here
I think (!), the difference here is that CPython actually keeps the
exception in the generator frame. We don't have a frame, so we have to
emulate it using the closure class. I guess we'll have to store away the
exception into the closure when we yield while an exception is being
handled, and restore it afterwards. Note: this is not the exception that is
freshly *being* raised (the "_cur*" fields in the thread state), it's the
exception that *was* raised and is now being handled, i.e. the thread state
fields without the "_cur", that are reflected by sys.exc_info().
raise ValueError
yield
raise
list(foo())
File "xxx.py", line 7, in<module>
list(foo())
File "xxx.py", line 6, in foo
raise
TypeError: exceptions must be old-style classes or derived from
BaseException, not NoneType
It seems that exception info is completely lost (tried 2.6, 2.7) and
seems to be fixed in python3.
Not surprising. The implementation is completely different in Py2 and Py3,
both in CPython and in Cython. It's actually much simpler in Cython under
Py3, due to better semantics and C-API support. That also implies that
there's much less Cython can do wrong in that environment. ;-)
Post by Vitja Makarov
Btw exception info temps are already saved and restored between yields.
Right, but the exc_info itself is not reset and recovered around the yield.
As I said above, generators have their own lifetime frame in CPython, and
exceptions don't leak from that. So, whenever it's the generator (or code
called by it) that raises an exception, that must be kept local to the
generator.

Stefan
Vitja Makarov
2011-04-18 13:19:18 UTC
Permalink
Post by Stefan Behnel
Post by Vitja Makarov
Post by Vitja Makarov
3. check_yield_in_exception()
I added this because I found a failing pyregr test that uses it (testing the
@contextmanager decorator).
Post by Vitja Makarov
Cython calls __Pyx_ExceptionReset when except block is done, so when
yield is there no exception reset is called.
I'm not sure how to fix this.
I'm not completely sure either.
Post by Vitja Makarov
import sys
    """
    >>>    list(foo())
    [<type 'exceptions.ValueError'>, None]
    """
        raise ValueError
        yield sys.exc_info()[0]
        yield sys.exc_info()[0] # exc_info is lost here
I think (!), the difference here is that CPython actually keeps the
exception in the generator frame. We don't have a frame, so we have to
emulate it using the closure class. I guess we'll have to store away the
exception into the closure when we yield while an exception is being
handled, and restore it afterwards. Note: this is not the exception that is
freshly *being* raised (the "_cur*" fields in the thread state), it's the
exception that *was* raised and is now being handled, i.e. the thread state
fields without the "_cur", that are reflected by sys.exc_info().
        raise ValueError
        yield
        raise
list(foo())
  File "xxx.py", line 7, in<module>
    list(foo())
  File "xxx.py", line 6, in foo
    raise
TypeError: exceptions must be old-style classes or derived from
BaseException, not NoneType
It seems that exception info is completely lost (tried 2.6, 2.7) and
seems to be fixed in python3.
Not surprising. The implementation is completely different in Py2 and Py3,
both in CPython and in Cython. It's actually much simpler in Cython under
Py3, due to better semantics and C-API support. That also implies that
there's much less Cython can do wrong in that environment. ;-)
Post by Vitja Makarov
Btw exception info temps are already saved and restored between yields.
Right, but the exc_info itself is not reset and recovered around the yield.
As I said above, generators have their own lifetime frame in CPython, and
exceptions don't leak from that. So, whenever it's the generator (or code
called by it) that raises an exception, that must be kept local to the
generator.
There is one more interesting thing:

def test_yield_inside_genexp():
"""
Post by Stefan Behnel
Post by Vitja Makarov
o = test_yield_inside_genexp()
list(o)
[0, None, 1, None, 2, None]
"""
return ((yield i) for i in range(3))
--
vitja.
Stefan Behnel
2011-04-19 05:17:45 UTC
Permalink
Post by Vitja Makarov
"""
o = test_yield_inside_genexp()
list(o)
[0, None, 1, None, 2, None]
"""
return ((yield i) for i in range(3))
Impressively ugly.

I guess we should start testing these things with other Python
implementations than CPython, in order to see if these corner cases are
really being accepted and knowingly being implemented, or if it's just
something that no-one has really cared about so far and that's actually
worth discussing.

Stefan
Vitja Makarov
2011-04-19 05:49:19 UTC
Permalink
Post by Stefan Behnel
Post by Vitja Makarov
    """
    >>>  o = test_yield_inside_genexp()
    >>>  list(o)
    [0, None, 1, None, 2, None]
    """
    return ((yield i) for i in range(3))
Impressively ugly.
I guess we should start testing these things with other Python
implementations than CPython, in order to see if these corner cases are
really being accepted and knowingly being implemented, or if it's just
something that no-one has really cared about so far and that's actually
worth discussing.
The case above is yield inside genexp and it works now.

Btw we've found one more funny thing this time with parser: 0lor(1)
--
vitja.
Vitja Makarov
2011-04-20 08:26:46 UTC
Permalink
Post by Stefan Behnel
Post by Vitja Makarov
Post by Vitja Makarov
3. check_yield_in_exception()
I added this because I found a failing pyregr test that uses it (testing the
@contextmanager decorator).
Post by Vitja Makarov
Cython calls __Pyx_ExceptionReset when except block is done, so when
yield is there no exception reset is called.
I'm not sure how to fix this.
I'm not completely sure either.
Post by Vitja Makarov
import sys
    """
    >>>    list(foo())
    [<type 'exceptions.ValueError'>, None]
    """
        raise ValueError
        yield sys.exc_info()[0]
        yield sys.exc_info()[0] # exc_info is lost here
I think (!), the difference here is that CPython actually keeps the
exception in the generator frame. We don't have a frame, so we have to
emulate it using the closure class. I guess we'll have to store away the
exception into the closure when we yield while an exception is being
handled, and restore it afterwards. Note: this is not the exception that is
freshly *being* raised (the "_cur*" fields in the thread state), it's the
exception that *was* raised and is now being handled, i.e. the thread state
fields without the "_cur", that are reflected by sys.exc_info().
        raise ValueError
        yield
        raise
list(foo())
  File "xxx.py", line 7, in<module>
    list(foo())
  File "xxx.py", line 6, in foo
    raise
TypeError: exceptions must be old-style classes or derived from
BaseException, not NoneType
It seems that exception info is completely lost (tried 2.6, 2.7) and
seems to be fixed in python3.
Not surprising. The implementation is completely different in Py2 and Py3,
both in CPython and in Cython. It's actually much simpler in Cython under
Py3, due to better semantics and C-API support. That also implies that
there's much less Cython can do wrong in that environment. ;-)
Post by Vitja Makarov
Btw exception info temps are already saved and restored between yields.
Right, but the exc_info itself is not reset and recovered around the yield.
As I said above, generators have their own lifetime frame in CPython, and
exceptions don't leak from that. So, whenever it's the generator (or code
called by it) that raises an exception, that must be kept local to the
generator.
Please review:
https://github.com/vitek/cython/commit/73014aaed10b82a3f632d7f86212f86280c55858

I've added __Pyx_Generator_SwapExceptions() method and call it right
before resume switch and before return from yield. It swaps generator
exception state with thread local.
--
vitja.
Stefan Behnel
2011-04-20 09:43:49 UTC
Permalink
Post by Vitja Makarov
Post by Stefan Behnel
Post by Vitja Makarov
Post by Vitja Makarov
3. check_yield_in_exception()
I added this because I found a failing pyregr test that uses it (testing the
@contextmanager decorator).
Post by Vitja Makarov
Cython calls __Pyx_ExceptionReset when except block is done, so when
yield is there no exception reset is called.
I'm not sure how to fix this.
I'm not completely sure either.
Post by Vitja Makarov
import sys
"""
list(foo())
[<type 'exceptions.ValueError'>, None]
"""
raise ValueError
yield sys.exc_info()[0]
yield sys.exc_info()[0] # exc_info is lost here
I think (!), the difference here is that CPython actually keeps the
exception in the generator frame. We don't have a frame, so we have to
emulate it using the closure class. I guess we'll have to store away the
exception into the closure when we yield while an exception is being
handled, and restore it afterwards. Note: this is not the exception that is
freshly *being* raised (the "_cur*" fields in the thread state), it's the
exception that *was* raised and is now being handled, i.e. the thread state
fields without the "_cur", that are reflected by sys.exc_info().
raise ValueError
yield
raise
list(foo())
File "xxx.py", line 7, in<module>
list(foo())
File "xxx.py", line 6, in foo
raise
TypeError: exceptions must be old-style classes or derived from
BaseException, not NoneType
It seems that exception info is completely lost (tried 2.6, 2.7) and
seems to be fixed in python3.
Not surprising. The implementation is completely different in Py2 and Py3,
both in CPython and in Cython. It's actually much simpler in Cython under
Py3, due to better semantics and C-API support. That also implies that
there's much less Cython can do wrong in that environment. ;-)
Post by Vitja Makarov
Btw exception info temps are already saved and restored between yields.
Right, but the exc_info itself is not reset and recovered around the yield.
As I said above, generators have their own lifetime frame in CPython, and
exceptions don't leak from that. So, whenever it's the generator (or code
called by it) that raises an exception, that must be kept local to the
generator.
https://github.com/vitek/cython/commit/73014aaed10b82a3f632d7f86212f86280c55858
I've added __Pyx_Generator_SwapExceptions() method and call it right
before resume switch and before return from yield. It swaps generator
exception state with thread local.
Looks good to me. I assume this fixes the problem? Then please push it into
mainline.

Stefan
Vitja Makarov
2011-04-20 09:50:53 UTC
Permalink
Post by Stefan Behnel
Post by Vitja Makarov
Post by Stefan Behnel
Post by Vitja Makarov
Post by Stefan Behnel
Post by Vitja Makarov
3. check_yield_in_exception()
I added this because I found a failing pyregr test that uses it (testing
the
@contextmanager decorator).
Post by Vitja Makarov
Cython calls __Pyx_ExceptionReset when except block is done, so when
yield is there no exception reset is called.
I'm not sure how to fix this.
I'm not completely sure either.
Post by Vitja Makarov
import sys
    """
    >>>      list(foo())
    [<type 'exceptions.ValueError'>, None]
    """
        raise ValueError
        yield sys.exc_info()[0]
        yield sys.exc_info()[0] # exc_info is lost here
I think (!), the difference here is that CPython actually keeps the
exception in the generator frame. We don't have a frame, so we have to
emulate it using the closure class. I guess we'll have to store away the
exception into the closure when we yield while an exception is being
handled, and restore it afterwards. Note: this is not the exception
that
is
freshly *being* raised (the "_cur*" fields in the thread state), it's the
exception that *was* raised and is now being handled, i.e. the thread state
fields without the "_cur", that are reflected by sys.exc_info().
        raise ValueError
        yield
        raise
list(foo())
  File "xxx.py", line 7, in<module>
    list(foo())
  File "xxx.py", line 6, in foo
    raise
TypeError: exceptions must be old-style classes or derived from
BaseException, not NoneType
It seems that exception info is completely lost (tried 2.6, 2.7) and
seems to be fixed in python3.
Not surprising. The implementation is completely different in Py2 and Py3,
both in CPython and in Cython. It's actually much simpler in Cython under
Py3, due to better semantics and C-API support. That also implies that
there's much less Cython can do wrong in that environment. ;-)
Post by Vitja Makarov
Btw exception info temps are already saved and restored between yields.
Right, but the exc_info itself is not reset and recovered around the yield.
As I said above, generators have their own lifetime frame in CPython, and
exceptions don't leak from that. So, whenever it's the generator (or code
called by it) that raises an exception, that must be kept local to the
generator.
https://github.com/vitek/cython/commit/73014aaed10b82a3f632d7f86212f86280c55858
I've added __Pyx_Generator_SwapExceptions() method and call it right
before resume switch and before return from yield. It swaps generator
exception state with thread local.
Looks good to me. I assume this fixes the problem? Then please push it into
mainline.
old pull request is still there ;)

https://github.com/cython/cython/pull/25

Does __Pyx_ExceptionReset() steal references to args, so they should
not be decrefed later?
--
vitja.
Stefan Behnel
2011-04-20 10:16:31 UTC
Permalink
Post by Vitja Makarov
Post by Stefan Behnel
Post by Vitja Makarov
Post by Stefan Behnel
generators have their own lifetime frame in CPython, and
exceptions don't leak from that. So, whenever it's the generator (or code
called by it) that raises an exception, that must be kept local to the
generator.
https://github.com/vitek/cython/commit/73014aaed10b82a3f632d7f86212f86280c55858
I've added __Pyx_Generator_SwapExceptions() method and call it right
before resume switch and before return from yield. It swaps generator
exception state with thread local.
Looks good to me. I assume this fixes the problem? Then please push it into
mainline.
old pull request is still there ;)
https://github.com/cython/cython/pull/25
Does __Pyx_ExceptionReset() steal references to args, so they should
not be decrefed later?
Hmm, good call. The refcounting looks correct over the two function calls,
but it would be nicer if it could be avoided instead. We are really just
swapping around pointers here, no refcounting is needed.

I think it's worth using a dedicated utility function for this.

Oh, and one more thing: what should be done when exiting the generator
normally? Would that just raise GeneratorExit anyway, so that we can
swallow any original outer exception? I think there might be a problem with
Python 3, where the GeneratorExit should be raised in the context of the
original exception. Worth testing how CPython behaves here...

Stefan
Vitja Makarov
2011-04-20 10:51:35 UTC
Permalink
Post by Stefan Behnel
Post by Vitja Makarov
Post by Stefan Behnel
Post by Vitja Makarov
Post by Stefan Behnel
generators have their own lifetime frame in CPython, and
exceptions don't leak from that. So, whenever it's the generator (or code
called by it) that raises an exception, that must be kept local to the
generator.
https://github.com/vitek/cython/commit/73014aaed10b82a3f632d7f86212f86280c55858
I've added __Pyx_Generator_SwapExceptions() method and call it right
before resume switch and before return from yield. It swaps generator
exception state with thread local.
Looks good to me. I assume this fixes the problem? Then please push it into
mainline.
old pull request is still there ;)
https://github.com/cython/cython/pull/25
Does __Pyx_ExceptionReset() steal references to args, so they should
not be decrefed later?
Hmm, good call. The refcounting looks correct over the two function calls,
but it would be nicer if it could be avoided instead. We are really just
swapping around pointers here, no refcounting is needed.
I think it's worth using a dedicated utility function for this.
Do you mean new utility call __Pyx_ExceptionSwap()?
Post by Stefan Behnel
Oh, and one more thing: what should be done when exiting the generator
normally? Would that just raise GeneratorExit anyway, so that we can swallow
any original outer exception? I think there might be a problem with Python
3, where the GeneratorExit should be raised in the context of the original
exception. Worth testing how CPython behaves here...
Right! It's better to wrap call to generator:

ExceptionSwap();
generator_body();
ExceptionSwap();

Ok, I'll take a look.
--
vitja.
Stefan Behnel
2011-04-20 11:45:02 UTC
Permalink
Post by Vitja Makarov
Post by Stefan Behnel
Post by Vitja Makarov
Post by Stefan Behnel
Post by Vitja Makarov
Post by Stefan Behnel
generators have their own lifetime frame in CPython, and
exceptions don't leak from that. So, whenever it's the generator (or code
called by it) that raises an exception, that must be kept local to the
generator.
https://github.com/vitek/cython/commit/73014aaed10b82a3f632d7f86212f86280c55858
I've added __Pyx_Generator_SwapExceptions() method and call it right
before resume switch and before return from yield. It swaps generator
exception state with thread local.
Looks good to me. I assume this fixes the problem? Then please push it into
mainline.
old pull request is still there ;)
https://github.com/cython/cython/pull/25
Does __Pyx_ExceptionReset() steal references to args, so they should
not be decrefed later?
Hmm, good call. The refcounting looks correct over the two function calls,
but it would be nicer if it could be avoided instead. We are really just
swapping around pointers here, no refcounting is needed.
I think it's worth using a dedicated utility function for this.
Do you mean new utility call __Pyx_ExceptionSwap()?
Yes. I think it doesn't even have to be specific to the generator code.
Just pass in "&gen->exc_type" etc.
Post by Vitja Makarov
Post by Stefan Behnel
Oh, and one more thing: what should be done when exiting the generator
normally? Would that just raise GeneratorExit anyway, so that we can swallow
any original outer exception? I think there might be a problem with Python
3, where the GeneratorExit should be raised in the context of the original
exception. Worth testing how CPython behaves here...
ExceptionSwap();
generator_body();
ExceptionSwap();
Good idea.
Post by Vitja Makarov
Ok, I'll take a look.
Cool. Thanks!

Stefan
Vitja Makarov
2011-04-20 13:16:09 UTC
Permalink
Post by Vitja Makarov
Post by Stefan Behnel
Post by Vitja Makarov
Post by Stefan Behnel
Post by Vitja Makarov
Post by Stefan Behnel
generators have their own lifetime frame in CPython, and
exceptions don't leak from that. So, whenever it's the generator (or code
called by it) that raises an exception, that must be kept local to the
generator.
https://github.com/vitek/cython/commit/73014aaed10b82a3f632d7f86212f86280c55858
I've added __Pyx_Generator_SwapExceptions() method and call it right
before resume switch and before return from yield. It swaps generator
exception state with thread local.
Looks good to me. I assume this fixes the problem? Then please push it into
mainline.
old pull request is still there ;)
https://github.com/cython/cython/pull/25
Does __Pyx_ExceptionReset() steal references to args, so they should
not be decrefed later?
Hmm, good call. The refcounting looks correct over the two function calls,
but it would be nicer if it could be avoided instead. We are really just
swapping around pointers here, no refcounting is needed.
I think it's worth using a dedicated utility function for this.
Do you mean new utility call __Pyx_ExceptionSwap()?
Yes. I think it doesn't even have to be specific to the generator code. Just
pass in "&gen->exc_type" etc.
Post by Vitja Makarov
Post by Stefan Behnel
Oh, and one more thing: what should be done when exiting the generator
normally? Would that just raise GeneratorExit anyway, so that we can swallow
any original outer exception? I think there might be a problem with Python
3, where the GeneratorExit should be raised in the context of the original
exception. Worth testing how CPython behaves here...
ExceptionSwap();
generator_body();
ExceptionSwap();
Good idea.
Post by Vitja Makarov
Ok, I'll take a look.
Cool. Thanks!
https://github.com/vitek/cython/compare/73014aaed1...01286645d0

Another one try ;)
--
vitja.
Loading...