Discussion:
[Python-3000-checkins] r66476 - in python/branches/py3k/Doc/c-api: buffer.rst typeobj.rst
benjamin.peterson
2008-09-16 02:24:32 UTC
Permalink
Author: benjamin.peterson
Date: Tue Sep 16 04:24:31 2008
New Revision: 66476

Log:
add documentation for the new buffer interface based on PEP 3118; I hope it's at least 60% right...

Modified:
python/branches/py3k/Doc/c-api/buffer.rst
python/branches/py3k/Doc/c-api/typeobj.rst

Modified: python/branches/py3k/Doc/c-api/buffer.rst
==============================================================================
--- python/branches/py3k/Doc/c-api/buffer.rst (original)
+++ python/branches/py3k/Doc/c-api/buffer.rst Tue Sep 16 04:24:31 2008
@@ -6,19 +6,20 @@
--------------

.. sectionauthor:: Greg Stein <gstein at lyra.org>
+.. sectionauthor:: Benjamin Peterson


.. index::
object: buffer
single: buffer interface

-Python objects implemented in C can export a group of functions called the
-"buffer interface." These functions can be used by an object to expose its data
-in a raw, byte-oriented format. Clients of the object can use the buffer
-interface to access the object data directly, without needing to copy it first.
+Python objects implemented in C can export a "buffer interface." These
+functions can be used by an object to expose its data in a raw, byte-oriented
+format. Clients of the object can use the buffer interface to access the object
+data directly, without needing to copy it first.

-Two examples of objects that support the buffer interface are strings and
-arrays. The string object exposes the character contents in the buffer
+Two examples of objects that support the buffer interface are bytes and
+arrays. The bytes object exposes the character contents in the buffer
interface's byte-oriented form. An array can also expose its contents, but it
should be noted that array elements may be multi-byte values.

@@ -33,87 +34,275 @@
More information on the buffer interface is provided in the section
:ref:`buffer-structs`, under the description for :ctype:`PyBufferProcs`.

-A "buffer object" is defined in the :file:`bufferobject.h` header (included by
-:file:`Python.h`). These objects look very similar to string objects at the
-Python programming level: they support slicing, indexing, concatenation, and
-some other standard string operations. However, their data can come from one of
-two sources: from a block of memory, or from another object which exports the
-buffer interface.
-
Buffer objects are useful as a way to expose the data from another object's
-buffer interface to the Python programmer. They can also be used as a zero-copy
-slicing mechanism. Using their ability to reference a block of memory, it is
-possible to expose any data to the Python programmer quite easily. The memory
+buffer interface to the Python programmer. They can also be used as a zero-copy
+slicing mechanism. Using their ability to reference a block of memory, it is
+possible to expose any data to the Python programmer quite easily. The memory
could be a large, constant array in a C extension, it could be a raw block of
memory for manipulation before passing to an operating system library, or it
could be used to pass around structured data in its native, in-memory format.


-.. ctype:: PyBufferObject
-
- This subtype of :ctype:`PyObject` represents a buffer object.
-
-
-.. cvar:: PyTypeObject PyBuffer_Type
-
- .. index:: single: BufferType (in module types)
-
- The instance of :ctype:`PyTypeObject` which represents the Python buffer type;
- it is the same object as ``buffer`` and ``types.BufferType`` in the Python
- layer. .
-
-
-.. cvar:: int Py_END_OF_BUFFER
-
- This constant may be passed as the *size* parameter to
- :cfunc:`PyBuffer_FromObject` or :cfunc:`PyBuffer_FromReadWriteObject`. It
- indicates that the new :ctype:`PyBufferObject` should refer to *base* object
- from the specified *offset* to the end of its exported buffer. Using this
- enables the caller to avoid querying the *base* object for its length.
-
-
-.. cfunction:: int PyBuffer_Check(PyObject *p)
-
- Return true if the argument has type :cdata:`PyBuffer_Type`.
-
-
-.. cfunction:: PyObject* PyBuffer_FromObject(PyObject *base, Py_ssize_t offset, Py_ssize_t size)
-
- Return a new read-only buffer object. This raises :exc:`TypeError` if *base*
- doesn't support the read-only buffer protocol or doesn't provide exactly one
- buffer segment, or it raises :exc:`ValueError` if *offset* is less than zero.
- The buffer will hold a reference to the *base* object, and the buffer's contents
- will refer to the *base* object's buffer interface, starting as position
- *offset* and extending for *size* bytes. If *size* is :const:`Py_END_OF_BUFFER`,
- then the new buffer's contents extend to the length of the *base* object's
- exported buffer data.
-
-
-.. cfunction:: PyObject* PyBuffer_FromReadWriteObject(PyObject *base, Py_ssize_t offset, Py_ssize_t size)
-
- Return a new writable buffer object. Parameters and exceptions are similar to
- those for :cfunc:`PyBuffer_FromObject`. If the *base* object does not export
- the writable buffer protocol, then :exc:`TypeError` is raised.
-
-
-.. cfunction:: PyObject* PyBuffer_FromMemory(void *ptr, Py_ssize_t size)
-
- Return a new read-only buffer object that reads from a specified location in
- memory, with a specified size. The caller is responsible for ensuring that the
- memory buffer, passed in as *ptr*, is not deallocated while the returned buffer
- object exists. Raises :exc:`ValueError` if *size* is less than zero. Note that
- :const:`Py_END_OF_BUFFER` may *not* be passed for the *size* parameter;
- :exc:`ValueError` will be raised in that case.
+.. ctype:: Py_buffer

+ .. cmember:: void *buf

-.. cfunction:: PyObject* PyBuffer_FromReadWriteMemory(void *ptr, Py_ssize_t size)
+ A pointer to the start of the memory for the object.

- Similar to :cfunc:`PyBuffer_FromMemory`, but the returned buffer is writable.
+ .. cmember:: Py_ssize_t len

+ The total length of the memory in bytes.
+
+ .. cmember:: int readonly
+
+ An indicator of whether the buffer is read only.
+
+ .. cmember:: const char *format
+
+ A *NULL* terminated string in :mod:`struct` module style syntax giving the
+ contents of the elements available through the buffer. If this is *NULL*,
+ ``"B"`` (unsigned bytes) is assumed.
+
+ .. cmember:: int ndim
+
+ The number of dimensions the memory represents as a multi-dimensional
+ array. If it is 0, :cdata:`strides` and :cdata:`suboffsets` must be
+ *NULL*.
+
+ .. cmember:: Py_ssize_t *shape
+
+ An array of :ctype:`Py_ssize_t`\s the length of :cdata:`ndim` giving the
+ shape of the memory as a multi-dimensional array. Note that
+ ``((*shape)[0] * ... * (*shape)[ndims-1])*itemsize`` should be equal to
+ :cdata:`len`.
+
+ .. cmember:: Py_ssize_t *strides
+
+ An array of :ctype:`Py_ssize_t`\s the length of :cdata:`ndim` giving the
+ number of bytes to skip to get to a new element in each dimension.
+
+ .. cmember:: Py_ssize_t *suboffsets
+
+ An array of :ctype:`Py_ssize_t`\s the length of :cdata:`ndim`. If these
+ suboffset numbers are greater than or equal to 0, then the value stored
+ along the indicated dimension is a pointer and the suboffset value
+ dictates how many bytes to add to the pointer after de-referencing. A
+ suboffset value that it negative indicates that no de-referencing should
+ occur (striding in a contiguous memory block).
+
+ Here is a function that returns a pointer to the element in an N-D array
+ pointed to by an N-dimesional index when there are both non-NULL strides
+ and suboffsets::
+
+ void *get_item_pointer(int ndim, void *buf, Py_ssize_t *strides,
+ Py_ssize_t *suboffsets, Py_ssize_t *indices) {
+ char *pointer = (char*)buf;
+ int i;
+ for (i = 0; i < ndim; i++) {
+ pointer += strides[i] * indices[i];
+ if (suboffsets[i] >=0 ) {
+ pointer = *((char**)pointer) + suboffsets[i];
+ }
+ }
+ return (void*)pointer;
+ }
+
+
+ .. cmember:: Py_ssize_t itemsize
+
+ This is a storage for the itemsize (in bytes) of each element of the
+ shared memory. It is technically un-necessary as it can be obtained using
+ :cfunc:`PyBuffer_SizeFromFormat`, however an exporter may know this
+ information without parsing the format string and it is necessary to know
+ the itemsize for proper interpretation of striding. Therefore, storing it
+ is more convenient and faster.
+
+ .. cmember:: void *internal
+
+ This is for use internally by the exporting object. For example, this
+ might be re-cast as an integer by the exporter and used to store flags
+ about whether or not the shape, strides, and suboffsets arrays must be
+ freed when the buffer is released. The consumer should never alter this
+ value.
+
+
+Buffer related functions
+========================
+
+
+.. cfunction:: int PyObject_CheckBuffer(PyObject *obj)
+
+ Return 1 if *obj* supports the buffer interface otherwise 0.
+
+
+.. cfunction:: int PyObject_GetBuffer(PyObject *obj, PyObject *view, int flags)
+
+ Export *obj* into a :ctype:`Py_buffer`, *view*. These arguments must
+ never be *NULL*. The *flags* argument is a bit field indicating what kind
+ of buffer the caller is prepared to deal with and therefore what kind of
+ buffer the exporter is allowed to return. The buffer interface allows for
+ complicated memory sharing possibilities, but some caller may not be able
+ to handle all the complexibity but may want to see if the exporter will
+ let them take a simpler view to its memory.
+
+ Some exporters may not be able to share memory in every possible way and
+ may need to raise errors to signal to some consumers that something is
+ just not possible. These errors should be a :exc:`BufferError` unless
+ there is another error that is actually causing the problem. The exporter
+ can use flags information to simplify how much of the :cdata:`Py_buffer`
+ structure is filled in with non-default values and/or raise an error if
+ the object can't support a simpler view of its memory.
+
+ 0 is returned on success and -1 on error.
+
+ The following table gives possible values to the *flags* arguments.
+
+ +------------------------------+-----------------------------------------------+
+ | Flag | Description |
+ +==============================+===============================================+
+ | :cmacro:`PyBUF_SIMPLE` |This is the default flag state. The returned |
+ | |buffer may or may not have writable memory. |
+ | |The format will be assumed to be unsigned bytes|
+ | |. This is a "stand-alone" flag constant. It |
+ | |never needs to be |'d to the others. The |
+ | |exporter will raise an error if it cannot |
+ | |provide such a contiguous buffer of bytes. |
+ | | |
+ +------------------------------+-----------------------------------------------+
+ | :cmacro:`PyBUF_WRITABLE` |The returned buffer must be writable. If it is |
+ | |not writable, then raise an error. |
+ +------------------------------+-----------------------------------------------+
+ | :cmacro:`PyBUF_STRIDES` |This implies :cmacro:`PyBUF_ND`. The returned |
+ | |buffer must provide strides information |
+ | |(i.e. the strides cannot be NULL). This would |
+ | |be used when the consumer can handle strided, |
+ | |discontiguous arrays. Handling strides |
+ | |automatically assumes you can handle shape. The|
+ | |exporter may raise an error if cannot provide a|
+ | |strided-only representation of the data |
+ | |(i.e. without the suboffsets). |
+ +------------------------------+-----------------------------------------------+
+ | :cmacro:`PyBUF_ND` |The returned buffer must provide shape |
+ | |information. The memory will be assumed C-style|
+ | |contiguous (last dimension varies the |
+ | |fastest). The exporter may raise an error if it|
+ | |cannot provide this kind of contiguous |
+ | |buffer. If this is not given then shape will be|
+ | |*NULL*. |
+ | | |
+ | | |
+ +------------------------------+-----------------------------------------------+
+ |:cmacro:`PyBUF_C_CONTIGUOUS` |These flags indicate that the contiguoity |
+ |:cmacro:`PyBUF_F_CONTIGUOUS` |returned buffer must be respectively, |
+ |:cmacro:`PyBUF_ANY_CONTIGUOUS`|C-contiguous (last dimension varies the |
+ | |fastest), Fortran contiguous (first dimension |
+ | |varies the fastest) or either one. All of |
+ | |these flags imply :cmacro:`PyBUF_STRIDES` and |
+ | |guarantee that the strides buffer info |
+ | |structure will be filled in correctly. |
+ | | |
+ +------------------------------+-----------------------------------------------+
+ | :cmacro:`PyBUF_INDIRECT` |This implies :cmacro:`PyBUF_STRIDES`. The |
+ | |returned buffer must have suboffsets |
+ | |information (which can be NULL if no suboffsets|
+ | |are needed). This would be used when the |
+ | |consumer can handle indirect array referencing |
+ | |implied by these suboffsets. |
+ | | |
+ | | |
+ | | |
+ +------------------------------+-----------------------------------------------+
+ | :cmacro:`PyBUF_FORMAT` |The returned buffer must have true format |
+ | |information if this flag is provided. This |
+ | |would be used when the consumer is going to be |
+ | |checking for what 'kind' of data is actually |
+ | |stored. An exporter should always be able to |
+ | |provide this information if requested. If |
+ | |format is not explicitly requested then the |
+ | |format must be returned as *NULL* (which means |
+ | |``'B'``, or unsigned bytes) |
+ +------------------------------+-----------------------------------------------+
+ | :cmacro:`PyBUF_STRIDED` |This is equivalent to ``(PyBUF_STRIDES | |
+ | |PyBUF_WRITABLE)``. |
+ +------------------------------+-----------------------------------------------+
+ | :cmacro:`PyBUF_STRIDED_RO` |This is equivalent to ``(PyBUF_STRIDES)``. |
+ | | |
+ +------------------------------+-----------------------------------------------+
+ | :cmacro:`PyBUF_RECORDS` |This is equivalent to ``(PyBUF_STRIDES | |
+ | |PyBUF_FORMAT | PyBUF_WRITABLE)``. |
+ +------------------------------+-----------------------------------------------+
+ | :cmacro:`PyBUF_RECORDS_RO` |This is equivalent to ``(PyBUF_STRIDES | |
+ | |PyBUF_FORMAT)``. |
+ +------------------------------+-----------------------------------------------+
+ | :cmacro:`PyBUF_FULL` |This is equivalent to ``(PyBUF_INDIRECT | |
+ | |PyBUF_FORMAT | PyBUF_WRITABLE)``. |
+ +------------------------------+-----------------------------------------------+
+ | :cmacro:`PyBUF_FULL_RO`` |This is equivalent to ``(PyBUF_INDIRECT | |
+ | |PyBUF_FORMAT)``. |
+ +------------------------------+-----------------------------------------------+
+ | :cmacro:`PyBUF_CONTIG` |This is equivalent to ``(PyBUF_ND | |
+ | |PyBUF_WRITABLE)``. |
+ +------------------------------+-----------------------------------------------+
+ | :cmacro:`PyBUF_CONTIG_RO` |This is equivalent to ``(PyBUF_ND)``. |
+ | | |
+ +------------------------------+-----------------------------------------------+
+
+
+.. cfunction:: void PyBuffer_Release(PyObject *obj, Py_buffer *view)
+
+ Release the buffer *view* over *obj*. This shouldd be called when the buffer
+ is no longer being used as it may free memory from it.
+
+
+.. cfunction:: Py_ssize_t PyBuffer_SizeFromFormat(const char *)
+
+ Return the implied :cdata:`~Py_buffer.itemsize` from the struct-stype
+ :cdata:`~Py_buffer.format`.
+
+
+.. cfunction:: int PyObject_CopyToObject(PyObject *obj, void *buf, Py_ssize_t len, char fortran)
+
+ Copy *len* bytes of data pointed to by the contiguous chunk of memory pointed
+ to by *buf* into the buffer exported by obj. The buffer must of course be
+ writable. Return 0 on success and return -1 and raise an error on failure.
+ If the object does not have a writable buffer, then an error is raised. If
+ *fortran* is ``'F'``, then if the object is multi-dimensional, then the data
+ will be copied into the array in Fortran-style (first dimension varies the
+ fastest). If *fortran* is ``'C'``, then the data will be copied into the
+ array in C-style (last dimension varies the fastest). If *fortran* is
+ ``'A'``, then it does not matter and the copy will be made in whatever way is
+ more efficient.
+
+
+.. cfunction:: int PyBuffer_IsContiguous(Py_buffer *view, char fortran)
+
+ Return 1 if the memory defined by the *view* is C-style (*fortran* is
+ ``'C'``) or Fortran-style (*fortran* is ``'F'``) contiguous or either one
+ (*fortran* is ``'A'``). Return 0 otherwise.
+
+
+.. cfunction:: void PyBuffer_FillContiguousStrides(int ndim, Py_ssize_t *shape, Py_ssize_t *strides, Py_ssize_t itemsize, char fortran)
+
+ Fill the *strides* array with byte-strides of a contiguous (C-style if
+ *fortran* is ``'C'`` or Fortran-style if *fortran* is ``'F'`` array of the
+ given shape with the given number of bytes per element.
+
+
+.. cfunction:: int PyBuffer_FillInfo(Py_buffer *view, void *buf, Py_ssize_t len, int readonly, int infoflags)
+
+ Fill in a buffer-info structure, *view*, correctly for an exporter that can
+ only share a contiguous chunk of memory of "unsigned bytes" of the given
+ length. Return 0 on success and -1 (with raising an error) on error.
+
+
+MemoryView objects
+==================
+
+A memoryview object is an extended buffer object that could replace the buffer
+object (but doesn't have to as that could be kept as a simple 1-d memoryview
+object). It, unlike :ctype:`Py_buffer`, is a Python object (exposed as
+:class:`memoryview` in :mod:`builtins`), so it can be used with Python code.

-.. cfunction:: PyObject* PyBuffer_New(Py_ssize_t size)
+.. cfunction:: PyObject* PyMemoryView_FromObject(PyObject *obj)

- Return a new writable buffer object that maintains its own memory buffer of
- *size* bytes. :exc:`ValueError` is returned if *size* is not zero or positive.
- Note that the memory buffer (as returned by :cfunc:`PyObject_AsWriteBuffer`) is
- not specifically aligned.
+ Return a memoryview object from an object that defines the buffer interface.

Modified: python/branches/py3k/Doc/c-api/typeobj.rst
==============================================================================
--- python/branches/py3k/Doc/c-api/typeobj.rst (original)
+++ python/branches/py3k/Doc/c-api/typeobj.rst Tue Sep 16 04:24:31 2008
@@ -1196,101 +1196,47 @@
========================

.. sectionauthor:: Greg J. Stein <greg at lyra.org>
+.. sectionauthor:: Benjamin Peterson


The buffer interface exports a model where an object can expose its internal
-data as a set of chunks of data, where each chunk is specified as a
-pointer/length pair. These chunks are called :dfn:`segments` and are presumed
-to be non-contiguous in memory.
+data.

If an object does not export the buffer interface, then its :attr:`tp_as_buffer`
member in the :ctype:`PyTypeObject` structure should be *NULL*. Otherwise, the
:attr:`tp_as_buffer` will point to a :ctype:`PyBufferProcs` structure.

-.. note::

- It is very important that your :ctype:`PyTypeObject` structure uses
- :const:`Py_TPFLAGS_DEFAULT` for the value of the :attr:`tp_flags` member rather
- than ``0``. This tells the Python runtime that your :ctype:`PyBufferProcs`
- structure contains the :attr:`bf_getcharbuffer` slot. Older versions of Python
- did not have this member, so a new Python interpreter using an old extension
- needs to be able to test for its presence before using it.
-
-.. XXX out of date!
.. ctype:: PyBufferProcs

Structure used to hold the function pointers which define an implementation of
the buffer protocol.

- The first slot is :attr:`bf_getreadbuffer`, of type :ctype:`getreadbufferproc`.
- If this slot is *NULL*, then the object does not support reading from the
- internal data. This is non-sensical, so implementors should fill this in, but
- callers should test that the slot contains a non-*NULL* value.
-
- The next slot is :attr:`bf_getwritebuffer` having type
- :ctype:`getwritebufferproc`. This slot may be *NULL* if the object does not
- allow writing into its returned buffers.
-
- The third slot is :attr:`bf_getsegcount`, with type :ctype:`getsegcountproc`.
- This slot must not be *NULL* and is used to inform the caller how many segments
- the object contains. Simple objects such as :ctype:`PyString_Type` and
- :ctype:`PyBuffer_Type` objects contain a single segment.
-
- .. index:: single: PyType_HasFeature()
-
- The last slot is :attr:`bf_getcharbuffer`, of type :ctype:`getcharbufferproc`.
- This slot will only be present if the :const:`Py_TPFLAGS_HAVE_GETCHARBUFFER`
- flag is present in the :attr:`tp_flags` field of the object's
- :ctype:`PyTypeObject`. Before using this slot, the caller should test whether it
- is present by using the :cfunc:`PyType_HasFeature` function. If the flag is
- present, :attr:`bf_getcharbuffer` may be *NULL*, indicating that the object's
- contents cannot be used as *8-bit characters*. The slot function may also raise
- an error if the object's contents cannot be interpreted as 8-bit characters.
- For example, if the object is an array which is configured to hold floating
- point values, an exception may be raised if a caller attempts to use
- :attr:`bf_getcharbuffer` to fetch a sequence of 8-bit characters. This notion of
- exporting the internal buffers as "text" is used to distinguish between objects
- that are binary in nature, and those which have character-based content.
-
- .. note::
-
- The current policy seems to state that these characters may be multi-byte
- characters. This implies that a buffer size of *N* does not mean there are *N*
- characters present.
-
-
-.. ctype:: Py_ssize_t (*readbufferproc) (PyObject *self, Py_ssize_t segment, void **ptrptr)
-
- Return a pointer to a readable segment of the buffer in ``*ptrptr``. This
- function is allowed to raise an exception, in which case it must return ``-1``.
- The *segment* which is specified must be zero or positive, and strictly less
- than the number of segments returned by the :attr:`bf_getsegcount` slot
- function. On success, it returns the length of the segment, and sets
- ``*ptrptr`` to a pointer to that memory.
-
-
-.. ctype:: Py_ssize_t (*writebufferproc) (PyObject *self, Py_ssize_t segment, void **ptrptr)
-
- Return a pointer to a writable memory buffer in ``*ptrptr``, and the length of
- that segment as the function return value. The memory buffer must correspond to
- buffer segment *segment*. Must return ``-1`` and set an exception on error.
- :exc:`TypeError` should be raised if the object only supports read-only buffers,
- and :exc:`SystemError` should be raised when *segment* specifies a segment that
- doesn't exist.
-
- .. Why doesn't it raise ValueError for this one?
- GJS: because you shouldn't be calling it with an invalid
- segment. That indicates a blatant programming error in the C code.
-
-
-.. ctype:: Py_ssize_t (*segcountproc) (PyObject *self, Py_ssize_t *lenp)
-
- Return the number of memory segments which comprise the buffer. If *lenp* is
- not *NULL*, the implementation must report the sum of the sizes (in bytes) of
- all segments in ``*lenp``. The function cannot fail.
-
+ .. cmember:: getbufferproc bf_getbuffer

-.. ctype:: Py_ssize_t (*charbufferproc) (PyObject *self, Py_ssize_t segment, const char **ptrptr)
+ This should fill a :ctype:`Py_buffer` with the necessary data for
+ exporting the type. The signature of :data:`getbufferproc` is ``int
+ (PyObject *obj, PyObject *view, int flags)``. *obj* is the object to
+ export, *view* is the :ctype:`Py_buffer` struct to fill, and *flags* gives
+ the conditions the caller wants the memory under. (See
+ :cfunc:`PyObject_GetBuffer` for all flags.) :cmember:`bf_getbuffer` is
+ responsible for filling *view* with the approiate information.
+ (:cfunc:`PyBuffer_FillView` can be used in simple cases.) See
+ :ctype:`Py_buffer`\s docs for what needs to be filled in.
+
+
+ .. cmember:: releasebufferproc bf_releasebuffer
+
+ This should release the resources of the buffer. The signature of
+ :cdata:`releasebufferproc` is ``void (PyObject *obj, Py_buffer *view)``.
+ If the :cdata:`bf_releasebuffer` function is not provided (i.e. it is
+ *NULL*), then it does not ever need to be called.
+
+ The exporter of the buffer interface must make sure that any memory
+ pointed to in the :ctype:`Py_buffer` structure remains valid until
+ releasebuffer is called. Exporters will need to define a
+ :cdata:`bf_releasebuffer` function if they can re-allocate their memory,
+ strides, shape, suboffsets, or format variables which they might share
+ through the struct bufferinfo.

- Return the size of the segment *segment* that *ptrptr* is set to. ``*ptrptr``
- is set to the memory buffer. Returns ``-1`` on error.
+ See :cfunc:`PyBuffer_Release`.

Loading...