PyTables issues warnings when certain limits are exceeded. Those limits are not intrinsic limitations of the underlying software, but rather are proactive measures to avoid large resource consumptions. The default limits should be enough for most of cases, and users should try to respect them. However, in some situations, it can be convenient to increase (or decrease) these limits.
Also, and in order to get maximum performance, PyTables implements a series of sophisticated features, like I/O buffers or different kind of caches (for nodes, chunks and other internal metadata). These features comes with a default set of parameters that ensures a decent performance in most of situations. But, as there is always a need for every case, it is handy to have the possibility to fine-tune some of these parameters.
Because of this, PyTables implements a couple of ways to change
these values. All of the tunable parameters live
in the tables/parameters.py
(and tables/_parameters_pro.py
, for PyTables Pro
users). The user can choose to change them in the parameter files
themselves for a global and persistent change. Moreover, if she wants a
finer control, she can pass any of these parameters directly to
the openFile()
function (see
description), and the new parameters will only take
effect in the corresponding file (the defaults will continue to be in
the parameter files.
A description of all of the tunable parameters follows. Please see your parameter files so as to know the actual default values.
![]() | Warning |
---|---|
Changing the next parameters may have a very bad effect in the resource consumption and performance of your PyTables scripts. Please be careful when touching these! |
MAX_COLUMNS
Maximum number of columns in Table
objects before a PerformanceWarning
is
issued. This limit is somewhat arbitrary and can be
increased.
MAX_COLUMNS
MAX_NODE_ATTRS
Maximum allowed number of attributes in a node
MAX_GROUP_WIDTH
Maximum depth in object tree allowed.
MAX_UNDO_PATH_LENGTH
Maximum length of paths allowed in undo/redo operations.
METADATA_CACHE_SIZE
Size (in bytes) of the HDF5 metadata cache. This only takes effect if using HDF5 1.8.x series.
NODE_CACHE_SLOTS
Maximum number of unreferenced nodes to be kept in memory.
If positive, this is the number of unreferenced nodes to be kept in the metadata cache. Least recently used nodes are unloaded from memory when this number of loaded nodes is reached. To load a node again, simply access it as usual. Nodes referenced by user variables are not taken into account nor unloaded.
Negative value means that all the touched nodes will be
kept in an internal dictionary. This is the faster way to
load/retrieve nodes. However, and in order to avoid a large
memory comsumption, the user will be warned when the number
of loaded nodes will reach
the -NODE_CACHE_SLOTS
value.
Finally, a value of zero means that any cache mechanism is disabled.
CHUNKTIMES
The buffersize/chunksize ratio.
BUFFERTIMES
The maximum buffersize/rowsize ratio before issuing a
PerformanceWarning
.
EXPECTED_ROWS_EARRAY
Default expected number of rows
for EArray
objects.
EXPECTED_ROWS_TABLE
Default expected number of rows
for Table
objects.
PYTABLES_SYS_ATTRS
Set this to False
if you don't want
to create PyTables system attributes in datasets. Also, if
set to False
the possible existing system
attributes are not considered for guessing the class of the
node during its loading from disk (this work is delegated to
the PyTables' class discoverer function for general HDF5
files).
![]() | Note |
---|---|
These parameters are only available in PyTables Pro. |
BOUNDS_MAX_SIZE
The maximum size for bounds values cached during index lookups.
BOUNDS_MAX_SLOTS
The maximum number of slots for
the BOUNDS
cache.
ITERSEQ_MAX_ELEMENTS
The maximum number of iterator elements cached in data lookups.
ITERSEQ_MAX_SIZE
The maximum space that will
take ITERSEQ
cache (in bytes).
ITERSEQ_MAX_SLOTS
The maximum number of slots in
ITERSEQ
cache.
LIMBOUNDS_MAX_SIZE
The maximum size for the query limits (for example,
(lim1, lim2)
in conditions like
lim1 ≤ col < lim2
) cached during
index lookups (in bytes).
LIMBOUNDS_MAX_SLOTS
The maximum number of slots for
LIMBOUNDS
cache.
TABLE_MAX_SIZE
The maximum size for table chunks cached during index queries.
SORTED_MAX_SIZE
The maximum size for sorted values cached during index lookups.
SORTEDLR_MAX_SIZE
The maximum size for chunks in last row cached in index lookups (in bytes).
SORTEDLR_MAX_SLOTS
The maximum number of chunks
for SORTEDLR
cache.
![]() | Warning |
---|---|
The next parameters will not be effective if passed to the
|
DISABLE_EVERY_CYCLES
The number of cycles in which a cache will be forced to
be disabled if the hit ratio is lower than the
LOWEST_HIT_RATIO
(see below). This value
should provide time enough to check whether the cache is being
efficient or not.
ENABLE_EVERY_CYCLES
The number of cycles in which a cache will be forced to be (re-)enabled, irregardingly of the hit ratio. This will provide a chance for checking if we are in a better scenario for doing caching again.
LOWEST_HIT_RATIO
The minimum acceptable hit ratio for a cache to avoid disabling (and freeing) it.