Source documentation
mrcfile – Main package
mrcfile
A pure Python implementation of the MRC2014 file format.
For a full introduction and documentation, see http://mrcfile.readthedocs.io/
Functions
new()
: Create a new MRC file.open()
: Open an MRC file.open_async()
: Open an MRC file asynchronously.mmap()
: Open a memory-mapped MRC file (fast for large files).new_mmap()
: Create a new empty memory-mapped MRC file (fast for large files).validate()
: Validate an MRC file
Basic usage
Examples assume that this package has been imported as mrcfile
and numpy
has been imported as np
.
To open an MRC file and read a slice of data:
>>> with mrcfile.open('tests/test_data/EMD-3197.map') as mrc:
... mrc.data[10,10]
...
array([ 2.58179283, 3.1406002 , 3.64495397, 3.63812137, 3.61837363,
4.0115056 , 3.66981959, 2.07317996, 0.1251585 , -0.87975615,
0.12517013, 2.07319379, 3.66982722, 4.0115037 , 3.61837196,
3.6381247 , 3.64495087, 3.14059472, 2.58178973, 1.92690361], dtype=float32)
To create a new file with a 2D data array, and change some values:
>>> with mrcfile.new('tmp.mrc') as mrc:
... mrc.set_data(np.zeros((5, 5), dtype=np.int8))
... mrc.data[1:4,1:4] = 10
... mrc.data
...
array([[ 0, 0, 0, 0, 0],
[ 0, 10, 10, 10, 0],
[ 0, 10, 10, 10, 0],
[ 0, 10, 10, 10, 0],
[ 0, 0, 0, 0, 0]], dtype=int8)
Background
The MRC2014 format was described in the Journal of Structural Biology: http://dx.doi.org/10.1016/j.jsb.2015.04.002
The format specification is available on the CCP-EM website: http://www.ccpem.ac.uk/mrc_format/mrc2014.php
Members
- mrcfile.new(name, data=None, compression=None, overwrite=False)
Create a new MRC file.
- Parameters
name – The file name to use, as a string or
Path
.data – Data to put in the file, as a
numpy array
. The default isNone
, to create an empty file.compression – The compression format to use. Acceptable values are:
None
(the default; for no compression),'gzip'
or'bzip2'
. It’s good practice to name compressed files with an appropriate extension (for example,.mrc.gz
for gzip) but this is not enforced.overwrite – Flag to force overwriting of an existing file. If
False
and a file of the same name already exists, the file is not overwritten and an exception is raised.
- Returns
An
MrcFile
object (or a subclass of it ifcompression
is specified).- Raises
ValueError – If the file already exists and overwrite is
False
.ValueError – If the compression format is not recognised.
- Warns
RuntimeWarning – If the data array contains Inf or NaN values.
- mrcfile.open(name, mode='r', permissive=False, header_only=False)
Open an MRC file.
This function opens both normal and compressed MRC files. Supported compression formats are: gzip, bzip2.
It is possible to use this function to create new MRC files (using mode
w+
) but thenew()
function is more flexible.This function offers a permissive read mode for attempting to open corrupt or invalid files. In permissive mode,
warnings
are issued instead of exceptions if problems with the file are encountered. SeeMrcInterpreter
or the usage guide for more information.- Parameters
name – The file name to open, as a string or
Path
.mode – The file mode to use. This should be one of the following:
r
for read-only,r+
for read and write, orw+
for a new empty file. The default isr
.permissive – Read the file in permissive mode. The default is
False
.header_only – Only read the header (and extended header) from the file. The default is
False
.
- Returns
An
MrcFile
object (or aGzipMrcFile
object if the file is gzipped).- Raises
ValueError – If the mode is not one of
r
,r+
orw+
.ValueError – If the file is not a valid MRC file and
permissive
isFalse
.ValueError – If the mode is
w+
and the file already exists. (Callnew()
withoverwrite=True
to deliberately overwrite an existing file.)OSError – If the mode is
r
orr+
and the file does not exist.
- Warns
RuntimeWarning – If the file appears to be a valid MRC file but the data block is longer than expected from the dimensions in the header.
RuntimeWarning – If the file is not a valid MRC file and
permissive
isTrue
.RuntimeWarning – If the header’s
exttyp
field is set to a known value but the extended header’s size is not a multiple of the number of bytes in the corresponding dtype.
- mrcfile.read(name)
Read an MRC file’s data into a numpy array.
This is a convenience function to read the data from an MRC file when there is no need for the file’s header information. To read the headers as well, or if you need access to an
MrcFile
object representing the file, usemrcfile.open()
instead.- Parameters
name – The file name to read, as a string or
Path
.- Returns
A
numpy array
containing the data from the file.
- mrcfile.write(name, data=None, overwrite=False, voxel_size=None)
Write a new MRC file.
This is a convenience function to allow data to be quickly written to a file (with optional compression) using just a single function call. However, there is no control over the file’s metadata except for optionally setting the voxel size. For more control, or if you need access to an
MrcFile
object representing the new file, usemrcfile.new()
instead.- Parameters
name – The file name to use, as a string or
Path
. If the name ends with.gz
or.bz2
, the file will be compressed using gzip or bzip2 respectively.data – Data to put in the file, as a
numpy array
. The default isNone
, to create an empty file.overwrite – Flag to force overwriting of an existing file. If
False
and a file of the same name already exists, the file is not overwritten and an exception is raised.voxel_size – float | 3-tuple The voxel size to be written in the file header.
- Raises
ValueError – If the file already exists and overwrite is
False
.- Warns
RuntimeWarning – If the data array contains Inf or NaN values.
- mrcfile.open_async(name, mode='r', permissive=False)
Open an MRC file asynchronously in a separate thread.
This allows a file to be opened in the background while the main thread continues with other work. This can be a good way to improve performance if the main thread is busy with intensive computation, but will be less effective if the main thread is itself busy with disk I/O.
Multiple files can be opened in the background simultaneously. However, this implementation is relatively crude; each call to this function will start a new thread and immediately use it to start opening a file. If you try to open many large files at the same time, performance will decrease as all of the threads attempt to access the disk at once. You’ll also risk running out of memory to store the data from all the files.
This function returns a
FutureMrcFile
object, which deliberately mimics the API of theFuture
object from Python 3’sconcurrent.futures
module. (Future versions of this library might return genuineFuture
objects instead.)To get the real
MrcFile
object from aFutureMrcFile
, callresult()
. This will block until the file has been read and theMrcFile
object is ready. To check if theMrcFile
is ready without blocking, callrunning()
ordone()
.- Parameters
- Returns
A
FutureMrcFile
object.
- mrcfile.mmap(name, mode='r', permissive=False)
Open a memory-mapped MRC file.
This allows much faster opening of large files, because the data is only accessed on disk when a slice is read or written from the data array. See the
MrcMemmap
class documentation for more information.Because the memory-mapped data array accesses the disk directly, compressed files cannot be opened with this function. In all other ways,
mmap()
behaves in exactly the same way asopen()
. TheMrcMemmap
object returned by this function can be used in exactly the same way as a normalMrcFile
object.
- mrcfile.new_mmap(name, shape, mrc_mode=0, fill=None, overwrite=False, extended_header=None, exttyp=None)
Create a new, empty memory-mapped MRC file.
This function is useful for creating very large files. The initial contents of the data array can be set with the
fill
parameter if needed, but be aware that filling a large array can take a long time.If
fill
is not set, the new data array’s contents are unspecified and system-dependent. (Some systems fill a new empty mmap with zeros, others fill it with the bytes from the disk at the newly-mapped location.) If you are definitely going to fill the entire array with new data anyway you can safely leavefill
asNone
, otherwise it is advised to use a sensible fill value (or ensure you are on a system that fills new mmaps with a reasonable default value).- Parameters
name – The file name to use, as a string or
Path
.shape – The shape of the data array to open, as a 2-, 3- or 4-tuple of ints. For example,
(nz, ny, nx)
for a new 3D volume, or(ny, nx)
for a new 2D image.mrc_mode –
The MRC mode to use for the new file. One of 0, 1, 2, 4 or 6, which correspond to numpy dtypes as follows:
mode 0 -> int8
mode 1 -> int16
mode 2 -> float32
mode 4 -> complex64
mode 6 -> uint16
The default is 0.
fill – An optional value to use to fill the new data array. If
None
, the data array will not be filled and its contents are unspecified. Numpy’s usual rules for rounding or rejecting values apply, according to the dtype of the array.overwrite – Flag to force overwriting of an existing file. If
False
and a file of the same name already exists, the file is not overwritten and an exception is raised.extended_header – The extended header object
exttyp – The extended header type
- Returns
A new
MrcMemmap
object.- Raises
ValueError – If the MRC mode is invalid.
ValueError – If the file already exists and overwrite is
False
.
- mrcfile.validate(name, print_file=None)
Validate an MRC file.
This function first opens the file by calling
open()
(withpermissive=True
), then callsvalidate()
, which runs a series of tests to check whether the file complies with the MRC2014 format specification.If the file is completely valid, this function returns
True
, otherwise it returnsFalse
. Messages explaining the validation result will be printed tosys.stdout
by default, but if a text stream is given (using theprint_file
argument) output will be printed to that instead.Badly invalid files will also cause
warning
messages to be issued, which will be written tosys.stderr
by default. See the documentation of thewarnings
module for information on how to suppress or capture warning output.Because the file is opened by calling
open()
, gzip- and bzip2-compressed MRC files can be validated easily using this function.After the file has been opened, it is checked for problems. The tests are:
MRC format ID string: The
map
field in the header should contain “MAP “.Machine stamp: The machine stamp should contain one of
0x44 0x44 0x00 0x00
,0x44 0x41 0x00 0x00
or0x11 0x11 0x00 0x00
.MRC mode: the
mode
field should be one of the supported mode numbers: 0, 1, 2, 4, 6 or 12. (Note that MRC modes 3 and 101 are also valid according to the MRC 2014 specification but are not supported by mrcfile.)Map and cell dimensions: The header fields
nx
,ny
,nz
,mx
,my
,mz
,cella.x
,cella.y
andcella.z
must all be positive numbers.Axis mapping: Header fields
mapc
,mapr
andmaps
must contain the values 1, 2, and 3 (in any order).Volume stack dimensions: If the spacegroup is in the range 401–630, representing a volume stack, the
nz
field should be exactly divisible bymz
to represent the number of volumes in the stack.Header labels: The
nlabl
field should be set to indicate the number of labels in use, and the labels in use should appear first in the label array.MRC format version: The
nversion
field should be 20140 or 20141 for compliance with the MRC2014 standard.Extended header type: If an extended header is present, the
exttyp
field should be set to indicate the type of extended header.Data statistics: The statistics in the header should be correct for the actual data in the file, or marked as undetermined.
File size: The size of the file on disk should match the expected size calculated from the MRC header.
- Parameters
name – The file name to open and validate.
print_file – The output text stream to use for printing messages about the validation. This is passed directly to the
file
argument of Python’sprint()
function. The default isNone
, which means output will be printed tosys.stdout
.
- Returns
True
if the file is valid, orFalse
if the file does not meet the MRC format specification in any way.- Raises
OSError – If the file does not exist or cannot be opened.
- Warns
RuntimeWarning – If the file is seriously invalid because it has no map ID string, an incorrect machine stamp, an unknown mode number, or is not the same size as expected from the header.
Submodules
mrcfile.bzip2mrcfile module
bzip2mrcfile
Module which exports the Bzip2MrcFile
class.
- Classes:
Bzip2MrcFile
: An object which represents a bzip2-compressed MRC file.
- class mrcfile.bzip2mrcfile.Bzip2MrcFile(name, mode='r', overwrite=False, permissive=False, header_only=False, **kwargs)
Bases:
MrcFile
MrcFile
subclass for handling bzip2-compressed files.Usage is the same as for
MrcFile
.- _open_file(name)
Override _open_file() to open a bzip2 file.
- _read(header_only=False)
Override _read() to ensure bzip2 file is in read mode.
- _ensure_readable_bzip2_stream()
Make sure _iostream is a bzip2 stream that can be read.
- _get_file_size()
Override _get_file_size() to ensure stream is readable first.
- _read_bytearray_from_stream(number_of_bytes)
Override because BZ2File in Python 2 does not support
readinto()
.
mrcfile.command_line module
command_line
Module for functions used as command line entry points.
The names of the corresponding command line scripts can be found in the
entry_points
section of setup.py
.
- mrcfile.command_line.print_headers(names=None, print_file=None)
Print the MRC header contents from a list of files.
This function opens files in permissive mode to allow headers of invalid files to be examined.
- Parameters
names – A list of file names. If not given or
None
, the names are taken from the command line arguments.print_file – The output text stream to use for printing the headers. This is passed directly to the
print_file
argument ofprint_header()
. The default isNone
, which means output will be printed tosys.stdout
.
mrcfile.constants module
constants
Constants used by the mrcfile.py
library.
mrcfile.dtypes module
dtypes
numpy dtypes used by the mrcfile.py
library.
The dtypes are defined in a separate module because they do not interact nicely
with the from __future__ import unicode_literals
feature used in the rest
of the package.
- mrcfile.dtypes.get_ext_header_dtype(exttyp, byte_order='=')
Get a dtype for an extended header.
- Parameters
exttyp – One of
b'FEI1'
orb'FEI2'
, which are currently the only supported extended header types.byte_order – One of
=
,<
or>
.
- Returns
A
numpy dtype
object for the extended header, orNone
- Raises
ValueError – If
byte_order
is not one of=
,<
or>
.
mrcfile.future_mrcfile module
future_mrcfile
Module which exports the FutureMrcFile
class.
- Classes:
FutureMrcFile
: An object which represents an MRC file beingopened asynchronously.
- class mrcfile.future_mrcfile.FutureMrcFile(open_function, args=(), kwargs={})
Bases:
object
Object representing an MRC file being opened asynchronously.
This API deliberately mimics a
Future
object from theconcurrent.futures
module in Python 3.2+ (which we do not use directly because this code still needs to run in Python 2.7).- __init__(open_function, args=(), kwargs={})
Initialise a new
FutureMrcFile
object.This constructor starts a new thread which will invoke the callable given in
open_function
with the given arguments.- Parameters
open_function – The callable to use to open the MRC file. (This will normally be
mrcfile.open()
, but could also beMrcFile
or any of its subclasses.)args – A tuple of positional arguments to use when
open_function
is called. (Normally a 1-tuple containing the name of the file to open.)kwargs – A dictionary of keyword arguments to use when
open_function
is called.
- _run(*args, **kwargs)
Call the open function and store the result in the holder list.
(For internal use only.)
- cancel()
Return
False
.(See
concurrent.futures.Future.cancel()
for more details. This implementation does not allow jobs to be cancelled.)
- cancelled()
Return
False
.(See
concurrent.futures.Future.cancelled()
for more details. This implementation does not allow jobs to be cancelled.)
- running()
Return
True
if theMrcFile
is currently being opened.(See
concurrent.futures.Future.running()
for more details.)
- done()
Return
True
if the file opening has finished.(See
concurrent.futures.Future.done()
for more details.)
- result(timeout=None)
Return the
MrcFile
that has been opened.(See
concurrent.futures.Future.result()
for more details.)- Parameters
timeout – Time to wait (in seconds) for the file opening to finish. If
timeout
is not specified or isNone
, there is no limit to the wait time.- Returns
An
MrcFile
object (or one of its subclasses).- Raises
RuntimeError – If the operation has not finished within the time limit set by
timeout
. (Note that the type of this exception will change in future if this class is replaced byconcurrent.futures.Future
.)Exception – Any exception raised by the
MrcFile
opening operation will be re-raised here.
- exception(timeout=None)
Return the exception raised by the file opening operation.
(See
concurrent.futures.Future.exception()
for more details.)- Parameters
timeout – Time to wait (in seconds) for the operation to finish. If
timeout
is not specified or isNone
, there is no limit to the wait time.- Returns
An
Exception
, if one was raised by the file opening operation, orNone
if no exception was raised.- Raises
RuntimeError – If the operation has not finished within the time limit set by
timeout
. (Note that the type of this exception will change in future if this class is replaced byconcurrent.futures.Future
.)
- _get_result(timeout)
Return the result or exception from the file opening operation.
(For internal use only.)
- add_done_callback(fn)
Not implemented.
(See
concurrent.futures.Future.add_done_callback()
for more details.)
mrcfile.gzipmrcfile module
gzipmrcfile
Module which exports the GzipMrcFile
class.
- Classes:
GzipMrcFile
: An object which represents a gzipped MRC file.
- class mrcfile.gzipmrcfile.GzipMrcFile(name, mode='r', overwrite=False, permissive=False, header_only=False, **kwargs)
Bases:
MrcFile
MrcFile
subclass for handling gzipped files.Usage is the same as for
MrcFile
.- _open_file(name)
Override _open_file() to open both normal and gzip files.
- _close_file()
Override _close_file() to close both normal and gzip files.
- _read(header_only=False)
Override _read() to ensure gzip file is in read mode.
- _ensure_readable_gzip_stream()
Make sure _iostream is a gzip stream that can be read.
- _get_file_size()
Override _get_file_size() to avoid seeking from end.
mrcfile.load_functions module
load_functions
Module for top-level functions that open MRC files and form the main API of the package.
- mrcfile.load_functions.new(name, data=None, compression=None, overwrite=False)
Create a new MRC file.
- Parameters
name – The file name to use, as a string or
Path
.data – Data to put in the file, as a
numpy array
. The default isNone
, to create an empty file.compression – The compression format to use. Acceptable values are:
None
(the default; for no compression),'gzip'
or'bzip2'
. It’s good practice to name compressed files with an appropriate extension (for example,.mrc.gz
for gzip) but this is not enforced.overwrite – Flag to force overwriting of an existing file. If
False
and a file of the same name already exists, the file is not overwritten and an exception is raised.
- Returns
An
MrcFile
object (or a subclass of it ifcompression
is specified).- Raises
ValueError – If the file already exists and overwrite is
False
.ValueError – If the compression format is not recognised.
- Warns
RuntimeWarning – If the data array contains Inf or NaN values.
- mrcfile.load_functions.open(name, mode='r', permissive=False, header_only=False)
Open an MRC file.
This function opens both normal and compressed MRC files. Supported compression formats are: gzip, bzip2.
It is possible to use this function to create new MRC files (using mode
w+
) but thenew()
function is more flexible.This function offers a permissive read mode for attempting to open corrupt or invalid files. In permissive mode,
warnings
are issued instead of exceptions if problems with the file are encountered. SeeMrcInterpreter
or the usage guide for more information.- Parameters
name – The file name to open, as a string or
Path
.mode – The file mode to use. This should be one of the following:
r
for read-only,r+
for read and write, orw+
for a new empty file. The default isr
.permissive – Read the file in permissive mode. The default is
False
.header_only – Only read the header (and extended header) from the file. The default is
False
.
- Returns
An
MrcFile
object (or aGzipMrcFile
object if the file is gzipped).- Raises
ValueError – If the mode is not one of
r
,r+
orw+
.ValueError – If the file is not a valid MRC file and
permissive
isFalse
.ValueError – If the mode is
w+
and the file already exists. (Callnew()
withoverwrite=True
to deliberately overwrite an existing file.)OSError – If the mode is
r
orr+
and the file does not exist.
- Warns
RuntimeWarning – If the file appears to be a valid MRC file but the data block is longer than expected from the dimensions in the header.
RuntimeWarning – If the file is not a valid MRC file and
permissive
isTrue
.RuntimeWarning – If the header’s
exttyp
field is set to a known value but the extended header’s size is not a multiple of the number of bytes in the corresponding dtype.
- mrcfile.load_functions.read(name)
Read an MRC file’s data into a numpy array.
This is a convenience function to read the data from an MRC file when there is no need for the file’s header information. To read the headers as well, or if you need access to an
MrcFile
object representing the file, usemrcfile.open()
instead.- Parameters
name – The file name to read, as a string or
Path
.- Returns
A
numpy array
containing the data from the file.
- mrcfile.load_functions.write(name, data=None, overwrite=False, voxel_size=None)
Write a new MRC file.
This is a convenience function to allow data to be quickly written to a file (with optional compression) using just a single function call. However, there is no control over the file’s metadata except for optionally setting the voxel size. For more control, or if you need access to an
MrcFile
object representing the new file, usemrcfile.new()
instead.- Parameters
name – The file name to use, as a string or
Path
. If the name ends with.gz
or.bz2
, the file will be compressed using gzip or bzip2 respectively.data – Data to put in the file, as a
numpy array
. The default isNone
, to create an empty file.overwrite – Flag to force overwriting of an existing file. If
False
and a file of the same name already exists, the file is not overwritten and an exception is raised.voxel_size – float | 3-tuple The voxel size to be written in the file header.
- Raises
ValueError – If the file already exists and overwrite is
False
.- Warns
RuntimeWarning – If the data array contains Inf or NaN values.
- mrcfile.load_functions.open_async(name, mode='r', permissive=False)
Open an MRC file asynchronously in a separate thread.
This allows a file to be opened in the background while the main thread continues with other work. This can be a good way to improve performance if the main thread is busy with intensive computation, but will be less effective if the main thread is itself busy with disk I/O.
Multiple files can be opened in the background simultaneously. However, this implementation is relatively crude; each call to this function will start a new thread and immediately use it to start opening a file. If you try to open many large files at the same time, performance will decrease as all of the threads attempt to access the disk at once. You’ll also risk running out of memory to store the data from all the files.
This function returns a
FutureMrcFile
object, which deliberately mimics the API of theFuture
object from Python 3’sconcurrent.futures
module. (Future versions of this library might return genuineFuture
objects instead.)To get the real
MrcFile
object from aFutureMrcFile
, callresult()
. This will block until the file has been read and theMrcFile
object is ready. To check if theMrcFile
is ready without blocking, callrunning()
ordone()
.- Parameters
- Returns
A
FutureMrcFile
object.
- mrcfile.load_functions.mmap(name, mode='r', permissive=False)
Open a memory-mapped MRC file.
This allows much faster opening of large files, because the data is only accessed on disk when a slice is read or written from the data array. See the
MrcMemmap
class documentation for more information.Because the memory-mapped data array accesses the disk directly, compressed files cannot be opened with this function. In all other ways,
mmap()
behaves in exactly the same way asopen()
. TheMrcMemmap
object returned by this function can be used in exactly the same way as a normalMrcFile
object.
- mrcfile.load_functions.new_mmap(name, shape, mrc_mode=0, fill=None, overwrite=False, extended_header=None, exttyp=None)
Create a new, empty memory-mapped MRC file.
This function is useful for creating very large files. The initial contents of the data array can be set with the
fill
parameter if needed, but be aware that filling a large array can take a long time.If
fill
is not set, the new data array’s contents are unspecified and system-dependent. (Some systems fill a new empty mmap with zeros, others fill it with the bytes from the disk at the newly-mapped location.) If you are definitely going to fill the entire array with new data anyway you can safely leavefill
asNone
, otherwise it is advised to use a sensible fill value (or ensure you are on a system that fills new mmaps with a reasonable default value).- Parameters
name – The file name to use, as a string or
Path
.shape – The shape of the data array to open, as a 2-, 3- or 4-tuple of ints. For example,
(nz, ny, nx)
for a new 3D volume, or(ny, nx)
for a new 2D image.mrc_mode –
The MRC mode to use for the new file. One of 0, 1, 2, 4 or 6, which correspond to numpy dtypes as follows:
mode 0 -> int8
mode 1 -> int16
mode 2 -> float32
mode 4 -> complex64
mode 6 -> uint16
The default is 0.
fill – An optional value to use to fill the new data array. If
None
, the data array will not be filled and its contents are unspecified. Numpy’s usual rules for rounding or rejecting values apply, according to the dtype of the array.overwrite – Flag to force overwriting of an existing file. If
False
and a file of the same name already exists, the file is not overwritten and an exception is raised.extended_header – The extended header object
exttyp – The extended header type
- Returns
A new
MrcMemmap
object.- Raises
ValueError – If the MRC mode is invalid.
ValueError – If the file already exists and overwrite is
False
.
mrcfile.mrcfile module
mrcfile
Module which exports the MrcFile
class.
- Classes:
MrcFile
: An object which represents an MRC file.
- class mrcfile.mrcfile.MrcFile(name, mode='r', overwrite=False, permissive=False, header_only=False, **kwargs)
Bases:
MrcInterpreter
An object which represents an MRC file.
The header and data are handled as numpy arrays - see
MrcObject
for details.MrcFile
supports a permissive read mode for attempting to open corrupt or invalid files. Seemrcfile.mrcinterpreter.MrcInterpreter
or the usage guide for more information.- Usage:
To create a new MrcFile object, pass a file name and optional mode. To ensure the file is written to disk and closed correctly, it’s best to use the
with
statement:>>> with MrcFile('tmp.mrc', 'w+') as mrc: ... mrc.set_data(np.zeros((10, 10), dtype=np.int8))
In mode
r
orr+
, the named file is opened from disk and read. In modew+
a new empty file is created and will be written to disk at the end of thewith
block (or whenflush()
orclose()
is called).
- __init__(name, mode='r', overwrite=False, permissive=False, header_only=False, **kwargs)
Initialise a new
MrcFile
object.The given file name is opened in the given mode. For mode
r
orr+
the header, extended header and data are read from the file. For modew+
a new file is created with a default header and empty extended header and data arrays.- Parameters
name – The file name to open, as a string or pathlib Path.
mode – The file mode to use. This should be one of the following:
r
for read-only,r+
for read and write, orw+
for a new empty file. The default isr
.overwrite – Flag to force overwriting of an existing file if the mode is
w+
. IfFalse
and a file of the same name already exists, the file is not overwritten and an exception is raised. The default isFalse
.permissive – Read the file in permissive mode. (See
mrcfile.mrcinterpreter.MrcInterpreter
for details.) The default isFalse
.header_only – Only read the header (and extended header) from the file. The default is
False
.
- Raises
ValueError – If the mode is not one of
r
,r+
orw+
.ValueError – If the file is not a valid MRC file and
permissive
isFalse
.ValueError – If the mode is
w+
, the file already exists and overwrite isFalse
.OSError – If the mode is
r
orr+
and the file does not exist.
- Warns
RuntimeWarning – If the file appears to be a valid MRC file but the data block is longer than expected from the dimensions in the header.
RuntimeWarning – If the file is not a valid MRC file and
permissive
isTrue
.RuntimeWarning – If the header’s
exttyp
field is set to a known value but the extended header’s size is not a multiple of the number of bytes in the corresponding dtype.
- _open_file(name)
Open a file object to use as the I/O stream.
- _read(header_only=False)
Override _read() to move back to start of file first.
- _read_data()
Override _read_data() to check file size matches data block size.
- _get_file_size()
Return the size of the underlying file object, in bytes.
- close()
Flush any changes to disk and close the file.
This override calls
MrcInterpreter.close()
to ensure the stream is flushed and closed, then closes the file object.
- _close_file()
Close the file object.
- validate(print_file=None)
Validate this MRC file.
The tests are:
MRC format ID string: The
map
field in the header should contain “MAP “.Machine stamp: The machine stamp should contain one of
0x44 0x44 0x00 0x00
,0x44 0x41 0x00 0x00
or0x11 0x11 0x00 0x00
.MRC mode: the
mode
field should be one of the supported mode numbers: 0, 1, 2, 4, 6 or 12. (Note that MRC modes 3 and 101 are also valid according to the MRC 2014 specification but are not supported by mrcfile.)Map and cell dimensions: The header fields
nx
,ny
,nz
,mx
,my
,mz
,cella.x
,cella.y
andcella.z
must all be positive numbers.Axis mapping: Header fields
mapc
,mapr
andmaps
must contain the values 1, 2, and 3 (in any order).Volume stack dimensions: If the spacegroup is in the range 401–630, representing a volume stack, the
nz
field should be exactly divisible bymz
to represent the number of volumes in the stack.Header labels: The
nlabl
field should be set to indicate the number of labels in use, and the labels in use should appear first in the label array.MRC format version: The
nversion
field should be 20140 or 20141 for compliance with the MRC2014 standard.Extended header type: If an extended header is present, the
exttyp
field should be set to indicate the type of extended header.Data statistics: The statistics in the header should be correct for the actual data in the file, or marked as undetermined.
File size: The size of the file on disk should match the expected size calculated from the MRC header.
- Parameters
print_file – The output text stream to use for printing messages about the validation. This is passed directly to the
file
argument of Python’sprint()
function. The default isNone
, which means output will be printed tosys.stdout
.- Returns
True
if the file is valid, orFalse
if the file does not meet the MRC format specification in any way.
mrcfile.mrcinterpreter module
mrcinterpreter
Module which exports the MrcInterpreter
class.
- Classes:
MrcInterpreter
: An object which can interpret an I/O stream as MRC data.
- class mrcfile.mrcinterpreter.MrcInterpreter(iostream=None, permissive=False, header_only=False, **kwargs)
Bases:
MrcObject
An object which interprets an I/O stream as MRC / CCP4 map data.
The header and data are handled as numpy arrays - see
MrcObject
for details.MrcInterpreter
can be used directly, but it is mostly intended as a superclass to provide common stream-handling functionality. This can be used by subclasses which will handle opening and closing the stream.This class implements the
__enter__()
and__exit__()
special methods which allow it to be used by the Python context manager in awith
block. This ensures thatclose()
is called after the object is finished with.When reading the I/O stream, a
ValueError
is raised if the data is invalid in one of the following ways:The header’s
map
field is not set correctly to confirm the file type.The machine stamp is invalid and so the data’s byte order cannot be determined.
The mode number is not recognised. Currently accepted modes are 0, 1, 2, 4 and 6.
The file is not large enough for the specified extended header size.
The data block is not large enough for the specified data type and dimensions.
MrcInterpreter
offers a permissive read mode for handling problematic files. Ifpermissive
is set toTrue
and any of the validity checks fails, awarning
is issued instead of an exception, and file interpretation continues. If the mode number is invalid or the data block is too small, thedata
attribute will be set toNone
. In this case, it might be possible to inspect and correct the header, and then call_read()
again to read the data correctly. See the usage guide for more details.Methods:
Methods relevant to subclasses:
- __init__(iostream=None, permissive=False, header_only=False, **kwargs)
Initialise a new MrcInterpreter object.
This initialiser reads the stream if it is given. In general, subclasses should call
__init__()
without giving aniostream
argument, then set the_iostream
attribute themselves and call_read()
when ready.To use the MrcInterpreter class directly, pass a stream when creating the object (or for a write-only stream, create an MrcInterpreter with no stream, call
_create_default_attributes()
and set the_iostream
attribute directly).- Parameters
- Raises
ValueError – If
iostream
is given, the data it contains cannot be interpreted as a valid MRC file andpermissive
isFalse
.- Warns
RuntimeWarning – If
iostream
is given, the data it contains cannot be interpreted as a valid MRC file andpermissive
isTrue
.
- _read(header_only=False)
Read the header, extended header and data from the I/O stream.
Before calling this method, the stream should be open and positioned at the start of the header. This method will advance the stream to the end of the data block (or the end of the extended header if
header_only
isTrue
.- Parameters
header_only – Only read the header and extended header from the stream. The default is
False
.- Raises
ValueError – If the data in the stream cannot be interpreted as a valid MRC file and
permissive
isFalse
.- Warns
RuntimeWarning – If the data in the stream cannot be interpreted as a valid MRC file and
permissive
isTrue
.
- _read_header()
Read the MRC header from the I/O stream.
The header will be read from the current stream position, and the stream will be advanced by 1024 bytes.
- Raises
ValueError – If the data in the stream cannot be interpreted as a valid MRC file and
permissive
isFalse
.- Warns
RuntimeWarning – If the data in the stream cannot be interpreted as a valid MRC file and
permissive
isTrue
.
- _read_extended_header()
Read the extended header from the stream.
If there is no extended header, a zero-length array is assigned to the extended_header attribute.
The dtype is set as void (
'V1'
).- Raises
ValueError – If the stream is not long enough to contain the extended header indicated by the header and
permissive
isFalse
.- Warns
RuntimeWarning – If the stream is not long enough to contain the extended header indicated by the header and
permissive
isTrue
.
- _read_data(max_bytes=0)
Read the data array from the stream.
This method uses information from the header to set the data array’s shape and dtype.
- Parameters
max_bytes – Read at most this many bytes from the stream. If zero or negative, the full size of the data block as defined in the header will be read, even if this is very large.
- Raises
ValueError – If the stream is not long enough to contain the data indicated by the header and
permissive
isFalse
.- Warns
RuntimeWarning – If the stream is not long enough to contain the data indicated by the header and
permissive
isTrue
.
- _read_bytearray_from_stream(number_of_bytes)
Read a
bytearray
from the stream.This default implementation relies on the stream implementing the
readinto()
method to avoid copying the new array while creating the mutablebytearray
. Subclasses should override this if their stream does not supportreadinto()
.- Returns
A 2-tuple of the
bytearray
and the number of bytes that were read from the stream.
- close()
Flush to the stream and clear the header and data attributes.
- flush()
Flush the header and data arrays to the I/O stream.
This implementation seeks to the start of the stream, writes the header, extended header and data arrays, and then truncates the stream.
Subclasses should override this implementation for streams which do not support
seek()
ortruncate()
.
mrcfile.mrcmemmap module
mrcmemmap
Module which exports the MrcMemmap
class.
- Classes:
MrcMemmap
: An MrcFile subclass that uses a memory-mapped data array.
- class mrcfile.mrcmemmap.MrcMemmap(name, mode='r', overwrite=False, permissive=False, header_only=False, **kwargs)
Bases:
MrcFile
MrcFile subclass that uses a
numpy memmap array
for the data.Using a memmap means that the disk access is done lazily: the data array will only be read or written in small chunks when required. To access the contents of the array, use the array slice operator.
Usage is the same as for
MrcFile
.Note that memmap arrays use a fairly small chunk size and so performance could be poor on file systems that are optimised for infrequent large I/O operations.
If required, it is possible to create a very large empty file by creating a new MrcMemmap and then calling
_open_memmap()
to create the memmap array, which can then be filled slice-by-slice. Be aware that the contents of a new, empty memmap array depend on your platform: the data values could be garbage or zeros.- set_extended_header(extended_header)
Replace the file’s extended header.
Note that the file’s entire data block must be moved if the extended header size changes. Setting a new extended header can therefore be very time consuming with large files, if the new extended header occupies a different number of bytes than the previous one.
- flush()
Flush the header and data arrays to the file buffer.
- _read_data()
Read the data block from the file.
This method first calculates the parameters needed to read the data (block start position, endian-ness, file mode, array shape) and then opens the data as a numpy memmap array.
- _open_memmap(dtype, shape)
Open a new memmap array pointing at the file’s data block.
- _close_data()
Delete the existing memmap array, if it exists.
The array is flagged as read-only before deletion, so if a reference to it has been kept elsewhere, changes to it should no longer be able to change the file contents.
- _set_new_data(data)
Override of
_set_new_data()
to handle opening a new memmap and copying data into it.
mrcfile.mrcobject module
mrcobject
Module which exports the MrcObject
class.
- Classes:
MrcObject
: An object representing image or volume data in the MRC format.
- class mrcfile.mrcobject.MrcObject(**kwargs)
Bases:
object
An object representing image or volume data in the MRC format.
The header, extended header and data are stored as numpy arrays and exposed as read-only attributes. To replace the data or extended header, call
set_data()
orset_extended_header()
. The header cannot be replaced but can be modified in place.Voxel size is exposed as a writeable attribute, but is calculated on-the-fly from the header’s
cella
andmx
/my
/mz
fields.Three-dimensional data can represent either a stack of 2D images, or a 3D volume. This is indicated by the header’s
ispg
(space group) field, which is set to 0 for image data and >= 1 for volume data. Theis_single_image()
,is_image_stack()
,is_volume()
andis_volume_stack()
methods can be used to identify the type of information stored in the data array. For 3D data, theset_image_stack()
andset_volume()
methods can be used to switch between image stack and volume interpretations of the data.If the data contents have been changed, you can use the
update_header_from_data()
andupdate_header_stats()
methods to make the header consistent with the data. These methods are called automatically if the data array is replaced by callingset_data()
.update_header_from_data()
is fast, even with very large data arrays, because it only examines the shape and type of the data array.update_header_stats()
calculates statistics from all items in the data array and so can be slow for very large arrays. If necessary, thereset_header_stats()
method can be called to set the header fields to indicate that the statistics are undetermined.Attributes:
Methods:
Attributes and methods relevant to subclasses:
- __init__(**kwargs)
Initialise a new
MrcObject
.This initialiser deliberately avoids creating any arrays and simply sets the header, extended header and data attributes to
None
. This allows subclasses to call__init__()
at the start of their initialisers and then set the attributes themselves, probably by reading from a file, or by calling_create_default_attributes()
for a new empty object.Note that this behaviour might change in future: this initialiser could take optional arguments to allow the header and data to be provided by the caller, or might create the standard empty defaults rather than setting the attributes to
None
.
- _check_writeable()
Check that this MRC object is writeable.
- Raises
ValueError – If this object is read-only.
- _create_default_attributes()
Set valid default values for the header and data attributes.
- _create_default_header()
Create a default MRC file header.
The header is initialised with standard file type and version information, default values for some essential fields, and zeros elsewhere. The first text label is also set to indicate the file was created by this module.
- property header
Get the header as a
numpy record array
.
- property extended_header
Get the extended header as a
numpy array
.The dtype will be void (raw data, dtype
V'
). If the actual data type of the extended header is known, the dtype of the array can be changed to match. For supported types (e.g.'FEI1'
and'FEI2'
), the indexed part of the extended header (excluding any zero padding) can be accessed usingindexed_extended_header()
.The extended header may be modified in place. To replace it completely, call
set_extended_header()
.
- property indexed_extended_header
Get the indexed part of the extended header as a
numpy array
with the appropriate dtype set.Currently only
'FEI1'
and'FEI2'
extended headers are supported. Modifications to the indexed extended header will not change the extended header data recorded in thisMrcObject
. If the extended header type is unrecognised or extended header data is not of sufficient length a warning will be produced and the indexed extended header will be None.
- set_extended_header(extended_header)
Replace the extended header.
If you set the extended header you should also set the
header.exttyp
field to indicate the type of extended header.
- property data
Get the data as a
numpy array
.
- set_data(data)
Replace the data array.
This replaces the current data with the given array (or a copy of it), and updates the header to match the new data dimensions. The data statistics (min, max, mean and rms) stored in the header will also be updated.
- Warns
RuntimeWarning – If the data array contains Inf or NaN values.
- _close_data()
Close the data array.
- _set_new_data(data)
Replace the data array with a new one.
The new data array is not checked - it must already be valid for use in an MRC file.
- property voxel_size
Get or set the voxel size in angstroms.
The voxel size is returned as a structured NumPy
record array
with three fields (x, y and z). For example:>>> mrc.voxel_size rec.array((0.44825, 0.3925, 0.45874998), dtype=[('x', '<f4'), ('y', '<f4'), ('z', '<f4')]) >>> mrc.voxel_size.x array(0.44825, dtype=float32)
Note that changing the voxel_size array in-place will not change the voxel size in the file – to prevent this being overlooked accidentally, the writeable flag is set to
False
on the voxel_size array.To set the voxel size, assign a new value to the voxel_size attribute. You may give a single number, a 3-tuple
(x, y ,z)
or a modified version of the voxel_size array. The following examples are all equivalent:>>> mrc.voxel_size = 1.0
>>> mrc.voxel_size = (1.0, 1.0, 1.0)
>>> vox_sizes = mrc.voxel_size >>> vox_sizes.flags.writeable = True >>> vox_sizes.x = 1.0 >>> vox_sizes.y = 1.0 >>> vox_sizes.z = 1.0 >>> mrc.voxel_size = vox_sizes
- _set_voxel_size(x_size, y_size, z_size)
Set the voxel size.
- Parameters
x_size – The voxel size in the X direction, in angstroms
y_size – The voxel size in the Y direction, in angstroms
z_size – The voxel size in the Z direction, in angstroms
- property nstart
Get or set the grid start locations.
This provides a convenient way to get and set the values of the header’s
nxstart
,nystart
andnzstart
fields. Note that these fields are integers and are measured in voxels, not angstroms. The start locations are returned as a structured NumPyrecord array
with three fields (x, y and z). For example:>>> mrc.header.nxstart array(0, dtype=int32) >>> mrc.header.nystart array(-21, dtype=int32) >>> mrc.header.nzstart array(-12, dtype=int32) >>> mrc.nstart rec.array((0, -21, -12), dtype=[('x', '<i4'), ('y', '<i4'), ('z', '<i4')]) >>> mrc.nstart.y array(-21, dtype=int32)
Note that changing the nstart array in-place will not change the values in the file – to prevent this being overlooked accidentally, the writeable flag is set to
False
on the nstart array.To set the start locations, assign a new value to the nstart attribute. You may give a single number, a 3-tuple
(x, y ,z)
or a modified version of the nstart array. The following examples are all equivalent:>>> mrc.nstart = -150
>>> mrc.nstart = (-150, -150, -150)
>>> starts = mrc.nstart >>> starts.flags.writeable = True >>> starts.x = -150 >>> starts.y = -150 >>> starts.z = -150 >>> mrc.nstart = starts
- _set_nstart(nxstart, nystart, nzstart)
Set the grid start locations.
- Parameters
nxstart – The location of the first column in the unit cell
nystart – The location of the first row in the unit cell
nzstart – The location of the first section in the unit cell
- is_single_image()
Identify whether the file represents a single image.
- Returns
True
if the data array is two-dimensional.
- is_image_stack()
Identify whether the file represents a stack of images.
- Returns
True
if the data array is three-dimensional and the space group is zero.
- is_volume()
Identify whether the file represents a volume.
- Returns
True
if the data array is three-dimensional and the space group is not zero.
- is_volume_stack()
Identify whether the file represents a stack of volumes.
- Returns
True
if the data array is four-dimensional.
- set_image_stack()
Change three-dimensional data to represent an image stack.
This method changes the space group number (
header.ispg
) to zero.- Raises
ValueError – If the data array is not three-dimensional.
- set_volume()
Change three-dimensional data to represent a volume.
If the space group was previously zero (representing an image stack), this method sets it to one. Otherwise the space group is not changed.
- Raises
ValueError – If the data array is not three-dimensional.
- update_header_from_data()
Update the header from the data array.
This function updates the header byte order and machine stamp to match the byte order of the data. It also updates the file mode, space group and the dimension fields
nx
,ny
,nz
,mx
,my
andmz
.If the data is 2D, the space group is set to 0 (image stack). For 3D data the space group is not changed, and for 4D data the space group is set to 401 (simple P1 volume stack) unless it is already in the volume stack range (401–630).
This means that new 3D data will be treated as an image stack if the previous data was a single image or image stack, or as a volume if the previous data was a volume or volume stack.
Note that this function does not update the data statistics fields in the header (
dmin
,dmax
,dmean
andrms
). Use theupdate_header_stats()
function to update the statistics. (This is for performance reasons – updating the statistics can take a long time for large data sets, but updating the other header information is always fast because only the type and shape of the data array need to be inspected.)
- update_header_stats()
Update the header’s
dmin
,dmax
,dmean
andrms
fields from the data.Note that this can take some time with large files, particularly with files larger than the currently available memory.
- Warns
RuntimeWarning – If the data array contains Inf or NaN values.
- reset_header_stats()
Set the header statistics to indicate that the values are unknown.
- print_header(print_file=None)
Print the contents of all header fields.
- Parameters
print_file – The output text stream to use for printing the header. This is passed directly to the
file
argument of Python’sprint()
function. The default isNone
, which means output will be printed tosys.stdout
.
- get_labels()
Get the labels from the MRC header.
Up to ten labels are stored in the header as arrays of 80 bytes. This method returns the labels as Python strings, filtered to remove non-printable characters. To access the raw bytes (including any non-printable characters) use the
header.label
attribute (and note thatheader.nlabl
stores the number of labels currently set).- Returns
The labels, as a list of strings. The list will contain between 0 and 10 items, each containing up to 80 characters.
- add_label(label)
Add a label to the MRC header.
The new label will be stored after any labels already in the header. If all ten labels are already in use, an exception will be raised.
Future versions of this method might add checks to ensure that labels containing valid text are not overwritten even if the
nlabl
value is incorrect.- Parameters
label – The label value to store, as a string containing only printable ASCII characters.
- Raises
ValueError – If the label is longer than 80 bytes or contains non-printable or non-ASCII characters.
IndexError – If the file already contains 10 labels and so an additional label cannot be stored.
- validate(print_file=None)
Validate this MrcObject.
This method runs a series of tests to check whether this object complies strictly with the MRC2014 format specification:
MRC format ID string: The header’s
map
field must contain “MAP “.Machine stamp: The machine stamp should contain one of
0x44 0x44 0x00 0x00
,0x44 0x41 0x00 0x00
or0x11 0x11 0x00 0x00
.MRC mode: the
mode
field should be one of the supported mode numbers: 0, 1, 2, 4, 6 or 12. (Note that MRC modes 3 and 101 are also valid according to the MRC 2014 specification but are not supported by mrcfile.)Map and cell dimensions: The header fields
nx
,ny
,nz
,mx
,my
,mz
,cella.x
,cella.y
andcella.z
must all be positive numbers.Axis mapping: Header fields
mapc
,mapr
andmaps
must contain the values 1, 2, and 3 (in any order).Volume stack dimensions: If the spacegroup is in the range 401–630, representing a volume stack, the
nz
field should be exactly divisible bymz
to represent the number of volumes in the stack.Header labels: The
nlabl
field should be set to indicate the number of labels in use, and the labels in use should appear first in the label array (that is, there should be no blank labels between text-filled ones).MRC format version: The
nversion
field should be 20140 or 20141 for compliance with the MRC2014 standard.Extended header type: If an extended header is present, the
exttyp
field should be set to indicate the type of extended header.Data statistics: The statistics in the header should be correct for the actual data, or marked as undetermined.
- Parameters
print_file – The output text stream to use for printing messages about the validation. This is passed directly to the
file
argument of Python’sprint()
function. The default isNone
, which means output will be printed tosys.stdout
.- Returns
True
if this MrcObject is valid, orFalse
if it does not meet the MRC format specification in any way.
mrcfile.utils module
utils
Utility functions used by the other modules in the mrcfile package.
Functions
data_dtype_from_header()
: Work out the datadtype
from an MRC header.data_shape_from_header()
: Work out the data array shape from an MRC headermode_from_dtype()
: Convert anumpy dtype
to an MRC mode number.dtype_from_mode()
: Convert an MRC mode number to anumpy dtype
.pretty_machine_stamp()
: Get a nicely-formatted string from a machine stamp.machine_stamp_from_byte_order()
: Get a machine stamp from a byte order indicator.byte_orders_equal()
: Compare two byte order indicators for equal endianness.normalise_byte_order()
: Convert a byte order indicator to<
or>
.spacegroup_is_volume_stack()
: Identify if a space group number represents a volume stack.
- mrcfile.utils.data_dtype_from_header(header)
Return the data dtype indicated by the given header.
This function calls
dtype_from_mode()
to get the basic dtype, and then makes sure that the byte order of the new dtype matches the byte order of the header’smode
field.- Parameters
header – An MRC header as a
numpy record array
.- Returns
The
numpy dtype
object for the data array corresponding to the given header.- Raises
ValueError – If there is no corresponding dtype for the given mode.
- mrcfile.utils.data_shape_from_header(header)
Return the data shape indicated by the given header.
- Parameters
header – An MRC header as a
numpy record array
.- Returns
The shape tuple for the data array corresponding to the given header.
- mrcfile.utils.mode_from_dtype(dtype)
Return the MRC mode number corresponding to the given
numpy dtype
.The conversion is as follows:
float16 -> mode 12
float32 -> mode 2
int8 -> mode 0
int16 -> mode 1
uint8 -> mode 6 (data will be widened to 16 bits in the file)
uint16 -> mode 6
complex64 -> mode 4
Note that there is no numpy dtype which corresponds to MRC mode 3.
- Parameters
dtype – A
numpy dtype
object.- Returns
The MRC mode number.
- Raises
ValueError – If there is no corresponding MRC mode for the given dtype.
- mrcfile.utils.dtype_from_mode(mode)
Return the
numpy dtype
corresponding to the given MRC mode number.The mode parameter may be given as a Python scalar, numpy scalar or single-item numpy array.
The conversion is as follows:
mode 0 -> int8
mode 1 -> int16
mode 2 -> float32
mode 4 -> complex64
mode 6 -> uint16
mode 12 -> float16
Note that modes 3 and 101 are not supported as there is no matching numpy dtype.
- Parameters
mode – The MRC mode number. This may be given as any type which can be converted to an int, for example a Python scalar (
int
orfloat
), a numpy scalar or a single-item numpy array.- Returns
The
numpy dtype
object corresponding to the given mode.- Raises
ValueError – If there is no corresponding dtype for the given mode, or if
mode
is an array and does not contain exactly one item.
- mrcfile.utils.pretty_machine_stamp(machst)
Return a human-readable hex string for a machine stamp.
- mrcfile.utils.byte_order_from_machine_stamp(machst)
Return the byte order corresponding to the given machine stamp.
- Parameters
machst – The machine stamp, as a
bytearray
or anumpy array
of bytes.- Returns
<
if the machine stamp represents little-endian data, or>
if it represents big-endian.- Raises
ValueError – If the machine stamp is invalid.
- mrcfile.utils.machine_stamp_from_byte_order(byte_order='=')
Return the machine stamp corresponding to the given byte order indicator.
- Parameters
byte_order – The byte order indicator: one of
=
,<
or>
, as defined and used by numpy dtype objects.- Returns
The machine stamp which corresponds to the given byte order, as a
bytearray
. This will be either(0x44, 0x44, 0, 0)
for little-endian or(0x11, 0x11, 0, 0)
for big-endian. If the given byte order indicator is=
, the native byte order is used.- Raises
ValueError – If the byte order indicator is unrecognised.
- mrcfile.utils.byte_orders_equal(a, b)
Work out if the byte order indicators represent the same endianness.
- Parameters
a – The first byte order indicator: one of
=
,<
or>
, as defined and used bynumpy dtype
objects.b – The second byte order indicator.
- Returns
True
if the byte order indicators represent the same endianness.- Raises
ValueError – If the byte order indicator is not recognised.
- mrcfile.utils.normalise_byte_order(byte_order)
Convert a numpy byte order indicator to one of
<
or>
.- Parameters
byte_order – One of
=
,<
or>
.- Returns
<
if the byte order indicator represents little-endian data, or>
if it represents big-endian. Therefore on a little-endian machine,=
will be converted to<
, but on a big-endian machine it will be converted to>
.- Raises
ValueError – If
byte_order
is not one of=
,<
or>
.
- mrcfile.utils.spacegroup_is_volume_stack(ispg)
Identify if the given space group number represents a volume stack.
- Parameters
ispg – The space group number, as an integer, numpy scalar or single- element numpy array.
- Returns
True
if the space group number is in the range 401–630.
- mrcfile.utils.is_printable_ascii(string_)
Check if a string is entirely composed of printable ASCII characters.
- mrcfile.utils.printable_string_from_bytes(bytes_)
Convert bytes into a printable ASCII string by removing non-printable characters.
- mrcfile.utils.bytes_from_string(string_)
Convert a string to bytes.
Even though this is a one-liner, the details are tricky to get right so things work properly in both Python 2 and 3. It’s broken out as a separate function so it can be thoroughly tested.
- Raises
UnicodeError – If the input contains non-ASCII characters.
mrcfile.validator module
validator
Module for top-level functions that validate MRC files.
This module is runnable to allow files to be validated easily from the command line.
- mrcfile.validator.main(args=None)
Validate a list of MRC files given as command arguments.
The return value is used as the process exit code when this function is called by running this module or from the corresponding
console_scripts
entry point.- Returns
0
if all command arguments are names of valid MRC files.1
if no file names are given or any of the files is not a valid MRC file.
- mrcfile.validator.validate_all(names, print_file=None)
Validate a list of MRC files.
This function calls
validate()
for each file name in the given list.- Parameters
names – A sequence of file names to open and validate.
print_file – The output text stream to use for printing messages about the validation. This is passed directly to the
print_file
argument of thevalidate()
function. The default isNone
, which means output will be printed tosys.stdout
.
- Returns
True
if all of the files are valid, orFalse
if any of the files do not meet the MRC format specification in any way.- Raises
OSError – If one of the files does not exist or cannot be opened.
- Warns
RuntimeWarning – If one of the files is seriously invalid because it has no map ID string, an incorrect machine stamp, an unknown mode number, or is not the same size as expected from the header.
- mrcfile.validator.validate(name, print_file=None)
Validate an MRC file.
This function first opens the file by calling
open()
(withpermissive=True
), then callsvalidate()
, which runs a series of tests to check whether the file complies with the MRC2014 format specification.If the file is completely valid, this function returns
True
, otherwise it returnsFalse
. Messages explaining the validation result will be printed tosys.stdout
by default, but if a text stream is given (using theprint_file
argument) output will be printed to that instead.Badly invalid files will also cause
warning
messages to be issued, which will be written tosys.stderr
by default. See the documentation of thewarnings
module for information on how to suppress or capture warning output.Because the file is opened by calling
open()
, gzip- and bzip2-compressed MRC files can be validated easily using this function.After the file has been opened, it is checked for problems. The tests are:
MRC format ID string: The
map
field in the header should contain “MAP “.Machine stamp: The machine stamp should contain one of
0x44 0x44 0x00 0x00
,0x44 0x41 0x00 0x00
or0x11 0x11 0x00 0x00
.MRC mode: the
mode
field should be one of the supported mode numbers: 0, 1, 2, 4, 6 or 12. (Note that MRC modes 3 and 101 are also valid according to the MRC 2014 specification but are not supported by mrcfile.)Map and cell dimensions: The header fields
nx
,ny
,nz
,mx
,my
,mz
,cella.x
,cella.y
andcella.z
must all be positive numbers.Axis mapping: Header fields
mapc
,mapr
andmaps
must contain the values 1, 2, and 3 (in any order).Volume stack dimensions: If the spacegroup is in the range 401–630, representing a volume stack, the
nz
field should be exactly divisible bymz
to represent the number of volumes in the stack.Header labels: The
nlabl
field should be set to indicate the number of labels in use, and the labels in use should appear first in the label array.MRC format version: The
nversion
field should be 20140 or 20141 for compliance with the MRC2014 standard.Extended header type: If an extended header is present, the
exttyp
field should be set to indicate the type of extended header.Data statistics: The statistics in the header should be correct for the actual data in the file, or marked as undetermined.
File size: The size of the file on disk should match the expected size calculated from the MRC header.
- Parameters
name – The file name to open and validate.
print_file – The output text stream to use for printing messages about the validation. This is passed directly to the
file
argument of Python’sprint()
function. The default isNone
, which means output will be printed tosys.stdout
.
- Returns
True
if the file is valid, orFalse
if the file does not meet the MRC format specification in any way.- Raises
OSError – If the file does not exist or cannot be opened.
- Warns
RuntimeWarning – If the file is seriously invalid because it has no map ID string, an incorrect machine stamp, an unknown mode number, or is not the same size as expected from the header.