limix.io.bgen.read¶
-
limix.io.bgen.
read
(filepath, size=50, verbose=True, metadata_file=True, sample_file=None)[source]¶ Read a given BGEN file.
Parameters: - filepath (str) – A BGEN file path.
- size (float, optional) – Chunk size in megabytes. Defaults to
50
. - verbose (bool, optional) –
True
to show progress;False
otherwise. - metadata_file (bool, str, optional) – If
True
, it will try to read the variants metadata from the metadata filefilepath + ".metadata"
. If this is not possible, the variants metadata will be read from the BGEN file itself. Iffilepath + ".metadata"
does not exist, it will try to create one with the same name to speed up reads. IfFalse
, variants metadata will be read only from the BGEN file. If a file path is given instead, it assumes that the specified metadata file is valid and readable and therefore it will read variants metadata from that file only. Defaults toTrue
. - sample_file (str, optional) – A sample file in GEN format.
If sample_file is provided, sample IDs are read from this file. Otherwise, it
reads from the BGEN file itself if present. Defaults to
None
.
Returns: - variants (
pandas.DataFrame
) – Variant position, chromossomes, RSIDs, etc. - samples (
pandas.DataFrame
) – Sample identifications. - genotype (
dask.array.Array
) – Array of genotype references. - X (
dask.array.Array
) – Allele probabilities.
Note
Metadata files can speed up subsequent reads tremendously. But often the user does not have write permission for the default metadata file location
filepath + ".metadata"
. We thus provide thelimix.io.bgen.create_metadata_file()
function for creating one at the given path.