limix.io.bgen.read¶

limix.io.bgen.read(filepath, size=50, verbose=True, metadata_file=True, sample_file=None)[source]¶

Read a given BGEN file.

Parameters:

filepath (str) – A BGEN file path.
size (float, optional) – Chunk size in megabytes. Defaults to 50.
verbose (bool, optional) – True to show progress; False otherwise.
metadata_file (bool, str, optional) – If True, it will try to read the variants metadata from the metadata file filepath + ".metadata". If this is not possible, the variants metadata will be read from the BGEN file itself. If filepath + ".metadata" does not exist, it will try to create one with the same name to speed up reads. If False, variants metadata will be read only from the BGEN file. If a file path is given instead, it assumes that the specified metadata file is valid and readable and therefore it will read variants metadata from that file only. Defaults to True.
sample_file (str, optional) – A sample file in GEN format. If sample_file is provided, sample IDs are read from this file. Otherwise, it reads from the BGEN file itself if present. Defaults to None.

Returns:

variants (pandas.DataFrame) – Variant position, chromossomes, RSIDs, etc.
samples (pandas.DataFrame) – Sample identifications.
genotype (dask.array.Array) – Array of genotype references.
X (dask.array.Array) – Allele probabilities.

Note

Metadata files can speed up subsequent reads tremendously. But often the user does not have write permission for the default metadata file location filepath + ".metadata". We thus provide the limix.io.bgen.create_metadata_file() function for creating one at the given path.