limix.io.plink.read¶

limix.io.plink.read(prefix, verbose=True)[source]¶

Read PLINK files into Pandas data frames.

Parameters:

prefix (str) – Path prefix to the set of PLINK files.
verbose (bool) – True for progress information; False otherwise.

Returns:

alleles (pandas dataframe)
samples (pandas dataframe)
genotype (ndarray)

Examples

>>> from limix.io import plink
>>> from pandas_plink import example_file_prefix
>>>
>>> (bim, fam, bed) = plink.read(example_file_prefix(), verbose=False)
>>> print(bim.head())
  chrom         snp       cm    pos a0 a1  i
0     1  rs10399749  0.00000  45162  G  C  0
1     1   rs2949420  0.00000  45257  C  T  1
2     1   rs2949421  0.00000  45413  0  0  2
3     1   rs2691310  0.00000  46844  A  T  3
4     1   rs4030303  0.00000  72434  0  G  4
>>> print(fam.head())
        fid       iid    father    mother gender trait  i
0  Sample_1  Sample_1         0         0      1    -9  0
1  Sample_2  Sample_2         0         0      2    -9  1
2  Sample_3  Sample_3  Sample_1  Sample_2      2    -9  2
>>> print(bed.compute())
[[ 2.  2.  1.]
 [ 2.  1.  2.]
 [nan nan nan]
 [nan nan  1.]
 [ 2.  2.  2.]
 [ 2.  2.  2.]
 [ 2.  1.  0.]
 [ 2.  2.  2.]
 [ 1.  2.  2.]
 [ 2.  1.  2.]]

Notice the i column in bim and fam data frames. It maps to the corresponding position of the bed matrix:

>>> from limix.io import plink
>>> from pandas_plink import example_file_prefix
>>>
>>> (bim, fam, bed) = plink.read(example_file_prefix(), verbose=False)
>>> chrom1 = bim.query("chrom=='1'")
>>> X = bed[chrom1.i.values, :].compute()
>>> print(X)
[[ 2.  2.  1.]
 [ 2.  1.  2.]
 [nan nan nan]
 [nan nan  1.]
 [ 2.  2.  2.]
 [ 2.  2.  2.]
 [ 2.  1.  0.]
 [ 2.  2.  2.]
 [ 1.  2.  2.]
 [ 2.  1.  2.]]