This is the link to download the imputed genotypes at the 4 million (3,948,884) SNPs fully genotyped by NIEHS/Perlegen reseqeucing project. across 15 resequenced strains. Each file is compressed with gzip. Once the file is uncompressed, each file contains a header file describing each column. The genotypes are represented in the following format.
- All the genotyped SNPs (by NIEHS/Perlegen or mouse HapMap) are represented with capital letters (A,C,G,T)
- In the posterior probability file, the imputed alleles of ungenotyped SNPs are represented as the posterior probability of allele 1. The probability of allele 2 is simply 1 - Pr(allele 1).
- In the files containing imputed calls with high-confidence, the genotypes with posterior probability greater than 0.98 are called. The imputed genotype calls are represented in small letters (a,c,g,t) to be distinguished from the actual genotypes. All the missing calls with posterior probability less than 0.98 was represented as a letter 'n'.
- In the files containing imputed call at every genotype, same representation is used as above, except that no 'n' is observed, because every genotype is forced to be imputed, regardless of the confidence of imputed genotype.