Prepare input

The input files of function polyOrigin are two text files: genofile and pedfile. The input files are in CSV format; set the option delimchar for a different delimiter.

Input genofile

The genofile stores genetic map, parent genotypes, and offspring genotypes. Click to download zipped example genofile. For example, a genofile looks like

markerchromosomepositionind1ind2ind3ind4...
snp110.140214...
snp210.1640NA2...
snp310.21NA301...

The genofile must pass the following check list

  • Column 1: SNP IDs must be unique.
  • Column 2: SNPs must be grouped by chromosome IDs.
  • Column 3: Positions of markers within a chromosome must be non-decreasing. The position unit must be base-pair for physical map and centiMorgan for genetic map.
  • Row 1, Col 4-end: Individual IDs must be unique, and must be in pedfile.
  • Row 2:end, Col 4-end: all genotypes of parents must take on one of the following formats 1-4, and all genotypes of offspring must take one of the following formats 1-3.

List of four possible format of genotypes:

  1. dosage: ranges from 0, 1, ..., ploidy, and NA for missing dosage;
  2. readcount: c1|c2, where c1 and c2 are the number of reads for alleles 1 and 2, respectively. Missing genotypes are denoted by 0|0
  3. probability: p(0)|p(1)|...|p(ploidy), where p(i) denotes the probability of observed data given dosage i = 0, ..., ploidy, and the probabilities are normalized so that their sum is 1.
  4. phasedgeno: g1|g2|...|g(ploidy), where g(i)=1 or 2 for i=1, ..., ploidy.

Input pedfile

The pedfile stores pedigree information. Click to download zipped example pedfile. For example, a pedfile looks like

individualpopulationmotheridfatheridploidy
P10004
P30004
P30004
offspring1pop1P1P24
offspring2pop1P1P24
offspring3pop2P1P34
offspring4pop2P1P34
offspring5pop3P2P34
offspring6pop4P3P34

The pedigree contains three founders (parents), two offspring from the cross between parents 1 and 2, two offspring from the cross between parents 1 and 3, one offspring from the cross between parents 2 and 3, and one offspring from the selfing of parent 3.

The pedfile must pass the following check list

  • Column 1: individual IDs must be unique, and must be in genofile.
  • Column 2: Unique ID for each sub-population (F1 cross or selfing). The sub-population of founders must be the same.
  • Column 3-4: parentID of founders must be 0.
  • Column 5: ploidy must be 4. TODO for n=2, 4, 6, and 8.