Run polyOrigin
Run function polyOrigin
for haplotype reconstruction. Here is a simulated example of a tetraploid population, consists of four subpopulations: three crosses among 3 parents and one selfing of the first parent. The genofile contains genetic map with 120 SNPs and dosage data for 3 parents and 120 offspring.
using PolyOrigin
workdir = joinpath(pkgdir(PolyOrigin),"docs","run_polyOrigin")
genofile = "geno_disturbed.csv"
pedfile = "ped.csv"
outstem = "outstem"
polyancestry = polyOrigin(genofile,pedfile;
chrpairing=44,
isphysmap = false,
refinemap=true,
refineorder=true,
workdir,
outstem,
)
filter(x->occursin(outstem,x), readdir(workdir))
8-element Vector{String}: "outstem.log" "outstem_genoprob.csv" "outstem_maprefined.csv" "outstem_parentphased.csv" "outstem_parentphased_corrected.csv" "outstem_polyancestry.csv" "outstem_postdoseprob.csv" "outstem_valentsummary.csv"
Here chrpairing=44
denotes bivalent and quadrivalent formations in ancestral inference. isphysmap
specifies if the input map is a physical map, refinemap
specifies if refine the input map and refineorder
specifies further if refine only inter-marker distances (refinemap
=true and refineorder
=false), workdir
specifies the directory of input and output files, and outstem
specifies the filename stem for output files.
The polyOrigin function produces the following outputfiles:
outstem.log
: log file saves messages that are printed on console.outstem_maprefined.csv
: same as the input genofile, except that input marker map is refined.outstem_parentphased.csv
: same as the input genofile, except that parental genotypes are phased.outstem_parentphased_corrected.csv
: the phased parent genotypes are corrected.outstem_polyancestry.csv
: saves the returned variable polyancestry. See alsopolyOrigin
for more details.outstem_genoprob.csv
: a concise version of the above file, including genetic map, phased parental genotypes, and posterior origin-genotype probabilities.outstem_postdoseprob.csv
: same as the input genofile, except that parent genotypes are phased and offspring genotypes are given by the posterior dose probabilities.
Here outstem_maprefined.csv
, outstem_parentphased.csv
, outstem_parentphased_corrected.csv
, or outstem_postdoseprob.csv
may be iteratively used as the input genofile.
Visualize conditional probability
The outstem_polyancestry.csv
can be read back by the function readPolyAncestry
, and ancestral conditional probability can be visualized by plotCondprob
or animCondprob
.
polyancestry = readPolyAncestry("outstem_polyancestry.csv",workdir=workdir)
truegeno = readTruegeno!("true.csv",polyancestry,workdir=workdir)
using Plots
animCondprob(polyancestry; truegeno=truegeno,
left_margin = 8Plots.mm, bottom_margin = 5Plots.mm)
Evaluate estimated map
Compare input map and estimated map with true map
polygeno=readPolyGeno(genofile,pedfile,workdir=workdir)
fig = plotMapComp(truegeno.truemap,polygeno.markermap,
xlabel="True position (cM)",
ylabel="Input position (cM)")
fig2 = plotMapComp(truegeno.truemap, polyancestry.markermap,
xlabel="True position (cM)",
ylabel="Estimated position (cM)")
using Plots
plot(fig,fig2,left_margin = 8Plots.mm, bottom_margin = 5Plots.mm)
where tau is the Kendall rank correlation between true map and comparing map.