Meta-predictor of disease causing variants
Meta-SNP has been trained and tested using a 20-fold cross-validation
procedure on a set of 35,766 variations from 8,667 proteins (SV-2009)
extracted from the Swiss-Var database (Oct. 2009).
The SV-2009 dataset is composed by 17,883 disease-related mutations
and the same number of randomly selected polymorphisms.
In the cross-validation procedure, proteins are clustered using the
blastclust algorithm in the BLAST package, and keeping in the same set
all the variations belonging to the same cluster of similar sequences.
The SV-2009 dataset can be downloaded from this
An additional dataset composed by 972 newly annotated variats in a
recent version of Swiss-Var (Feb. 2012) from 577 proteins (NSV-2012)
has been used to test Meta-SNP.
The list of NSV-2012 variations is available