Publications
Christmas et al. Evolutionary constraint and innovation across hundreds of placental mammals
https://www.science.org/doi/10.1126/science.abn3943
- Scripts are archived at https://zenodo.org/badge/latestdoi/428298370.
- The Cactus alignment and constraint scores are available at https://cglgenomics.ucsc.edu/data/cactus/ and at https://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg38&g=cons241way.
- For protein-coding sequence alignments, see Kirilenko et al., below
- Information regarding genome assemblies and specimen biosamples are provided in (4) and at https://zoonomiaproject.org/
Sullivan et al. Leveraging base pair mammalian constraint to understand genetic variation and human disease
https://www.science.org/doi/10.1126/science.abn2937
- Scripts for PhyloP and PhastCons constraint score calculation are available at https://github.com/michaeldong1/ZOONOMIA.git.
- Code to perform analyses benchmarking of phyloP deleteriousness and a case example is available at https://github.com/teone182/Zoonomia_Scripts.
- Scripts for analyses using TOGA are available at https://github.com/GMHughes/ZoonomiaScripts.
- ldsc software and annotations are available at http://www.github.com/bulik/ldsc and https://alkesgroup.broadinstitute.org/LDSCORE/
- GWAS summary statistics used in ldsc analyses are available at https://alkesgroup.broadinstitute.org/cS2G/sumstats_63/ and https://data.broadinstitute.org/alkes-group/UKBB/UKBB_409K
- S-LDXR software is available at https://alkesgroup.broadinstitute.org/S-LDXR/
- Code to perform Polyfun analyses is available at https://github.com/pfenninglab/Zoonomia_flagship2_fine-mapping.
- Polyfun annotations for fine-mapping are available at https://kilthub.cmu.edu/account/articles/19380533.
Andrews et al. Mammalian evolution of human cis-regulatory elements and transcription factor binding sites
https://www.science.org/doi/10.1126/science.abn7930
- All data and code are deposited at Zenodo (doi:10.5281/ZENODO.7149205).
Foley at al. A genomic timescale for placental mammal evolution
https://www.science.org/doi/10.1126/science.abl8189
- All datasets used in this analysis are available where indicated in the text.
- Scripts written as part of this study are available at https://github.com/VCMason/Foley2021 and also archived at https://doi.org/10.5281/zenodo.5793715.
- The HAL alignment is publicly available at https://cglgenomics.ucsc.edu/data/cactus/. Information regarding genome assemblies and specimen biosamples are provided in Ref. 1 and can be accessed at https://zoonomiaproject.org/the-data/.
- Human referenced PhyloP scores are publicly available at http://genome.ucsc.edu/cgi-bin/hgGateway?genome=Homo_sapiens&hubUrl=http://cgl.gi.ucsc.edu/data/cactus/241-mammalian-2020v2-hub/hub.txt.
- All other data including alignments, phylogenies, and Excel versions of the Supplementary Tables are available at the following repository 10.5281/zenodo.5823345.
Kaplow et al. Relating enhancer genetic variation across mammals to complex phenotypes using machine learning
https://www.science.org/doi/10.1126/science.abm7993
- Publicly available ATAC-seq data was obtained from Gene Expression Omnibus accessions GSE161374, GSE146897, GSE137311, and GSE159815; China National GeneBank accession CNP0000198; and ArrayExpress accession E-MTAB-2633.
- Unpublished ATAC-seq data generated by the Pfenning Lab can be found at GSE187366.
- Tree used for the phenotype association pipeline can be obtained by contacting the Zoonomia Consortium and will be released prior to publication.
- Publicly available genomes and annotations were downloaded from NCBI Publicly available genomes and annotations were downloaded from NCBI.
- Publicly available human Hi-C data was accessed at http://hugin2.genetics.unc.edu/Project/hugin/.
- Mouse cortex Dip-C data was downloaded from Gene Expression Omnibus accession GSE146397.
- Motif discovery results and machine learning models can be found at http://daphne.compbio.cs.cmu.edu/files/ikaplow/TACITSupplement/.
- Machine learning model predictions can be obtained from the UCSC Genome Browser (http://courtyard.gi.ucsc.edu/data/cactus/241-mammalian-2020v2-hub/hub.txt).
- New code for this work can be found at doi.org/10.5281/zenodo.7358830.
Keough et al. Three-dimensional genome re-wiring in loci with human accelerated regions
https://www.science.org/doi/10.1126/science.abm1696
- The Zoonomia data are available at https://zoonomiaproject.org/the-project/.
- The Nextflow pipeline to identify lineage-specific accelerated regions is available at https://github.com/keoughkath/AcceleratedRegionsNF.
- The Hi-C data are available at GSE183137.
- All other data are available in the main text or the supplementary materials.
Kirilenko et al. Integrating gene annotation with orthology inference at scale
https://www.science.org/doi/10.1126/science.abn3107
- The protein-coding sequence alignments are available at http://genome.senckenberg.de/download/TOGA/
- The source code used for this study, and all scripts to run TOGA, create training and test data sets and browser tracks are permanently archived at Zenodo: https://zenodo.org/record/640067
- Further code development will be tracked on https://github.com/hillerlab/TOGA.
- We recommend generating alignment chains with our pipeline (https://github.com/hillerlab/make_lastz_chains).
- All data are available in the manuscript or the supplementary material, and available for download at http://genome.senckenberg.de/download/TOGA/ and for browsing in our UCSC genome browser mirror at https://genome.senckenberg.de.
Moon et. al. Comparative genomics of balto, a famous historic dog, captures lost diversity of 1920s sled dogs
https://www.science.org/doi/10.1126/science.abn5887
- Raw sequencing reads for Balto and Alaskan sled dogs have been deposited to the NCBI Sequence Read Archive under BioProject accession PRJNA786530.
Osmanski et al. Insights into mammalian TE diversity via the curation of 248 mammalian genome assemblies
https://www.science.org/doi/10.1126/science.abn1430
- All assemblies are available in Genbank, TE consensus sequences are available via the Dfam database.
- All other data is available in the supplementary materials; code used in the analysis is available at zenodo.org/badge/latestdoi/431231925
Wilder et al. The contribution of historical processes to contemporary extinction risk in placental mammals
https://www.science.org/doi/10.1126/science.abn5856
- The data presented in this paper are detailed in supplementary materials.
- Summary data and analysis scripts are available at https://github.com/LaMariposa/zoonomia_biodiversity.
- NCBI accession numbers for sequence data used in analyses are given in table S1.
Xue et al. The functional and evolutionary impacts of human-specific deletions in conserved elements
https://www.science.org/doi/10.1126/science.abn2253
- Oligo libraries used in this study are available upon request. CRISPR-modified SK-N-SH for the LOXL2-associated hCONDEL-edited cell lines are available upon request.
- All additional unique/stable reagents generated in this study are available without restriction, or with a Materials Transfer Agreement.