Zachary Winger


Week 1

1. GISAID Initiative

GISAID, Global initiative on Sharing All Influenza Data, is a partnership between the intiative’s administrative arm Freunde of GISAID, a non-profit, and the governments of Germany, Singapore, and the USA. The goal of GISAID is to provide a free, open-source location for people all over the world to share all influenza virus sequences, related data associated with human viruses, and data associated with avian and animal viruses. The initiative ensures this open-source access for all provided individuals identify themselves, and give proper credit to the source(s) of all data provided.

2. Folding@Home

Folding@Home is a distributed computing project whose purpose is to similate protein dynamics. The project relies on citizen scientists volunteering to run simulations of protein dynamics on their own computers to help create more data, faster to be used by scientists. Anyone is able to download the project and put help run the protein simulations. The Folding@Home project aims to understand how proteins behave and use the simulations results to help develop theraputics. Since COVID-19 is a virus, and viruses have proteins, Folding@Home is hoping to use these simulations to better understand the virus to find a way to stop it.

3. Next Strain

NextStrain is a tool that displays the progression and spread of the coronavirus utilizing data provided through GISAID. Although it is known the COVID-19 pandemic began in Wuhan, China in November to December 2019, the exact transmission dates and spread of the virus are unclear. Sustained human-to-human interactions lead to the spread of the virus, which would explain the clear genetic relationships among the sampled viruses. The simulation shows only roughly 3000 genomes in a simgle view, but there are hundreds more complete genomes being found daily.

4. Microbiology Resource Announcements

A complete genome for the SARS-Cov-2 was obtained from a Nepalese patient. The patient aquired the infection in Wuhan, China, and traveled with it to Nepal. The patient was a 32-year-old male student at Wuhan University of technology in Wuhan, China, who returned to Nepal with a cough, mild fever, and throat congestion. The sequencing was done by using the Illumina MiSeq system with the Burrows-Wheeler Aligner MEM algorithm 0.7.5a-r405 assembly method. The sequence was put in the GenBank at the GISAID database.

5. Galaxy Project

The Galaxy Project’s goal is to provide public access to infrastructure and workflowas for analying COVID-19 data. They focus on three types of data analysis, Genomics, Evolution, and Cheminformatics. The Genomics section discusses the sequencing of the virus. There are over 1,000 complete genomes on GISAID, but only a handful of raw sequencing read datasets. Galaxy project has found that there are 397 sites that show intra-host variation across 33 samples, and 29 have fixed differences at 39 sites from the published reference. The Eveolution section discusses analyzing which protions of the SARS-Cov-2 genome may be subject to positive or negative selection. Using data from GISAID, Galaxy Project uses comparative evolutionary techniques to analyze potential candidates. They have found about 5 genomic positions that may merit furhter invesitgation so far. The Cheminformatics section involves analyzing nonstructural proteins of SARS-Cov-2. Galaxy Project used protein-ligand docking to analyze and identify potentially inhibitory compounds that could be used to control viral proliferation. These compounds were chosen based on recently published X-ray crystal structures, with 500 high scoring compunds being identified.

Week 2

For week 2, I decided to create some phylogenetic trees using complete genome sequences of SARS-Cov-2. My analysis for week 2 can be found here.

Week 3

For week 3, I wanted to expand further on last weeks analysis and try to figure out the country of origin for SARS-Cov-2. My analysis for week 3 can be found here.