About SARS-CoV-2 (2019nCoV, novel Coronavirus) Envelop protein (Coronavirus E protein)
1. SARS-CoV-2 (2019nCoV) E protein: E protein is the smallest major structural proteins. It has a N-terminal ectodomain and a C-terminal endodomain with ion channel activity. During the replication cycle, E protein is abundantly expressed inside the infected cell, but only a small portion is incorporated into the virus envelope. The majority of the protein participates in viral assembly and budding. E protein is important in virus production and maturation. Recombinant CoVs without E have been shown to exhibit significantly reduced viral titres, crippled viral maturation, or yield incompetent progeny.
About SARS-CoV-2 (2019nCoV, novel Coronavirus) membrane protein (Coronavirus M protein)
1. SARS-CoV-2 (2019nCoV) M protein: Coronavirus M protein is believed to define the shape of the viral envelope,which contains three transmembrane domains. It has a small N-terminal glycosylated ectodomain and a much larger C-terminal endodomain that extends 6–8 nm into the viral particle. M protein usualy exists as a dimer, and may adopt two different conformations allowing it to promote membrane curvature as well as bind to the nucleocapsid.
About SARS-CoV-2 (2019nCoV, novel Coronavirus) Nucleocapsid protein (Coronavirus N protein)
1. SARS-CoV-2 (2019nCoV) N protein: Coronavirus N protein is required for coronavirus RNA synthesis, and has RNA chaperone activity that may be involved in template switch. Nucleocapsid protein is a most abundant protein of coronavirus. N protein packages the positive strand viral genome RNA into a helical ribonucleocapsid (RNP) and plays a fundamental role during virion assembly through its interactions with the viral genome and membrane protein M. Plays an important role in enhancing the efficiency of subgenomic viral RNA transcription as well as viral replication. Because of the conservation of N protein sequence and its strong immunogenicity, the N protein of coronavirus is chosen as a diagnostic tool.
About SARS-CoV-2 (2019nCoV, novel Coronavirus)Non-structure protein (Nsp1-Nsp16)
2019nCoV contains 16 Non-structure protein (Nsp1-Nsp16) that may be drugable targets for antiviral compounds discovery against COVID-191.
Non-structure proteins | Starting position (aa) | Ending position (aa) | Length (aa) |
nsp1 | 1 | 180 | 180 |
(leader protein) | |||
nsp2 | 181 | 818 | 638 |
nsp3 | 819 | 2763 | 1945 |
(Papain-Like proteinase, PLpro) | |||
nsp4 | 2764 | 3263 | 500 |
nsp5 | 3264 | 3569 | 306 |
(Mpro, Main proteinase, 3C-like proteinase) | |||
nsp6 | 3570 | 3859 | 290 |
nsp7 | 3860 | 3942 | 83 |
nsp8 | 3943 | 4140 | 198 |
nsp9 | 4141 | 4253 | 113 |
nsp10 | 4254 | 4392 | 139 |
(growth-factor-like protein) | |||
nsp12 | 4393 | 5324 | 932 |
(RdRP,NA-dependent RNA polymerase) | |||
nsp13 | 5325 | 5925 | 601 |
(RNA 5′-triphosphatase) | |||
nsp14 | 5926 | 6452 | 527 |
(3′-to-5′ exonuclease) | |||
nsp15 | 6453 | 6798 | 346 |
(endoRNAse) | |||
nsp16 | 6799 | 7096 | 298 |
(2’O-MTase, 2′-O-ribose methyltransferase) |
1. Nsp3: Nsp3 (200 kDa) is the largest protein encoded by the coronavirus (CoV) genome. Nsp3 is an essential component of the replication and transcription complex. It comprises various domains, the organization of which differs between CoV genera, due to duplication or absence of some domains. However, the N-terminal region of the Nsp3 is highly conserved among CoV, containing a ubiquitin-like (Ubl) globular fold followed by a flexible, extended acidic-domain (AC domain) rich in glutamic acid (38%). Next to the AC domain is a catalytically active ADP-ribose-1″-phosphatase (ADRP, app-1″-pase) domain (also called macro domain or X domain) thought to play a role during synthesis of viral subgenomic RNAs. SARS Unique Domain (SUD), a domain not yet identified in other coronaviruses from alphacoronavirus and betacoronavirus, follows next. The SUD domain binds oligonucleotides known to form G-quadruplexes. Downstream of the SUD domain is a second Ubl domain and the catalytically active PLpro domain that proteolytically processes the Nsp1/2, Nsp2/3 and Nsp3/4 cleavage sites. Downstream of PLpro are found a nucleic acid-binding domain (NAB) with a nucleic acid chaperon function, which is conserved in betacoronavirus and gammacoronavirus, and one uncharacterized domain termed the marker domain (G2M). Following the G2M are two predicted double-pass transmembrane domains (TM1–2 and TM3–4), a putative metal binding region (ZN) and the Y domain of unknown function (subdomains Y1–3).
2. Nsp5: Nsp5 protease (3CLpro; Mpro) mediates processing at 11 distinct cleavage sites, including its own autoproteolysis, and is essential for virus replication. Nsp5 exhibits a conserved three-domain structure and catalytic residues.
3. Nsp10: Nsp10 (18 kDa) is well conserved among coronaviruses and encoded by ORF1a. It’s thought to serve as an important multifunctional cofactor in replication. Nsp10 was shown to interact with itself, as well as with Nsp1, Nsp7, Nsp14, and Nsp16. The important role of Nsp10 is responsible for RNA synthesis. It was shown that a murine hepatitis virus (MHV) temperature-sensitive mutant carrying a non-synonymous mutation in the Nsp10 coding sequence had a defect in minus-strand RNA synthesis at non-permissive temperatures.
4. Nsp12: Nsp12 (102 kDa) is a multidomain RNA polymerase, which is the most conserved protein in coronaviruses. Nsp12 contains an RNA-dependent RNA polymerase (RdRp) domain in its C-terminal, which is essential for the viral replication and transcription.
5. Nsp16: Nsp16 is an SAM-dependent nucleoside-2’O-methyl-transferase (2’O-MTase). The mRNA cap for coronaviruses is completed by Nsp16, which ensures formation of a protective cap-1 structure that prevent recognition by either MDA5 or IFIT proteins. Finally, the NSP16/NSP10 complex finishes coronavirus capping process permitting viral infection with reduced host recognition.
About COVID-19 pandemic, Coronavirus (Coronavirus) and genome of SARS-CoV-2 (2019nCoV)
COVID-19 pandemic is caused by 2019nCoV (SARS-CoV-2, a novel coronavirus) infection.The 2019-nCoV genome was annotated to possess 14 ORFs encoding 27 proteins1.
Gene name of 2019nCoV (SARS-CoV-2, a novel coronavirus) | Coding region(nt) | Protein length(aa) |
orf1a | 266-13483 | 4405 |
orf1ab | 266-13468, 13468-21555 | 7096 |
S | 21563-25384 | 1273 |
3a | 25393-26220 | 275 |
3b | 25814-25882 | 22 |
26183-26281 | 32 | |
E (envelope protei) | 26245-26472 | 75 |
M (matrix protein) | 26523-27191 | 222 |
p6 | 27202-27387 | 61 |
7a | 27394-27759 | 121 |
7b | 27756-27887 | 43 |
8b | 27894-28259 | 121 |
9b | 28284-28577 | 97 |
N(nucleocapsid) | 28274-29533 | 419 |
orf14 | 28734-28955 | 73 |
Collection of COVID-19 landscape knowledge base
References
1.Wu, A. et al. Genome Composition and Divergence of the Novel Coronavirus (2019-nCoV) Originating in China. Cell Host Microbe, doi:10.1016/j.chom.2020.02.001 (2020).