ORF3c

Gene found in coronaviruses of the subgenus Sarbecovirus
ORF3c
Identifiers
OrganismSARS-CoV-2
SymbolORF3c
UniProtP0DTG1
Search for
StructuresSwiss-model
DomainsInterPro

ORF3c is a gene found in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It was first identified in the SARS-CoV-2 genome and encodes a 41 amino acid non-structural protein of unknown function.[1][2][3] It is also present in the SARS-CoV genome, but was not recognized until the identification of the SARS-CoV-2 homolog.[4]

Nomenclature

There has been significant confusion in the scientific literature around the nomenclature used for the accessory proteins of SARS-CoV-2, especially several overlapping genes with ORF3a.[4] The predicted protein product of the ORF3c gene has at least once been referred to as "3b protein",[5] but it is not to be confused with the non-homologous gene ORF3b.[4] It has also been described under the names ORF3h[2] and ORF3a.iORF1.[6] The recommended nomenclature for SARS-CoV-2 uses the term ORF3c for this gene.[4]

Comparative genomics

ORF3c is an overlapping gene whose open reading frame overlaps both ORF3a and ORF3d in the SARS-CoV-2 genome. This potentially represents a rare example of all three possible reading frames of the same sequence region encoding functional proteins.[7][4]

Bioinformatics analyses of Sarbecovirus sequences suggest that the sequence and length of ORF3c are well conserved, indicating that it is likely to encode a functional protein.[1][3][2] It appears to be subject to purifying selection.[1][7]

Properties

Ribosome profiling experiments confirm that the ORF3c gene expresses a protein product.[6] The relatively short 41-residue protein is predicted to contain a transmembrane domain and has features suggestive of a viroporin.[2]

References

  1. ^ a b c Firth AE (October 2020). "A putative new SARS-CoV protein, 3c, encoded in an ORF overlapping ORF3a". The Journal of General Virology. 101 (10): 1085–1089. doi:10.1099/jgv.0.001469. PMC 7660454. PMID 32667280.
  2. ^ a b c d Cagliani R, Forni D, Clerici M, Sironi M (September 2020). "Coding potential and sequence conservation of SARS-CoV-2 and related animal viruses". Infection, Genetics and Evolution. 83: 104353. doi:10.1016/j.meegid.2020.104353. PMC 7199688. PMID 32387562.
  3. ^ a b Jungreis I, Sealfon R, Kellis M (May 2021). "SARS-CoV-2 gene content and COVID-19 mutation impact by comparing 44 Sarbecovirus genomes". Nature Communications. 12 (1): 2642. doi:10.1038/s41467-021-22905-7. hdl:1721.1/130581. PMC 8113528. PMID 33976134.
  4. ^ a b c d e Jungreis I, Nelson CW, Ardern Z, Finkel Y, Krogan NJ, Sato K, et al. (June 2021). "Conflicting and ambiguous names of overlapping ORFs in the SARS-CoV-2 genome: A homology-based resolution". Virology. 558: 145–151. doi:10.1016/j.virol.2021.02.013. hdl:1721.1/130363. PMC 7967279. PMID 33774510.
  5. ^ Pavesi A (July 2020). "New insights into the evolutionary features of viral overlapping genes by discriminant analysis". Virology. 546: 51–66. doi:10.1016/j.virol.2020.03.007. PMC 7157939. PMID 32452417.
  6. ^ a b Finkel Y, Mizrahi O, Nachshon A, Weingarten-Gabbay S, Morgenstern D, Yahalom-Ronen Y, et al. (January 2021). "The coding capacity of SARS-CoV-2". Nature. 589 (7840): 125–130. doi:10.1038/s41586-020-2739-1. PMID 32906143.
  7. ^ a b Nelson CW, Ardern Z, Goldberg TL, Meng C, Kuo CH, Ludwig C, et al. (October 2020). "Dynamically evolving novel overlapping gene as a factor in the SARS-CoV-2 pandemic". eLife. 9: e59633. doi:10.7554/eLife.59633. PMC 7655111. PMID 33001029.
  • v
  • t
  • e
Coronavirus genomes
Viral structural protein
Viral nonstructural protein
(expressed from ORF1ab)
  • nonstructural protein 1
  • nonstructural protein 2
  • papain-like protease (nsp3)
  • nonstructural protein 4
  • 3C-like protease (nsp5)
  • nonstructural protein 6
  • nonstructural protein 7
  • nonstructural protein 8
  • nonstructural protein 9
  • nonstructural protein 10
  • nonstructural protein 11
  • nonstructural protein 12
  • nonstructural protein 13
  • nonstructural protein 14
  • nonstructural protein 15
  • nonstructural protein 16
Viral accessory proteinRNA
  • v
  • t
  • e
DNA
linear ds-DNA
(Duplodnaviria,
Varidnaviria)
Herpes simplex
VSPs:
capsid:
VNPs:
Vaccinia
VNPs:
  • B13R
Adenoviridae
VNPs:
circular ds-DNA
(Duplodnaviria,
Varidnaviria?)
Epstein–Barr
VSPs:
VNPs:
ncRNA:
Baculoviridae
VNPs:
other
(Riboviria,
Monodnaviria)
Polyomaviridae
(SV40, MPyV, MCPyV, HaPyV)
(non-enveloped circular ds-DNA)
VSPs:
capsid:
VNPs:
oncoprotein:
Hepatitis B
(circular partially ds-DNA)
VSPs:
VNPs:
RNA
ds-RNA
(Riboviria)
Rotavirus
(Duplornaviricota)
VNPs:
  • NSP1
  • NSP2
  • NSP3
  • NSP4
  • NSP5
  • NSP6
Rhinov., Polio, Hep A,
etc. (Pisuviricota)
VNPs:
ss-RNA
positive-sense
(Riboviria)
Hepatitis C
(Kitrinoviricota)
VSPs:
VNPs:
SARS-CoV-2
(Pisuviricota)
VSPs:
VNPs:
ss-RNA
negative-sense
(Negarnaviricota)
Influenza virus
VSPs:
capsid:
glycoprotein:
VNPs:
Parainfluenza
VSPs:
glycoprotein:
Mumps
VSPs:
glycoprotein:
Measles
VSPs:
glycoprotein:
RSV
VSPs:
glycoprotein:
Zaire ebolavirus
VSPs:
capsid:
Indiana vesiculovirus
VSPs:
capsid:
RT
Structure and genome of HIV
VSPs:
VRAPs:
Multiple
Fusion protein
oncoprotein: