C10orf67

Chromosome 10 open reading frame 67 (C10orf67), also known as C10orf115, LINC01552, and BA215C7.4, is an un-characterized human protein-coding gene. Several studies indicate a possible link between genetic polymorphisms of this and several other genes to chronic inflammatory barrier diseases such as Crohn's Disease and sarcoidosis.[3][4][5]

C10orf67
Identifiers
AliasesC10orf67, C10orf115, LINC01552, bA215C7.4, chromosome 10 open reading frame 67
External IDsHomoloGene: 82326 GeneCards: C10orf67
Gene location (Human)
Chr.Chromosome 10 (human)[1]
Band10p12.2Start23,202,696 bp[1]
End23,344,845 bp[1]
Orthologs
SpeciesHumanMouse
Entrez

256815

n/a

Ensembl

ENSG00000179133

n/a

UniProt

Q8IYJ2

n/a

RefSeq (mRNA)

NM_153714
NM_001351306
NM_001365862
NM_001371909

n/a

RefSeq (protein)

NP_714925
NP_001338235
NP_001352791
NP_001358838

n/a

Location (UCSC)Chr 10: 23.2 – 23.34 Mbn/a
PubMed search[2]n/a
Wikidata
View/Edit Human

Gene

A map of Chromosome 10 with the location of C10orf67 marked in red

The gene spans 142,366 base pairs and is located at the 10p12.2 locus on the minus (-) or sense strand of chromosome 10. It is flanked upstream by the gene ARMC3[6] and downstream by the gene KIAA1217.[7][8] These genes are approximately 150,000 bp and 350,000 bp from C10orf67, respectively.

This segment depicts approximately 1,700,000 base pairs of chromosome 10. The green lines indicate the start of transcription while the red diamonds indicate the termination of transcription. C10orf67 is transcribed in the opposite direction of its flanking genes, which are located on the anti-sense strand.

Transcript

There are 23 alternatively spliced exons, which encode 13 transcript variants. The primary transcript, only 2943 bp, is not well conserved among orthologs, rather, the X2 variant, 3417 bp, has far greater identity with orthologous proteins. This X2 transcript variant contains 15 exons which yield a polypeptide of 551 amino acids.[9][10]

Protein

General properties

Property Preprotein Cleaved protein Mature protein
Amino Acid length 551 515 515
Isoelectric Point 9.3 8.6 8.3-8.9*
Molecular Weight 63 kDa 59 kDa ~59-61 kDa**

*depending on post-translational modifications (PTMs)

**From no PTMs - all possible PTMs

The isoelectric point is significantly greater than average for human proteins (6.81).[11]

Predicted tertiary structure of C10orf67 generated by software.[12] Based on a protein template covering 74% of the protein sequence with 96% identity.

Structure

Shown to the right is a predicted tertiary structure of the protein. It is marked by long alpha-helices with several coil regions and beta strands localized to the end of the protein opposite the N- and C- terminal ends.

Expression

Expression of C10orf67 in various tissues.[13]

C10orf67 is moderately expressed (50-75%) in most tissues in the body.[13] However, a study on NCBI GEO discussing the influence of interleukin-13 (IL-13) on gene expression[14] found that protein expression dropped to zero in the presence of IL-13 in airway epithelia.

Subcellular localization

The protein contains a mitochondrial signal peptide localizing it to the mitochondrial matrix.[15] Analysis with subcellular localization software[16][17] confirmed this finding. However, some orthologs were also predicted to localize in the nucleus. Though the high isoelectric point of the Human protein provides further evidence for the mitochondrial localization due to the high pH of the mitochondrial matrix.

Cleavage sites

The protein is initially cleaved to remove the 36 amino acid N-terminal signal peptide after it is localized to the mitochondrion.[18]

Phosphorylation

The possible phosphorylation sites of C10orf67. The concentration of possible phosphorylation sites is far greater near the C-terminus of the protein and far lower near the N-terminus, which contains the signal peptide.

There are a number of predicted phosphorylation sites, however there is one experimentally-confirmed phosphorylation site at threonine 69.[19] The other phosphorylation sites are summarized in the protein diagram below.

Sumoylation

There are five predicted sumoylation sites within C10orf67. These are summarized by the following table:

No. Pos. Group Score
1 K461 NSFHV LKNE MFTRH 0.91
2 K401 MPKKA LKED QAVVE 0.91
3 K224 EVIKE LKEE LDQYK 0.91
4 K136 KFEDR LKEE SLS L 0.91
5 K130 KQLLQ LKFE DRLKE 0.91
Post translational modifications of C10orf67. The N-terminus is on the left with the 36 amino acid signal peptide and the C-terminus is on the right.

Homology and evolution

Evolution

C10orf67 has no known paralogs but has many orthologs within eukaryotes and retains significant identity with species as distantly related as invertebrates. Several select orthologs are listed below with some identifying information.

Genus and Species Common Name Organism Type Time Since Last

Common Ancestor

(million years ago)

Accession #

(NCBI)

Sequence Length % Identity Isoelectric Point

(pre-protein)

Homo Sapiens Humans Primate 0 XP_016871518 551 100 9.3
Pan troglodytes Chimpanzee 6.65 XP_009456334 573 95 9.27
Macaca nemestrina Southern pig-tailed macaque 29.44 XP_011736768 572 88.1 9.17
Bubalus bubalis Water Buffalo Mammal 96 XP_006080042 565 56.6 6.24
Felis catus Cat 96 XP_019689630 560 55.1 7.68
Sus scrofa Wild Boar 96 XP_013835714 515 55 6.53
Panthera pardus Leopard 96 XP_019316071 504 53.9 6.24
Ovis aries Sheep 96 XP_012043724 516 53.6 6.61
Mustela putorius furo Ferret 96 XP_012914379 566 50.8 9.34
Castor canadensis Beaver 90 XP_020038711 617 44 8.92
Mus musculus Mouse 90 NP_081876 560 43.6 5.89
Myotis lucifugus Little Brown Bat 96 XP_014316001 598 38.9 6.22
Myotis brandtii Brandt's bat 96 XP_014394869 639 38.3 6.7
Elephantulus edwardii Cape elephant shrew 105 XP_006887164 493 37.9 5.62
Gallus gallus Chicken Bird 312 XP_003640687 430 26.3 5.44
Astyanax mexicanus Mexican Tetra Fish 435 XP_007253068 475 26.1 4.76
Lepisosteus oculatus Spotted Gar 435 XP_015208957 479 25.2 6.73
Danio rerio Zebrafish 435 XP_698346 461 24.5 5.93
Salmo salar Atlantic Salmon 435 XP_013995887 455 21.6 6.18
Amphimedon queenslandica Reniera Invertebrate 951.8 XP_011402872 513 24.1 7.05
Branchiostoma belcheri Branchiostoma 684 XP_019645941 563 23.5 6.24

Evolution

The rate of evolution of C10orf67 relative to Fibrinogen and Cytochrome c.

The rate of evolution of C10orf67 was compared to that of fibrinogen and cytochrome c, which represent fast and slow rates of evolution, respectively. The bolded species in the table were selected to represent the fibrinogen and cytochrome c orthologs to determine the rate of evolution of the respective proteins.

The rate of evolution of C10orf67 is very curious in that it follows a logarithmic trend rather than a linear trend, like most proteins.

Clinical significance

Sarcoidosis

While the function of C10orf67 is unknown, its interactions with IL-13 further suggest a role of C10orf67 in sarcoidosis as the disease is known to involve various interleukins.

Cancer

While several NCBI GEO profiles examining various factors on gene expression show that C10orf67 is expressed in varying levels in different cancer tissues,[20][21] the mitochondrial localization may yield some insight as to a clinical function. Mitochondria have been shown to have some influence in cell proliferation. Given the high energy demand from cell proliferation, there have been several hypotheses that the mitochondria may play a role in the cell cycle and that C10orf67, being localized to the mitochondria, may have a hand in this as well.

References

  1. GRCh38: Ensembl release 89: ENSG00000179133 - Ensembl, May 2017
  2. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  3. Thiébaut R, Esmiol S, Lecine P, Mahfouz B, Hermant A, Nicoletti C, Parnis S, Perroy J, Borg JP, Pascoe L, Hugot JP, Ollendorff V (2016-01-01). "Characterization and Genetic Analyses of New Genes Coding for NOD2 Interacting Proteins". PLOS ONE. 11 (11): e0165420. doi:10.1371/journal.pone.0165420. PMC 5094585. PMID 27812135.
  4. Cozier YC, Ruiz-Narvaez EA, McKinnon CJ, Berman JS, Rosenberg L, Palmer JR (October 2012). "Fine-mapping in African-American women confirms the importance of the 10p12 locus to sarcoidosis". Genes and Immunity. 13 (7): 573–8. doi:10.1038/gene.2012.42. PMC 3475762. PMID 22972473.
  5. Franke A, Fischer A, Nothnagel M, Becker C, Grabe N, Till A, Lu T, Müller-Quernheim J, Wittig M, Hermann A, Balschun T, Hofmann S, Niemiec R, Schulz S, Hampe J, Nikolaus S, Nürnberg P, Krawczak M, Schürmann M, Rosenstiel P, Nebel A, Schreiber S (October 2008). "Genome-wide association analysis in sarcoidosis and Crohn's disease unravels a common susceptibility locus on 10p12.2". Gastroenterology. 135 (4): 1207–15. doi:10.1053/j.gastro.2008.07.017. PMID 18723019.
  6. "ARMC3 armadillo repeat containing 3 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov.
  7. "KIAA1217 KIAA1217 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov.
  8. "C10orf67 chromosome 10 open reading frame 67 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2017-04-30.
  9. "Homo sapiens chromosome 10 open reading frame 67 (C10orf67), mRNA". www.ncbi.nlm.nih.gov. Retrieved 2017-02-05.
  10. Database, GeneCards Human Gene. "C10orf67 Gene - GeneCards | CJ067 Protein | CJ067 Antibody". www.genecards.org. Retrieved 2017-02-06.
  11. Kozlowski, Lukasz P. "Proteome-pI - Proteome Isoelectric Point Database statistics". isoelectricpointdb.org. Retrieved 2017-04-30.
  12. Kelley, Lawrence. "PHYRE2 Protein Fold Recognition Server". www.sbg.bio.ic.ac.uk. Retrieved 2017-05-05.
  13. "GDS4794 / 1553845_x_at". www.ncbi.nlm.nih.gov. Retrieved 2017-04-30.
  14. "GDS4981 / ILMN_1719577". www.ncbi.nlm.nih.gov. Retrieved 2017-04-30.
  15. "uncharacterized protein C10orf67, mitochondrial [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2017-05-05.
  16. "PSORT II Prediction". psort.hgc.jp. Retrieved 2017-05-05.
  17. "MitoFates". mitf.cbrc.jp. Retrieved 2017-05-05.
  18. "WoLF PSORT: Protein Subcellular Localization Prediction". wolfpsort.hgc.jp. Retrieved 2017-04-30.
  19. "Thr69". www.phosphosite.org. Retrieved 2017-04-30.
  20. "GDS4080 / 1553844_a_at". www.ncbi.nlm.nih.gov. Retrieved 2017-05-06.
  21. "GDS1807 / 1553843_at". www.ncbi.nlm.nih.gov. Retrieved 2017-05-06.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.