This web page was produced as an assignment for Genetics 677, an undergraduate course at UW-Madison.
What is a DNA Motif?
DNA motifs are constant, short sequences of DNA that are continuously seen throughout the genome and are functionally relevant (1). Creating a protein involves expressing a piece of DNA, which is then turned into RNA. This RNA is then immensely edited and processed so that a functional protein can be created. At the DNA level, motifs function as binding sites for proteins/transcription factors, and, at the RNA level, they become functional sites for mRNA processing (such as splice and post-modification sites) and ribosomal binding (1). Transcription factors, mRNA processing, and ribosomal binding are processes vital for creating a functional protein. Understanding DNA motifs poses insight to understand post-regulation networks and its effects on creating proteins.
DNA motifs can be identified using online databases. MOTIF and MEME were used to identify motifs, and PROSITE was used to identify its functions.
DNA motifs can be identified using online databases. MOTIF and MEME were used to identify motifs, and PROSITE was used to identify its functions.
Analysis
MOTIF identified 10 motifs from the HLA-C DNA sequence. The resultant motifs included a variety of functions such as the immune system and extracellular matrix. It is possible that the pathways of these biological functions overlap, perhaps bound by the same proteins.
MEME identified 5 motifs consistent between the human HLA-C protein sequence and the zebrafish HLA-C homolog protein sequence. The motifs identified weren't localized to one area of the 366 amino acid protein, but were spread apart at varied locations.
MEME identified 5 motifs consistent between the human HLA-C protein sequence and the zebrafish HLA-C homolog protein sequence. The motifs identified weren't localized to one area of the 366 amino acid protein, but were spread apart at varied locations.
MOTIF Motifs
The MOTIF database identified 10 motifs from the HLA-C DNA sequence from FASTA.
*Click on the hyperlink Motif name to see the PROSITE profile.
EGF_1 (Epidermal Growth Factor-like domain signature 1)
Pattern: C-x-C-x(2)-{V}-x(2)-G-{C}-x-C
Function: EGF is a polypeptide which initiates a signal transduction pathway to stimulate DNA synthesis and cell proliferation
Integrin_beta (integrins beta chain cysteine-rich domain signature)
Pattern: C-x-[GNQ]-x(1,3)-G-x-C-x-C-x(2)-C-x-C
Function: Integrins are cell surface receptors that mediate adhesion of cell-cell and cell-matrix.
CTCK_1 (C-terminal cystine knot signature)
Pattern: C-C-x(13)-C-x(2)-[GN]-x(12)-C-x-C-x(2,4)-C
Function: CTCK has a widely used motif, thus is multifunctional
Anaphylatoxin_1 (anaphylatoxin domain signature)
Pattern: [CSH]-C-x(2)-[GAP]-x(7,8)-[GASTDEQR]-C-[GASTDEQL]-x(3,9)-[GASTDEQN]-x(2)- [CE]-x(6,7)-C-C
Function: Anaphylatoxins mediate a localized inflammation process by causing smooth muscle contrations
Thiolase_3 (thiolases active site)
Pattern: [AG]-[LIVMA]-[STAGCLIVM]-[STAG]-[LIVMA]-C-{Q}-[AG]-x-[AG]-x-[AG]-x-[SAG]
Function: Thiolases are used in degradative or biosynthetic pathways
Tubulin (tubulin subunits alpha, beta, and gamma signature)
Pattern: [SAG]-G-G-T-G-[SA]-G
Function: Tubulins are the major component in microtubules and dimeric proteins
VWFC_1 (VWFC domain signature)
Pattern: C-x(2,3)-C-{CG}-C-x(6,14)-C-x(3,4)-C-x(2,10)-C-x(9,16)-C-C-x(2,4)-C
Function: von Willebrand factor (VWF) type C is believed to be involved in forming large protein complexes
4FE4S_FER_1 (4Fe-4S ferredoxin-type iron-sulfur binding region signature)
Pattern: C-x-{P}-C-{C}-x-C-{CP}-x-{C}-C-[PEG]
Function: iron-sulfur proteins intercede electron transfers in various metabolic reactions
2FE2S_FER_1 (2Fe-2S ferredoxin-type iron-sulfur binding region signature)
Pattern: C-{C}-{C}-[GA]-{C}-C-[GAST]-{CPDEKRHFYW}-C
Function: Ferredoxins are tiny, acidic proteins that intercede electron transfers in reduction/oxidation systems
Defensin (mammalian defensins signature)
Pattern: C-x-C-x(3,5)-C-x(7)-G-x-C-x(9)-C-C
Function: Defensins have a role in immune system to form voltage channels in the cell membrane to kill cells
*Click on the hyperlink Motif name to see the PROSITE profile.
EGF_1 (Epidermal Growth Factor-like domain signature 1)
Pattern: C-x-C-x(2)-{V}-x(2)-G-{C}-x-C
Function: EGF is a polypeptide which initiates a signal transduction pathway to stimulate DNA synthesis and cell proliferation
Integrin_beta (integrins beta chain cysteine-rich domain signature)
Pattern: C-x-[GNQ]-x(1,3)-G-x-C-x-C-x(2)-C-x-C
Function: Integrins are cell surface receptors that mediate adhesion of cell-cell and cell-matrix.
CTCK_1 (C-terminal cystine knot signature)
Pattern: C-C-x(13)-C-x(2)-[GN]-x(12)-C-x-C-x(2,4)-C
Function: CTCK has a widely used motif, thus is multifunctional
Anaphylatoxin_1 (anaphylatoxin domain signature)
Pattern: [CSH]-C-x(2)-[GAP]-x(7,8)-[GASTDEQR]-C-[GASTDEQL]-x(3,9)-[GASTDEQN]-x(2)- [CE]-x(6,7)-C-C
Function: Anaphylatoxins mediate a localized inflammation process by causing smooth muscle contrations
Thiolase_3 (thiolases active site)
Pattern: [AG]-[LIVMA]-[STAGCLIVM]-[STAG]-[LIVMA]-C-{Q}-[AG]-x-[AG]-x-[AG]-x-[SAG]
Function: Thiolases are used in degradative or biosynthetic pathways
Tubulin (tubulin subunits alpha, beta, and gamma signature)
Pattern: [SAG]-G-G-T-G-[SA]-G
Function: Tubulins are the major component in microtubules and dimeric proteins
VWFC_1 (VWFC domain signature)
Pattern: C-x(2,3)-C-{CG}-C-x(6,14)-C-x(3,4)-C-x(2,10)-C-x(9,16)-C-C-x(2,4)-C
Function: von Willebrand factor (VWF) type C is believed to be involved in forming large protein complexes
4FE4S_FER_1 (4Fe-4S ferredoxin-type iron-sulfur binding region signature)
Pattern: C-x-{P}-C-{C}-x-C-{CP}-x-{C}-C-[PEG]
Function: iron-sulfur proteins intercede electron transfers in various metabolic reactions
2FE2S_FER_1 (2Fe-2S ferredoxin-type iron-sulfur binding region signature)
Pattern: C-{C}-{C}-[GA]-{C}-C-[GAST]-{CPDEKRHFYW}-C
Function: Ferredoxins are tiny, acidic proteins that intercede electron transfers in reduction/oxidation systems
Defensin (mammalian defensins signature)
Pattern: C-x-C-x(3,5)-C-x(7)-G-x-C-x(9)-C-C
Function: Defensins have a role in immune system to form voltage channels in the cell membrane to kill cells
Protein MEME Motifs
Motif 1
E-value: 2.3E-002
Length (amino acids): 9
Start Location: 114/116
Motif 2
E-value: 5.3E-001
Length (amino acids): 8
Start Location: 226/230
Motif 3
E-value: 8.7E-001
Length (amino acids): 9
Start Location: 252/267
Motif 4
E-value: 1.7E+000
Length (amino acids): 7
Start Location: 280/285
Motif 5
E-value: 6.0E+000
Length (amino acids): 10
Start Location: 22/19
Above are 5 HLA-C protein motifs identified using MEME. The motifs are between human HLA-C and the zebrafish HLA-C homolog.
The start locations include two numbers, the left number is the human start location and the right number is the zebrafish start location.
Below are the locations of the motifs in the 366 amino acid HLA-C protein, the top motif is the human and the bottom motif is the zebrafish.
References
1. D'haeseleer, P. What are DNA sequence motifs? Nature Biotechnology 24, 423 - 425 (2006) doi:10.1038/nbt0406-423
2. PROSITE
3. MOTIF
4. MEME
2. PROSITE
3. MOTIF
4. MEME
Site created by Valeri Lapacek
Genetics 677 Assignment, Spring 2012
University of Wisconsin-Madison
Last Updated: 5/23/2012
Genetics 677 Assignment, Spring 2012
University of Wisconsin-Madison
Last Updated: 5/23/2012