In this lab the first mode of action was to choose a gene family and a gene family is known to be a set of genes that share similar important characteristics. The gene family that was chosen is called the Human Leukocyte antigen and it is a complex that is responsible for helping the immune system in identifying which proteins are of the body from the proteins that are invaders of the body. There are various classes of the HLA, three classes to be exact and in this lab the class I gene called HLA-A was used. According to Markus Meissner “Human leukocyte antigen (HLA) class I antigen defects, which are frequently present in head and neck squamous cell carcinoma (HNSCC) cells may provide the tumor with an escape mechanism from immune surveillance”. It is known that the proteins that are produced by HLA-A are present on the surface majority of cells. After the gene was chosen various searches were conducted using the NCBI database system.
NCBI stands for the National center for biotechnology information, it providers scientist with genomic and biomedical information. Within the NCBI system there are various databases that were used in section 1 of the lab to gain information on the genomic data, MRNA/cDNA, protein, structure, pubmed and EST which is and expressed sequence tag. Also in this section the genomic sequences were obtained and saved in a format called FASTA, which is an alignment software package of DNA and protein sequences, it is also known as a text based format. In section 2 there were two tools that were employed but utilizing the previous information obtained from section 1 and they were the expasy tool and ORF finder. Expasy is the Swiss institute of bioinofmatics that translates the nucleotide sequence to protein sequences and according to Elisabeth Gasteigers definition of expasy, she says that, “it provides access to a variety of databases and analytical tools dedicated to proteins and proteomics”. Expasy was utilized by feeding the fasta format of the nucleotide sequence of the gene HLA-A in to its search engine. The results of the expasy tool resulted in translation to the protein sequence with six different frames in which the open reading frames where highlighted in red.
An open reading frame is DNA sequence that starts with a start codon and ends with a stop codon. Since DNA is interpreted in groups of three nucleotides and it is double stranded, this is what leads to there being six frame translations. When the results are obtained from the expasy tool, the open reading frames are highlighted red and there are some of the frames that contain more start and stop condons than the other frames. The frames with the most highlighted regions indicated that there are more ORF’s and is most likely to be translated, whereas the frames that contained the least highlighted ORF’s are least likely to be translated. Within this same section there was also the use of the ORF finder, which is the open reading frame finder that gives the range of each ORF and its protein translation and graphically represents it. It indicates the longest to shortest open reading frame by means of a bar graph, as well as indicating the exact placement number of the start and stop codons and the amount of amino acids present. The ORF is basically a graphically representation of the results from the expasy tool. The expasy tool displays the nucleotides that make up the start and stop codons, by highlighting the exact nucleotides, whereas in ORF finder it is a graphical representation of the start and stop codons of the most likely and least likely proteins. Then there is section 3 which dealt with BLAST that stands for Basic local alignment search tool.
According to Jian Ye in his article he says that the “Basic local alignment search tool (BLAST) is a sequence similarity search program that can be used via a web interface or as a stand-alone tool to compare a user’s query to a database of sequences” (Ye 2016). This basically states that whatever sequence query you have with either your nucleotide sequences or protein sequence running the appropriate blast will use their database that contains similar sequences to compare them. There are several types of BLAST, but in this lab only four of those types were utilized and they are nucleotide blast, protein blast, tblastn and tblastx. In the nucleotide blast, it compares the similarity of the fasta format of the nucleotides sequences to the database sequences which are nucleotide sequences as well. In the protein blast, the fasta formats of nucleotides are converted to protein sequences using the expasy tool, because the protein blast compares the similarity between protein sequences. The tblastn takes the protein sequence of a genome and compares it to a nucleotide sequence database and the blastx compares the sequence of a translated nucleotide to protein sequence database.
After all the blasts are performed, it results in a graphical representation of the compared sequences as well as the statistical significance of the sequences is also calculated. In this entire lab all the searches that were conducted on the NCBI site which is known for where a series of databases are housed, were based on the similarity of the nucleotide sequences that were entered into the search engine. NCBI was rather helpful in providing the various data concerning any chosen genes’ nucleotide sequence.