R Code and Dataset Java Project

User Generated

gmmm

Mathematics

Description

Unformatted Attachment Preview

1. For this question, we will be using the dataset from the publication Himes BE, Jiang X, Wagner P, Hu R, Wang Q, Klanderman B, Whitaker RM, Duan Q, Lasky-Su J, Nikolos C, Jester W. Johnson M, Panettieri R Jr, Tantisira KG, Weiss ST, Lu Q. "RNA-Seq Transcriptome Profiling Identifies CRISPLD2 as a Glucocorticoid Responsive Gene that Modulates Cytokine Function in Airway Smooth Muscle Cells.” PLoS One. 2014 Jun 13:96):e99625. PMID: 24926665. GEO: GSE52778. Asthma is a chronic inflammatory respiratory disease that affects over 300 million people worldwide. Glucocorticoids are a mainstay therapy for asthma because they exert anti- inflammatory effects in multiple lung tissues, including the airway smooth muscle (ASM). However, the mechanism by which glucocorticoids suppress inflammation in ASM remains poorly understood. Using RNA-Seq, a high-throughput sequencing method, the authors characterized transcriptomic changes in four primary human ASM cell lines that were treated with dexamethasone-a potent synthetic glucocorticoid (1 M for 18 hours). The dataset has already been preprocessed for you and you will be using the following data files provided to you on my courses: • count.csv that contains a subset of expression matrix from the above RNA-seq study (i.e. matrix of counts). • Run-Info.csv that contains some meta data on the run. • Treatment-Info.csv that contains meta data on the treatment type. Using the above datasets, write an R code to complete the following: (a) Read all files as a tibble object. Identify primary keys in all datasets and write R code to show that these keys uniquely identifies observations in these datasets. (b) Both Run-Info.csv and Treatment_Info.csv contains meta-data information. Using appropriate join function, combine the information into one tibble called metadata. (c) Write a pipe that computes the mean of the average length by treatment type and by cell type. Create a 1 x 2 grid using ggplot2 and plot the average length by treatment type or average length by treatment type. Make sure you use an appropriate legend for the plots so that each point in the plot can be visually identified and add an appropriate title for the grid. (d) Note that we have meta-data on 16 samples but the count matrix only contains eight samples. Write R code that will use an appropriate join function to add the metadata information as additional columns to observations that are present in the count matrix. You will need to convert the data format of the count matrix from wide format to long format. Use function pivot_longer(). Use "?pivot_longer” to get information on how to use the function. (e) Using a side-by-side box plot, visually inspect if there is a difference in mean (or me- dian) expression between different treatments? Make sure you add a legend. If working with counts directly is not informative, use an appropriate transformation. What do you conclude? You may add your conclusion as a comment in your.R file. (f) Using a side-by-side box plot, visually inspect if there is a difference in mean (or me- dian) expression between different cell types? Make sure you add a legend. If working with counts directly is not informative, use an appropriate transformation. What do you conclude? You may add your conclusion as a comment in your R file. (g) Using a side-by-side box plot, visually inspect if there is a difference in mean (or me- dian) expression between different treatment types within cell types? Make sure you add a legend. If working with counts directly is not informative, use an appropriate transformation. What do you conclude? You may add your conclusion as a comment in your R file. (h) Extract the expression profile of gene "ENSGO0000064607” and plot it against the samples in x-axis. Add different symbols and/or colors to identify which expressions are from which treatment type. 2. Proteins are an important class of molecules that play a variety of roles in the cell, including structural (cytoskeleton), mechanical (muscle), biochemical (enzymes), and cell signaling (hormones). A protein is composed of one or more long chains of amino acids a particular order. During translation, which is the second major step in gene expression, the mRNA is "read" according to the genetic code, which relates the DNA sequence to the amino acid sequence. Each group of three bases in mRNA constitutes a codon, and each codon specifies a particular amino acid (hence, it is a triplet code). The mRNA sequence is thus used as a template to assemble the chain of amino acids that form a protein. You are provided with a file that contains an mRNA sequence. Download and save the mRNA.csv file in your working directory. You may read the file from your working directory using the following command: mRNA_first
Purchase answer to see full attachment
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

Attached.

R Assignment

1

R Assignment: Genetic Code
N...


Anonymous
Very useful material for studying!

Studypool
4.7
Trustpilot
4.5
Sitejabber
4.4