Provide details on what you need help with along with a budget and time limit. Questions are posted anonymously and can be made 100% private.
Match with a Tutor
Studypool matches you to the best tutor to help you with your question. Our tutors are highly qualified and vetted.
Get Quality Help
Your matched tutor provides personalized help according to your question details. Payment is made only after you have completed your 1-on-1 session and are satisfied with your session.
1. For this question, we will be using the dataset from the publication
Himes BE, Jiang X, Wagner P, Hu R, Wang Q, Klanderman B, Whitaker RM, Duan Q,
Lasky-Su J, Nikolos C, Jester W. Johnson M, Panettieri R Jr, Tantisira KG, Weiss ST, Lu Q.
"RNA-Seq Transcriptome Profiling Identifies CRISPLD2 as a Glucocorticoid Responsive
Gene that Modulates Cytokine Function in Airway Smooth Muscle Cells.” PLoS One. 2014
Jun 13:96):e99625. PMID: 24926665. GEO: GSE52778.
Asthma is a chronic inflammatory respiratory disease that affects over 300 million people
worldwide. Glucocorticoids are a mainstay therapy for asthma because they exert anti-
inflammatory effects in multiple lung tissues, including the airway smooth muscle (ASM).
However, the mechanism by which glucocorticoids suppress inflammation in ASM remains
poorly understood. Using RNA-Seq, a high-throughput sequencing method, the authors
characterized transcriptomic changes in four primary human ASM cell lines that were treated
with dexamethasone-a potent synthetic glucocorticoid (1 M for 18 hours). The dataset has
already been preprocessed for you and you will be using the following data files provided to
you on my courses:
• count.csv that contains a subset of expression matrix from the above RNA-seq
study (i.e. matrix of counts).
• Run-Info.csv that contains some meta data on the run.
• Treatment-Info.csv that contains meta data on the treatment type.
Using the above datasets, write an R code to complete the following:
(a) Read all files as a tibble object. Identify primary keys in all datasets and write R
code to show that these keys uniquely identifies observations in these datasets.
(b) Both Run-Info.csv and Treatment_Info.csv contains meta-data information.
Using appropriate join function, combine the information into one tibble called
metadata.
(c) Write a pipe that computes the mean of the average length by treatment type and by
cell type. Create a 1 x 2 grid using ggplot2 and plot the average length by treatment
type or average length by treatment type. Make sure you use an appropriate legend for
the plots so that each point in the plot can be visually identified and add an appropriate
title for the grid.
(d) Note that we have meta-data on 16 samples but the count matrix only contains eight
samples. Write R code that will use an appropriate join function to add the metadata
information as additional columns to observations that are present in the count matrix.
You will need to convert the data format of the count matrix from wide format to long
format. Use function pivot_longer(). Use "?pivot_longer” to get information on
how to use the function.
(e) Using a side-by-side box plot, visually inspect if there is a difference in mean (or me-
dian) expression between different treatments? Make sure you add a legend. If working
with counts directly is not informative, use an appropriate transformation. What do you
conclude? You may add your conclusion as a comment in your.R file.
(f) Using a side-by-side box plot, visually inspect if there is a difference in mean (or me-
dian) expression between different cell types? Make sure you add a legend. If working
with counts directly is not informative, use an appropriate transformation. What do you
conclude? You may add your conclusion as a comment in your R file.
(g) Using a side-by-side box plot, visually inspect if there is a difference in mean (or me-
dian) expression between different treatment types within cell types? Make sure you
add a legend. If working with counts directly is not informative, use an appropriate
transformation. What do you conclude? You may add your conclusion as a comment
in your R file.
(h) Extract the expression profile of gene "ENSGO0000064607” and plot it against the
samples in x-axis. Add different symbols and/or colors to identify which expressions
are from which treatment type.
2. Proteins are an important class of molecules that play a variety of roles in the cell, including
structural (cytoskeleton), mechanical (muscle), biochemical (enzymes), and cell signaling
(hormones). A protein is composed of one or more long chains of amino acids a particular
order. During translation, which is the second major step in gene expression, the mRNA
is "read" according to the genetic code, which relates the DNA sequence to the amino acid
sequence. Each group of three bases in mRNA constitutes a codon, and each codon specifies
a particular amino acid (hence, it is a triplet code). The mRNA sequence is thus used as a
template to assemble the chain of amino acids that form a protein. You are provided with
a file that contains an mRNA sequence. Download and save the mRNA.csv file in your
working directory. You may read the file from your working directory using the following
command:
mRNA_first Purchase answer to see full
attachment
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.
Would you like to help your fellow students? Are you in need of an additional source of income? Apply to become a tutor on Studypool! Our best tutors earn over $7,500 each month!