The community site for and by
developmental and stem cell biologists

PhD studentship in Cardiff – Chronic radiation injury in Drosophila

Posted by , on 11 October 2017

Closing Date: 15 March 2021

I am looking for talented and driven candidates for a 4yr-PhD programme to join my laboratory at the European Cancer Stem Cell Research Institute at Cardiff University. The studentship is funded by the GW4 BioMed Doctoral Training Partnership of the MRC, starting October 2018.

The successful candidate will have a 1st or 2:1 class degree in Biomedical or Biological Sciences, or a related discipline, and an interest in basic biomedical research and in vivo approaches. Applications from EU citizens are welcome.

The project aims at understanding the molecular mechanisms underlying the chronic radiation injury, which is an important limitation to the efficacy of cancer radiotherapy. The project combines transcriptional profiling, functional assays and microscopy and histology, in collaboration with Dr Pablo Orozco ter Wengel.

For more details of the project see here.

For further questions and informal enquiries contact Dr Joaquín de Navascués at deNavascuesJ@cardiff.ac.uk.

For further details of the partnership, including eligibility and expected grades of the candidates, visit the DTP website at http://www.gw4biomed.ac.uk/

 

Application: http://www.cardiff.ac.uk/study/postgraduate/funding/view/mrc-gw4-biomed-doctoral-training-partnership-phd-in-biosciences

Deadline for applications: Friday 24th November 2017, midnight

Thumbs up (No Ratings Yet)
Loading...

Categories: Jobs

Navigate the archive

Use our Advanced Search tool to search and filter posts by date, category, tags and authors.

PhD studentship in Cardiff – Intestinal stem cells in Drosophila

Posted by , on 11 October 2017

Closing Date: 15 March 2021

I am looking for talented and driven candidates for a 4yr-PhD programme to join my laboratory at the European Cancer Stem Cell Research Institute at Cardiff University. The studentship is funded by the South-West Doctoral Training Partnership of the BBSRC, starting September 2018.

The successful candidate will have a 1st or 2:1 class degree in Biological Sciences or a related discipline, an appetite for technological development as well as an interest in quantitative approaches to stem cell biology. Applications from EU citizens are welcome.

The project aims at understanding how adult stem cells respond to the local needs for cell replacement through lineage tracing, genetic manipulation, confocal microscopy and the development of a new, Gal4-compatible, drug-inducible method for the temporal control of transgene expression in Drosophila. This will be done in collaboration with Prof Helen White-Cooper’s lab in Cardiff and that of Dr Edward Morrissey in Oxford (who will bring in his expertise in mathematical modelling).

For more details of the project see here.

For further questions and informal enquiries contact Dr Joaquín de Navascués at deNavascuesJ@cardiff.ac.uk.

For further details of the partnership, including eligibility and expected grades of the candidates, visit the DTP website at http://www.bristol.ac.uk/swbio/

 

Application: https://www.cardiff.ac.uk/study/postgraduate/funding/view/bbsrc-swbio-phd-in-biosciences

Deadline for applications: Monday 4th December 2017, midnight

Thumbs up (No Ratings Yet)
Loading...

Categories: Jobs

2018 JDB Travel Award

Posted by , on 11 October 2017

 

The Journal of Developmental Biology are inviting applications for the 2018 JDB Travel Award. The award is for postdoctoral researchers and PhD students to attend a conference of their choice in 2018).

The award will consist of 800 CHF (Swiss Francs), and nominations must be in by 15/10/17

Find out more on the JDB website:

http://www.mdpi.com/journal/jdb/awards/185/view

 

 

 

Thumbs up (No Ratings Yet)
Loading...

Tags: ,
Categories: Events, News

Two birds with one stone: CTCF control of dynamic gene expression during heart development.

Posted by , on 9 October 2017

CTCF binds to chromatin and is thought of as an architectural protein in the genome. If the genome were a text, CTCF would act like the punctuation marks, so that words are grouped together becoming meaningful sentences.

When I started my PhD, the Manzanares lab had been fruitfully collaborating with that of Jose Luis Gómez-Skarmeta at the CABD in Seville for several years in different projects. In one of them, they were studying IrxA gene regulation in the developing embryo, and previous results they had generated suggested that CTCF had a role in the regulation of this cluster. This is a fascinating gene cluster, duplicated during vertebrate evolution, which comprises only three homeobox-containing genes spread over more than one megabase of DNA. It has three genes, the two first (Irx1 and Irx2) showing nearly identical expression patterns, while the third gene, Irx4, shows distinct prominent expression in the heart.

The ideal experiment we first thought of was to delete a particular CTCF binding site. Another option was to prevent CTCF protein from binding there. This could be achieved by using a conditional mouse mutant allele, where Ctcf is flanked by two loxP sites (1), which allows the specific deletion of this gene, in our case using a Cre-driver specific for the developing heart. Other groups had already used this Ctcf floxed line and conditionally deleted the gene in several contexts, such as the developing limb or T-cells, finding that CTFC had roles in processes such as apoptosis or proliferation.

Although we initially aimed to study the function of a particular CTCF binding site, I started characterizing the phenotype derived from the lack of CTCF in the developing heart. Quickly, this experiment acquired life of its own, when we saw that deleting CTCF during heart development led to embryonic death. This highlighted the importance of CTCF in the process, and thus we decided to focus on the overall function of CTCF in the developing heart.

In order to obtain a global view of the effects on gene regulation by Ctcf-deletion we carried out RNA sequencing of control hearts and of other two scenarios: one or two copies of Ctcf deleted.  We found that only one copy of Ctcf is enough for correct gene expression and normal heart development (mice reach adulthood with no problems) so we focused on the comparison of gene expression between wild-type and Ctcf homozygous mutant embryonic hearts.

When we carried out functional annotation of the list of approximately two thousand genes with significant changes in expression between wild-types and mutants, the Gene Ontology (GO) term “heart development” stood out in all different GO tools we used. What also caught our attention was that the “heart development” and related GO terms were only present in the down-regulated gene set. To corroborate this finding we analyzed the expression pattern of some of the genes belonging to the “heart development” GO term (such as the master cardiac regulator Nkx2-5) by in situ hybridization, and we could observe clear downregulation of all genes examined. This led us to think that CTCF controlled heart development by directly regulating those genes critical for embryonic patterning.

Our next question was pretty straightforward: what is the relationship between CTCF binding to the genome and our set of differentially expressed genes in the Ctcf knockout? Taking advantage of the previously described genome-wide distribution of enhancers and CTCF binding generated by ChIP-seq in the mouse heart by Bing Ren’s lab, we tried to tackle this question. The analysis showed that both up and down-regulated genes were closer to a CTCF binding site than expected, but that only down-regulated genes were closer to a heart enhancer. This observation suggested that CTCF is acting to bring together gene promoters and developmental enhancers to achieve proper expression in this developing system.

Following this last idea, we wanted to study in more detail if the alteration in transcription of selected genes in the absence of CTCF was due to changes in the 3D chromatin structure. When we were developing this work, chromatin conformation capture techniques (3C) and its derivatives were the way to go to study 3D chromatin structure, so we decided to use this emerging technology to address the issues we had in hand. At this point, two questions came up. First, how do we choose interesting candidates for further studies among 2000 genes. And second, which 3C-derived technique should we use? For the first, we chose Irx4, as it is one of the members of the IrxA gene cluster, on which the Manzanares and Gómez-Skarmeta labs had been already working, and it has an important role in ventricle identity (2,3). Most importantly, we had seen that it was strongly downregulated in our RNA-seq data, had predicted heart enhancers and ChIP-seq CTCF binding sites in its vicinity, and in situ hybridization confirmed its down-regulation in mutant hearts. Regarding the 3C-derived technique to use, we chose 4C-seq because we knew very little about Irx4 regulatory landscape and in this way we could address all the interactions occurring from specific regions in the locus, as this is a “one-versus-all” approach.

Standard 4C-seq protocols are supposed to use 1×107 cells, and this was a challenge given that we planned to carry out the analysis using E10.5 embryonic hearts, that are composed of roughly 30-40 thousand cells. For example, in a previous work showing p300 ChIP-seq in E11.5 embryonic hearts, 270 hearts were used (4). 270!! Obtaining wildtype hearts for setting up the technique was relatively straightforward, but Ctcf heart-specific mutants would only be one fourth of the litters. We managed to obtain reproducible 4Cs using pools of 45-60 for each genotype and replicate, but each pool took 2-3 months and long hours on the dissecting scope to collect. Nevertheless, the help of Claudio Badía-Careaga made this enormous effort, and the project in general, much more bearable.

Our 4C-seq showed that the Irx4 promoter interacted with surrounding CTCF binding sites. One of them was located between Irx1/Irx2 and Irx4, and therefore a prime candidate to be mediating the differences in expression of these two IrxA genes in the developing heart. To see if this site was important for regulating Irx4 in vivo, we deleted it with the new CRISPR-Cas9 system that Isabel Rollan in our lab had been working to set up. We could observe a subtle decrease of Irx4 expression levels in the heart when we compared the deletion mutant to controls. But, we were so focused in heart expression that we almost miss the fact that we had Irx4 ectopic expression. And it was in an Irx1 expression territory!! Thus, this CTCF binding site is crucial for proper Irx4 expression, both by aiding in its regulation by heart specific enhancers, and also by prevents its ectopic expression in territories where Irx1/Irx2 are normally expressed, possibly by not allowing other tissue-specific enhancers to activate the gene.

 

Top panel, Irx4 expression in the heart. Bottom panel, Irx4 ectopic expression in the oral-esophageal region of foregut, an Irx1 expression territory.

 

We were very pleased with these results, but we struggled for a while in publishing our data. We presented the work at several meetings and in one of them, somebody asked what happened with the upregulated genes, what are those? And why are there no heart development genes among them?  Sure enough, we had looked at them but up to then merely described the functional enrichments. Therefore, we went back and re-checked the GO terms associated with up-regulated genes in Ctcf mutant hearts. Two categories stood out by the number of genes up-regulated: translation and mitochondria. The great majority of ribosomal protein-coding genes from both small and large subunits were among this subset, and more than 300 genes labelled with mitochondrial GO terms were up-regulated. When cardiomyocytes mature, they require two things in abundance: protein to build sarcomeres and energy to contract. Therefore, loss of Ctcf appeared to push cardiomyocytes towards maturation.

Looking at the mitochondrial genes differentially expressed in our RNA-seq, we found both functional and structural genes. We wanted to see if this upregulation led to more mitochondria and if these were functional. This was a new territory for us, and we were extremely lucky that in our same institute we had colleagues that are world- experts in mitochondrial biology. Ana Victoria Lechuga-Vieco, Rocio Nieto-Arellano and Jose Antonio Enriquez guided us through the analysis of the mitochondrial phenotype in our mutants. The RNAseq analysis showed an increase in several OXPHOS (oxidative phosphorylation) system components. We also saw this increase at protein level. However, the assembly of some OXPHOS super-complexes was impaired. One of my favorites results were the pictures Ana Lechuga-Vieco took with TEM (transmission electron microscopy). Not only they were absolutely beautiful and looked like a biology text book, but they showed that loss of CTCF lead to immature and sick-looking mitochondria, and that the sarcomeres assembled earlier than expected in the Ctcf mutant hearts. The 6 month-long wait for the TEM to be available had been worth it! Therefore it appears that despite increased transcription of maturation genes, proper assembly of cellular components does not follow leading to non-viable cardiac cells.

 

Transmission electron microscopy (TEM) showing swollen and larger mitochondria in the Ctcf KO hearts (Ctcf fl/fl ; Nkx2.5-Cre) in comparison to the control.

 

We concluded that CTCF was controlling two transcriptional programs in opposite directions, and this dynamic was necessary for proper formation of the heart. It is curious how the attempt to close a story opened a new one. And also, how a fresh look at your data can change the course of the story.

 

Numbers show part of the CTCF team in Miguel Manzanares lab. 1, Claudio Badía-Careaga. 2, Isabel Rollán. 3, Miguel Manzanares. 4, Melisa Gómez Velázquez. 5, Alba Alvarez

 

This post was written by Melisa Gómez-Velázquez and Miguel Manzanares

PLOS Genetics paper: CTCF counter-regulates cardiomyocyte development and maturation programs in the embryonic heart.

 

 

Bibliography

  1. Helen Heath, Claudia Ribeiro de Almeida, Frank Sleutels, Gemma Dingjan, Suzanne van de Nobelen, Iris Jonkers, Kam-Wing Ling, Joost Gribnau, Rainer Renkawitz, Frank Grosveld, Rudi W Hendriks, and Niels Galjart. CTCF regulates cell cycle progression of αβ T cells in the thymus. EMBO J. 2008 Nov 5; 27(21): 2839–2850. doi: 10.1038/emboj.2008.214. Epub 2008 Oct 16.
  2. Bruneau BG, Bao ZZ, Fatkin D, Xavier-Neto J, Georgakopoulos D, Maguire CT, et al. Cardiomyopathy in Irx4-deficient mice is preceded by abnormal ventricular gene expression. Mol Cell Biol. 2001;21(5):1730–6. doi: 10.1128/MCB.21.5.1730-1736.2001.
  3. Bao ZZ, Bruneau BG, Seidman JG, Seidman CE, Cepko CL. Regulation of chamber-specific gene expression in the developing heart by Irx4. Science. 1999;283(5405):1161–4.
  4. Matthew J. Blow, David J. McCulley, Zirong Li, Tao Zhang, Jennifer A. Akiyama, Amy Holt, Ingrid Plajzer-Frick, Malak Shoukry, Crystal Wright, Feng Chen, Veena Afzal, James Bristow, Bing Ren, Brian L. Black, Edward M. Rubin, Axel Visel, and Len A. Pennacchio. ChIP-seq Identification of Weakly Conserved Heart Enhancers. Nat Genet. 2010 Sep; 42(9): 806–810. Published online 2010 Aug 22. doi: 10.1038/ng.650.
Thumbs up (2 votes)
Loading...

Tags: , , , ,
Categories: Research

€1000 Travel Award for Life Science Researchers

Posted by , on 9 October 2017

Hi Everyone,

I thought I would post our travel award for life science researchers which might be of interest.

At Antibody Genie we run a quarterly travel award for Life Science researchers, all you need to enter is to write an 800 word blog piece on your research area or write a piece on your life as a PhD/Post-Doc/PI.

For entrants that blog on their research area we also ask them to submit a diagram to explain their pathway/structures etc. We redesign these diagrams on posts them on the site.

In our last round we had 20+ entrants, with two researchers invited to international conferences based on promotion that we did on their blog posts.

You can find out more information here:

https://www.antibodygenie.com/travel-award/

Best of luck

Seán

Thumbs up (2 votes)
Loading...

Tags: , ,
Categories: Resources

Two fully-funded PhD positions in Wnt trafficking at the LSI in Exeter

Posted by , on 9 October 2017

Closing Date: 15 March 2021

The process of subdividing a tissue into functional units represents a classic problem in pattern formation. Signalling proteins – so-called morphogens – orchestrate this process. The traditional view is that morphogens are released from local source and slowly diffuse through a neighbouring tissue to build up a gradient. As Wnt signals act as a key morphogen in tissue patterning, it is believed that similarly these signal proteins diffuse long range to exert their morphogenetic function.

However, we have recently identified long signalling filopodia – so called cytonemes – that tightly control transport of Wnt proteins. We have observed fast and directed distribution through expanding tissues. It is unclear how such a dissemination generates a stable and robust signalling gradient.

The two fully-funded PhD project will focus on:

(1) deciphering of the molecular mechanism of Wnt transport,

(2) simulating of the impact on patterning,

(3) validating the prediction in growing tissue

(4) comparing Wnt trafficking in healthy and diseased tissue

 

The Scholpp lab in the Living Systems Institute (LSI) Exeter is an optimal environment to conduct these doctoral training studies. The LSI offers unique training opportunities for the PhD student as it allows the student address key problems in life sciences with state-of-the-art equipment in an interdisciplinary environment. The centre facilitates interaction between empirical and theoretical scientists leading to the development of predictive modelling capacity from experimental data, leading to accurate, mechanistic descriptions of biophysical processes for entire ‘living systems’. The project includes close collaboration with the universities of Bristol and Cardiff to complement the required skill sets.

Application deadline: 24.11.2017

Start date: Sep/Oct 2018

Detailed information can be found here:

BBSRC-funded project: Wnt transport in tissue patterning
MRC-funded project: Wnt transport in gastric cancer
Thumbs up (No Ratings Yet)
Loading...

Tags: , , , ,
Categories: Jobs

PhD or Postdoctoral Positions

Posted by , on 7 October 2017

Closing Date: 15 March 2021

The López-Schier laboratory at the Helmholtz Zentrum Munich in Germany is seeking creative and highly motivated PhD students or postdoctoral scholars to work within our group of 9 graduate students and postdoctoral fellows. The working language of the laboratory is English.

 

Our group focuses on understanding the development, regeneration and function of sensory systems and neuronal control of metabolism. We use the zebrafish as experimental model, and integrate molecular, cellular, behavioural and clinical data. We also have developed new technical approaches to understand organogenesis, including cell-fate acquisition after regeneration from tissue-resident progenitor cells. Mutations in many of the genes that we have identified are responsible for neurological diseases and cancer. Our ultimate goal is to connect our studies of fundamental mechanisms to understand health disorders in humans.

 

We currently have 4 fully funded openings for the following projects:

 

  1. Cellular and genetic bases of organogenesis, including cell packing and tissue remodelling. This project combines single-cell transcriptional profiling, genome engineering using CRISPR/Cas9 and quantitative live imaging data by light-sheet microscopy. Preference will be given to candidates with theoretical or practical knowledge in cell biology or biophysics.

 

  1. Control of cell number, organ size and proportions. Using state of the art high-resolution cell tracking, optogenetics, genome engineering and machine learning, we attempt to understand how cells self-organize and to predict cellular behavior during the regenerative response after tissue injury. This project is ideal for a candidate with a background in physics or engineering and a good command of computer programming.

 

  1. Neuronal control of metabolism. This project combines transcriptional profiling, genome engineering and neuronal-activity imaging by light-sheet microscopy to understand the neuronal basis of systemic metabolism, and neuronal dysfunction in metabolic syndromes. Preference will be given to candidates with theoretical or practical knowledge in neuroscience, electrophysiology and/or optogenetics.

 

 

Qualifications & skills

– University studies in biology-related sciences, physics, engineering or computer science (PhD)

– Ideally having recently completed or about to complete a PhD (Postdoctoral)

– Having published or likely to publish at least one first-author paper in a

first/second tier journal (Postdoctoral)

– Candidates for all position should have a strong inner drive, independence, and willingness to work in a highly interdisciplinary team

– A good command of the English language is essential

 

Laboratory

The team’s projects are interdisciplinary, and are aimed at understanding the basic rules that allow sensory systems to develop, regenerate and function. We use confocal, spinning-disc and light-sheet microscopy imaging, biochemistry, genome engineering by CRISPR/Cas9, laser nanosurgery, optogenetics, and machine learning.

 

Environment

The Helmholtz Zentrum in an innovative, well-equipped and scientifically stimulating élite research centre located in the outskirts of Munich, one of the most attractive and innovative major cities in Germany. Situated at the foothills of the Alps, Munich is a cosmopolitan city that has ranked among those with the highest quality of life in Europe.

 

Contact

Please, apply via electronic mail only, including a cover letter with a short statement of research interests and motivation, a Curriculum Vitae including a list of names and email-addresses for two/three academic references, to:

 

Dr. Hernán López-Schier

Research Unit Sensory Biology & Organogenesis

Helmholtz Zentrum München
Ingolstädter Landstrasse 1
85764 Neuherberg – Munich, Germany

E-mail: hernan.lopez-schier@helmholtz-muenchen.de

 

Website: http://www.gsn.uni-muenchen.de/people/faculty/associate/lopez-schier/index.html

 

 

Thumbs up (No Ratings Yet)
Loading...

Categories: Jobs

Converting excellent spreadsheets to tidy data

Posted by , on 6 October 2017

Structuring data according to the ‘tidy data‘ standard simplifies data analysis and data visualisation. But understanding the structure of tidy data does not come naturally (in my experience), since it is quite different from the structure of data in spreadsheets or tables. Here, I explain how to convert typical spreadsheet data to tidy data to facilitate data visualisation.

When I started to use ggplot2 for data visualisation, I was impressed by its power and elegance. With ggplot2, a free, open source package for R, you can make high-quality graphs. For instance, the graphs that I made for a previous blog advocating the importance of showing the actual data instead of summaries were made with ggplot2 (scripts available here). Before I could make those graphs, however, I was struggling with the tidy data structure, which is a requirement for using ggplot2.

The reason that I was struggling is that as a researcher I am used to collect, process and summarise my data in spreadsheets. In addition, I am used to read and present data in a tabular format. Tidy data contains exactly the same information as non-tidy, spreadsheet data, but it is organised in a different way. The tidy data standard defines a consistent way to structure datasets, facilitating data manipulation and visualization (Wickham, 2014). So, once the data is organised according to the tidy data principles, making the graphs is relatively straightforward.

Below, I will explain how data in the typical spreadsheet format (and I’m not naming names here) can be converted to tidy data using R. I will use two examples of data in a spreadsheet format that I often encounter during my research. The first example is a dataset in which a single parameter is compared across different conditions. The second example is a dataset from a time-series experiment with different conditions.

Before we can perform the conversion to the tidy format, we need to define what types of information are stored in a dataset. I will use the nomenclature that is used in Tidy Data by Hadley Wickham. Let’s say that you have measured cell lengths under a number of different experimental conditions (A,X,Y,Z). An example dataset in the typical non-tidy spreadsheet format would look like this:

A     X     Y     Z
0.8  7.1  9.9  5.1
1.9  6.7  9.5  4.5
2.6  6.3  8.5  4.4
3.6  6.2  8.7  4.3
4.4  4.5  7.3  3.3
5.7  4.3  6.2  3.1
6.3  4.1  5.5  2.5
7.1  4.8  6.8  3.4
8.2  5.1  7.3  3.6
9.2  5.5  7.9  3.9

The dataset consists of values and these are usually numbers or text. In this dataset, all the measured values represent the same characteristic (cell length) and therefore belong to a single variable. Here, we indicate that variable as  “Length”. Since the values that represent cell length are experimentally determined, “Length” is a measured variable. The different conditions (A,X,Y,Z) are also assigned to a variable; “Condition”. Since this variable was known when the experiment was designed it is a fixed variable.

The most important requirement for tidy data is that each variable occupies only a single column. This rule is violated in the example above, because values that belong to the same variable (“Length”) are distributed over different columns. The tidy version of the dataset would have two columns, one with the fixed variable “Condition’ and one with the measured variable “Length”. Below, I will explain how to use R to convert the spreadsheet data into tidy data.

In the next part, which will be a tutorial on tidying data in R, I assume some basic knowledge of R, including setting the working directory and installing packages. The packages that need to be installed are tidyr and ggplot2. The example dataset is available here

 

Example 1: Tidying a spreadsheet with different conditions

First, we will read the non-tidy spreadsheet data from the file ‘test-data.csv‘ and assign the data to a dataframe named ‘data_spread’

> data_spread <- read.csv('test-data.csv')

To verify this step, you can print the first six lines of the dataframe by using the function head() with the name of the dataframe as the argument:

> head(data_spread)

Which will return:


    A   X   Y   Z
1 0.8 7.1 9.9 5.1
2 1.9 6.7 9.5 4.5
3 2.6 6.3 8.5 4.4
4 3.6 6.2 8.7 4.3
5 4.4 4.5 7.3 3.3
6 5.7 4.3 6.2 3.1

The first row of the dataframe is the header, which specifies the experimental condition of each column (A,X,Y,Z). All other rows contain the observed cell length.

To convert the dataframe to a tidy format I use the function gather(), which is part of the tidyr package. Using gather(), I specify that “Condition” is taken from the header and should go in the first column and that all the values (which are all cell lengths) should be gathered in the second columns and that this variable should have the name “Length”. The result is assigned to the dataframe ‘data_tidy’:

> data_tidy <- gather(data_spread, Condition, Length)

To show the contents of the entire dataframe, enter the name of the dataframe at the prompt (the result is not shown here):

> data_tidy

To show the first six lines of the tidy dataframe ‘data_tidy’, use the function head():

> head(data_tidy)

  Condition Length
1         A    0.8
2         A    1.9
3         A    2.6
4         A    3.6
5         A    4.4
6         A    5.7

To show the last six lines of the tidy dataframe, use the function tail():

> tail(data_tidy)

   Condition Length
35         Z    3.3
36         Z    3.1
37         Z    2.5
38         Z    3.4
39         Z    3.6
40         Z    3.9

With the new, tidy dataframe ‘data_tidy’ it is straightforward to plot the data with ggplot2:

> ggplot(data_tidy, aes(x = Condition, y = Length)) +
     geom_jitter(position=position_jitter(0.3), cex=1, color="grey40")

 

Wrap-up
In this example we have converted a spreadsheet with measurements, taken under different conditions, into tidy data. In the tidy data structure, only two columns are used, one for the fixed variable “Condition” and another one for the values that belong to the measured variable “Length”. This format, that looks like a list, may seem odd. It also uses more storage space (or memory), since the “Condition” is listed for every value. Still, this format makes perfect sense in R and it can be used to plot the data with ggplot2.

 

Example 2: Tidying a spreadsheet with time-dependent data and different conditions
Let’s take this a step further with the same non-tidy spreadsheet data “data_spread”. Now, suppose that the data is from a time series and column A represents time. To indicate that column A is actually “Time” you can change the name of the first column using the function colnames():

> colnames(data_spread)[1] <- "Time"

You can verify that the name of the first column has been changed by using the function head():

> head(data_spread)

   Time   X   Y   Z
 1  0.8 7.1 9.9 5.1
 2  1.9 6.7 9.5 4.5
 3  2.6 6.3 8.5 4.4
 4  3.6 6.2 8.7 4.3
 5  4.4 4.5 7.3 3.3
 6  5.7 4.3 6.2 3.1

In this dataset cell length has been measured at different times and for different conditions (X,Y,Z). In addition to the variables “Length” and “Condition” a third variable “Time” is present in this dataset. Again, this dataset is not tidy since values that belong to the same variable (cell length) are spread over different columns. Note that the first column contains only values that belong to the variable “Time”, which should remain like that. So, for this dataset we want to gather all the values that represent cell lengths in one column, but these should not be mixed with the variable “Time”. To achieve this, we will use the same function gather() and add that “Time” should be excluded from gathering:

> data_tidy_time <- gather(data_spread, Condition, Length, -Time)

Let’s check the structure of the new dataframe:

> head(data_tidy_time)

   Time Condition Length
 1  0.8         X    7.1
 2  1.9         X    6.7
 3  2.6         X    6.3
 4  3.6         X    6.2
 5  4.4         X    4.5
 6  5.7         X    4.3

The new dataframe ‘data_time_tidy’ can be used as input for ggplot2 to plot a “Length” versus “Time” plot for three conditions:

> ggplot(data_tidy_time, aes(x=Time)) +
    geom_line(aes(x= Time, y=Length, color=Condition), size=0.5, alpha=1)

 

Wrap-up
In this example we have converted a spreadsheet with time-series data in tidy data. The tidy data structure consists of three columns for the three variables “Time’, “Condition” and “Length”. Each column contains only values that belong to the indicated variable. This long format with the values for “Time” repeated several times look atypical. But this format offers flexibility, for instance when the measurements for the conditions are taken at different times. In addition, it is the only structure for packages that use tidy data as input (tidy tools). Finally, the tidy data structure enables plotting the data with ggplot2 and grouping the data or coloring the data according to the variable “Condition”.

 

Final words
If you do not comprehend the concept of tidy data right away, don’t worry. It took me actually quite a while to grasp it and I’m still not confident that I fully understand it. I can highly recommend reading the paper Tidy Data by Hadley Wickham, which offers a thorough and clear explanation. In my experience, the best way to learn how to tidy your data is by doing it. Start out easy, with the example dataset, or with some ‘simple’ data of your own. From there on, you can work on more complicated datasets. Many examples of tidying more complex data are out there, including a very nice tutorial on Data Tidying by Garrett Grolemund. If you are stuck, try to find solutions online, use a community site, ask a colleague (this is the right moment for me to thank Katrin Wiese) or post a tweet. I hope that this tutorial will help people that are used to working with data in spreadsheets and tables to take full advantage of the power of ggplot2, tidy tools and R.

 

Acknowledgments: A shout-out to the twitter community, to anyone sharing code and Jakobus van Unen, Katrin Wiese and all other colleagues for their input.

Thumbs up (18 votes)
Loading...

Tags: , , , , , , ,
Categories: Education, Research, Resources

SDB Puerto Rico Research Relief Grant

Posted by , on 6 October 2017

Reposted from the SDB’s website

 

The Society for Developmental Biology is deeply concerned about the damage caused by Hurricanes Irma and Maria to the laboratories of our colleagues in Puerto Rico. In order to facilitate the continuation of research programs in those labs or at another temporary host location, SDB is offering relief grants of up to $10,000 to each lab. Funds may be used for replacement of organisms, reagents, supplies, travel to host lab by PI or his/her trainees, core facility usage fees at host institution, etc. For questions, please contact Ida Chow at ichow@sdbonline.org.

Eligibility: Principal investigators (PIs) at institutions located in Puerto Rico who are currently conducting research projects in developmental biology. Priority will be given to PIs who are current SDB members. The PIs may request support for travel to host lab for themselves, as well as for their trainees (students and postdocs) who are active participants in the project.

Deadline: As soon as possible, no later than December 31, 2017. Proposals must be submitted as one complete PDF attachment to ichow@sdbonline.org, and they will be reviewed as they are received.

If you would like to financially support this grant, please donate online here and select “Puerto Rico Relief Fund.”

Find out about the host labs already signed up and how to apply here:

https://www.sdbonline.org/puerto_rico_relief_grant

 

Thumbs up (3 votes)
Loading...

Categories: News

Interested in Drosophila and adult stem cells? A 3-year DFG-funded PhD position available at the Leibniz Institute on Aging in Jena, Germany

Posted by , on 6 October 2017

Closing Date: 15 March 2021

A 3-year DFG-funded PhD-position is available at the Jasper Collaboration Group at the FLI-Leibniz Institute on Aging. Our work focuses on the Drosophila intestine as a model for adult stem cell regulation and aging (http://www.leibniz-fli.de/research/associated-research-groups/jasper/).

In this DFG-funded project, we will focus on elucidating the mechanisms of entero-endocrine (EE) cell differentiation and the role of EE cells in age-related dysbiosis using a combination of genetics, transcriptomics and transcription-factor binding studies. Despite their importance in controlling animal metabolism, behaviour and immune responses, there is still a great deal to learn about how EE cells are formed from intestinal stem cells and the influence EE cells exert on the microbiome, organismal aging and stress responses.

We are looking for highly motivated individuals who are fascinated by stem cells, transcription factors and the genetics of aging. Experience with Drosophila, stem cell systems and/or bio-informatic analysis of NGS-sequencing data is a pre. Please contact Dr. Jerome Korzelius (jerome.korzelius@leibniz-fli.de) about any details regarding this position.

Thumbs up (No Ratings Yet)
Loading...

Categories: Jobs, Uncategorized