2020

Efficient, quick and easy-to-use DNA replication timing analysis with START-R suite

Djihad Hadjadj, Thomas Denecker, Eva Guérin, Su-Jung Kim, Fabien Fauchereau, Giuseppe Baldacci, Chrystelle Maric, Jean-Charles Cadoret
NAR Genomics and Bioinformatics - 10.1093/nargab/lqaa045

Abstract

DNA replication must be faithful and follow a well-defined spatiotemporal program closely linked to transcriptional activity, epigenomic marks, intranuclear structures, mutation rate and cell fate determination. Among the readouts of the spatiotemporal program of DNA replication, replication timing analyses require not only complex and time-consuming experimental procedures, but also skills in bioinformatics. We developed a dedicated Shiny interactive web application, the START-R (Simple Tool for the Analysis of the Replication Timing based on R) suite, which analyzes DNA replication timing in a given organism with high-throughput data. It reduces the time required for generating and analyzing simultaneously data from several samples. It automatically detects different types of timing regions and identifies significant differences between two experimental conditions in ∼15 min. In conclusion, START-R suite allows quick, efficient and easier analyses of DNA replication timing for all organisms. This novel approach can be used by every biologist. It is now simpler to use this method in order to understand, for example, whether ‘a favorite gene or protein’ has an impact on replication process or, indirectly, on genomic organization (as Hi-C experiments), by comparing the replication timing profiles between wild-type and mutant cell lines.


FAIR_Bioinfo: a turnkey training course and protocol for reproducible computational biology

Thomas Denecker, Claire Toffano-Nioche
HAL - https://hal.archives-ouvertes.fr/hal-02880655

Abstract

Reproducibility plays an essential part in the success of a bioinformatics project. Indeed, Reproducibility makes it possible to guarantee the validity of scientific results and to simplify the dissemination of projects. To help disseminate Reproducibility principles among bioinformatics students, engineers and scientists, we created the FAIR_Bioinfo course, which presents a set of features we consider necessary to make a complete bioinformatics analysis reproducible. To illustrate the theoretical concepts of reproducibility, we use as an example a classic bioinformatics analysis (differential gene expression analysis from RNA-seq data). In short, we retrieve the data from public databases (ENA/SRA), we perform a reproducible analysis using a workflow management system (snakemake) in a virtual environment (Docker). The entire versioned (git) code is open source (Github https://github.com/thomasdenecker/FAIR_Bioinfo and dockerhub https://hub.docker.com/r/tdenecker/fair_bioinfo). The course book is available in English on GitBook (https://fair-bioinfo.gitbook.io/fair-bioinfo/) and the slides in French on Github. The visualization of the results is dynamic (Shiny app) and the PDF or HTML report (Rmarkdown) provides the results of the analysis and lists all user-selected parameters.


Functional networks of co-expressed genes to explore iron homeostasis processes in the pathogenic yeast Candida glabrata

Thomas Denecker, Youfang Zhou Li, Cécile Fairhead, Karine Budin, Jean-Michel Camadro, Monique Bolotin-Fukuhara, Adela Angoulvant, Gaëlle Lelandais
NAR Genomics and Bioinformatics - 10.1093/nargab/lqaa027

Abstract

Candida glabrata is a cause of life-threatening invasive infections especially in elderly and immunocompromised patients. Part of human digestive and urogenital microbiota, C. glabrata faces varying iron availability, low during infection or high in digestive and urogenital tracts. To maintain its homeostasis, C. glabrata must get enough iron for essential cellular processes and resist toxic iron excess. The response of this pathogen to both depletion and lethal excess of iron at 30°C have been described in the literature using different strains and iron sources. However, adaptation to iron variations at 37°C, the human body temperature and to gentle overload, is poorly known. In this study, we performed transcriptomic experiments at 30°C and 37°C with low and high but sub-lethal ferrous concentrations. We identified iron responsive genes and clarified the potential effect of temperature on iron homeostasis. Our exploration of the datasets was facilitated by the inference of functional networks of co-expressed genes, which can be accessed through a web interface. Relying on stringent selection and independently of existing knowledge, we characterized a list of 214 genes as key elements of C. glabrata iron homeostasis and interesting candidates for medical applications.


2019

Rendre ses projets R plus accessibles grâce à Shiny

Thomas Denecker
Bioinfo-fr.net - https://bioinfo-fr.net/rendre-ses-projets-r-plus-accessibles-grace-a-shiny

Abstract

Vous avez un script que vous souhaitez partager avec une équipe expérimentale? Vous ne voulez pas que les utilisateurs modifient le code pour paramétrer votre programme? Vous codez avec R ? Alors cet article est fait pour vous ! Nous allons voir comment créer une application web avec R et permettre à votre utilisateur d’exécuter votre code sans le voir.


Pixel: a content management platform for quantitative omics data

Thomas Denecker, William Durand, Julien Maupetit, Charles Hébert, Jean-Michel Camadro, Pierre Poulain, Gaëlle Lelandais
PeerJ - 10.7717/peerj.6623

Abstract

Background: In biology, high-throughput experimental technologies, also referred as “omics” technologies, are increasingly used in research laboratories. Several thousands of gene expression measurements can be obtained in a single experiment. Researchers are routinely facing the challenge to annotate, store, explore and mine all the biological information they have at their disposal. We present here the Pixel web application (Pixel Web App), an original content management platform to help people involved in a multi-omics biological project.

Methods: The Pixel Web App is built with open source technologies and hosted on the collaborative development platform GitHub (https://github.com/Candihub/pixel). It is written in Python using the Django framework and stores all the data in a PostgreSQL database. It is developed in the open and licensed under the BSD 3-clause license. The Pixel Web App is also heavily tested with both unit and functional tests, a strong code coverage and continuous integration provided by CircleCI. To ease the development and the deployment of the Pixel Web App, Docker and Docker Compose are used to bundle the application as well as its dependencies.

Results: The Pixel Web App offers researchers an intuitive way to annotate, store, explore and mine their multi-omics results. It can be installed on a personal computer or on a server to fit the needs of many users. In addition, anyone can enhance the application to better suit their needs, either by contributing directly on GitHub (encouraged) or by extending Pixel on their own. The Pixel Web App does not provide any computational programs to analyze the data. Still, it helps to rapidly explore and mine existing results and holds a strategic position in the management of research data.

Keywords: Data cycle analyses; Omics; Open source; Pixel Web App.


Label-free quantitative proteomics in Candida yeast species: technical and biological replicates to assess data reproducibility

Gaëlle Lelandais, Thomas Denecker, Camille Garcia , Nicolas Danila , Thibaut Léger , Jean-Michel Camadro
BMC Research Notes - 10.1186/s13104-019-4505-8

Abstract

Objective: Label-free quantitative proteomics has emerged as a powerful strategy to obtain high quality quantitative measures of the proteome with only a very small quantity of total protein extract. Because our research projects were requiring the application of bottom-up shotgun mass spectrometry proteomics in the pathogenic yeasts Candida glabrata and Candida albicans, we performed preliminary experiments to (i) obtain a precise list of all the proteins for which measures of abundance could be obtained and (ii) assess the reproducibility of the results arising respectively from biological and technical replicates.

Data description: Three time-courses were performed in each Candida species, and an alkaline pH stress was induced for two of them. Cells were collected 10 and 60 min after stress induction and proteins were extracted. Samples were analysed two times by mass spectrometry. Our final dataset thus comprises label-free quantitative proteomics results for 24 samples (two species, three time-courses, two time points and two runs of mass spectrometry). Statistical procedures were applied to identify proteins with differential abundances between stressed and unstressed situations. Considering that C. glabrata and C. albicans are human pathogens, which face important pH fluctuations during a human host infection, this dataset has a potential value to other researchers in the field.

Keywords: Alkaline pH; Candida albicans; Candida glabrata; Label-free quantitative proteomics; Mass spectrometry.


2018

Empowering the detection of ChIP-seq 'basic peaks' (bPeaks) in small eukaryotic genomes with a web user-interactive interface

Thomas Denecker, Gaëlle Lelandais
BMC Research Notes - 10.1186/s13104-018-3802-y

Abstract

Objective: bPeaks is a peak calling program to detect protein DNA-binding sites from ChIPseq data in small eukaryotic genomes. The simplicity of the bPeaks method is well appreciated by users, but its use via an R package is challenging and time-consuming for people without programming skills. In addition, user feedback has highlighted the lack of a convenient way to carefully explore bPeaks result files. In this context, the development of a web user interface represents an important added value for expanding the bPeaks user community.

Results: We developed a new bPeaks application (bPeaks App). The application allows the user to perform all the peak-calling analysis steps with bPeaks in a few mouse clicks via a web browser. We added new features relative to the original R package, particularly the possibility to import personal annotation files to compare the location of the detected peaks with specific genomic elements of interest of the user, in any organism, and a new organization of the result files which are directly manageable via a user-interactive genome browser. This significantly improves the ability of the user to explore all detected basic peaks in detail.

Keywords: ChIP-seq; Peak calling; Protein DNA-binding sites; Small eukaryotic genomes; bPeaks.


2017

A hypothesis-driven approach identifies CDK4 and CDK6 inhibitors as candidate drugs for treatments of adrenocortical carcinomas

Djihad Hadjadj, Su-Jung Kim, Thomas Denecker, Laura Ben Driss, Jean-Charles Cadoret, Chrystelle Maric, Giuseppe Baldacci, Fabien Fauchereau
Aging - 10.18632/aging.101356

Abstract

High proliferation rate and high mutation density are both indicators of poor prognosis in adrenocortical carcinomas. We performed a hypothesis-driven association study between clinical features in adrenocortical carcinomas and the expression levels of 136 genes involved in DNA metabolism and G1/S phase transition. In 79 samples downloaded from The Cancer Genome Atlas portal, high Cyclin Dependent Kinase 6 (CDK6) mRNA levels gave the most significant association with shorter time to relapse and poorer survival of patients. A hierarchical clustering approach assembled most tumors with high levels of CDK6 mRNA into one group. These tumors tend to cumulate mutations activating the Wnt/β-catenin pathway and show reduced MIR506 expression. Actually, the level of MIR506 RNA is inversely correlated with the levels of both CDK6 and CTNNB1 (encoding β-catenin). Together these results indicate that high CDK6 expression is found in aggressive tumors with activated Wnt/β-catenin pathway. Thus we tested the impact of Food and Drug Administration-approved CDK4 and CDK6 inhibitors, namely palbociclib and ribociclib, on SW-13 and NCI-H295R cells. While both drugs reduced viability and induced senescence in SW-13 cells, only palbociclib was effective on the retinoblastoma protein (pRB)-negative NCI-H295R cells, by inducing apoptosis. In NCI-H295R cells, palbociclib induced an increase of the active form of Glycogen Synthase Kinase 3β (GSK3β) responsible for the reduced amount of active β-catenin, and altered the amount of AXIN2 mRNA. Taken together, these data underline the impact of CDK4 and CDK6 inhibitors in treating adrenocortical carcinomas.

Keywords: CDK6; adrenocortical; cancer; palbociclib; ribociclib.


2016

Characterization of the replication timing program of 6 human model cell lines

Djihad Hadjadj, Thomas Denecker, Chrystelle Maric, Fabien Fauchereau, Giuseppe Baldacci, Jean-Charles Cadoret
Genomics Data - 10.1016/j.gdata.2016.07.003

Abstract

During the S-phase, the DNA replication process is finely orchestrated and regulated by two programs: the spatial program that determines where replication will start in the genome (Cadoret et al. (2008 Oct 14), Cayrou et al. (2011 Sep), Picard et al. (2014 May 1) [1], [2], [3]), and the temporal program that determines when during the S phase different parts of the genome are replicated and when origins are activated. The temporal program is so well conserved for each cell type from independent individuals [4] that it is possible to identify a cell type from an unknown sample just by determining its replication timing program. Moreover, replicative domains are strongly correlated with the partition of the genome into topological domains (determined by the Hi-C method, Lieberman-Aiden et al. (2009 Oct 9), Pope et al. (2014 Nov 20) [5], [6]). On the one hand, replicative areas are well defined and participate in shaping the spatial organization of the genome for a given cell type. On the other hand, studies on the timing program during cell differentiation showed a certain plasticity of this program according to the stage of cell differentiation Hiratani et al. (2008 Oct 7, 2010 Feb) [7], [8]. Domains where a replication timing change was observed went through a nuclear re-localization. Thus the temporal program of replication can be considered as an epigenetic mark Hiratani and Gilbert (2009 Feb 16) [9]. We present the genomic data of replication timing in 6 human model cell lines: U2OS (GSM2111308), RKO (GSM2111309), HEK 293T (GSM2111310), HeLa (GSM2111311), MRC5-SV (GSM2111312) and K562 (GSM2111313). A short comparative analysis was performed that allowed us to define regions common to the 6 cell lines. These replication timing data can be taken into account when performing studies that use these model cell lines.

Keywords: DNA replication timing; HEK 293T; HeLa; K562; MRC5; RKO; U2OS.