Chemogenomics at Janssen

It was a long time since I blog, but it was a very particular and important reason - I relocated to Spain, Toledo to work for Janssen R&D (pharmaceutical companies of Johnson&Johnson). The main research topic is chemogenomics. I prefer the term systems pharmacology, but it's often used interchangeably.

I am already here for several months, but that time flew as a one moment. It's very exciting for me and an important step in my career. I will renew my blogging activity with very interesting topics. So, see you soon.

PS: My blogging activity is a private matter and not connected with Janssen R&D.


Buggy me...AutoLog

Dear readers,

I messed up with AutoLog - an attempt to autolog all of the activity over the social web with IFTTT and Blogger. That was mostly easy for Question/Answers sites as Blueobelisk, Stackexchange and similar using RSS that most of those sites are providing.

Sorry for overflew your RSS readers.


Famous statistician quotes

I found very interesting post in Cross Validated website: Famous statistician quotes. I really liked some of them.

  • All models are wrong, but some are useful. George E. P. Box
  • Statisticians, like artists, have the bad habit of falling in love with their models. George E. P. Box
  • In God we trust. All others must bring data. W. Edwards Deming
  • Statistical thinking will one day be as necessary a qualification for efficient citizenship as the ability to read and write. H.G. Wells
  • A big computer, a complex algorithm and a long time does not equal science. Robert Gentleman
  • All generalizations are false, including this one. Mark Twain
  • If you torture the data enough, nature will always confess. Ronald Coase
  • He uses statistics like a drunken man uses a lamp post, more for support than illumination. Andrew Lang
  • Everybody believes in the exponential law of errors [i.e., the Normal distribution]: the experimenters, because they think it can be proved by mathematics; and the mathematicians, because they believe it has been established by observation. Whittaker, E. T. and Robinson, G. "Normal Frequency Distribution."
  • I keep saying that the sexy job in the next 10 years will be statisticians. And I'm not kidding. Hal Varian
  • It is easy to lie with statistics. It is hard to tell the truth without statistics. Andrejs Dunkels
  • My thesis is simply this: probability does not exist. Bruno de Finetti
  • We are drowning in information and starving for knowledge. Rutherford D. Roger
  • The Earth is round. p < .05. Jacob Cohen
  • When I see articles with lots of significance tests, I say that the statisticians are p-ing on the research. Herman Friedmann
  • Torture numbers, and they'll confess to anything. Gregg Easterbrook
  • With three constants, I can fit a dog. With four, I can make it bark. William Reifsnyder
  • The best time to plan an experiment is after you've done it. R.A. Fisher

And the best one I love and used as epigraph for my PhD thesis: "He who loves practice without theory is like the sailor who boards ship without a rudder and compass and never knows where he may be cast." by Leonardo da Vinci


Coursera: the revolution in education

I am really like the idea of open education. Currently massive online open courses (MOOC) are on the absolutely new level with initiatives such as Coursera, edX, Udacity, Class2Go, and Khan Academy also of course. Coursera stand out from that list, because of huge spectra of courses from Social Psychology, Quantum Computing to System Biology offered by major Universities of the World.

I just finished Drugs and the Brain class from Coursera by Henry A. Lester. This course is amazing because it combines all of the aspect of the neurological drug action from molecular target to neurons to neuronal circuits to regions of the brain and behavior in final. Of course I can read the book, or articles, that I already did, but I personally find attractive to listen and to see well-known professors. Maybe fill myself a student a little bit.

I took other coursers from Coursera and can say I am really impressed. One of the key aspect for me, as a native Russian speaking scientist, is - studying some of the subjects in English. I consider myself as advanced English "user", but sometimes it's hard for me to discuss material in depth because of limited scientific vocabulary. Just an example: I studied mathematics at school and university, even Latin nature of most of the words used - don't help, some of terms are unique to Russian language.  The same for other sciences. So, these courses help me to improve English as a foreign language.

There are definitely other pros: most of the courses taught by well-known professor from leading Universities; interactive way of studying: there are tests, code submission, forums to discuss questions and additional material, etc - help to acquire new information.

Humanity currently on the new level of information processing. We are collecting data in a geometric progression, and we urgently need effective ways to analyze it. In my opinion MOOC is one step to give education for more people and to be more effective in data analysis in the near future.


A new era for drug discovery is already here: biologics vs small compound drugs

I found a very interesting post in the Biotech-Now blog "Small Companies, Big Returns"  - about the best performing biotech/pharmatech companies according to their revenue in 2012, who's market capitalization have jumped 100 percent. And what triggered me that half of the first 10 companies are working with either biologic or new drug delivery systems for the old or well studied drugs.

Here is the Top 10 of the list

Biologics: Sarepta with unique RNA-based approaches; Interferon alfa-n3 from Hemispherx Biopharma; Affymax with peginesatide, a functional form of erythropoietin); 

New drug delivery systems for for well-known drugs: BioDelivery Sciences with unique patch; Celsion with drug delivery liposomes;

Small compound drugs: Arena Pharmaceuticals with rich GPCR pipeline; topoisomerase II inhibitor of Sunesis and some others compounds; Threshold Pharmaceuticals with DNA alkylator prodrug activated by hypoxic conditions; Repros Therapeutics with steroid small compound drugs.
Mixed pipeline: Adamis Pharmaceuticals.

The next 10 companies show almost identical profile. The final picture looks very interesting and we can definitely say that the drug design and development nowadays evenly distributed between the biologics and small compound drugs. But more interesting that the new drug delivery system can be successfully applied for an old small compound drug reducing side effects and can provide significant benefits for the company.

Table of the top 20 companies according to YTD (Year-to-date) return.

Update: due to some really strange behavior of the Google Chrome I removed direct link to the Biotech-Now blog - it treated as malware, but it's easy to find a blog post in Google.


Seminar at Abagyan group: Protein-Ligand Interaction Fingerprints

I will be presenting a talk at Abagyan group with running topic Protein-Ligand Interaction Fingerprints, with major topic about interaction fingerprints and my work of predicting the fingerprints, that was done at University of Strasbourg.

When: Tomorrow, December 11, 10AM.
Where: Skaggs School of Pharmacy and Pharmaceutical Sciences, room 4220 (conference room).

Everyone is 100% welcome.


Big pharma, crowdsourcing and cheminformatics

Open innovation and crowdsourcing are fast growing areas. It is believed that the crowdsourcing and open innovation can help to jump over the research gap in drug design and discovery and other areas of innovations. Diversity of solutions provided by crowdsourcing will be always higher, one scientist or research group physically can't give as much solutions as scientific society can. There are some crowdsourcing and open innovation attempts made by academic science, but they do not deal with real world pharmaceutical innovation. This short review is about crowdsourcing as a way of earning money.

Up to now, there are several crowdsourcing platforms where pharma/biotech companies ask for solution using open innovation approach and where chemoinformatician or let's say life science data miner can benefit in a pecuniary way:
  • Innocentive - several virtual screening projects where already finished - Cytochrome BC1Hyper-Phosphorylated Tau Protein with rather competitive prizes ($10-15k). I participated in one of them: most of the time was spent for report writing, other time for data preparation, building the models and around 3 weeks of computational time.
  • Kaggle - the Merck challenge on prediction of biological activity and also HTS visualization - interesting QSAR/cheminformatics task; Boehringer Ingelheim - also for biological activity prediction.
  • NineSights - most of the drug design and discovery projects are related to search of seed compounds, but some modeling challenges were appeared. Sorry, can't find them now.
  • OneBillionMinds - share the same idea of crowdsourcing but mostly for non-profit, no tracks of science were found.
For the most of the mentioned projects there are up to 300 teams contributed their solutions. And it will be very interesting to compare them. Kaggle already have this in mind, but it will be more interesting to receive results for virtual screening projects at Innocentive. Of course not all of virtual screening results can be compared in real HTS, but intersection of different methods are always valuable.

I think there is a bright future for crowdsourcing. So many solution die on a shelf of researcher, but crowdsourcing can help to push innovation both for research and pharma company.

Some links:
Big pharma sees promise in incubators, crowdsourcing by Union-Tribune

PS: If you know any other open innovation/crowdsourcing platforms, please, drop a line, it will be very interesting for me.

Ideaconnection - R&D-oriented crowdsourcing platform, worth registering.


CADD success stories. Part2

Just back from the quit a long job searching, haven't read collection of TOCs for a 3 months. So, there a lot of CADD examples I want to share with you.

8. β-Amyloid Aggregation Inhibitors
Tools: molecular dynamics (Markov state model molecular dynamics), shape-based virtual screening (ROCS by OpenEye).
DOI: 10.1021/jm201332p

 Abstract Image

 9. RAF/VEGFR2 inhibitors 
Tools: structure-based drug design with mutant proteins (GOLD).
 Abstract Image
 10. Human 5-Lipoxygenase inhibitors
Tools: homology modeling, docking, molecular dynamics, virtual screening (DOCK, PSDOCK, AutoDock).
Comment: Strange combination of the virtual screening tools, because ligands were prepared with LigPrep (Schrodinger) why not to use Glide?
 Abstract Image

11. Pim-1 Kinase inhibitors
Tools: fragment-based drug design, constrained docking combined with (Glide by Schrodinger).
DOI: 10.1021/jm2014698
Comment: interesting discussion on different binding mode of fragments.
 Abstract Image

12. Falcipain Inhibitors
Tools: substructure filtering, virtual screening (Glide by Schrodinger), molecular dynamics and thermodynamic-based water displacement (WaterMap by Schrodinger), lipophilicity prediction (ClogP).
DOI: 10.1021/ci2005516
Abstract Image

13. S-Adenosyl-l-Homocysteine Hydrolase Inhibitors 
Tools: homology modeling, structure-based virtual screening

14. Dihydropteroate Synthase Inhibitors
Tools: Structure-based design supported by docking
DOI: 10.1002/cmdc.201200049

15. Human Estrogen Receptor Alpha modulators
Tools: pharmacophore-based virtual screening (fFLASH by IBM)
DOI: 10.1002/minf.201100127
Comments: anyone know where can I get the fFLASH?


GABAB reeptor arrived!

Just from the oven - GABA(B) receptor (PDB ID: 4F11, 4F12). Only the extracellular domain at 2.38Å, but there are no ligands co-crystallized.
DOI: 10.1038/nn.3133


BioIT Eorld 2012 - day three

Second day of the conference is described here.

Here is the short review of the third day of this interesting conference!

HPC trends from the Tranches - very tech but funny talk about high perfomance computing and especially cloud computing by Chris Dagdigian (SlideShare presentation) from BioTeam. He was talking about problems and solutions on Big Data computation, storage and retrieval.
Notes to take:
  • data transfer is speed problem for cloud computing and classic methods don't fit this, but faster solutions already exist (GridFTP, Aspera);
  • Chris said that the most of the companies soon will have problems with data storage because the size of data growing rapidly and current solutions working they's limit.
Enabling research in the Cloud - ad-lecture by Amazon Web Services of course about cloud computing.
Are you ready for 'in litero' drug discovery (Reverse Informatics) - manual annotation of the any life science information with ontology creation and dictionary of synonyms also. Very nice but actually nothing really new.

Development and application of Chemical Ontologies (Ontochem) - German company working with chemical ontologies. Showed PubMed search improvement by several times.

Systematic Drug Repositioning - heavy artillery from GSK. It's very nice to have fonds, because this lecture have showed combination of all possible methods for drug repurposing: text mining, microarray analysis - both protein, gene and RNA, QSAR and virtual screening was also used.Everything was mixed and served with a lot of practical examples.

Next Generation Model-based Drug Discovery and Development: Quantitative and Systems Pharmacology - another heavy artillery but from Merck. Mostly about PK/PD data processing and modeling, interesting model for drug distrubution inside the bone using FEM - good example of tissue modeling.

Pistoia Alliance: progress in precompetitive collaboration - Pistoia Alliance - collaboration of big pharma to develop new standards for data transfer. 
Interesting: winner announce for developing of the genome compression algorithm - Squeeze Genome by James Bonfield

Final lecture was a live discussion of the cancer treatment and cancer informatics by several experts:
  • 100$ genome was prognosed within 2-3 years;
  • active propaganda of disease and genome data sharing to concur disease faster;
  • portable medical diagnostic device Q-Poc were shown able to predict cancer disease and possibility of other disease in the near future. Spread of similar devices also will be prevalent;
  • interesting example of research misconduct - using GWAS is was found a mutation that was activaly studied for 5 years, but actually do not participate in disease;
  • integration of various omics to drug therapy were actively discussed;
  • interesting example of cancer prevention by aspirin usage was shown.
The BioIT conference was amazing!