A new era for drug discovery is already here: biologics vs small compound drugs

I found a very interesting post in the Biotech-Now blog "Small Companies, Big Returns"  - about the best performing biotech/pharmatech companies according to their revenue in 2012, who's market capitalization have jumped 100 percent. And what triggered me that half of the first 10 companies are working with either biologic or new drug delivery systems for the old or well studied drugs.

Here is the Top 10 of the list

Biologics: Sarepta with unique RNA-based approaches; Interferon alfa-n3 from Hemispherx Biopharma; Affymax with peginesatide, a functional form of erythropoietin); 

New drug delivery systems for for well-known drugs: BioDelivery Sciences with unique patch; Celsion with drug delivery liposomes;

Small compound drugs: Arena Pharmaceuticals with rich GPCR pipeline; topoisomerase II inhibitor of Sunesis and some others compounds; Threshold Pharmaceuticals with DNA alkylator prodrug activated by hypoxic conditions; Repros Therapeutics with steroid small compound drugs.
Mixed pipeline: Adamis Pharmaceuticals.

The next 10 companies show almost identical profile. The final picture looks very interesting and we can definitely say that the drug design and development nowadays evenly distributed between the biologics and small compound drugs. But more interesting that the new drug delivery system can be successfully applied for an old small compound drug reducing side effects and can provide significant benefits for the company.

Table of the top 20 companies according to YTD (Year-to-date) return.

Update: due to some really strange behavior of the Google Chrome I removed direct link to the Biotech-Now blog - it treated as malware, but it's easy to find a blog post in Google.


Seminar at Abagyan group: Protein-Ligand Interaction Fingerprints

I will be presenting a talk at Abagyan group with running topic Protein-Ligand Interaction Fingerprints, with major topic about interaction fingerprints and my work of predicting the fingerprints, that was done at University of Strasbourg.

When: Tomorrow, December 11, 10AM.
Where: Skaggs School of Pharmacy and Pharmaceutical Sciences, room 4220 (conference room).

Everyone is 100% welcome.


Big pharma, crowdsourcing and cheminformatics

Open innovation and crowdsourcing are fast growing areas. It is believed that the crowdsourcing and open innovation can help to jump over the research gap in drug design and discovery and other areas of innovations. Diversity of solutions provided by crowdsourcing will be always higher, one scientist or research group physically can't give as much solutions as scientific society can. There are some crowdsourcing and open innovation attempts made by academic science, but they do not deal with real world pharmaceutical innovation. This short review is about crowdsourcing as a way of earning money.

Up to now, there are several crowdsourcing platforms where pharma/biotech companies ask for solution using open innovation approach and where chemoinformatician or let's say life science data miner can benefit in a pecuniary way:
  • Innocentive - several virtual screening projects where already finished - Cytochrome BC1Hyper-Phosphorylated Tau Protein with rather competitive prizes ($10-15k). I participated in one of them: most of the time was spent for report writing, other time for data preparation, building the models and around 3 weeks of computational time.
  • Kaggle - the Merck challenge on prediction of biological activity and also HTS visualization - interesting QSAR/cheminformatics task; Boehringer Ingelheim - also for biological activity prediction.
  • NineSights - most of the drug design and discovery projects are related to search of seed compounds, but some modeling challenges were appeared. Sorry, can't find them now.
  • OneBillionMinds - share the same idea of crowdsourcing but mostly for non-profit, no tracks of science were found.
For the most of the mentioned projects there are up to 300 teams contributed their solutions. And it will be very interesting to compare them. Kaggle already have this in mind, but it will be more interesting to receive results for virtual screening projects at Innocentive. Of course not all of virtual screening results can be compared in real HTS, but intersection of different methods are always valuable.

I think there is a bright future for crowdsourcing. So many solution die on a shelf of researcher, but crowdsourcing can help to push innovation both for research and pharma company.

Some links:
Big pharma sees promise in incubators, crowdsourcing by Union-Tribune

PS: If you know any other open innovation/crowdsourcing platforms, please, drop a line, it will be very interesting for me.

Ideaconnection - R&D-oriented crowdsourcing platform, worth registering.


CADD success stories. Part2

Just back from the quit a long job searching, haven't read collection of TOCs for a 3 months. So, there a lot of CADD examples I want to share with you.

8. β-Amyloid Aggregation Inhibitors
Tools: molecular dynamics (Markov state model molecular dynamics), shape-based virtual screening (ROCS by OpenEye).
DOI: 10.1021/jm201332p

 Abstract Image

 9. RAF/VEGFR2 inhibitors 
Tools: structure-based drug design with mutant proteins (GOLD).
 Abstract Image
 10. Human 5-Lipoxygenase inhibitors
Tools: homology modeling, docking, molecular dynamics, virtual screening (DOCK, PSDOCK, AutoDock).
Comment: Strange combination of the virtual screening tools, because ligands were prepared with LigPrep (Schrodinger) why not to use Glide?
 Abstract Image

11. Pim-1 Kinase inhibitors
Tools: fragment-based drug design, constrained docking combined with (Glide by Schrodinger).
DOI: 10.1021/jm2014698
Comment: interesting discussion on different binding mode of fragments.
 Abstract Image

12. Falcipain Inhibitors
Tools: substructure filtering, virtual screening (Glide by Schrodinger), molecular dynamics and thermodynamic-based water displacement (WaterMap by Schrodinger), lipophilicity prediction (ClogP).
DOI: 10.1021/ci2005516
Abstract Image

13. S-Adenosyl-l-Homocysteine Hydrolase Inhibitors 
Tools: homology modeling, structure-based virtual screening

14. Dihydropteroate Synthase Inhibitors
Tools: Structure-based design supported by docking
DOI: 10.1002/cmdc.201200049

15. Human Estrogen Receptor Alpha modulators
Tools: pharmacophore-based virtual screening (fFLASH by IBM)
DOI: 10.1002/minf.201100127
Comments: anyone know where can I get the fFLASH?


GABAB reeptor arrived!

Just from the oven - GABA(B) receptor (PDB ID: 4F11, 4F12). Only the extracellular domain at 2.38Å, but there are no ligands co-crystallized.
DOI: 10.1038/nn.3133


BioIT Eorld 2012 - day three

Second day of the conference is described here.

Here is the short review of the third day of this interesting conference!

HPC trends from the Tranches - very tech but funny talk about high perfomance computing and especially cloud computing by Chris Dagdigian (SlideShare presentation) from BioTeam. He was talking about problems and solutions on Big Data computation, storage and retrieval.
Notes to take:
  • data transfer is speed problem for cloud computing and classic methods don't fit this, but faster solutions already exist (GridFTP, Aspera);
  • Chris said that the most of the companies soon will have problems with data storage because the size of data growing rapidly and current solutions working they's limit.
Enabling research in the Cloud - ad-lecture by Amazon Web Services of course about cloud computing.
Are you ready for 'in litero' drug discovery (Reverse Informatics) - manual annotation of the any life science information with ontology creation and dictionary of synonyms also. Very nice but actually nothing really new.

Development and application of Chemical Ontologies (Ontochem) - German company working with chemical ontologies. Showed PubMed search improvement by several times.

Systematic Drug Repositioning - heavy artillery from GSK. It's very nice to have fonds, because this lecture have showed combination of all possible methods for drug repurposing: text mining, microarray analysis - both protein, gene and RNA, QSAR and virtual screening was also used.Everything was mixed and served with a lot of practical examples.

Next Generation Model-based Drug Discovery and Development: Quantitative and Systems Pharmacology - another heavy artillery but from Merck. Mostly about PK/PD data processing and modeling, interesting model for drug distrubution inside the bone using FEM - good example of tissue modeling.

Pistoia Alliance: progress in precompetitive collaboration - Pistoia Alliance - collaboration of big pharma to develop new standards for data transfer. 
Interesting: winner announce for developing of the genome compression algorithm - Squeeze Genome by James Bonfield

Final lecture was a live discussion of the cancer treatment and cancer informatics by several experts:
  • 100$ genome was prognosed within 2-3 years;
  • active propaganda of disease and genome data sharing to concur disease faster;
  • portable medical diagnostic device Q-Poc were shown able to predict cancer disease and possibility of other disease in the near future. Spread of similar devices also will be prevalent;
  • interesting example of research misconduct - using GWAS is was found a mutation that was activaly studied for 5 years, but actually do not participate in disease;
  • integration of various omics to drug therapy were actively discussed;
  • interesting example of cancer prevention by aspirin usage was shown.
The BioIT conference was amazing!


BioIT Eorld 2012 - day two

I think BioIT World is the biggest conference in the world that mix life science and informatics. All of the molecular modeling software developer monsters were presented here: ChemAxon, OpenEye, Accelrys. GGA software was a specific interest for me, first because molecular modeling software development team is located in Saint Petersburg, one of the most beautiful Russian cities, second - because of their open source initiative - Bingo, Indigo and Imago. Also, a lot of companies representing data mining, data management and integration spheres were there. 

Some hardware companies were also active: I have tested the real-time rendering of the proteins - it's really hot, especially when you have two Tesla GPU cards. Personal super computers are reality nowadays - money is the only question.
Special news is cloudification of the Pipeline Pilot. Hmm...let it be the new word - cloudification, because people want to put into the cloud everything they have.

Actually, the conference have one big minus - there are 12 really interesting parallel tracks all at once! 

  • IT Infrastructure – Hardware
  • IT Infrastructure – Software
  • Cloud Computing
  • Bioinformatics
  • Next-Generation Sequencing Informatics
  • Systems and Multiscale Biology
  • eClinical Solutions
  • eHealth & HIT Solutions for Personalized Medicine
  • Drug Discovery Informatics
  • Molecular Diagnostics Informatics
  • Open Source Solutions
  • Cancer Informatics
So, after thorou inspection I chose Systems and Multiscale Biology, Drug Discovery Informatics and Cloud Computing, as my area of expertise. Here is the summary.

Introducing eTRIKS: European Translational Information & Knowledge Management Services - management and integration of the medical and life science data for translational medicine on the basis of open-source solution tranSMART.

Library Enhancement through the Wisdom of Crowds - Agrafiotis is really a big figure in computer-aided drug design that work for JnJ. Agrafiotis team randomly selected compounds from in-house database, then added some real HTS compounds from the library with good known physico-chemical profile. After, this database was given to 1000 medicinal chemist to select good compounds for HTS manually using "good-neutral-bad" scheme. Finally, it was shown that the most of the manually selected compounds have good solubility, good lipophilicity and good synthetic accessibility. That was really impressive, human brain definitely have best learning algorithm. The only thing that is disturbing is that the possibility of the "good" selected compound to be be rejected is around 20%. So, crowd-sourcing drug design is the future?

Changing the Landscape of Laboratory Informatics Systems to Enhance Innovation Life Cycle Management (ILM) - some impressive Accelrys and Scitegic lecture about scientific data management. Too expensive guys!

Chemical-Protein Interactome and its Application in Personalized Medicine and Drug Repositioning - using docking with Dock (not the best docking tool) FDA drugs were docked to all possible targets from PDB. Olanzapine and clozapine were used as an example: the last one cause agranulocytosis in 1% of the patients. What authors found that the HSP70 is over-expressed in microarray from the cancer line (strange, all of the HSPs are over-expressed in cancer lines), and also this target was found in the top of the inverse screening target list. Patients with agranulocytosis have mutation in one amino acid in the binding site that actually gave improved DOCK score for clozapinebut not olanzapine, that is actually is cause of the side effect. So, for me these results are very interesting as success for chemogenomics but still very-very questionable. DOCK have only ~60% success in Virtual screening.
Next Generation Bioinformatical Analysis on the Cloud - very interesting, but do we really need to put genome assembling into the cloud?

OpenEye Grapheme - very nice 2D depiction of the ligand-receptor complexes (soon will be avaliable in Vida) and new coloring and depiction scheme for compounds.

Also, it was interesting to talk with author of JSDraw and test new product - TouchMol a chemical editor for tablets, very nice and easy to draw compounds with fingertips, but I also think desktop users will like it. Have small talk with one of the Aysdi employers - interesting topological mapping as visualization for for biological networks.


Protein knots

I have stumble upon the protein with a very interesting structural motif (PDB ID: 3UN9). Protein subunits forms functional receptor, subunits fit into each other like a pieces of a jigsaw puzzle.

Actually knots are not very rare  in protein kingdom. Take a look for example into databases like pKNOT or knot server from MIT. There is evidence that the artificial knot proteins can be  even successively designed (PDB ID: 3MLG) [1].

But one question arose looking at those type of structure: how this 3-to-1 or 6-to-1 proteins are folded? As it was found knot proteins do not require chaperons for folding, thus folding happens independently under influence of the internal and external conditions: protein itself and cell solution.  Mutations some of the amino acids directly involved in folding do not have great influence thus no "folding features" were found. Some of the knotted structures can be easily unfolded, but methyltransferase structures are very stable under   denaturing conditions. Also interesting to point out that the knots are forming on the late stages of the folding pathway [2].

Attempts to mimic folding in silico was successful [3]. Coarse-grained representation of the protein and simulation of Langevin dynamics for YibK  protein with trefoil knot have shown the early and late knot formation pathways. And as it was expected the hydrophobic aminoacids play the most important role.

One of the most important interests is design of artificial enzymes stable under very harsh conditions for industry.

1. Structure and folding of a designed knotted protein - DOI: 10.1073/pnas.1007602107.
2. Nice review - DOI:10.1088/0953-8984/23/3/033101
3. Attempt to fold knot protein in silico: arxiv.org/abs/q-bio/0611073 


The structure of human soul

Finely, people of Earth solved the mystery of all the times.
They found the structure of soul, the precious tiny material of every or mostly every human. It's an x-ray one but an important step is done!


Why ChemInformatics and not ChemOinformatics?

So, finishing the old question "ChemInformatics or ChemOinformatics". I decided to find official clause for this name.
Origin of Cheminformatics seems to be originated from Obernai declaration that states use of the word cheminformatics instead of chemoinformatics. But there is no single document that states it! Even more this declaration use chemOinformatics term!

Some funny notes from molinspiration site:
Date: Fri, 17 Oct 1997 
From: Wendy Warr  
Subject: Re: Cheminformatics/Two new refs. 
I wonder if any of the sources define this awful neologism ("chemoinformatics" 
or "cheminformatics"). Does it really differ from "chemical information" or 
"computational chemistry".

More...from Endy Warr
About two years ago, many people (including myself) considered that
"cheminformatics" was a nasty neologism. My survey this summer shows
that it is now an established discipline, although the tasks involved,
and even the name of the discipline, are not clearly defined yet. 
"Cheminformatics" was preferred to "chemoinformatics" by most

More interesting about Google Ngram comment from the Dalke Scientific blog.

More specifically, if you search for "cheminformatics" you'll see "About 6,960,000 results". Try to go to item 900 and you'll get the message:
In order to show you the most relevant results, we have omitted some entries very similar to the 654 already displayed.
"Chemoinformatics" returns "About 96,500 results". Almost an order of magnitude less! But try going to the end of those and you'll see:
In order to show you the most relevant results, we have omitted some entries very similar to the 671 already displayed.
Shaky indeed!

So, my conclusion is the choice was made by some top molecular modeling people, highly connected to Internet community and the final line was made when Journal of Cheminformatics appearred. From my point of view, chemoinformatics is more correct from grammatical point of view, but the world have some rules.


Chemoinformatics versus Cheminformatics: the other point of view

ChemInformatics or ChemOinformatics? This question is rather old, but I do have my specific point of view.

In Russian (yep, I'm originally from Russia) we are using term chemOinformatics, because it is rather sounds more correctly for the Russian-speaking audience. Because we already have chemometrics, chemotherapy that pronounce more like chimeotherapy, chemoreceptors - oldy term from 60, and many-many others where the sound O in chemO is connecting two consonant letters. That is also true for all the Europe and especially German languages as noted by Egon. The other point is - there is a word chemO that gave birth to many terms in English literature that are chemOgenomics, chemotherapy already mentioned, chemogenesis, chemoreceptors.

Despite typing cheminformatics that can save you 6.25% of typing time, I prefere to use chemoinformatics in unoficial communication, but cheminformatics in official.


CADD successful stories: Part 1.

When browsing CADD journals I always like to read about successful storied lead to development of new drug-like compounds. I think everyone in CADD field like it. I decided to make small resume of these successful examples. You welcome to comment it.

1. Dopamine 1 receptor agonists with selectiviy over D2 receptor

Tools: Homology model, Pharmacophore model, Docking, selectivity by pharmacophore model.
DOI: 10.1002/cmdc.201100546

2. mGluR5 receptor negative nanomolar modulator

Tools: Artificial Neural Network (345 active and 155774 inactive compounds).
DOI: 10.1002/cmdc.201100510

3. Otimization of the omeprazole-based inhibitors of CYP2C19

Tools: pharmacophore, homology model

4. Inhibitors of Eg5 mitotic kinesin

Tools: Pharmacophore search following structure-based virtual screening of 700 000 compounds database. 3 found hits.

5. Inhibitors of ATP  binding  cassette  transporter  ABCC5
Tools: Homology model, structure-based virtual screening. 11 found hits.
DOI: 10.1021/jm2014666

6. Myotonic Dystrophy Type 1 RNA inhibitors

Tools: shape chemical similarity (ROCS), substructural search (RNA motif).

7. Adenosine A2A antagonists

Tools: homology model, structure-based virtual screening (Glide),  Biophysical Mapping 

After finishing I stumble upon excelent examples from Blue Dolphin Discovery. Take a look at them also.


AutoDock Tools bug

AutoDock was always famous for strange file formats, here is funny pdbqt bug.