ChEMBL Resources

Resources:
ChEMBL
|
SureChEMBL
|
ChEMBL-NTD
|
ChEMBL-Malaria
|
The SARfaris: GPCR, Kinase, ADME
|
UniChem
|
DrugEBIlity
|
ECBD

Tuesday, 31 March 2009

Bio-IT World (Europe) Conference, Hannover, October 2009

We are going to speak in the Data Integration and Knowledge Management track at the Bio-IT World (Europe) meeting to be held in the beautiful city of Hannover, Germany, October 5th to 7th 2009. Should be a good meeting...

We Now Have An Office!

The ChEMBL caravan will always have a place in our hearts, but now we have an office, and we must move on from the pain. It has some walls and a door, with a nice hook for jackets and coats. It is nice, bijou even, and has bought a smile to all our faces. Most importantly it gives us a place to entertain guests and visitors, so if anyone is in the area, please pop by and have a cup 'o tea and a slice 'o cake.

As a sideline, to fund the tea and cakes, we have a nice T-shirt - XXXL only, minimum order 10 pieces if you are interested.

Thursday, 26 March 2009

ChEMBL Goes to A Conference - Fall ACS 2009

We are going to speak at the 238th ACS National Meeting, Washington, DC, August 16-20, 2009, on "ChEMBL: Large-scale Mapping of Medicinal Chemistry and Pharmacology Data to Genomes". The Abstract for the talk is:

Although the majority of effective therapeutics are small molecules, there is relatively little readily accessible public domain data mapping drugs to their molecular targets. When one considers clinical trial stage, or discovery stage data, the situation deteriorates further. However, this type of data is essential for Chemical Biology experiments, and is crucial for informed target selection in drug discovery. To address this issue, we have built a series of large scale databases, known as ChEMBL, that map small molecule structures to their target genes and also their functional effects. This data also captures a large ammount of human and model organism pharmacological data, systems often used in pre-clinical validation and safety pharmacology testing. A variety of applications of these databases in the area of target prioritisation, lead discovery, lead optimisation and drug repurposing will be described.

Wednesday, 25 March 2009

Books and Papers - 8 - Programming Collective Intelligence

I am on holiday today - sort of. Went to Borders for a Starbucks (product placement hopefully pays well), and while queuing for my Orange Mocha Frappuccino!, I caught sight of the O'Reilly books; one stood out from the crowd - Programming Collective Intelligence. It looks a very cool collection of code (Python) implementing a whole variety of data analysis/machine learning techniques and routines to build smarter, more responsive and adaptive web 2.0 applications. Skimming the pages while I had my caffeine speedball led me to spend my cash.
%A Toby Segaran
%T Programming Collective Intelligence
%I O'Reilly
%D 2007
%O ISBN 978-0596529321

Monday, 23 March 2009

One StARlite interface

No, this isn't a hotel review, despite the picture above; however since finding this image, I now feel compelled to visit Carlsbad, NM to pose under the sign. Now for the post itself; find below some screenshots for an internal interface for StARlite data developed by our close collaborators at the Institute of Cancer Research in Sutton. This interface for StARlite shows some basic workflow themes that give some ideas as to the potential uses of StARlite 'straight out of the tin'. Several of the views will be incorporated into the EMBL-EBI public web interface ;) Bissan's group is developing an integrated system for cancer chemogenomics, called canSAR.

Compound Searching: What would an SAR database be without a compound sketcher and search mechanism. Well here is one, implemented with Marvin, and the Dotmatics Pinpoint cartridge.

Compound Browsing: A "Top Trumps" view on compounds is quite a useful paradigm for browsing and selecting compounds for further analysis. As an aside, for an example of sensory overload on a web page, try out the current Top Trumps company website.

Target Browsing: It is useful sometimes to browse the target dictionary to explore its structure, coverage, and to pre-group compounds into various activity sets.

This page uses the Google Chart API to do some of the heavy lifting. In which the chart itself is made using a URL of the form...

http://chart.apis.google.com/chart?cht=p3&chd=t:60,40&chs=250x100&chl=Good|Evil

Which produces the following (and you thought Google could do no Evil).

I thank Mark Halling-Brown, and Bissan Al-Lazikani of the ICR for the images. However, I do not thank them for the geek joke in the use of the StarWarsTM font for StARlite (Ha, ha, ha, ha, ha.....).

Friday, 20 March 2009

Books and Papers - 7 - Population Genetics, Molecular Evolution, and The Neutral Theory by Motoo Kimura

Published in 1994 in recognition of the huge influence of Motoo Kimura on the field of theoretical studies on molecular evolution. This is a collection of papers and essays published by Kimura over the period 1955 to 1986. The writing is just truly beautiful, the prose, pace and clarity in the text humbles me as a supposed native English speaker (as this blog so clearly shows!). If you're not interested in the science at all, just buy it for the masterclass of technical writing inside.

The theme of the book is around The Neutral Theory, quite a contentious issue in evolution (essentially, this states that the vast majority of observed mutations at a molecular level are not adaptive; now flame me!) This book changed the way I thought about mutation, protein sequence and structure and function. Forever.

%T Population Genetics, Molecular Evolution, and The Neutral Theory
%A Motoo Kimura
%E J.F. Crow
%D 1994
%I Chicago
%O ISBN: 0-226-43562-8

Tuesday, 17 March 2009

Protein Ligand Affinity Databases

I came across a nice tabular summary of some existing 'public' 'primary' protein-ligand interaction databases that primarily focus on protein-ligand affinity data (so Ki, Kd, IC50, EC50, etc.), that I have reproduced below (many thanks to Helena Strömbergsson, from Uppsala University for the data).

NameTarget Class FocusApprox size
BindingDBAll~48,000
PDSPReceptors~47,000
BRENDAEnzymes~19,000
BindingMOADAll~3,500
PDBBindAll~3,500
AffinDBAll~700
PLDAll~500

The comparable number from StARlite (31) are 507,645 (of which 186,370 are better than 100nM) for affinity class end-points. Oh, and we have started a new load.....

Friday, 6 March 2009

StARlite Schema walkthrough web-meeting

The next web-meeting for a walkthrough of the StARlite schema, data model assumptions and sample queries will be at 11am to noon local UK time (so at this time of year GMT/UT) on Friday 20th March. If you wish to take part in this meeting please use this link (do not modify the header of the email in any way!).

The last time we tried a web meeting, my domestic broadband connection could not cope with audio and the slides at the same time, so you will need to dial into a UK land line number; unfortunately, this will not be a freephone number.

Finally, if you can't make this time, we will set up a similar meeting in another few weeks or so.

Tuesday, 3 March 2009

ChEMBL Group Research Retreat 2009

I have provisionally planned the ChEMBL group retreat for 2009. It will be in Crieff, Scotland, and will be in late September. The format will allow detailed discussion and brainstorming of ideas for the ChEMBL project, and will be themed around the following areas.

  • Open-Source Drug Discovery and Open Science.
  • Patent data-mining and indexing.
  • Auto-curation and in-line predictive model generation.
  • An Ontology for drug discovery screening cascades.
  • A web-services primer.

    The mornings and evenings will be informal discussions of science, while the afternoons will be fun, fungi, flora and photography (who said alliteration is dead!) oriented walks in the wooded areas around Crieff. The picture above is of a reasonably rare parasite of truffles (a Cordyceps sp.) found in Crieff around the same time of year in 2008. This is the fruiting stage of the fungus (the teleomorph), the non-fruiting body stage (the anamorph) of a closely related fungus is the source of the powerful immunosuppresive drug cyclosporin, wow!

    I will try and get some really special VIP guests as surprise speakers and participants! ;)

  • Sunday, 1 March 2009

    Software - Papers for the iPhone

    Just downloaded and synced up Papers for the iPhone from mekentosj. What a great little app, beautiful interface, very snappy performance, and allows the carrying of a whole bunch of literature in your pocket, and also searching/downloads from your handheld. What more could a hipster mobile scientist want? (Apart from good 3G coverage, 64GB of memory, free Wi-Fi everywhere, and free journal access, of course).