ChEMBL Resources

Resources:
ChEMBL
|
SureChEMBL
|
ChEMBL-NTD
|
ChEMBL-Malaria
|
The SARfaris: GPCR, Kinase, ADME
|
UniChem
|
DrugEBIlity
|
ECBD

Wednesday, 27 February 2013

ChEMBL Compound Clean Up




For the last three months, I've been busy working my way through a 9000 long (sometimes headache-inducing) set of ChEMBL compound ids. These had been highlighted for curation for the reason that for each ChEMBL_id in the list, there were two or more compound keys from the same paper. This implied that either there were two indistinguishable using InChI representation compounds described in the paper or they were different compounds that had been somehow merged together in the database.

Each ChEMBL_id was individually checked against the data in the original paper to see if there were indeed two compound keys for the same structure.

The outcome of this check gave rise to one of four cases:
  • The structure(s) was found to be incorrect and was redrawn.
  • The structure was correct for some records but not others, so a new compound was created for those selected records.
  • The structure required the definition of stereochemistry or a salt.
  • The structure was left alone - either the stereochemistry could not be shown or it was indeed a currently indistinguishable compound with separate compound keys. An example of this case is where chemists have separated enantiomers, and know that a pair of compounds only differ by stereochemistry, but they don't know the absolute configuration, just that they are 'opposite'.
It was a laborious but satisfying job to complete, allowing me to make use of my pedantic and geek-like tendencies. This has shown that there are a fairly significant number of papers where the authors have given identical structures two different compound keys. In some cases these are duplicates and probably should have been merged in the original publication; it also highlights some of the problems of representation of relative stereoisomers and sometimes atropisomers. These are difficult things.

It has definitely been an interesting project to get through with over 3,800 compounds being redrawn, altered or had records moved/merged. These changes will be available with the release of ChEMBL_16, further enhancing the data you have and need! 

Any questions or queries, please feel free to contact ChEMBL Help at the usual address.

Louisa

Sunday, 24 February 2013

New Drug Approvals 2013 - Pt. III - Pomalidomide (PomalystTM)



ATC Code: L04A (partial)
Wikipedia: Pomalidomide

On February 8th, the FDA approved Pomalidomide (Tradename: Pomalyst; Research Code: CC-4047, IMiD 3), a thalidomide analogue, indicated for the treatment of multiple myeloma in patients who failed to respond to previous therapies (e.g. lenalidomide and bortezomib).

Multiple myeloma is a form of blood cancer that primarily affects older adults, and arises from the accumulation of abnormal plasma cells in the bone marrow. These abnormal plasma cells produce large amounts of unneeded antibodies, which are then deposited in various organs, causing renal failure, polyneuropathy and other myeloma-associated symptoms.

Pomalidomide, an analogue of thalidomide, is an immunomodulatory agent with antineoplastic activity. Like other thalidomide analogs, the exact mechanism of action is yet not fully understood, however in vitro assays demonstrated that pomalidomide inhibited proliferation and induced apoptosis of hematopoietic tumor cells, including lenalidomide-resistant multiple myeloma cell lines. It has also been shown that pomalidomide enhanced T cell and natural killer (NK) cell-mediated immunity and inhibited production of pro-inflammatory cytokines (e.g., TNF-α and IL-6). For more information take a look at this review.

Pomalidomide, like other thalidomide derivatives, belongs to the -domide USAN/INN stem. Members of this class are thalidomide, lenalidomide (both approved drugs and licensed by Celgene Corporation), and Mitindomide and Endomide. Pomalidomide is a result of a quest for safer analogs of thalidomide, and has a higher potency than any of its predecessors.


Pomalidomide (IUPAC Name: 4-amino-2-(2,6-dioxopiperidin-3-yl)isoindole-1,3-dione; Canonical smiles: Nc1cccc2C(=O)N(C3CCC(=O)NC3=O)C(=O)c12 ; ChEMBL: CHEMBL43452; PubChem: 134780; ChemSpider: 118785; Standard InChI Key: UVSMNLNDYGZFPF-UHFFFAOYSA-N) is a derivative of thalidomide, with a molecular weight of 273.2 Da, 5 hydrogen bond acceptors, 2 hydrogen bond donors, and has an ALogP of -0.65. The compound is therefore fully compliant with the rule of five.

Pomalidomide is available in the capsular form, and the recommended daily dose is 4 mg on days 1-21 of repeated 28-day cycles until disease progression. Following administration of single oral doses in patients with multiple mieloma, the systematic exposure was characterized by an AUC(Τ) of 400 ng.hr/ mL and maximum plasma concentration (Cmax) of 75 ng/mL. At steady state, the mean apparent volume of distribution (Vd/F) was 62-138 L. Pomalidomide is weakly bound to human plasma proteins (12-44%).

Pomalidomide is primarily metabolized in the liver by CYP1A2 and CYP3A4, with additional minor contributions from CYP2C19 and CYP2D6. Pomalidomide is also a substrate for P-glycoprotein (P-gp). The elimination median plasma half-life (t1/2) for pomalidomide is approximately 9.5 hours in healthy subjects and 7.5 in patients with multiple mieloma. Pomalidomide has a mean total body clearance (CL/ F) of 7-10 L/hr.

Pomalidomide has been issued with a black box warning due to its teratogenic profile, i.e., it can cause severe life-threatening birth defects, and also due to its higher risk for venous thromboembolism in patients exposed to the drug. Because of Pomalyst’s embryo-fetal risk, it is available only through the Pomalyst Risk Evaluation and Mitigation Strategy (REMS) Program.

The license holder for PomalystTM is Celgene Corporation, and the full prescribing information can be found here.

Thursday, 21 February 2013

Save the date: 2nd RDKit UGM, 2-4 October


We'll be organising the 2nd RDKit Users Group Meeting which will be held from the 2nd until the 4th of October 2013 here at the EMBL-EBI in Hinxton. In addition to two days of talks, tutorials and discussions, the last day will be dedicated to a coding/documentation sprint.

Stay tuned for more information, as well as a call for presenters, which will come over the next few weeks, but, in the meantime, please go ahead and block the dates in your busy calendars!


George

Monday, 18 February 2013

Sign Up Now For Our Webinars!!




A couple of weeks ago, I created a Doodle Poll to gauge interest in  hosting another series of Webinars, after the success of the ones we hosted last year.
After a good response, these Webinars have now been organised and those who are interested in signing up to them, can do so here.
Most of the webinars will only take 45mins and will give a good overview of the topic that they are talking about. You can watch and listen to them from the comfort of your own desk.
The Doodle Poll signup section is hidden so only us here at ChEMBL Towers can see your personal details. However, I must stress that it is important that you leave your name and email address on the poll when you sign up, so that I can send on the connection details for the Webinar. Without this information, you will not be able to take part as I won't know where to send the connection details to.

Any issues or queries, please do not hesitate to contact ChEMBL Help.

Friday, 15 February 2013

Latest activities on the Activities table in ChEMBL_15


For the recent ChEMBL_15 release, a considerable part of our efforts was focussed on the standardisation and harmonisation of the data in the Activities table. The latter holds all the quantitative and qualitative experimental measurements across compounds, assays and targets; needless to say that without it there's no ChEMBL!

This is a summary of what we've incorporated so far:

  1. Flag missing data: Records with null published values and null activity comments were flagged as missing.
  2. Standardise activity types and units: Conversion of heterogenous published activity type descriptions and units to a standard_type and set of standard_units (e.g., for IC50 convert mM/uM/pM measurements to nM).
  3. Flag unusual units: Records with unusual published units for their respective activity types were flagged as 'non standard'. For example, a hypothetical record with IC50 type and units in kg would be flagged!
  4. Convert the log values: The records with activity types such as pKi and logIC50 were appropriately converted to their non-log equivalents (by considering the units and sign of course as well). This updated a whopping 25% of the activities table - this means that significantly more data will become more comparable for subsequent analyses.
  5. Round values: For records with a standard activity value above 10, the rounding was done to the second decimal place. Otherwise, rounding was performed after the first three significant digits. For example 0.00023666666 would become a more concise 0.000237
  6. Check activity ranges: Records with a standard activity value outside the range specified by our expert biological curators, given the standard unit and type, were appropriately flagged.
  7. Detect duplicated values: For this one, we were inspired by a recent publicationWhat we did is we detected and flagged duplicated activity entries and potential transcription errors in activity records that come from publications. The former are records with identical compound, target, activity, type and unit values that were most likely reported as citations of measurements from previous papers, even when these measurements were subsequently rounded. The latter cases consist of otherwise identical entries whose activity values differ by exactly 3 or 6 orders of magnitude indicating a likely error in the units (e.g. uM instead of nM).

As a result of our efforts, we added 2 new columns in the Activities table, namely Data_validity_comment and Potential_duplicate. The former takes one out of 5 possible values: NULL, 'Potential missing data' (see point 1), 'Non standard unit for type' (see point 3), 'Outside typical range' (see point 6) and 'Potential transcription error' (see point 7). The latter column contains a binary (0,1) flag to indicate whether we think the specific activity record is a duplicate, as per point 7 above.

Stay tuned for more posts on the changes/improvements introduced by the new ChEMBL_15 release. Meanwhile, if you have any comments/feedback on the curation process or on the activity types we should prioritise, please let us know

George

Monday, 11 February 2013

New Drug Approvals 2013 - Pt. II - Mipomersen (KynamroTM)



ATC Code: C10AX11
Wikipedia: Mipomersen

On January 29st, the FDA approved Mipomersen (Tradename: Kynamro; Research Code: ISIS-310312), an oligonucleotide inhibitor of apolipoprotein B-100 (apo B-100) synthesis, indicated as an adjunct to lipid-lowering medications and diet to reduce low density lipoprotein-cholesterol (LDL-C), apolipoprotein B (apo B), total cholesterol (TC), and non-high density lipoprotein-cholesterol (non HDL-C) in patients with homozygous familial hypercholesterolemia (HoFH).

Familial hypercholesterolemia is a genetic disorder, characterised by high levels of cholesterol rich low-density lipoproteins (LDL-C) in the blood. This genetic condition is generally attributed to a faulty mutation in the LDL receptor (LDLR) gene, which mediates the endocytosis of LDL-C.

Mipomersen is the first antisense oligonucleotide that targets messenger RNA (mRNA) enconding apolipoprotein B-100 (Apo B-100), the principal apolipoprotein of LDL and its metabolic precursor, very low density lipoprotein (VLDL). Mipomersen forms a duplex with the target mRNA, causing the mRNA to be cleaved by RNase H and therefore unable to be translated to apoB-100. Hepatic apoB mRNA silencing gives rise to reductions in hepatic apoB, total cholesterol and LDL-C levels in the serum (PMID: 23226021).

The binding site for mipomersen lies within the coding region of the apo B mRNA at the position 3249-3268 relative to the published sequence GenBank accession number NM_000384.1.

Mipomersen, like other antisense oligonucleotides, belongs to the -rsen USAN/INN stem group. Several members of this class are currently in clinical trials like Alicaforsen (Isis, phase III) for the treatment of inflammatory bowel disease and Oblimersen (Genta, phase II) for cancer therapy.


Mipomersen (IUPAC Name: 2'-O-(2-methoxyethyl)-P-thioguanylyl-(3'→5')-2'-O-(2-methoxyethyl)-5-methyl- P-thiocytidylyl-(3'→5')-2'-O-(2-methoxyethyl)-5-methyl-P-thiocytidylyl-(3'→5')- 2'-O-(2-methoxyethyl)-5-methyl-P-thiouridylyl-(3'→5')-2'-O-(2-methoxyethyl)-5- methyl-P-thiocytidylyl-(3'→5')-2'-deoxy-P-thioadenylyl-(3'→5')-2'-deoxy-P- thioguanylyl-(3'→5')P-thiothymidylyl-(3'→5')-2'-deoxy-5-methyl-P-thiocytidylyl- (3'→5')-P-thiothymidylyl-(3'→5')-2'-deoxy-P-thioguanylyl-(3'→5')-2'-deoxy-5- methyl-P-thiocytidylyl-(3'→5')-P-thiothymidylyl-(3'→5')-P-thiothymidylyl- (3'→5')-2'-deoxy-5-methyl-P-thiocytidylyl-(3'→5')-2'-O-(2-methoxyethyl)-P- thioguanylyl-(3'→5')-2'-O-(2-methoxyethyl)-5-methyl-P-thiocytidylyl-(3'→5')-2'- O-(2-methoxyethyl)-P-thioadenylyl-(3'→5')-2'-O-(2-methoxyethyl)-5-methyl-P- thiocytidylyl-(3'→5')-2'-O-(2-methoxyethyl)-5-methylcytidine nonadecasodium salt; Canonical smiles: COCCO[C@H]1[C@@H](O)[C@H](COP(=O)(O)S[C@H]2[C@H](COP(=O)(O)S[C@H]3[C@H](COP(=O)(O)S[C@H]4[C@H](COP(=O)(O)S[C@@H]5[C@@H](COP(=O)(O)S[C@H]6C[C@@H](O[C@@H]6COP(=O)(O)S[C@H]7C[C@@H](O[C@@H]7COP(=O)(O)S[C@H]8C[C@@H](O[C@@H]8COP(=O)(O)S[C@H]9C[C@@H](O[C@@H]9COP(=O)(O)S[C@H]%10C[C@@H](O[C@@H]%10COP(=O)(O)S[C@H]%11C[C@@H](O[C@@H]%11COP(=O)(O)S[C@H]%12C[C@@H](O[C@@H]%12COP(=O)(O)S[C@H]%13C[C@@H](O[C@@H]%13COP(=O)(O)S[C@H]%14C[C@@H](O[C@@H]%14COP(=O)(O)S[C@H]%15C[C@@H](O[C@@H]%15COP(=O)(O)S[C@@H]%16[C@@H](COP(=O)(O)S[C@@H]%17[C@@H](COP(=O)(O)S[C@@H]%18[C@@H](COP(=O)(O)S[C@@H]%19[C@@H](COP(=O)(O)S[C@@H]%20[C@@H](CO)O[C@H]([C@@H]%20OCCOC)n%21cnc%22C(=O)NC(=Nc%21%22)N)O[C@H]([C@@H]%19OCCOC)N%23C=C(C)C(=NC%23=O)N)O[C@H]([C@@H]%18OCCOC)N%24C=C(C)C(=NC%24=O)N)O[C@H]([C@@H]%17OCCOC)N%25C=C(C)C(=O)NC%25=O)O[C@H]([C@@H]%16OCCOC)N%26C=C(C)C(=NC%26=O)N)n%27cnc%28c(N)ncnc%27%28)n%29cnc%30C(=O)NC(=Nc%29%30)N)N%31C=C(C)C(=O)NC%31=O)N%32C=C(C)C(=NC%32=O)N)N%33C=C(C)C(=O)NC%33=O)n%34cnc%35C(=O)NC(=Nc%34%35)N)N%36C=C(C)C(=NC%36=O)N)N%37C=C(C)C(=O)NC%37=O)N%38C=C(C)C(=O)NC%38=O)N%39C=C(C)C(=NC%39=O)N)O[C@H]([C@@H]5OCCOC)n%40cnc%41C(=O)NC(=Nc%40%41)N)O[C@@H]([C@H]4OCCOC)N%42C=C(C)C(=NC%42=O)N)O[C@@H]([C@H]3OCCOC)n%43cnc%44c(N)ncnc%43%44)O[C@@H]([C@H]2OCCOC)N%45C=C(C)C(=NC%45=O)N)O[C@@H]1N%46C=C(C)C(=NC%46=O)N; ChEMBL: CHEMBL1208153; Standard InChI Key: TZRFSLHOCZEXCC-PBNBMMCMSA-N) is a synthetic second-generation 20-base phosphorothioate antisense oligonucleotide, with a molecular weight of 7594.9 Da and the following sequence:

5'-GMeCMeCMeUMeCAGTMeCTGMeCTTMeCGMeCAMeCMeC-3'

where the underlined residues are 2′-O-(2-methoxyethyl) nucleosides, and the remaining are 2′-deoxynucleosides. Substitution at the 5-position of the cytosine (C) and uracil (U) bases with a methyl group is indicated by Me.

Mipomersen is available as an aqueous solution for subcutaneous injection, and the recommended weekly dose is a single-use pre-filled syringe containing 1 mL of a 200 mg/mL solution. Following subcutaneous injection, peak concentrations of mipomersen are typically reached in 3 to 4 hours. The estimated plasma bioavailability of mipomersen following subcutaneous administration over a dose range of 50 mg to 400 mg, relative to intravenous administration, ranged from 54% to 78%. Mipomersen is highly bound to human plasma proteins (≥ 90%).

Mipomersen is metabolised in tissues by endonucleases to form shorter oligonucleotides that are then substrates for additional metabolism by exonucleases, and finally excreted in urine. The elimination half-life (t1/2) for mipomersen is approximately 1 to 2 months.

Mipomersen has been approved with a black box warning due to an increase in transaminases (alanine aminotransferase [ALT] and/or aspartate aminotransferase [AST]) levels, and hepatic fat content (steatosis) after exposure to the drug.

The license holder for KynamroTM is Genzyme Corporation, and the full prescribing information can be found here.

Thursday, 7 February 2013

Future Webinars



After the success of the last round of webinars, we have decided to run another set.

However, we would like to gauge the interest in which topics would be most useful. The topics that have been suggested so far are:


  • ChEMBL Overview - Basic interface walkthrough and searching
  • ChEMBL Schema - Basic overview and ChEMBL changes
  • ChEMBL Schema - Changes to ChEMBL target data model
  • UniChem - Basic overview and searching
  • Drug and USAN data content


For this, a doodle poll has been set up which will allow you to register your interest. The poll is hidden, so no one will see your details, but it will allow us at ChEMBL Towers to see if the webinar would be worthwhile to set up. Please click on the link and let us know what you think.

Additionally, if there are any webinars that we have not suggested, that you believe would be useful, please feel free to suggest these on the doodle poll, or email chembl-help@ebi.ac.uk

Wednesday, 6 February 2013

New Drug Approvals 2013 - Pt. 1 - Alogliptin (NesinaTM)


ATC Code: A10BH04
Wikipedia: Alogliptin

On January 25th 2013, FDA approved Alogliptin (as the benzoate salt; tradename: Nesina; research code: SYR-322, TAK-322; CHEMBL: CHEMBL376359), a dipeptidyl peptidase-4 (DPP-4) inhibitor indicated as an adjunct to diet and exercise to improve glycemic control in adults with type 2 diabetes mellitus (also known as noninsulin-dependent diabetes mellitus (NIDDM)).

NIDDM is a chronic disease characterized by high blood glucose. In response to meals, increased concentrations of incretin hormones such as glucagon-like peptide-1 (GLP-1) and glucose-dependent insulinotropic polypeptide (GIP) are released into the bloodstream from the small intestine. These hormones cause insulin release from the pancreatic beta cells in a glucose-dependent manner but are inactivated by the DPP-4 enzyme within minutes. GLP-1 also lowers glucagon secretion from pancreatic alpha cells, reducing hepatic glucose production. In patients with NIDDM, concentrations of GLP-1 are reduced but the insulin response to GLP-1 is preserved. Alogliptin exerts its therapeutic action by inhibiting DPP-4, thereby slowing the inactivation of the incretin hormones and increasing their bloodstream concentrations.

Image from Wikipedia

Other DPP-4 inhibitors are already available on the market (some of which have already been covered here on the ChEMBL-og) and these include Linagliptin (approved in 2011 under the tradename Tradjenta; ChEMBL: CHEMBL237500; PubChem: CID10096344; ChemSpider: 8271879), Saxagliptin (approved in 2009 under the tradename Onglyza; ChEMBL: CHEMBL385517; PubChem: CID11243969; ChemSpider: 9419005) and Sitagliptin (approved in 2006 under the tradename Januvia; ChEMBL: CHEMBL1422; PubChem: CID4369359; ChemSpider: 3571948). Several other DPP-4 inhibitors are in clinical trials such as Trelagliptin (ChEMBL: CHEMBL1650443; research code: SYR-472), Omarigliptin (ChEMBL: CHEMBL2105762; research code: MK-3102), Carmegliptin (ChEMBL: CHEMBL591118; research code: R-1579, RO-4876904), Gosogliptin (ChEMBL: CHEMBL515387; research code: PF-734200), Dutogliptin (research code: PHX1149), Denagliptin (ChEMBL: CHEMBL2110666; research code: GW823093). Vildagliptin (ChEMBL: CHEMBL142703) has been approved in Europe and Japan, but not in the United States.

DPP-4 (ChEMBL: CHEMBL284; Uniprot: P27487) is 766 amino acid-long enzyme, which is responsible for the removal of N-terminal dipeptides sequentially from polypeptides having unsubstituted N termini, provided that the penultimate residue is proline. It belongs to the Dipeptidyl peptidase IV family (PFAM: PF00930).

>DPP4_HUMAN Dipeptidyl peptidase 4
MKTPWKVLLGLLGAAALVTIITVPVVLLNKGTDDATADSRKTYTLTDYLKNTYRLKLYSL
RWISDHEYLYKQENNILVFNAEYGNSSVFLENSTFDEFGHSINDYSISPDGQFILLEYNY
VKQWRHSYTASYDIYDLNKRQLITEERIPNNTQWVTWSPVGHKLAYVWNNDIYVKIEPNL
PSYRITWTGKEDIIYNGITDWVYEEEVFSAYSALWWSPNGTFLAYAQFNDTEVPLIEYSF
YSDESLQYPKTVRVPYPKAGAVNPTVKFFVVNTDSLSSVTNATSIQITAPASMLIGDHYL
CDVTWATQERISLQWLRRIQNYSVMDICDYDESSGRWNCLVARQHIEMSTTGWVGRFRPS
EPHFTLDGNSFYKIISNEEGYRHICYFQIDKKDCTFITKGTWEVIGIEALTSDYLYYISN
EYKGMPGGRNLYKIQLSDYTKVTCLSCELNPERCQYYSVSFSKEAKYYQLRCSGPGLPLY
TLHSSVNDKGLRVLEDNSALDKMLQNVQMPSKKLDFIILNETKFWYQMILPPHFDKSKKY
PLLLDVYAGPCSQKADTVFRLNWATYLASTENIIVASFDGRGSGYQGDKIMHAINRRLGT
FEVEDQIEAARQFSKMGFVDNKRIAIWGWSYGGYVTSMVLGSGSGVFKCGIAVAPVSRWE
YYDSVYTERYMGLPTPEDNLDHYRNSTVMSRAENFKQVEYLLIHGTADDNVHFQQSAQIS
KALVDVGVDFQAMWYTDEDHGIASSTAHQHIYTHMSHFIKQCFSLP

The image above shows a crystal structure of DPP-4 (in this example, two copies of DPP-4 are displayed - PDBe: 3g0b). Information on the active site of DPP-4 can be found here.


Alogliptin is an oral small-molecule with a molecular weight of 339.4 Da (461.51 Da as the benzoate salt). The image on the right shows Alogliptin in the active site of DPP-4. Important features of its chemical structure are the aminopiperidine motif, which provides a salt bridge to the glutamic acids residues 205/206 in the active site of DPP-4, the cyanobenzyl group which interacts with the arginine residue 125, the carbonyl group from the pyrimidinedione moiety which participates in an hydrogen bond with the backbone NH of tyrosine 631 and the uracil ring which π-stacks with tyrosine 547.
IUPAC: 2-[[6-[(3R)-3-aminopiperidin-1-yl]-3-methyl-2,4-dioxopyrimidin-1-yl]methyl]benzonitrile
Canonical Smiles: CN1C(=O)C=C(N2CCC[C@@H](N)C2)N(Cc3ccccc3C#N)C1=O
InChI: InChI=1S/C18H21N5O2/c1-21-17(24)9-16(22-8-4-7-15(20)12-22)23(18(21)25)11-14-6-3-2-5-13(14)10-19/h2-3,5-6,9,15H,4,7-8,11-12,20H2,1H3/t15-/m1/s1

The recommended dosage of Alogliptin is 25 mg once daily. Alogliptin has good oral bioavailability F (approximately 100% bioavailable), with a volume of distribution Vd of 417 L and a low plasma protein binding (20%). Excretion is mainly renal (76% of the dose recovered in urine) and mostly as the parent compound (60% to 71%). Alogliptin is metabolized by CYP2D6 and CYP3A4 to two minor metabolites, M-I (N-demethylated alogliptin - &gt1% of the parent drug), which is an active metabolite and is an inhibitor of DPP-4 similar to the parent molecule and M-II (N-acethylated alogliptin - &gt6% of the parent drug), which does not display any inhibitory activity towards DPP-4 or other DPP-related enzymes. The renal clearance of Alogliptin is 9.6 L/hr and the systemic clearance is 14.0 L/hr.

The license holder is Takeda Pharmaceuticals America, Inc. and the prescribing information of Alogliptin can be found here.

Patricia

Sunday, 3 February 2013

Japan - Here I Come (in October)!


I'm out in Japan at the end of October this year - the week of October 28th 2013 for a scientific conference (the CBI Annual Conference). Japan is one of my favorite places in the whole world, and I have a routine of...

  • browsing vintage hi-fi shops
  • hunting out high-end capacitor and choke components, 
  • eating eel うなぎ bento, 
  • visiting CD stores (at least the format lives on in Japan, and the Obi Strips and enhanced content makes compelling browsing for an obsessive completist like me)
  • going to Bic Camera 株式会社ビックカメラ
  • visiting Akihabara 秋葉原.


My schedule is currently empty for Wednesday 30th and Thursday 31st. I'd be delighted to visit and give a seminar, or maybe run a workshop on ChEMBL, so if you are interested in meeting up, or a visit, or an evening meal, let me know.

jpo

Friday, 1 February 2013

Updated Drug Icons

In the recent release of CHEMBL_15, we have revisited the information displayed in the drug icons used in the ChEMBL interface and in the ChEMBL-og New Drug Approvals monographs and we have made a few changes.

The following images show the main changes (in this example, for the case of an oral synthetic small molecule):




1. We have visually separated the ingredient-specific information (icons in green) from the product-specific information (icons in blue).

2. The chirality icon will now also show if the ingredient is dosed as a racemic mixture (an image of two human hands).

3. An extra icon has been added to indicate the marketing status of a drug product. The product can be available as prescription (an image of the letters RX), over-the-counter (an image of the letters OTC) or discontinued (an image of the letters of RX with a stripe across it).

In summary...

The ingredient icons (in green) display the following information (from left to right)
Drug class
this can either be
Synthetic small molecule
Natural product-derived small molecule
Inorganic small molecule
Peptide/protein
Monoclonal antibody
Enzyme
Oligonucleotide
Oligosaccharide.
Rule of Five
An image of the number five: this is either pass or fail - we fail a molecule if it fails to pass all the individual tests (usually people use fail one parameter); we use AlogP for the calculations and use 5.0 as a cutoff.
New target
An image of a 'bullseye' target: this is either true or false - the target here refers to the molecular target responsible (or believed to be responsible) for its therapeutic efficacy.
Chirality
An image of two human hands: the drug is dosed as a racemic mixture.
An image of a chiral human hand: the drug is dosed as a single optically active substance.
Prodrug
An image of a par of scissors: the drug is essentially inactive in the dosed form and requires some chemical change in order to become pharmacologically active against its efficacy target.











The product icons (in blue) display the following information
Oral delivery
An image of a capsule.
Parenteral delivery
An image of a syringe.
Topical delivery
An image of an ointment tube.
Some drugs are dosed in multiple forms, so this is why we haven't collapsed these down to a single state. Also this icon actually represents the absorption route (so some drug that are actually deliver orally, may in fact be sublingually absorbed).
Boxed warning
An image of a black box: this is either true or false.
Availability
An image of the letters RX: the product is available as prescription.
An image of the letters OTC: the product is available over-the-counter.
An image of the letters RX with a stripe across: the product is discontinued.

patricia

Searching ChEMBL with GO terms


Here is a little new tip/trick within the ChEMBL interface. It's possible to search by GO term - for example, if you wanted to retrieve targets (and then easily get compounds that bind to these targets) with a particular GO annotation, it's really easy to do.

So, imagine you wanted targets that were GO:0008270 (which is zinc ion binding), type this in to the search box, select the "target" search button, and you get targets retrieved that have this GO term assigned. This is really cool!

PS One issue is that the leading 0s in the GO term are significant