I covered the issue of taxol a few weeks ago: http://www.chemspider.com/blog/?p=64. Today Taxol came up again on a post by Peter Murray-Rust. First of all a couple of comments re the post.

PMR commented “The intelligible Chemspider image was hand-drawn by the PNAS authors – I don’t know how it got to Chemspider. (Personally I think it’s pretty awful – I do not like stereo bonds which are rectangular rather than wedges. Why do people use them. And You only have to scale the image to corrupt this info). So we need an Open collection of chemical structures.” In case there is confusion please read the original post…the structure was grabbed from a PDF file (4 Total synthesis highlights (Annu. Rep. Prog. Chem., Sect. B: Org. Chem., 2004, 100, 91) – Royal Society of Chemistry)…it is NOT on ChemSpider. The structure was located by a search using Chemrefer, now on ChemSpider. It was not drawn by us, we’re not responsible for it and, to clarify, I don’t like it either.

Oh, and we do have an Open Collection of chemical structures. The deposition process is under beta-testing and anyone can download the data (we will give away the entire structure collection shortly).

Peter commented that Wikipedia is highly curated. I use it a lot. But, I am cautious…ESPECIALLY with stereochemisty. I’m trying to determine what the ACTUAL taxol structure is. My investigations suggest that one stereocenter is WRONG on the Wikipedia structure. The link to the PubChem record is therefore to the incorrect structure in theory.

Also, the systematic name is not what I would term as anywhere near IUPAC standard: β-(benzoylamino)-α-hydroxy-,6,12b-bis(acetyloxy)-12-(benzoyloxy)-2a,3,4,4a,5,6,9,10,11,12,12a,12b-dodecahydro-4,11-dihydroxy-4a,8,13,13-tetramethyl-5-oxo-7,11-methano-1H-cyclodeca(3,4)benz(1,2-b)
oxet-9-ylester,(2aR-(2a-α,4-β,4a-β,6-β,9-α(α-R*,β-S*),11-α,12-α,12a-α,2b-α))-benzenepropanoic acid

By the way…the name on Drugbank is 5 beta,20-Epoxy-1,2a,4,7 beta,10 beta,13 alpha-hexahydroxytax-11-en-9-one 4,10-diacetate
2-benzoate 13-ester with (2 R,3S)-N-benzoyl-3-phenylisoserine….hmmm…

I would LOVE this post to get confirmation regarding what the right structure is…is Wikipedia CORRECT or Wrong? I THINK the structure on Drugbank is RIGHT. This DIFFERS from the Wikipedia structure by one stereocenter. Check out the InChIs below:

PUBCHEM
InChI=1/C47H51NO14/c1-25-31(60-43(56)36(52)35(28-16-10-7-11-17-28)48-41(54)29-18-12-8-13-19-29)23-47(57)40(61-42(55)30-20-14-9-15-21-30)38-45(6,32(51)22-33-46(38,24-58-33)62-27(3)50)39(53)37(59-26(2)49)34(25)44(47,4)5/h7-21,31-33,35-38,40,51-52,57H,22-24H2,1-6H3,(H,48,54)/t31-,32-,33+,35-,36+,37-,38-,40-,45+,46-,47+/m0/s1/f/h48H

DRUGBANK
InChI=1/C47H51NO14/c1-25-31(60-43(56)36(52)35(28-16-10-7-11-17-28)48-41(54)29-18-12-8-13-19-29)23-47(57)40(61-42(55)30-20-14-9-15-21-30)38-45(6,32(51)22-33-46(38,24-58-33)62-27(3)50)39(53)37(59-26(2)49)34(25)44(47,4)5/h7-21,31-33,35-38,40,51-52,57H,22-24H2,1-6H3,(H,48,54)/t31-,32-,33+,35-,36+,37+,38-,40-,45+,46-,47+/m0/s1/f/h48H

Compare the STEREO layer at:
t31-,32-,33+,35-,36+,37-,38-,40-,45+,46-,47+
t31-,32-,33+,35-,36+,37+,38-,40-,45+,46-,47+

and compare the stereo for stereo center 37… one is PLUS and one is MINUS. OOPS!

I’m certainly willing to be wrong but the point is, right now, I am not sure what the right structure. Can anyone out there confirm??? Can someone check “the” highly curated data source and tell us?

Until then I am in full agreement with Peter regarding what Wikipedia SHOULD be “It’s Open, re-usable, very highly curated, and the first place that students look. That – or a derivative – is where the world’s chemistry should reside. ” HOWEVER, I am calling for confirmation of the structure and correction if necessary. One of either DrugBank OR PubChem, both linked from Wikipedia, is wrong.

In terms of the comment “That – or a derivative – is where the world’s chemistry should reside”. I DO agree. We have committed to a wiki-environment for Chemistry. We are presently deciding on the appropriate wiki environment (NOT necessarily MediaWiki) to layer onto ChemSpider. Email exchanges are underway with some of the players in this domain at present – and a sincere thanks to Joerg Wegner for his support on this! With Martin Walker on our advisory group (Walkerma on Wikipedia…a very active player in this domain) we look forward to the best advice and guidance from our collaborators.

Stumble it!

11 Responses to “Will the Correct Structure of Taxol Please Stand Up. Part 2.”

  1. fvasconcellos says:

    Hi there from a wikipedian. The structure currently on WP matches that on DrugBank, and a quick and dirty Google search seems to confirm that is indeed the correct stereochemistry. After recreating the PubChem structure in ChemSketch and cleaning it up, the stereochemistry on that center doesn’t match that found on any other source I could locate.

  2. Antony Williams says:

    Fellow wikipedian….I BELIEVE you are correct. My research today suggests that the structure in Wikipedia shown in the DrugBox IS the correct one. I believe the structure on Drugbank is the correct one (WP and Drugbank do match). It is the identified structure on PubChem that is in error so the link to that record in PubChem should be removed. That’s TWO votes (yours and mine)…anybody else?

  3. M Karthikeyan says:

    TAXOL:
    SMILES

    [H][C@]12[C@H](OC(=O)C3=CC=CC=C3)[C@]4(O)C[C@H](OC(=O)[C@H](O)[C@@H](NC(=O)C5=CC=CC=C5)C6=CC=CC=C6)C(C)=C([C@@H](OC(C)=O)C(=O)[C@]1(C)[C@@H](O)C[C@H]7OC[C@@]27OC(C)=O)C4(C)C

    Chemical Name: (Traditional and ‘authentic’)

    Benzenepropanoic acid, b-(benzoylamino)-a-hydroxy-, (2aR,4S,4aS,6R,9S,11S,12S,12aR,12bS)-6,12b-bis(acetyloxy)-12-(benzoyloxy)-2a,3,4,4a,5,6,9,10,11,12,12a,12b-dodecahydro-4,11-dihydroxy-4a,8,13,13-tetramethyl-5-oxo-7,11-methano-1H-cyclodeca[3,4]benz[1,2-b]oxet-9-yl ester, (aR,bS)-

    Name from web:

    http://www.chemexper.com/chemicals/supplier/cas/33069-62-4.html

    S-Isoserine [2aR-[2aa,4b,4ab,6a,9a(aR,bS),11a,12a,12aa,12ba]]-b-(Benzoylamino)-a-hydroxy-6,12b-bis(acetyloxy)-12-(benzoyloxy)-2a,3,4,4a,5,6,9,10,11,12,12,12b-dodecahydro-4,11-dihydroxy-4a,8,13,13-tetramethyl-5-oxo-7,11-methano-1H-cyclodeca[3,4]benz[1,2-b]oxet-9-yl

    “is there a way to post the image here.. created from the above smiles using Jchem?”

  4. M Karthikeyan says:

    I posted the chemical structure (generated from smiles) here..

    http://chemxtreme.blogspot.com/2007/09/taxol-structure.html

  5. M Karthikeyan says:

    InChI=1/C47H51NO14/c1-25-31(60-43(56)36(52)35(28-16-10-7-11-17-28)48-41(54)29-18-12-8-13-19-29)23-47(57)40(61-42(55)30-20-14-9-15-21-30)38-45(6,32(51)22-33-46(38,24-58-33)62-27(3)50)39(53)37(59-26(2)49)34(25)44(47,4)5/h7-21,31-33,35-38,40,51-52,57H,22-24H2,1-6H3,(H,48,54)/t31-,32-,33+,35-,36+,37+,38-,40-,45+,46-,47+/m0/s1

    AuxInfo=1/1/N:39,44,59,62,63,49,35,29,10,34,36,28,30,9,11,33,37,27,31,8,12,52,15,55,38,43,58,32,26,7,16,50,53,40,22,20,41,2,46,3,24,5,18,61,48,56,13,23,45,60,51,21,47,25,6,19,14,54,42,17,4,57/E:(4,5)(10,11)(12,13)(14,15)(16,17)(18,19)(20,21)/it:im/rA:63cHCCOCOCCCCCCCOCCOCOCOCNCOCCCCCCCCCCCCCCCCOCCOCOCCCOCCOCCOCCOCCC/rB:p1;s2;P3;s4;d5;s5;d7;s8;d9;s10;s7d11;s3;N13;s13;s15;P16;s17;d18;s18;N20;s20;P22;s23;d24;s24;d26;s27;d28;s29;s26d30;s22;d32;s33;d34;s35;s32d36;s16;s38;d38;s40;N41;s42;s43;d43;s41;d46;s2s46;N48;s48;N50;s50;n52;s53;s54;s2s53s55;P56;s57;s58;d58;s13s40;s61;s61;/rC:.2089,1.3942,0;.8533,1.8993,0;.7069,1.0313,0;.0044,.5343,0;-.8388,.4818,0;-.4565,1.1939,0;-1.6388,.7336,0;-1.9244,1.5078,0;-2.7347,1.6572,0;-3.2658,1.0282,0;-2.9881,.2493,0;-2.1759,.1099,0;1.1976,.3272,0;.4902,-.1198,0;.9201,-.6158,0;2.3175,-1.1941,0;1.4272,-1.4835,0;.6381,-1.7325,0;.0962,-1.113,0;.1116,-2.3697,0;.4388,-3.1275,0;-.695,-2.5513,0;-1.1884,-3.2167,0;-1.9951,-3.4075,0;-2.6555,-2.9102,0;-2.348,-4.1587,0;-3.1709,-4.2223,0;-3.5284,-4.9664,0;-3.0635,-5.6472,0;-2.2414,-5.5851,0;-1.8836,-4.8411,0;-1.2858,-1.9758,0;-.932,-1.2315,0;-1.3977,-.5527,0;-2.2219,-.6203,0;-2.5759,-1.3614,0;-2.1074,-2.0404,0;3.3202,-.3091,0;4.3309,-.6054,0;2.6225,.5841,0;2.8122,1.425,0;3.4417,1.9674,0;4.248,1.7845,0;4.1409,.9641,0;4.9164,2.2723,0;2.354,2.1465,0;2.8826,2.7901,0;1.5279,2.3335,0;2.0276,2.9953,0;1.48,3.1973,0;1.3969,4.0302,0;.7586,3.5778,0;.0825,3.1681,0;-.733,3.198,0;-.7497,2.3807,0;.0825,2.3374,0;-.5929,1.8352,0;-1.4215,1.9408,0;-2.1736,2.3088,0;-1.5636,2.7484,0;2.0075,.3054,0;2.4381,-.2607,0;1.7971,-.4384,0;

  6. M Karthikeyan says:

    chemical structure:
    http://chemxtreme.blogspot.com/2007/09/taxol-structure.html

  7. M Karthikeyan says:

    Structure image: chemxtreme.blogspot.com

  8. M Karthikeyan says:

    http://chemxtreme.blogspot.com/

  9. Antony Williams says:

    I looked at: http://chemxtreme.blogspot.com/ and the link posted there http://www.chemexper.com/chemicals/supplier/cas/33069-62-4.html

    Problems…the molecule taxol has the name S-Isoserine? Hmmm… No

    One of the formulae, there are three, is listed as C46H51NO14. Hmmm…No

    The name listed is very confusing for me: [2aR-[2aa,4b,4ab,6a,9a(aR,bS),11a,12a,12aa,12ba]]-b-(Benzoylamino)-a-hydroxy-6,12b-bis(acetyloxy)-12-(benzoyloxy)-2a,3,4,4a,5,6,9,10,11,12,12,12b-dodecahydro-4,11-dihydroxy-4a,8,13,13-tetramethyl-5-oxo-7,11-methano-1H-cyclodeca[3,4]benz[1,2-b]oxet-9-yl

    The InCHI you listed (InChI=1/C47H51NO14/c1-25-31(60-43(56)36(52)35(28-16-10-7-11-17-28)48-41(54)29-18-12-8-13-19-29)23-47(57)40(61-42(55)30-20-14-9-15-21-30)38-45(6,32(51)22-33-46(38,24-58-33)62-27(3)50)39(53)37(59-26(2)49)34(25)44(47,4)5/h7-21,31-33,35-38,40,51-52,57H,22-24H2,1-6H3,(H,48,54)/t31-,32-,33+,35-,36+,37+,38-,40-,45+,46-,47+/m0/s1)

    does connect with the ChemSPider record http://www.chemspider.com/RecordView.aspx?id=16739643

    Where did you get THIS SMILES? “[H][C@]12[C@H](OC(=O)C3=CC=CC=C3)[C@]4(O)C[C@H](OC(=O)[C@H](O)[C@@H](NC(=O)C5=CC=CC=C5)C6=CC=CC=C6)C(C)=C([C@@H](OC(C)=O)C(=O)[C@]1(C)[C@@H](O)C[C@H]7OC[C@@]27OC(C)=O)C4(C)C
    ” It appears to be consistent with what I think taxol is…

  10. M Karthikeyan says:

    the structure is from chemexper.. for the CAS no. (is CAS No right? if CAS no. right why it is cited for palitaxol in chemexper.. chemexper wrong? ) require some time to clarify..

  11. Antony Williams says:

    This info is from ChemExper:

    RN:
    33069-62-4
    33069626 (but this is struckout online)
    330696249 (but this is struckout online)

    MF:
    C47H51NO14
    C47H51NO14�
    NAME (Name is not a formula)
    C46H51NO14 (why are there four formulas for one structure….and one is text “Name”

    MW:
    853.92024
    841.90924

    Why are there two Mw.?

    I think the CAS number is 33069-62-4 as you comment. BUT, I don’t have access to Scifinder to confirm. There are likely CAS people reading this blog so it would be great if they could clean it up for us please!

Leave a Reply