As discussed in an earlier blog I spent some time chatting with Paul Doherty and Peter Murray Rust this weekend…specifically around InChIs and InChIKeys. I’d originally suggested to Paul that he put InChIs on the site so that I could use them to check for presence of the structures he draws in the ChemSpider database. Well, now he’s started to include them on his postings I get to check them.

I started with a search on the term Diazonamide A in Pubchem. Two hits….shown below.

Diazonamide A on PubChem 

Diazonamide A is a complex structure. Those structure representations drawn above are NASTY and do need cleaning for sure. Unfortunately Chempider has the same issues for these two structures (see below) since we did not CLEAN the PubChem dataset. Cleaning these structures in not an easy task. We are working on improving this as discussed previously.

The two shown on PubChem have different Isomeric SMILES and different InChIs. Why? ONE stereocenter difference…see the highlighted difference below (in red) and the arrow to the one stereocenter difference.

Stereo Differences

 It is appropriate to have two structures in PubChem..they are unique. But now we don’t know which one is Diazonamide A. Shucks.

And so, let’s check eMolecules. I didn’t find any hits. eMolecules did take a lot of PubChem into their dataset. the reason for not finding it might be “it’s not there” because of eMolecules focus on commercial suppliers OR because “I queried incorrectly”. Don’t know.

So, to ChemSpider. A search on Diazonamide A gave SIX hits. Two of these are the exact ones from PubChem. The four others are shown below…

Marinlit Diazonamide A

Of these four two of them have an “additional oxygen and two hydrogen atoms” in the molecular formula.

Is that right or wrong? TotallySynthetic has the formula as C40H34Cl2N6O6 so we’ll trust Paul and curate these two records as IN ERROR as shown below (A primary advantage of ChemSpider is we are allowing curating!). I’ll also let Marinlit know…

 

 Curate Out

 The differences between Diazonamide A and the “incorrect structure” are shown below just for information. SIGNIFICANTLY different.

Differences in Marinlit

Let’s take a look at the different InChIs for all structures we are considering – 2 PubChem, 2 ChemSpider (from Marinlit) and 1 from TotallySynthetic. They are ALL different and all differ in the sterochemistry layer.

PubChem CID: 395475
InChI=1/C40H34Cl2N6O6/c1-15(2)27-37-46-29-32(54-37)40-20-9-5-8-19(18-7-6-10-22-25(18)26(33(41)43-22)31-34(42)48-38(29)53-31)28(20)47-39(40)52-24-12-11-17(13-21(24)40)14-23(35(50)45-27)44-36(51)30(49)16(3)4/h5-13,15-16,23,27,30,39,43,47,49H,14H2,1-4H3,(H,44,51)(H,45,50)/t23-,27-,30-,39-,40-/m0/s1/f/h44-45H
 

PubChem CID: 5492609
InChI=1/C40H34Cl2N6O6/c1-15(2)27-37-46-29-32(54-37)40-20-9-5-8-19(18-7-6-10-22-25(18)26(33(41)43-22)31-34(42)48-38(29)53-31)28(20)47-39(40)52-24-12-11-17(13-21(24)40)14-23(35(50)45-27)44-36(51)30(49)16(3)4/h5-13,15-16,23,27,30,39,43,47,49H,14H2,1-4H3,(H,44,51)(H,45,50)/t23-,27-,30-,39-,40u /m0/s1/f/h44-45H
 

ChemSpider CID: 10478902 

InChI=1/C40H34Cl2N6O6/c1-15(2)27-37-46-29-32(54-37)40-20-9-5-8-19(18-7-6-10-22-25(18)26(33(41)43-22)31-34(42)48-38(29)53-31)28(20)47-39(40)52-24-12-11-17(13-21(24)40)14-23(35(50)45-27)44-36(51)30(49)16(3)4/h5-13,15-16,23,27,30,39,43,47,49H,14H2,1-4H3,(H,44,51)(H,45,50)/t23?,27-,30-,39?,40-/m0/s1/f/h44-45H  

ChemSpider CID: 17212293 

InChI=1/C40H34Cl2N6O6/c1-15(2)27-37-46-29-32(54-37)40-20-9-5-8-19(18-7-6-10-22-25(18)26(33(41)43-22)31-34(42)48-38(29)53-31)28(20)47-39(40)52-24-12-11-17(13-21(24)40)14-23(35(50)45-27)44-36(51)30(49)1 6(3)4/h5-13,15-16,23,27,30,39,43,47,49H,14H2,1-4H3,(H,44,51)(H,45,50)/t23-,27+,30+,39-,40+/m1/s1 

Totally Synthetic

InChI=1/C40H34Cl2N6O6/c1-15(2)27-37-46-29-32(54-37)40-20-9-5-8-19(18-7-6-10-22-25(18)26(33(41)43-22)31-34(42)48-38(29)53-31)28(20)47-39(40)52-24-12-11-17(13-21(24)40)14-23(35(50)45-27)44-36(51)30(49)16(3)4/h5-13,15-16,23,27,30,39,43,47,49H,14H2,1-4H3,(H,44,51)(H,45,50)/t23-,27-,30-,39?,40-/m0/s1

 

The TotallySynthetic structure has the stereo layer t23-,27-,30-,39?,40-. In real terms this relates to five stereo centers. At this point I have to question whether the structure as drawn is correct or not..what we have is five structures with the SAME connectivities but with different stereochemistries. This is similar to the issue I blogged about previously in regards to Taxol.

 

I’m not sure that we have the right structure of Diazonamide A yet..but I’m sure we’ll get there very quickly with this question out there in the open. I HOPE SO!

Stumble it!

3 Responses to “Diazonamide A and Chats with TotallySynthetic.Com”

  1. ChemSpider Blog » Blog Archive » More Comments About Diazonamide A - other efforts to distinguish WHAT’S REAL? says:

    [...] Diazonamide A and Chats with TotallySynthetic.Com 24 09 [...]

  2. Dan Zaharevitz says:

    The problem with the PubChem structures is not that they are badly drawn, but they are badly “cleaned up”. Navigate to the substance records and set the display to view the “deposited” structure. You certainly can download the deposited structure there. I don’t know if you can bulk download deposited structures. It just points out that standardization may give, but it takes away as well.

  3. ChemSpider Blog » Blog Archive » One Day I’ll Have Lunch with Egon Willighagen Too… says:

    [...] tagging his posts with InChIKeys…not InChIStrings (I’ve talked about the value of this here and [...]

Leave a Reply