libokj commited on
Commit
03de32d
·
1 Parent(s): 9dfd620

Upload 2 files

Browse files
data/examples/Aspirin_CID_2244.sdf ADDED
@@ -0,0 +1,157 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2244
2
+ -OEChem-12132310082D
3
+
4
+ 21 21 0 0 0 0 0 0 0999 V2000
5
+ 3.7320 -0.0600 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
6
+ 6.3301 1.4400 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
7
+ 4.5981 1.4400 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
8
+ 2.8660 -1.5600 0.0000 O 0 0 0 0 0 0 0 0 0 0 0 0
9
+ 4.5981 -0.5600 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
10
+ 5.4641 -0.0600 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
11
+ 4.5981 -1.5600 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
12
+ 6.3301 -0.5600 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
13
+ 5.4641 -2.0600 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
14
+ 6.3301 -1.5600 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
15
+ 5.4641 0.9400 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
16
+ 2.8660 -0.5600 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
17
+ 2.0000 -0.0600 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0
18
+ 4.0611 -1.8700 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
19
+ 6.8671 -0.2500 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
20
+ 5.4641 -2.6800 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
21
+ 6.8671 -1.8700 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
22
+ 2.3100 0.4769 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
23
+ 1.4631 0.2500 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
24
+ 1.6900 -0.5969 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
25
+ 6.3301 2.0600 0.0000 H 0 0 0 0 0 0 0 0 0 0 0 0
26
+ 1 5 1 0 0 0 0
27
+ 1 12 1 0 0 0 0
28
+ 2 11 1 0 0 0 0
29
+ 2 21 1 0 0 0 0
30
+ 3 11 2 0 0 0 0
31
+ 4 12 2 0 0 0 0
32
+ 5 6 1 0 0 0 0
33
+ 5 7 2 0 0 0 0
34
+ 6 8 2 0 0 0 0
35
+ 6 11 1 0 0 0 0
36
+ 7 9 1 0 0 0 0
37
+ 7 14 1 0 0 0 0
38
+ 8 10 1 0 0 0 0
39
+ 8 15 1 0 0 0 0
40
+ 9 10 2 0 0 0 0
41
+ 9 16 1 0 0 0 0
42
+ 10 17 1 0 0 0 0
43
+ 12 13 1 0 0 0 0
44
+ 13 18 1 0 0 0 0
45
+ 13 19 1 0 0 0 0
46
+ 13 20 1 0 0 0 0
47
+ M END
48
+ > <PUBCHEM_COMPOUND_CID>
49
+ 2244
50
+
51
+ > <PUBCHEM_COMPOUND_CANONICALIZED>
52
+ 1
53
+
54
+ > <PUBCHEM_CACTVS_COMPLEXITY>
55
+ 212
56
+
57
+ > <PUBCHEM_CACTVS_HBOND_ACCEPTOR>
58
+ 4
59
+
60
+ > <PUBCHEM_CACTVS_HBOND_DONOR>
61
+ 1
62
+
63
+ > <PUBCHEM_CACTVS_ROTATABLE_BOND>
64
+ 3
65
+
66
+ > <PUBCHEM_CACTVS_SUBSKEYS>
67
+ AAADccBwOAAAAAAAAAAAAAAAAAAAAAAAAAAwAAAAAAAAAAABAAAAGgAACAAADASAmAAyDoAABgCIAiDSCAACCAAkIAAIiAEGCMgMJzaENRqCe2Cl4BEIuYeIyCCOAAAAAAAIAAAAAAAAABAAAAAAAAAAAA==
68
+
69
+ > <PUBCHEM_IUPAC_OPENEYE_NAME>
70
+ 2-acetoxybenzoic acid
71
+
72
+ > <PUBCHEM_IUPAC_CAS_NAME>
73
+ 2-acetyloxybenzoic acid
74
+
75
+ > <PUBCHEM_IUPAC_NAME_MARKUP>
76
+ 2-acetyloxybenzoic acid
77
+
78
+ > <PUBCHEM_IUPAC_NAME>
79
+ 2-acetyloxybenzoic acid
80
+
81
+ > <PUBCHEM_IUPAC_SYSTEMATIC_NAME>
82
+ 2-acetyloxybenzoic acid
83
+
84
+ > <PUBCHEM_IUPAC_TRADITIONAL_NAME>
85
+ 2-acetoxybenzoic acid
86
+
87
+ > <PUBCHEM_IUPAC_INCHI>
88
+ InChI=1S/C9H8O4/c1-6(10)13-8-5-3-2-4-7(8)9(11)12/h2-5H,1H3,(H,11,12)
89
+
90
+ > <PUBCHEM_IUPAC_INCHIKEY>
91
+ BSYNRYMUTXBXSQ-UHFFFAOYSA-N
92
+
93
+ > <PUBCHEM_XLOGP3>
94
+ 1.2
95
+
96
+ > <PUBCHEM_EXACT_MASS>
97
+ 180.04225873
98
+
99
+ > <PUBCHEM_MOLECULAR_FORMULA>
100
+ C9H8O4
101
+
102
+ > <PUBCHEM_MOLECULAR_WEIGHT>
103
+ 180.16
104
+
105
+ > <PUBCHEM_OPENEYE_CAN_SMILES>
106
+ CC(=O)OC1=CC=CC=C1C(=O)O
107
+
108
+ > <PUBCHEM_OPENEYE_ISO_SMILES>
109
+ CC(=O)OC1=CC=CC=C1C(=O)O
110
+
111
+ > <PUBCHEM_CACTVS_TPSA>
112
+ 63.6
113
+
114
+ > <PUBCHEM_MONOISOTOPIC_WEIGHT>
115
+ 180.04225873
116
+
117
+ > <PUBCHEM_TOTAL_CHARGE>
118
+ 0
119
+
120
+ > <PUBCHEM_HEAVY_ATOM_COUNT>
121
+ 13
122
+
123
+ > <PUBCHEM_ATOM_DEF_STEREO_COUNT>
124
+ 0
125
+
126
+ > <PUBCHEM_ATOM_UDEF_STEREO_COUNT>
127
+ 0
128
+
129
+ > <PUBCHEM_BOND_DEF_STEREO_COUNT>
130
+ 0
131
+
132
+ > <PUBCHEM_BOND_UDEF_STEREO_COUNT>
133
+ 0
134
+
135
+ > <PUBCHEM_ISOTOPIC_ATOM_COUNT>
136
+ 0
137
+
138
+ > <PUBCHEM_COMPONENT_COUNT>
139
+ 1
140
+
141
+ > <PUBCHEM_CACTVS_TAUTO_COUNT>
142
+ -1
143
+
144
+ > <PUBCHEM_COORDINATE_TYPE>
145
+ 1
146
+ 5
147
+ 255
148
+
149
+ > <PUBCHEM_BONDANNOTATIONS>
150
+ 5 6 8
151
+ 5 7 8
152
+ 6 8 8
153
+ 7 9 8
154
+ 8 10 8
155
+ 9 10 8
156
+
157
+ $$$$
data/examples/interaction_pair_inference.csv ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ ID1,X1,ID2,X2
2
+ CHEMBL41355,CCOC(=O)Nc1ccc(NCc2ccc(F)cc2)cc1N,O88943,MVQKSRNGGVYPGTSGEKKLKVGFVGLDPGAPDSTRDGALLIAGSEAPKRGSVLSKPRTGGAGAGKPPKRNAFYRKLQNFLYNVLERPRGWAFIYHAYVFLLVFSCLVLSVFSTIKEYEKSSEGALYILEIVTIVVFGVEYFVRIWAAGCCCRYRGWRGRLKFARKPFCVIDIMVLIASIAVLAAGSQGNVFATSALRSLRFLQILRMIRMDRRGGTWKLLGSVVYAHSKELVTAWYIGFLCLILASFLVYLAEKGENDHFDTYADALWWGLITLTTIGYGDKYPQTWNGRLLAATFTLIGVSFFALPAGILGSGFALKVQEQHRQKHFEKRRNPAAGLIQSAWRFYATNLSRTDLHSTWQYYERTVTVPMISSQTQTYGASRLIPPLNQLEMLRNLKSKSGLTFRKEPQPEPSPSQKVSLKDRVFSSPRGVAAKGKGSPQAQTVRRSPSADQSLDDSPSKVPKSWSFGDRSRARQAFRIKGAASRQNSEEASLPGEDIVEDNKSCNCEFVTEDLTPGLKVSIRAVCVMRFLVSKRKFKESLRPYDVMDVIEQYSAGHLDMLSRIKSLQSRVDQIVGRGPTITDKDRTKGPAETELPEDPSMMGRLGKVEKQVLSMEKKLDFLVSIYTQRMGIPPAETEAYFGAKEPEPAPPYHSPEDSRDHADKHGCIIKIVRSTSSTGQRKYAAPPVMPPAECPPSTSWQQSHQRHGTSPVGDHGSLVRIPPPPAHERSLSAYSGGNRASTEFLRLEGTPACRPSEAALRDSDTSISIPSVDHEELERSFSGFSISQSKENLNALASCYAAVAPCAKVRPYIAEGESDTDSDLCTPCGPPPRSATGEGPFGDVAWAGPRK
3
+ CHEMBL497318,CCCCCc1cc(O)c(C/C=C(\C)CCC=C(C)C)c(O)c1,Q9Y5S1,MTSPSSSPVFRLETLDGGQEDGSEADRGKLDFGSGLPPMESQFQGEDRKFAPQIRVNLNYRKGTGASQPDPNRFDRDRLFNAVSRGVPEDLAGLPEYLSKTSKYLTDSEYTEGSTGKTCLMKAVLNLKDGVNACILPLLQIDRDSGNPQPLVNAQCTDDYYRGHSALHIAIEKRSLQCVKLLVENGANVHARACGRFFQKGQGTCFYFGELPLSLAACTKQWDVVSYLLENPHQPASLQATDSQGNTVLHALVMISDNSAENIALVTSMYDGLLQAGARLCPTVQLEDIRNLQDLTPLKLAAKEGKIEIFRHILQREFSGLSHLSRKFTEWCYGPVRVSLYDLASVDSCEENSVLEIIAFHCKSPHRHRMVVLEPLNKLLQAKWDLLIPKFFLNFLCNLIYMFIFTAVAYHQPTLKKQAAPHLKAEVGNSMLLTGHILILLGGIYLLVGQLWYFWRRHVFIWISFIDSYFEILFLFQALLTVVSQVLCFLAIEWYLPLLVSALVLGWLNLLYYTRGFQHTGIYSVMIQKVILRDLLRFLLIYLVFLFGFAVALVSLSQEAWRPEAPTGPNATESVQPMEGQEDEGNGAQYRGILEASLELFKFTIGMGELAFQEQLHFRGMVLLLLLAYVLLTYILLLNMLIALMSETVNSVATDSWSIWKLQKAISVLEMENGYWWCRKKQRAGVMLTVGTKPDGSPDERWCFRVEEVNWASWEQTLPTLCEDPSGAGVPRTLENPVLASPPKEDEDGASEENYVPVQLLQSN
4
+ CHEMBL497318,CCCCCc1cc(O)c(C/C=C(\C)CCC=C(C)C)c(O)c1,Q9Y5S1,MTSPSSSPVFRLETLDGGQEDGSEADRGKLDFGSGLPPMESQFQGEDRKFAPQIRVNLNYRKGTGASQPDPNRFDRDRLFNAVSRGVPEDLAGLPEYLSKTSKYLTDSEYTEGSTGKTCLMKAVLNLKDGVNACILPLLQIDRDSGNPQPLVNAQCTDDYYRGHSALHIAIEKRSLQCVKLLVENGANVHARACGRFFQKGQGTCFYFGELPLSLAACTKQWDVVSYLLENPHQPASLQATDSQGNTVLHALVMISDNSAENIALVTSMYDGLLQAGARLCPTVQLEDIRNLQDLTPLKLAAKEGKIEIFRHILQREFSGLSHLSRKFTEWCYGPVRVSLYDLASVDSCEENSVLEIIAFHCKSPHRHRMVVLEPLNKLLQAKWDLLIPKFFLNFLCNLIYMFIFTAVAYHQPTLKKQAAPHLKAEVGNSMLLTGHILILLGGIYLLVGQLWYFWRRHVFIWISFIDSYFEILFLFQALLTVVSQVLCFLAIEWYLPLLVSALVLGWLNLLYYTRGFQHTGIYSVMIQKVILRDLLRFLLIYLVFLFGFAVALVSLSQEAWRPEAPTGPNATESVQPMEGQEDEGNGAQYRGILEASLELFKFTIGMGELAFQEQLHFRGMVLLLLLAYVLLTYILLLNMLIALMSETVNSVATDSWSIWKLQKAISVLEMENGYWWCRKKQRAGVMLTVGTKPDGSPDERWCFRVEEVNWASWEQTLPTLCEDPSGAGVPRTLENPVLASPPKEDEDGASEENYVPVQLLQSN
5
+ CHEMBL444449,O=c1ccc2c(OCCCCOc3ccccc3)c3ccoc3cc2o1,P17658,MRSEKSLTLAAPGEVRGPEGEQQDAGDFPEAGGGGGCCSSERLVINISGLRFETQLRTLSLFPDTLLGDPGRRVRFFDPLRNEYFFDRNRPSFDAILYYYQSGGRLRRPVNVPLDIFLEEIRFYQLGDEALAAFREDEGCLPEGGEDEKPLPSQPFQRQVWLLFEYPESSGPARGIAIVSVLVILISIVIFCLETLPQFRVDGRGGNNGGVSRVSPVSRGSQEEEEDEDDSYTFHHGITPGEMGTGGSSSLSTLGGSFFTDPFFLVETLCIVWFTFELLVRFSACPSKPAFFRNIMNIIDLVAIFPYFITLGTELVQQQEQQPASGGGGQNGQQAMSLAILRVIRLVRVFRIFKLSRHSKGLQILGKTLQASMRELGLLIFFLFIGVILFSSAVYFAEADDDDSLFPSIPDAFWWAVVTMTTVGYGDMYPMTVGGKIVGSLCAIAGVLTIALPVPVIVSNFNYFYHRETEQEEQGQYTHVTCGQPAPDLRATDNGLGKPDFPEANRERRPSYLPTPHRAYAEKRMLTEV
6
+ CHEMBL305187,CC(C(O)c1ccc(O)cc1)N1CCC(Cc2ccccc2)CC1,O15399,MRGAGGPRGPRGPAKMLLLLALACASPFPEEAPGPGGAGGPGGGLGGARPLNVALVFSGPAYAAEAARLGPAVAAAVRSPGLDVRPVALVLNGSDPRSLVLQLCDLLSGLRVHGVVFEDDSRAPAVAPILDFLSAQTSLPIVAVHGGAALVLTPKEKGSTFLQLGSSTEQQLQVIFEVLEEYDWTSFVAVTTRAPGHRAFLSYIEVLTDGSLVGWEHRGALTLDPGAGEAVLSAQLRSVSAQIRLLFCAREEAEPVFRAAEEAGLTGSGYVWFMVGPQLAGGGGSGAPGEPPLLPGGAPLPAGLFAVRSAGWRDDLARRVAAGVAVVARGAQALLRDYGFLPELGHDCRAQNRTHRGESLHRYFMNITWDNRDYSFNEDGFLVNPSLVVISLTRDRTWEVVGSWEQQTLRLKYPLWSRYGRFLQPVDDTQHLTVATLEERPFVIVEPADPISGTCIRDSVPCRSQLNRTHSPPPDAPRPEKRCCKGFCIDILKRLAHTIGFSYDLYLVTNGKHGKKIDGVWNGMIGEVFYQRADMAIGSLTINEERSEIVDFSVPFVETGISVMVARSNGTVSPSAFLEPYSPAVWVMMFVMCLTVVAVTVFIFEYLSPVGYNRSLATGKRPGGSTFTIGKSIWLLWALVFNNSVPVENPRGTTSKIMVLVWAFFAVIFLASYTANLAAFMIQEEYVDTVSGLSDRKFQRPQEQYPPLKFGTVPNGSTEKNIRSNYPDMHSYMVRYNQPRVEEALTQLKAGKLDAFIYDAAVLNYMARKDEGCKLVTIGSGKVFATTGYGIALHKGSRWKRPIDLALLQFLGDDEIEMLERLWLSGICHNDKIEVMSSKLDIDNMAGVFYMLLVAMGLSLLVFAWEHLVYWRLRHCLGPTHRMDFLLAFSRGMYSCCSAEAAPPPAKPPPPPQPLPSPAYPAPRPAPGPAPFVPRERASVDRWRRTKGAGPPGGAGLADGFHRYYGPIEPQGLGLGLGEARAAPRGAAGRPLSPPAAQPPQKPPPSYFAIVRDKEPAEPPAGAFPGFPSPPAPPAAAATAVGPPLCRLAFEDESPPAPARWPRSDPESQPLLGPGAGGAGGTGGAGGGAPAAPPPCRAAPPPCPYLDLEPSPSDSEDSESLGGASLGGLEPWWFADFPYPYAERLGPPPGRYWSVDKLGGWRAGSWDYLPPRSGPAAWHCRHCASLELLPPPRHLSCSHDGLDGGWWAPPPPPWAAGPLPRRRARCGCPRSHPHRPRASHRTPAAAAPHHHRHRRAAGGWDLPPPAPTSRSLEDLSSCPRAAPARRLTGPSRHARRCPHAAHWGPPLPTASHRRHRGGDLGTRRGSAHFSSLESEV
7
+ CHEMBL305187,CC(C(O)c1ccc(O)cc1)N1CCC(Cc2ccccc2)CC1,Q14957,MGGALGPALLLTSLFGAWAGLGPGQGEQGMTVAVVFSSSGPPQAQFRARLTPQSFLDLPLEIQPLTVGVNTTNPSSLLTQICGLLGAAHVHGIVFEDNVDTEAVAQILDFISSQTHVPILSISGGSAVVLTPKEPGSAFLQLGVSLEQQLQVLFKVLEEYDWSAFAVITSLHPGHALFLEGVRAVADASHVSWRLLDVVTLELGPGGPRARTQRLLRQLDAPVFVAYCSREEAEVLFAEAAQAGLVGPGHVWLVPNLALGSTDAPPATFPVGLISVVTESWRLSLRQKVRDGVAILALGAHSYWRQHGTLPAPAGDCRVHPGPVSPAREAFYRHLLNVTWEGRDFSFSPGGYLVQPTMVVIALNRHRLWEMVGRWEHGVLYMKYPVWPRYSASLQPVVDSRHLTVATLEERPFVIVESPDPGTGGCVPNTVPCRRQSNHTFSSGDVAPYTKLCCKGFCIDILKKLARVVKFSYDLYLVTNGKHGKRVRGVWNGMIGEVYYKRADMAIGSLTINEERSEIVDFSVPFVETGISVMVARSNGTVSPSAFLEPYSPAVWVMMFVMCLTVVAITVFMFEYFSPVSYNQNLTRGKKSGGPAFTIGKSVWLLWALVFNNSVPIENPRGTTSKIMVLVWAFFAVIFLASYTANLAAFMIQEQYIDTVSGLSDKKFQRPQDQYPPFRFGTVPNGSTERNIRSNYRDMHTHMVKFNQRSVEDALTSLKMGKLDAFIYDAAVLNYMAGKDEGCKLVTIGSGKVFATTGYGIAMQKDSHWKRAIDLALLQFLGDGETQKLETVWLSGICQNEKNEVMSSKLDIDNMAGVFYMLLVAMGLALLVFAWEHLVYWKLRHSVPNSSQLDFLLAFSRGIYSCFSGVQSLASPPRQASPDLTASSAQASVLKMLQAARDMVTTAGVSSSLDRATRTIENWGGGRRAPPPSPCPTPRSGPSPCLPTPDPPPEPSPTGWGPPDGGRAALVRRAPQPPGRPPTPGPPLSDVSRVSRRPAWEARWPVRTGHCGRHLSASERPLSPARCHYSSFPRADRSGRPFLPLFPELEDLPLLGPEQLARREALLHAAWARGSRPRHASLPSSVAEAFARPSSLPAGCTGPACARPDGHSACRRLAQAQSMCLPIYREACQEGEQAGAPAWQHRQHVCLHAHAHLPFCWGAVCPHLPPCASHGSWLSGAWGPLGHRGRTLGLGTGYRDSGGLDEISRVARGTQGFPGPCTWRRISSLESEV
8
+ CHEMBL1098,CCCCN1CCCCC1C(=O)Nc1c(C)cccc1C,O54912,MKRQNVRTLALIVCTFTYLLVGAAVFDALESEPEMIERQRLELRQLELRARYNLSEGGYEELERVVLRLKPHKAGVQWRFAGSFYFAITVITTIGYGHAAPSTDGGKVFCMFYALLGIPLTLVMFQSLGERINTFVRYLLHRAKRGLGMRHAEVSMANMVLIGFVSCISTLCIGAAAFSYYERWTFFQAYYYCFITLTTIGFGDYVALQKDQALQTQPQYVAFSFVYILTGLTVIGAFLNLVVLRFMTMNAEDEKRDAEHRALLTHNGQAGGLGGLSCLSGSLGDGVRPRDPVTCAAAAGGMGVGVGVGGSGFRNVYAEMLHFQSMCSCLWYKSREKLQYSIPMIIPRDLSTSDTCVEHSHSSPGGGGRYSDTPSHPCLCSGTQRSAISSVSTGLHSLATFRGLMKRRSSV
9
+ CHEMBL1098,CCCCN1CCCCC1C(=O)Nc1c(C)cccc1C,Q9ES08,MKRQNVRTLSLIACTFTYLLVGAAVFDALESDHEMREEEKLKAEEVRLRGKYNISSDDYQQLELVILQSEPHRAGVQWKFAGSFYFAITVITTIGYGHAAPGTDAGKAFCMFYAVLGIPLTLVMFQSLGERMNTFVRYLLKRIKKCCGMRNTEVSMENMVTVGFFSCMGTLCLGAAAFSQCEDWSFFHAYYYCFITLTTIGFGDFVALQSKGALQRKPFYVAFSFMYILVGLTVIGAFLNLVVLRFLTMNTDEDLLEGEVAQILAGNPRRVVVRVPQSRKRHHPMYFLRKYGRTLCYLCFPGANWGDDDDDDDDAVENVVVTTPVPPAVAAAAAAATPGPSTRNVRATVHSVSCRVEEIPPDVLRNTYFRSPFGAIPPGMHTCGENHRLHIRRKSI