-
rdkit
2020-12-09 14:04:11<div><p>not sure how to rebase yet, so just cherry-picked from another branch (rdkit).</p><p>该提问来源于开源项目:ParmEd/ParmEd</p></div> -
RDKit RDKit is a collection of cheminformatics and machine-learning software written in C++ and Python. BSD license - a business friendly license for open source Core data structures and ...
-
RDKit | 基于RDKit绘制化学反应
2019-12-24 21:34:01基于RDKit绘制化学反应 导入库 from rdkit import RDConfig import unittest import random from rdkit import Chem from rdkit.Chem import Draw, AllChem from rdkit.Chem.Draw import rdMolDraw2D from rdkit ...基于RDKit绘制化学反应
导入库
from rdkit import RDConfig import unittest import random from rdkit import Chem from rdkit.Chem import Draw, AllChem from rdkit.Chem.Draw import rdMolDraw2D from rdkit import Geometry %matplotlib inline from numpy.polynomial.polynomial import polyfit import matplotlib.pyplot as plt import matplotlib.cm as cm import matplotlib from IPython.display import SVG, display import seaborn as sns; sns.set(color_codes=True)
定义反应
rxn = AllChem.ReactionFromSmarts('[CH3:1][C:2](=[O:3])[OH:4].[CH3:5][NH2:6]>CC(O)C.[Pt]>[CH3:1][C:2](=[O:3])[NH:6][CH3:5].[OH2:4]',useSmiles=True) d = Draw.MolDraw2DSVG(900, 300) d.DrawReaction(rxn) d.FinishDrawing()
绘制反应
定义绘制反应并高亮
rxn = AllChem.ReactionFromSmarts('[CH3:1][C:2](=[O:3])[OH:4].[CH3:5][NH2:6]>CC(O)C.[Pt]>[CH3:1][C:2](=[O:3])[NH:6][CH3:5].[OH2:4]',useSmiles=True) colors=[(0.3,0.7,0.9),(0.9,0.7,0.9),(0.6,0.9,0.3),(0.9,0.9,0.1)] d = Draw.MolDraw2DSVG(900, 300) d.DrawReaction(rxn,highlightByReactant=True,highlightColorsReactants=colors) d.FinishDrawing() svg = d.GetDrawingText() svg2 = svg.replace('svg:','') svg3 = SVG(svg2) display(svg3)
参考
1.https://www.rdkit.org/docs/source/rdkit.Chem.rdChemReactions.html
3. https://www.kesci.com/home/project/5c7685191ce0af002b556cc5
-
RDKit references
2020-12-28 09:34:48<div><p>My article uses RDKit, how do I add references?</p><p>该提问来源于开源项目:rdkit/rdkit</p></div> -
rdkit interoperability
2020-12-31 10:49:17s a bunch of cool stuff in rdkit which isn't even close to being in MDA (and vice versa) so rather than reinvent wheels, it would be cool to do: <pre><code>python import MDAnalysis as mda from ... -
RDKit | 处理RDKit分子Mol对象
2020-01-15 11:11:07Mol对象的组成 由于Mol对象是分子,因此它们自然是由原子组成的。分子是通过原子间键的形成而形成的。 导入库 ...from rdkit import rdBase, Chem from rdkit.Chem import AllChem, Draw ...rdkit versio...Mol对象的组成
由于Mol对象是分子,因此它们自然是由原子组成的。分子是通过原子间键的形成而形成的。
导入库
from rdkit import rdBase, Chem from rdkit.Chem import AllChem, Draw print('rdkit version: {}'.format(rdBase.rdkitVersion))
rdkit version: 2019.09.3
载入数据
suppl = Chem.SDMolSupplier('sdf_20200114172835.sdf') mols = [x for x in suppl if x is not None] len(mols)
Atom对象
Mol.GetAtoms()
Mol.GetAtomWithIdx(idx)for mol in mols[:5]: print(mol.GetProp('PRODUCT_NAME')) for a in mol.GetAtoms(): if a.GetSymbol() == 'C' and str(a.GetHybridization()) == 'SP3': print('index for sp3 carbon: {}'.format(a.GetIdx())) print('###')
Methyl Acetylsalicylate
index for sp3 carbon: 9
index for sp3 carbon: 12
###
Methyl o-Anisate
index for sp3 carbon: 9
index for sp3 carbon: 11
###
Dimethyl 4-Acetoxyisophthalate
index for sp3 carbon: 12
index for sp3 carbon: 13
index for sp3 carbon: 16
###
Ethyl Acetylsalicylate
index for sp3 carbon: 10
index for sp3 carbon: 11
index for sp3 carbon: 13
###
(+)-Bicuculline
index for sp3 carbon: 2
index for sp3 carbon: 4
index for sp3 carbon: 5
index for sp3 carbon: 8
index for sp3 carbon: 10
index for sp3 carbon: 16
index for sp3 carbon: 25
###Bond对象
Mol.GetBonds()
Mol.GetBondWithIdx(idx)
Mol.GetBondBetweenAtoms()for mol in mols[-3:]: print(mol.GetProp('PRODUCT_NAME')) for b in mol.GetBonds(): if b.GetIsAromatic(): print('bond between {}-{} is aromatic.'. format(b.GetBeginAtomIdx(),b.GetEndAtomIdx())) print('###')
2-Ethylhexyl Salicylate
bond between 3-4 is aromatic.
bond between 3-8 is aromatic.
bond between 4-5 is aromatic.
bond between 5-6 is aromatic.
bond between 6-7 is aromatic.
bond between 7-8 is aromatic.
###
Propyl Salicylate
bond between 3-4 is aromatic.
bond between 3-8 is aromatic.
bond between 4-5 is aromatic.
bond between 5-6 is aromatic.
bond between 6-7 is aromatic.
bond between 7-8 is aromatic.
###
3,3,5-Trimethylcyclohexyl Salicylate (cis- and trans- mixture)
bond between 0-1 is aromatic.
bond between 1-2 is aromatic.
bond between 2-3 is aromatic.
bond between 3-4 is aromatic.
bond between 4-5 is aromatic.
bond between 0-5 is aromatic.
###属性
GetProp(name)
SetProp(name, value)
GetPropNames()可以将称为属性的任意值添加到Mol对象。使用SetProp进行设置,并使用GetProp作为参考。如果从头开始在原始SDF中设置属性,则在创建对象的同时设置它们的值。
for p in mols[0].GetPropNames(): print('{}: {}'.format(p, mols[0].GetProp(p)))
PRODUCT_NUMBER: A0114
PRODUCT_NAME: Methyl Acetylsalicylate
MOLECULAR_FORMULA: C10H10O4
MOLECULAR_WEIGHT: 194.19
CAS_NUMBER: 580-02-9
MDL_NUMBER: MFCD00014978分子的2D结构图
1) 一个分子绘制
a. Chem.Draw.MolToImage(mol)
b. Chem.Draw.MolToFile(mol, file_name)
2) 多个分子的绘制
a. Chem.Draw.MolsToImage(mols)
b. Chem.Draw.MolsToGridImage(mols, molsPerRow, subImgSize, legends)处理的所有分子都具有原子坐标,但是,如果未初始化它们,则需要使用AllChem.Compute2DCoordinate计算。
Draw.MolToImage(mols[0])
Draw.MolsToGridImage(mols[:9], molsPerRow=3, subImgSize=(300,200), legends=[x.GetProp('PRODUCT_NUMBER') for x in mols[:9]])
基于模板结构并行化结构
由于化合物的排列在网格图像中不匹配,因此一眼就很难理解类似的结构。因此,让我们以水杨酸为模板旋转分子。在Chem.AllChem中使用GenerateDepictionMatching2DStructure。所要做的就是为模板创建一个Mol对象并计算2D坐标。
import pubchempy as pcp tmp = pcp.get_compounds('salicylic acid', 'name') tmp = tmp[0] tmp_smiles = tmp.canonical_smiles template = Chem.MolFromSmiles(tmp_smiles) AllChem.Compute2DCoords(template) for mol in mols: if mol.HasSubstructMatch(template): AllChem.GenerateDepictionMatching2DStructure(mol, template) Draw.MolsToGridImage(mols[:9], molsPerRow=3, subImgSize=(300,200), legends=[x.GetProp('PRODUCT_NUMBER') for x in mols[:9]])
亚结构的处理
aspirin = pcp.get_compounds('aspirin', 'name') aspirin = aspirin[0] aspirin_sm = aspirin.canonical_smiles aspirin_mol = Chem.MolFromSmiles(aspirin_sm) AllChem.Compute2DCoords(aspirin_mol) match = [] for mol in mols: if mol.HasSubstructMatch(aspirin_mol): match.append(mol) print(len(match)) # 6 Draw.MolsToGridImage(match, molsPerRow=3, subImgSize=(300,200), legends=[x.GetProp('PRODUCT_NAME') for x in match])
参考
1. http://www.rdkit.org/docs/index.html
2. http://www.rdkit.org/docs/api-docs.html
DrugAI -
RDKit | 基于RDKit 的化合物预处理
2020-09-16 19:20:33基于Python和RDKit对化合物数据进行预处理。 环境 MolVS是专门用于化合物预处理的库。 rdkit 2020.03 molvs 0.1.1 化合物(Compound)预处理 RDKit:SanitizeMol Kekure的形成,化合价的确认,芳香性的...基于Python和RDKit对化合物数据进行预处理。
环境
MolVS是专门用于化合物预处理的库。
- rdkit 2020.03
- molvs 0.1.1
化合物(Compound)预处理
RDKit:SanitizeMol
Kekure的形成,化合价的确认,芳香性的设定,结合等。
参考:http://rdkit.org/docs/source/rdkit.Chem.rdmolops.html
MolVS : Normarize
参考:https://molvs.readthedocs.io/en/latest/guide/standardize.html
进行了一系列转换,以修复常见错误并标准化特征组。
from rdkit import Chem from molvs.normalize import Normalizer, Normalization
old_smiles = "[Na]OC(=O)c1ccc(C[S+2]([O-])([O-]))cc1" print("PREV:" + old_smiles) old_mol = Chem.MolFromSmiles(old_smiles) normalizer = Normalizer(normalizations=[Normalization('Sulfone to S(=O)(=O)', '[S+2:1]([O-:2])([O-:3])>>[S+0:1](=[O-0:2])(=[O-0:3])')]) new_mol = normalizer.normalize(old_mol) new_smiles = Chem.MolToSmiles(new_mol) print("NEW:" + new_smiles)
以上,选择性地执行在“S(= O)(= O)”中定义的归一化处理。
结果如下,硫原子和氧原子的电荷发生了变化。
如果生成不带参数的规范化器,则将执行MolVS中预先定义的所有规范化过程。PREV:[Na]OC(=O)c1ccc(C[S+2]([O-])([O-]))cc1 NEW:O=C(O[Na])c1ccc(C[S](=O)=O)cc1
MolVS : TautomerCanonicalizer
参考:https://molvs.readthedocs.io/en/latest/guide/tautomer.html
互变异构体似乎是一组通过氢原子的移动而易于彼此交换的分子。
from rdkit import Chem from molvs.tautomer import TAUTOMER_TRANSFORMS, TAUTOMER_SCORES, MAX_TAUTOMERS, TautomerCanonicalizer, TautomerEnumerator, TautomerTransform
tautomerCanonicalizer = TautomerCanonicalizer(( TautomerTransform('1,7 aromatic heteroatom H shift r', '[#7,S,O,Se,Te,CX4;!H0]-[#6,#7X2]=[#6]-[#6,#7X2]=[#6,#7X2]-[#6,#7X2]=[NX2,S,O,Se,Te]'), )) mol = Chem.MolFromSmiles("O=C1CC=CC=C1") print("prev:" + Chem.MolToSmiles(mol)) mol2 = tautomerCanonicalizer.canonicalize(mol) print("after: "+ Chem.MolToSmiles(mol2))
prev:O=C1C=CC=CC1 after: Oc1ccccc1
MolVS : LargestFragmentChooser
参考:https://molvs.readthedocs.io/en/latest/api.html#molvs-fragment
当包含多个分子时,它返回最大的分子。
from rdkit import Chem from molvs.fragment import LargestFragmentChooser
flagmentChooser1 = LargestFragmentChooser() old_smiles = "O=S(=O)(Cc1[nH]c(-c2ccc(Cl)s2)c[s+]1)c1cccs1.[Br-]" print("prev:" + old_smiles) mol = Chem.MolFromSmiles(old_smiles) mol2 = flagmentChooser1(mol) print("after:" + Chem.MolToSmiles(mol2))
prev:O=S(=O)(Cc1[nH]c(-c2ccc(Cl)s2)c[s+]1)c1cccs1.[Br-] after:O=S(=O)(Cc1[nH]c(-c2ccc(Cl)s2)c[s+]1)c1cccs1
MolVS : Uncharger
参考:https://molvs.readthedocs.io/en/latest/api.html#molvs-charge
试图中和分子上的离子化酸和碱。
from molvs.charge import Reionizer, Uncharger
uncharger = Uncharger() mol = Chem.MolFromSmiles("c1cccc[nH+]1") print("prev:" + Chem.MolToSmiles(mol)) mol2 = uncharger(mol) print("after:" + Chem.MolToSmiles(mol2))
prev:c1cc[nH+]cc1 after:c1ccncc1
-
rdkit安装
2020-11-30 19:17:26不夸张,我用过最香的rdkit安装 引用:https://anaconda.org/rdkit/rdkit
不夸张,我用过最香的rdkit安装
引用:https://anaconda.org/rdkit/rdkit -
rdkit:RDKit库的官方资源-源码
2021-03-26 00:48:07RDKit 是用C ++和Python编写的化学信息学和机器学习软件的集合。 -开源的商业友好许可证 C ++中的核心数据结构和算法 使用Boost.Python生成的 用SWIG生成的Java和C#包装器 2D和3D分子操作 用于机器学习的和生成 ... -
rdkit计算
2021-03-02 15:51:22import rdkit import pandas as pd from rdkit import Chem from rdkit.Chem import Descriptors from rdkit.ML.Descriptors import MoleculeDescriptors path=‘POSCAR-2’ mols=[] files= os.listdir(path) for ... -
Issue Importing rdkit
2021-01-06 19:06:09<p>Failure: ImportError (No module named rdkit) ... ERROR Failure: ImportError (No module named rdkit) ... ERROR Failure: ImportError (No module named rdkit) ... ERROR Failure: ImportError (No module ... -
安装rdkit
2021-03-08 13:43:39pip\conda安装RDKit 一直报错,尝试以下命令 conda install -c conda-forge rdkit -
RDKit | 基于RDKit的SMILES转canonical SMILES
2020-03-24 15:21:06基于RDKit的SMILES String转canonical SMILESString 导入库 from rdkit import Chem from rdkit.Chem import Draw from rdkit.Chem.Draw import IPythonConsole SMILES转RDKit的Mol对象 testsmi = '[H][C@@]12... -
RDKit | RDKit中的药效团特征
2020-04-01 17:21:10from rdkit import RDConfig from rdkit.Chem import AllChem from rdkit import Chem fdef = AllChem.BuildFeatureFactory(os.path.join(RDConfig.RDDataDir,'BaseFeatures.fdef')) print(fd... -
RDKit | 基于RDKit的随机SMILES的生成
2020-02-27 15:13:06环境 系统:Windows 10 (x64) Python: Python3.7 RDKit:2019.09.3 基于RDKit的随机smiles生成 ...from rdkit import Chem from rdkit.Chem import Draw from rdkit.Chem.Draw import IPythonCon... -
RDKit | RDKit中处理分子Mol对象
2020-03-23 14:15:28from rdkit import rdBase, Chem from rdkit.Chem import AllChem, Draw print('rdkit version: {}'.format(rdBase.rdkitVersion)) 载入数据 suppl = Chem.SDMolSupplier('sdf_20191011152835.sdf') m... -
RDKit:运用RDKit计算USRCAT
2018-08-10 21:47:37USRCAT USRCAT是基于形状的方法,它的工作速度非常快。代码是免费提供的,如果要使用代码,用户需要安装它。 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3505738/ RDKit代码: ...from rdkit import Chem... -
rdkit入门
2019-06-26 20:31:40Python中的RDkit包,是将化学与机器学习联系起来的、非常实用的库。可以在很多种化学文件如mol2,mol,Smiles,sdf等之间互相转化,并能将其展示成2D、3D等形式供开发人员使用。 1.生成描述:2D分子 from rdkit... -
RDKit | 基于RDKit中和带电分子
2020-03-26 21:32:26化合物的预处理 ...from rdkit import Chem from rdkit.Chem import MolStandardize 载入数据 smis = ("c1cccc[nH+]1", "C[N+](C)(C)C", "c1ccccc1[NH3+]", "CC(=O)[O-]", "c1ccccc1[O-]", "CCS... -
RDKit | 基于RDKit绘制黑白颜色的分子
2020-05-17 00:26:04基于RDKit绘制黑白颜色的分子 导入库 from rdkit import Chem from rdkit.Chem import Draw from rdkit.Chem.Draw import IPythonConsole import rdkit rdkit.__version__ 2020.03.1 载入数据 ms = ... -
RDKit | RDKit分子结构图的详细说明
2020-06-20 16:59:29RDKit中有多个绘制引擎,通过使用不同的方法绘制的结构在外观上有所不同。这次将深入研究RDKit的结构图,并说明SVG格式的绘制方法,该方法自2015.03更新起可用。可能有很多细节,但是了解幕后发生的事情通常会很有... -
RDKit | 基于RDKit计算3D药效团指纹
2020-03-18 20:27:50from rdkit import Chem, DataStructs, RDConfig from rdkit.Chem import AllChem from rdkit.Chem.Pharm2D import Gobbi_Pharm2D, Generate 载入数据,产生3D结构 mol = Chem.MolFromSmiles( '... -
RDKit | 基于RDKit获取分子3D距离矩阵
2020-03-22 16:36:46from rdkit import Chem from rdkit.Chem import AllChem from rdkit.Chem import rdDistGeom as molDG mol = Chem.MolFromSmiles('CCC') bm = molDG.GetMoleculeBoundsMatrix(mol) bm Out[]: array([[0. ... -
RDKit | 基于RDKit的氨基酸序列转换为SMILES
2020-12-30 13:31:45from rdkit import Chem from rdkit.Chem import Draw from rdkit.Chem.Draw import IPythonConsole IPythonConsole.ipython_useSVG = True 载入数据 peptide_smiles = Chem.MolToSmiles(Chem.MolFromFASTA(... -
RDKit | 基于RDKit和Cytoscape绘制分子相似图
2020-03-23 17:41:55import os import numpy as np import igraph from py2cytoscape import util from cyjupyter import Cytoscape from rdkit import Chem from rdkit.Chem import DataStructs from rdkit.Chem import AllChem fr... -
RDKit | 基于RDKit操纵分子结构(骨架转换)
2021-01-15 20:56:32RDKit2020.09.1 Python=3.7.9 RDKit操纵分子结构 导入库 import pandas as pd import numpy as np from rdkit import Chem from rdkit.Chem import AllChem, Draw Mol对象和SMILES之间转换 mol = Chem.... -
RDKit | 基于RDKit从分子中提取3D药效团特征
2020-03-22 16:37:45从分子中提取3D药效团特征 导入库 import os from rdkit import Geometry from rdkit import RDConfig from rdkit.Chem import AllChem from rdkit.Chem import ...from rdkit.Chem.Pharm3D import Pharmacophore... -
rdkit_nim:用于C ++化学格式工具RDKit的Nim绑定-源码
2021-02-04 11:18:24rdkit_nim:用于C ++化学格式工具RDKit的Nim绑定 -
RDKit | RDKit(2019.09)新增相似性图函数
2020-04-26 16:46:53RDKit(2019.09)新增相似性图函数 导入库 from rdkit import Chem from rdkit.Chem import Draw from rdkit.Chem.Draw import SimilarityMaps from IPython.display import SVG import numpy as np import rdkit...