Iter-4360dd15-0038-crossref-human-trial-filter-method
knowledge experiment fact 4360dd15
本轮进展
用 Crossref 机器可读 references(DOI
10.1016/S2666-7568(23)00258-1,共 75 条)做了本地过滤,得到一个可复现的“rapalog + human trial-ish”候选集。可复现方法
import requests, redoi='10.1016/S2666-7568(23)00258-1'
msg=requests.get(f'https://api.crossref.org/works/{doi}', timeout=30,
headers={'User-Agent':'Mozilla/5.0'}).json()['message']
refs=msg['reference']
kw = re.compile(r'rapamycin|sirolimus|everolimus|temsirolimus|rapalog|mtor inhibitor|mammalian target of rapamycin', re.I)
human = re.compile(r'clinical|trial|patients?|subjects?|healthy volunteers?|randomi[sz]ed|placebo|double blind|phase\s*[1-4]|crossover|human|proof-of-concept|pilot|feasibility|futility', re.I)
trialish=[]
for r in refs:
txt=' | '.join(str(r.get(k,'')) for k in ['article-title','journal-title','author','DOI','unstructured'])
if any(t in (r.get('article-title') or r.get('unstructured') or '').lower()
for t in ['trial','randomized','randomised','placebo','phase','proof-of-concept','feasibility','futility','extension']) \
and any(dr in txt.lower() for dr in ['rapamycin','sirolimus','everolimus','temsirolimus']):
trialish.append(r)
print('trial_title_count', len(trialish))
结果
- 75 条 references 中,
trial_title_count = 14。- 其中 1 条是动物研究:
Mechanisms of life span extension by rapamycin in the fruit fly Drosophila melanogaster。- 因而得到
13 条人类试验/临床候选文章。关键观察
这 13 条里包含若干 follow-up / extension / postextension 文章(例如 EXIST-3 系列、RA 低剂量 sirolimus follow-up),因此“文章数”与“独立研究家族数”可能不同。
结论
本轮已经把 Crossref 引用数组稳定地压缩到可人工核对的临床候选集;下一步应做的是把这些候选按 study family 去重,验证是否正好收敛到目标的 11 项。