出版物情报
科学出版物往往能比专利或临床试验提前数年揭示竞争性的研究方向。通过将出版物与来自 ClinVar 等数据库的作者、机构、基因、疾病和临床变异进行关联,企业能够识别新兴的治疗靶点、追踪机构的专业领域并发现合作机会。这种情报在制药和生物技术领域尤为宝贵,因为出版物往往能在临床数据或产品发布前数年发出战略意图的信号,使企业能够预判特定治疗领域的竞争动态,并据此调整其研发重点。
场景
一家负责监测阿尔茨海默病研究的制药竞争情报团队正面临信息过载的问题——成千上万条碎片化的 PubMed 搜索结果使得回答诸如“哪些机构在 TREM2 研究中处于领先地位?”或“哪些新的基因-疾病关联得到了临床证据的验证?”等战略性问题变得不可能。他们需要将这些出版物与专利、临床试验和其他研究关联起来,以了解竞争格局并确定潜在的合作机会。
解决方案
该图谱将出版物、作者、机构、基因、蛋白质、疾病和 SNP(单核苷酸多态性)连接在一个统一的模型中。这使得通过单次查询即可遍历诸如“机构 -> 作者 -> 出版物 -> 基因 -> 临床变异 -> 疾病”等复杂路径,从而揭示传统数据库中不可见的模式。
演示数据
此演示数据集包含:
-
6 家学术和行业机构
-
6 位作者(包含 h 指数和专业领域)
-
6 种顶级期刊
-
主要神经退行性疾病
-
6 个基因和蛋白质
-
6 个 SNP(来自 ClinVar 的临床变异)
-
6 个关键词(研究主题)
-
6 篇出版物
-
6 条引用
-
6 个临床变异
-
6 种疾病
-
6 个基因
-
6 种蛋白质
-
6 个 SNP
-
6 个关键词
// ============================================
// MERGE INSTITUTIONS (Academic & Pharma)
// ============================================
MERGE (mit:Institution {name: 'MIT', type: 'Academic', country: 'US'})
MERGE (stanford:Institution {name: 'Stanford University', type: 'Academic', country: 'US'})
MERGE (ucl:Institution {name: 'University College London', type: 'Academic', country: 'UK'})
MERGE (dzne:Institution {name: 'DZNE', type: 'Academic', country: 'DE'})
MERGE (genentechRes:Institution {name: 'Genentech Research', type: 'Industry', country: 'US'})
MERGE (biogenRes:Institution {name: 'Biogen Research', type: 'Industry', country: 'US'})
// ============================================
// MERGE AUTHORS
// ============================================
MERGE (chen:Author {name: 'Dr. Li Chen', h_index: 45, expertise: 'Neurogenetics'})
MERGE (rodriguez:Author {name: 'Dr. Maria Rodriguez', h_index: 52, expertise: 'Immunology'})
MERGE (tanaka:Author {name: 'Dr. Kenji Tanaka', h_index: 38, expertise: 'Structural Biology'})
MERGE (schmidt:Author {name: 'Dr. Anna Schmidt', h_index: 41, expertise: 'Neuroscience'})
MERGE (williams:Author {name: 'Dr. James Williams', h_index: 35, expertise: 'Genomics'})
MERGE (kumar:Author {name: 'Dr. Priya Kumar', h_index: 29, expertise: 'Clinical Genetics'})
// Author-Institution relationships
MERGE (chen)-[:AFFILIATED_WITH]->(mit)
MERGE (rodriguez)-[:AFFILIATED_WITH]->(stanford)
MERGE (tanaka)-[:AFFILIATED_WITH]->(genentechRes)
MERGE (schmidt)-[:AFFILIATED_WITH]->(dzne)
MERGE (williams)-[:AFFILIATED_WITH]->(ucl)
MERGE (kumar)-[:AFFILIATED_WITH]->(biogenRes)
// ============================================
// MERGE JOURNALS
// ============================================
MERGE (nature:Journal {name: 'Nature', impact_factor: 49.9})
MERGE (cell:Journal {name: 'Cell', impact_factor: 41.6})
MERGE (natGenet:Journal {name: 'Nature Genetics', impact_factor: 31.7})
MERGE (neuron:Journal {name: 'Neuron', impact_factor: 16.2})
// ============================================
// MERGE DISEASES
// ============================================
MERGE (ad:Disease {name: "Alzheimer's Disease", icd10: 'G30', prevalence: '6.7M US'})
MERGE (pd:Disease {name: "Parkinson's Disease", icd10: 'G20', prevalence: '1M US'})
MERGE (ftd:Disease {name: 'Frontotemporal Dementia', icd10: 'G31.09', prevalence: '60K US'})
// ============================================
// MERGE GENES AND PROTEINS
// ============================================
MERGE (trem2_gene:Gene {symbol: 'TREM2', name: 'Triggering receptor expressed on myeloid cells 2', chromosome: '6'})
MERGE (apoe_gene:Gene {symbol: 'APOE', name: 'Apolipoprotein E', chromosome: '19'})
MERGE (app_gene:Gene {symbol: 'APP', name: 'Amyloid precursor protein', chromosome: '21'})
MERGE (mapt_gene:Gene {symbol: 'MAPT', name: 'Microtubule associated protein tau', chromosome: '17'})
MERGE (bin1_gene:Gene {symbol: 'BIN1', name: 'Bridging integrator 1', chromosome: '2'})
MERGE (trem2_prot:Protein {name: 'TREM2', uniprot: 'Q9NZC2', function: 'Immune receptor'})
MERGE (apoe_prot:Protein {name: 'ApoE', uniprot: 'P02649', function: 'Lipid transport'})
MERGE (tau_prot:Protein {name: 'Tau', uniprot: 'P10636', function: 'Microtubule binding'})
MERGE (abeta_prot:Protein {name: 'Amyloid-beta', uniprot: 'P05067', function: 'Peptide fragment'})
// Gene codes for Protein
MERGE (trem2_gene)-[:CODES_FOR]->(trem2_prot)
MERGE (apoe_gene)-[:CODES_FOR]->(apoe_prot)
MERGE (mapt_gene)-[:CODES_FOR]->(tau_prot)
MERGE (app_gene)-[:CODES_FOR]->(abeta_prot)
// ============================================
// MERGE SNPs (Clinical Variants from ClinVar)
// ============================================
MERGE (rs75932628:SNP {
rsid: 'rs75932628',
variant: 'R47H',
clinical_significance: 'Pathogenic',
review_status: '3-star',
allele_freq: 0.002
})
MERGE (rs429358:SNP {
rsid: 'rs429358',
variant: 'ε4 allele',
clinical_significance: 'Risk factor',
review_status: '4-star',
allele_freq: 0.14
})
MERGE (rs63750847:SNP {
rsid: 'rs63750847',
variant: 'A152T',
clinical_significance: 'Likely pathogenic',
review_status: '2-star',
allele_freq: 0.0001
})
// SNP-Gene associations
MERGE (rs75932628)-[:ASSOCIATED_WITH]->(trem2_gene)
MERGE (rs429358)-[:ASSOCIATED_WITH]->(apoe_gene)
MERGE (rs63750847)-[:ASSOCIATED_WITH]->(app_gene)
// ============================================
// MERGE KEYWORDS (Research Themes)
// ============================================
MERGE (neuroinflamm:Keyword {term: 'Neuroinflammation'})
MERGE (microglia:Keyword {term: 'Microglia'})
MERGE (geneticRisk:Keyword {term: 'Genetic Risk Factors'})
MERGE (proteinAgg:Keyword {term: 'Protein Aggregation'})
MERGE (immunotherapy:Keyword {term: 'Immunotherapy'})
// Keyword hierarchies
MERGE (microglia)-[:IS_A]->(neuroinflamm)
MERGE (immunotherapy)-[:IS_A]->(neuroinflamm)
// ============================================
// PUBLICATION 1: TREM2 Discovery Paper
// ============================================
MERGE (pub1:Publication {
pmid: 'PMID35123456',
title: 'TREM2 variants confer risk for Alzheimers disease through microglial dysfunction',
year: 2023,
citations: 342,
abstract: 'Rare variants in TREM2 increase Alzheimers disease risk 3-fold...'
})
MERGE (pub1)-[:PUBLISHED_IN]->(natGenet)
MERGE (pub1)-[:HAS_AUTHOR]->(chen)
MERGE (pub1)-[:HAS_AUTHOR]->(rodriguez)
MERGE (pub1)-[:HAS_KEYWORD]->(neuroinflamm)
MERGE (pub1)-[:HAS_KEYWORD]->(microglia)
MERGE (pub1)-[:HAS_KEYWORD]->(geneticRisk)
MERGE (pub1)-[:MENTIONS]->(trem2_gene)
MERGE (pub1)-[:MENTIONS]->(trem2_prot)
MERGE (pub1)-[:MENTIONS]->(ad)
MERGE (pub1)-[:MENTIONS_VARIANT]->(rs75932628)
// ============================================
// PUBLICATION 2: APOE Review Paper
// ============================================
MERGE (pub2:Publication {
pmid: 'PMID35234567',
title: 'APOE4: The most significant genetic risk factor for late-onset Alzheimers',
year: 2023,
citations: 589,
abstract: 'The APOE ε4 allele increases AD risk in dose-dependent manner...'
})
MERGE (pub2)-[:PUBLISHED_IN]->(neuron)
MERGE (pub2)-[:HAS_AUTHOR]->(williams)
MERGE (pub2)-[:HAS_KEYWORD]->(geneticRisk)
MERGE (pub2)-[:MENTIONS]->(apoe_gene)
MERGE (pub2)-[:MENTIONS]->(apoe_prot)
MERGE (pub2)-[:MENTIONS]->(ad)
MERGE (pub2)-[:MENTIONS_VARIANT]->(rs429358)
MERGE (pub2)-[:CITES]->(pub1)
// ============================================
// PUBLICATION 3: TREM2 Therapeutic Paper (Industry)
// ============================================
MERGE (pub3:Publication {
pmid: 'PMID35345678',
title: 'Therapeutic targeting of TREM2 in neurodegenerative diseases',
year: 2024,
citations: 127,
abstract: 'TREM2 agonists show promise in preclinical models...'
})
MERGE (pub3)-[:PUBLISHED_IN]->(cell)
MERGE (pub3)-[:HAS_AUTHOR]->(tanaka)
MERGE (pub3)-[:HAS_AUTHOR]->(rodriguez)
MERGE (pub3)-[:HAS_KEYWORD]->(immunotherapy)
MERGE (pub3)-[:HAS_KEYWORD]->(microglia)
MERGE (pub3)-[:MENTIONS]->(trem2_gene)
MERGE (pub3)-[:MENTIONS]->(trem2_prot)
MERGE (pub3)-[:MENTIONS]->(ad)
MERGE (pub3)-[:MENTIONS]->(pd)
MERGE (pub3)-[:CITES]->(pub1)
// ============================================
// PUBLICATION 4: Multi-omics Paper
// ============================================
MERGE (pub4:Publication {
pmid: 'PMID35456789',
title: 'Integrated genomics reveals novel Alzheimers disease susceptibility loci',
year: 2024,
citations: 234,
abstract: 'Genome-wide association analysis identifies BIN1 and CD33...'
})
MERGE (pub4)-[:PUBLISHED_IN]->(nature)
MERGE (pub4)-[:HAS_AUTHOR]->(schmidt)
MERGE (pub4)-[:HAS_AUTHOR]->(kumar)
MERGE (pub4)-[:HAS_AUTHOR]->(chen)
MERGE (pub4)-[:HAS_KEYWORD]->(geneticRisk)
MERGE (pub4)-[:MENTIONS]->(bin1_gene)
MERGE (pub4)-[:MENTIONS]->(trem2_gene)
MERGE (pub4)-[:MENTIONS]->(apoe_gene)
MERGE (pub4)-[:MENTIONS]->(ad)
// ============================================
// PUBLICATION 5: Tau/FTD Paper
// ============================================
MERGE (pub5:Publication {
pmid: 'PMID35567890',
title: 'MAPT mutations in frontotemporal dementia and Alzheimers overlap',
year: 2023,
citations: 178,
abstract: 'Tau protein dysfunction links multiple neurodegenerative diseases...'
})
MERGE (pub5)-[:PUBLISHED_IN]->(neuron)
MERGE (pub5)-[:HAS_AUTHOR]->(kumar)
MERGE (pub5)-[:HAS_KEYWORD]->(proteinAgg)
MERGE (pub5)-[:MENTIONS]->(mapt_gene)
MERGE (pub5)-[:MENTIONS]->(tau_prot)
MERGE (pub5)-[:MENTIONS]->(ftd)
MERGE (pub5)-[:MENTIONS]->(ad)
// ============================================
// PUBLICATION 6: Industry-Academic Collaboration
// ============================================
MERGE (pub6:Publication {
pmid: 'PMID35678901',
title: 'Clinical validation of TREM2 R47H variant in diverse populations',
year: 2024,
citations: 95,
abstract: 'Multi-center study confirms TREM2 variant pathogenicity...'
})
MERGE (pub6)-[:PUBLISHED_IN]->(natGenet)
MERGE (pub6)-[:HAS_AUTHOR]->(tanaka)
MERGE (pub6)-[:HAS_AUTHOR]->(williams)
MERGE (pub6)-[:HAS_AUTHOR]->(schmidt)
MERGE (pub6)-[:HAS_KEYWORD]->(geneticRisk)
MERGE (pub6)-[:MENTIONS]->(trem2_gene)
MERGE (pub6)-[:MENTIONS]->(ad)
MERGE (pub6)-[:MENTIONS_VARIANT]->(rs75932628)
MERGE (pub6)-[:CITES]->(pub1)
MERGE (pub6)-[:CITES]->(pub3)
Cypher 查询
以下示例查询展示了如何从图谱中提取出版物情报。
机构热点
以下所有查询均专注于寻找研究活动热点,这在多种竞争情报场景中非常有用。
哪些机构正在针对某种药物靶点和特定疾病发表研究?
您可以利用此查询来识别潜在合作的领先研究中心,或监控竞争对手在特定治疗领域的活动。
WITH
"TREM2" AS targetGene,
"Alzheimer's Disease" AS targetDisease
MATCH
(pub:Publication)-[:MENTIONS]->(gene:Gene {symbol: targetGene}),
(pub)-[:MENTIONS]->(disease:Disease {name: targetDisease}),
(pub)-[:HAS_AUTHOR]->(author:Author)-[:AFFILIATED_WITH]->(inst:Institution)
RETURN
inst.name AS Institution,
inst.type AS Type,
count(DISTINCT pub) AS Publications,
collect(DISTINCT author.name) AS Researchers,
avg(pub.citations) AS AvgCitations
ORDER BY Publications DESC, AvgCitations DESC;
哪些机构在针对某一靶点和疾病的出版物上合作最频繁?
这有助于识别治疗领域中强大的机构伙伴关系和合作网络,以及可以培养新合作的潜在空白点。
WITH
"TREM2" AS targetGene,
"Alzheimer's Disease" AS targetDisease
MATCH
(pub:Publication)-[:MENTIONS]->(gene:Gene {symbol: targetGene}),
(pub)-[:MENTIONS]->(disease:Disease {name: targetDisease}),
(pub)-[:HAS_AUTHOR]->(a1:Author)-[:AFFILIATED_WITH]->(inst1:Institution),
(pub)-[:HAS_AUTHOR]->(a2:Author)-[:AFFILIATED_WITH]->(inst2:Institution)
WHERE inst1.name < inst2.name
RETURN
inst1.name AS Institution1,
inst2.name AS Institution2,
count(DISTINCT pub) AS SharedPublications,
collect(DISTINCT pub.pmid) AS PMIDs
ORDER BY SharedPublications DESC, Institution1, Institution2
哪些作者正在针对某种药物靶点和特定疾病发表研究?
您可以使用此查询来识别特定治疗领域的关键意见领袖(KOL),或监控从事特定靶点研究的竞争对手研究人员。
最终结果可能是您聘请其中的一些研究人员担任顾问或指导,或者与他们联系以寻求潜在的合作。
WITH
"TREM2" AS targetGene,
"Alzheimer's Disease" AS targetDisease
MATCH
(pub:Publication)-[:MENTIONS]->(gene:Gene {symbol: targetGene}),
(pub)-[:MENTIONS]->(disease:Disease {name: targetDisease}),
(pub)-[:HAS_AUTHOR]->(author:Author)
RETURN
author.name AS Author,
author.h_index AS `H-Index`,
count(DISTINCT pub) AS Publications
ORDER BY Publications DESC, `H-Index` DESC;
哪些关键词与某种药物靶点和特定疾病相关联?
这将使您对围绕该靶点和疾病的文献中的主要研究主题和趋势有一个宏观的了解,帮助您识别新兴的兴趣领域或研究格局中的空白。
WITH
"TREM2" AS targetGene,
"Alzheimer's Disease" AS targetDisease
MATCH
(pub:Publication)-[:MENTIONS]->(gene:Gene {symbol: targetGene}),
(pub)-[:MENTIONS]->(disease:Disease {name: targetDisease}),
(pub)-[:HAS_KEYWORD]->(keyword:Keyword)
RETURN
keyword.term AS Keyword,
count(DISTINCT pub) AS Publications
ORDER BY Publications DESC;
临床验证
基因-疾病关联的临床验证使您能够优先考虑具有强力证据支持的靶点。
通过临床变异发现新兴的基因-疾病关联
此查询通过出版物中提到的具有临床意义的变异(单核苷酸多态性,SNP)识别与疾病相关的基因,从而帮助确定具有强临床证据支持的靶点的优先级。
WITH ['Pathogenic', 'Likely pathogenic'] AS significantClasses
MATCH
(snp:SNP)-[:ASSOCIATED_WITH]->(gene:Gene),
(pub:Publication)-[:MENTIONS_VARIANT]->(snp),
(pub)-[:MENTIONS]->(disease:Disease)
WHERE snp.clinical_significance IN significantClasses
RETURN
gene.symbol AS Gene,
gene.name AS GeneName,
snp.rsid AS ClinicalVariant,
snp.variant AS Mutation,
disease.name AS Disease,
count(DISTINCT pub) AS Publications,
snp.clinical_significance AS ClinicalSignificance
ORDER BY Publications DESC;
查找提及临床变异和疾病的出版物。
此查询旨在寻找同时提及临床变异(SNP rsID)和疾病的出版物,有助于识别为遗传发现提供临床背景的研究。
MATCH
(snp:SNP)-[:ASSOCIATED_WITH]->(gene:Gene),
(pub:Publication)-[:MENTIONS_VARIANT]->(snp),
(pub)-[:MENTIONS]->(disease:Disease)
RETURN
pub.title AS Publications,
pub.year AS Year,
snp.rsid AS ClinicalVariant,
snp.variant AS Mutation,
disease.name AS Disease
ORDER BY Publications DESC;
合作网络
学术机构与工业界之间的研究合作可以加速药物发现,并可用于识别潜在的合作伙伴。
绘制研究合作网络
我们正在寻找共同发表过出版物的机构对(学术机构与工业界),以及他们共同研究的疾病,这有助于识别强大的合作伙伴关系和潜在的合作机会。
MATCH
(auth1:Author)-[:AFFILIATED_WITH]->(inst1:Institution),
(auth2:Author)-[:AFFILIATED_WITH]->(inst2:Institution),
(pub:Publication)-[:HAS_AUTHOR]->(auth1),
(pub)-[:HAS_AUTHOR]->(auth2)
WHERE
inst1.name < inst2.name
AND inst1.type <> inst2.type // Cross-sector collaboration
MATCH (pub)-[:MENTIONS]->(disease:Disease)
RETURN
inst1.name AS AcademicInstitution,
inst2.name AS IndustryPartner,
count(DISTINCT pub) AS CollaborativePublications,
collect(DISTINCT auth1.name + ' & ' + auth2.name) AS ResearcherPairs,
collect(DISTINCT disease.name) AS DiseasesStudied
ORDER BY CollaborativePublications DESC;
关键意见领袖 (KOL)
识别特定治疗领域中最具影响力的研究人员。这些人可能是您希望合作或考虑聘用的对象。
WITH
"TREM2" AS targetGene,
"Alzheimer's Disease" AS targetDisease
MATCH
(pub:Publication)-[:MENTIONS]->(gene:Gene {symbol: targetGene}),
(pub)-[:MENTIONS]->(disease:Disease {name: targetDisease}),
(pub)-[:HAS_AUTHOR]->(author:Author)
RETURN
author.name AS Author,
author.h_index AS `H-Index`,
count(DISTINCT pub) AS Publications
ORDER BY Publications DESC, `H-Index` DESC;
寻找连接不同研究主题的“桥梁”研究人员
识别跨多个领域发表研究的关键意见领袖。
MATCH (author:Author)<-[:HAS_AUTHOR]-(pub:Publication)-[:HAS_KEYWORD]->(keyword:Keyword)
WITH
author,
collect(DISTINCT keyword.term) AS themes,
count(DISTINCT pub) AS pubCount
WHERE size(themes) >= 2
MATCH (author)-[:AFFILIATED_WITH]->(inst:Institution)
RETURN
author.name AS Researcher,
inst.name AS Institution,
author.h_index AS `H-Index`,
themes AS ResearchThemes,
pubCount AS Publications
ORDER BY size(themes) DESC, author.h_index DESC;
多靶点研究趋势
发现疾病中的多靶点研究趋势(多基因方法)。
查找将多个基因与同一种疾病联系起来的出版物(系统生物学方法)。
WITH "Alzheimer's Disease" AS targetDisease
MATCH
(pub:Publication)-[:MENTIONS]->(disease:Disease {name: targetDisease}),
(pub)-[:MENTIONS]->(gene:Gene),
(pub)-[:HAS_AUTHOR]->(author:Author)-[:AFFILIATED_WITH]->(inst:Institution)
WITH
pub,
disease,
collect(DISTINCT gene.symbol) AS genes,
inst
WHERE size(genes) >= 2 // Publications mentioning 2+ genes
RETURN
pub.title AS Publication,
pub.pmid AS PMID,
pub.year AS Year,
inst.name AS Institution,
disease.name AS Disease,
genes AS GenesStudied,
size(genes) AS NumberOfGenes,
pub.citations AS Citations
ORDER BY NumberOfGenes DESC, Citations DESC
基于出版物的竞争情报 (CI) 可以通过利用两个数据集中的重叠实体,与基于专利的 CI 进行强有力的整合。科学出版物往往提供了专利构建的基础知识,许多专利在申请过程中明确引用了关键的出版物。基因、药物靶点和分子变异等实体常同时出现在学术文章和专利权利要求中,从而能够映射从基础研究到知识产权的创新轨迹。此外,高影响力出版物的作者也可能作为发明人或顾问出现在专利中,使他们成为关键意见领袖和潜在的合作对象。整合出版物和专利 CI 使组织能够识别新兴热点、具有影响力的人员,以及将新颖科学发现转化为可保护发明的过程。