Scientific knowledge has been traditionally disseminated and preserved through research articles published in journals, conference proceedings, and online archives. However, this article-centric paradigm has been often criticized for not allowing to automatically process, categorize, and reason on this knowledge. An alternative vision is to generate a semantically rich and interlinked description of the content of research publications.
The Computer Science Knowledge Graph (CS-KG) is a large-scale automatically generated knowledge graph describing 67M statements from 14.5M articles about 24M entities (e.g., tasks, methods, materials, metrics) linked by 219 semantic relations.
It was designed to support a large variety of intelligent services for analyzing and making sense of research dynamics, supporting researchers in their daily job, and informing decision of founding bodies and research policy makers.
CS-KG was generated by applying an automatic pipeline that extracts entities and relationships using four tools: DyGIE++, Stanford CoreNLP, the CSO Classifier, and a new PoS Tagger. It then integrates and filters the resulting triples using a combination of deep learning and semantic technologies in order to produce a high quality knowledge graph. This pipeline was evaluated on a manually crafted gold standard yielding competitive results.
CS-KG is available under CC BY 4.0 and can be downloaded as a dump or queried via a SPARQL endpoint.
CS-KG is now replacing the Artificial Intelligence Knowledge Graph (AI-KG). We suggest users to switch to CS-KG that covers a much larger number of concepts and publications. For the sake of compatibility, old versions of AI-KG will still be avaliable in the download section.
CS-KG is aligned with the initiative of the Knowledge Graph Construction W3C Community Group for producing benchmarks, resources, and tools to support the semi-automatic generation of knowledge graphs from documents.
Danilo Dessì, Francesco Osborne, Diego Reforgiato Recupero, Davide Buscaldi, and Enrico Motta, E. (2022) SCICERO: A Deep Learning and NLP Approach for Generating Scientific Knowledge Graphs in the Computer Science Domain. Knowledge-Based Systems.
Agustin Borrego, Danilo Dessì, Imna Hernàndez, Francesco Osborne, Diego Reforgiato Recupero, David Ruiz, Davide Buscaldi, and Enrico Motta. (2022) Completing Scientific Facts in Knowledge Graphs of Research Concepts. IEEE Access.
Danilo Dessì, Francesco Osborne, Diego Reforgiato Recupero, Davide Buscaldi, Enrico Motta. (2022) CS-KG: A Large-Scale Knowledge Graph of Research Entities and Claims in Computer Science. In: The Semantic Web – ISWC 2022. Lecture Notes in Computer Science, Springer, Cham, 13489 pp. 678–696.
Danilo Dessì, Francesco Osborne, Diego Reforgiato Recupero, Davide Buscaldi, Enrico Motta. (2021) Generating Knowledge Graphs by Employing Natural Language Processing and Machine Learning Techniques within the Scholarly Domain. Future Generation Computer Systems 2021.
Danilo Dessì, Francesco Osborne, Diego Reforgiato Recupero, Davide Buscaldi, Enrico Motta, Harald Sack. (2020)AI-KG: an Automatically Generated Knowledge Graph of Artificial Intelligence . International Semantic Web Conference 2020.
Davide Buscaldi, Danilo Dessì, Enrico Motta, Francesco Osborne, Diego Reforgiato Recupero. (2019)Mining Scholarly Data for Fine-Grained Knowledge Graph Construction. In DL4KG@ESWC 2019: 21-30
Davide Buscaldi, Danilo Dessì, Enrico Motta, Francesco Osborne, Diego Reforgiato Recupero. (2019) Mining Scholarly Publications for Scientific Knowledge Graph Construction. In ESWC (Satellite Events) 2019: 8-12
Department of Mathematics and Computer Science, University of Cagliari (Italy)
Knowledge Media Institute, The Open University, Milton Keynes (UK)
Department of Mathematics and Computer Science, University of Cagliari (Italy)
LIPN, CNRS (UMR 7030), Université Sorbonne Paris Nord, Villetaneuse (France)
Knowledge Media Institute, The Open University, Milton Keynes (UK)
For information and questions please contact:
Danilo Dessì – danilo [dot] dessi [at] unica [dot] it
Francesco Osborne – francesco [dot] osborne [at] open [dot] ac [dot] uk