Nexus - knowledge graph of political nexus
2nd place (overall) @ HopHacks Spring '17
[ Spark | Kafka | NLTK | Neo4j | ElasticSearch | Flask | Angular2 ]
Darren Geng, Hugh Han, and Nikhil Kulkarni
Feb 2017 @ Johns Hopkins University
Nexus is able to stream news articles in real time and tag related entities in text. After constructing knowledge graphs, hidden insights regarding specific businessmen, politicians, etc. are made publicly accessible through the Nexus' RESTful API and client. Ever see Mr. Robot? A little bit like that, but completely legal.
Nexus streams data (both past and current) from New York Times articles and finds relationships between companies/people. Created by Darren Geng, Hugh Han, and Nikhil Kulkarni, and me at HopHacks 2017 (where it received the 2nd place prize), we created Nexus in the hopes that it can be used to uncover hidden insights and connections. Ever see Mr. Robot? A little bit like that, but completely legal.
Streams and processes data from New York Times using Apache Spark and Kafka. The processed news is then tagged by StanfordNER (from the NLTK library) to create a list of all of the entities linked by each article. Fully processed articles are then stored into an ElasticSearch database, where it is connected to a front-end (built in AngularJS) via a RESTful API (Flask). Users can perform a full search on the database to investigate individual articles and subjects. Neo4j is used to construct the full interactive knowledge graph, where nodes represent companies/people and are connected by individual articles acting as edges.