Using Wikipedia as a lens to study historical, cultural, and political ties between countries through network science.
WorldGraph constructs and analyses networks of countries derived from Wikipedia article cross-references. By treating countries as nodes and their Wikipedia interlinkages as edges, the project uncovers latent structures in how the world is connected — through shared history, geography, culture, and politics.
The analysis combines network topology, sentiment analysis, and community detection to surface meaningful clusters and relationships that go beyond conventional geopolitical boundaries.
- Graph Construction — Country networks built from Wikipedia cross-link data using the Wikipedia API
- Network Analysis — Degree distribution, centrality measures, and structural properties of the world graph
- Community Detection — Louvain / modularity-based algorithms to identify clusters of historically or culturally related countries
- Sentiment Analysis — Sentiment scoring of Wikipedia article content to characterise how countries describe one another
- Visualisation — Interactive and static graph visualisations of detected communities and link weights
Key libraries: networkx · wikipedia-api · community (python-louvain) · nltk / TextBlob · pandas · matplotlib · seaborn
- Which countries form tight clusters based on Wikipedia cross-references?
- Do detected communities align with geopolitical blocs, historical empires, or linguistic groups?
- How do countries portray one another in sentiment — and does this correlate with real-world relations?
- What are the most central and most peripheral countries in the world graph?
Maria Vendas, João Mata and Rita Silva.