Zpracování dat s Wikipedie
Loading...
Downloads
6
Date issued
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Vysoká škola báňská - Technická univerzita Ostrava
Location
Signature
Abstract
Goal of this master thesis is to describe options of how to process data from Wikipedia. First part is about how to get the data, process them and save for further analysis. The database is viewed as a network, so it's focused on pages and their connections through links.
The analysis is made in Python environment. Thesis describes how to create a graph and how to calculate his basic properties an metrices. It further documents the procedure of finding the communities, including custom implementation of Label Propagation algorithm. Presented are results of each step.
Description
Subject(s)
Wikipedia, data analysis, data processing, C#, Python, network, graph, CSR, NetworkX, word cloud