GIS can change with current technology such as Apache Arrow, and perhaps also with some other techniques. In Pythonland, Geopandas has improved performance over time. I also wanted to try Dask, DuckDB and Apache Sedona.
Please contact me if you read this and think I could have improved the code quality and speed. I use some public data, which is still widely used in Germany as shape files.
The data
First I downloaded the ALKIS (register) building data for all counties in the state of Brandenburg. All the vector files are open data. The vector files are still offered as shapefiles. From the ALKIS dataset of Brandenburg I used the buildings and the parcels (with land use). The files are stored per county! The geometries have some errors, which Geopandas automatically detects and fixes. In addition, some files cannot be opened with the fiona library of Geopandas, with the error message of multiple geometry columns. So we always use the new default: pyogrio.
We selected a party as the dominant party, when this party or its candidate was mentioned 50 % or more compared to other parties. But this results in many posts that mention the candidates Söder, Aiwanger, although the post is attribute to different parties. This mixed attribution, which allows to mention different parties as long as there is a clear dominant party, makes attribution hard, but was used to keep as many posts as possible. Even though in some weeks we did not record post for some smaller "dominant" parties.
I held a presentation on this project on Elixir MeetUp Berlin on Feb 8th, 2024. The slides can be found here.
The Livebook code can be found here. The code for data collection can be found here.
Abstract
We try to predict voting result in the 2023 Bavarian state election in Germany by Mastodon posts.
The last polls before the election show an average error of about 0.7 to 0.9 percent per major party. A time weighted average of the polls of the last six weeks before the election reduced the error to 0.39 percentage points per party.
Introduction and Objectives
The German power system is adding renewable energy sources in the north, where wind energy plants reach their highest efficiency, due to higher wind speeds. At the same time old power plants e.g. nuclear, hard coal and lignite are being phased out [^eserfrey_analyzing_2012]. These older power plants were mainly located in southern and central Germany. The energy sink, industrial and private demand, is not shifting north. Therefore, the renewable energy has to be transported from north to south which increases the congestion in the power grid. The amount of offshore wind power, that the German energy system can use, can be greatly increased by adding new power lines 1.