Here is some technical detail.
There should be 7 or 8 files published per Ward.
Filenames are renamed from something like ...
CandidateVotesPerStageReport_V0003_Ward-1-Linn_06052022_122617.pdf
to
C16_W01_CANDIDATEVOTESPERSTAGE__CandidateVotesPerStageReport_V0003_Ward-1-Linn_06052022_122617.pdf
Inconsistencies discovered: presentation; file-naming – particularly .BLT; file order; downloading types; only a few websites offer ZIP files.
I considered website “scraping” – but the Council websites were too inconsistent in style and presentation to make this feasible.
Program Driver Files
There are three driver files, each containing what I believe to be the definitive naming sources. There are plenty of very helpful comments within these files. The three files are:
- List_Councils.txt
- List_Council_Wards.txt
- List_Report_File_Specs.txt
Relational Database Management System (RDBMS)
I intentionally used a flat-file text “driver” methodology for reasons of simplicity and because of the many inconsistencies and missing data encountered in gathering the required data files from the various council websites. Going forward, I’m not convinced that adopting an RDBMS (e.g. MySQL, MariaDB, MS SQL Server, Oracle) would be sensible unless significant improvements are made to the generation and consistent naming of the underlying data files.
Software used
My Workstation Operating System: Windows 11
Basic File Downloading/Renaming/Correction
Directory List & Print Pro
Bulk Rename Utility
UltraEdit
Excel
Basic File Content Analysis
UltraEdit
Excel
EncodeAnt
Programming
Python (IDE: Anaconda/Spyder)
R (IDE: RStudio) (early days only)
Main STV Website
WordPress
FileZilla / WinSCP
PuTTY
LibreOffice Draw (Graphics)
Consolidated Election Results Portal website
Linux
CoffeeCup
FileZilla / WinSCP
PuTTY
Websites hosted by IONOS (1&1).
Virtual Private Server (VPS)
Ubuntu Linux