Technical

Here is some technical detail.

There should be 7 or 8 files published per Ward.

Filenames are renamed from something like ...

CandidateVotesPerStageReport_V0003_Ward-1-Linn_06052022_122617.pdf

to

C16_W01_CANDIDATEVOTESPERSTAGE__CandidateVotesPerStageReport_V0003_Ward-1-Linn_06052022_122617.pdf

Inconsistencies discovered: presentation; file-naming – particularly .BLT; file order; downloading types; only a few websites offer ZIP files.

I considered website “scraping” – but the Council websites were too inconsistent in style and presentation to make this feasible.


Program Driver Files
There are three driver files, each containing what I believe to be the definitive naming sources. There are plenty of very helpful comments within these files. The three files are:

  • List_Councils.txt
  • List_Council_Wards.txt
  • List_Report_File_Specs.txt


Relational Database Management System (RDBMS)
I intentionally used a flat-file text “driver” methodology for reasons of simplicity and because of the many inconsistencies and missing data encountered in gathering the required data files from the various council websites. Going forward, I’m not convinced that adopting an RDBMS (e.g. MySQL, MariaDB, MS SQL Server, Oracle) would be sensible unless significant improvements are made to the generation and consistent naming of the underlying data files.

Software used

 My Workstation Operating System:  Windows 11

 Basic File Downloading/Renaming/Correction
    Directory List & Print Pro
    Bulk Rename Utility
    UltraEdit
    Excel

 Basic File Content Analysis
    UltraEdit
    Excel
    EncodeAnt

 Programming
    Python   (IDE: Anaconda/Spyder)
    R        (IDE: RStudio)  (early days only)

 Main STV Website
    WordPress
    FileZilla / WinSCP
    PuTTY
    LibreOffice Draw  (Graphics)

 Consolidated Election Results Portal website
    Linux
    CoffeeCup
    FileZilla / WinSCP
    PuTTY

  Websites hosted by IONOS (1&1).
     Virtual Private Server (VPS)
     Ubuntu Linux