Apportionment scenarios

1) Introduction

This vignette explores the implementation of different seat apportionment scenarios. There are generally two ways to assign seats of a parliament: Either you distribute seats within each district independently or you distribute seats according to the vote share for the whole election region (e.g. on a national level). We will focus on the difference in seat shares compared to vote shares these two approaches can produce.

2) Setup data

We’ll mainly use the data set for the 2019 Finnish parliamentary elections. The finland2019 data set contains two data.frames: One with the number of votes for each party, the other data.frame tells us how many seats (election_mandates) each district has.

library(proporz)
str(finland2019)
#> List of 2
#>  $ votes_df         :'data.frame':   229 obs. of  3 variables:
#>   ..$ party_name   : chr [1:229] "KOK" "SDP" "VIHR" "PS" ...
#>   ..$ district_name: chr [1:229] "UUS" "UUS" "HEL" "UUS" ...
#>   ..$ votes        : int [1:229] 114243 97107 90662 86691 84141 78486 73626 66109 59722 55244 ...
#>  $ district_seats_df:'data.frame':   12 obs. of  2 variables:
#>   ..$ district_name: chr [1:12] "HEL" "HÄM" "KAA" "KES" ...
#>   ..$ seats        : int [1:12] 22 14 17 10 7 18 19 8 15 36 ...

To make comparisons easier, we’ll use matrices instead of data.frames. We create a votes_matrix from the given data.frame.

votes_matrix = pivot_to_matrix(finland2019$votes_df)
dim(votes_matrix)
#> [1] 44 12

# Let's look at all parties with at least 10k votes
knitr::kable(votes_matrix[rowSums(votes_matrix) > 10000,])

	HEL	HÄM	KAA	KES	LAP	OUL	PIR	SAT	SKA	UUS	VAA	VAR
KD	7253	11733	10875	8774	1077	5728	17180	3361	17389	14912	16741	5121
KESK	11015	20892	40715	30597	29145	78486	26536	20878	50459	35348	50053	29796
KOK	84141	39511	44224	19707	11243	28382	55244	17666	27699	114243	29530	52367
NYT	13529	0	0	0	0	1593	0	0	0	0	0	0
Nyt	0	0	0	0	0	0	0	0	0	24778	0	0
PIR	5840	996	687	1239	280	1244	2018	253	686	4083	462	1244
PS	47276	43199	46196	27851	17160	52771	51819	30090	39996	86691	42843	52913
RKP	20348	500	0	0	112	432	327	178	0	49625	52880	15238
SDP	52393	49359	59722	29201	13479	26588	66109	31475	38274	97107	33608	49156
SIN	2056	821	3212	705	483	1801	3342	0	4128	9498	2470	1427
STL	1285	1028	829	389	667	847	1371	199	769	2754	777	451
VAS	42899	15172	10371	12681	14128	33820	24341	12435	15908	26257	8315	35481
VIHR	90662	17354	23032	17639	9706	20376	37152	7461	21362	73626	10515	25309

We also create a named vector that defines how many seats each district has (district_seats). Note that the order of district names in district_seats and columns in votes_matrix differ. However, this does not affect the analysis as we access districts by name (as does biproporz() by the way).

district_seats = finland2019$district_seats_df$seats
names(district_seats) <- finland2019$district_seats_df$district_name
district_seats
#> HEL HÄM KAA KES LAP OUL PIR SAT SKA UUS VAA VAR 
#>  22  14  17  10   7  18  19   8  15  36  16  17

3) Distribute seats in every district independently

3.1) Function to distribute seats in districts

In Finland and many other jurisdictions, seats are assigned for each district independently. There is no biproportional apportionment among all districts. The following function, apply_proporz, calculates the seat distribution for each district and returns the number of seats per party and district as a matrix. The user can specify the apportionment method and a quorum threshold.

apply_proporz = function(votes_matrix, district_seats, method, quorum = 0) {
    seats_matrix = votes_matrix
    seats_matrix[] <- NA
    
    # calculate proportional apportionment for each district (matrix column)
    for(district in names(district_seats)) {
        seats_matrix[,district] <- proporz(votes_matrix[,district],
                                           district_seats[district],
                                           quorum = quorum,
                                           method = method)
    }
    return(seats_matrix)
}

3.2) Actual distribution method (D’Hondt)

The Finnish election system uses the D’Hondt method. We’ll calculate the seat distribution as a baseline to compare it with other methods.

bydistrict_v0 = apply_proporz(votes_matrix, district_seats, "d'hondt")

bydistrict_v0[rowSums(bydistrict_v0) > 0,]
#>           district_name
#> party_name HEL HÄM KAA KES LAP OUL PIR SAT SKA UUS VAA VAR
#>       KD     0   1   0   0   0   0   1   0   1   1   1   0
#>       KESK   0   1   3   3   3   6   2   1   4   2   4   2
#>       KOK    6   3   3   1   1   2   4   1   2   9   2   4
#>       Nyt    0   0   0   0   0   0   0   0   0   1   0   0
#>       PS     3   3   4   2   1   4   4   2   3   6   3   4
#>       RKP    1   0   0   0   0   0   0   0   0   3   4   1
#>       SDP    3   4   5   2   1   2   5   3   3   7   2   3
#>       VAS    3   1   0   1   1   3   1   1   1   2   0   2
#>       VIHR   6   1   2   1   0   1   2   0   1   5   0   1

3.3) Alternative methods

We’ll now look at alternative apportionment methods. The Sainte-Laguë method (standard rounding) for example is impartial to party size. On the other hand, the Huntington–Hill method favors small parties. Generally, Huntington-Hill is used with a quorum, otherwise all parties with more than zero votes get at least one seat. In this example, the quorum is 3% of votes in a district.

bydistrict_v1 = apply_proporz(votes_matrix, district_seats,
                                      method = "sainte-lague")

bydistrict_v2 = apply_proporz(votes_matrix, district_seats,
                                      method = "huntington-hill", 
                                      quorum = 0.03)

Let’s compare the seat distributions for these three methods. We get the number of party seats on the national level with rowSums.

df_bydistrict = data.frame(
    D.Hondt = rowSums(bydistrict_v0),
    Sainte.Lague = rowSums(bydistrict_v1),
    Huntington.Hill = rowSums(bydistrict_v2)
)

# sort table by D'Hondt seats
df_bydistrict <- df_bydistrict[order(df_bydistrict[[1]], decreasing = TRUE),] 

# print parties with at least one seat
knitr::kable(df_bydistrict[rowSums(df_bydistrict) > 0,])

	D.Hondt	Sainte.Lague	Huntington.Hill
SDP	40	36	36
PS	39	37	37
KOK	38	35	35
KESK	31	29	29
VIHR	20	25	27
VAS	16	18	18
RKP	9	8	8
KD	5	7	6
Nyt	1	2	2
NYT	0	1	1
SIN	0	1	0

The actual political analysis is better left for people familiar with the party system. However, it is fair to say that the election system within districts has a significant impact on the number of seats in parliament. For example, the party with the most seats changes with Sainte-Laguë or Huntington-Hill.

3.4) Compare seat share with vote share

Let’s now compare the vote shares on the national level with the seat share in parliament. Since every entry in votes_matrix is only one voter, we can simply use the row sums to get the vote shares. Otherwise, we’d have to weigh votes in each district according to the number of seats. Since disproportionality analysis is not the focus of this package, we’ll simply look at the difference in shares for the actual distribution method.

vote_shares = rowSums(votes_matrix)/sum(votes_matrix)

shares = data.frame(
    seats = rowSums(bydistrict_v0)/sum(district_seats),
    votes = vote_shares 
)
shares$difference <- shares$seats-shares$votes
shares <- round(shares, 4)

# Only look at parties with at least 0.5 % of votes
shares <- shares[shares$votes > 0.005,]
shares <- shares[order(shares$difference),]

shares
#>       seats  votes difference
#> VIHR 0.1005 0.1154    -0.0149
#> KD   0.0251 0.0391    -0.0140
#> SIN  0.0000 0.0098    -0.0098
#> PIR  0.0000 0.0062    -0.0062
#> Nyt  0.0050 0.0081    -0.0030
#> VAS  0.0804 0.0821    -0.0017
#> RKP  0.0452 0.0455    -0.0003
#> KESK 0.1558 0.1381     0.0176
#> KOK  0.1910 0.1707     0.0202
#> PS   0.1960 0.1756     0.0204
#> SDP  0.2010 0.1781     0.0229

The difference between seat and vote share varies among parties from -1.5 to +2.3 percentage points.

4) Biproportional method

4.1) Biproportional party seats

We’ll now look at biproportional apportionment and whether it better matches the seat share with the national vote share. Keep in mind that simply using existing data sets with biproporz is not really suitable from a modeling perspective. Vote distributions are likely to be different if, for example, people know their vote also counts on a national level or if smaller parties choose to run in more districts.

With that being said, let’s assume our data set is properly modeled already. We need to consider that voters in Finland can only vote for one person in each district. They can’t distribute as many votes as there are seats in their district among candidates/parties. Thus we need to use use_list_votes=FALSE as a parameter in biproporz.

seats_biproportional = biproporz(votes_matrix, 
                                 district_seats, 
                                 use_list_votes = FALSE)

# show only parties with seats
seats_biproportional[rowSums(seats_biproportional) > 0,]

	HEL	HÄM	KAA	KES	LAP	OUL	PIR	SAT	SKA	UUS	VAA	VAR
KD	0	1	1	1	0	1	1	0	1	1	1	0
KESK	1	1	3	2	2	6	2	1	3	2	3	2
KOK	5	3	3	1	1	2	4	1	2	7	2	4
KP	0	0	0	0	0	0	1	0	0	0	0	0
NYT	1	0	0	0	0	0	0	0	0	0	0	0
Nyt	0	0	0	0	0	0	0	0	0	2	0	0
PIR	1	0	0	0	0	0	0	0	0	0	0	0
PS	3	3	3	2	1	4	3	2	3	5	3	3
RKP	1	0	0	0	0	0	0	0	0	4	3	1
SDP	3	4	4	2	1	2	4	2	3	6	2	3
SIN	0	0	0	0	0	0	0	0	1	1	0	0
STL	0	0	0	0	0	0	0	0	0	1	0	0
VAS	2	1	1	1	1	2	2	1	1	2	1	2
VIHR	5	1	2	1	1	1	2	1	1	5	1	2

Let’s now look at the difference between vote and seat share.

vote_shares = rowSums(votes_matrix)/sum(votes_matrix)
seat_shares = rowSums(seats_biproportional)/sum(seats_biproportional)

range(vote_shares - seat_shares)
#> [1] -0.005144852  0.002429257

As we can see, the difference between vote and seat share ranges from -0.5 to +0.2 percentage point. Biproportional apportionment matches the national vote share better than apportionment by district. This is expected however, since biproportional apportionment actually considers the national vote share. Discussing the pros and cons of a regional representation compared to a priority on national vote shares is not within the scope of this vignette. The following chunk shows the seat changes.

seat_changes = seats_biproportional-bydistrict_v0

knitr::kable(seat_changes[rowSums(abs(seat_changes)) > 0,colSums(abs(seat_changes))>0])

	HEL	KAA	KES	LAP	OUL	PIR	SAT	SKA	UUS	VAA	VAR
KD	0	1	1	0	1	0	0	0	0	0	0
KESK	1	0	-1	-1	0	0	0	-1	0	-1	0
KOK	-1	0	0	0	0	0	0	0	-2	0	0
KP	0	0	0	0	0	1	0	0	0	0	0
NYT	1	0	0	0	0	0	0	0	0	0	0
Nyt	0	0	0	0	0	0	0	0	1	0	0
PIR	1	0	0	0	0	0	0	0	0	0	0
PS	0	-1	0	0	0	-1	0	0	-1	0	-1
RKP	0	0	0	0	0	0	0	0	1	-1	0
SDP	0	-1	0	0	0	-1	-1	0	-1	0	0
SIN	0	0	0	0	0	0	0	1	1	0	0
STL	0	0	0	0	0	0	0	0	1	0	0
VAS	-1	1	0	0	-1	1	0	0	0	1	0
VIHR	-1	0	0	1	0	0	1	0	0	1	1

4.2) Distribute seats among districts as well

Normally the number of seats per district is defined before an election, based on census data. However, you could also assign the number of seats per district, based on the actual votes cast in every district. This way, each districts seat share is proportional to its vote share. To do this, we use the total number of seats for the district_seats parameter.

full_biproportional = biproporz(votes_matrix, 
                                district_seats = sum(district_seats),
                                use_list_votes = FALSE)

# party seat distribution has not changed
rowSums(full_biproportional) - rowSums(seats_biproportional)
#>   Asyl   E117   E118   E154   E168   E169   E185   E191   E192   E287   E491 
#>      0      0      0      0      0      0      0      0      0      0      0 
#>   E492   E493    EOP     FP     IP     KD   KESK    KOK     KP    KTP   LIBE 
#>      0      0      0      0      0      0      0      0      0      0      0 
#>    LII   LIIK     LN   LNLY    LNY   LNYL    NYT    Nyt    PEL    PIR     PS 
#>      0      0      0      0      0      0      0      0      0      0      0 
#>    REF    RKP    RLI Reform    SDP    SIN    SKE    SKP    STL    VAS   VIHR 
#>      0      0      0      0      0      0      0      0      0      0      0

# district seat distribution is different
colSums(full_biproportional) - colSums(seats_biproportional)
#> HEL HÄM KAA KES LAP OUL PIR SAT SKA UUS VAA VAR 
#>   3  -1  -1   0   0  -1   0   0  -1   0   0   1

As we can see, the number of party seats does not change. However, districts where more people voted get a higher share of the seats. 2 districts gain seats, 4 districts lose a seat.

5) Effect of a system change on seat distribution

As a last example, we’ll look at the changes in seat distribution biproportional apportionment had in the 2020 parliament election in the Swiss canton of Uri. In Uri, 2020 was the first year the cantonal parliament was elected with biproportional apportionment for some municipalities. Previously, these four municipalities used the Hagenbach-Bischoff method. We’ll ignore all other municipalities which use a majoritarian electoral system.

seats_old_system = apply_proporz(uri2020$votes_matrix, uri2020$seats_vector, "hagenbach-bischoff")

seats_new_system = biproporz(uri2020$votes_matrix, uri2020$seats_vector)

seats_new_system-seats_old_system
#>      Altdorf Bürglen Erstfeld Schattdorf
#> CVP        1       0        0          0
#> SPGB      -1       0        0          0
#> FDP        0       0        0          0
#> SVP        0       0        0          0

Compared to the previous election system there was a change of one seat (out of 37).