Non-Euclidean Democracy – Alexander A. Migdal

Democracy is as old as Geometry, but unlike that Greek invention, Democracy did not evolve since Euclid times. While Geometry was constantly perfecting, undergoing revisions and discovering new concepts, Democracy was frozen in time.
After proven to be unstable against civil war in ancient Rome it was succeeded by monarchy for thousands of years until it was revived a couple of centuries ago. The great political leaders who did it had the best intentions and indeed, they brought vast amount of progress to the modern world.
Unfortunately, they left some old Greek bugs unfixed. Democracy as an algorithm for elections, fails way too often to keep these bugs neglected. It is high time to debug this algorithm, or, even better, totally redesign it.

Greek Election Algorithm

The pseudo-code for Presidential Election (PE) is very simple. Here is US version– others are similar.

Algorithm PE_Greek
1. Let presidential candidates promote themselves and present their promises through the media.
2. Let every citizen cast his/her vote, which is 0 for every candidate except possibly one, where the vote is 1.
3. Find majority winners in every state.
4. Elect the president by majority of victories in states.
5. Do not let the same president to be nominated more than twice in a row.
There are several obvious flaws in this algorithm. The simplest one (though least significant) was demonstrated recently by Putin-Medvedev tandem in Russia. This algorithm can get into infinite loop PP-M-PP-M…without breaking the last rule.

Simple Correlation Algorithm

There are far more significant flaws in PE_Greek algorithm, at the level of its design. From the point of view of Computer Science this is particular case of Data Mining, with the purpose of prediction of future behavior of the President.
The first principle of Data Mining, neglected in Greek implementation of Democracy, is the feedback. In order to predict the time series such as behavior of Presidents, one needs to have some Machine Learning algorithm in place. At the end of the Presidents term, the results of his rule must be accounted for and then used to improve the predictive power of subsequent elections.
One way to do it would be an independent commission of internationally respected politicians and scholars which would produce the rating of every president in recent past — say the last 10.
Each citizen will have to do similar rating for each presidential candidate at the voting time. This rating would reflect his expectations of future performance of each candidate. At the end of each term, the commission will rate the President for his accomplishment and that rating will be compared with citizens rating of this candidate at election time. Correlation coefficient between these two ratings will define the voting power of this citizen for the next election.
So, here is the first version of improvement of the Greek algorithm.

Algorithm PE_Corr
1. Let presidential candidates promote themselves and present their promises through the media.
2. Let every citizen record his/her rating (between 0 and 100) for each presidential candidate.
3. Multiply every rating by mean correlation coefficient between his/her score and the de-facto rating of the past presidents.
4. Elect the president by simple majority of ratings.
5. Do not prevent the same president from re-election.

This correlation coefficient will become more and more meaningful with every election, as more and more feedback gets into the voting database. At some point, perhaps 20 elections later, there will be enough data to set up more sophisticated prediction engine.

Adaptive Regression Algorithm

With millions of individual ratings and just a few past elections to process this is particular case of data mining, calling for so called Sparse Lasso Regression method.

The essence of this method as applied to our case is to fit the data for posterior presidential rating as a linear combination of mean pre-electoral ratings of certain representative groups of population, such as race, age, gender, occupation, health, wealth, state, city etc.

These groups may overlap — the only constraint is that they have to be meaningful demographically and contain sufficiently large fraction of the population each. Simplest would be 50/50 split like with sex or in equal quantiles of wealth or age.

In view of paramount importance of the choice of these groups, and the risk of human bias or mean intentions in the initial set of groups, some special algorithms of automated clusterization of population by their correlated ratings can be suggested instead.

This corresponds to so-called unsupervised learning in Computer Science. This would be closer to the spirit of equality of all citizens to let computer make an initial selection of groups not by human criteria, but rather by the correlation of ratings of people within each group.

This was regression part. The Lasso is an extra constraint of fixed number of groups used with the best fit for regression coefficients of these group. While the number of fixed groups increases one by one, each time adding the particular group, which leads to biggest improvement of the fit, the quality of the fit decreases in-sample, i.e. for the dates used for fitting.

There must be also so called out-of-sample set of dates (say random 10 out of 20 elections go to in-sample, and other 10 go into out-of-sample set). The out-of-sample dates are used to check that the fit quality, indeed, increases with the number of groups allowed.

This increase will happen in the beginning, however, after some number of groups, the out-of-sample fit quality will start deteriorating. This phenomenon is called over-fitting– the in-sample fitting picks the fine details, not repeated in the other data sets. This is when one must stop adding groups for regression and come back to the optimal number, giving the best fit out-of-sample.

As a result, not just the weights, but the groups of citizens used in regression will be chosen automatically, without any human bias, based on their relevance for the Presidential rating prediction. Thus, the underlying idea of equality of all citizens, which is so dear to our hearts, will still be present in all these implementations of PE algorithm, though the final contribution of each citizen would depend on the history of his guessed ratings at the past elections.

Some citizens with great predictive powers will automatically obtain high weight (or high negative weight for , while those who vote randomly, would effectively decouple. In other words this would be some rudimentary AI, listening to everyone and learning how to predict the future based on the multitude of public opinions.

The theory and practice of Sparse Lasso Regression (SLR) is well developed in the field of Statistical Learning. The most striking feature is that one does not have to keep the number of groups much smaller than the number of processed past elections. The SLR is known to work well for just dozens of events and hundreds of groups (called features in Machine Learning literature).

So, here is the pseudo-code of proposed PE_SLR algorithm.

Algorithm PE_SLR
1. Let presidential candidates promote themselves and present their promises through the media.
2. Let every citizen record his/her rating (between 0 and 100) for each presidential candidate.
3. Pool the ratings across various voting groups (to be established either by Voting Committee or by Unsupervised Clusterization by rating correlation).
4. For each candidate, weigh these group ratings with SLR coefficients, as computed after previous elections by fitting renewed National Data (in general, exponential moving average will be used to discount previous election data).
5. Elect the president by simple majority of regression predictions.
6. Do not prevent the same president from re-election.

Political Problems and Solutions

Surely, this election process (just one new data point every 4 years) is extremely slow compared to the speed of development of Computer Science. Once the data collection mechanism will be in place (forced rating of every candidate at every by every citizen at every election) the method will have time to be perfected over the next decades.

I am sure that given the utter importance of this process, there will be enough input from academic community to perfect this method. There will always be past data archive to test prediction methods.

This will still not protect us from abuses of power, such as pressure on the government from large businesses, nor the abuses created by global madness epidemics through social networks.

The President should simply have enough power not to be paralyzed by such external pressures, like he is paralyzed now. No machine learning can give him/her such power, this must be done by the legislation once and for all times.

The chance of getting crazy maniac elected, who will abuse the power and drive the country to the ground, is getting much less than a chance that no President will be able to do anything in existing hostile political environment.

The technical problem of creation machine learning procedure for presidential election is infinitely simpler than political problem of getting these ideas accepted by the people.

It would require a lot of discussion in academic and political circles and especially a lot of media support to even get started with such a radical modification of the cornerstones of our society.

The first step would be to simply get started with data collection, namely the obligatory rating of all the candidates at every election by every voter. While these data will be accumulated, the academic researchers must have it available for experiments with the prediction algorithms.

Creating the universally respected and balanced committee for rating the President after expiration of his term would be the next difficult step. Media as well as scholars in political and economic sciences do such rating all the same, but making it legal and official is quite another story.

The practical way to avoid these political problems would be to make these ratings Democratic in the Greek sense, i.e. determine them by a national poll. Voting here would not be as biased as in election, because this would be after the fact, not before. This may be more acceptable to the people raised in old style Democracy.

They truly vote for the past President, and supercomputers do the automated corrections of their expectation about future Presidents based on the past experience. Perhaps, this version of final rating would be more acceptable politically.

How to Fix The Congress

Once the President is elected, he/she has to make decisions every day, and important part of these decisions must be accepted by the Congress, again, by simple majority voting. Due to the small number of voters here the bugs of Greek Democracy display themselves in especially obvious and frustrating way.

Instead of honest expression of opinions, voting becomes a dirty political game, with shameless feedback from personal (or party line) political goals. This game is- directly or indirectly affected by big corporations and rich individuals. This did not change much since Roman times.

Let us see how the Machine Learning can help us fixing these Congress problems. Again, as with the Presidential elections, it is much easier to judge the past decisions than foresee the consequences of the current ones. Moreover, this process of rating past Presidential decisions and guessing the rate of the future ones by every Congressman can be done in a very private and secret way, at random times, to avoid political conspiracies.

There must be some technologies (lie detector?) to make the Congressman honestly answer the questions such as rating of past Presidential decisions, especially if the party leader is not present to influence these answers.

If this interrogation technique looks insulting to the Congressmen, maybe they deserved it with their shameless behaviors lately.

The details of SLR algorithm which uses these old ratings and correct the current rating into regression estimate of the current decision, must be kept secret, so that congressmen would not be able to reverse engineer it.

The basic idea is the same as with Presidential elections: the Machine learning takes as input the whole history of pre-electional and posterior ratings by every Congressman of the past Presidential decisions and produces from pre-election ratings the regression estimate of the future posterior decision of the Congress.

As with general elections, the principle of equality is fulfilled, but instead of real-time feedback from the ongoing voting process (the worst feature of Greek election) there will be automated feedback from the past voting.

This will to a great extent disarm dishonest politicians from playing voting games, as these games would become too complex for a human, plus the humans would not have all relevant information to play this game.

Going Online

The sophisticated reader must be smiling by now– this will never pass into a law– just forget it! I agree that standard political methods will never make such radical changes happen in peaceful times.

To this reader I would like to remind about new mega-force unleashed by Internet: the force of social networks. This force is capable of making changes in societies, as we have seen in Arab world, and perhaps, even in the Western World — I mean the Occupy Wall Street movement which is already taking over the whole world.

With respect to my voting algorithms I suggest to create global online game: Prediction Game. All participants will voluntarily rate some political events before and after. I will create and support the Prediction System, based on SLR or some other Machine Learning Algorithm, which would take all these ratings, generate representative groups by Unsuperwised Learning, compute weights and keep constantly learning how to predict political events by fitting posterior ratings to SLR made from the past ones.

Everything will be transparent. The predictions will be published on my website before the event, protected by password, which will be revealed after the agreed time. Moreover, I will accept other machines: I will take their predicted ratings as inputs into my system, so if they have high success rate as predictors they will automatically get high weight in my adaptive system.

This way we can save the centuries of learning, needed by SLR system to learn from experience. With such a game the SLR system may get smart enough in just a few years, if we keep making people answer questions every few days and use their millions answers as input for the Machine Learning.

My main idea is that technology for global brain is already there, and this simple SLR algorithm may be the practical way to create this global brain and make it learning and helping mankind to foresee the future.

There is a threshold to overcome before such a Prediction game may become pandemic, as it is needed for efficient learning system to get large numbers for reliable statistics. I would need help from Internet activists, familiar with the ways of social networking — maybe even collaboration with Facebook of Google.

The successful launch of such a game may have historical consequences at global scale.

Security and Transparency

One of my friends responded to the first version of this document: “Politics is about big money, so the Big Money will find the way to game your system like they game the financial markets”.

I agree that the danger is there, but my hope is that Internet with its social networks is a force to rival Big Money. There are known distributed online systems such as Wikipedia and Bitcoin, not to mention Facebook, which have the life of their own, not serving Big Money and representing Global cooperation.

The danger of hackers falsifying my online polling can be dealt by the same methods which are used in electronic commerce. There are good enough encryption methods and firewalls so that identification of voters and secrecy of archives can be achieved.

The privacy can be addressed in variety of ways, including the following one. As the true identity of the voter is not needed for my algorithm, but rather only his/her demographic parameters, the used is invited to take an alias during registration (he still has to fill in the demographic form).

His identity can be established by a standard combination of username and password, with the latter being automatically generated for him by known algorithms. Stealing or selling voters identity cannot be prevented, but it would not help the thief or a buyer to achieve any political goals, as the automatic weight assigned to every demographic group by the SLR algorithm, depends on the correctness of their historical judgements rather than pure numbers of voters.

Still, some protection must be used, especially when computing total rating of past presidents by a national poll. I believe that those organisations such as Gallup Institute which do it currently, do not make a secret iof these protection measures, so they can be copied and used in our polling as well.

As for the potential accusation of the political or economic bias of proposed NonEuclidean Democracy, there are two ways to deflect such accusations. First, this must be non-profit organization, with totally transparent financial books. Second, the algorithm for prediction (SLR or some other) must be also visible online (read only, of course). Machine learning experts will be welcome to improve this algorithm.

However, to protect the system from gaming by hackers, some parameters of this algorithm (without which it would not run at all or run differently) should be hidden from public view. The source code by itself would be sufficient to convince an expert that there are no inequality between voters. The hidden parameters (such as seeds for random number generators) would not change this equality, which is all what is needed to deflect the accusation of any bias by race, gender, sexual orientation, age etc.