After Two Years, Peru's Four-member Investigation Team Released an Anti-corruption AI

In Borges's novel "Funes the Learned and the Well-Known", a genius named Funes with extraordinary observation and memory is described.
After a fall from a horse, young Funes gained amazing memory and observation skills. He could recall every detail of his past experiences, every word in a book, and even observe the subtle changes in life and nature, such as the growth of new buds and the withering of petals.
Funes in the novel is like a man with a third eye, with the ability to observe, remember, understand and interpret everything.
Corruption AI Funes: Finding clues of violations in contracts
This novel inspired the Peruvian digital investigation agency ojo-publico. They believe that Borges' Funes is like today's algorithms, which can discover many hidden and unknown secrets beneath the surface.
The agency's investigative journalists, machine learning experts, and legal advisors worked together to base their research on 245,000 contracts and account details for government procurement, engineering construction, and election donations made public by the Peruvian government.
After two years of training, an AI model for contract review was developed that can detect clues of corruption and violations in contracts. The investigation team believed that the algorithm had the ability to observe and review all details, so they named the model Funes.

So far, based on Funes' meticulous verification work, a total of 110,000 problematic contracts (out of a total of 245,000 contracts) have been identified, with a total value of 57 billion soles (Peruvian currency unit, approximately 100.9 billion yuan).
Based on the clues of these problematic contracts, the team of reporters conducted a more in-depth investigation and verification, exposing a number of corrupt and illegal government procurement behaviors of the Peruvian government, involving many large Peruvian companies and a total amount of nearly 100 billion euros.
AI has a keen eye for the tricks in procurement
It involved several major corruption cases in Peru in recent years, including:
- Petroperú, a state-owned oil refining and processing company in Peru, has won nearly RMB 2.4 billion in government procurement projects through public bidding in multiple states and provinces over the past four years. Among them, the 90% project was won as the only bidding company, which seriously violated Peru's government procurement regulations and affected normal market competition.
Funes also found from 240,000 government procurement contracts and data that tens of millions of dollars in fiscal expenditures were paid to companies that had been established for less than 20 days.
- For example, the catering company Melcesca was registered on October 23, 2015. Less than two weeks after its establishment, it won an open tender held by the National University of San Antonio Abad Cuzco (Unsaac) in Peru and became the supplier of the school's canteen. (There were 16 bidders in this procurement tender)

It has attracted huge public attention in Peru
Funes has discovered many similar illegal operations and risky contracts. Funes' keen insight and high judgment come from mature business investigations and mature algorithm support.
In the field of public policy and sociology, many scholars are committed to studying government information disclosure and corruption. Funes' algorithm is also based on the research results of a senior scholar, Mihaly Fazekas.
Mihaly Fazekas is a doctoral researcher in human, social and political science at the University of Cambridge. In his research, he developed a set of corruption detection algorithms for government procurement contracts and account flows. He found that there are several types of contracts with obvious characteristics that are the breakthrough points for violations and corruption cases. As long as these characteristics are grasped, these contracts and related information can be found from a large number of documents.
These characteristics include:
- Non-public bidding;
- The public notice period for bidding information is significantly shorter;
- Among the bidders, there was a clear disparity in size and strength;
- The procurement contract contained a number of modifications;
- The time for the winning decision is too short or too long.
Based on these judgments, he designed an evaluation model and defined CRI (Corruption Risk Index) as the corruption risk index.

Where CRIi represents the corruption risk index of contract i,
CIji represents the jth basic corruption indicator observed in the bidding for contract i,
wj represents the weight of the jth basic corruption indicator.
CRI=0 indicates the lowest corruption risk.
CRI=1 indicates the maximum observed corruption risk.

where single bidder is equal to 1 if there is only one bidder for the ith contract and equal to 0 if there are more bidders;
Zi represents the logarithm of a single-bid contract; β0 is the regression constant.
Rij is the j corruption “red flag” matrix of the ith contract, such as the length of the public notice period.
Cim represents the matrix of m control variables for the ith contract, such as the length of the public notice period;
Cim represents the matrix of m control variables for the ith contract.
Competitors in the market; εi is the error term;
β1j and β4m represent the vectors of coefficients, explanatory variables and control variables.
Based on Mihaly Fazekas' algorithm, the algorithm team also improved the text recognition of Spanish contracts and adjusted the risk indicators based on the conditions of Peru, and Funes achieved very good results.
Open data to achieve transparency and promote innovation
Three scholars from the School of Public Administration at Tsinghua University once pointed out in their study “Government Data Openness and Anti-Corruption: Practice and Inspiration from the UK”:Open data can promote social supervision, and the public can discover corruption through open data.
Open government data is conducive to improving government transparency and promoting economic development and social innovation. In this regard, the EU and the UK are currently at the forefront.
In 2015, the European Commission launched the Towards a European Strategy to Reduce Corruption by Enhancing the Use of Open Data (TACOD), and the United Kingdom became one of the pilot countries for the project.
The TACOD research team found that the largest number of sources of corruption exposure came from law enforcement agencies (34%), followed by investigative journalists (25%), freedom of information requests (14%), whistleblowing (13%), and open data (7%).
Although only 71% of corruption is exposed through open data, if certain key data are made public early, a large amount of corruption can be discovered and detected earlier. Open government data has the potential to become an important tool in the fight against corruption.
Even with government information disclosure, the threshold for handling complex contracts and massive transaction data has been raised. In 2009, the media exposed scandals of British MPs abusing public funds to reimburse personal bills. The three major political parties in the UK and more than 300 MPs were involved. Faced with the vast amount of reimbursement vouchers and application documents of MPs,Media outlets such as The Daily Telegraph and The Guardian released large amounts of data on the Internet and invited British citizens to participate in the survey in a "crowdsourcing" manner.

It has attracted huge public attention in Peru
If citizens find suspicious points in the data, they can mark the corresponding data on the website, and the investigation team will continue to follow up. However, crowdsourcing also has many problems. Citizens who have not received investigation training cannot complete these tasks efficiently and accurately.
The emergence of Funes has achieved a breakthrough and best practice in revealing corruption through government open data. The four-person team consisting of data scientists, investigative journalists, and legal experts has powerfully demonstrated the powerful investigative capabilities of humans + AI.
Funes is still in action
To this day, Funes remains on the front lines of Peruvian investigative journalism as an open and accurate investigative tool.
Not only that, since last year, ojo-publico, the digital investigation agency that created Funes, has been using Funes to verify government procurement projects during the COVID-19 pandemic.