Skip to main content

Chatbots Are Not Search: Algorithmic Gatekeeping and Generative AI in Education Policy

Chatbots Are Not Search: Algorithmic Gatekeeping and Generative AI in Education Policy

Picture

Member for

1 year 1 month
Real name
Ethan McGowan
Bio
Ethan McGowan is a Professor of Financial Technology and Legal Analytics at the Gordon School of Business, SIAI. Originally from the United Kingdom, he works at the frontier of AI applications in financial regulation and institutional strategy, advising on governance and legal frameworks for next-generation investment vehicles. McGowan plays a key role in SIAI’s expansion into global finance hubs, including oversight of the institute’s initiatives in the Middle East and its emerging hedge fund operations.

Modified

Chatbots replace lists with a single voice, intensifying algorithmic gatekeeping
In portal-first markets like Korea, hallucination and narrowed content threaten civic learning
Mandate citations, rival answers, list-mode defaults, and independent audits in schools and platforms

Only 6% of people in South Korea go directly to news sites or apps for news. The majority access information through platforms like Naver, Daum, and YouTube. When most of a nation relies on just a few sources for public information, how those sources are designed becomes a civic issue, not just a product feature. This is the essence of algorithmic gatekeeping. In the past, recommendation engines provided lists. Users could click away, search again, or compare sources. Chatbots do more than that. They make selections, condense information, and present it in a single voice. That voice can appear calm but may be misleading. It might "hallucinate." It can introduce bias that seems helpful. If news access shifts to a chatbot interface, the old concerns about search bias become inadequate. We need policies that treat conversational responses as editorial decisions on a large scale. In Korea's portal culture, this change is urgent and has wider implications.

Algorithmic gatekeeping changes the power of defaults

In the past, the main argument for personalization was choice. Lists of links allowed users to retain control. They could type a new query or try a different portal. In chat, however, the default is an answer rather than a list. This answer influences the follow-up question. It creates context and narrows the scope. In a portal-driven market like Korea, where portals are the primary source for news and direct access is uncommon, designing a single default answer carries democratic significance. When a gate provides an answer instead of a direction, the line between curation and opinion becomes unclear. Policymakers should view this not simply as a tech upgrade, but as a change in editorial control with stakes greater than previous debates about search rankings and snippets. If algorithmic gatekeeping once organized information like shelves, it now defines the blurb on the cover. That blurb can be convincing because it appears neutral. However, it is difficult to audit without a clear paper trail.

Figure 1: Older Koreans rely on YouTube for news more than younger groups, concentrating agenda-setting power in platform gatekeepers. This makes default “single-answer” chat layers even more consequential for civic learning.

Korea's news portals reveal both opportunities and dangers. A recent peer-reviewed study comparing personalized and non-personalized feeds on Naver shows that personalization can lower content diversity while increasing source diversity, and that personalized outputs tend to appear more neutral than non-personalized ones. The user's own beliefs did not significantly affect the measured bias. This does not give a free pass. Reduced content diversity can still limit what citizens learn. More sources do not ensure more perspectives. A seemingly "neutral" tone in a single conversational response may hide what has been left out. In effect, algorithmic gatekeeping can seem fair while still limiting the scope of information. The shift from lists to voices amplifies this narrowing, especially for first-time users who rarely click through.

Algorithmic gatekeeping meets hallucination risk

Another key difference between search and chat is the chance for errors. Recommendation engines might surface biased links, but they rarely create false information. Chatbots sometimes do. Research on grounded summarization indicates modest but genuine rates of hallucination for leading models, typically in the low single digits when responses rely on provided sources. Vectara's public leaderboard shows rates around 1-3% for many top systems within this limited task. That may seem small until you consider it across millions of responses each day. These low figures hold in narrow, source-grounded tests. In more open tasks, academic reviews in 2024 found hallucination rates ranging from 28% to as high as 91% across various models and prompts. Some reasoning-focused models also showed spikes, with one major system measuring over 14% in a targeted assessment. The point is clear: errors are a feature of current systems, not isolated bugs. In a chat interface, that risk exists at the public sphere's entrance.

Korea's regulators have begun to treat this as a user-protection issue. In early 2025, the Korea Communications Commission issued guidelines to protect users of generative AI services. These guidelines include risk management responsibilities for high-impact systems. The broader AI Framework Act promotes a risk-based approach and outlines obligations for generative AI and other high-impact uses. Competition authorities are also monitoring platform power and preferential treatment in digital markets. These developments indicate a shift from relaxed platform policies to rules that address the actual impact of algorithmic gatekeeping. If the main way to access news starts to talk, we must ask what it says when it is uncertain, how it cites information, and whether rivals can respond on the same platform. Portals that make chat the default should have responsibilities more akin to broadcasters than bulletin boards.

Algorithmic gatekeeping in a portal-first country

South Korea is a critical case because portals shape user habits more than in many democracies. The Reuters Institute's 2025 country report highlights that portals still have the largest share of news access. A Korea Times summary of the same data emphasizes the extent of intermediation: only 6% of users go directly to news sites or apps. Meanwhile, news avoidance is increasing; a Korea Press Foundation survey found that over 70% of respondents avoid the news, citing perceived political bias as a key reason. In this environment, how first-touch interfaces are designed matters significantly. If a portal transitions from lists to chat, it could result in fewer users clicking through to original sources. This would limit exposure to bylines, corrections, and the editorial differences between news and commentary. It would also complicate educators' efforts to teach source evaluation when the "source" appears as a single, blended answer.

The Korean research on personalized news adds another layer. If personalization on Naver tends to present more neutral content while offering fewer distinct topics, then a constant chat interface could amplify a narrow but calm midpoint. This may reduce polarization at the edges but could also hinder diversity and civic curiosity. Educators need students to recognize differing viewpoints, not just a concise summary. Administrators require media literacy programs that teach students how an answer was created, not just how to verify a statement. Policymakers need transparency not only in training data, but also in the live processes that fetch, rank, cite, and summarize information. In a portal-first system, these decisions will determine whether algorithmic gatekeeping expands or restricts the public's perspective. The shift to chat must include a clear link from evidence to statement, visible at the time of reading, not buried in a help page.

What schools, systems, and regulators should do next

First, schools should emphasize dialog-level source critique. Traditional media literacy teaches students to read articles and evaluate outlets. Chat requires a new skill: tracing claims back through a live answer. Teachers can ask students to expand citations within chat responses and compare answers to at least two linked originals. They can cultivate a habit of using "contrast prompts": ask the same question for two conflicting viewpoints and compare the results. This helps build resistance against the tidy, singular answers that algorithmic gatekeeping often produces. In Korea, where most students interact with news via portals, this approach is essential for civic education.

Second, administrators should set defaults that emphasize source accuracy. If schools implement chat tools, the default option should be "grounded with inline citations" instead of open-ended dialogue. Systems should show a visible uncertainty badge when the model is guessing or when sources differ. Benchmarks are crucial here. Using public metrics like the Vectara HHEM leaderboard helps leaders choose tools with lower hallucination risks for summary tasks. It also enables IT teams to conduct acceptance tests that match local curricula. The aim is not a flawless model, but predictable behavior under known prompts, especially in critical classes like civics and history.

Third, policymakers should ensure chat defaults allow for contestation. A portal that gives default answers should come with a "Right to a Rival Answer." If a user asks about a contested issue, the interface should automatically show a credible opposing viewpoint, linked to its own sources, even if the user does not explicitly request it. Korea's new AI user-protection guidelines and risk-based framework provide opportunities for such regulations. So do competition measures aimed at self-favoring practices. The goal is not to dictate outcomes, but to ensure viewpoint diversity is a standard component of gatekeeper services. Requiring a visible, user-controllable "list mode" alongside chat would also maintain some of the user agency from the search age. These measures are subtle but impactful. They align with user habits rather than countering them.

Finally, auditing must be closer to journalism standards. Academic teams in Korea are already developing datasets and methods to identify media bias across issues. Regulators should fund independent research labs that use these tools to rigorously test portal chats on topics like elections and education. The results should be made public, not just sent to vendors. Additionally, portals should provide "sandbox" APIs to allow civil groups to perform audits without non-disclosure agreements. This approach aligns with Korea's recent steps towards AI governance and adheres to global best practices. In a world dominated by algorithmic gatekeeping, we need more than just transparency reports. We require active, replicated tests that reflect real user experiences on a large scale.

Anticipating the critiques

One critique argues that chat reduces polarization by softening language and eliminating the outrage incentives present in social feeds. There is some validity to this. Personalized feeds on Naver display more neutral coverage and less biased statements compared to non-personalized feeds. However, neutrality in tone does not equate to diversity in content. If chat limits exposure to legitimate but contrasting viewpoints, the public may condense into a narrow middle shaped by model biases and gaps in training data. In education, this can limit opportunities to teach students how to assess conflicting claims. The solution is not to ban chat, but to create an environment that fosters healthy debate. Offering rival answers, clear citations, and prompts for contrast allows discussion to thrive without inciting outrage.

Another critique posits that the hallucination issue is diminishing quickly, suggesting less concern. It is true that in grounded tasks, many leading systems now have low single-digit hallucination rates. However, it is also true that in numerous unconstrained tasks, these rates remain high, and some reasoning-focused models see significant spikes when under pressure. In practice, classroom use falls between these extremes. Students will pose open questions, blend facts with opinions, and explore outside narrow sources. This is why policy should acknowledge the potential for error and create safeguards where it counts: defaulting to citations, displaying uncertainty, and maintaining an option for list-mode. When the gatekeeper provides information, a small rate of error can pose a significant social risk. The solution isn't perfection; it's building a framework that allows users to see, verify, and switch modes as needed.

Figure 2: Quiz responses lean on two outlets for ~60% of sources, revealing a narrow upstream pool; a single default chat answer can amplify that concentration.

Lastly, some warn that stricter regulations may hinder innovation. However, Korea's recent policy trends suggest otherwise. Risk-based requirements, user-protection guidelines, and oversight of competition can target potential harms without hindering progress. Clear responsibilities often accelerate adoption by providing confidence to schools and portals to move forward. The alternative—ambiguous liabilities and unclear behaviors—impedes pilot programs and stirs public mistrust. In a portal-first market, trust is the most valuable resource. Guidelines that make algorithmic gatekeeping visible and contestable are not obstacles. They are essential for sustainable growth.

If a nation accesses news through gatekeepers, then the defaults at those gates become a public concern. South Korea illustrates the stakes involved. Portals dominate access. Direct visits are rare. A transition from lists to chat shifts control from ranking to authorship. It also brings the risk of hallucination to the forefront. We cannot view this merely as an upgrade to search. It is algorithmic gatekeeping with a new approach. The response is not to fear chat. It is to tie chat to diversity, source accuracy, and choice. Schools can empower students to demand citations and contrasting views. Administrators can opt for grounded response modes and highlight uncertainty by default. Regulators can mandate rival answers, keep list mode accessible, and fund independent audits. If we take these steps, the new gates can expand the public square instead of constricting it. If we leave this solely to product teams, we risk tidy answers to fewer questions. The critical moment is now. The path forward is clear. We should follow it.


The views expressed in this article are those of the author(s) and do not necessarily reflect the official position of the Swiss Institute of Artificial Intelligence (SIAI) or its affiliates.


References

Adel, A. (2025). Can generative AI reliably synthesise literature? Exploring hallucination risks in LLMs. AI & Society. https://doi.org/10.1007/s00146-025-02406-7
Foundation for Freedom Online. (2025, April 18). South Korea's new AI Framework Act: A balancing act between innovation and regulation. Future of Privacy Forum.
Kim & Chang. (2025, March 7). The Korea Communications Commission issues the Guidelines on the Protection of Users of Generative AI Services.
Korea Press Foundation. (2024). Media users in Korea (news avoidance findings as summarized by RSF). Reporters Without Borders country profile: South Korea.
Korea Times. (2025, June 18). YouTube dominates news consumption among older, conservative Koreans; only 6% access news directly.
Lee, S. Y. (2025). How diverse and politically biased is personalized news compared to non-personalized news? The case of Korea's internet news portals. SAGE Open.
Reuters Institute for the Study of Journalism. (2025). Digital News Report—South Korea country page.
Vectara. (2024, August 5). HHEM 2.1: A better hallucination detection model and a new leaderboard.
Vectara. (2025). LLM Hallucination Leaderboard
Vectara. (2025, February 24). Why does DeepSeek-R1 hallucinate so much? Yonhap/Global Competition Review. (2025, September 22). KFTC introduces new measures to regulate online players; amends merger guidelines for digital markets.

Picture

Member for

1 year 1 month
Real name
Ethan McGowan
Bio
Ethan McGowan is a Professor of Financial Technology and Legal Analytics at the Gordon School of Business, SIAI. Originally from the United Kingdom, he works at the frontier of AI applications in financial regulation and institutional strategy, advising on governance and legal frameworks for next-generation investment vehicles. McGowan plays a key role in SIAI’s expansion into global finance hubs, including oversight of the institute’s initiatives in the Middle East and its emerging hedge fund operations.

Parrondo's Paradox in AI: Turning Losing Moves into Better Education Policy

Parrondo's Paradox in AI: Turning Losing Moves into Better Education Policy

Picture

Member for

1 year 1 month
Real name
David O'Neill
Bio
David O’Neill is a Professor of Finance and Data Analytics at the Gordon School of Business, SIAI. A Swiss-based researcher, his work explores the intersection of quantitative finance, AI, and educational innovation, particularly in designing executive-level curricula for AI-driven investment strategy. In addition to teaching, he manages the operational and financial oversight of SIAI’s education programs in Europe, contributing to the institute’s broader initiatives in hedge fund research and emerging market financial systems.

Modified

AI reveals Parrondo’s paradox can turn losing tactics into schoolwide gains
Run adaptive combined-game pilots with bandits and multi-agent learning, under clear guardrails
Guard against persuasion harms with audits, diversity, and public protocols

The most concerning number in today's learning technology debate is 64. In May 2025, a preregistered study published in Nature Human Behaviour found that GPT-4 could outperform humans in live, multi-round online debates 64% of the time when it could quietly adjust arguments to fit a listener's basic traits. In other words, when the setting becomes a multi-stage, multi-player conversation—more like a group game than a test—AI can change our expectations about what works. What seems weak alone can become strong in combination. This is the essence of Parrondo's paradox: two losing strategies, when alternated or combined, can lead to a win. The paradox is no longer just a mathematical curiosity; it signals a policy trend. If "losing" teaching techniques or governance rules can be recombined by machines into a better strategy, education will require new experimental designs and safeguards. The exact mechanics that improve learning supports can also enhance manipulation. We need to prepare for both.

What Parrondo's paradox in AI actually changes

Parrondo's paradox is easy to explain and hard to forget: under the right conditions, alternating between two strategies that each lose on their own can result in a net win. Scientific American's recent article outlines the classic setup—Game A and Game B both favor the house, yet mixing them produces a positive expected value—supported by specific numbers (for one sequence, a gain of around 1.48 cents per round). The key is structural: Game B's odds rely on the capital generated by Game A, creating an interaction between the games. This is not magic; it is coupling. In education systems, we see coupling everywhere: attendance interacts with transportation; attention interacts with device policies; curriculum pacing interacts with assessment stakes. When we introduce AI to this complex environment, we are automatically in combined-game territory. The right alternation of weak rules can outperform any single "best practice," and machine agents excel at identifying those alternations.

Parrondo's paradox in AI, then, is not merely a metaphor; it is a method. Multi-agent reinforcement learning (MARL) applies game-theoretic concepts—best responses, correlated equilibria, evolutionary dynamics—and learns policies by playing in shared environments. Research from 2023 to 2024 shows a shift from simplified 2-player games to mixed-motive, multi-stage scenarios where communication, reputation, and negotiation are essential. AI systems that used to solve complex puzzles are now tackling group strategy: forming coalitions, trading short-term losses for long-term coordination, and adapting to changing norms. This shift is crucial for schools and ministries. Most education challenges—placement, scheduling, teacher allocation, behavioral nudges, formative feedback—are not single-shot optimization tasks; they involve repeated, coupled games among thousands of agents. If Parrondo effects exist anywhere, they exist here.

Figure 1: Alternating weak policies (A/B) produces higher cumulative learning gains than A-only or B-only because the alternation exploits dependencies.

Parrondo's paradox in AI, from lab games to group decisions

Two findings make the policy implications clear. First, Meta's CICERO achieved human-level performance in the negotiation game Diplomacy, which involves building trust and managing coalitions among seven players. Across 40 anonymous league games, CICERO scored more than double the human average and ranked in the top 10% of all participants. It accomplished this by combining a language model with a planning engine that predicted other players' likely actions and shaped messages to match evolving plans. This is a combined game at its finest: language plus strategy; short-term concessions paired with long-term positioning. Education leaders should view this not as a curiosity from board games but as a proof-of-concept showing that machines can leverage cross-stage dependencies to transform seemingly weak moves into strong coalitions—precisely what we need for attendance recovery, grade-level placement, and improving campus climate.

Second, persuasion is now measurable at scale. The 2025 Nature Human Behaviour study had around 900 participants engage in multi-round debates and found that large language models not only kept pace but also outperformed human persuaders 64% of the time with minimal personalization. The preregistered analysis revealed an 81.7% increase in the likelihood of changing agreement compared to human opponents in that personalized setting. Debate is a group game with feedback: arguments change the state, which influences subsequent arguments. This is where Parrondo's effects come into play, and the data suggest that AI can uncover winning combinations among rhetorical strategies that might appear weak when viewed in isolation. This is a strong capability for tutoring and civic education—if we can demonstrate improvements without undermining autonomy or trust. Conversely, it raises concerns for assessment integrity, media literacy, and platform governance.

Figure 2: With light personalization, GPT-4 persuades more often than humans (64% vs 36%), showing how combined strategies can flip expected winners.

Designing combined games for education: from pilots to policy

If Parrondo's paradox in AI applies to group decision-making, education must change how it conducts experiments. The current approach—choosing one "treatment," comparing it to a "control," and scaling the winner—reflects a single-game mindset. A better design would draw from adaptive clinical trials, where regulators now accept designs that adjust as evidence accumulates. Adaptive clinical trials are a type of clinical trial that allows for modifications to the trial's procedures or interventions based on interim results. In September 2025, the U.S. Food and Drug Administration issued draft guidance (E20) on adaptive designs, establishing principles for planning, analysis, and interpretation. The reasoning is straightforward: if treatments interact with their context and with each other, we must allow the experiment itself to adapt, combining or alternating candidate strategies to reveal hidden wins. Education trials should similarly adjust scheduling rules, homework policies, and feedback timing, enabling algorithms to modify the mix as new information emerges rather than sticking to a single policy for an entire year.

A practical starting point is to regard everyday schooling as a formal multi-armed bandit problem with ethical safeguards in place. The multi-armed bandit problem is a classic dilemma in probability theory and statistics, where a gambler must decide which arm of a multi-armed slot machine to pull to maximize their total reward over a series of pulls. In the context of education, this problem can be seen as the challenge of choosing the most effective teaching strategies or interventions to maximize student learning outcomes. Bandit methods—used in dose-finding and response-adaptive randomization—shift participants toward better-performing options while mitigating risk. A 2023 review in clinical dose-finding highlights their clarity and effectiveness: allocate more to what works, keep exploring, and update as outcomes arrive. In a school context, this could involve alternating two moderately effective formative feedback methods—such as nightly micro-quizzes and weekly reflection prompts—because this alternation aligns with a known dependency (such as sleep consolidation midweek or teacher workload on Fridays). Either approach alone might be a "loser" in isolation; when alternated by a bandit algorithm, the combination could improve attention, retention, and reduce teacher burnout. The policy step is to normalize such combined-game pilots with preregistered safeguards and clear dashboards so that improvements do not compromise equity or consent.

Risk, governance, and measurement in a world of combined games

Parrondo's paradox in AI is not without its challenges. Combined games are more complex to audit than single-arm trials, and "winning" can mask unacceptable side effects. Multi-agent debate frameworks that perform well in one setting can fail in another. Several studies from 2024 to 2025 warn that multi-agent debate can sometimes reduce accuracy or amplify errors, especially if agents converge on persuasive but incorrect arguments or if there is low diversity in reasoning paths. Education has real examples of this risk: groupthink in committee decisions, educational trends that spread through persuasion rather than evidence. As we implement AI systems that coordinate across classrooms or districts, we should be prepared for similar failure modes—and proactively assess for them. A short-term solution is to ensure diversity: promote variety among agents, prompts, and evaluation criteria; penalize agreement without evidence; and require control groups where the "winning" combined strategy must outperform a strong single-agent baseline.

Measurement must evolve as well. Traditional assessment captures outcomes. Combined games require tracking progress: how quickly a policy adjusts to shocks, how outcomes shift for subgroups over time, and how often the system explores less-favored strategies to prevent lock-in. Here again, AI can assist. DeepMind's 2024–2025 work on complex reasoning—like AlphaGeometry matching Olympiad-level performance on formal geometry—demonstrates that machine support can navigate vast policy spaces that are beyond unaided human design. However, increased searching power raises ethical concerns. Education ministries should follow the example of health regulators: publish protocols for adaptive design, specify stopping rules, and clarify acceptable trade-offs before the search begins. Combined games can be a strategic advantage; they should not be kept secret.

The policy playbook: how to use losing moves to win fairly

The first step is to make adaptive, combined-game pilots standard at the district or national level. Every mixed-motive challenge—attendance, course placement, teacher assignment—should have an environment where two or more modest strategies are intentionally alternated and refined based on data. The protocol should identify the dependency that justifies the combination (for example, how scheduling changes affect homework return) and the limits on explorations (equity floors, privacy constraints, and teacher workload caps). If we expect the benefits of Parrondo's paradox, we need to plan for them.

The second step is to raise the evidence standards for any AI that claims benefits from coordination or persuasion. Systems like CICERO that plan and negotiate among agents should be assessed against human-compatible standards, not just raw scores. Systems capable of persuasion should have disclosure requirements, targeted-use limits, and regular assessments for subgroup harm. Given that AI can now win debates more often than people under light personalization, we should assume that combined rhetorical strategies—some weak individually—can manipulate as well as educate. Disclosure and logging alone will not address this; they are essential for accountability in combined games.

The third step is to safeguard variability in decision-making. Parrondo's paradox thrives because alternation helps avoid local traps. In policy, that means maintaining a mix of tactics even when one appears superior. If a single rule dominates every dashboard for six months, the system is likely overfitting. Always keeping at least one "loser" in the mix allows for flexibility and tests whether the environment has changed. This approach is not indecision; it is precaution.

The fourth step is to involve educators and students. Combined games will only be legitimate if those involved can understand and influence the alternations. Inform teachers when and why the schedule shifts; let students join exploration cohorts with clear incentives; publish real-time fairness metrics. In a combined game, transparency is a key part of the process.

64 is not just about debates; it represents the new baseline of machine strategy in group contexts. In the context of Parrondo's paradox in AI, education is a system of interlinked games with noisy feedback and human stakes. The lesson is not to search for one dominant strategy. Instead, we need to design for alternation within constraints, allowing modest tactics to combine for strong outcomes while keeping the loop accountable when optimization risks becoming manipulation. The evidence is already available: combined strategies can turn weak moves into successful policies, as seen in CICERO's coalition-building and in adaptive trials that dynamically adjust. The risks are present too: debate formats can lower accuracy; personalized persuasion can exceed human defenses. The call to action is simple to lay out and challenging to execute. Establish Parrondo-aware pilots with clear guidelines. Commit to adaptive measurement and public protocols. Deliberately maintain diversity in the system. If we do that, we can let losing moves teach us how to win—without losing sight of why we play.


The views expressed in this article are those of the author(s) and do not necessarily reflect the official position of the Swiss Institute of Artificial Intelligence (SIAI) or its affiliates.


References

Bakhtin, A., Brown, N., Dinan, E., et al. (2022). Human-level play in the game of Diplomacy by combining language models with strategic reasoning. Science (technical report version). Meta FAIR Diplomacy Team.
Bischoff, M. (2025, October 16). A Mathematical Paradox Shows How Combining Losing Strategies Can Create a Win. Scientific American.
De La Fuente, N., Noguer i Alonso, M., & Casadellà, G. (2024). Game Theory and Multi-Agent Reinforcement Learning: From Nash Equilibria to Evolutionary Dynamics. arXiv.
Food and Drug Administration (FDA). (2025, September 30). E20 Adaptive Designs for Clinical Trials (Draft Guidance).
Huh, D., & Mohapatra, P. (2023). Multi-Agent Reinforcement Learning: A Comprehensive Survey. arXiv.
Kojima, M., et al. (2023). Application of multi-armed bandits to dose-finding clinical trials. European Journal of Operational Research.
Ning, Z., et al. (2024). A survey on multi-agent reinforcement learning and its applications. Intelligent Systems with Applications.
Salvi, F., Horta Ribeiro, M., Gallotti, R., & West, R. (2024/2025). On the Conversational Persuasiveness of Large Language Models: A Randomized Controlled Trial (preprint 2024; published 2025 as On the conversational persuasiveness of GPT-4 in Nature Human Behaviour).
Trinh, T. H., et al. (2024). Solving Olympiad geometry without human demonstrations (AlphaGeometry). Nature.
Wynn, A., et al. (2025). Understanding Failure Modes in Multi-Agent Debate. arXiv.

Picture

Member for

1 year 1 month
Real name
David O'Neill
Bio
David O’Neill is a Professor of Finance and Data Analytics at the Gordon School of Business, SIAI. A Swiss-based researcher, his work explores the intersection of quantitative finance, AI, and educational innovation, particularly in designing executive-level curricula for AI-driven investment strategy. In addition to teaching, he manages the operational and financial oversight of SIAI’s education programs in Europe, contributing to the institute’s broader initiatives in hedge fund research and emerging market financial systems.

The Economy Special Services

Five Pillars of Applied Intelligence

 

The Special Services of The Economy form a multi-domain intelligence network that translates research into applied analysis across the world’s most critical sectors.
Each sub-brand—Financial Economy, Tech Economy, Policy Economy, Bio Economy, and Token Economy—serves as an independent analytical division within The Economy ecosystem, yet all share a common purpose:

The Economy Global Services

The Five Pillars of Institutional Intelligence

 

At the heart of The Economy lies a network of five interlinked divisions — Research, Ranking, Signal, Senate, and Wiki — each serving a distinct but complementary function within a unified framework of institutional intelligence.

The Economy News Introduction

Five Languages. One Intelligence.

 

The Economy Global Editions extend The Economy’s mission of institutional intelligence across linguistic, cultural, and geographic boundaries.
Operating in five fully synchronized editionsEnglish, Arabic, Chinese, Japanese, and Korean — the platform provides a unified framework for understanding how markets, institutions, and technologies evolve worldwide.